–Variation in copy number (common in repetitive parts of the genome)
–Numbers of extra or missing bases (also common in repetitive regions
–Base differences in the sequence
Called Single Nucleotide Polymorphisms—SNP
Relatively easy to detect and quantify
Define SNP
Single Nucleotide Polymorphism
Different individuals in a species have a different base
Sources of genetic bottlenecks
SNP variation in humans:
–Any two genomes differ at (roughly) 1 base pair in 1000
Sources of variation within species:
What are implications of genetic drift?
Small populations have less variation than larger ones
Population size needs to be thought of as an average over time
Many populations undergo bottlenecks
–Large, rapid reduction in population size
–Followed by population growth
–Domestication
Linkage disequilibrium
Linkage disequilibrium (LD) is a population-based parameter that describes the degree to which an allele of one genetic variant is inherited or correlated with an allele of a nearby genetic variant within a given population
What happens if the linkage disequalibrium equation, D=0?
Distribution of alleles follow the expected Hardy-weinburg equation
How is linkage disequilibrium useful in finding genetic diseases?
Human species is young and small
–LD blocks are large and strong
Many diseases have a heritable component
We would like to find genome locations responsible
–Treatment
–Susceptibility
Difficulties using Genome-wide association studies for common diseases:
Most genetic diseases are poly-genic
In other words, there are a number of variations that contribute to the disease
–Your chance of contracting the disease might depend on how many of the disease alleles you inherit and on which particular alleles those are
The diseases have low penetrance: even you inherit the “bad” alleles, the probability of getting the disease might be low
GWAS:
Genome-wide association studies
What does linkage analysis do?
uses recombinations to locate unknown genes with known ones
–We can locate disease genes by calculating the recombination fraction between the disease phenotype and known markers.
–In humans, this procedure is hampered by high degrees of homozygosity
–Instead, we calculated the likelihood of a gene of interest being at a particular location on the chromosome
How long has genetic drift been going on for?
Since speciation event
If you've used genehunter to locate SNPs around a possible site of genetic disease? Where can you look up the markers?
A) GenBank
Markers can be looked up directly in GenBank then located on the Human Genome using Blast.
Phylio creates trees
Its rare to pin down one gene as the source of a genetic disease
True
False
True
Rare to pinpoint 1 gene—most often find a region for further study
Cons of locating a genetic disease using classic linkage:
–Requires a known pedigree with the disease
–Requires genotyping many individuals in a single family
–Is also computationally expensive
and limited:
–Only works if the disease can be localized in a family
–Essentially works for mono-allelic diseases
How do you find genetic disease with Genome-wide associations?
We can take genetic data from a large number of unrelated individuals
Calculate the association between a bunch of SNPs and disease occurrence.
Works based on LD (linkage disequalibrium)
It is very unlikely that the causative genetic difference will be captured by the specific SNPs typed
However, because LD is common in the human genome, SNPs within ~50kb are close enough to show association with the disease
Describe the image:
Points are SNPs
>1000 individuals
500K SNPS were typed—these are top hits
The y-axis gives the probability that this level association between the disease gene and SNP by chance
Calculated the probability of no association between a loci and the disease (P-value)
Looking for cases where this value is small
–Implies it is very unlikely that a given SNP and the disease would co-occur that often by chance
Describe the study that used GWAS to search for causes of Age Related Macular Degeneration
After Bonferroni correction, two SNPs showed association with AMD
What were the SNPs located in the AMD study?
One of the SNPs seems to have resulted from missing genotyping data—unclear how
Second SNP was in a region with another nearly significant SNP
-In an intron for Complement factor H
-Part of the innate immune system
-Responsible for rapid response to pathogens in the blood
-Complement factor H can initiate this response
-Hypothesis: Improper activation of the complement system is involved in damaging retinal cells
The GWAS analysis of age related macular degeneration suggests that AMD might be an autoimmune disease
True
You are able to locate an essential gene in a GWAS study
False
if there is no variation in the gene because it is essential, it will not be able to be located via SNPs in GWAS
What is this an example of?
A bottleneck
What is the expected outcome of a two-loci Hardy-Weinburg? (when link disequilibrium is not in effect... D= 0)
It is treating the alleles/traits as independent
Author
saucyocelot
ID
360194
Card Set
Bioinformatics I (part 7)
Description
Second half of the semester, pt 7 lectures, Population Genetics/Genomics
Lectures on 10/18, 10/20, 10/25