Bioinformatics I (part 7)

  1. Ways to measure variation in mammals
    –Variation in copy number (common in repetitive parts of the genome)

    –Numbers of extra or missing bases (also common in repetitive regions

    • –Base differences in the sequence
    • Called Single Nucleotide Polymorphisms—SNP
    • Relatively easy to detect and quantify
  2. Define SNP
    • Single Nucleotide Polymorphism
    • Image Upload 2
    • Different individuals in a species have a different base
  3. Sources of genetic bottlenecks
    Image Upload 4
  4. SNP variation in humans:
    –Any two genomes differ at (roughly) 1 base pair in 1000
  5. Sources of variation within species:
    Image Upload 6
  6. What are implications of genetic drift?
    Small populations have less variation than larger ones

    Population size needs to be thought of as an average over time

    • Many populations undergo bottlenecks
    • –Large, rapid reduction in population size
    • –Followed by population growth
    • –Domestication
  7. Linkage disequilibrium
    Linkage disequilibrium (LD) is a population-based parameter that describes the degree to which an allele of one genetic variant is inherited or correlated with an allele of a nearby genetic variant within a given population
  8. What happens if the linkage disequalibrium equation, D=0?
    Distribution of alleles follow the expected Hardy-weinburg equation
  9. How is linkage disequilibrium useful in finding genetic diseases?
    • Human species is young and small
    • –LD blocks are large and strong

    Many diseases have a heritable component

    • We would like to find genome locations responsible
    • –Treatment
    • –Susceptibility
  10. Difficulties using Genome-wide association studies for common diseases:
    Most genetic diseases are poly-genic

    • In other words, there are a number of variations that contribute to the disease
    • –Your chance of contracting the disease might depend on how many of the disease alleles you inherit and on which particular alleles those are

    The diseases have low penetrance: even you inherit the “bad” alleles, the probability of getting the disease might be low
  11. GWAS:
    Genome-wide association studies
  12. What does linkage analysis do?
    uses recombinations to locate unknown genes with known ones

    –We can locate disease genes by calculating the recombination fraction between the disease phenotype and known markers.

    –In humans, this procedure is hampered by high degrees of homozygosity

    –Instead, we calculated the likelihood of a gene of interest being at a particular location on the chromosome
  13. How long has genetic drift been going on for?
    Since speciation event
  14. If you've used genehunter to locate SNPs around a possible site of genetic disease? Where can you look up the markers?



    A) GenBank

    • Markers can be looked up directly in GenBank then located on the Human Genome using Blast.
    • Phylio creates trees
  15. Its rare to pin down one gene as the source of a genetic disease

    True
    False
    True

    Rare to pinpoint 1 gene—most often find a region for further study
  16. Cons of locating a genetic disease using classic linkage:
    –Requires a known pedigree with the disease

    –Requires genotyping many individuals in a single family

    • –Is also computationally expensive
    • and limited:
    • –Only works if the disease can be localized in a family
    • –Essentially works for mono-allelic diseases
  17. How do you find genetic disease with Genome-wide associations?
    We can take genetic data from a large number of unrelated individuals

    Calculate the association between a bunch of SNPs and disease occurrence.

    Works based on LD (linkage disequalibrium)

    It is very unlikely that the causative genetic difference will be captured by the specific SNPs typed

    However, because LD is common in the human genome, SNPs within ~50kb are close enough to show association with the disease
  18. Describe the image:
    Image Upload 8
    Points are SNPs

    >1000 individuals

    500K SNPS were typed—these are top hits

    The y-axis gives the probability that this level association between the disease gene and SNP by chance

    Calculated the probability of no association between a loci and the disease (P-value)

    • Looking for cases where this value is small
    • –Implies it is very unlikely that a given SNP and the disease would co-occur that often by chance
  19. Describe the study that used GWAS to search for causes of Age Related Macular Degeneration
    Image Upload 10


    After Bonferroni correction, two SNPs showed association with AMD
  20. What were the SNPs located in the AMD study?
    One of the SNPs seems to have resulted from missing genotyping data—unclear how

    • Second SNP was in a region with another nearly significant SNP
    • -In an intron for Complement factor H
    • -Part of the innate immune system
    • -Responsible for rapid response to pathogens in the blood
    • -Complement factor H can initiate this response
    • -Hypothesis: Improper activation of the complement system is involved in damaging retinal cells
  21. The GWAS analysis of age related macular degeneration suggests that AMD might be an autoimmune disease
    True
  22. You are able to locate an essential gene in a GWAS study
    False

    if there is no variation in the gene because it is essential, it will not be able to be located via SNPs in GWAS
  23. What is this an example of?
    Image Upload 12
    A bottleneck
  24. What is the expected outcome of a two-loci Hardy-Weinburg? (when link disequilibrium is not in effect... D= 0)
    It is treating the alleles/traits as independent

    Image Upload 14
Author
saucyocelot
ID
360194
Card Set
Bioinformatics I (part 7)
Description
Second half of the semester, pt 7 lectures, Population Genetics/Genomics Lectures on 10/18, 10/20, 10/25
Updated