EvolutionaryGenetics_03

  1. How can you get estimates on contemporary and ancient events?
    • by estimating population parameters for different genetic markers
    • Some processes are recurrent at equilibrium and other events only occur once
    • recall DNA sequence variation captures the long-term evolutionary history of the population
  2. If FST ≤ 0.05, is differentiation negligible?
    As Wright points out: Differentiation is by no means negligible if FST is ≤ 0.05
  3. What are the test statistics for geographic structure?
    • Χ2
    • HST
    • KS
    • Z
  4. For sequenced-based stats, it is best to use when your data has:
                       mutation rate and              n
    High mutation rate and small n
  5. For haplotype-based stats, it is best to use when your data has:
                       mutation rate and               n
    Low mutation rate and large n
  6. How can you interpret FST when:
    0.00 to 0.05
    0.05 to 0.15
    0.15 to 0.25
    > 0.25
    • 0.00 to 0.05 = little genetic differentiation
    • 0.05 to 0.15 = moderate genetic differentiation
    • 0.15 to 0.25 = great genetic differentiation
    • > 0.25 = very great genetic differentiation
  7. Heterozygosity
    haplotype diversity for haploid data
  8. Which two tests are analogous to interpretation to the haplotype based statistic GST?
    • FST frequency-based test statistic for haplotype diversity
    • KST tests for diversity in populations based on sequence based statistics
  9. How can you increase the power of Hudson's test to detect subdivision?
    Hudson's test (KST) can have increased power in the presence of recombination
  10. How can the presence of recombination increase the power of KST to detect subdivision?
    • increased haplotype diversity in the total population
    • recombination giving rise to unique recombination blocks that distinguish the different subpopulations (clades)
  11. Give the strengths and weaknesses for the following test statistic for geographic structure:
    Χ2
    • Strengths: Useful in almost all cases, esp n > 50. Can have unequal sample sizes
    • Weaknesses: poor at high mutation rates and small sample sizes
  12. Give the strengths and weaknesses for the following test statistic for geographic structure:
    HST
    • Strengths: Useful with high haplotype diversity. Can have unequal sample sizes
    • Weaknesses: Poor with small sample size and low diversity
  13. Give the strengths and weaknesses for the following test statistic for geographic structure:
    KS
    • Strengths: Powerful with high mutation rates, small sample size, and recombination
    • Weaknesses: must have equal sample sizes
  14. Give the strengths and weaknesses for the following test statistic for geographic structure:
    Z
    • Strengths: Can have unequal sample sizes. Powerful with high mutation rates and small sample size
    • Weaknesses: n/a
  15. What are examples of different clustering methods?
    • Agglomerated
    • Gap Statistic
    • K-means
    • Calinski-Harabasz
  16. Agglomerated
    • (UPGMA – Unweighted Pair Group Method with Arithmetic mean)
    • Hierarchical clustering method with a ‘bottom up’ approach; each observation starts in its own cluster, and pairs of cluster are merged as one move up the hierarchy.
    • Uses a dissimilarity matrix based on the Euclidean distances.
  17. Gap-Statistic
    Compares within-cluster dispersion to the expected value under a null distribution
  18. K-means
    Aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean
  19. Calinski-Harabasz
    Variance ratio criterion
  20. What is population stratification
    grouping into classes, categories, or clusters
  21. What is a problem when calculating population stratification, and how can you correct for it?
    • Stratification may induce false positives
    • This must be corrected for using the PCA and the EIGENSTRAT method to identify cryptic structure in the data
  22. How can you use principal component analysis to correct for stratification in population data that may have cryptic structures?
    y regressing out the top PCs (e.g., ev1) from the genotype data obtained using the EIGENSTRAT method, we can identify cryptic structure in the data (e.g., ev3).

    • The CHD and JPT populations are so close to each other that they can be distinguished only on eigenvector 3
    • Image Upload 2
  23. What is structure?
    An alternative approach in determining the optimal number of genetic clusters
  24. Why use structure?
    • One of the most widely used population analysis tools that assesses patterns of genetic structure in a set of samples
    • Identifies populations from the data and assigns individuals to that population representing the best fit for the variation patterns found.
  25. Compatibility methods (Le Quesne):
    • Evaluate compatibility of genetic data with different evolutionary scenarios.
    • Non-genealogical (frequency-based)
  26. Population parameter estimators (θ):
    • Estimate mutation rates and effective population sizes
    • Non-genealogical (frequency-based)
  27. Neutrality tests (Tajima, Fu):
    • Used to detect deviations from neutrality, indicating selection or demographic changes
    • Non-genealogical (frequency-based)
  28. Genetic Diversity measures (Nei):
    • Assess genetic variation within a population
    • Non-genealogical (frequency based)
  29. Classical Wright’s FST statistics:
    • Measures genetic differentiation among populations
    • Non-genealogical (frequency-based)
    • Image Upload 4
  30. Population subdivision (Hudson):
    • Examines how populations are divided and structured
    • Non-genealogical (frequency based)
Author
saucyocelot
ID
365955
Card Set
EvolutionaryGenetics_03
Description
more for midterm
Updated