ANTHR 2750 Lecture 2

  1. Steps of sequencing a genome
    • Obtain genomic DNA sample
    • 2. Sequence genomic DNA
    • 3. Assemble sequences in order
    • 4. Annotate sequence
  2. What sequencing method did public effort use in HGP?
    • Hierarchical sequencing:
    • Genomic DNA is cut into pieces of about 150 Mb and inserted into BAC vectors, transformed into E coli where they are replicated and stored. The BAC inserts are isolated and mapped to determine the order of each cloned 150 Mb fragment.
  3. What sequencing method did Celera (Craig Venter) / private effort use?
    • Shotgun Sequencing:
    • Randomly shears genomic DNA into small pieces which are cloned into plasmids and sequenced on both strands, thus eliminating the BAC step from the HGP’s apporach and deeming it more efficient.
  4. How do we put the sequences in the right order?
    • Genome assembly.
    • Based on finding regions of overlap between individual sequencing fragments.
    • This is a difficult problem for complex genomes.
  5. What is annotation?
    Identifying what part of DNA corresponds to gene.
  6. Craig Venter
    Celera (private effort)
  7. How is annotation done?
    • Compare to known genes already described and sequence;. Through expressed sequence tags (EST) and randomly produced mRNA to infer genes
    • Predict genes computationally: Using computer predictions to determine shape of protein and infer function.
  8. What are the two types of DNA in the genome?
    • Euchromatic: 93% of DNA, contains most of the genes in your genome, 99% has been sequenced.
    • Heterochromatic: 7% of your DNA, highly repetitive. Some parts are structural, contains centromeres and telomeres. Genes are quite sparse. Very difficult to sequence due to repetition and thus largely unexplored.
  9. Explain euchromatic DNA
    • 93% of DNA,
    • contains most of the genes in your genome
    • 99% has been sequenced.
    • 30,000 genes. (many fewer than expected)
    • 50% have unknown function
    • Less than 2% of the genome is a gene
    • 98% is “junk” DNA. Many repeated sequences and transposable elements with unknown functions.
    • 98% does not code for genes. The function of non-coding elements is unknown, but potentially very important.
    • 50% of repeated sequences are transposable elements.
  10. What is a transposable element (transposon)?
    Sequences of DNA that can move around to different positions within the genome of a single cell, a process called transposition.
  11. What is alternative splicing? Why is it important?
    When removing introns, processes such as skipping exons, intron retention can occur, leading to varying gene expression.
  12. What are duplicate genes?
    • Any duplication of a region of DNA that contains a gene.
    • It may occur as an error during homologous recombination, a retrotransposition event, or duplication of an entire chromosome.
  13. What percentage of the genome is transposable elements?
    They occupy 50% of our genome.
  14. History of Transposon discovery
    Barbara McClintock suspected the existence of transposons in 1940. The existence of transposable elements proven experimentally in the 1970s. Received the Nobel Prize in 1983. Cornell University
  15. Four major classes of transposable elements:
    • LINEs (long interspersed nuclear elements) retroposons
    • SINEs (short interspersed nuclear elements) retroposons
    • DNA transposons
    • Retroviruses and retrovirus-like LTR (long terminal repeat) retrotransposons.
  16. Explain LINEs?
    • LINE retroposons and retrovirus-like retrotransposons code for reverse transcriptase. Go through the intermediate RNA phase.
    • LINE1 is the most common.
  17. Explain SINEs?
    • Acquire reverse transcriptase from other elements.
    • Alu helps the process of duplicate regions.
  18. Explain DNA transposons
    Code for transposase and insert double stranded DNA into host genome.
  19. TE distribution as function of GC content:
    • Alu element tends to insert itself into G C rich regions. (G C tends to be more frequent in coding areas)
    • LINE1 element tends to insert itself into G C poor regions.
  20. Age Distribution of TE:
    • Activity (insertion) of Alu peaked at 7% divergence (based on molecular clock, substitution from consensus sequences), corresponding to 40 Myr, then dropped. This is when it evolved. We can see Alu in chimpanzees as well.
    • Activity of DNA transposons and LTR retrotransposons started diminishing earlier.
  21. Impact of TE on Human Genome:
    • Some are inversions or deleterious, thus causing disease etc.
    • Can cause genome instability. (cancer)
    • Although it might be rare, RE can also contribute to coding in regulatory regions of functional genes.
  22. How do new gene functions originate? How does gene number increase among organisms?
    Gene duplication and mutation.
  23. What are gene duplication mechanisms?
    • Single gene duplication (including retro-transposition)
    • Segmental gene duplication (including whole chromosome duplication)
    • Whole genome duplication
  24. What is homology?
    Two genes are homologous if they have a common ancestor.
  25. What is orthology?
    Two genes are orthologous if they diverged following a speciation event.
  26. What is paralogy?
    Two genes are paralogous if they diverged following a duplication event.
  27. How many genes does the olfactory receptors have?
    500 genes in humans and 1400 genes in mice.
  28. Amylase:
    • More copies in the genome of populations that consume high starch to produce more amylase.
    • High copy number variation gives us hints about evolutionary history.
  29. What are the consequences of gene duplication?
    • Genetic duplication leads to genetic novelties
    • Nonfunctionalization (pseudogenes)
    • Subfunctionalization
    • Neofunctionalization (evolution of new function, beta version of the function)
  30. Aerobic fermentation capacity evolved as a result of whole genome replication 100 million years ago.
  31. Explain the 2R hypothesis in the human genome:
    One theory formalized in 1970 that the complexity of the vertebrate genomes originated by means of genome duplication at the base of the vertebrate lineage. Since then, the theory has remained both popular and controversial.
  32. History of the Human Genome Project
    • 1. The Human Genome Project was thought up by Charles DeLisi in 1985 with the aim of sequencing the DNA of the human genome and determining all sequences coding for genes. 
    • 2. The project officially began in 1990. This was also when the Ethical, Legal and Social Implications Program was created. 
    • 3. Craig Venter, an NIH scientist believed that he could cut costs and time by using "shotgun sequencing" rather than "hierarchical sequencing" and founded Celera. 
    • 4. Both projects published a working draft in 2000. 
    • 5. In 2003, the initial project was completed.
  33. What is the future of pharmacogenomics and personalized medicine?
    • 1. Targeted medication with minimal side effects. 
    • 2. More effective vaccines made of DNA and RNA reducing risks of vaccines. 
    • 3. More accurate dosages based on specific metabolisms. 
    • 5. Early diagnosis. 
    • 6. The rescue of drugs that may work for some people and not for others.
  34. What are the potential adverse effects of the Human Genome Project?
    • 1. Denial of insurance. 
    • 2. Denial of employment. 
    • 3. Increases inequality in access to these technologies leading to genetic aristocracy. 
    • 4. Determinism element associated with disease etc. leading to lessened efforts to improve environmental conditions. 
    • 5. Privacy issues with access to DNA data.
Author
pelinpoyraz
ID
351124
Card Set
ANTHR 2750 Lecture 2
Description
transposable elements and HGP
Updated