What are the major classes of DNA sequences that make up the human genome? See figure 9.1.
- The human genome is comprised of two components: Nuclear genome and mitochondrial genome.
- Nuclear genome:
- 3.1 billion bps (Gb) ~ 93.5% euchromatin (i.e. actively transcribed)
- Remainder = constitutive heterochromatin
- Highly conserved sequences are protein-coding genes (1.1%) and RNA genes (4%), regulatory sequences.
- 45 % of our genome are sequences that either currently or at one time evolutionary derived from transposons, and they can also amplify themselves transposon based repeats (45%).
- About 6.5% is constituitive heterocromatin, no identifiable genes, always in compactive state.
- 44% is other sequences, we dont know what they do. One of the purposes of the ENCODE was to begin identifying what is all that stuff. Focus will be here!
- Studies have also shown even though we dont know the function of our genome >90% is transcribed.
DNA and human genome table
- Not a lot code for protein, although a lot of genes are expressed.
- Poorly conserved sequences are transposon-based repeats, heterochromatin and other sequences.
- Table summarizes important stuff. Genome is organized in 23 pairs of chromosomes.
- Big surprise out of human genome project is total number of protein coding genes. ~21000 is most recent estimate. Total number of genes that only produce RNA, prior to the human genome project we knew about rRNA, tRNA, smRNAs, but we discovered that there were whole masses of genes that final product is not a protein, estimate is 6000.
What is a karyogram? What are the rules used for ordering and orientation of chromosomes in a karyogram?
- Karyogram of a human genome, 23 pairs of chromosomes, diploid
- range from 1-22 autosomes (same for males and fameless), carry most genetic info
- and X and Y (different for individual).
- organized based on the size Largest to Smallest, and grouped based on centromere placemet.
- Divided into 7 groups of chromosomes.
- First one 1-3, metacentric,
- 2nd pairs 4 and 5 similar in size to 1-3 but centromere is more off center, with distinct arms on both sides.
- 6-12 and X, mostly submetacentric,
- 13-15 centromere placement subtelocentric chromosome.
- 16-18 and 19-20 and 21-22 and Y are the smallest.
- Shorter arm=p arm and in a karyogram is always at the top. Longer arm is q arm. P arm is always on the top, and q in the bottom.
- Karyogram wil be specific for each species. Bunch of different ways to stain chromosomes, different band pattern, etc.
- Largest vs smallest chromosome size and expression
- 1 is largest 249MB,
- smallest 21 only 51MB of DNA.
- They also differ by euchromatin or heterochromatin. Most chromosome the bulk is euchromatin, genes can potentially be expressed. Heterochrmoatic are not typically expressed. Y chromosome over 50% is heterochromatic. Very few genes have been mapped to Y chromosome. SRY sex related locus on the Y chromosome (male).
Discuss the roles that tandem duplications and transposons have played in the evolution of eukaryotic genomes, including the human genome.
- Gene duplications and chromosome rearrangements have occurred a lot over the course of evolution. Gene duplication range from relatively short non coding sequences to large sequences that may include genes or lage parts of genes.
- Ie: Amylase gene, 1-6 copies of gene per genome
with number of copies varying between individuals, is this due to a diet or?
- 1.1% of our genome is protein coding.
- Very few genes in our genomes are considered unique. They arose via 1. gene duplication and 2. exon shuffling.
- typically arise through unequal crossing over. *repetitive DNA sequences allow misalignment of homologues.
- If the pairing is not exact you can get a crossover band, ending up with one of chromosomes having two copies of A, 3 copies of repeat. The other chromosome would have just a single copy of repeat element. They will end up in different gametes, if passed on to offspring the first one provides extra copy of A, but if second passes the embryo will not survive due to LACK of A
What are transposons?
- DNA sequence that can change its location within the genome
- Transposable elements create REPEAT elements in our genome, creating conditions for UNEQUAL crossing over.
- LINES (long interspersed elements) and SINES = 30% of genome
- LTR and DNA transposons
- All transposons are 40-45% of genome
- 2 modes of transposition:
- Copy & Paste: duplication of genes between transposons on same chromosome
- Cut & Paste: occurs with DNA transposons, integrates into NEW Location
What is meant by the term gene family? How do gene families arise? Discuss the similarities between gene families and superfamilies.
- Gene family:
- Most single genes are members of gene families.
- family members may be clustered or dispersed in the genome.
- The extent of sequence and structural identity/similarity can vary significantly between gene families.
- Gene families where conservation extends throughout the coding regions or multiple functional domains:
- - tubulin and actin genes, histone genes
- - motor protein genes (myosins,kinesins and dyneins), cell surface receptor genes
- Groups of gene that share functional properties but often have little sequence similarity.
- Ig (immunoglobulin) includes genes that encode antibodies AS WELL AS those having Ig-like domains.
- GPCR (G-protein coupled receptor) also known as 7 helix transmembrane receptors. All activate intracellular signaling cascades via binding and activation of trimeric G-proteins. One subfamily =odorant receptors, ~900 members.
What are retrogenes and how are they distinguished from other genes?
- RETROGENES: genes that appear to have arisen via retrotransposition. (DNA gene copied back from RNA by reverse transcription)
- Highly similar to other genes in genome, are expressed, but lack introns found in parent gene.
- Retrogenes = could code for a protein.
What are the four major classes of transposable elements found in the human genome? Which method of transposition do they utilize?
- Four major classes identified in human genome.
- (ALL Transposons together make up 45% of our genome)
- 1. LINES largest transposons.
- all autonomous, they could transpose under the appropriate conditions
- transposone is 6-8kb
- ORF, ORF2 encodes polymerase (reverse transcriptase), they replicate themselves in this manner and spread to the genome.
- Transcribed by RNA polymerase
- 2. SINES three classes of these: ALU, MIR and MIR3.
- nonautonomous, but you can also get new copies of these throughout the genome.
- (retro-transposase or retro polymerase) can make DNA copy from one of these SINE elements.
- Transcribed by RNA polymerase
- 3. LTR (retrovirus-like)- two classes: HERV and MaLR.
- less common in the genome
- Nonautonomous when NO pol, but will often encode pol, and that distinguish whether they are autonomous or nonautonomous.
- Autonomous version have intact pol, nonautonomous does not.
- Gag, pol and env gene identify that these elements evolved from retroviruses.
- Move via COPY and paste
- 4. DNA transposons are thought to be fossil elements.
- Most dont move around, but original ones include transposase.
- Move via CUT and paste mechanism.
- Transposase makes cuts in DNA molecule.
- DNA is intermediate, integrates into new location WITHOUT RNA intermediate(as is w/ others above)!
- Create repetitive elements in our genome that can create conditions for unequal crossing over to occur. Unequal tandem; dispersed repeat sequences in the genome.
- piRNA: associate with particular class of proteins called pi proteins, primary function is to repress transposition in the germline. Only expressed in germline cells.
Briefly describe the copy and paste method of transposition.
- Transposition mode of copy and paste (appears most common in human genome) and cut and paste.
- Copy and paste: duplication of genes located between transposons on same chromosome. Most extant transposons are defective and unable to transpose, but recent cases of LINE transpositions > child with hemophilia, 2 separate cases of muscular dystrophy and various reports of transpositions in specific cancers. Recent cases of SINE (alu element) insertions in BRCA2 familial breast cancer, factor IX- hemophilia B, ChE-acholinesterasemia, and NF1-neurofibromatosis.
What is meant by the term functional RNA
- Functional RNA may be involved in enzymatic/effector roles.
- Non-coding RNA (ncRNA) is a functional RNA molecule that is NOT translated into a protein
One of the surprising findings coming from the human genome project and subsequent studies is the large number of genes that encode functional RNAs.? What are some examples of functional RNA genes (gene classes) and what are the functions of each class?
- Examples of functional RNA are:
- 1. Ribosomal RNA (rRNA): 16s and 18s RNA functions as part of the components of the mitochondrial and cytoplasmic ribosome, respectively.
- 2. Small Nuclear RNA (snRNA): involved in transcription regulation. U1, U2, U4, U5 process GU-AG introns.
- 3. Long regulatory RNAs: can act as trans-acting regulator: HOTAIR acts in trans(acts on some other part of a gene that gets transcribed).
What are the classes of RNA genes that are involved in gene regulation?
- 1. MicroRNA (miRNA)
- 2. Piwi-binding RNA (piRNA)
- 3. Endogenous short interfering RNA
- 4. Long noncoding regulatory RNA
At what level of gene regulation do the classes of RNA genes function? Note that the last will vary between different classes of regulatory RNA genes.
- 1. micro RNA (miRNA)~22nt: 1000 different types;
- level of gene regulation that they function: multiple important roes in gene regulation, notably in development and implicated in some cancers. RNA based gene-silencing.
- 2. Piwi-binding RNA (piRNA)
- level of gene regulation that they function: Derived from repeats, expressed only in germ-line cells, where they LIMIT excess transposon activity
- 3. Endogenous short interfering RNA (endo-siRNA)
- level of gene regulation that they function: Involved in gene regulation in somatic cells and may be involved in regulating some types of transposon. Often derived from pseudogenes, inverted repeats, etc.
- 4. Long noncoding regulatory RNA;
- level of gene regulation that they function: involved in regulating gene expression; some are involved in monoallelic expression and/or as antisense regulators.
During lecture I mentioned several surprising findings that came out of the human genome project. What were they and why were these findings surprising? You should be able to list at least three surprises from the human genome project.
- 1. Big surprise out of human genome project: Only a small portion of our genome is made up of protein-coding genes.
- a. Total number of protein coding genes. ~21000 is most recent estimate.
- b. These protein-coding genes are mostly members of gene families and superfamilies.
- c. Total number of genes that only produce RNA, prior to the human genome project we knew about rRNA, tRNA, smRNAs, but we discovered that there were whole masses of genes that final product is not a protein, estimate is 6000.
- 2. On the other hand, Transposons make up a significant proportion of our genome (45%). These transposons, that work aid in gene duplication and chromosome rearrangements have occurred a lot over the course of evolution, and continue to occur, which may be reasoning for why gene therapies may not work, due to the unspecified amount of gene tandem duplication and transposition.
- a. Gene duplication range from relatively short non coding sequences. Amilase gene, number of copies varies between individuals, is this due to a diet or? Individuals ranged from having a single copy up to six copies per genome.
- 3. Also it is suprising how many genes encode for functional RNAs. This supports the theory that DNA evolved from RNA.
- a. Functions that these RNAs involve range from proving
- i. protein synthesis and support: mRNA, rRNA, tRNAs
- ii. RNA maturation: splicing (snRNA), base modification
- iii. DNA synthesis: TERC (RNA assoc. w/telomerase)
- iv. Gene regulation: micro, piwiRNA, endo-siRNA, noncoding regulatory RNA (explained in previous question.
- v. Transposon control: piRNA (suppression of more transposons, prevents mutations)