Friday, July 27, 2007

papers to read

  • Recent papers(June, July) about CNEs
1. Adaptive evolution of conserved non-coding elements in mammals. Su Yeon Kim, Jonathan K Pritchard. PLoS Genetics

developed a statistical method called the 'shared rates test' (SRT) to identify CNCs that show significant variation in substitution rates across branches of a phylogenetic tree, and they applied the method on 98910 CNEs from Hs:Ch:Dog:Mm:Rat alignment. 68% of them are constrainedly evolved, while the rest (32%) show departure, including some fast evolving ones. The author claimed it as evidence of adaptive evolution in these CNEs.

2. Comprehensive characterization of the cis-regulatory code responsible for the spatio-temporal expression of olSix3.2 in the developing medaka forebrain. Ivan Conte and Paola Bovolenta from Spain, Genome Biology

Ivan investigated the CNEs around the Six3 gene from fish alignment, 10 CNEs blocks flanking 5' of the gene, with 2 enhancers (D, I ), 2 silencers(A, G) and 2 silencers blockers(E, H). They demonstrated that the entire expression of the newly identified olSix3.2 is orchestrated by the combinatorial use of seven different cis-regulatory modules that at least part of this regulation is conserved in the Six3 locus of vertebrates other than fishes.

I guess it's important to show the regulatory code in a spatio-temporal AND combinational way. As the paper said, “one limitation of previous studies that have used transgenic analysis to test the function of highly conserved non-coding sequences is the identification of single enhancers uprooted from possible interactions with the remaining regulatory elements“.

3. Statistical information characterization of conserved non-coding elements in vertebrates.
From Elger Greg's group.

Can not open the full text, just viewing from the abstract, no so much surprising result expected. I guess this paper could be categorized together with one of their previous papers at Trends Genetics: Striking nucleotide frequency pattern at the borders of highly conserved vertebrate non-coding sequences.

4. A large family of ancient repeat elements in the human genome is under strong selection. PNAS, 2006. Michael Kamal, Xiaohui Xie, Eric S. Lander (@ Harvard)

I guess the paper mainly offered two messages useful for me.
  1. The discovery that a large CNEs family fall into the MER121 repeat class (with 1/4 of 115 50-mer perfect conserved instances). And given the exceptional conservation properties of MER121, itis clear that it must have an important function that has beenunder purifying selection for 200 million years. That's the methodology how the title could be proved. This idea of observing purifying selection on ARs which is depositing in CNEs was applied/amplified by David Haussler (@ Stanford) and Gill Bejerano (@UCSC) in 2007. Their PNAS paper shows "thousands of human mobile element fragments undergo strong purifying selection near developmental genes".
  2. The other thing that I could learn from the paper is the method line to extract the Ancient Repeat sequences, or more generally speaking the Neutral Evolving sequences. For the first paper, they got the AR sequences with method in the mouse sequencing Nature paper (method). The 2nd one use a model of neutral evolution computed by PhyloP from 4-fold degenerate sites in the ENCODE regions. But I can not get the application of PhyloP (published on RECOMB 2006).
Information about RepeatSequence
  1. RepBase / Repeat Masking / Repeat Map @ with username of xianjun
  2. Repeatmasker, a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences, @
  3. A good article "DNA repeat sequence and disease" @
5. Widely distributed noncoding purifying selection in the human genome, PNAS, 2007 July, Saurabh Asthana... John A. Stamatoyannopoulos (@Washington University)

This paper supposed to answer the question that "to what extent noncoding sites outside of CNEs are functionally significant in modern humans", by using SNPs data and CNEs data. They conclude that the noncoding purifying selection pressure is more widely distributed in the genome, instead of concentrated in CNEs. From the following figure, we could see that most of the four-genome conserved bases(up to 96.5%) occur outside of CNEs.

The author validated this partition method(conserved or non-conserved) by testing the selection pressure in coding exon.
Then partition the regions into 3 parts: coding, non-CNE noncoding, CNE; and test the SNP diversity(allele frequency) difference btw groups. use subsample method to check the reliability. Additionally, they check that the selective effect was independent of CNEs definition, population demographic history, heterogeneity in mutation rate, local G+C content, 4GCBs density, and substitution type. (Very strong!!)
They then estimated the proportion of noncoding bases in the human genome under selection, by using a model named "infinite number of sites model" (ref. to two papers[1, 2] and a book). About neutral theory of molecular evolution, ref. to this wiki page. The result is : a minimum of 18.5% of nucleotide positions conserved across four genomes must be under pressure of negative selection. "Our results indicate that at a minimum 3.5-fold more noncoding nucleotides (2.8% of nucleotides) are under selection than estimates based on CNSs, and that 71.4% of positions under selection (2% of nucleotides) lie outside CNSs."

6. Purifying Selection Maintains Highly Conserved Noncoding Sequences in Drosophila, by Bergman CM. MBE 2007.

The paper use a model to test predictions of the mutational cold spot model of CNEs evolution in the genus Drosophila. Some models/data in this paper are similar as the above paper.
  • News from RNA world
1. The RNAz web server: prediction of thermodynamically stable and evolutionarily conserved RNA structures.

From Vienna University, seems related with one of the best posters. I am not sure they offer API for access.

2. Promoter-associated RNA is required for RNA-directed transcriptional gene silencing in human cells

I guess it's an important paper to understand the mechanism how siRNA silent the gene expression at the posttranscriptional stage in human cell. Similar as Yeast(?), the siRNA recognizes the promoter-associated RNAs transcribed through RNAPII promoters, these promoter RNAs function as a recognition motif to direct epigenetic silencing complexes to the corresponding targeted promoters to mediate transcriptional silencing in human cells.
  • Interesting story
1.Rapid asymmetric evolution of a dual-coding tumor suppressor INK4a/ARF locus contradicts its function, PNAS, Nekrutenko A
a funny story to show the function of an overlap region for two proteins, possible intrinsic property of the dual-coding exon. Also, they mentioned 90 newly identified genes with similar dual-coding structure. It's interesting to see the common feature of these genes. ... which reminds me the similar cases I observed in GRB study.

There are some related papers about dual-coding genes from Nekrutenko's group.

2. Identification of a locus control region for quadruplicated green-sensitive opsin genes in zebrafish, PNAS

Shoji group presents a 0.5-kb region located 15 kb upstream of the RH2 gene array(RH2-1, RH2-2, RH2-3, and RH2-4) is an essential regulator for their expression. Lots of experimental data, but did not say much about the ortholog case, whether similar/different story happened in other fish/mammals. Today's Science Editors' Choice put this in the list.

3. Global analysis of patterns of gene expression during Drosophila embryogenesis
4. The new mutation theory of phenotypic evolution
5. Nucleosome positioning signals in genomic DNA
6. Functional persistence of exonized mammalian-wide interspersed repeat elements (MIRs)
7. Housekeeping genes tend to show reduced upstream sequence conservation

No comments: