The study of common human diseases is rapidly moving away from an exclusive focus on common variants using genome-wide association studies and toward sequencing approaches that represent most variants, including those that are rare in the general population.
Although rapidly falling, the per base costs of next generation sequencing platforms still preclude the generation of large sample sizes of entirely sequenced genomes at high coverage. In addition to this economic constraint, it is widely appreciated that the very large number of variants identified in such studies will make it difficult to use association evidence alone to identify causal sites. For these reasons, there has been considerable interest in focusing attention on coding variants as a first step at complete representation of human variation. Part of the motivation for this approach stems from the experience with Mendelian diseases, in which 59% of the causal variants are either missense or nonsense mutations [1]. Although there has been considerable speculation on the topic, there are in fact no solid data showing that the picture is any different for common diseases, which may also be influenced by variants that are in or near protein coding sequence [1].
Bạn đang xem: Screening the human exome: a comparison of whole genome and whole transcriptome sequencing
The most comprehensive approach for focusing on exons alone is clearly exome capture, where regions matching a defined set of coding exons are pulled from the genomic DNA (gDNA) using microarrays and then sequenced. However, this approach requires an initial and costly hybridization step. The cost of exome sequencing has contributed to the interest in sequencing the transcriptome (RNA-Seq) as an alternative, and possibly easier and less expensive strategy [2]. While this approach will clearly miss poorly expressed genes in whatever tissue is being studied, it does have the advantage of generating additional information, such as gene expression level and splicing patterns.
Xem thêm : True Velocity .308 Review
Although exome capture was demonstrated to identify approximately 95% of genomic single nucleotide variants (SNVs) in curated and non-paralogous exons [3], it is not currently known to what extent SNVs identified by RNA-Seq capture the full set of exonic SNVs identified by genomic sequencing. If the ability to capture SNVs by RNA-Seq is highly dependent on expression level, then this method would be useful only when performed in the appropriate tissue type. If, on the other hand, RNA-Seq at high coverage allows SNVs to be captured even in genes that are not highly expressed, then both methods could be useful for opening up sequencing studies to larger datasets in more diverse scientific studies.
Here, we have sequenced the entire genome and transcriptome of a single individual to high coverage. By comparing the SNVs identified in the transcriptome at different levels of coverage to those identified in the gDNA, we are able to directly evaluate how well RNA-Seq captures genomic variants.
Nguồn: https://buycookiesonline.eu
Danh mục: Info