Which of the following processes is the best way to determine whether alternative splicing of a given gene occurs?

  1. Mortazavi A, Williams B, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods. 2008, 5: 621-628. 10.1038/nmeth.1226.

    Article  PubMed  CAS  Google Scholar 

  2. Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10: 57-63. 10.1038/nrg2484.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  3. Filichkin S, Priest H, Givan S, Shen R, Bryant D, Fox S, Wong W, Mockler T: Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res. 2010, 20: 45-10.1101/gr.093302.109.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  4. Harr B, Turner L: Genome-wide analysis of alternative splicing evolution among Mus subspecies. Mol Ecol. 2010, 19: 228-239.

    Article  PubMed  CAS  Google Scholar 

  5. Ramani A, Calarco J, Pan Q, Mavandadi S, Wang Y, Nelson A, Lee L, Morris Q, Blencowe B, Zhen M, Fraser A: Genome-wide analysis of alternative splicing in Caenorhabditis elegans. Genome Res. 2011, 21: 342-10.1101/gr.114645.110.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  6. Stamm S, Ben-Ari S, Rafalska I, Tang Y, Zhang Z, Toiber D, Thanaraj T, Soreq H: Function of alternative splicing. Gene. 2005, 344: 1-20.

    Article  PubMed  CAS  Google Scholar 

  7. Hallegger M, Llorian M, Smith CWJ: Alternative splicing: global insights. FEBS J. 2010, 277: 856-866. 10.1111/j.1742-4658.2009.07521.x.

    Article  PubMed  CAS  Google Scholar 

  8. Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol. 2008, 26: 1135-1145. 10.1038/nbt1486.

    Article  PubMed  CAS  Google Scholar 

  9. Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-10.1101/gr.078212.108.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  10. Campagna D, Albiero A, Bilardi A, Caniato E, Forcato C, Manavski S, Vitulo N, Valle G: PASS: a program to align short sequences. Bioinformatics. 2009, 25: 967-10.1093/bioinformatics/btp087.

    Article  PubMed  CAS  Google Scholar 

  11. Langmead B, Trapnell C, Pop M, Salzberg S: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10: R25-10.1186/gb-2009-10-3-r25.

    Article  PubMed  PubMed Central  Google Scholar 

  12. De Bona F, Ossowski S, Schneeberger K, Rätsch G: Optimal spliced alignments of short sequence reads. BMC Bioinformatics. 2008, 9: O7-10.1186/1471-2105-9-S10-O7.

    Article  Google Scholar 

  13. Yassour M, Kaplan T, Fraser H, Levin J, Pfiffner J, Adiconis X, Schroth G, Luo S, Khrebtukova I, Gnirke A, Nusbaum C, Thompson D, Friedman N, Regev A: Ab initio construction of a eukaryotic transcriptome by massively parallel mRNA sequencing. Proc Natl Acad Sci USA. 2009, 106: 3264-10.1073/pnas.0812841106.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  14. Trapnell C, Pachter L, Salzberg S: TopHat: discovering splice junctions with RNA-Seq. Bioinformatics. 2009, 25: 1105-1111. 10.1093/bioinformatics/btp120.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  15. Jean G, Kahles A, Sreedharan V, Bona F, Rätsch G: RNA-Seq Read Alignments with PALMapper. Curr Protocols Bioinformatics. 2010, 32: 11.6.1-11.6.37.

    Google Scholar 

  16. Wang K, Singh D, Zeng Z, Coleman S, Huang Y, Savich G, He X, Mieczkowski P, Grimm S, Perou C, MacLeod J, Chiang D, Prins J, Liu J: MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Res. 2010, 38: e178-10.1093/nar/gkq622.

    Article  PubMed  PubMed Central  Google Scholar 

  17. Bryant D, Shen R, Priest H, Wong W, Mockler T: Supersplat-spliced RNA-seq alignment. Bioinformatics. 2010, 26: 1500-10.1093/bioinformatics/btq206.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  18. Pan Q, Shai O, Lee L, Frey B, Blencowe B: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008, 40: 1413-1415. 10.1038/ng.259.

    Article  PubMed  CAS  Google Scholar 

  19. Sultan M, Schulz M, Richard H, Magen A, Klingenhoff A, Scherf M, Seifert M, Borodina T, Soldatov A, Parkhomchuk D, Schmidt D, O'Keeffe S, Haas S, Vingron M, Lehrach H, Yaspo M: A global view of gene activity and alternative splicing by deep sequencing of the human transcriptome. Science. 2008, 321: 956-959. 10.1126/science.1160342.

    Article  PubMed  CAS  Google Scholar 

  20. Wang E, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore S, Schroth G, Burge C: Alternative isoform regulation in human tissue transcriptomes. Nature. 2008, 456: 470-476. 10.1038/nature07509.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  21. Tang F, Barbacioru C, Wang Y, Nordman E, Lee C, Xu N, Wang X, Bodeau J, Tuch B, Siddiqui A, Lao K, Surani M: mRNA-Seq whole-transcriptome analysis of a single cell. Nat Methods. 2009, 6: 377-382. 10.1038/nmeth.1315.

    Article  PubMed  CAS  Google Scholar 

  22. Trapnell C, Williams B, Pertea G, Mortazavi A, Kwan G, van Baren M, Salzberg S, Wold B, Pachter L: Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010, 28: 511-515. 10.1038/nbt.1621.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  23. Guttman M, Garber M, Levin J, Donaghey J, Robinson J, Adiconis X, Fan L, Koziol M, Gnirke A, Nusbaum C, Rinn J, Lander E, Regev A: Ab initio reconstruction of cell type-specific transcriptomes in mouse reveals the conserved multi-exonic structure of lincRNAs. Nat Biotechnol. 2010, 28: 503-510. 10.1038/nbt.1633.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  24. Grabherr M, Haas B, Yassour M, Levin J, Thompson D, Amit I, Adiconis X, Fan L, Raychowdhury R, Zeng Q, Chen Z, Mauceli E, Hacohen N, Gnirke A, Rhind N, di Palma F, Birren B, Nusbaum C, Lindblad-Toh K, Friedman N, Regev A: Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011, 29: 644-652. 10.1038/nbt.1883.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  25. Simpson J, Wong K, Jackman S, Schein J, Jones S, Birol I: ABySS: a parallel assembler for short read sequence data. Genome Res. 2009, 19: 1117-10.1101/gr.089532.108.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  26. Heber S, Alekseyev M, Sze S, Tang H, Pevzner P: Splicing graphs and EST assembly problem. Bioinformatics. 2002, 18: 181-188.

    Article  Google Scholar 

  27. Xing Y, Resch A, Lee C: The multiassembly problem: reconstructing multiple transcript isoforms from EST fragment mixtures. Genome Res. 2004, 14: 426-10.1101/gr.1304504.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  28. Sammeth M, Valiente G, Guigo R: Bubbles: alternative splicing events of arbitrary dimension in splicing graphs. Lecture Notes Comput Sci. 2008, 4955: 372-10.1007/978-3-540-78839-3_32.

    Article  Google Scholar 

  29. Harrington E, Bork P: Sircah: a tool for the detection and visualization of alternative transcripts. Bioinformatics. 2008, 24: 1959-10.1093/bioinformatics/btn361.

    Article  PubMed  CAS  Google Scholar 

  30. Bonizzoni P, Mauri G, Pesole G, Picardi E, Pirola Y, Rizzi R: Detecting alternative gene structures from spliced ESTs: a computational approach. J Comput Biol. 2009, 16: 43-66. 10.1089/cmb.2008.0028.

    Article  PubMed  CAS  Google Scholar 

  31. Labadorf A, Link A, Rogers M, Thomas J, Reddy A, Ben-Hur A: Genome-wide analysis of alternative splicing in Chlamydomonas reinhardtii. BMC Genomics. 2010, 11: 114-10.1186/1471-2164-11-114.

    Article  PubMed  PubMed Central  Google Scholar 

  32. Richardson D, Rogers M, Labadorf A, Ben-Hur A, Guo H, Paterson A, Reddy A: Comparative analysis of serine/arginine-rich proteins across 27 eukaryotes: insights into subfamily classification and extent of alternative splicing. PLoS ONE. 2011, 6: e24542-10.1371/journal.pone.0024542.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  33. Zenoni S, Ferrarini A, Giacomelli E, Xumerle L, Fasoli M, Malerba G, Bellin D, Pezzotti M, Delledonne M: Characterization of transcriptional complexity during berry development in Vitis vinifera using RNA-Seq. Plant Physiol. 2010, 152: 1787-10.1104/pp.109.149716.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  34. Reddy A: Alternative splicing of pre-messenger RNAs in plants in the genomic era. Annu Rev Plant Biol. 2007, 58: 267-294. 10.1146/annurev.arplant.58.032806.103754.

    Article  PubMed  CAS  Google Scholar 

  35. Wang B, Brendel V: Genomewide comparative analysis of alternative splicing in plants. Proc Natl Acad Sci USA. 2006, 103: 7175-10.1073/pnas.0602039103.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  36. Kim E, Magen A, Ast G: Different levels of alternative splicing among eukaryotes. Nucleic Acids Res. 2007, 35: 125-10.1093/nar/gkm529.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  37. Boguski M, Lowe T, Tolstoshev C: dbEST-database for "expressed sequence tags". Nat Genet. 1993, 4: 332-333. 10.1038/ng0893-332.

    Article  PubMed  CAS  Google Scholar 

  38. PlantGDB http://plantgdb.org/

  39. Wu T, Watanabe C: GMAP: a genomic mapping and alignment program for mRNA and EST sequences. Bioinformatics. 2005, 21: 1859-10.1093/bioinformatics/bti310.

    Article  PubMed  CAS  Google Scholar 

  40. Montgomery S, Sammeth M, Gutierrez-Arcelus M, Lach R, Ingle C, Nisbett J, Guigo R, Dermitzakis E: Transcriptome genetics using second generation sequencing in a Caucasian population. Nature. 2010, 464: 773-777. 10.1038/nature08903.

    Article  PubMed  CAS  Google Scholar 

  41. Blencowe B, Ahmad S, Lee L: Current-generation high-throughput sequencing: deepening insights into mammalian transcriptomes. Genes Dev. 2009, 23: 1379-10.1101/gad.1788009.

    Article  PubMed  CAS  Google Scholar 

  42. Huang W, Khatib H: Comparison of transcriptomic landscapes of bovine embryos using RNA-Seq. BMC Genomics. 2010, 11: 711-10.1186/1471-2164-11-711.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  43. Wang L, Xi Y, Yu J, Dong L, Yen L, Li W: A statistical method for the detection of alternative splicing using RNA-Seq. PLoS ONE. 2010, 5: e8529-10.1371/journal.pone.0008529.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Richard H, Schulz M, Sultan M, Nürnberger A, Schrinner S, Balzereit D, Dagand E, Rasche A, Lehrach H, Vingron M, Haas S, Yaspo M: Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments. Nucleic Acids Res. 2010, 38: e112-10.1093/nar/gkq041.

    Article  PubMed  PubMed Central  Google Scholar 

  45. NCBI Sequence Read Archive. [http://www.ncbi.nlm.nih.gov/sra]

  46. Swarbreck D, Wilks C, Lamesch P, Berardini T, Garcia-Hernandez M, Foerster H, Li D, Meyer T, Muller R, Ploetz L, Radenbaugh A, Singh S, Swing V, Tissier C, Zhang P, Huala E: The Arabidopsis Information Resource (TAIR): gene structure and function annotation. Nucleic Acids Res. 2008, 36: D1009-

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  47. Palusa S, Ali G, Reddy A: Alternative splicing of pre-mRNAs of Arabidopsis serine/arginine-rich proteins: regulation by hormones and stresses. Plant J. 2007, 49: 1091-10.1111/j.1365-313X.2006.03020.x.

    Article  PubMed  CAS  Google Scholar 

  48. Barrett T, Troup DB, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M, Marshall KA, Phillippy KH, Sherman PM, Muertter RN, Holko M, Ayanbule O, Yefanov A, Andrey , Soboleva : NCBI GEO: archive for functional genomics data sets-10 years on. Nucleic Acids Res. 2011, 39: D1005-D1010. 10.1093/nar/gkq1184.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  49. Kent W: BLAT-the BLAST-like alignment tool. Genome Res. 2002, 12: 656-

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  50. Eilbeck K, Mungall C, Lewis S, Ashburner M: The Sequence Ontology Project 2009. [http://www.sequenceontology.org/gff3.shtml]

  51. Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The sequence alignment/map format and SAMtools. Bioinformatics. 2009, 25: 2078-10.1093/bioinformatics/btp352.

    Article  PubMed  PubMed Central  Google Scholar 

  52. Rätsch G, Sonnenburg S: Accurate splice site detection for Caenorhabditis elegans. Kernel Methods in Computational Biology. Edited by: Schölkopf B, Tsuda K, Vert JP. 2004, MIT Press, 277-

    Google Scholar 

  53. Rätsch G, Sonnenburg S, SchÄolkopf B: RASE: recognition of alternatively spliced exons in C. elegans. Bioinformatics. 2005, 21: i369-i377. 10.1093/bioinformatics/bti1053.

    Article  PubMed  Google Scholar 

  54. Ben-Hur A, Ong C, Sonnenburg S, Schölkopf B, Rätsch G: Support vector machines and kernels for computational biology. PLoS Comput Biol. 2008, 4: e1000173-10.1371/journal.pcbi.1000173.

    Article  PubMed  PubMed Central  Google Scholar 

  55. PyML-machine learning in Python. [http://pyml.sourceforge.net/]


Page 2

  AS AS events
  genes IR ES Alt. 5' Alt. 3' Total
A. thaliana models 4,029 1,987 (33%) 550 (9%) 1,256 (21%) 2,145 (36%) 5,938
   SpliceGrapher       
No ESTs 4,901 2,248 (30%) 714 (10%) 1,560 (21%) 2,866 (39%) 7,388
Novel 885 308 (21%) 164 (11%) 304 (20%) 721 (48%) 1,497
With ESTs 6,162 3,658 (33%) 994 (9%) 2,335 (21%) 4,128 (37%) 9,916
Novel 2,154 1,779 (34%) 444 (8%) 1,079 (20%) 1,983 (38%) 5,285
   Cufflinks       
No gene models 1,263 449 (32%) 383 (28%) 237 (17%) 319 (23%) 1,388
Novel 699 429 (32%) 380 (28%) 232 (17%) 304 (23%) 1,345
With gene models 6,056 4,029 (39%) 2,857 (27%) 1,427 (14%) 2,106 (20%) 10,419
Novel 2,319 2,232 (38%) 2,550 (43%) 552 (9%) 594 (10%) 5,928
   TAU       
No gene models 2,777 893 (17%) 475 (9%) 1,481 (27%) 2,555 (47%) 5,404
Novel 1,591 811 (16%) 460 (9%) 1,431 (28%) 2,351 (47%) 5,053
With gene models 10,458 94,571 (85%) 598 (1%) 5,972 (5%) 9,820 (9%) 110,961
Novel 8,364 94,124 (86%) 476 (0%) 5,697 (5%) 9,219 (8%) 109,516
V. vinifera models 0 0 (0%) 0 (0%) 0 (0%) 0 (0%) 0
   SpliceGrapher 2,039 347 (13%) 830 (31%) 640 (24%) 838 (32%) 2,655
   TAU       
No gene models 3,099 531 (10%) 684 (13%) 1,321 (25%) 2,743 (52%) 5,279
With gene models 15,874 135,585 (72%) 4,938 (3%) 23,615 (13%) 24,406 (13%) 188,544
   Cufflinks       
No gene models 1,057 324 (24%) 519 (39%) 140 (11%) 349 (26%) 1,332
With gene models 4,263 4,120 (34%) 3,148 (26%) 2,165 (18%) 2,818 (23%) 12,251

  1. The number of AS events detected by SpliceGrapher, Cufflinks, and TAU compared with events inferred from the TAIR9 annotations. We track the following AS event types: intron retention (IR), exon skipping (ES), alternative 5' sites (Alt. 5') and alternative 3' sites (Alt. 3'). SpliceGrapher uses the TAIR9 gene models as a baseline, so it includes all of the same AS events along with additional events inferred from RNA-Seq data. Without gene models, nearly all TAU and Cufflinks predictions are novel AS events. With gene models, more than half of Cufflinks predictions reproduce AS events from the gene models. TAU uses known splice sites to predict all possible exons in a gene, generating vast numbers of novel exons and novel IR events.