Abstract
The use of RNA sequencing (RNA-Seq) data and the generation of de novo transcriptome assemblies have been pivotal for studies in ecology and evolution. This is especially true for nonmodel organisms, where no genome information is available. In such organisms, studies of differential gene expression, DNA enrichment bait design and phylogenetics can all be accomplished with de novo transcriptome assemblies. Multiple tools are available for transcriptome assembly, but no single tool can provide the best assembly for all data sets. Therefore, a multi-assembler approach, followed by a reduction step, is often sought to generate an improved representation of the assembly. To reduce errors in these complex analyses while at the same time attaining reproducibility and scalability, automated workflows have been essential in the analysis of RNA-Seq data. However, most of these tools are designed for species where genome data are used as reference for the assembly process, limiting their use in nonmodel organisms. We present TransPi, a comprehensive pipeline for de novo transcriptome assembly, with minimum user input but without losing the ability of a thorough analysis. A combination of different model organisms, k-mer sets, read lengths and read quantities was used for assessing the tool. Furthermore, a total of 49 nonmodel organisms, spanning different phyla, were also analysed. Compared to approaches using single assemblers only, TransPi produces higher BUSCO completeness percentages, and a concurrent significant reduction in duplication rates. TransPi is easy to configure and can be deployed seamlessly using Conda, Docker and Singularity.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Geowissenschaften > Department für Geo- und Umweltwissenschaften |
Themengebiete: | 500 Naturwissenschaften und Mathematik > 550 Geowissenschaften, Geologie |
URN: | urn:nbn:de:bvb:19-epub-107077-7 |
ISSN: | 1755-098X |
Sprache: | Englisch |
Dokumenten ID: | 107077 |
Datum der Veröffentlichung auf Open Access LMU: | 11. Sep. 2023, 13:47 |
Letzte Änderungen: | 15. Sep. 2023, 05:20 |
DFG: | Gefördert durch die Deutsche Forschungsgemeinschaft (DFG) - 491502892 |