Abstract
Identifying distinctive taxa for micro-biome-related diseases is considered key to the establishment of diagnosis and therapy options in precision medicine and imposes high demands on the accuracy of micro-biome analysis techniques. We propose an alignment- and reference- free subsequence based 16S rRNA data analysis, as a new paradigm for micro-biome phenotype and biomarker detection. Our method, called DiTaxa, substitutes standard operational taxonomic unit (OTU)-clustering by segmenting 16S rRNA reads into the most frequent variable-length subsequences. We compared the performance of DiTaxa to the state-of-the-art methods in phenotype and biomarker detection, using human-associated 16S rRNA samples for periodontal disease, rheumatoid arthritis and inflammatory bowel diseases, as well as a synthetic benchmark dataset. DiTaxa performed competitively to the k-mer based state-of-the-art approach in phenotype prediction while outperforming the OTU-based state-of-the-art approach in finding biomarkers in both resolution and coverage evaluated over known links from literature and synthetic benchmark datasets. Availability and implementation DiTaxa is available under the Apache 2 license at http://llp.berkeley.edu/ditaxa. Supplementary information Supplementary data are available at Bioinformatics online.
| Dokumententyp: | Zeitschriftenartikel | 
|---|---|
| Fakultät: | Medizin | 
| Themengebiete: | 600 Technik, Medizin, angewandte Wissenschaften > 610 Medizin und Gesundheit | 
| ISSN: | 1367-4803 | 
| Sprache: | Englisch | 
| Dokumenten ID: | 79582 | 
| Datum der Veröffentlichung auf Open Access LMU: | 15. Dez. 2021 14:49 | 
| Letzte Änderungen: | 15. Dez. 2021 14:49 | 
 
		 
	 
    


