Abstract
The massive size of single-cell RNA sequencing datasets often exceeds the capability of current computational analysis methods to solve routine tasks such as detection of cell types. Recently, geometric sketching was introduced as an alternative to uniform subsampling. It selects a subset of cells (the sketch) that evenly cover the transcriptomic space occupied by the original dataset, to accelerate downstream analyses and highlight rare cell types. Here, we propose algorithm Sphetcher that makes use of the thresholding technique to efficiently pick representative cells within spheres (as opposed to the typically used equal-sized boxes) that cover the entire transcriptomic space. We show that the spherical sketch computed by Sphetcher constitutes a more accurate representation of the original transcriptomic landscape. Our optimization scheme allows to include fairness aspects that can encode prior biological or experimental knowledge. We show how a fair sampling can inform the inference of the trajectory of human skeletal muscle myoblast differentiation.
Dokumententyp: | Zeitschriftenartikel |
---|---|
Fakultät: | Chemie und Pharmazie > Department Biochemie |
Themengebiete: | 500 Naturwissenschaften und Mathematik > 540 Chemie |
Sprache: | Englisch |
Dokumenten ID: | 89731 |
Datum der Veröffentlichung auf Open Access LMU: | 25. Jan. 2022, 09:32 |
Letzte Änderungen: | 25. Jan. 2022, 09:32 |