PhantomBuster – A tool for removing phantoms in sequencing-based lineage tracing experiments
Abstract
Sequencing-based lineage-tracing uses random DNA barcodes to track thousands to millions of lineages simultaneously, but faces the analytical challenge of distinguishing genuine barcodes from "phantom" barcodes originating from sequencing errors. Index hopping, which is typically known to cause incorrect sample assignments, produces in some setups "intra-sample" phantoms, which can not be removed by current tools. Despite affecting only few reads, we find that a significant number of phantoms originate from index hopping, which need to be removed. Hence, we introduce PhantomBuster, a bioinformatical tool to detect and remove phantoms and end-to-end pipeline for sequencing-based lineage-tracing and CRISPR screens. We showcase its application for lineage tracing and CRISPR-LICHT screening in which PhantomBuster identifies approximately 95% of barcode combinations as phantoms and eliminates them.