Background Modern cytometry can simultaneously measure dozens of markers, empowering investigation of complex phenotypes. However, manual gating relies on previous biological knowledge, and clustering/dimension-reduction tools fail to capture discrete phenotypes. Consequently, complex phenotypes with potential biological importance are often overlooked. To address this, we developed PhenoComb, an R package that allows agnostic exploration of complex phenotypes by assessing the frequencies of all marker combinations in cytometry datasets.
Methods PhenoComb uses signal intensity thresholds to assign markers to discrete states (e.g. negative, low, high). As PhenoComb works in a memory-safe manner, time and disk space are the only constraints to the number of markers and discrete states that can be evaluated. Next, the number of cells per sample from all possible marker combinations are counted and frequencies assessed. PhenoComb provides several approaches to perform statistical comparisons, evaluate the relevance of phenotypes, and assess the independence of identified phenotypes. PhenoComb also allows users to guide analysis by adjusting several function arguments such as identifying parent populations of interest, filtering low-frequency populations, and defining a maximum marker complexity. PhenoComb is compatible with local computer or server-based use.
Results In testing of PhenoComb’s performance on synthetic datasets, computation on 16 markers was completed in the scale of minutes and up to 26 markers in hours. We applied PhenoComb to two publicly available datasets: an HIV flow cytometry dataset (12 markers and 421 samples) and the COVIDome CyTOF dataset (40 markers and 99 samples). In the HIV dataset, PhenoComb identified immune phenotypes associated with HIV seroconversion, including those highlighted in the original publication. In the COVID dataset, we identified several immune phenotypes with altered frequencies in infected individuals relative to healthy individuals.
Conclusions PhenoComb is a unique and powerful tool for agnostically assessing phenotypes. By more fully utilizing the high-dimension data in single cell datasets, PhenoComb empowering exploratory data analysis and discovery of phenotypes for further characterization. PhenoComb is publicly available at https://github.com/SciOmicsLab/PhenoComb.
Statistics from Altmetric.com
If you wish to reuse any or all of this article please use the link below which will take you to the Copyright Clearance Center’s RightsLink service. You will be able to get a quick price and instant permission to reuse the content in many different ways.