Single-cell RNA-sequencing (scRNAseq) technologies have transformed our understanding of cellular systems. As these technologies rapidly develop, the field has gained an overwhelming number of computational tools that analyze scRNAseq data.
A research group led by Vicky Yao, assistant professor of computer science, has been awarded a Data Insights award from the Chan Zuckerberg Initiative (CZI) to characterize scRNAseq tools and track trends of their use across biomedical literature.
The CZI Data Insights award provides 18 months of support for projects that advance insights of human health from single-cell biology datasets by addressing computational challenges and bottlenecks in single-cell biology.
In a project titled, “Meta-Characterization of Single-Cell Workflows,” Yao will use natural language processing to analyze single-cell workflows outlined in biomedical literature, providing clarity on how researchers are using scRNAseq tools.
“As these technologies advance, people are overwhelmed with all the tools and different ways to use them,” said Yao. “Although there are existing resources that list the tools, we don’t have a systematic understanding of how they are being used and how the choice of ordering may affect downstream results.”
Building on her research group’s previous work in procedural knowledge extraction, Yao will catalog the tools and the sequence in which they are used across full text publications, resulting in a systematic analysis of scRNAseq pipelines. In doing so, she will construct a comprehensive “multiverse” of scRNAseq pipelines in biomedical literature.
This project will help researchers working with single-cell data navigate the complex landscape of single-cell analyses, aiming to highlight which areas may have new opportunities for methods development as well as careful benchmarking.
“Once we chart the set of pipelines capturing how people are using the tools, we can understand what people are doing at a global level,” Yao said, “By looking at the trends, we can start to understand reproducibility of results.”
Yao’s work could pave the way for future research that sheds light on best practices for single-cell analysis. “This work could enable development of new statistical methods, considering not only multiple hypothesis corrections, but also perhaps ‘multiverse’ hypothesis correction,” said Yao.
Yao’s project is one of 17 awarded projects in the third cycle of CZI’s Data Insights awards, as announced July 23, 2024.