On April 28 and 29, bioinformatics experts from Rice University and other Texas Medical Center Institutions organized a RAD Microbes Boot Camp at Rice’s BioScience Research Collaborative. This workshop was organized by Todd Treangen, associate professor of computer science at Rice University’s George R. Brown School of Engineering and Computing, and Ken Kennedy Institute’s AI2Health research cluster lead, in collaboration with his colleagues from Texas Medical Center institutions, UTHealth Houston, Baylor College of Medicine, the University of Texas MD Anderson Cancer Center, and the Houston Methodist Academic Institute.
The workshop provided comprehensive hands-on training to 40 local researchers—graduate students, postdoctoral fellows, and principal investigators from diverse backgrounds—on the unbiased analysis and use of genomic data using several current and emerging bioinformatics tools and pipelines to solve clinical and research questions.
Analyzing an organism’s genome—the complete set of the genetic blueprint (DNA)—offers important biomedical research and clinical insights. For instance, it helps researchers track how variants of different pathogens keep evolving—crucial for predicting and mitigating disease outbreaks, and antibiotic resistance in communities. It is used by physicians to diagnose, monitor, and treat genetic disorders, and by biomedical researchers to understand the origins of new diseases and develop targeted therapies.
These applications rely on the ability of all researchers—even those without a bioinformatics background—to accurately and reliably analyze and interpret genomic data. Treangen teamed up with researchers and clinical experts from other area institutions to educate the research community on this topic using hands-on demonstrations and in-depth discussions. In addition to building the participants’ skills and knowledge, he hoped this bootcamp would encourage cross-disciplinary learning, build connections, and facilitate future collaborations among participants.
"As a developer of computational methods geared towards genomic data analysis at scale, my lab is committed to providing training and education to interested researchers to help bridge the gap between the answers provided by command-line bioinformatics tools and clinical research questions by scientists and clinicians at the Texas Medical Center. This workshop is the latest in a series of boot camps and workshops I’ve organized under this theme. The others being RAD microbes 2023, CINEMA 2023 at the Rice Global Paris Center, and GCC AMR pre-conference workshop on strain level metagenomic analysis in 2024 and 2025.”
The two-day workshop delved deep into each step of genome sequencing and analysis and covered a wide range of topics within those areas.
The first session, conducted by Blake Hanson, epidemiologist and assistant professor of internal medicine at UT Houston’s McGovern School of Medicine, introduced sampling and study design, short- and long-read sequencing technologies followed by quality assessment and control. Treangen lab members, Michael Nute, a postdoctoral fellow and Rossie Luo, a graduate student at Rice conducted the second session on genome alignment, assembly, and visualization. The third session, devoted to genome alignment and variant discovery, was conducted by Daniel Paiva Agustinho, assistant professor at Baylor College of Medicine. These sessions aimed to build a foundational understanding of the common computational tools and techniques used in genome sequencing and analysis.
On the second day of the boot camp, William Shropshire, instructor and research faculty at the University of Texas MD Anderson Cancer Center and An Dinh, a graduate student at UTHealth School of Public Health, presented an overview of genomic epidemiology and phylogenomics. The aim of this session was to build an understanding of how molecular subtyping, genomics and metadata can be used to uncover evolutionary relationships across genomes and to inform and guide public health. The final session by Rodrigo de Paula Baptista, professor of Medicine at the Houston Methodist Academic Research Institute, delved into the analysis and use of large datasets to custom annotate and visualize pan-genomes—which represent the complete collection of genes within a species—to extract useful actionable insights from large numbers of genomes.
Rice computer science graduate students, Ryan Doughty, Eddy Huang and Natalie Kokroko, as well as Hossaena Ayele from the School of Public Health and Center for Infectious Diseases at UTHealth Houston, and Austin Marshall from Houston Methodist Research Institute were the teaching assistants. The workshop organizers are thankful to the Ken Kennedy Institute, Pacific Biosciences of California (PacBio), Rice Office of Information Technology (Clinton Heider, Bryan Raney, Erik Engquist), and the National Institutes of Health (NIH NIAID P01AI15299) for their support.