
Wenyi Wang, Ph.D.
Department of Bioinformatics and Computational Biology, Division of Basic Sciences
About Dr. Wang
Professor Wenyi Wang is a big data scientist with an academic background in both statistics and biology. Her main interests lie in using data wrangling to understand big data generated by cancer multi-omics. Her lab’s statistical methodology and tool development is data-driven and motivated by solving important and novel biological questions.
Her lab website is at: https://odin.mdacc.tmc.edu/~wwang7/.
Present Title & Affiliation
Primary Appointment
Professor, Department of Bioinformatics and Computational Biology, Division of Quantitative Sciences, The University of Texas MD Anderson Cancer Center, Houston, TX
Dual/Joint/Adjunct Appointment
Professor, Department of Biostatistics, Division of Quantitative Sciences, The University of Texas MD Anderson Cancer Center, Houston, TX
Professor, Department of Statistics, Texas A&M University, College Station, TX
Research Interests
The lab’s current major research focuses are:
1) Tumor heterogeneity and evolution using computational deconvolution of both transcriptomic and genomic data (inter-gromics), from both bulk sample and single cell sequencing data. Our goal is to understand a) tumor microenvironment that may lead to different outcomes in prognosis or response to treatment, b) the molecular mechanism of cell-cell interactions in normal tissues.
2) Personalized cancer risk prediction models, using TP53 mutation-associated Li-Fraumeni syndrome as a disease model; and novel statistical modeling and study design to further our understanding of the pan-cancer impact of TP53 mutations.
The lab uses statistical/computational toolbox as needed, including deep learning approaches. Our current focus is on mixture modeling and machine learning in high dimensional data.
Education & Training
Degree-Granting Education
2007 | Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, USA, PHD, Biostatistics |
2003 | Columbia University College of Physicians and Surgeons, New York City, NY, USA, MA, Human Nutrition |
2001 | Fudan University, Shanghai, CHN, BS, Biology |
Experience & Service
Administrative Appointments/Responsibilities
Admissions Committee member, GSBS, Houston, TX, 2018 - 2020
Quantitative Sciences Program Co-director, The University of Texas Graduate School of Biomedical Sciences at Houston, Houston, TX, 2014 - 2018
Other Appointments/Responsibilities
Faculty member, Baylor College of Medicine Graduate Program in Quantitative Computational Biology, Houston, TX, 2017 - Present
Selected Publications
Peer-Reviewed Articles
- Gao F, Pan X, Dodd-Eaton EB, Recio CV, Montierth MD, Bojadzieva J, Mai PL, Zelley K, Johnson VE, Braun D, Nichols KE, Garber JE, Savage SA, Strong LC, Wang W. A pedigree-based prediction model identifies carriers of deleterious de novo mutations in families with Li-Fraumeni syndrome. Genome Res 30(8):1170-1180, 2020. e-Pub 2020. PMID: 32817165.
- Shin SJ, Li J, Ning J, Bojadzieva J, Strong LC, Wang W. Bayesian estimation of a semipara- metric recurrent event model with applications to the penetrance estimation of multiple primary cancers in Li-Fraumeni Syndrome. Biostatistics 21(3):467-482, 2020. e-Pub 2018. PMID: 30445420.
- Nikooienejad A, Wang W, Johnson VE. BAYESIAN VARIABLE SELECTION FOR SURVIVAL DATA USING INVERSE MOMENT PRIORS. Ann Appl Stat 14(2):809-828, 2020. e-Pub 2020. PMID: 33456641.
- McCarthy DJ, Rostom R, Huang Y, Kunz DJ, Danecek P, Bonder MJ, Hagai T, Lyu R, HipSci Consortium, Wang W, Gaffney DJ, Simons BD, Stegle O, Teichmann SA. Cardelino: computational integration of somatic clonal substructure and single-cell transcriptomes. Nat Methods 17(4):414-421, 2020. e-Pub 2020. PMID: 32203388.
- Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Rosebrock D, Mitchell TJ, Rubanova Y, Anur P, Yu K, Tarabichi M, Deshwar A, Wintersinger J, Kleinheinz K, Vázquez-García I, Haase K, Jerman L, Sengupta S, Macintyre G, Malikic S, Donmez N, Livitz DG, Cmero M, Demeulemeester J, Schumacher S, Fan Y, Yao X, Lee J, Schlesner M, Boutros PC, Bowtell DD, Zhu H, Getz G, Imielinski M, Beroukhim R, Sahinalp SC, Ji Y, Peifer M, Markowetz F, Mustonen V, Yuan K, Wang W, Morris QD, PCAWG Evolution & Heterogeneity Working Group, Spellman PT, Wedge DC, Van Loo P, PCAWG Consortium. The evolutionary history of 2,658 cancers. Nature 578(7793):122-128, 2020. e-Pub 2020. PMID: 32025013.
- Wu CC, Beird HC, Andrew Livingston J, Advani S, Mitra A, Cao S, Reuben A, Ingram D, Wang WL, Ju Z, Hong Leung C, Lin H, Zheng Y, Roszik J, Wang W, Patel S, Benjamin RS, Somaiah N, Conley AP, Mills GB, Hwu P, Gorlick R, Lazar A, Daw NC, Lewis V, Futreal PA. Immuno-genomic landscape of osteosarcoma. Nat Commun 11(1):1008, 2020. e-Pub 2020. PMID: 32081846.
- Shin SJ, Dodd-Eaton EB, Gao F, Bojadzieva J, Chen J, Kong X, Amos CI, Ning J, Strong LC, Wang W. Penetrance Estimates Over Time to First and Second Primary Cancer Diagnosis in Families with Li-Fraumeni Syndrome: A Single Institution Perspective. Cancer Res 80(2):347-353, 2020. e-Pub 2019. PMID: 31719099.
- Shin SJ, Dodd-Eaton EB, Peng G, Bojadzieva J, Chen J, Amos CI, Frone MN, Khincha PP, Mai PL, Savage SA, Ballinger ML, Thomas DM, Yuan Y, Strong LC, Wang W. Penetrance of Different Cancer Types in Families with Li-Fraumeni Syndrome: A Validation Study Using Multicenter Cohorts. Cancer Res 80(2):354-360, 2020. e-Pub 2019. PMID: 31719101.
- Haider S, Tyekucheva S, Prandi D, Fox NS, Ahn J, Xu AW, Pantazi A, Park PJ, Laird PW, Sander C, Wang W, Demichelis F, Loda M, Boutros PC, Cancer Genome Atlas Research Network. Systematic Assessment of Tumor Purity and Its Clinical Implications. JCO Precis Oncol 4, 2020. e-Pub 2020. PMID: 33015524.
- Maura F, Agnelli L, Leongamornlert D, Bolli N, Chan WC, Dodero A, Carniti C, Heavican TB, Pellegrinelli A, Pruneri G, Butler A, Bhosle SG, Chiappella A, Di Rocco A, Zinzani PL, Zaja F, Piva R, Inghirami G, Wang W, Palomero T, Iqbal J, Neri A, Campbell PJ, Corradini P. Integration of transcriptional and mutational data simplifies the stratification of peripheral T-cell lymphoma. Am J Hematol 94(6):628-634, 2019. e-Pub 2019. PMID: 30829413.
- Tarabichi M, Martincorena I, Gerstung M, Leroi AM, Markowetz F, PCAWG Evolution and Heterogeneity Working Group, Spellman PT, Morris QD, Lingjærde OC, Wedge DC, Van Loo P. Neutral tumor evolution?. Nat Genet 50(12):1630-1633, 2018. e-Pub 2018. PMID: 30374075.
- Wang Z, Morris JS, Cao S, Ahn J, Liu R, Tyekucheva S, Li B, Lu W, Tang X, Wistuba II, Bowden M, Mucci L, Loda M, Parmigiani G, Holmes CC, Wang W. Transcriptome deconvolution of heterogeneous tumor samples with immune infiltration. iScience, 2018. e-Pub 2018.
- Shin SJ, Yuan Y*, Strong LC, Bojadzieva J, Wang W*. Bayesian semiparametric estimation of cancer-specific age-at-onset penetrance with application to Li-Fraumeni Syndrome. JASA, 2018. e-Pub 2018.
- Li J, Fu C, Speed TP, Wang W*, Symmans F*. Accurate RNA Sequencing From Formalin- Fixed Cancer Tissue to Represent High-Quality Transcriptome From Frozen Tissue. Journal of Clinical Oncology Precision Oncology, 2018. e-Pub 2018. PMID: 29862382.
- Peng G, Bojadzieva J, Ballinger ML, Li J, Blackford AL, Mai PL, Savage SA, Thomas DM, Strong LC, Wang W. Estimating TP53 mutation carrier probability in families with Li-Fraumeni Syndrome using LFSPRO. Cancer Epidemiol Biomarkers Prev 26(6):837-844, 2017. e-Pub 2017. PMID: 28137790.
- Holik AZ, Law CW, Liu R, Wang Z, Wang W, Ahn J, Asselin-Labat ML, Smyth GK, Ritchie ME. RNA-seq mixology: designing realistic control experiments to compare protocols and analysis methods. Nucleic Acids Res 45(5):e30, 2017. e-Pub 2016. PMID: 27899618.
- Lefterova MI, Shen P, Odegaard JI, Fung E, Chiang T, Peng G, Davis RW, Wang W, Kharrazi M, Schrijver I, Scharfe C. Next-generation molecular testing of newborn dried blood spots for cystic fibrosis. J Mol Diagn 18(2):267-82, 2016. e-Pub 2016. PMID: 26847993.
- Fan Y, Xi L, Hughes DS, Zhang J, Zhang J, Futreal PA, Wheeler DA, Wang W. MuSE: accounting for tumor heterogeneity using a sample-specific error model improves sensitivity and specificity in mutation calling from sequencing data. Genome Biol 17(1):178, 2016. e-Pub 2016. PMID: 27557938.
- Ahn J, Liu S, Wang W*, Yuan Y*. Bayesian latent-class mixed-effect hybrid models for dyadic longitudinal data with non-ignorable dropouts. Biometrics 69(4):914-24, 2013. e-Pub 2013. PMID: 24328715.
- Cancer Genome Atlas Research Network, Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45(10):1113-20, 2013. PMID: 24071849.
- Ahn J, Yuan Y, Parmigiani G, Suraokar MB, Diao L, Wistuba II, Wang W. DeMix: deconvolution for mixed cancer transcriptomes using raw measured data. Bioinformatics 29(15):1865-71, 2013. e-Pub 2013. PMID: 23712657.
- Peng G, Fan Y, Palculict TB, Shen P, Ruteshouser EC, Chi AK, Davis RW, Huff V, Scharfe C, Wang W. Rare variant detection using family-based sequencing analysis. Proc Natl Acad Sci U S A 110(10):3985-90, 2013. e-Pub 2013. PMID: 23426633.
- Srivastava S, Wang W, Manyam G, Ordonez C, Baladandayuthapani V. Integrating Multi-Platform Genomic Data Using Hierarchical Bayesian Relevance Vector Machines. EURASIP J Bioinform Syst Biol 2013(1):9, 2013. e-Pub 2013. PMID: 23809014.
- Hua Y, Gorshkov K, Yang Y, Wang W, Zhang N, Hughes DP. Slow down to stay alive: HER4 protects against cellular stress and confers chemoresistance in neuroblastoma. Cancer 118(20):5140-54, 2012. e-Pub 2012. PMID: 22415601.
- Zhang N, Xu Y, O'Hely M, Speed TP, Scharfe C, Wang W. SRMA: an R package for resequencing array data analysis. Bioinformatics 28(14):1928-30, 2012. e-Pub 2012. PMID: 22581181.
- Wilkins EJ, Rubio JP, Kotschet KE, Cowie TF, Boon WC, O'Hely M, Burfoot R, Wang W, Sue CM, Speed TP, Stankovitch J, Horne MK. A DNA Resequencing Array for Genes Involved in Parkinson's Disease. Parkinsonism Relat Disord 18(4):386-90, 2012. e-Pub 2012. PMID: 22243833.
- Shen P*, Wang W*, Krishnakumar S, Palm C, Chi AK, Enns GM, Davis RW, Speed TP, Mindrinos MN, Scharfe C. High quality DNA sequence capture of 524 disease candidate genes. Proc Natl Acad Sci U S A 108(16):6549-54, 2011. PMID: 21467225.
- Wang W, Shen P, Thiyagarajan S, Lin S, Palm C, Horvath R, Klopstock T, Cutler D, Pique L, Schrijver I, Davis RW, Mindrinos M, Speed TP, Scharfe C. Identification of Rare DNA Variants in Mitochondrial Disorders with Improved Array-based Sequencing. Nucleic Acids Res 39(1):doi: 10.1093/nar/gkq750, 2011. PMID: 20843780.
- Wang W, Niendorf KB, Patel D, Blackford A, Marroni F, Sober AJ, Parmigiani G, Tsao H. Estimating CDKN2A carrier probability and personalizing cancer risk assessments in hereditary melanoma using MelaPRO. Cancer Res 70(2):552-9, 2010. e-Pub 2010. PMID: 20068151.
- Lin S*, Wang W*, Palm C, Davis RW, Juneau K. A Molecular Inversion Probe Assay for Detecting Alternative Splicing. BMC Genomics 11:712, 2010. e-Pub 2010. PMID: 21167051.
- Wang W, Carvalho B, Miller ND, Pevsner J, Chakravarti A, Irizarry RA. Estimating genome-wide copy number using allele-specific mixture models. J Comput Biol 15(7):857-66, 2008. PMID: 18707534.
- Wang W, Chen S, Brune KA, Hruban RH, Parmigiani G, Klein AP. PancPRO: risk assessment for individuals with a family history of pancreatic cancer. J Clin Oncol 25(11):1417-22, 2007. PMID: 17416862.
- Nicodemus KK, Wang W, Shugart YY. Stability of variable importance scores and rankings using statistical learning tools on single-nucleotide polymorphisms and risk factors involved in gene x gene and gene x environment interactions. BMC Proc 1 Suppl 1:S58, 2007. e-Pub 2007. PMID: 18466558.
- González JR, Wang W, Ballana E, Estivill X. A recessive Mendelian model to predict carrier probabilities of DFNB1 for nonsyndromic deafness. Hum Mutat 27(11):1135-42, 2006. PMID: 16941638.
- Chen S, Wang W, Lee S, Nafa K, Lee J, Romans K, Watson P, Gruber SB, Euhus D, Kinzler KW, Jass J, Gallinger S, Lindor NM, Casey G, Ellis N, Giardiello FM, Offit K, Parmigiani G, Colon Cancer Family Registry. Prediction of germline mutations and cancer risk in the Lynch syndrome. JAMA 296(12):1479-87, 2006. PMID: 17003396.
- Xu Z, Sproul A, Wang W, Kukekov N, Greene LA. Siah1 interacts with the scaffold protein POSH to promote JNK activation and apoptosis. J Biol Chem 281(1):303-12, 2006. e-Pub 2005. PMID: 16230351.
- Chen S, Wang W, Broman KW, Katki HA, Parmigiani G. BayesMendel: an R environment for Mendelian risk prediction. Stat Appl Genet Mol Biol 3:Article21, 2004. e-Pub 2004. PMID: 16646800.
- Ewing AD, Houlahan KE, Hu Y, Ellrott K, Caloian C, Yamaguchi TN, Bare JC, P’ng C, Waggott D, Sabelnykova VY; ICGC-TCGA DREAM Somatic Mutation Calling Challenge participants, Kellen MR, Norman TC, Haussler D, Friend SH, Stolovitzky G, Margolin AA, Stuart JM, Boutros PC.. Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide- variant detection. Nature Methods. e-Pub 2015.
- Davis CF et al., The Cancer Genome Atlas Research Network.. The Somatic Genomic Landscape of Chromophobe Renal Cell Carcinoma. Cancer Cell. e-Pub 2014.
- Peng G, Fan Y, Wang W. FamSeq: a variant calling program for family-based sequencing data using graphics processing units. PLOS Comput Biol. e-Pub 2014. PMID: 25357123.
- Shen P*, Wang W*, Chi AK, Fan Y, Davis RW, Scharfe C. Multiplex target capture with double-stranded DNA probes. Genome Med 5(5):50. e-Pub 2013. PMID: 23718862.
- Shen P*, Wang W* , Chi AK, Fan Y, Davis RW and Scharfe C. Multiplex target capture with long padlock probes. Genome Medicine(5):50. e-Pub 2013. PMID: 23718862.
- Morris JS, Hassan MM, Zohner YE, Wang Z, Xiao L, Rashid A, Haque A, Abdel-Wahab R, Mohamed YI, Ballard KL, Wolff RA, George B, Li L, Allen G, Weylandt M, Li D, Wang W, Raghav K, Yao J, Amin HM, Kaseb AO. HepatoScore-14: Measures of biological heterogeneity significantly improve prediction of hepatocellular carcinoma risk. Hepatology. e-Pub 2020. PMID: 32931023.
- The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes Consortium. Pan-cancer analysis of whole genomes. Nature. e-Pub 2020.
- Wang Z, Kaseb AO, Amin HM, Hassan MM, Wang W, Morris JS. Bayesian Edge Regression in Undirected Graphical Models to Characterize Interpatient Heterogeneity in Cancer. JASA.
- Ahn J, Morita S, Wang W*, Yuan Y*. Bayesian analysis of longitudinal dyadic data with informative missing data using a dyadic shared-parameter model. Stat Methods Med Res:962280217715051. e-Pub 2017. PMID: 28629259.
- Nikooienejad A, Wang W*, Johnson VE*.. Bayesian variable selection for binary outcomes in high dimensional genomic studies using non-local priors. Bioinformatics. e-Pub 2016.
- Palculict TB, Ruteshouser EC, Fan Y, Wang W, Strong L, Huff V. Identification of germline DICER1 mutations and loss of heterozygosity in familial Wilms tumor using whole genome sequenc- ing. Journal of Medical Genetics. e-Pub 2015.
- Fang LT, Afshar PT, Chhibber A,Mohiyuddin M, Fan Y, Mu J, Gibeling G, Barr S, Asadi NB, Gerstein M, Koboldt D, Wang W, Wong WH, Lam H.. An ensemble approach to accurately detect so- matic mutations using SomaticSeq. Genome Biology. e-Pub 2015.
Book Chapters
- Wang W, Fan Y, Speed T. DNA Variant Calling in Targeted Sequencing Data. In: Advances in Statistical Bioinformatics: Models and Integrative Inference for High-Throughput. Cambridge University Press: New York, 2013.
Grant & Contract Support
Title: | Statistical methods and tools for cancer risk prediction in families with germline mutations in TP53 |
Funding Source: | NIH/NCI |
Role: | Principal Investigator |
Title: | Cell Atlas of the Neural Retina |
Funding Source: | Chan Zuckerberg Institute via subaward Baylor College of Medicine |
Role: | Co-Principal Investigator |
Title: | Improving risk prediction for Li-Fraumeni Syndrome: A practical tool for clinical health care providers |
Funding Source: | Cancer Prevention & Research Institute of Texas (CPRIT) |
Role: | Co-Principal Investigator |
Title: | Intratumor Heterogeneity in Anaplastic Thyroid Carcinoma: Implications for Response to Neoadjuvant BRAF- and Immune- Directed Therapies |
Funding Source: | The Mark Foundation for Cancer Research |
Role: | Co-Principal Investigator |