Skip to Content

Wenyi Wang, PhD

Present Title & Affiliation

Primary Appointment

Assistant Professor, Department of Bioinformatics and Computational Biology, Division of Quantitative Sciences, The University of Texas MD Anderson Cancer Center, Houston, TX

Dual/Joint/Adjunct Appointment

Assistant Professor, Statistics, Texas A&M, College Station, TX

Bio Statement

Dr. Wang had formal training in both statistical bioinformatics and basic science research.  She received Ph.D. training from the Department of Biostatistics at Johns Hopkins University. Her PhD thesis (advisor, Dr. Giovanni Parmigiani) is about statistical methods for cancer risk assessment and copy number estimation (with Dr. Rafael Irizarry). As a postdoctoral fellow at both Stanford Genome Technology Center (advisor, Dr. Ron Davis) and UC Berkeley Department of Statistics (advisor, Dr. Terry Speed), she completed three years of research on statistical methods for analyzing high-throughput sequencing data, where she developed a new and improved analysis tool for rare variant calling with resequencing arrays. More information about her lab is available here

Research Interests

Dr. Wang's research is motivated by large-scale complex data sets in recent genomic and familial studies and by important biological questions that emerge from the analysis of these data. Her current interests can be divided into two parts: 1) Development of methods and software for the accurate measurement of high-throughput genomic data; 2) Development and validation of statistical approaches and software for personalized cancer risk prediction.

It is non-trivial to extract genomic information of interest from the raw signals that come directly from chemical or physical reactions. Current high-throughput technologies have all inevitably incorporated multi-level confounders that affect the observed signals. The large amount of data they produce also make it difficult to calibrate these technologies using "gold standards", usually generated by experiments that are more accurate but are low-throughput and expensive. My work in this part is focused on the accurate interpretation of raw high-throughput signals using statistical modeling. Currently, I have worked with high-throughput data measuring copy number, single nucleotide variants and alternative splicing. 

Cancer results from accumulation of multiple genetic mutations. Germline mutation of a cancer gene predisposes the carrier to the development of cancer, known as "inherited susceptibility". This inheritance results in familial clustering of cancers, known as "familial cancer syndromes". Clinical researchers utilize model-based prediction algorithms to identify cancer patients at earlier and more treatable stages and/or to identify healthy individuals at high risk of developing cancer in future. As a result, Mendelian carrier probability models are based on Bayesian methods using detailed family history as input,and have shown performances better than empirical models using regression or classification trees alone. My work in this part is focused on a) applying Mendelian models to cancers of interest for personalized risk assessment and b) developing methodologies for evaluation of risk assessment models using family and correlated data. 

Education & Training

Degree-Granting Education

2007 Johns Hopkins Bloomberg School of Public Health, Baltimore, MD, PHD, Biostatistics
2003 Columbia University College of Physicians and Surgeons, New York City, NY, MA, Human Nutrition
2001 Fudan University, Shanghai, China, BS, Honor Science Program, Biology


Other Appointments/Responsibilities

Postdoctoral fellow, Stanford University, Palo Alto, CA, 8/2007-8/2010
Visiting Scholar, UC Berkeley, Berkeley, CA, 8/2007-8/2010

Honors and Awards

2011 The Stellar Abstract Award, The 5th Annual Program in Quantitative Genomics, Harvard School of Public Health
2008 Delta Omega Alpha Inducted Member, Johns Hopkins Bloomberg School of Public Health
2008 Phi Beta Kappa Inducted Member, Johns Hopkins University Chaper of Phi Beta Kappa
2008 The Jane and Steve Dykacz Award, Johns Hopkins University, Baltimore, MD
2007 Travel Award, The 11th International Conference on Research in Computational and Molecular Biology
2006 Travel Award, The International Genetic Epidemiology Society 15th Annual Meeting
2005 The June B. Culley Award, Johns Hopkins University, Baltimore, MD
1997-2001 People's Scholarship, Fudan Univeristy, Shanghai, China
1994-2001 Honor Science Program, Fundan University, Shanghai, China

Selected Publications

Peer-Reviewed Original Research Articles

1. Peng G, Fan Y, Palculict TB, Shen P, Ruteshouser EC, Chi A, Davis RW, Huff V, Scharfe C, Wang W. Rare variant detection using family-based sequencing analysis. Proceedings of the National Academy of Sciences, 2/2013. e-Pub 2/2013.
2. Hua Y, Gorshkov K, Yang Y, Wang W, Zhang N, Hughes DP. Slow down to stay alive: HER4 protects against cellular stress and confers chemoresistance in neuroblastoma. Cancer 118(20):5140-54, 10/2012. PMCID: PMC3414637.
3. Wilkins EJ, Rubio JP, Kotschet KE, Cowie TF, Boon WC, O'Hely M, Burfoot R, Wang W, Sue CM, Speed TP, Stankovitch J, Horne MK. A DNA Resequencing Array for Genes Involved in Parkinson's Disease. Parkinsonism Relat Disord 18(4):386-90, 5/2012. PMID: 22243833.
4. Zhang N, Xu Y, O'Hely M, Speed TP, Scharfe C and Wang W. SRMA: an R package for resequencing array data analysis. Bioinformatics, 5/2012. PMID: 22581181.
5. Shen P*, Wang W*, Krishnakumar S, Palm C, Chi AK, Enns GM, Davis RW, Speed TP, Mindrinos MN, Scharfe C. High quality DNA sequence capture of 524 disease candidate genes. Proc Natl Acad Sci U S A 108(16):6549-54, 4/2011. PMCID: PMC3080966.
6. Wang W, Shen P, Thiyagarajan S, Lin S, Palm C, Horvath R, Klopstock T, Cutler D, Pique L, Schrijver I, Davis RW, Mindrinos M, Speed TP, Scharfe C. Identification of Rare DNA Variants in Mitochondrial Disorders with Improved Array-based Sequencing. Nucleic Acids Res 39(1):doi: 10.1093/nar/gkq750, 1/2011. PMCID: PMC3017602.
7. Wang W, Niendorf KB, Patel D, Blackford A, Marroni F, Sober AJ, Parmigiani G, Tsao H. Estimating CDKN2A carrier probability and personalizing cancer risk assessments in hereditary melanoma using MelaPRO. Cancer Res 70(2):552-9, 1/2010. e-Pub 1/2010. NIHMSID: NIHMS161634.
8. Lin S*, Wang W*, Palm C, Davis RW, Juneau K. A Molecular Inversion Probe Assay for Detecting Alternative Splicing. BMC Genomics 11:712, 2010. e-Pub 12/2010. PMCID: PMC3022918.
9. Wang W, Carvalho B, Miller ND, Pevsner J, Chakravarti A, Irizarry RA. Estimating genome-wide copy number using allele-specific mixture models. J Comput Biol 15(7):857-66, 9/2008. PMCID: PMC2612042.
10. Wang W, Chen S, Brune KA, Hruban RH, Parmigiani G, Klein AP. PancPRO: risk assessment for individuals with a family history of pancreatic cancer. J Clin Oncol 25(11):1417-22, 4/2007. PMCID: PMC2267288.
11. Nicodemus KK, Wang W, Shugart YY. Stability of variable importance scores and rankings using statistical learning tools on single-nucleotide polymorphisms and risk factors involved in gene x gene and gene x environment interactions. BMC Proc 1 Suppl 1:S58, 2007. e-Pub 12/2007. PMCID: PMC2367584.
12. González JR, Wang W, Ballana E, Estivill X. A recessive Mendelian model to predict carrier probabilities of DFNB1 for nonsyndromic deafness. Hum Mutat 27(11):1135-42, 11/2006. PMCID: PMC2268028.
13. Chen S, Wang W, Lee S, Nafa K, Lee J, Romans K, Watson P, Gruber SB, Euhus D, Kinzler KW, Jass J, Gallinger S, Lindor NM, Casey G, Ellis N, Giardiello FM, Offit K, Parmigiani G, Colon Cancer Family Registry. Prediction of germline mutations and cancer risk in the Lynch syndrome. JAMA 296(12):1479-87, 9/2006. PMCID: PMC2538673.
14. Chen S, Wang W, Broman KW, Katki HA, Parmigiani G. BayesMendel: an R environment for Mendelian risk prediction. Stat Appl Genet Mol Biol 3:Article21, 2004. e-Pub 9/2004. PMCID: PMC2274007.

Grant & Contract Support

Title: Bioinformatics tools for genomic analysis of tumor and stromal pathways in cancer
Funding Source: NIH/NCI (Subcontract from Dana Farber Cancer Institute)
Role: Co-Investigator
Principal Investigator: Giovanni Parmigiani
Duration: 2/1/2013 - 1/31/2018
Title: Next-Generation Genomic Sequence Identification of the 19q Famillial Wilms Tumor Predisposition Gene
Funding Source: Cancer Prevention & Research Institute of Texas (CPRIT)
Role: Co-Investigator
Principal Investigator: Vicky Huff
Duration: 5/1/2010 - 4/30/2013
Title: The University of Texas SPORE in Lung Cancer (PC-C)
Funding Source: NIH/NCI (Subcontract from the University of Texas Southwestern Medical Center)
Role: Co-Investigator
Principal Investigator: John Minna
Duration: 5/1/2008 - 4/30/2013

Last updated: 4/18/2013