Skip to Content

Kim-Anh Do

Present Title & Affiliation

Primary Appointment

Department Chair, Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX
Professor, Department of Biostatistics, The University of Texas MD Anderson Cancer Center, Houston, TX

Dual/Joint/Adjunct Appointment

Adjunct Professor, Statistics, Rice University, Houston, TX
Adjunct Professor, Statistics, Texas A&M University, College Station, TX

Bio Statement

Kim-Anh Do, Ph.D., is a Professor and Chair in the Department of Biostatistics at MD Anderson, and a recipient of the Faculty Scholar Award at MD Anderson in 2003 and the Texas 4000 Distinguished Professorship in 2013. She is a Fellow of the American Statistical Association, the American Association for the Advancement of Science (AAAS) and the Royal Statistical Society and is an Elected Member of the International Statistical Institute. She has served as a primary statistician or co-investigator on several National Institutes of Health (NIH) funded grants and clinical trials in prostate cancer, epidemiology, leukemia, upper aerodigestive cancer, breast cancer and brain cancer, including the Early Detection Research Network (EDRN) grant, the Prostate SPORE (as Director of the Biostatistics Core), the Breast SPORE, and the Brain SPORE at M. D. Anderson. She has significant publications in statistical methodology, computing, biomedical, and in other applied specialist journals. Her most recent interest is in the development of clustering and analytic methods for genomic and proteomic expressions. She has developed bioinformatics software and authored books: (i) Analyzing microarray gene expression data; (ii) Bayesian Inference for Gene Expression and Proteomics; and (iii) Advances in Statistical Bioinformatics--Models and Integrative Inference for High-Throughput Data. Her extensive contribution to statistical and cancer research at M.D. Anderson has resulted in more than 160 published articles in the past years. Additional information regarding Dr. Do's educational and professional activities can be found here.

Research Interests

  • Computational Statistics and Biostatistics
  • Bioinformatics
  • Statistical Genetics
  • Non-parametric Statistical Methods

Office Address

The University of Texas MD Anderson Cancer Center
Department of Biostatistics
1400 Pressler Street
Unit Number: 1411
Houston, TX 77030
Room Number: FCT4.6040
Phone: (713) 794-4155
Fax: (713) 563-4243

Education & Training

Degree-Granting Education

1990 Stanford University, Stanford, CA, PHD, Statistics
1985 Stanford University, Stanford, CA, MS, Statistics
1983 Queensland University, Brisbane, Australia, B.Sc., First Class Honors, Mathematics and Computer Science

Honors and Awards

2013 Texas 4000 Distinguished Professorship
2012 Elected Member, International Statistical Institute
2006 Fellow, American Statistical Association
2005 Fellow, Royal Statistical Society
2003 Faculty Scholar Award, University of Texas M.D. Anderson Cancer Center
1994 Australian Academy of Science Travel Award
1983 Amy R. Hughes Award, Australian Federation of University Women, Australia
1982 Caltex Woman Graduate of the Year, Postgraduate Scholarship, University of Queensland

Professional Memberships

American Statistical Association, Houston Chapter (HACASA), Houston, TX
President, 2003-2004
President-Elect, 2002-2003

Selected Publications

Peer-Reviewed Original Research Articles

1. Chekouo T, Stingo FC, Doecke JD, Do K-A. Incorporating microRNA regulatory network and pathways: A Bayesian graphical approach to the selection of miRNAs and genes with censored outcomes. Biometrics. In Press.
2. Gonzalez-Angulo AM, Akcakanat A, Liu S, Green MC, Murray JL, Chen H, Palla SL, Koenig KB, Brewster AM, Valero V, Ibrahim NK, Moulder-Thompson S, Litton JK, Tarco E, Moore J, Flores P, Crawford D, Dryden MJ, Symmans WF, Sahin A, Giordano SH, Pusztai L, Do K-A, Mills GB, Hortobagyi GN, Meric-Bernstam F. Open label randomized clinical trial of standard neoadjuvant chemotherapy with paclitaxel followed by FEC vs. the combination of paclitaxel and everolimus followed by FEC in women with triple receptor-negative breast cancer. Ann Oncol. In Press.
3. Ha MJ, Baladandayuthapani, Do K-A. Prognostic gene signature identification using causal structural learning: applications in kidney cancer. Cancer Informatics. In Press.
4. Ha MJ, Baladandayuthapani V, Do KA. DINGO: Differential Network Analysis in Genomics. Bioinformatics. e-Pub 7/6/2015. PMID: 26148744.
5. Wang Y, Hobbs BP, Hu J, Ng CS, Do KA. Predictive classification of correlated targets with application to detection of metastatic cancer using functional CT imaging. Biometrics. e-Pub 4/7/2015. PMID: 25851056.
6. Rembach A, Stingo FC, Peterson C, Vannucci M, Do KA, Wilson WJ, Macaulay SL, Ryan TM, Martins RN, Ames D, Masters CL, Doecke JD, AIBL Research Group. Bayesian graphical network analyses reveal complex biological interactions specific to Alzheimer's disease. J Alzheimers Dis 44(3):917-25, 2015. PMCID: PMC4499459.
7. Tam CS, O'Brien S, Plunkett W, Wierda W, Ferrajoli A, Wang X, Do KA, Cortes J, Khouri I, Kantarjian H, Lerner S, Keating MJ. Life After FCR: Outcomes of patients with chronic lymphocytic leukemia who progress after frontline treatment with Fludarabine, Cyclophosphamide and Rituximab. Blood 124(20):3059-64, 11/13/2014. e-Pub 10/3/2014. PMCID: PMC4231417.
8. Hassan B, Akcakanat A, Sangai T, Evans KW, Adkins F, Eterovic AK, Zhao H, Chen K, Chen H, Do KA, Xie SM, Holder AM, Naing A, Mills GB, Meric-Bernstam F. Catalytic mTOR inhibitors can overcome intrinsic and acquired resistance to allosteric mTOR inhibitors. Oncotarget 5(18):8544-57, 9/30/2014. e-Pub 8/10/2014. PMCID: PMC4226703.
9. Pande M, Bondy ML, Do KA, Sahin AA, Ying J, Mills GB, Thompson PA, Brewster AM. Association between germline single nucleotide polymorphisms in the PI3K-AKT-mTOR pathway, obesity, and breast cancer disease-free survival. Breast Cancer Res Treat 147(2):381-7, 9/2014. e-Pub 8/10/2014. PMCID: PMC4174407.
10. Zhang L, Baladandayuthapani V, Mallick BK, Manyam GC, Thompson PA, Bondy ML, Do KA. Bayesian hierarchical structured variable selection methods with application to MIP studies in breast cancer. J R Stat Soc Ser C Appl Stat 63(4):595-620, 8/2014. PMCID: PMC4334391.
11. Meric-Bernstam F, Akcakanat A, Chen H, Sahin A, Tarco E, Carkaci S, Adrada BE, Singh G, Do KA, Garces ZM, Mittendorf E, Babiera G, Bedrosian I, Hwang R, Krishnamurthy S, Symmans WF, Gonzalez-Angulo AM, Mills GB. Influence of Biospecimen Variables on Proteomic Biomarkers in Breast Cancer. Clin Cancer Res 20(14):3870-83, 7/15/2014. e-Pub 6/3/2014. PMCID: PMC4112583.
12. Blanco E, Sangai T, Wu S, Hsiao A, Ruiz-Esparza GU, Gonzalez-Delgado CA, Cara FE, Granados-Principal S, Evans KW, Akcakanat A, Wang Y, Do KA, Meric-Bernstam F, Ferrari M. Colocalized delivery of rapamycin and paclitaxel to tumors enhances synergistic targeting of the PI3K/Akt/mTOR pathway. Mol Ther 22(7):1310-9, 7/2014. e-Pub 2/26/2014. PMCID: PMC4088997.
13. Doecke JD, Chekouo T, Stingo FC, and Do K-A. miRNA target gene identification: sourcing miRNA-target gene relationships for the analyses of TCGA Illumina miSeq and RNA-Seq Hiseq platform data. International Journal of Human Genetics 14(1):17-22, 4/2014.
14. Chavez-Macgregor M, Liu S, De Melo-Gagliato D, Chen H, Do KA, Pusztai L, Fraser Symmans W, Nair L, Hortobagyi GN, Mills GB, Meric-Bernstam F, Gonzalez-Angulo AM. Differences in gene and protein expression and the effects of race/ethnicity on breast cancer subtypes. Cancer Epidemiol Biomarkers Prev 23(2):316-23, 2/2014. e-Pub 12/2/2013. PMCID: PMC3946290.
15. Liu Y, Zhou R, Baumbusch LO, Tsavachidis S, Brewster AM, Do KA, Sahin A, Hortobagyi GN, Taube JH, Mani SA, Aarøe J, Wärnberg F, Børresen-Dale AL, Mills GB, Thompson PA, Bondy ML. Genomic copy number imbalances associated with bone and non-bone metastasis of early-stage breast cancer. Breast Cancer Res Treat 143(1):189-201, 1/2014. e-Pub 12/4/2013. PMCID: PMC3993091.
16. Shirazi F, Farmakiotis D, Yan Y, Albert N, Do KA, Kim-Anh D, Kontoyiannis DP. Diet modification and Metformin have a beneficial cial effect in a model of obesity and mucormycosis. PLoS One 9(9):e108635, 2014. e-Pub 9/30/2014. PMCID: PMC4182538.
17. Gorlov IP, Yang JY, Byun J, Logothetis C, Gorlova OY, Do KA, Amos C. How to get the most from microarray data: advice from reverse genomics. BMC Genomics 15:223, 2014. e-Pub 3/21/2014. PMCID: PMC3997969.
18. León-Novelo LG, Müller P, Arap W, Sun J, Pasqualini R, Do KA. Bayesian decision theoretic multiple comparison procedures: an application to phage display data. Biom J 55(3):478-89, 5/2013. e-Pub 12/20/2012. PMCID: PMC3840910.
19. León-Novelo LG, Müller P, Arap W, Kolonin M, Sun J, Pasqualini R, Do KA. Semiparametric Bayesian Inference for Phage Display Data. Biometrics 69(1):174-83, 3/2013. e-Pub 1/22/2013. PMCID: PMC3622196.
20. Wang W, Baladandayuthapani V, Morris JS, Broom BM, Manyam G, Do KA. iBAG: integrative Bayesian analysis of high-dimensional multi-platform genomics data. Bioinformatics 29(2):149-59, 1/15/2013. e-Pub 11/9/2012. PMCID: PMC3546799.
21. Wang W, Baladandayuthapani V, Holmes CC, Do KA. Integrative network-based Bayesian analysis of diverse genomics data. BMC Bioinformatics 14 Suppl 13:S8, 2013. e-Pub 10/1/2013. PMCID: PMC3849715.
22. Bonato V, Baladandayuthapani V, Broom BM, Sulman EP, Aldape KD, Do KA. Bayesian ensemble methods for survival prediction in gene expression data. Bioinformatics 27(3):359-67, 2/2011. e-Pub 12/2010. PMCID: PMC3031034.
23. Zhang S, Mueller P, Do KA. A Bayesian Semiparametric Survival Model with Longitudinal Markers. Biometrics 66(2):435-43, 6/2010. e-Pub 6/2009. PMCID: PMC3045702.
24. Brewster AM, Do KA, Thompson PA, Hahn KM, Sahin AA, Cao Y, Stewart MM, Murray JL, Hortobagyi GN, Bondy ML. Relationship between epidemiologic risk factors and breast cancer recurrence. J Clin Oncol 25(28). e-Pub 9/2007. PMID: 17785707.
25. Ji Y, Yin G, Tsui K-W, Kolonin MG, Sun J, Arap W, Pasqualini R, Do K-A. Bayesian mixture models for complex high dimensional count data in phage display experiments. Journal of the Royal Statistical Society: Series C (Applied Statistics) 56(2):139-52, 3/2007.
26. Do K-A, McLachlan GJ, Bean R, Wen S. Gene shaving versus mixture models for the clustering of microarray gene expression data. Cancer Informatics 2:25-43, 2007. PMID: No PubMed.
27. Kim SJ, Uehara H, Yazici S, Busby JE, Nakamura T, He J, Maya M, Logothetis C, Mathew P, Wang X, Do KA, Fan D, Fidler IJ. Targeting platelet-derived growth factor receptor on endothelial cells of multidrug-resistant prostate cancer. J Natl Cancer Inst 98(11):783-93, 6/2006. PMID: 16757703.
28. Do K-A Mueller P, Tang F. A Bayesian mixture model for differential gene expression. Journal of the Royal Statistical Society, Series C-Applied Statistics 54(3):1-18, 2005. PMID: No PubMed.
29. Do KA, Johnson MM, Lee JJ, Wu XF, Dong Q, Hong WK, Khuri FR, Spitz MR. Longitudinal study of smoking patterns in relation to the development of smoking-related secondary primary tumors in patients with upper aerodigestive tract malignancies. Cancer 101(12):2837-42, 12/2004. PMID: 15536619.
30. Do K-A, Green A, Guthrie JR, Dudley EC, Burger HG, Dennerstein L. Longitudinal study of risk factors for coronary heart disease across the menopausal transition. Am J Epidemiol 151:584-93, 3/2000. PMID: 10733040.
31. Do K-A, Kirk K. Discriminant analysis of event-related potential curves using smoothed principal components. Biometrics 55:174-81, 3/1999. PMID: 11318152.
32. Wood ATA, Do K-A, Broom BM. Sequential linearization of empirical likelihood constraints with application to U-statistics. Journal of Computational and Graphical Statistics 5:365-85, 1996.
33. Booth JG, Do K-A. Simple and efficient methods for constructing bootstrap confidence intervals. Computational Statistics 8:333-46, 1994.
34. Do K-A, Hall P. Distribution estimation using concomitants of order statistics, with application to Monte Carlo stimulation for the bootstrap. Journal of the Royal Statistical Society, Series B 54:1-14, 1992.
35. Do K-A, Hall P. On importance sampling for the bootstrap. Biometrika 78:161-167, 1991.
36. Do K-A, McLachlan GJ. Estimation of mixing proportions: a case study. Applied Statistics 33:134-40, 1984.

Book Chapters

1. Wang W, Baladandayuthapani V, Broom BM, Do K-A. Bayesian graphical models for integrating multi-platform genomics data. Methods for the analysis of copy number data in cancer research. In: Advances in Statistical Bioinformatics: Models and Integrative Inferences for High-Throughput Data. Cambridge University Press, 2013.
2. Broom BM, Do K-A, Bondy M, Thompson P, Coombes K. Methods for the analysis of copy number data in cancer research. In: Advances in Statistical Bioinformatics: Models and Integrative Inferences for High-Throughput Data. Cambridge University Press, 2013.

Books (edited and written)

1. Do K-A, Qin Z, Vannucci M. Advances in Statistical Bioinformatics: Models and Integrative Inferences for High-Throughput Data. Cambridge University Press, 2013.
2. Do K-A, Müller P, Vannucci M. Bayesian Inference for Gene Expression and Proteomics. Cambridge University Press, 2006. ISBN: 052186092X.
3. McLachlan GJ, Do K-A, Ambroise C. Analyzing Microarray Gene Expression Data. In: Wiley Series in Probability and Statistics. Wiley-Interscience: New Jersey, 2004. ISBN: 0471226165.

Grant & Contract Support

Title: MD Anderson Cancer Center Prostate Cancer SPORE.
Funding Source: NIH/NCI
Role: Director
Principal Investigator: Christopher Logothetis
Duration: 4/1/2016 - 3/31/2021
Title: (PQ5) Role of ATM in mitochondrial pathogenesis of mantle cell lymphoma
Funding Source: NIH/NCI
Role: Co-Investigator
Principal Investigator: Varsha Gandhi
Duration: 3/1/2016 - 2/28/2021
Title: Overcoming Aggressive Variant Prostate Cancer by DNA Damage Response-Targeted Therapy
Funding Source: Movember-Prostate Cancer Foundation
Role: Statistician
Principal Investigator: Timothy C. Thompson
Duration: 9/1/2015 - 8/31/2017
Title: Defining and treating targetable lesions in AYA acute lymphoblastic leukemia
Funding Source: Cancer Prevention & Research Institute of Texas (CPRIT)
Role: Collaborator
Principal Investigator: Marina Konopleva
Duration: 3/1/2015 - 2/28/2019
Title: MD Anderson Cancer Center Prostate Cancer SPORE
Funding Source: MDACC
Role: Project Leader
Principal Investigator: Christopher Logothetis
Duration: 3/1/2015 - 11/30/2015
Title: Prostate Cancer Moonshot Program
Funding Source: MD Anderson Cancer Center Moonshot Flagships
Role: Biostatistics Director
Duration: 9/14/2014 - 8/31/2015
Title: CLL Moonshot Program
Funding Source: MD Anderson Cancer Center Moonshot Flagship
Role: Biostatistics Director
Duration: 9/1/2014 - 8/31/2015
Title: MDS/AML Moonshot Program
Funding Source: MD Anderson Cancer Center Moonshot
Role: Biostatistics Director
Duration: 6/1/2014 - 5/31/2015
Title: Optimizing Akt/mTOR targeted breast cancer therapy
Funding Source: Komen
Role: Biostatistician
Principal Investigator: Funda Meric-Bernstam
Duration: 11/3/2013 - 11/2/2015
Title: Sapacitabine therapy to create synthetic lethality in DNA repair-deficient CLL
Funding Source: NIH/NCI
Role: Statistician
Principal Investigator: William Wierda
Duration: 8/1/2012 - 7/30/2015
Title: Center for Clinical and Translational Sciences (PP-2)
Funding Source: NIH/NCI (Subcontract from the UT Health Science Center - Houston)
Role: Collaborator - Statistical Leader
Principal Investigator: David McPherson
Duration: 7/1/2012 - 6/30/2017
Title: Towards Personalized Therapy of Resistant Triple Negative Breast Cancer
Funding Source: American Cancer Society (ACS)
Role: Investigator
Principal Investigator: Naoto Ueno
Duration: 7/1/2011 - 6/30/2015
Title: M D Anderson Cancer Center Prostate SPORE (PC-B)
Funding Source: NIH/NCI
Role: Core Director
Principal Investigator: Christopher J. Logothetis
Duration: 9/2/2009 - 8/31/2014
Title: Ethnic differences in the mutational status of the P13K pathway and breast cancer outcome
Funding Source: Susan G. Komen Breast Cancer Foundation
Role: Statistician
Principal Investigator: Abenaa Brewster
Duration: 8/3/2009 - 8/2/2012
Title: SPORE in Brain Cancer (PP-3A)
Funding Source: NIH/NCI
Role: Co-Investigator
Principal Investigator: W. K. Alfred Yung
Duration: 9/1/2008 - 8/31/2013
Title: UT Health Science Center - Center for Clinical and Translational Research - (PP-4)
Funding Source: NIH/NCRR (Subcontract from The University of Texas Health Science Center)
Role: Biostatistician
Principal Investigator: David McPherson
Duration: 7/1/2007 - 6/30/2013
Title: UTMDACC SPORE in Breast Cancer (PC-B)
Funding Source: NIH/NCI
Role: Investigator
Principal Investigator: Gabriel Hortobagyi
Duration: 9/23/2005 - 8/31/2011
Title: Cancer Center Support Grant, Biostatistics Shared Resource (Biostatistics Resource Group (BRG)
Funding Source: NIH/NCI
Role: Statistician
Principal Investigator: Ronald DePinho
Duration: 9/4/1998 - 6/30/2018

Last updated: 8/18/2015