Justin Cosentino

📍 PHL
✉️ name.first at name.last dot io

about

I’m a research engineer on the Google Health AI Genomics team, working to better understand the genetic basis of disease and exploring how to incorporate multimodal health data into LLMs. I graduated with a Master’s in Computer Science from Tsinghua University, where I was supervised by Professor Jun Zhu in the Tsinghua Statistical Artificial Intelligence and Learning Group.

Previously, I was a research intern in Uber’s Advanced Technologies Group under the supervision of Professor Raquel Urtasun and an intern with the Google Brain Genomics team. Before Tsinghua, I was a Senior Software Engineer working on Search at Salesforce. I studied Computer Science at Swarthmore College.

research

My current research focuses on the application of machine learning techniques to genomics. While at Tsinghua, I focused on uncertainty and robustness in deep learning.

journal publications

Inference of chronic obstructive pulmonary disease with deep learning on raw spirograms identifies novel genetic loci and improves risk models.
Liability scores for COPD obtained from our deep learning model improve genetic association discovery and risk prediction. We trained our model using full spirograms and noisy medical record labels obtained from self-reporting and hospital diagnostic codes, and demonstrated that the machine-learning-based phenotyping approach can be generalized to diseases that lack expert-defined annotations.
{Justin Cosentino, Babak Behsaz, Babak Alipanahi, Zachary R. McCaw}, Davin Hill, Tae-Hwi Schwantes-An, Dongbing Lai, Andrew Carroll, Brian D. Hobbs, Michael H. Cho, Cory Y. McLean, and Farhad Hormozdiari.
Nature Genetics, 2023.
[ paper ] [ research briefing ] [ code ] [ bibtex ]

Large-scale machine learning-based phenotyping significantly improves genomic discovery for optic nerve head morphology
Use machine learning to generate more accurate phenotypes, leading to the discovery of novel loci.
{Babak Alipanahi, Farhad Hormozdiari, Babak Behsaz, Justin Cosentino, Zachary R. McCaw}, Emanuel Schorsch, D. Sculley, Elizabeth H. Dorfman, Paul J. Foster, Lily H. Peng, Sonia Phene, Naama Hammel, Andrew Carroll, and {Anthony P. Khawaja, Cory Y. McLean}.
American Journal of Human Genetics (AJHG), 2021.
[ paper ] [ code ] [ bibtex ]

conference proceedings

Generative Well-intentioned Networks
A novel framework for leveraging uncertainty and rejection-based classifiers.
Justin Cosentino and Jun Zhu.
Neural Information Processing Systems (NeurIPS), 2019.
[ paper ] [ poster ] [ slides ] [ bibtex ]

workshop papers and preprints

Multimodal LLMs for health grounded in individual-specific data

{Anastasiya Belyaeva, Justin Cosentino}, Farhad Hormozdiari, {Cory Y. McLean, Nicholas A. Furlotte}.
Machine Learning for Multimodal Healthcare Data @ ICML (Oral Presentation), 2023.
[ paper ] [ bibtex ]

Unsupervised representation learning improves genomic discovery for lung function and respiratory disease prediction.
We introduce a general deep learning framework, REpresentation learning for Genetic discovery on Low-dimensional Embeddings (REGLE), for discovering associations between genetic variants and high-dimensional clinical data.
Taedong Yun, Justin Cosentino, Babak Behsaz, Zachary R. McCaw, Davin Hill, Robert Luben, Dongbing Lai, John Bates, Howard Yang, Tae-Hwi Schwantes-An, Anthony P. Khawaja, Andrew Carroll, Brian D. Hobbs, Michael H. Cho, Cory Y. McLean, and Farhad Hormozdiari.
medRxiv, 2023.
[ paper ] [ code ] [ bibtex ]

An Empirical Study of ML-based Phenotyping and Denoising for Improved Genomic Discovery.
Using synthetic noisy VCDR-based phenotypes, we show that the ML-based phenotyping procedure recovers underlying liability scores across noise levels, significantly improving genetic discovery and PRS predictive power relative to noisy equivalents.
Bo Yuan, Cory Y. McLean, Farhad Hormozdiari, and Justin Cosentino.
Learning Meaningful Representations of Life Workshop @ NeurIPS, 2022.
[ paper ] [ poster ] [ code ] [ bibtex ]

The Search for Sparse, Robust Neural Networks
Showing that winning Lottery Tickets account for the robustness of a network.
{Justin Cosentino, Federico Zaiter}, and {Dan Pei, Jun Zhu}.
Safety and Robustness in Decision Making Workshop @ NeurIPS, 2019.
[ paper ] [ poster ] [ code ] [ bibtex ]

links

_{^{Last updated in Jan, 2024.}}