Resume
Download PDF Version
Contact
E-mail: xli74@ncsu.edu
Summary
Ph.D. candidate in Chemistry (expecting to graduate in Nov. 2020). Expertise in applying machine learning techniques and cheminformatics to solve chemistry problems, e.g., QSAR modeling. Well-versed in programming languages Python and R and machine learning toolkits such as PyTorch and Scikit-Learn. Proficient skills in chemical data mining, curation, analysis, visualization, and modeling.
Research Interests
- Cheminformatics
- QSAR/QSPR Modeling
- Molecular Representation Learning (SMILES-based and Graph-based approaches)
- Machine Learning/Deep Learning
- Big Data Mining, Curation, Analysis, Visulzation and Modeling
- Virtual Screening
- Natural Language Processing
Research
- Development of Novel Quantitative Structure-Property/Activity Relationship (QSPR/QSAR) Modeling Methodologies:
- MolPMoFiT: An inductive transfer learning method for QSRR/QSAR modeling. MolPMoFiT pre-trained a universal molecular structure prediction model using one million bioactive molecules from ChEMBL and then fine-tuned it for various QSPR/QSAR tasks.
- Hierarchical QSAR: An effective ensemble modeling method that Integrating binary/multi classification and regression models for predicting acute oral systemic toxicity.
- Python packages:
- SMILES Pair Encoding (SPE): a data-driven substructure tokenization algorithm for deep learning.
- Molecular DataSets (MolDS): A toolkit (Python Package) for curating, standardizing and diagnosing Molecular Data Sets for benchmarking machine learning methods.
- Software Development:
- CryptoChem: A novel cryptographic and data storage method based on cheminformatics, machine learning, and big chemical data. Two independent software were developed: MOLWRITE and MOLREAD. MOLWRITE encrypts the text/image data into chemical message and MOLREAD decrypts the encoded chemical message back to text/image.
Education
- PhD in Chemistry North Carolina State University Raleigh, NC 2017 – 2020
- MS in Chemistry Beijing University of Chemical Technology Beijing, China 2013 –2016
- BS in Chemistry Beijing University of Chemical Technology Beijing, China 2009 –2013
Skills
- Programming Toolkits: Python, R, Git
- Cheminformatics Toolkits: KNIME, RDKit, Schrödinger, ChemAxon
- Machine Learning Toolkits: Pytorch, Keras, Scikit-Learn, Streamlit, Jupyter Notebook
Data Science Related Courses
- deeplearning.ai: Deep Learning Specialization
- Algorithmic Toolbox (Coursera)
Publication
- Xinhao Li, Nicole Kleinstreuer and Denis Fourches. Hierarchical Quantitative Structure–Activity Relationship Modeling Approach for Integrating Binary, Multiclass, and Regression Models of Acute Oral Systemic Toxicity. Chemical Research in Toxicology. 2020, 33, 353-366.
- Xinhao Li and Denis Fourches. Inductive Transfer Learning for Molecular Activity Prediction: Next-Gen QSAR Models with MolPMoFiT. J Cheminform 2020, 12, 27.
- Xinhao Li and Denis Fourches. SMILES Pair Encoding: A Data-Driven Substructure Tokenization Algorithm for Deep Learning. ChemRxiv 2020
- Xinhao Li and Jiaxi Xu. (2017): Effects of the Microwave Power on the Microwave-assisted Esterification. Current Microwave Chemistry. 158-162.
- Xinhao Li and Jiaxi Xu. (2017): Identification of Microwave Selective Heating Effort in an Intermolecular Reaction with Hammett Linear Relationship as a Molecular Level Probe. Current Microwave Chemistry. 339-346.
- Xinhao Li and Jiaxi Xu. (2016): Determination on temperature gradient of different polar reactants in reaction mixture under microwave irradiation with molecular probe. Tetrahedron. 35, 5515-5520.
- Shanyan Mo, Xinhao Li and Jiaxi Xu. (2014): In Situ-Generated Iodonium Ylides as Safe Carbene Precursors for the Chemoselective Intramolecular Buchner Reaction. J. Org. Chem. 19, 9186-9195.
Presentations
- NC State Chemistry Recruitment Weekend (March 2019, Raleigh, NC) Xinhao Li, Denis Fourches. Hierarchical H-QSAR Modeling Method that Integrates Binary/Multi Classification and Regression Models for Predicting Acute Oral Systemic Toxicity. (Poster)
- American Chemical Society Conference (April 2019, Orlando, FL) Xinhao Li, Denis Fourches. Hierarchical H-QSAR Modeling Method that Integrates Binary/Multi Classification and Regression Models for Predicting Acute Oral Systemic Toxicity. (Poster)
- “Innovations in Agriculture” Scientific Poster Session at BASF (May 2019, RTP, NC) Xinhao Li, Denis Fourches. Hierarchical H-QSAR Modeling Method that Integrates Binary/Multi Classification and Regression Models for Predicting Acute Oral Systemic Toxicity. (Poster)
- NCSU/BASF Poster Session (Aug 2019, Raleigh, NC) Xinhao Li, Denis Fourches. Transfer Learning for Molecular Property/Activity Prediction. (Poster)
- Triangle Machine Learning Day (Sep 2019, Durham, NC) Xinhao Li, Denis Fourches. Transfer Learning for Molecular Property/Activity Prediction. (Oral Presentation & Poster)
- AI Powered Drug Discovery and Manufacturing (Feb 2020, MIT, Cambridge, MA) Xinhao Li, Denis Fourches. Inductive Transfer Learning for Molecular Activity Prediction. (Poster)
Awards
- CINF Scholarship for Scientific Excellence awarded by ACS CINF (2019 Spring)