Data Science Contractor2020 - PRESENTAstraZeneca
Technologies: Python 3, Bash Script, Data Science, Machine Learning, Natural Language Processing (NLP), Scikit-learn, Keras, TensorFlow, Streamlit, NGINX, Python, Data Analysis, Spotfire, Flask, Git, Data Visualization
- Developed a machine learning workflow to leverage and interpret genetic data. This included parsing and preprocessing patient data, normalization, dimensionality reduction, statistical tests, and supervised analysis.
- Created a natural language solution for mining biomedical literature. The data was structured in an Elasticsearch database, cleaned, tokenized using the Natural Language Toolkit (NLTK), vectorized, and then used in a text classification framework.
- Built dashboards and UI using Streamlit in Python. Deployed using Nginx.
Data Science Contractor2019 - 2020Arm
Technologies: Python 3, Scikit-learn, Keras, TensorFlow, Generative Adversarial Networks (GANs), Bash, Jenkins, Git, Slurm, GitHub, Python, Deep Learning, Genetic Algorithms, Numerical Methods, Convex Optimization, Data Visualization
- Built a machine learning framework for maximizing coverage in CPU verification. Development was in Python; deployed on HPC using the Slurm Workload Manager.
- Developed workflows leveraging adversarial learning using GANs and programmed in Python Keras.
- Addressed numerical optimization problems using genetic algorithms with a custom GA implementation.
Principal Data Scientist2016 - 2019UCB Celltech
Technologies: R, Python 3, Spotfire, Linux, H20, Keras, LSTM, Git, Python, Data Analysis, Data Analytics, Data Science, Machine Learning, Bioinformatics, Genomics, Data Visualization
- Built machine learning workflows to predict patient response to candidate drugs. Developed in R.
- Led a team of three developers to create exploratory analytics solutions/dashboards to visualize high-dimensional data. Results were pre-calculated in R, then imported in TIBCO Spotfire.
- Designed machine learning solutions to predict drug activity in assays. Used LSTMs to model chemical structures as free text and applied methods from text classification.
Postdoctoral Research Fellow2014 - 2016U.S. Food & Drug Administration
Technologies: R, Linux, C, Slurm, Linear Optimization, NetworkX, Bioinformatics, Genomics, Drug Development, Python, Data Science, Data Analytics
- Developed a solution for predicting drug adverse events based on their transcriptomic profiles.
- Created a linear programming formulation to model the structure of directed graphs.
- Applied a solution to predict the adverse effects of new compounds.