Senior Machine Learning Engineer2018 - PRESENTTerramera
Technologies: Python, Scikit-learn, OpenCV, TensorFlow, Keras, RDKit
- Researched deep learning models for object detection (Mask-RCNN, U-Net) using Python and TensorFlow to automate pest counting processes.
- Implemented multispectral image processing pipelines for plant health evaluation using Computer Vision and machine learning techniques, including geometric transformation and keypoint descriptor for image alignment and stitching, stereovision for depth estimation, color threshold and watershed for segmentation, and regression and tree-based models for plant trait estimation, in Python, OpenCV, and Scikit-learn.
- Developed a multispectral imaging prototype and implemented machine learning models to estimate grape sugar content based on multispectral reflectance, in Python, OpenCV, and Scikit-learn.
- Implemented machine learning models for drug dose-response modeling, drug synergy analysis and prediction using cheminformatics and machine learning libraries (Scikit-learn, RDKit, PubChemPy) to accelerate the drug discovery process in plant health research.
- Provided advice on experimental design and implemented statistical analysis pipelines with Python, Rpy2, and StatsModels to automate various statistical analyses for plant health research.
Data Scientist2017 - 2017Phemi Systems
Technologies: Hadoop, Spark, Hive, Zeppelin, Scala, Python, SciPy, Scikit-learn
- Developed distributed data processing and analytics prototypes using Spark (Scala), Hive and Zeppelin to demonstrate fast query and analytics on terabytes of clinical data.
- Proposed machine learning and deep learning demos using Python, Scikit-learn, and TensorFlow to show how medical imaging data can be analyzed to support diagnosis.
- Researched time-domain/frequency-domain signal processing and machine learning algorithms for fall detection based on biomedical signal data collected from wearable sensors using Python and Scikit-learn.
- Implemented a natural language processing pipeline in Scala and Apache cTAKES - a library with both rule-based and machine learning techniques, to extract clinical information from unstructured medical text.
- Developed term partitioned index mechanism in Java to enable fast document search in Accumulo.
Data Analytics Engineer2012 - 2016DBS Bank
Technologies: Java, SAS, SQL, QlikView
- Developed SAS code to extract data from Teradata SQL databases and perform statistical analysis for Card and Unsecured Lending sales and marketing.
- Liaised with the modeling team to deploy predictive models (Recommender System, Location Analytics) for targeted marketing, leading to two times the lift in the customer response rate in digital campaigns.
- Proposed experimental design and hypothesis testing on different content factors to improve customer engagement in email marketing.
- Developed Java analysis reports and QlikView dashboards to provide technology and operations teams with insight on process improvement and risk control.
- Performed process mapping and simulation modeling to optimize business processes, resulting in a 10% reduction in operating costs.
- Led infrastructure design and deployment for an enterprise data management system.