Data Scientist
2020 - PRESENTToptal- Implemented Liquid chromatography-mass spectrometry data processing and machine learning models to detect targeted proteins in blood samples, facilitating faster disease diagnosis.
- Researched time-domain/frequency-domain signal processing and machine learning algorithms for fall detection based on biomedical signal data collected from wearable sensors using Python and Scikit-learn.
- Provided advice on data science use cases and data collection protocols for machine learning model development/evaluation.
Technologies: SciPy, Jupyter Notebook, Scikit-learn, PythonMachine Learning Engineer (Computer Vision)
2020 - PRESENTFugro- Developed deep learning object detection models and object tracking prototypes with Python and TensorFlow on Amazon SageMaker to detect and track traffic objects, reducing manual processing cost by 50%.
- Architected data processing workflow with multiple parallel compute jobs using AWS Batch and Step Functions, reducing processing time ten times.
- Implemented computer vision algorithms in C++ and OpenCV to improve automated pavement distress extraction.
- Improved data processing pipeline and implemented bundle adjustment to estimate a road object’s GPS coordinate based on vehicle’s position/orientation, inertial measurement unit (IMU) data, and camera intrinsic/extrinsic parameters.
- Standardized team’s DevOps and MLOps processes with Jenkins, Docker, and MLflow.
Technologies: Amazon Web Services (AWS), Amazon SageMaker, AWS, Object Detection, Computer Vision, MLflow, Docker, Jenkins, OpenCV, Python, C++Senior Machine Learning Engineer
2018 - 2019Terramera- Researched deep learning models for object detection (Mask RCNN, U-Net) using Python and TensorFlow to automate pest counting processes.
- Implemented multispectral image processing pipelines for plant health evaluation using computer vision and machine learning techniques in Python, OpenCV, and Scikit-learn.
- Developed a multispectral imaging prototype and implemented machine learning models to estimate grape sugar content based on multispectral reflectance, in Python, OpenCV, and Scikit-learn.
- Implemented machine learning models for drug dose-response modeling, drug synergy analysis, and prediction using cheminformatics and machine learning libraries (Scikit-learn, RDKit) to accelerate the drug discovery process in plant health research.
- Provided advice on experimental design and implemented statistical analysis pipelines with Python, Rpy2, and StatsModels to automate various statistical analyses for plant health research.
Technologies: Keras, TensorFlow, OpenCV, Scikit-learn, PythonData Scientist
2017 - 2017Phemi Systems- Developed distributed data processing and analytics prototypes using Spark (Scala), Hive, and Zeppelin to demonstrate fast query and analytics on terabytes of clinical data.
- Proposed machine learning and deep learning demos using Python, Scikit-learn, and TensorFlow to show how medical imaging data can be analyzed to support the diagnosis.
- Implemented a natural language processing pipeline in Scala and Apache cTAKES - a library with both rule-based and machine learning techniques, to extract clinical information from unstructured medical text.
- Developed term partitioned index mechanism in Java to enable fast document search in Accumulo.
Technologies: Scikit-learn, SciPy, Python, Scala, Zeppelin, Apache Hive, Spark, HadoopData Analytics Engineer
2012 - 2016DBS- Developed SAS code to extract data from Teradata SQL databases and perform statistical analysis for the card and unsecured lending sales and marketing.
- Liaised with the modeling team to deploy predictive models (recommender system, location analytics) for targeted marketing, leading to two times the lift in the customer response rate in digital campaigns.
- Proposed experimental design and hypothesis testing on different content factors to improve customer engagement in email marketing.
- Developed Java analysis reports and QlikView dashboards to provide technology and operations teams with insight on process improvement and risk control.
- Performed process mapping and simulation modeling to optimize business processes, resulting in a 10% reduction in operating costs.
- Led infrastructure design and deployment for an enterprise data management system.
Technologies: QlikView, SQL, SAS, JavaBusiness Intelligence Developer
2011 - 2011Hutcabb Consulting- Implemented inventory modeling and demand forecasting function in a Decision Support System with Java Servlet to facilitate optimal inventory decision making.
- Supported infrastructure design and deployment for the Decision Support System.
- Streamlined data processing from multiple data sources by developing Data Extract, Transform, and Load (ETL) pipeline with Microsoft SQL Server Integration Services (SSIS).
Technologies: SQL Server Integration Services (SSIS), SQL, Java