Big Data, AI Developer
2019 - PRESENTFinancial Industry Regulatory Authority- Developed the largest financial regulatory database in the world. Consolidated Audit Trail (CAT) handling up to 400 billion records per trade day.
- Developed and implemented a graph-based algorithm to link all market events and track its life cycle on the scale of billions of records using Spark and AWS.
- Created an end-to-end graph-based analytic solution for recommendation and fraud detection and an end-to-end people's analytics recommendation system using machine learning.
Technologies: Amazon Web Services (AWS), TensorFlow, PyTorch, Sklearn, AWS, Python, PySpark, ScalaSenior Data Scientist
2018 - 2019GEICO- Build a state-of-art end-to-end machine learning solution for the second-largest insurance company for 17 million customers.
- Delivered an end-to-end machine learning tracking and verification pipeline using blockchain for better machine learning model lifecycle management.
- Oversaw model deployment and designed an integrated pipeline for continuously monitoring model performance and online learning.
Technologies: Azure, PySpark, Deep Learning, XGBoost, Sklearn, PythonData Scientist
2017 - 2018IHS Markit- Drove cultural change in engineering for the advanced analytic team to experiment and adopt more efficient analysis methodologies and tools.
- Collaborated with the energy and maritime team to develop creative analytic solutions to their unique business challenges.
- Streamlined the data mining process and standardized all methodologies for sharing and validating analysis. Automated daily data analysis pipeline, SQL search, and R code review with web-based applications.
- Designed and experimented with various popular machine learning models for predicting oil price, major finance events using ARIMA, VAR, state-space model, regression, neural network, random forest, elastic neural net, RBM, and other similar methods.
- Translated billions of maritime trip data into valuable business insight by pattern recognition and modeling on AWS environment.
- Provided in-team technical assistance and knowledge-sharing on best machine learning and coding practices.
Technologies: D3.js, Plotly, Dash, Machine Learning, Python, ROperational Storm Surge Model Developer
2015 - 2017NOAA: National Oceanic & Atmospheric Administration- Built a national hurricane database and perform category analysis.
- Developed and maintained risk scoring for regions with different levels of flooding risk.
- Designed, developed, implemented, and validated a deterministic and ensemble storm surge model for the North Atlantic Ocean.
- Developed statistics metrics and visualization in Python for evaluating model performance.
- Designed an algorithm to deploy an operational storm surge model on Unix cloud clusters and code in Perl and Shell Scripts.
- Delivered a Python-based opensource library for automatically generating model grids, pre-processing, and post-analyzing model results.
- Developed signal processing algorithms for short and long-term water level time series using sophisticated statistic methods: Fourier transform, PCA, multivariate dimensional analysis, and regression analysis, to name a few.
Technologies: ShellScript, Linux, Fortran, MATLAB, Python, Microsoft HPCNumerical Modeler and Data Scientist
2012 - 2015Environmental Resource Management- Developed and quantitatively validated the coupled four-dimensional numerical coastal ocean models and water quality model for global oceans.
- Designed algorithms for four dimensional fluid dynamic models and deployed it for various water-bodies, from ponds, rivers, to ocean waters.
- Worked on international projects for oil & gas, mining, and the hydro power industries, where my role was to use various sophisticated hydrodynamic, environmental models, and data analytic tools to assess its impact on the receiving environment.
- Deployed a sophisticated four-dimensional operational hydrodynamic modeling system for the Bohai Sea (www.euler-tech.com) using Java, JavaScript, PHP, HTML5, SQL, and Amazon EC2.
Technologies: Fortran, ShellScript, Python, MATLAB