Jesse Liu, Developer in Irvine, CA, United States
Jesse is available for hire
Hire Jesse

Jesse Liu

Verified Expert  in Engineering

Data Scientist and Quantitative Software Developer

Location
Irvine, CA, United States
Toptal Member Since
March 14, 2022

Jesse is a machine learning and software engineer with 10+ years of versatile industry experience developing robust solutions for social networks, ads engineering, quantitative investments, and wireless systems. Jesse is passionate about delivering high-quality code and solutions that meet the needs and expectations of his clients. He enjoys collaborating with team members with a professional attitude and exploring new ideas and technologies.

Portfolio

Reddit, Inc.
SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker...
Outremont Technologies
Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL...
iFDC Capital Management, LLC
Python, NumPy, SciPy, Pandas, Scikit-learn, Keras, TensorFlow, MySQL...

Experience

Availability

Full-time

Preferred Environment

Linux, Visual Studio Code (VS Code), GitHub

The most amazing...

...projects I've worked on are an ML pipeline for Reddit Ads Engineering, statistical models, and software for GPS chips in Samsung Galaxy and algorithmic trading.

Work Experience

Machine Learning Engineer

2022 - PRESENT
Reddit, Inc.
  • Developed, updated, and maintained a feature engineering pipeline for various ads prediction models, including click-through and conversion models, with Google BigQuery, Apache Airflow, and dbt.
  • Updated and maintained the ads inference server, modeling pipeline, and prediction interface with the upstream ads server according to various requirements and design changes.
  • Developed ads modeling performance monitoring and tracking dashboards with Mode.com.
Technologies: SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker, Amazon Web Services (AWS), Python, Java, GitHub, Apache Airflow, Kubernetes, Machine Learning, Ads, Back-end Development, Go, Google BigQuery, Dashboards, Visual Studio Code (VS Code), Predictive Modeling, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation

Senior Quantitative Researcher

2022 - 2022
Outremont Technologies
  • Researched cryptocurrency markets and statistical data analysis.
  • Developed quantitative crypto trading models based on Bayesian statistical, machine learning (GBDT) algorithms, and signal filtering technologies.
  • Developed data pipelines and quantitative model evaluation and backtesting systems.
Technologies: Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL, Pandas, Scikit-learn, TensorFlow, NumPy, Visual Studio Code (VS Code), GitHub, Amazon Web Services (AWS), Data Engineering, Quantitative Finance, Docker, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Signal Processing, Finance, Financial Modeling, Blockchain, Data Analytics, Data Manipulation

Quantitative Researcher | Quantitative Developer

2017 - 2021
iFDC Capital Management, LLC
  • Developed a data pipeline and management dashboard with Python for real-time data streaming, historical database query, model optimization and monitoring, statistics and indicators generation and reporting, and user front-end visualization.
  • Led the quantitative software development of the event-driven algorithm module for a micro-service structured application. Implemented the cost and timing-optimized automatic execution algorithms and deployed them in the cloud.
  • Developed the software implementation of the statistical arbitrage models and strategies for commodity futures portfolio management, cointegration testing, and dynamic hedging with a Bayesian filtering algorithm.
  • Researched modeling with Conv1D, ResNet-1D, WaveNet, and fine-tuned pre-trained ResNet-2D models on CWT transformed signal to adapt the weights for volatility time series forecasting.
Technologies: Python, NumPy, SciPy, Pandas, Scikit-learn, Keras, TensorFlow, MySQL, Data Science, Deep Learning, Machine Learning, Data Visualization, Predictive Modeling, Visualization, Data Pipelines, Artificial Intelligence (AI), Docker, Statistical Data Analysis, Software Engineering, SQL, Back-end Development, Quantitative Finance, Dashboards, Visual Studio Code (VS Code), GitHub, Data Engineering, C++, Amazon EC2, Software Development, Signal Processing, Finance, Financial Modeling, Stock Analysis, Stock Market, Stock Trading, Data Analytics, Data Manipulation

Staff Systems Engineer

2010 - 2017
Broadcom
  • Developed the statistical models and algorithms for the highly accurate GPS receiver on-chip clock system.
  • Analyzed a large amount of lab data for cellular radio chips and built statistical models to trade off between radio communication metrics. Developed numerical programs with Python to search circuit parameters and optimize chips' performance.
  • Developed a comprehensive set of automation tools in Python for regression characterization. The tools manage testing cases, error recovery, lab instruments control, and data report.
  • Developed a GPS receiver host software and signal processing algorithms in C++, log analysis and processing with Python, and software and algorithm defect tracking via Jira.
Technologies: Statistical Modeling, Data Modeling, Time Series Analysis, Signal Processing, Python, Data Visualization, Visualization, Predictive Modeling, MATLAB, Software Engineering, C++, Wireless Systems, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation

Ads Prediction 10x Features

The project added much more features (about tenfold more than the original) for the new ads prediction model based on GBDT architecture, implemented with TensorFlow Decision Forests (TF-DF). I coded BigQuery SQL scripts with dbt templates for all the features' daily roll-up, generation, and training/test building. I also updated the GBDT model with bid adjustments in the modeling pipeline. The new features with the enhanced model greatly improved ads prediction performance and revenue.

Ads Prediction Batch Feature Engineering Pipeline

Migrated the legacy Airflow data pipeline to batch the feature engineering platform with dbt. The project exploited the parallel processing ability of dbt. It improved the dataset-building efficiency with Google BigQuery for daily feature roll-up, generation, ingestion, and transferring to S3 in TFRecord format.

Ads Prediction Modeling Performance-tracking Dashboard

Refactored the Mode dashboards that track the performance of the ads' click-through rate model or generalized conversion rate model. The BigQuery SQL code was reorganized into an Airflow-managed ETL pipeline, with template DDG experiment filters and version management on GitHub.

Quantitative Cryptocurrecy Trading Model

I conducted research on cryptocurrency market data and developed a quantitative trading model based on a Bayesian probabilistic filter and machine learning. I implemented the whole backtesting pipeline in Python with a core algorithm in C++.

Algorithmic Trading Module of an Asset Management Platform

WORK DONE
• Led the quantitative software development of the event-driven algorithm module for the micro-service structured proprietary asset management cloud application.
• Researched the convolutional deep learning models (Conv1D, ResNet, EfficientNet, WaveNet, DenseNet, etc.) applied on the CWT transformed signal for time series forecasting.
• Innovated algorithmic order types to reduce transaction cost and risk and optimized the leg synchronous execution for spread trading to reduce timing risk.
• Developed data pipelines and a portfolio management dashboard for real-time market data streaming, historical database query, model optimization and monitoring, statistics and indicators generation and reporting, and user front-end visualization.

Languages

Python, SQL, C++, Java, Go

Libraries/APIs

Pandas, Scikit-learn, NumPy, PyTorch, Keras, SciPy, TensorFlow, Matplotlib, REST APIs

Other

Machine Learning, Signal Processing, Quantitative Finance, Wireless Systems, Software Development, Finance, Financial Modeling, Stock Trading, Visualization, Statistics, Statistical Analysis, Software Engineering, Deep Learning, Data Modeling, Statistical Data Analysis, Bokeh, Data Visualization, Predictive Modeling, Artificial Intelligence (AI), Back-end Development, Data Build Tool (dbt), Semiconductors, GPS, MLflow, Machine Learning Operations (MLOps), Data Engineering, Google BigQuery, Dashboards, Stock Analysis, Stock Market, Data Analytics, Data Manipulation, Time Series Analysis, RESTful Microservices, Back-end, Distributed Systems, Ads, Statistical Modeling, Signal Filtering, Dash

Tools

MATLAB, Apache Airflow, GitHub

Paradigms

Data Science, ETL, Microservices

Platforms

Jupyter Notebook, Docker, Visual Studio Code (VS Code), Blockchain, Linux, Amazon EC2, Amazon Web Services (AWS), Kubernetes

Storage

MySQL, Data Pipelines, ClickHouse

Frameworks

Flask

2006 - 2010

Ph.D. in Electrical Engineering

University of California, Riverside - Riverside, California, USA