Jesse Liu, Developer in Irvine, CA, United States
Jesse is available for hire
Hire Jesse

Jesse Liu

Bio

Jesse is a software engineer developing data science solutions and a machine learning engineer with 10+ years of versatile industry experience. He's worked in the areas of social networks, ad engineering, quantitative investments, and engineering applications. He is passionate about delivering high-quality code and solutions with real business impact and always meets expectations. Jesse enjoys collaborating with team members with a professional attitude and exploring new ideas and technologies.

Portfolio

Jinhui Capital
Predictive Modeling, Python, Data Pipelines
Reddit, Inc.
SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker...
Outremont Technologies
Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL...

Experience

  • Software Development - 6 years
  • Python - 5 years
  • C++ - 5 years
  • Machine Learning - 5 years
  • Advertising Technology (Adtech) - 2 years
  • Rankings - 2 years
  • Apache Airflow - 2 years
  • Recommendation Systems - 1 year

Preferred Environment

Visual Studio Code (VS Code), GitHub, Jupyter Notebook

The most amazing...

...projects I've developed are the machine learning systems for Reddit ads prediction and conversion, including feature engineering and model training pipelines.

Work Experience

Quantitative Developer

2024 - 2025
Jinhui Capital
  • Developed the enhanced equity indexing strategies that aim to outperform the CSI 300 equity index, offering excessive returns with consistent alpha by leveraging machine learning models with factor tilts, security selection, and derivatives/futures.
  • Developed the trading signal in production for portfolios with 30+ commodity pairs in the futures market, including ferrous metal, energy, chemical, and agricultural products, and stock index.
  • Managed quantitative team recruiting and tech infrastructure development.
Technologies: Predictive Modeling, Python, Data Pipelines

Machine Learning Engineer

2022 - 2023
Reddit, Inc.
  • Developed a model training pipeline for ad prediction models based on gradient-boosting decision trees (GBDTs) and bid adjustment algorithms, implemented using TensorFlow Decision Forests (TF-DF).
  • Built a feature engineering pipeline leveraging Google BigQuery, Apache Airflow, and dbt to handle daily roll-up of hundreds of TB-level big data, feature generation, ingestion, and training and test dataset creation for ads CPC and CPA cost models.
  • Developed and maintained the prediction API between the ads inference server and upstream requesting server, ensuring seamless integration and responsiveness to model updates, requirements, and design changes.
  • Developed ads ranking model performance monitoring and tracking dashboards, extracting business metrics such as conversion rate, model prediction rate and calibration error, conversion distribution, click rate, revenue/ARPU, CPM, and more.
  • Developed dashboards to track and evaluate ad conversion modeling performance with segmentation analysis on various features, such as placement/platform types, advertiser industry, sales channel, etc., based on A/B tests and DDG experiments results.
Technologies: SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker, Amazon Web Services (AWS), Python, GitHub, Apache Airflow, Machine Learning, Ads, Back-end Development, Go, Google BigQuery, Dashboards, Visual Studio Code (VS Code), Predictive Modeling, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation, Data Analysis, Algorithms, Rankings, Advertising Technology (Adtech), Artificial Intelligence (AI), Google Cloud Platform (GCP)

Senior Quantitative Researcher

2022 - 2022
Outremont Technologies
  • Handled machine learning modeling of the time series forecasting for cryptocurrency perpetual contracts, with XGBoost gradient boosting decision tree (GBDT) models, and temporal convolutional networks (TCN).
  • Developed a break-out momentum predictive model and the trading signal generation framework, employing Bayesian filtering techniques and order/trade flow analysis to discover trading opportunities for crypto derivatives/perpetual contracts.
  • Developed a complete backtesting framework for crypto perpetuals in Python, with core filtering algorithms in C++ to speed up, enabling rigorous evaluation of strategies and signal performance.
  • Contributed to statistical analysis and evaluation of the off-the-shelf quantitative feature libraries for the cryptocurrency market provided by various vendors.
Technologies: Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL, Pandas, Scikit-learn, TensorFlow, NumPy, Visual Studio Code (VS Code), GitHub, Amazon Web Services (AWS), Data Engineering, Quantitative Finance, Docker, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Signal Processing, Finance, Financial Modeling, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Quantitative Research, Artificial Intelligence (AI), Convolutional Neural Networks (CNNs)

Quantitative Developer

2017 - 2021
iFDC Capital Management
  • Developed machine learning models to generate directional signals for intraday momentum trading on U.S. stock index futures. Implemented automatic execution and backtesting for statistical arbitrage trading strategies on commodity derivatives.
  • Led the development of the event-driven algorithm module in the firm’s proprietary trading platform, ensuring efficient low-latency event processing.
  • Developed portfolio management dashboard, enabling real-time market data streaming, position monitoring, indicator and model monitoring, and performance reporting and visualization.
  • Built the historical data and feature engineering pipeline, from data ingestion, transformation, to feature generation.
  • Built a robust research Clickhouse database with exchange all-events market data, providing a centralized repository for data-driven analysis and strategy development.
Technologies: Python, NumPy, SciPy, Pandas, Scikit-learn, Keras, TensorFlow, MySQL, Data Science, Deep Learning, Machine Learning, Data Visualization, Predictive Modeling, Visualization, Data Pipelines, Artificial Intelligence (AI), Docker, Statistical Data Analysis, Software Engineering, SQL, Back-end Development, Quantitative Finance, Dashboards, Visual Studio Code (VS Code), GitHub, Data Engineering, C++, Amazon EC2, Software Development, Signal Processing, Finance, Financial Modeling, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Quantitative Research, HTML, Convolutional Neural Networks (CNNs)

Staff Systems Engineer

2010 - 2017
Broadcom
  • Developed a GPS receiver host software and signal processing algorithms in C++, log analysis and processing with Python, and software and algorithm defect tracking via Jira.
  • Analyzed a large amount of lab data for cellular radio chips and built statistical models to trade off between radio communication metrics. Developed numerical programs with Python to search circuit parameters and optimize chips' performance.
  • Developed a comprehensive set of automation tools in Python for regression characterization. The tools manage testing cases, error recovery, lab instrument control, and data reporting.
  • Developed the statistical models and algorithms for the highly accurate GPS receiver on-chip clock system.
Technologies: Statistical Modeling, Data Modeling, Time Series Analysis, Signal Processing, Python, Data Visualization, Visualization, Predictive Modeling, MATLAB, Software Engineering, C++, Wireless Systems, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation, Regression, Data Analysis

Experience

Ad Prediction - 10x Features

The project added many more features (about 10 times as many as the original) to the new ad prediction model based on a GBDT architecture, implemented with TensorFlow Decision Forests (TF-DF). I coded BigQuery SQL with dbt templates for all the features' daily roll-ups, generation, and training/test building. I also renovated the GBDT model with bid adjustments in the modeling pipeline. The new features with the enhanced model greatly improved ad prediction performance and revenue.

Ad Prediction Batch Feature Engineering Pipeline

I migrated the legacy Airflow data pipeline to the new feature engineering platform with dbt. The project exploited dbt's parallel batch-processing capabilities. It improved dataset-building efficiency in Google BigQuery for daily feature roll-up, generation, ingestion, and transfer to S3 in TFRecord format.

Ads Prediction Modeling Performance-tracking Dashboard

Refactored the Mode dashboards that track the performance of the ads' click-through rate model or generalized conversion rate model. The BigQuery SQL code was reorganized into an Airflow-managed ETL pipeline, with template DDG experiment filters and version management on GitHub.

Evaluation Framework for Cryptocurrency Investment Strategies

I developed a complete backtesting evaluation framework for cryptocurrency derivatives investment strategies in Python, with core signal-generation algorithms in C++ to speed up, enabling rigorous evaluation of strategies and signal performance.

Algorithmic Trading Module of the Proprietary Asset Management Platform

CONTRIBUTIONS
• Led the quantitative software development of the event-driven algorithm module for the microservice-structured proprietary asset management cloud application.
• Developed machine learning models to generate directional signals for intraday momentum trading on U.S. stock index futures. Implemented automatic execution and backtesting for statistical arbitrage trading strategies on commodity derivatives.
• Built the historical data and feature engineering pipeline, from data ingestion, transformation, to feature generation.
• Built a robust research Clickhouse database with exchange all-events market data, providing a centralized repository for data-driven analysis and strategy development.
• Developed portfolio management dashboard, enabling real-time market data streaming, position monitoring, indicator and model monitoring, and performance reporting and visualization.

Education

2006 - 2010

PhD in Electrical Engineering

University of California, Riverside - Riverside, California, USA

Skills

Libraries/APIs

Pandas, Scikit-learn, NumPy, PyTorch, Keras, SciPy, TensorFlow, Matplotlib

Tools

MATLAB, Apache Airflow, GitHub

Languages

Python, SQL, C++, Go, HTML

Paradigms

Quantitative Research, ETL, Microservices

Platforms

Jupyter Notebook, Docker, Visual Studio Code (VS Code), Google Cloud Platform (GCP), Linux, Amazon EC2, Amazon Web Services (AWS)

Storage

MySQL, Data Pipelines, ClickHouse, PostgreSQL

Other

Data Science, Machine Learning, Signal Processing, Quantitative Finance, Wireless Systems, Software Development, Finance, Financial Modeling, Visualization, Statistics, Statistical Analysis, Software Engineering, Deep Learning, Data Modeling, Statistical Data Analysis, Bokeh, Data Visualization, Predictive Modeling, Artificial Intelligence (AI), Back-end Development, Data Build Tool (dbt), Semiconductors, GPS, MLflow, Machine Learning Operations (MLOps), Data Engineering, Google BigQuery, Dashboards, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Rankings, Recommendation Systems, Advertising Technology (Adtech), Convolutional Neural Networks (CNNs), Time Series Analysis, Back-end, Distributed Systems, Ads, Statistical Modeling, Signal Filtering

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring