Jesse Liu, Developer in Irvine, CA, United States
Jesse is available for hire
Hire Jesse

Jesse Liu

Verified Expert  in Engineering

Data Scientist and Quantitative Software Developer

Location
Irvine, CA, United States
Toptal Member Since
March 14, 2022

Jesse is a machine learning and software engineer with 10+ years of versatile industry experience developing robust solutions for social networks, ads engineering, quantitative investments, and wireless systems. Jesse is passionate about delivering high-quality code and solutions that meet the needs and expectations of his clients. He enjoys collaborating with team members with a professional attitude and exploring new ideas and technologies.

Portfolio

Reddit, Inc.
SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker...
Outremont Technologies
Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL...
iFDC Capital Management, LLC
Python, NumPy, SciPy, Pandas, Scikit-learn, Keras, TensorFlow, MySQL...

Experience

Availability

Full-time

Preferred Environment

Linux, Visual Studio Code (VS Code), GitHub

The most amazing...

...projects I've worked on are an ML pipeline for Reddit Ads Engineering, statistical models, and software for GPS chips in Samsung Galaxy and algorithmic trading.

Work Experience

Machine Learning Engineer

2022 - 2023
Reddit, Inc.
  • Developed a model training pipeline for ads prediction models utilizing gradient-boosting decision trees (GBDTs) and bid adjustment architecture, implemented using TensorFlow Decision Forests (TF-DF).
  • Built a feature engineering pipeline leveraging Google BigQuery, Apache Airflow, and dbt to handle daily roll-up of PB-level big data, feature generation, ingestion, and training/test dataset creation for ads CPC and CPA cost models.
  • Constructed ads prediction modeling performance monitoring and tracking dashboards on Mode, which effectively extracted business metrics, e.g., conversion rate, prediction rate, calibration error, conversion distribution, revenue/ARPU, CPM, and more.
  • Tracked and evaluated ads conversion modeling performance with segmentation analysis on various features, including placement/platform types, advertiser industry, user privacy flags, and sales channel, based on A/B tests and DDG experiments results.
  • Developed and maintained the prediction API between the ads inference server and upstream requesting server, ensuring seamless integration and responsiveness to model updates, requirements, and design changes.
Technologies: SQL, Back-end, Data Engineering, TensorFlow, Distributed Systems, Docker, Amazon Web Services (AWS), Python, GitHub, Apache Airflow, Machine Learning, Ads, Back-end Development, Go, Google BigQuery, Dashboards, Visual Studio Code (VS Code), Predictive Modeling, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation, Data Analysis, Algorithms, Rankings, Advertising Technology (Adtech), Artificial Intelligence (AI), Google Cloud Platform (GCP)

Senior Quantitative Researcher

2022 - 2022
Outremont Technologies
  • Performed a thorough statistical analysis and evaluation of all-events cryptocurrency market data, examining the efficacy of quantitative signals and features provided by various vendors.
  • Conducted in-depth research on novel deep learning approaches for time series forecasting, including gradient boosting tree, N-Beats and TFT transformer models, assessing their potential to enhance crypto portfolio performance.
  • Investigated the application of convolutional deep learning architectures to transformed signals, evaluating their effectiveness in momentum and volatility time series forecasting for crypto assets.
  • Developed a robust break-out momentum trading signal generation framework, employing probabilistic filtering and transaction cost analysis techniques to identify opportunities in the cryptocurrency market.
Technologies: Python, C++, Amazon EC2, Predictive Modeling, Machine Learning, ClickHouse, SQL, Pandas, Scikit-learn, TensorFlow, NumPy, Visual Studio Code (VS Code), GitHub, Amazon Web Services (AWS), Data Engineering, Quantitative Finance, Docker, Data Pipelines, Software Engineering, Software Development, Statistical Data Analysis, Signal Processing, Finance, Financial Modeling, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Quantitative Research, Artificial Intelligence (AI)

Quantitative Researcher | Data Scientist

2017 - 2021
iFDC Capital Management, LLC
  • Leveraged statistical data analysis, trading signal hypothesis testing and optimization, and quantitative strategies to enhance financial derivatives portfolio management.
  • Developed time series regression models for parameter estimation of mean-reversion stochastic process models, contributing to the refined stock index investment strategies.
  • Created a data pipeline and management dashboard, enabling real-time data streaming, historical data query, model monitoring, performance reporting, and visualization.
  • Established robust research databases for all-events historical market data using Clickhouse, providing a centralized repository for data-driven analysis and strategy development.
Technologies: Python, NumPy, SciPy, Pandas, Scikit-learn, Keras, TensorFlow, MySQL, Data Science, Deep Learning, Machine Learning, Data Visualization, Predictive Modeling, Visualization, Data Pipelines, Artificial Intelligence (AI), Docker, Statistical Data Analysis, Software Engineering, SQL, Back-end Development, Quantitative Finance, Dashboards, Visual Studio Code (VS Code), GitHub, Data Engineering, C++, Amazon EC2, Software Development, Signal Processing, Finance, Financial Modeling, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Quantitative Research, HTML

Staff Systems Engineer

2010 - 2017
Broadcom
  • Developed the statistical models and algorithms for the highly accurate GPS receiver on-chip clock system.
  • Analyzed a large amount of lab data for cellular radio chips and built statistical models to trade off between radio communication metrics. Developed numerical programs with Python to search circuit parameters and optimize chips' performance.
  • Developed a comprehensive set of automation tools in Python for regression characterization. The tools manage testing cases, error recovery, lab instruments control, and data report.
  • Developed a GPS receiver host software and signal processing algorithms in C++, log analysis and processing with Python, and software and algorithm defect tracking via Jira.
Technologies: Statistical Modeling, Data Modeling, Time Series Analysis, Signal Processing, Python, Data Visualization, Visualization, Predictive Modeling, MATLAB, Software Engineering, C++, Wireless Systems, Software Development, Statistical Data Analysis, Data Analytics, Data Manipulation, Regression, Data Analysis

Ads Prediction 10x Features

The project added much more features (about tenfold more than the original) for the new ads prediction model based on GBDT architecture, implemented with TensorFlow Decision Forests (TF-DF). I coded BigQuery SQL scripts with dbt templates for all the features' daily roll-up, generation, and training/test building. I also updated the GBDT model with bid adjustments in the modeling pipeline. The new features with the enhanced model greatly improved ads prediction performance and revenue.

Ads Prediction Batch Feature Engineering Pipeline

Migrated the legacy Airflow data pipeline to batch the feature engineering platform with dbt. The project exploited the parallel processing ability of dbt. It improved the dataset-building efficiency with Google BigQuery for daily feature roll-up, generation, ingestion, and transferring to S3 in TFRecord format.

Ads Prediction Modeling Performance-tracking Dashboard

Refactored the Mode dashboards that track the performance of the ads' click-through rate model or generalized conversion rate model. The BigQuery SQL code was reorganized into an Airflow-managed ETL pipeline, with template DDG experiment filters and version management on GitHub.

Quantitative Cryptocurrecy Trading Model

I conducted research on cryptocurrency market data and developed a quantitative trading model based on a Bayesian probabilistic filter and machine learning. I implemented the whole backtesting pipeline in Python with a core algorithm in C++.

Algorithmic Trading Module of an Asset Management Platform

WORK DONE
• Led the quantitative software development of the event-driven algorithm module for the micro-service structured proprietary asset management cloud application.
• Researched the convolutional deep learning models (Conv1D, ResNet, EfficientNet, WaveNet, DenseNet, etc.) applied on the CWT transformed signal for time series forecasting.
• Innovated algorithmic order types to reduce transaction cost and risk and optimized the leg synchronous execution for spread trading to reduce timing risk.
• Developed data pipelines and a portfolio management dashboard for real-time market data streaming, historical database query, model optimization and monitoring, statistics and indicators generation and reporting, and user front-end visualization.

Languages

Python, SQL, C++, Go, HTML

Libraries/APIs

Pandas, Scikit-learn, NumPy, PyTorch, Keras, SciPy, TensorFlow, Matplotlib

Paradigms

Data Science, Quantitative Research, ETL, Microservices

Other

Machine Learning, Signal Processing, Quantitative Finance, Wireless Systems, Software Development, Finance, Financial Modeling, Visualization, Statistics, Statistical Analysis, Software Engineering, Deep Learning, Data Modeling, Statistical Data Analysis, Bokeh, Data Visualization, Predictive Modeling, Artificial Intelligence (AI), Back-end Development, Data Build Tool (dbt), Semiconductors, GPS, MLflow, Machine Learning Operations (MLOps), Data Engineering, Google BigQuery, Dashboards, Data Analytics, Data Manipulation, Quantitative Analysis, Forecasting, Regression, Classification, Data Analysis, Algorithms, Rankings, Recommendation Systems, Advertising Technology (Adtech), Time Series Analysis, Back-end, Distributed Systems, Ads, Statistical Modeling, Signal Filtering

Tools

MATLAB, Apache Airflow, GitHub

Platforms

Jupyter Notebook, Docker, Visual Studio Code (VS Code), Google Cloud Platform (GCP), Linux, Amazon EC2, Amazon Web Services (AWS)

Storage

MySQL, Data Pipelines, ClickHouse, PostgreSQL

2006 - 2010

Ph.D. in Electrical Engineering

University of California, Riverside - Riverside, California, USA

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring