Alex Baretta, Developer in San Jose, CA, United States
Alex is available for hire
Hire Alex

Alex Baretta

Verified Expert  in Engineering

Technology Leader/ Developer

Location
San Jose, CA, United States
Toptal Member Since
April 15, 2020

Alex is a versatile technologist with a deep academic background in computer science (Ecole Polytechnique), electrical engineering (Politecnico di Milano), and quantitative finance (Bocconi University). He has experience building search engines (Wink/MyLife), quantitative insurance risk and pricing models (The Climate Corporation), trading algorithms (Xambala/Final Strategies), and stochastic optimization algorithms (KCG/Virtu). Alex is also the co-founder of two startups.

Portfolio

TickUp (HFT / Hedge Fund)
C++, Director, Machine Learning, Data Science, Big Data, Data, Leadership...
Zeguro (Cyber Security / Cyber Insurance)
Amazon Web Services (AWS), Python, Boto 3, PostgreSQL, Spring, React...
KCG Holdings, Inc. | Virtu Financial, Inc. (HFT)
Optimization, Reinforcement Learning, Pandas, Python, OCaml, C++, SQL...

Experience

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Automation, DevOps, Bash, Programming, OCaml, Distributed Computing, Condor, Big Data, Hadoop, Search, Algorithmic Trading, C, Matplotlib, Scikit-learn, Python

The most amazing...

...project I've worked on is a machine learning algorithm which was used to predict loss distributions for efficient insurance pricing.

Work Experience

Head of High-frequency Trading

2020 - 2021
TickUp (HFT / Hedge Fund)
  • Developed a low-latency C++ book builder and market feed processor for Nasdaq TotalView-ITCH. Later my engineering team added support for all other US equities exchanges and their respective data protocols.
  • Hired a team of eight top-caliber algorithmic trading quants and C++ engineers.
  • Designed a multi-tactic HFT strategy for US equities supporting tactics, including passive orders at the near side with signal-driven cancellation, hidden midpoint peg orders, and IOC (aggressive) orders.
  • Developed a broad array of statistically powerful predictive models to inform the strategy's trading activity.
Technologies: C++, Director, Machine Learning, Data Science, Big Data, Data, Leadership, Trading, Hiring, Data Engineering, ETL, TensorFlow, Kubernetes, Amazon S3 (AWS S3), Team Leadership

CTO (contract)

2017 - 2019
Zeguro (Cyber Security / Cyber Insurance)
  • Architected the product stack: front end, middle tier, back end, and database.
  • Carefully designed the AWS execution environment based on ECS with Fargate and implemented an infrastructure as code automated build and deployment tool using Boto 3 in Python.
  • Built a generic database layer, blending the best of relational and NoSQL technology by leveraging PostgreSQL's native JSON support and GIN indexing technology.
  • Hired and mentored a highly performant engineering team.
  • Architected the data collection process to support the company's cyber insurance AI model.
  • Researched and developed a prototype of a "real-time" AI approach for cyber insurance underwriting based on information retrieval (i.e., inverted index) techniques, which does not require a training stage other than indexing each new observation.
Technologies: Amazon Web Services (AWS), Python, Boto 3, PostgreSQL, Spring, React, Express.js, Node.js, TypeScript, JavaScript, Python 3, Leadership, Director, Data, Atlassian, GitLab, Kubernetes, Amazon Cognito, Amazon DynamoDB, Amazon S3 (AWS S3), Amazon CloudWatch, Team Leadership

Quantitative Strategist, US Equities

2016 - 2018
KCG Holdings, Inc. | Virtu Financial, Inc. (HFT)
  • Developed a framework for reinforcement learning in the context of high-frequency trading.
  • Built a stochastic optimization framework for the parameters of KCG's flagship algorithmic trading strategy.
  • Contributed to developing a high-throughput trading simulator to support evaluating the performance of making various marketing strategies and predictors.
Technologies: Optimization, Reinforcement Learning, Pandas, Python, OCaml, C++, SQL, Machine Learning, Python 3, Deep Learning, ETL

VP of Data Science | Chief Technology Officer

2014 - 2016
Lumity, Inc. (Health Insurance)
  • Developed a novel machine learning algorithm and non-parametric conditional density estimation for quantitative risk management and efficient pricing of insurance products.
  • Built a machine learning model to predict the out-of-pocket expenses of an individual based on the features of a health insurance plan and the individual's risk profile.
  • Architected the benefits enrollment platform at the heart of Lumity's benefits brokerage platform.
  • Hired and mentored a high-functioning engineering and data science team.
Technologies: Data Science, JavaScript, Python, Scala, OCaml, SQL, Machine Learning, Amazon S3 (AWS S3), Amazon CloudWatch, Team Leadership

Research Engineer | High Frequency Trading

2012 - 2014
Xambala Capital (HFT)
  • Built an extensive library of signals derived from the raw market event feeds for machine learning applications.
  • Developed low-latency market predictors using GNU R and glmnet. The resulting models were blazingly fast to evaluate, as required by high-frequency trading, and their statistical performance was competitive with far more complex non-linear models.
  • Maintained and extended a stock exchange simulator, supporting feeds from all major use exchanges, including Nasdaq, NYSE, Arca, BATS, and Direct Edge, and implemented the distinct semantics of each exchange's matching engine.
  • Constructed an order router translating from Xambala's native ordering protocol to the protocols of all major US equities exchanges and several dark pools.
  • Built a family of two-sided liquidity providing marketing-making algorithms for tight-spread, high-volume stocks.
Technologies: Matplotlib, NumPy, Python, GNU, C++, OCaml, SQL, Machine Learning, Python 3, Deep Learning

Lead Engineer | Pricing and Risk Management

2011 - 2012
The Climate Corporation (Weather Insurance)
  • Built Climate's weather-based crop insurance pricing algorithm based on a distributed Monte Carlo simulation of predicted weather patterns over the insured farm throughout the growing season.
  • Developed a Hadoop MapReduce-based algorithm to produce the quantitative risk reports on the entire crop insurance portfolio for the reinsurance partners.
  • Researched the feasibility of expressing the insurance policy's payout calculation as a type of data by representing it as a first-class function in a functional programming language, namely Clojure.
Technologies: Hadoop, Clojure, Java, SQL, Machine Learning, Python 3, Leadership, Data, Team Leadership

Senior Engineering | Search Technology

2009 - 2011
Wink.com | Mylife.com (Social Media)
  • Built a Hadoop MapReduce indexing algorithm to process the document corpus into a set of Lucene indexes for the search engine cluster that was 100 times faster than the previous version, which was based on a static cluster of Lucene indexing servers.
  • Developed a real-time Lucene read-write indexing service to immediately make newly acquired data accessible through the search engine, which complemented the main search cluster, served the static document corpus, and was updated infrequently.
  • Leveraged Lucene inverted index technology to support k-nearest neighbors predictive modeling.
Technologies: Hadoop, Apache Lucene, Java, OCaml, Data, Big Data, Search, Web Search, Distributed Computing, Leadership, Team Leadership

Freelance Software and Industrial Automation Engineer

2001 - 2009
Self-employed
  • Organized and led the Zoonoses project at the European Food Safety Authority (EFSA), a branch of the European Commission. The Zoonoses project is the most important IT project at EFSA, handling data collection from 25-member states.
  • Designed and programmed a fully automated sandblasting industrial robot as part of the assembly line of a company manufacturing concrete mixer trucks.
  • Developed an HTML and JavaScript web UI for an industrial cutting robot. The control software developed included an algorithm to optimize the layout of the pieces cut out of the raw material, minimizing the total amount of material required.
Technologies: Java, Oracle, PHP, SQL, HTML, JavaScript, MySQL, PostgreSQL, PLC, Computer Vision

EigenDog: Stochastic Gradient Boosting Learner Written in OCaml

https://github.com/alexbaretta/dawg/tree/dawg2
Friedman's stochastic gradient boosting machine (S-GBM) is a powerful machine learning algorithm that models the training dataset through an ensemble of decision trees. Like deep learning, S-GBM is a universal function approximator. Unlike deep learning, where the network training process involves gradient descent on the neural network's parameters, S-GBM relies on a gradient descent in "functional space." Every step of the algorithm constructs a piecewise function in the form of a decision tree, whose addition to the model maximizes the reduction in training loss.

This approach has several advantages over deep learning. In particular, it is possible to construct a cross-validation path, showing the tradeoff between variance and bias as a function of the number of trees in the ensemble. Early termination of the algorithm based on the cross-validation path obviates the need to decide the algorithm's hyperparameters ahead of time, in contrast with deep learning, where the network topology is fixed. S-GBM works remarkably well for structured data with a large number of categorical or ordinal, but not necessarily metric, variables.

Dawg is an efficient implementation of S-GBM in OCaml. I worked on it from 2016 to 2017.

Languages

SQL, OCaml, Python 3, C++, Python, R, Scala, Bash, Bash Script, Java, C, PL/pgSQL, JavaScript, TypeScript, Clojure, Julia, PHP, HTML

Frameworks

Hadoop, Spark, Express.js, Spring

Libraries/APIs

Pandas, Apache Lucene, Scikit-learn, PySpark, TensorFlow, Matplotlib, Node.js, React, NumPy

Tools

Boto 3, Amazon Cognito, GitLab, Amazon CloudWatch, Atlassian

Paradigms

Data Science, Distributed Computing, DevOps, Automation, ETL

Platforms

Amazon Web Services (AWS), Docker, Linux, Azure, Kubernetes, Oracle, Director

Storage

PostgreSQL, Data Validation, Amazon S3 (AWS S3), MySQL, PL/SQL, Amazon DynamoDB

Other

Machine Learning, Algorithmic Trading, Time Series Analysis, Stochastic Modeling, Numerical Optimization, Deep Neural Networks, Model Validation, Predictive Analytics, Gradient Boosting, Gradient Boosted Trees, Random Forests, Big Data, Big Data Architecture, Deep Learning, Team Leadership, Computer Vision, Leadership, Search, Programming, Reinforcement Learning, Optimization, GNU, Neural Networks, Condor, PLC, Fintech, Insurance Technology (Insurtech), Automated Trading Software, Statistics, Artificial Intelligence (AI), Classification, Decision Tree Regression, Decision Trees, Decision Tree Classification, Model Regularization, Cross-validation, Data, Trading, Hiring, Data Engineering, Web Search

2009 - 2010

Master's Degree in Finance and Banking

SDA Bocconi School of Management, Bocconi University - Milano, Italy

2000 - 2001

Participated in an International Exchange Program (Non-degree Program) in Computer Science

École Polytechnique - Palaiseau, France

1996 - 2001

Engineer's Degree in Computer Engineering and Electrical Engineering

Politecnico di Milano - Milano, Italy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring