Advising CTO2017 - 2019Zeguro
- Architected the product stack (front end, middle tier, back end, and database).
- Carefully designed the AWS execution environment based on ECS/Fargate and implemented an Infrastructure-as-Code automated build and deployment tool using Python/Boto3.
- Built a generic database layer blending the best of relational and NoSQL technology by leveraging PostgreSQL's native JSON support and GIN indexing technology.
- Hired and mentored a highly performant engineering team.
- Architected the data collection process to support to the company's cyber insurance AI model.
- Researched and developed a prototype of "real time" AI approach based on information retrieval (i.e. inverted index) techniques, which does not require a training stage, other than indexing each new observation.
Quantitative Strategist, US Equities2016 - 2018KCG Holdings, Inc. | Virtu Financial, Inc.
Technologies: C++, OCaml, Python, Pandas, Machine Learning, Reinforcement Learning, Stochastic Optimization
- Developed a framework for reinforcement learning in the context of high-frequency trading.
- Built an a stochastic optimization framework for the parameters of KCG's flagship algorithmic trading strategy.
- Contributed to the development of a high-throughput trading simulator to support the evaluation of the performance of various marketing making strategies and predictors.
VP of Data Science | Chief Technology Officer2014 - 2016Lumity, Inc.
- Developed a novel machine learning algorithm (non-parametric conditional density estimation) for quantitative risk management and efficient pricing of insurance products.
- Built a machine learning model to predict the out-of-pocket expenses of an individual based on the features of a health insurance plan and the individual's risk profile.
- Architected the benefits enrollment platform at the heart of Lumity's benefits brokerage platform.
- Hired and mentored a high-functioning engineering and data-science team.
Research Engineer – High Frequency Trading2012 - 2014Xambala Capital
Technologies: OCaml, C++, GNU R, Python, NumPy, Matplotlib
- Built an extensive library of signals derived from the the raw market event feeds for machine learning applications.
- Developed low-latency market predictors using GNU R and Glmnet. The resulting models were blazingly fast to evaluate, as required by high-frequency trading, and their statistical performance was competitive with far more complex non-linear models.
- Maintained and extended a stock exchange simulator, supporting feeds from all major use exchanges (Nasdaq, NYSE, Arca, BATS, DirectEdge) and implemented the distinct semantics of each exchange's matching engine.
- Constructed an order router translating from Xambala's native ordering protocol to the protocols of all major US equities exchanges and several dark pools.
- Built a family of two-sided liquidity providing marketing making algorithms for tight-spread, high-volume stocks.
Lead Engineer, Pricing and Risk Management2011 - 2012The Climate Corporation
Technologies: Java, Clojure, Hadoop
- Built Climate's weather-based crop insurance pricing algorithm based on a distributed Montecarlo simulation of predicted weather patterns over the insured farm throughout the growing season.
- Developed a Hadoop/MapReduce-based algorithm to produce the quantitative risk reports on the entire crop insurance portfolio for the reinsurance partners.
- Researched the feasibility of expressing the insurance policy's payout calculation as a type of data by representing it as a first-class function in a functional programming language (Clojure).
Senior Engineering — Search Technology2009 - 2011Wink.com | Mylife.com
Technologies: OCaml, Java, Lucene, Hadoop
- Built a Hadoop/MapReduce indexing algorithm to process the document corpus into a set of Lucene indexes for the search engine cluster. This was 100x faster than the previous version, which was based on a static cluster of Lucene indexing servers.
- Developed a real-time Lucene read-write indexing service to make newly acquired data immediately accessible through the search engine. This complemented the main search cluster, which served the static document corpus, and was updated infrequently.
- Leveraged Lucene inverted index technology to support k-nearest neighbors predictive modeling.