Alex Eftimiades, Developer in Lenoir, United States
Alex is available for hire
Hire Alex

Alex Eftimiades

Verified Expert  in Engineering

Data Scientist and Developer

Location
Lenoir, United States
Toptal Member Since
May 31, 2022

Alex is an experienced data scientist, statistician, and Python engineer. He has built models that identify financial crime and classify text communications using tools ranging from XGBoost to cutting-edge research and deployed them on AWS Lambda from Docker containers. He authored the now open source Model Validation Toolkit, used at FINRA to perform statistically rigorous model validation and monitoring.

Portfolio

Penguin Random House
Python, Pandas, Scikit-learn, Jupyter, Kubernetes, Data Science, ETL...
FINRA
Python, XGBoost, Scikit-learn, Mathematics, Statistics, Machine Learning...
Catalist LLC
Python, SQL, Bash, Linux, Git, Machine Learning, Keras, Classification...

Experience

Availability

Part-time

Preferred Environment

MacOS, Linux, Jupyter, Vim Text Editor, iTerm2, Tmux, Spacemacs, Python

The most amazing...

...thing I've developed is the Model Validation Toolkit, which became an open source after two years of internal R&D on validation and monitoring at FINRA.

Work Experience

Applied ML Scientist

2022 - PRESENT
Penguin Random House
  • Built a Facebook and Instagram ad generation and monitoring pipeline using Python and Kubernetes.
  • Presented an approach and A/B testing techniques at Data Science Salon.
  • Built a model to predict how video adaptations of books would increase sales.
Technologies: Python, Pandas, Scikit-learn, Jupyter, Kubernetes, Data Science, ETL, Forecasting, Data Modeling, Data Visualization

Lead Data Scientist

2019 - 2022
FINRA
  • Led the deployment of NLP models in production using Docker and Lambda on AWS, reducing costs by 80%.
  • Developed and open-sourced a toolkit based on R&D efforts for validating and monitoring machine learning models https://finraos.github.io/model-validation-toolkit/. Presented at ODSC East 2022.
  • Mentored junior data scientists and led regular data science-related sessions and workshops.
  • Developed, supervised, and unsupervised models to identify insider trading (XGBoost; 96% AUC), market manipulation (DBSCAN), fraud (Bayesian analysis), and triage external communication (XGBoost, sklearn, and BERT).
  • Led R&D efforts on interpretable machine learning, model validation and monitoring, and various ensemble models.
  • Gave internal talks on software engineering for data scientists, countering sample bias, measuring model drift, thresholding, and normalizing flows.
  • Developed and conducted a technical interview process and brought on seven data scientists.
Technologies: Python, XGBoost, Scikit-learn, Mathematics, Statistics, Machine Learning, Bayesian Statistics, Model Validation, Deep Learning, Explainable Artificial Intelligence (XAI), Keras, TensorFlow, Plotly, Classification, Data Science, Amazon Web Services (AWS), ETL, Data Modeling, Data Visualization, Natural Language Processing (NLP)

Analytics Engineer

2018 - 2019
Catalist LLC
  • Optimized, parallelized, and deployed an NLP model with Keras.
  • Wrote SQL parser using Python that refactored over one million lines of legacy SQL scripts.
  • Designed and wrote a data processing pipeline for election results as they became available the night of an election.
  • Wrote internal technical guides on parallel processing.
Technologies: Python, SQL, Bash, Linux, Git, Machine Learning, Keras, Classification, Data Science, ETL, Data Modeling, Data Visualization, Natural Language Processing (NLP)

Developer

2016 - 2017
Comsol
  • Researched models and techniques to simulate physical phenomena of interest to engineers and scientists.
  • Wrote technical specifications of new front and back-end components.
  • Implemented algorithms used for numerical simulations and user interfaces in Java.
Technologies: Data Visualization

Freelance Developer

2011 - 2016
Self-Employed
  • Used dynamic programming to reduce the run time of quantum computing simulation from five days to 50 minutes (UMBC Physics Department).
  • Performed data visualization and image processing with Python, named the second author in publication summarizing results (American Dental Association Foundation).
  • Wrote code to tunnel citizens of countries with internet censorship to uncensored internet via Google Chat and Tor (Tor).
  • Helped build initial versions of iCARE, a cancer research and networking nonprofit.
Technologies: Python, Matplotlib, Mathematics, Data Visualization

Model Validation Toolkit

https://finraos.github.io/model-validation-toolkit/
Over the course of several internal R&D efforts at FINRA, I built a series of tools to help the quality assurance department validate, interpret, and monitor machine learning models to the extent that this could be done systematically. FINRA open-sourced this as its Model Validation Toolkit, and I presented it at ODSC East 2022.

Tlang

https://github.com/aeftimia/tlang
Tlang is a Python library and domain-specific language for building transpilers. Inspired by a project in which millions of lines of SQL needed to be updated to match new table layouts, Tlang seeks to make it as easy as possible for a Python developer with no prior background in parsing or compiler theory to write clean and efficient transpilers for major enterprise migrations.

Hexchat

https://github.com/aeftimia/hexchat
I wrote internet censorship circumvention software for Tor tunnels TCP connections over arbitrary numbers of XMPP chatlines–circumventing bandwidth limitations imposed by the hosts. Packets were fragmented and distributed across arbitrary numbers of bandwidth-limited chatlines, and fragments were reassembled in a queue at the receiving end.

Kahler

https://github.com/aeftimia/kahler
I developed and reported on efficient and parallelized finite elements framework. A framework capable of simulating electromagnetic and thermal radiation in one, two, and three dimensions with arbitrary discretizations and boundary conditions.

Languages

Python, Bash, SQL

Libraries/APIs

XGBoost, Scikit-learn, Matplotlib, TensorFlow, Keras, Sockets, Pandas

Tools

Jupyter, Vim Text Editor, Tmux, Git, Plotly, Spacemacs

Paradigms

Data Science, ETL

Platforms

MacOS, Linux, Amazon Web Services (AWS), Kubernetes

Other

iTerm2, Mathematics, Statistics, Machine Learning, Bayesian Statistics, Explainable Artificial Intelligence (XAI), Deep Learning, Model Validation, Classification, Data Modeling, Data Visualization, Natural Language Processing (NLP), Forecasting, Compilers, TCP/IP, Transmission Control Protocol (TCP), XMPP

2013 - 2015

Bachelor's Degree in Physics

University of Maryland, Baltimore County - Catonsville, Baltimore County, Maryland, United States

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring