Alex Eftimiades, Data Scientist and Developer in Lenoir, United States
Alex Eftimiades

Data Scientist and Developer in Lenoir, United States

Member since May 31, 2022
Alex is an experienced data scientist, statistician, and Python engineer. He has built models that identify financial crime and classify text communications using tools ranging from XGBoost to cutting-edge research and deployed them on AWS Lambda from Docker containers. He authored the now open source Model Validation Toolkit, used at FINRA to perform statistically rigorous model validation and monitoring.
Alex is now available for hire


    Python, XGBoost, Scikit-learn, Mathematics, Statistics, Machine Learning...
  • Catalist LLC
    Python, SQL, Bash, Linux, Git, Machine Learning, Keras, Classification
  • Comsol
    Java, C++



Lenoir, United States



Preferred Environment

MacOS, Linux, Jupyter, Vim Text Editor, iTerm2, Tmux, Spacemacs

The most amazing...

...thing I've developed is the Model Validation Toolkit, which became an open source after two years of internal R&D on validation and monitoring at FINRA.


  • Lead Data Scientist

    2018 - PRESENT
    • Led deployment of NLP models in production using Docker and Lambda on AWS.
    • Developed supervised and unsupervised models to identify insider trading XGBoost, market manipulation with DBScan, fraud by using Bayesian analysis, and triage external communication with XGBoost, and BERT.
    • Gave internal talks on software engineering for data scientists, countering sample bias, measuring model drift, thresholding, and normalizing flows.
    • Developed and open-sourced toolkit for validating and monitoring machine learning models and introduced this software at ODSC East 2022.
    Technologies: Python, XGBoost, Scikit-learn, Mathematics, Statistics, Machine Learning, Bayesian Statistics, Model Validation, Deep Learning, Explainable Artificial Intelligence (XAI), Keras, TensorFlow, Plotly, Classification
  • Analytics Engineer

    2018 - 2019
    Catalist LLC
    • Optimized, parallelized, and deployed an NLP model with Keras.
    • Wrote SQL parser using Python that refactored over one million lines of legacy SQL scripts.
    • Designed and wrote a data processing pipeline for election results as they became available the night of an election.
    • Wrote internal technical guides on parallel processing.
    Technologies: Python, SQL, Bash, Linux, Git, Machine Learning, Keras, Classification
  • Developer

    2016 - 2017
    • Researched models and techniques to simulate physical phenomena of interest to engineers and scientists.
    • Wrote technical specifications of new front and back-end components.
    • Implemented algorithms used for numerical simulations and user interfaces in Java.
    Technologies: Java, C++
  • Freelance Developer

    2011 - 2016
    • Used dynamic programming to reduce the run time of quantum computing simulation from five days to 50 minutes (UMBC Physics Department).
    • Performed data visualization and image processing with Python, named the second author in publication summarizing results (American Dental Association Foundation).
    • Wrote code to tunnel citizens of countries with internet censorship to uncensored internet via Google Chat and Tor (Tor).
    • Helped build initial versions of iCARE, a cancer research and networking nonprofit.
    Technologies: Python, PHP, JavaScript, HTML, Matplotlib, Mathematics


  • Model Validation Toolkit

    Over the course of several internal R&D efforts at FINRA, I built a series of tools to help the quality assurance department validate, interpret, and monitor machine learning models to the extent that this could be done systematically. FINRA open-sourced this as its Model Validation Toolkit, and I presented it at ODSC East 2022.

  • Tlang

    Tlang is a Python library and domain-specific language for building transpilers. Inspired by a project in which millions of lines of SQL needed to be updated to match new table layouts, Tlang seeks to make it as easy as possible for a Python developer with no prior background in parsing or compiler theory to write clean and efficient transpilers for major enterprise migrations.

  • Hexchat

    I wrote internet censorship circumvention software for Tor tunnels TCP connections over arbitrary numbers of XMPP chatlines–circumventing bandwidth limitations imposed by the hosts. Packets were fragmented and distributed across arbitrary numbers of bandwidth-limited chatlines, and fragments were reassembled in a queue at the receiving end.

  • Kahler

    I developed and reported on efficient and parallelized finite elements framework. A framework capable of simulating electromagnetic and thermal radiation in one, two, and three dimensions with arbitrary discretizations and boundary conditions.


  • Languages

    Python, Bash, SQL
  • Libraries/APIs

    XGBoost, Scikit-learn, Matplotlib, TensorFlow, Keras, Sockets
  • Tools

    Jupyter, Vim Text Editor, Tmux, Git, Plotly, Spacemacs
  • Platforms

    MacOS, Linux
  • Other

    iTerm2, Mathematics, Statistics, Machine Learning, Bayesian Statistics, Explainable Artificial Intelligence (XAI), Deep Learning, Model Validation, Classification, Compilers, TCP/IP, TCP, XMPP


  • Bachelor's Degree in Physics
    2013 - 2015
    University of Maryland, Baltimore County - Catonsville, Baltimore County, Maryland, United States

To view more profiles

Join Toptal
Share it with others