Alex Eftimiades
Verified Expert in Engineering
Data Scientist and Developer
Lenoir, United States
Toptal member since May 31, 2022
Alex is an experienced data scientist, statistician, and Python engineer. He has built models that identify financial crime and classify text communications using tools ranging from XGBoost to cutting-edge research and deployed them on AWS Lambda from Docker containers. He authored the now open source Model Validation Toolkit, used at FINRA to perform statistically rigorous model validation and monitoring.
Portfolio
Experience
- Bash - 12 years
- Git - 10 years
- Python - 10 years
- Linux - 8 years
- Deep Learning - 4 years
- Statistics - 4 years
- Machine Learning - 4 years
- Explainable Artificial Intelligence (XAI) - 3 years
Availability
Preferred Environment
MacOS, Linux, Jupyter, Vim Text Editor, iTerm2, Tmux, Spacemacs, Python
The most amazing...
...thing I've developed is the Model Validation Toolkit, which became an open source after two years of internal R&D on validation and monitoring at FINRA.
Work Experience
Applied ML Scientist
Penguin Random House
- Built a Facebook and Instagram ad generation and monitoring pipeline using Python and Kubernetes.
- Presented an approach and A/B testing techniques at Data Science Salon.
- Built a model to predict how video adaptations of books would increase sales.
Lead Data Scientist
FINRA
- Led the deployment of NLP models in production using Docker and Lambda on AWS, reducing costs by 80%.
- Developed and open-sourced a toolkit based on R&D efforts for validating and monitoring machine learning models https://finraos.github.io/model-validation-toolkit/. Presented at ODSC East 2022.
- Mentored junior data scientists and led regular data science-related sessions and workshops.
- Developed, supervised, and unsupervised models to identify insider trading (XGBoost; 96% AUC), market manipulation (DBSCAN), fraud (Bayesian analysis), and triage external communication (XGBoost, sklearn, and BERT).
- Led R&D efforts on interpretable machine learning, model validation and monitoring, and various ensemble models.
- Gave internal talks on software engineering for data scientists, countering sample bias, measuring model drift, thresholding, and normalizing flows.
- Developed and conducted a technical interview process and brought on seven data scientists.
Analytics Engineer
Catalist LLC
- Optimized, parallelized, and deployed an NLP model with Keras.
- Wrote SQL parser using Python that refactored over one million lines of legacy SQL scripts.
- Designed and wrote a data processing pipeline for election results as they became available the night of an election.
- Wrote internal technical guides on parallel processing.
Developer
Comsol
- Researched models and techniques to simulate physical phenomena of interest to engineers and scientists.
- Wrote technical specifications of new front and back-end components.
- Implemented algorithms used for numerical simulations and user interfaces in Java.
Freelance Developer
Self-Employed
- Used dynamic programming to reduce the run time of quantum computing simulation from five days to 50 minutes (UMBC Physics Department).
- Performed data visualization and image processing with Python, named the second author in publication summarizing results (American Dental Association Foundation).
- Wrote code to tunnel citizens of countries with internet censorship to uncensored internet via Google Chat and Tor (Tor).
- Helped build initial versions of iCARE, a cancer research and networking nonprofit.
Experience
Model Validation Toolkit
https://finraos.github.io/model-validation-toolkit/Tlang
https://github.com/aeftimia/tlangHexchat
https://github.com/aeftimia/hexchatKahler
https://github.com/aeftimia/kahlerEducation
Bachelor's Degree in Physics
University of Maryland, Baltimore County - Catonsville, Baltimore County, Maryland, United States
Skills
Libraries/APIs
XGBoost, Scikit-learn, Matplotlib, TensorFlow, Keras, Sockets, Pandas
Tools
Jupyter, Vim Text Editor, Tmux, Git, Plotly, Spacemacs
Languages
Python, Bash, SQL
Paradigms
ETL
Platforms
MacOS, Linux, Amazon Web Services (AWS), Kubernetes
Other
iTerm2, Mathematics, Statistics, Machine Learning, Bayesian Statistics, Explainable Artificial Intelligence (XAI), Deep Learning, Model Validation, Classification, Data Science, Data Modeling, Data Visualization, Natural Language Processing (NLP), Forecasting, Compilers, TCP/IP, Transmission Control Protocol (TCP), XMPP
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring