Dragos Tudor, Technical Leader and Developer in London, United Kingdom
Dragos Tudor

Technical Leader and Developer in London, United Kingdom

Member since June 10, 2018
Dragos is a technical leader who touched the lives of 1.5 million users and generated $50 million in business value by building and deploying machine learning implementations, for international enterprises, SMEs, and startups. Dragos has worked across the entire engineering pipeline, with both executives and data scientists and built production-ready recommender systems, advanced NLP models, time-series forecasting data products and classifiers, and other custom advanced analytics capabilities.
Dragos is now available for hire

Portfolio

  • Quasar Labs
    Data Analysis, Data Engineering, Computer Vision, Deep Learning, XGBoost...
  • DataZip
    Amazon Web Services (AWS), Data Analysis, Data Engineering, Computer Vision...
  • Tessian
    Data Analysis, Data Engineering, Deep Learning, XGBoost...

Experience

  • Python 5 years
  • Data Science 5 years
  • Technical Leadership 4 years
  • Neural Networks 4 years
  • Natural Language Processing (NLP) 4 years
  • TensorFlow 3 years
  • Keras 3 years
  • XGBoost 3 years

Location

London, United Kingdom

Availability

Part-time

Preferred Environment

Google Cloud Platform (GCP), Amazon Web Services (AWS), Amazon WorkSpaces, Amazon SageMaker, Python, R, TensorFlow, Linux

The most amazing...

...project was where I built transformer-based deep learning models and custom embeddings on 1.5 billion multilingual emails for detecting spear-phishing attacks

Employment

  • Founder | Senior Data Scientist

    2018 - PRESENT
    Quasar Labs
    • Consulted enterprise, SMB, and startup clients on the implementation of cutting-edge machine learning capabilities for a variety of use cases with the express goal of increasing performance and impact.
    • Communicated with executives, senior managers, and teams of data scientists from over 20 companies and over 40 countries.
    • Implemented deep learning neural networks using CNNs in TensorFlow for object detection and recognition (earthquake impact detection, receipt text detection, valve defect, and wear and tear detection).
    • Built custom learners for revenue forecasting in retail using seasonal ARIMA and RNNs and 85GB hourly sampled data. Deployed models in a real-time production environment. Used Docker, Flask, AWS, PostgreSQL, and MySQL Server.
    • Implemented OCR (optical character recognition) for automated receipt text extraction and classification using Google OCR, TensorFlow, Flask, and Keras.
    • Developed an end-to-end training pipeline with the aim of predicting user churn for a client from the Bahamas, in telecom. The architecture used leveraged time-to-event RNNs and gradient boosted decision trees.
    Technologies: Data Analysis, Data Engineering, Computer Vision, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Spark Streaming, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, SQL, TensorFlow, R
  • Founder

    2019 - 2020
    DataZip
    • Collected, processed, and controlled the distribution of auto dual dash-cam imagery and telematics data, as well as healthcare imagery.
    • Built pipelines for cleaning, processing, classifying, and anomaly detection applied to 1080p and 720p, and 30fps footage.
    • Synchronized the telematics and dash-cam video footage using audio recordings, Fast Fourier Transform (FFT) convolutions, de-noising, and signal processing techniques.
    • Implemented image semantic segmentation, road object classification, identification of rapid decelerations/breaks, and occurrences of near-misses and collisions.
    • Managed client interactions, projects, and development.
    Technologies: Amazon Web Services (AWS), Data Analysis, Data Engineering, Computer Vision, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Technical Leadership, Android, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, AWS, TensorFlow, OpenCV
  • Data Scientist | Natural Language Research Engineer

    2019 - 2019
    Tessian
    • Developed language models, transfer learning, text analysis/classification and clustering, few-shot learning, embeddings, and attention RNN networks across 100GB of email data.
    • Pioneered techniques such as unsupervised data augmentation, weak supervision in Snorkel MeTaL, and multi-task learning for malicious data classification.
    • Implemented end-to-end machine learning models, in production, using TensorFlow, AWS S3/Athena and SageMaker on both CPU and GPU based architectures.
    • Proactively explored and analysed the compatibility of string similarity matching using one-shot learning and siamese networks across multiple use cases.
    • Implemented various codebase improvements, testing automation, parallelized processing, and documentation design.
    Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Spark Streaming, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, AWS Athena, AWS DynamoDB, AWS S3, Docker, Bash, TensorFlow
  • Data Scientist

    2018 - 2018
    Apsara Capital
    • Led the development and implementation of the data analysis and research infrastructure.
    • Developed the AWS S3, Lambda, EC2, and Docker orchestration for extracting, processing, and storing financial, economic, and market data from the Thomson Reuters Eikon API.
    • Built an NLP language model using Snorkel and MeTaL for the analysis earnings of call transcripts.
    • Created the technical analysis infrastructure using R and a set of 20 customizable technical indicators.
    • Designed the codebase, automate the testing, integrated the production, and generated and managed documentation.
    Technologies: Quantitative Modeling, Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Technical Leadership, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, R, Amazon Kinesis Data Firehose, AWS Glue, AWS Athena, AWS S3
  • Data Scientist

    2017 - 2018
    Tracktics GmbH
    • Analyzed time series data for motion classification and identification of activity bursts using CNN, Bayesian models, and Monte Carlo simulations.
    • Supported the development of the analytical pipeline and user segmentation capabilities using AWS S3, AWS Lambda, and EC2.
    • Implemented data management and visualization with AWS SQS, S3, DynamoDB, Python, and Pandas/Bokeh.
    • Developed a general motion analysis over triaxial accelerometer, gyroscope, magnetometer data in addition to GPS and video.
    • Proactively researched sports analytics, documentation management, scrum integration and agile methodologies.
    Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, JavaScript, Amazon Web Services (AWS), Django
  • Data Scientist | Analyst

    2017 - 2018
    PredictX
    • Took the initiative and improved sales forecasting capabilities by more than 20% as part of an MVP for a retail client with 700 POS. Used tree-based/linear models and 40TB+ extraneous variables such as weather, events, and client-specific metrics.
    • Drove business decisions by researching, testing, and integrating various regression and classification-based models using Python Scikit-learn, TensorFlow, and Keras.
    • Led the implementation of end-to-end ETL processes using Python, MySQL, PostgreSQL, and Knime.
    • Applied association rule mining with Neo4j Graph data representations for product recommendations in retail. Replicated results in production and supported the transition of the research initiative to a new market-ready product.
    • Developed an insurance algorithm for seismic and flood risk computation using MCMC.
    • Delivered codebase improvements via the use of in-memory processing with Spark and Hadoop.
    Technologies: Quantitative Modeling, Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Flask, Sentiment Analysis, Data Reporting, Machine Learning, Exploratory Data Analysis, Statistical Analysis, KNIME, TensorFlow, Neo4j, MySQL, JavaScript
  • Research Assistant

    2016 - 2017
    University of Glasgow — Urban Big Data Centre
    • Started with no knowledge of machine learning and coding and ended up building an e-commerce recommender system that relied on RNNs and collaborative filtering to predict user-product relevance.
    • Learned C# from scratch and developed an Android app using Xamarin, which aimed to collect sensitive data from mobile devices. Developed the solution end-to-end (both front/back end and documentation) and paired it with a MySQL database for storage.
    • Manipulated high-dimensional datasets (120 GB+) for feature creation using Python Pandas, PostgreSQL, RDD in Hadoop DFS and Spark. Visualized the data using Tableau, Stata, and LaTeX.
    • Reviewed, replicated and analysed a variety of state of the art research papers about recommender systems, information retrieval and distributed systems.
    • Used GPU and parallel computing for modelling 100 GB+ datasets and Spark and Hadoop in a research environment on an on-premise cluster.
    Technologies: Data Analysis, Data Engineering, Deep Learning, XGBoost, Artificial Intelligence (AI), Keras, Neural Networks, Natural Language Processing (NLP), Python, Data Science, Data Analytics, Statistical Data Analysis, Sentiment Analysis, Android, Machine Learning, Exploratory Data Analysis, Statistical Analysis, STATA, LaTeX, Spark, Hadoop, Xamarin, Java, C#
  • Assistant Brand Manager

    2015 - 2015
    Procter & Gamble
    • Led a competitive analysis initiative across nine SEE regions.
    • Co-led a team of 5-10 people for launching Pampers Premium Care’s biggest innovation in the past five years and a Pampers UNICEF PR campaign across four SSE regions.
    • Identified pricing gaps and researched and presented viable solutions to increase the company’s competitiveness in four SEE regions.
    Technologies: Data Analysis, Data Science, Data Analytics, Statistical Data Analysis, Analytics, Marketing, Branding, Management, Microsoft Excel, Microsoft PowerPoint
  • Co-founder

    2014 - 2015
    Crowd Augur
    • Designed the project to harness video gamers’ actions for augmenting data analysis algorithms. Started in collaboration with five McGill-based bioinformatics and computer science researchers.
    • Took the initiative to secure meetings with top executives, which lead to several partnership agreements and four qualified clients from healthcare and finance.
    • Proposed and developed a unique business model for bringing more accurate data analysis to genomics and finance.
    • Ranked 5/150+ in the McGill University’s Dobson Startup Cup.
    Technologies: Data Analysis, Data Science, Statistical Data Analysis, Leadership, Research, Analysis, Microsoft PowerPoint, Strategy, Business Strategy, Microsoft Excel
  • Assistant Manager

    2010 - 2014
    Maximal Group
    • Proposed, built, and promoted (SEO, Google AdWords, and Analytics) the company’s first online store. This initiative leads to a 4x increase in new customer acquisition and a 13% increase in sales in the first three months.
    • Took the initiative to propose and coordinate a Kaizen/Lean-inspired waste reduction program that contributed to a 30% leftover reduction.
    • Managed suppliers and negotiated bulk purchases, which led to a 5% reduction in raw material costs.
    Technologies: Data Analysis, Data Science, Statistical Data Analysis, Technical Leadership, Management, Lean, Warehouses, Statistical Modeling, WordPress, CSS, HTML, Python 3

Experience

  • Satellite Building Damage Detection (Development)
    https://github.com/tudoriliuta/CollapseView

    I trained a CNN (convolutional neural network) in TensorFlow to recognize houses from satellite imagery. The aim was to re-run the model on an image post-earthquake for identifying collapsed units. 97%+ accuracy.

  • Traffic Accident Modeling (Development)
    https://github.com/tudoriliuta/RoadAccidentPrediction

    I built a model for visualizing clusters of road accidents across the UK. I used KDE and XGB for visualizing and modeling road accidents.

  • Mood Music (Development)
    https://github.com/tudoriliuta/MoodMusic

    This is a project where the music adapts to your emotions with data extracted from your own webcam.

  • Association Rule Learning for eCommerce (Other amazing things)

    I boosted a UK-based industrial retail client's revenues by 11% by recommending opportunities to upsell.

  • Housing Market Price Prediction (Development)

    This project consists of two main parts:
    1. London housing market price predictions—stacked learners and seasonal ARIMA-based models.
    2. Forecasted the error of Zillow's internal model better than 93% of other submitted models; used stacked models in Python.

  • DermaView: Skin Lesion Detection, Segmentation, and Categorization (Development)

    I used RCNN/DCNN and CRF on 50,000+ samples (ISIC, scraped and generated imagery) for identifying over 1,000 skin condition subtypes from HD images.

  • Allergen-aware Food Recipe Recommendations Using Graph Embeddings (Development)

    The project's goal was to unify and structure the existing knowledge about dietary preferences, allergens, intolerances, and their interactions. Given the distribution of various allergens, both IgE and non-IgE (delayed response), the goal was to identify the likelihood of one of them to be present in a specific ingredient and, therefore, to pose a threat to the user of an app.

    Some users might be allergic to peanuts, which might not be an issue if the dish contains Brazilian nuts. Similarly, a user might be intolerant to peanuts, but not if the amount is small in a given dish.

    For the two types of users, the perceived risk can differ. In the first case, the user perceives Brazil nuts as dangerous, while their real risk is low (restaurant might also process groundnuts), and in the second case, the perceived risk is medium, but the user can decide if it’s acceptable. All of these allergen - ingredient ’risk’ relationships are approved by an expert and categorized.

  • Secure Aggregation, Analysis, and Sharing of DICOM Radiology Data (Development)
    https://www.quasarlabs.co/holo

    Hospitals' DICOM imagery is curated, securely stored, anonymized, calibrated, and automatically annotated by using custom computer vision algorithms, at scale.

    Access to anonymized imagery is offered on-demand, to verified research departments, startups, and other partners, via virtual machines (VMs) hosted in a private cloud with strict data management and exfiltration prevention protocols.

Skills

  • Languages

    Python, SQL, C#, R, Bash, Python 3, HTML, CSS, Java, JavaScript
  • Frameworks

    Spark, Scrapy, Hadoop, Flask, Django
  • Libraries/APIs

    SciPy, NumPy, Sklearn, TensorFlow, PySpark, XGBoost, Keras, Pandas, Matplotlib, NLTK, OpenCV, Spark ML, AWS EC2 API, NetworkX, Spark Streaming
  • Tools

    Tableau, AWS Athena, PyCharm, IPython Notebook, Amazon SageMaker, Amazon WorkSpaces, Reuters Eikon, Amazon SQS, TensorBoard, LaTeX, AWS Glue, Microsoft PowerPoint, Microsoft Excel
  • Paradigms

    Requirements Analysis, Object-oriented Programming (OOP), Data Science, Siamese Neural Networks, Management
  • Platforms

    Amazon Web Services (AWS), AWS EC2, iOS, Windows, Jupyter Notebook, AWS Lambda, Linux, Docker, Ubuntu, KNIME, Android, WordPress, Google Cloud Platform (GCP)
  • Storage

    MySQL, AWS S3, MongoDB, Databases, AWS DynamoDB, Neo4j, Graph Databases
  • Industry Expertise

    Project Management, Retail & Wholesale, Healthcare, Branding, Marketing
  • Other

    Machine Learning, Data Analysis, Data, Unstructured Data Analysis, Complex Data Analysis, Scientific Data Analysis, Exploratory Data Analysis, Prescriptive Analytics, Prescriptive Modeling, Predictive Analytics, Statistical Analysis, Random Forest Regression, Regression, Regression Models, Decision Tree Regression, Logistic Regression, Linear Regression, Regression Modeling, Classification, Classification Algorithms, Text Classification, Decision Tree Classification, Stacked Ensemble, Startups, Early-stage Startups, Enterprise Startups, High-tech Startups, Lean Startups, Startup Consulting, Time Series Analysis, Predictive Modeling, Data Reporting, Statistics, Data Engineering, OCR, Image Analysis, Statistical Modeling, Statistical Data Analysis, Neural Networks, Statistical Forecasting, Communication, Data Analytics, Natural Language Processing (NLP), Image Recognition, Computer Vision, Natural Language Understanding (NLU), Artificial Intelligence (AI), Artificial Neural Networks (ANN), Deep Neural Networks, Convolutional Neural Networks, Recurrent Neural Networks, Gradient Boosting, Gradient Boosted Trees, Ensemble Methods, Bootstrapping, Deep Learning, Demand Sizing & Segmentation, Image Processing, Signal Processing, Technical Leadership, Sentiment Analysis, Quantitative Modeling, Leadership, Strategy, BERT, Computer Vision Algorithms, Explainable Artificial Intelligence (XAI), Unsupervised Learning, Parquet, Education, Radiology, AWS, Amazon Kinesis Data Firehose, Warehouses, Lean, Analytics, Business Strategy, Analysis, Research, Software Engineering, Lean Project Management, Grakn, Directed Acrylic Graphs (DAG), GNN, GraphSAGE, Food Safety, Food Science, DICOM, Healthcare IT, Healthcare Management Systems

Education

  • Graduate diploma in Mathematics
    2017 - 2019
    London School of Economics - London, UK
  • Master's degree in Economics, Econometrics, and Management
    2012 - 2016
    University of Glasgow - Glasgow, Scotland
  • Exchange in Strategy and Computer Science
    2014 - 2015
    McGill University - Montreal, Canada
  • Bachelor's degree in Mathematics and Management
    2011 - 2012
    University of Babes-Bolyai - Cluj-Napoca, Romania

To view more profiles

Join Toptal
Share it with others