Radu Nedelcu, Developer in London, United Kingdom
Radu is available for hire
Hire Radu

Radu Nedelcu

Verified Expert  in Engineering

Data Scientist and AI Developer

Location
London, United Kingdom
Toptal Member Since
April 9, 2020

Radu has been writing code since the age of 14. He has since built a career in data science and worked on projects in computer vision, text analysis, and financial data algorithms. His engineering background paired with his algorithm expertise enables him to work on the full pipeline from idea generation to proof of concept, development of the product to bringing it to production. He's excited about his next challenge and can't wait to get started.

Portfolio

SONY
OpenCV, Python 3, Spark, Amazon Web Services (AWS), Deep Learning, PyTorch...
University of London
Data Science, Spark, Hadoop, MapReduce, Data Modeling, Jupyter
Future Anthem
Spark, Recommendation Systems, Python 3, Delta Lake, Microsoft Power BI...

Experience

Availability

Part-time

Preferred Environment

Transformers, Jupyter Notebook, Keras, Pandas, Bash, PyCharm, PySpark, Scikit-learn, PyTorch, Image Processing

The most amazing...

...thing I've built was a vectorization method of text and entities for a news app such that our users could get news recommendations based on multiple topics.

Work Experience

Senior Data Scientist

2021 - PRESENT
SONY
  • Experimented with various technologies to build a better user experience as part of the PlayStation team.
  • Performed requirement gathering from various stakeholders; then collected data and aggregated data from various data sources using technologies such as Alation, Snowflake, Sagemaker, AWS EMR, and Databricks.
  • Worked on player-to-player clustering and recommenders based on their game activity.
  • Changed avatar emotions based on people's faces using GANs Dockerizing projects that had to be shared/deployed.
  • Worked on the delivery of a multi-million pound deep learning research infrastructure that involved various suppliers and stakeholders.
  • Developed a computer vision-based neural network that classified if images were good quality based on what they contained.
Technologies: OpenCV, Python 3, Spark, Amazon Web Services (AWS), Deep Learning, PyTorch, Computer Vision

University of London Tutor

2020 - PRESENT
University of London
  • Provided online tutor activities for the bachelor's degree in Computer Science and master's degree in Data Science.
  • Answered student questions about financial data modeling, Hadoop, Spark, Python, and cluster processing.
  • Organized webinars for the students that covered a range of topics and prepared them for their mid-terms and finals.
  • Graded coursework and exams for various modules such as Big Data and Software Development.
Technologies: Data Science, Spark, Hadoop, MapReduce, Data Modeling, Jupyter

Senior Data Scientist

2021 - 2022
Future Anthem
  • Aggregated data and did data wrangling using PySpark in Databricks on Azure.
  • Set up a recommendation system with three subsystems that would recommend games to users.
  • Built a user-item recommendation subsystem based on cosine similarity to make recommendations to new users.
  • Created a sequence-based recommendation system that could be used to make recommendations to early-stage users.
  • Constructed a collaborative filtering system based on implicit feedback using LightFM. The system was trained using the number of plays a user had in a game.
  • Built dashboards and performed data analysis to understand how new Future Anthem customers are performing and to help them get better results.
  • Delivered part of the work via other engineers from the Disruptive Engineering team who I managed.
Technologies: Spark, Recommendation Systems, Python 3, Delta Lake, Microsoft Power BI, Apache Spark, ETL

Senior Data Scientist

2020 - 2022
ContractPod AI
  • Worked on information extraction from legal documents.
  • Built an API to understand whether contracts are signed or not based on computer vision and NLP.
  • Researched methodologies for signature detection and obtained open-source, free data to train on.
  • Fine-tuned Yolo to detect signatures to an accuracy of 80%.
  • Built a dotted line detector to extract lines in documents using OpenCV.
  • Created a graph that represented the document and all the extractions.
  • Developed a signature requirement classifier that used an ensemble of mechanisms such as word density, dotted line presence, and neighboring words. The classifier had 90% accuracy on the test set.
  • Built a matching algorithm that matched signature requirements to the signatures. The API was deployed on CUDA-enabled Docker containers.
  • Conducted and created interviews to expand the team and offered support and mentorship to the team.
  • Built a contract clause comparison API to understand whether clauses in contracts match pre-approved clauses for multiple languages. Used a pre-trained BERT transformer that was fine-tuned with in-house data and deployed with Docker Containers.
Technologies: Transformers, Data Science, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Natural Language Toolkit (NLTK), SpaCy, Flask, Hugging Face, Jupyter, Apache Spark, ETL, Deep Learning, Computer Vision

Senior Data Scientist

2020 - 2020
Sprout AI
  • Led a small team of consultants to improve information extraction from claims.
  • Performed error analysis to understand current system results and what subsystems needed to be improved.
  • Annotated damaged items in insurance claims to build a custom model.
  • Trained a NER detector to detect damaged items in claims using Huggingface Transformers to an F1 score of 75%.
Technologies: Data Science, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), SpaCy, Jupyter, Deep Learning

Senior Data Scientist

2020 - 2020
Foreign, Commonwealth & Development Office - UK Government
  • Defined and explained a number of experiments that could improve information extraction from news worldwide.
  • Scraped news from news websites and cleaned and deduplicated them.
  • Built an MVP of an automated topic detection mechanism in the news using LDA and extracted topic names.
  • Aggregated processed data into a PowerBI visualization.
Technologies: Gensim, SpaCy, Natural Language Toolkit (NLTK), Microsoft Power BI, Agile Data Science, Jupyter

Senior Data Scientist

2020 - 2020
Fortress AI
  • Consulted on the strategic direction to implement machine learning on network devices for home environments.
  • Researched information around adblocking with machine learning and scraped ads and built an MVP of an ad-blocking mechanism using machine learning on JavaScript using TfIdf and logistic regression.
  • Researched information about doing QoS (quality of service) with machine learning and produced a report.
Technologies: Web Scraping, Scikit-learn, Pandas, Jupyter

Technical Trainer

2020 - 2020
OpenClassrooms
  • Developed a practical introductory course on deep learning.
  • Wrote a 3-part course that aimed to introduce students to deep learning, focusing on practicality and simple explanations. The course had the main theme of students working for a pizza company that uses machine learning.
  • Focused the first part on the differences between traditional machine learning and deep learning; the second on neurons, how they work, and fully connected networks; and the third part on convolutional neural networks and recurrent neural networks.
  • Developed a number of practical examples that the students are encouraged to follow and develop in their Jupyter Notebooks to better understand and have a reference tool later on.
Technologies: Linux, Keras, Teamwork, Data Visualization, Pandas, Machine Learning, Jupyter Notebook, Python 3, Jupyter

Senior Data Scientist

2020 - 2020
Cabinet Office
  • Worked on the discovery and alpha phases aimed at understanding user problems and creating MVPs.
  • Defined and explained a number of experiments that could improve knowledge management, such as faceted search and classifiers for different Tags.
  • Participated in a number of user interviews to better understand their working methods.
  • Wrote a number of small-scale experiments to test ideas.
  • Built, cleaned, and labeled datasets for the tasks.
  • Created a document type classifier that was able to distinguish between documents based on keywords and structure with an Accuracy of 90%. The system used Pika and Spacy in order to extract features and Scikit-learn to build the classifier.
  • Created a duplicate document and near-duplicate document detector using MinHash to make it easy to avoid duplication and understand related documents.
  • Built a 100,000 Node.js knowledge graph using Spacy, DBpedia, Gensim, and Neo4J to better understand connections between people and important topics in the documents.
  • Received a feature for the project in The Times: https://www.thetimes.co.uk/article/ai-trawls-20-000-miles-of-state-papers-j0l9k5gx9.
Technologies: Linux, Teamwork, Data Visualization, Pandas, Machine Learning, Agile Data Science, Scikit-learn, Jupyter

Data Scientist | Machine Learning Engineer

2019 - 2020
Ernst & Young
  • Researched public and internal information on ML models for mergers and acquisitions and participated in workshops to generate ideas for potential use cases of ML in the M&A process.
  • Performed data cleaning to ensure entities existed at different points in time and correct merging of entities from different datasets based on dates.
  • Created the first proof of concept models for applications of ML for M&A using Pandas and random forests in scikit-learn.
  • Set up the ML architecture to ensure integration with the engineering architecture in Azure and selected Databricks. It allows the use of Spark for cluster-based data processing and MLFlow for experiment tracking and deployment into Kubernetes.
  • Researched and experimented with a number of mechanisms to allow for modeling of imbalanced datasets–weight balancing, blagging (random forests where decision trees use undersampling), undersampling and oversampling, and transfer learning.
  • Analyzed multiple data sources and selected complementary data sources such as CapIQ for financial data, Factiva for news, and Oxford Economics for forecasts.
  • Managed the machine learning team and had duties such as planning the team's workload, providing guidance on priorities, planning the team structure and size, interviewing, and hiring.
  • Participated in user interviews to help shape how we built the algorithms and the platform on which they would be run. A simple product and model explainability were key takeaways.
  • Participated in a number of presentations to explain how machine learning works and how C-level stakeholders could use it.
  • Implemented a number of best practices in the team, such as random seed start, to get accurate scores of our models.
Technologies: Linux, Keras, Teamwork, Data Engineering, Data Visualization, Pandas, Machine Learning, Agile Data Science, Imbalanced-learn, Scikit-learn, MLflow, Databricks, PySpark, Python, Jupyter, Data Scraping, Finance

Data Scientist and Machine Learning Engineer

2017 - 2019
Serendipity AI
  • Helped put in practice a news classifier and created a topic/user-based news recommendation system using NLP.
  • Used named entity detectors from Spacy, DBpedia, and Jaccard Similarity together with Levehnstein distance to detect and match named entities in news and other text data.
  • Developed a new vectorization method for the detected named entities in text and worked on a mechanism to qualify their expertise to different topics.
  • Deployed Spark, Hadoop, and HBase on a cluster of three computers to speed up the machine learning processing.
  • Developed an ML processing pipeline that would allow information to flow to HBase and processed it in parallel using PySpark. Every stage in the pipeline was designed as a microservice with access to only an input and an output table.
  • Implemented a recommendation system using a neural network set up as an autoencoder and cosine similarity from Spotify Annoy.
  • Brought to production level an article judging system. The system had a classification service and a training application. I used Celery to train every night and restart the judging service's worker pool when new models were available.
  • Improved the code quality and reduced repeated code across applications written in Flask and Cherrypy by creating a shared library. Added a logging system based on Python logging that had handlers for local logging and Rollbar.
  • Created a number of APIs using Flask that ran on AWS and connected to Neo4j.
  • Set up a testing framework that would allow APIs to be tested before and after deployment using Jenkins and wrote integration tests for the APIs.
Technologies: Linux, Teamwork, Data Engineering, Data Visualization, Pandas, Machine Learning, Agile Data Science, SpaCy, Gensim, Scikit-learn, HBase, PySpark, Python, Jupyter, Data Scraping

Data Scientist and Machine Learning Engineer

2017 - 2017
Cappfinity
  • Researched and integrated an automatic machine learning algorithm picker in Python.
  • Researched Auto-Sklearn (bayesian optimization for algorithm selection), TPOT (genetic algorithms for feature processing and algorithm selection), and NEAT (genetic algorithms for neural network evolution).
  • Developed the architecture for experimentation and result visualization for machine learning algorithms using services built with C# ASP.NET Core and Python-Flask, which communicate via REST and RabbitMQ.
  • Built the system's presentation layer using Angular 4.
  • Wrote a text extraction service from speech using Google Speech to Text API.
  • Integrated MongoDB and connected all the services to it so that they can save processing results.
  • Integrated all the applications in Docker with their own private network and Docker Compose to allow for continuous integration and faster deployment.
Technologies: Linux, Teamwork, Pandas, Machine Learning, Tree-Based Pipeline Optimization Tool (TPOT), Flask, TensorFlow, Scikit-learn, Python

Research Engineer

2016 - 2017
Oxehealth
  • Led the data engineering team and worked on big data microservices that would connect cameras installed on-site with Oxehealth’s data warehouse.
  • Worked on Oxehealth’s TechCrunch London live demo that connected a room in Oxford with a human being monitored to the stage in London.
  • Designed and developed the microservices architecture for video data retrieval from customer sites using ZeroMQ, GRPC, and Boost Program Options and Property Tree for C++.
  • Set up a VPN Network to connect customer deployments to a central data repository using pfSense.
  • Built a breathing robot that could replicate different breathing patterns.
  • Designed and developed an application that allowed for multiple room monitoring using Qt.
Technologies: Teamwork, Data Engineering, Machine Learning, RabbitMQ, ZeroMQ, Python, C++, C

Computer Vision and Algorithms Engineer

2016 - 2016
Meta Vision Systems
  • Designed the full stack from image capture and processing to point clouds sent over the network using multiple threads and a pipeline architecture to measure oil pipes with lasers and cameras.
  • Wrote general-purpose GPU (GPGPU) code to accelerate image processing algorithms–convolution and point extraction via new kernels or through OpenCV, reducing processing time from the 40s to 40ms for some code paths.
  • Implemented K-means and ordinary least squares algorithms through OpenCV for finding points of interest and then line fitting.
  • Designed and set up the network communication channels to transmit data, commands, and replies using Type Length Value (TLV) messages via Boost ASIO.
  • Designed and developed a logging system using Microsoft ETW.
  • Set up point cloud library (PCL) for surface reconstruction and visualization of STL files and point clouds.
  • Used Boost Property Tree to implement a configuration file parser that uses JSON files.
  • Deployed Jenkins for automatic build verification and to run test cases.
Technologies: Linux, Teamwork, Machine Learning, NVIDIA CUDA, C++, C, OpenCV

Software Engineer

2013 - 2016
Qualcomm
  • Wrote the first Windows driver for Qualcomm's NFC chip.
  • Participated in a number of integration activities where I helped set up new platforms with our NFC chip.
  • Worked on the launch of a Windows mobile phone that contained the chip I worked on.
  • Advised other teams across the globe on Windows driver development.
  • Developed a script in PowerShell for improving the team’s efficiency.
  • Debugged customer and partner issues and those arising during testing.
  • Trained new team members from different disciplines such as software engineering and testing.
Technologies: Linux, Teamwork, C++, C

M&A Predictor

I built an application that uses the financial data of public companies and predicts whether they will go through a merger or acquisition event. The application was built using financial reports and more recent market data. The predictor had an F1 score of 0.2 - on average, returning 600 companies, of which around 100 were correct.

News Recommendation System

I worked on a news recommendation system that allowed users to follow a range of different topics, such as those extracted by named entity recognition and some topics from the DBPedia Ontology.
A vector made out of the same features was extracted for all the different types above, and it found recommendations using locality-sensitive hashing from Spotify Annoy.

Document Type Classifier

A classifier that used information about the document structure and keywords inside it to classify documents into one of several types of documents available in the organization. The classification could then be used to make automatic retention or deletion decisions that saved the company millions of pounds.

Linked Documents Detector

A locality-sensitive hashing-based application that allowed for documents to be linked either because of perfect duplication or because they were being used as a template or were versions of another document. The application improved the organization's search systems by adding contextual search.
2010 - 2013

Bachelor of Engineering Degree with Honors in Electronic and Communications Engineering

London Metropolitan University - London, England

OCTOBER 2021 - PRESENT

Generative Adversarial Networks (GANs)

deeplearning.ai

DECEMBER 2020 - PRESENT

Natural Language Processing Specialization

Coursera - deeplearning.ai

FEBRUARY 2020 - PRESENT

Deep Learning Specialization

Coursera - deeplearning.ai

SEPTEMBER 2016 - PRESENT

Machine Learning

Coursera - Stanford Online

NOVEMBER 2011 - NOVEMBER 2014

Cisco Certified Network Associate - Security

Cisco

JULY 2009 - PRESENT

Auditor/Lead Auditor (ISO 27001:2005)

IQMS

DECEMBER 2008 - NOVEMBER 2014

Cisco Certified Network Associate

Cisco

Libraries/APIs

Pandas, Scikit-learn, PySpark, Keras, SpaCy, OpenCV, Natural Language Toolkit (NLTK), ZeroMQ, TensorFlow, PyTorch

Tools

Jupyter, Git, PyCharm, RabbitMQ, Gensim, Microsoft Power BI, Tree-Based Pipeline Optimization Tool (TPOT), Apache Tika

Languages

Python 3, Python, C++, C, RDF, Bash

Paradigms

Concurrent Programming, Data Science, Agile, MapReduce, ETL

Platforms

Jupyter Notebook, Linux, NVIDIA CUDA, Databricks, Amazon Web Services (AWS)

Storage

Neo4j, HBase

Frameworks

Flask, Spark, Hadoop, Apache Spark

Other

Agile Data Science, Machine Learning, Data Visualization, Imbalanced-learn, Data Engineering, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Teamwork, Data Scraping, MLflow, Web Scraping, Transformers, Data Modeling, Hugging Face, GAN, Deep Neural Networks, Recommendation Systems, Delta Lake, Deep Learning, Image Processing, Finance, Computer Vision

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring