Radu Nedelcu
Verified Expert in Engineering
Data Scientist and AI Developer
London, United Kingdom
Toptal member since April 9, 2020
Radu has been writing code since the age of 14. He has since built a career in data science and worked on projects in computer vision, text analysis, and financial data algorithms. His engineering background paired with his algorithm expertise enables him to work on the full pipeline from idea generation to proof of concept, development of the product to bringing it to production. He's excited about his next challenge and can't wait to get started.
Portfolio
Experience
Availability
Preferred Environment
Transformers, Jupyter Notebook, Keras, Pandas, Bash, PyCharm, PySpark, Scikit-learn, PyTorch, Image Processing
The most amazing...
...thing I've built was a vectorization method of text and entities for a news app such that our users could get news recommendations based on multiple topics.
Work Experience
Senior Data Scientist
SONY
- Experimented with various technologies to build a better user experience as part of the PlayStation team.
- Performed requirement gathering from various stakeholders; then collected data and aggregated data from various data sources using technologies such as Alation, Snowflake, Sagemaker, AWS EMR, and Databricks.
- Worked on player-to-player clustering and recommenders based on their game activity.
- Changed avatar emotions based on people's faces using GANs Dockerizing projects that had to be shared/deployed.
- Worked on the delivery of a multi-million pound deep learning research infrastructure that involved various suppliers and stakeholders.
- Developed a computer vision-based neural network that classified if images were good quality based on what they contained.
University of London Tutor
University of London
- Provided online tutor activities for the bachelor's degree in Computer Science and master's degree in Data Science.
- Answered student questions about financial data modeling, Hadoop, Spark, Python, and cluster processing.
- Organized webinars for the students that covered a range of topics and prepared them for their mid-terms and finals.
- Graded coursework and exams for various modules such as Big Data and Software Development.
Senior Data Scientist
Future Anthem
- Aggregated data and did data wrangling using PySpark in Databricks on Azure.
- Set up a recommendation system with three subsystems that would recommend games to users.
- Built a user-item recommendation subsystem based on cosine similarity to make recommendations to new users.
- Created a sequence-based recommendation system that could be used to make recommendations to early-stage users.
- Constructed a collaborative filtering system based on implicit feedback using LightFM. The system was trained using the number of plays a user had in a game.
- Built dashboards and performed data analysis to understand how new Future Anthem customers are performing and to help them get better results.
- Delivered part of the work via other engineers from the Disruptive Engineering team who I managed.
Senior Data Scientist
ContractPod AI
- Worked on information extraction from legal documents.
- Built an API to understand whether contracts are signed or not based on computer vision and NLP.
- Researched methodologies for signature detection and obtained open-source, free data to train on.
- Fine-tuned Yolo to detect signatures to an accuracy of 80%.
- Built a dotted line detector to extract lines in documents using OpenCV.
- Created a graph that represented the document and all the extractions.
- Developed a signature requirement classifier that used an ensemble of mechanisms such as word density, dotted line presence, and neighboring words. The classifier had 90% accuracy on the test set.
- Built a matching algorithm that matched signature requirements to the signatures. The API was deployed on CUDA-enabled Docker containers.
- Conducted and created interviews to expand the team and offered support and mentorship to the team.
- Built a contract clause comparison API to understand whether clauses in contracts match pre-approved clauses for multiple languages. Used a pre-trained BERT transformer that was fine-tuned with in-house data and deployed with Docker Containers.
Senior Data Scientist
Sprout AI
- Led a small team of consultants to improve information extraction from claims.
- Performed error analysis to understand current system results and what subsystems needed to be improved.
- Annotated damaged items in insurance claims to build a custom model.
- Trained a NER detector to detect damaged items in claims using Huggingface Transformers to an F1 score of 75%.
Senior Data Scientist
Foreign, Commonwealth & Development Office - UK Government
- Defined and explained a number of experiments that could improve information extraction from news worldwide.
- Scraped news from news websites and cleaned and deduplicated them.
- Built an MVP of an automated topic detection mechanism in the news using LDA and extracted topic names.
- Aggregated processed data into a PowerBI visualization.
Senior Data Scientist
Fortress AI
- Consulted on the strategic direction to implement machine learning on network devices for home environments.
- Researched information around adblocking with machine learning and scraped ads and built an MVP of an ad-blocking mechanism using machine learning on JavaScript using TfIdf and logistic regression.
- Researched information about doing QoS (quality of service) with machine learning and produced a report.
Technical Trainer
OpenClassrooms
- Developed a practical introductory course on deep learning.
- Wrote a 3-part course that aimed to introduce students to deep learning, focusing on practicality and simple explanations. The course had the main theme of students working for a pizza company that uses machine learning.
- Focused the first part on the differences between traditional machine learning and deep learning; the second on neurons, how they work, and fully connected networks; and the third part on convolutional neural networks and recurrent neural networks.
- Developed a number of practical examples that the students are encouraged to follow and develop in their Jupyter Notebooks to better understand and have a reference tool later on.
Senior Data Scientist
Cabinet Office
- Worked on the discovery and alpha phases aimed at understanding user problems and creating MVPs.
- Defined and explained a number of experiments that could improve knowledge management, such as faceted search and classifiers for different Tags.
- Participated in a number of user interviews to better understand their working methods.
- Wrote a number of small-scale experiments to test ideas.
- Built, cleaned, and labeled datasets for the tasks.
- Created a document type classifier that was able to distinguish between documents based on keywords and structure with an Accuracy of 90%. The system used Pika and Spacy in order to extract features and Scikit-learn to build the classifier.
- Created a duplicate document and near-duplicate document detector using MinHash to make it easy to avoid duplication and understand related documents.
- Built a 100,000 Node.js knowledge graph using Spacy, DBpedia, Gensim, and Neo4J to better understand connections between people and important topics in the documents.
- Received a feature for the project in The Times: https://www.thetimes.co.uk/article/ai-trawls-20-000-miles-of-state-papers-j0l9k5gx9.
Data Scientist | Machine Learning Engineer
Ernst & Young
- Researched public and internal information on ML models for mergers and acquisitions and participated in workshops to generate ideas for potential use cases of ML in the M&A process.
- Performed data cleaning to ensure entities existed at different points in time and correct merging of entities from different datasets based on dates.
- Created the first proof of concept models for applications of ML for M&A using Pandas and random forests in scikit-learn.
- Set up the ML architecture to ensure integration with the engineering architecture in Azure and selected Databricks. It allows the use of Spark for cluster-based data processing and MLFlow for experiment tracking and deployment into Kubernetes.
- Researched and experimented with a number of mechanisms to allow for modeling of imbalanced datasets–weight balancing, blagging (random forests where decision trees use undersampling), undersampling and oversampling, and transfer learning.
- Analyzed multiple data sources and selected complementary data sources such as CapIQ for financial data, Factiva for news, and Oxford Economics for forecasts.
- Managed the machine learning team and had duties such as planning the team's workload, providing guidance on priorities, planning the team structure and size, interviewing, and hiring.
- Participated in user interviews to help shape how we built the algorithms and the platform on which they would be run. A simple product and model explainability were key takeaways.
- Participated in a number of presentations to explain how machine learning works and how C-level stakeholders could use it.
- Implemented a number of best practices in the team, such as random seed start, to get accurate scores of our models.
Data Scientist and Machine Learning Engineer
Serendipity AI
- Helped put in practice a news classifier and created a topic/user-based news recommendation system using NLP.
- Used named entity detectors from Spacy, DBpedia, and Jaccard Similarity together with Levehnstein distance to detect and match named entities in news and other text data.
- Developed a new vectorization method for the detected named entities in text and worked on a mechanism to qualify their expertise to different topics.
- Deployed Spark, Hadoop, and HBase on a cluster of three computers to speed up the machine learning processing.
- Developed an ML processing pipeline that would allow information to flow to HBase and processed it in parallel using PySpark. Every stage in the pipeline was designed as a microservice with access to only an input and an output table.
- Implemented a recommendation system using a neural network set up as an autoencoder and cosine similarity from Spotify Annoy.
- Brought to production level an article judging system. The system had a classification service and a training application. I used Celery to train every night and restart the judging service's worker pool when new models were available.
- Improved the code quality and reduced repeated code across applications written in Flask and Cherrypy by creating a shared library. Added a logging system based on Python logging that had handlers for local logging and Rollbar.
- Created a number of APIs using Flask that ran on AWS and connected to Neo4j.
- Set up a testing framework that would allow APIs to be tested before and after deployment using Jenkins and wrote integration tests for the APIs.
Data Scientist and Machine Learning Engineer
Cappfinity
- Researched and integrated an automatic machine learning algorithm picker in Python.
- Researched Auto-Sklearn (bayesian optimization for algorithm selection), TPOT (genetic algorithms for feature processing and algorithm selection), and NEAT (genetic algorithms for neural network evolution).
- Developed the architecture for experimentation and result visualization for machine learning algorithms using services built with C# ASP.NET Core and Python-Flask, which communicate via REST and RabbitMQ.
- Built the system's presentation layer using Angular 4.
- Wrote a text extraction service from speech using Google Speech to Text API.
- Integrated MongoDB and connected all the services to it so that they can save processing results.
- Integrated all the applications in Docker with their own private network and Docker Compose to allow for continuous integration and faster deployment.
Research Engineer
Oxehealth
- Led the data engineering team and worked on big data microservices that would connect cameras installed on-site with Oxehealth’s data warehouse.
- Worked on Oxehealth’s TechCrunch London live demo that connected a room in Oxford with a human being monitored to the stage in London.
- Designed and developed the microservices architecture for video data retrieval from customer sites using ZeroMQ, GRPC, and Boost Program Options and Property Tree for C++.
- Set up a VPN Network to connect customer deployments to a central data repository using pfSense.
- Built a breathing robot that could replicate different breathing patterns.
- Designed and developed an application that allowed for multiple room monitoring using Qt.
Computer Vision and Algorithms Engineer
Meta Vision Systems
- Designed the full stack from image capture and processing to point clouds sent over the network using multiple threads and a pipeline architecture to measure oil pipes with lasers and cameras.
- Wrote general-purpose GPU (GPGPU) code to accelerate image processing algorithms–convolution and point extraction via new kernels or through OpenCV, reducing processing time from the 40s to 40ms for some code paths.
- Implemented K-means and ordinary least squares algorithms through OpenCV for finding points of interest and then line fitting.
- Designed and set up the network communication channels to transmit data, commands, and replies using Type Length Value (TLV) messages via Boost ASIO.
- Designed and developed a logging system using Microsoft ETW.
- Set up point cloud library (PCL) for surface reconstruction and visualization of STL files and point clouds.
- Used Boost Property Tree to implement a configuration file parser that uses JSON files.
- Deployed Jenkins for automatic build verification and to run test cases.
Software Engineer
Qualcomm
- Wrote the first Windows driver for Qualcomm's NFC chip.
- Participated in a number of integration activities where I helped set up new platforms with our NFC chip.
- Worked on the launch of a Windows mobile phone that contained the chip I worked on.
- Advised other teams across the globe on Windows driver development.
- Developed a script in PowerShell for improving the team’s efficiency.
- Debugged customer and partner issues and those arising during testing.
- Trained new team members from different disciplines such as software engineering and testing.
Experience
M&A Predictor
News Recommendation System
A vector made out of the same features was extracted for all the different types above, and it found recommendations using locality-sensitive hashing from Spotify Annoy.
Document Type Classifier
Linked Documents Detector
Education
Bachelor of Engineering Degree with Honors in Electronic and Communications Engineering
London Metropolitan University - London, England
Certifications
Generative Adversarial Networks (GANs)
deeplearning.ai
Natural Language Processing Specialization
Coursera - deeplearning.ai
Deep Learning Specialization
Coursera - deeplearning.ai
Machine Learning
Coursera - Stanford Online
Cisco Certified Network Associate - Security
Cisco
Auditor/Lead Auditor (ISO 27001:2005)
IQMS
Cisco Certified Network Associate
Cisco
Skills
Libraries/APIs
Pandas, Scikit-learn, PySpark, Keras, SpaCy, OpenCV, Natural Language Toolkit (NLTK), ZeroMQ, TensorFlow, PyTorch
Tools
Jupyter, Git, PyCharm, RabbitMQ, Gensim, Microsoft Power BI, Tree-Based Pipeline Optimization Tool (TPOT), Apache Tika
Languages
Python 3, Python, C++, C, RDF, Bash
Paradigms
Concurrent Programming, Agile, MapReduce, ETL
Platforms
Jupyter Notebook, Linux, NVIDIA CUDA, Databricks, Amazon Web Services (AWS)
Frameworks
Flask, Spark, Hadoop, Apache Spark
Storage
Neo4j, HBase
Other
Agile Data Science, Machine Learning, Data Visualization, Imbalanced-learn, Data Engineering, Natural Language Processing (NLP), Data Science, Generative Pre-trained Transformers (GPT), Teamwork, Data Scraping, MLflow, Web Scraping, Transformers, Data Modeling, Hugging Face, GAN, Deep Neural Networks (DNNs), Recommendation Systems, Delta Lake, Deep Learning, Image Processing, Finance, Computer Vision
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring