Verified Expert in Engineering
Data Scientist and Back-end Developer
Edoardo is a data scientist who has worked as a CTO and vice president of engineering and founded multiple projects and businesses. He specializes in R&D initiatives, having created MLJ.ji (Julia's largest machine learning framework) and worked on detection algorithms at Shift Technology. Edoardo has a master's degree in applied mathematics from the University of Warwick.
Linux, Python 3, Rust, Neo4j, PyCharm, Docker, Scikit-learn, PyTorch, Amazon Web Services (AWS), Apollo Server
The most amazing...
...project I've led, researched, and implemented is an ML-based solution capable of detecting shellcode cyber attacks in raw data at 5Gbit per second.
Cloud and Neo4j Developer
- Implemented multiple solidity smart contracts to handle auction mechanisms, sponsoring, and general NFT exchanges.
- Built the containerized deployment process for Google Cloud Platform.
- Implemented a social recommendation system for the users using graph theory and machine learning.
- Implemented and maintained the Neo4j/GraphQL API, including writing multiple custom functions.
Chief Technology Officer
- Created an API-based automated agent continuously extracting data from clients' sources, such as AWS, Azure, Google Cloud Platform, GitHub, and Google Workspace, into our storage awaiting processing.
- Designed and implemented the graph structure on Neo4j to load clients' data, including infrastructure, assets, users, and permissions. It allowed the team to analyze complex relationships and find security flaws.
- Outlined and developed a framework for the abovementioned graph, allowing developers and data scientists to easily extend the structure and add new analysis models for continuous improvement.
- Created an automated vulnerability scanner running on clients' AWS and GCP clouds to continuously analyze their instances and report any new issues.
VP of Engineering
- Researched and developed a machine-learning-based high-performance software in Rust capable of detecting shellcode cyber threats in raw network data.
- Containerized the solution using Docker to make it easily deployable on-premise.
- Built the company's entire cloud infrastructure on AWS.
- Managed tech roadmaps, assigned tasks, and mentored junior developers.
- Closed contract with one of the largest French cyber security companies to develop a specific cyber attack detection software.
- Set up Neo4j and PostgreSQL databases with automated backup and security rules.
- Set up isolated environments, firewall security rules, and a REST API with AWS Lambdas.
- Included several AWS services with proper deployment using CloudFormation, such as AWS Lambda, SQS, SNS, Secrets Manager, S3, REST API, RDS, and AWS IoT.
- Implemented state-of-the-art models to detect and read car license plates in pictures.
- Researched and implemented new detection algorithms for specific types of fraudulent Italian claims.
- Provided technical explanations and support during sales meetings in Italy for potential clients.
- Created a pipeline to process, load, and analyze several gigabytes of raw data daily from our clients.
University of Warwick
- Learned CUDA and developed simulations using it to demonstrate speed gain against CPU.
- Taught students what a GPU is, what CUDA is, and when you should consider it.
- Participated in the development of CUDA.jl, the Julia CUDA wrapper.
Lead Back-end Developer
- Set up the infrastructure made of a MySQL database and a Ubuntu server.
- Set up CI/CD for continuous development and integration.
- Designed and implemented the entire back-end logic.
- Created a REST API to interact with the front and integrated the endpoints.
Research and Development of High-performance Shellcode Detector
This project consisted of two large parts—the research of a model for very high-speed inference and its implementation. Due to NDA, I can't go into any details about the research side. Still, the main issue I can comment on is that the model should have a high detection rate with a shallow false positive rate to ensure false positives wouldn't overwhelm the analysts.
One of the main project constraints was the speed since it acts as a firewall and therefore must take decisions to let packets go or not go through in real time. It was estimated that the solution would need to go at 1Gbps speed, at least, on an average laptop. Furthermore, the solution's security was also critical, which led me to use the Rust language that combines these two features. Data had to be processed and kept at the lowest levels of cache and use vectorization at the CPU level to reach the expected speed.
The final solution was able to reach 5Gbps of an average laptop with a detection rate of over 95% and a false positive rate under 0.000000001%, which means one false positive per Terabyte of data.
Created MLJ.jl, Julia's Largest Machine Learning Frameworkhttps://github.com/alan-turing-institute/MLJ.jl
I, therefore, designed, architected, and implemented the first version of that framework. The difficulty was designing a well-balanced interface, something generic enough to include all models but strict enough that it wouldn't be lengthy and too abstract to use.
By the end of my master's, a dozen of the most fundamental machine learning libraries had been unified, and the project had attracted the Alan Turing Institute's attention. I've been invited to present it at the Julia Convention and ended up taking it over and continuing to develop it ever since.
Creator and Owner of Websek.co
I developed the whole project with the following stack and structure:
• A REST API used by the front end with AWS Lambda
• A scheduler to launch EC2 analysis instances when required using AWS Lambda and AWS EC2
• A monitoring agent to verify the instances are healthy with AWS Lambda
• An analysis tool based on OWASP ZAP
• A PostgreSQL database
• A simple Bootstrap and jQuery front end
Co-author of a Peer-reviewed Scientific Paper
The project consisted of a numerical model analysis to determine how various attributes change depending on environmental parameters. This environment exists on a 4D lattice—three space and one time dimensions— and gets more and more accurate as the lattice increases. From earlier research, the margin of error was acceptable, starting from around 100-lattice.
Before this paper, the only implementation of the model would take 10 minutes to simulate a 10-lattice. This is a minimal lattice that can't be accurately used for numerical analysis and would be too slow.
I redesigned and implemented the model in C, creating a software that could simulate a 160-lattice in a few seconds. With such a performance, we could do a grid search on multiple parameters, giving us a global view of how each parameter affects the results. The paper is a collection of the most important results and explains its further implications.
Invester: Stock Price Prediction Bothttps://github.com/dominusmi/Invester
Backstory: I was finishing my degree in mathematics, and having been interested for years in stock trading, I decided to try and write my trading agent. The idea was not to do real-time algo-trading but mostly mid-long-term suggestions.
Tech: I created a framework to allow for different trading strategies so that I could independently compare different ideas. I also implemented backtesting mechanisms and various strategies, some heavily reliant on machine learning while others followed simpler signals. The code was deployed to the cloud and would run daily, giving a list of the top 10 suggestions to buy.
Interesting observation: The more complex the algorithm was, the less well it handled the early COVID-19 period.
Twitter Sentiment Analysishttps://github.com/dominusmi/Twitter-Sentiment-Analysis-Project
Designed, researched, and developed several classifiers to process and study tweets and predict the sentiment as either positive, negative, or neutral.
Train Ticket Cost Optimizer
AWS CloudFormation, Jupyter, GitHub, PyCharm, Microsoft Excel, Amazon CloudFront CDN, NGINX
Data Science, B2B, Anomaly Detection, Kanban, REST
AWS Lambda, Amazon Web Services (AWS), Docker, Amazon EC2, Azure, Linux, NVIDIA CUDA, cPanel, Google Cloud Platform (GCP)
Neo4j, Cloud Deployment, PostgreSQL, Amazon DynamoDB, Amazon S3 (AWS S3)
APIs, API Integration, Machine Learning, Graph Theory, Algebra, Architecture, Web Scraping, Scraping, Linux Servers, Data Scraping, SaaS, Web Servers, OpenAI GPT-3 API, System Architecture, Artificial Intelligence (AI), Statistics, Applied Research, Physics, Mathematics, Applied Mathematics, Game Theory, Reinforcement Learning, Bayesian Inference & Modeling, Scientific Computing, Team Management, IT Project Management, Cloud Infrastructure, Product Strategy, Competitive Strategy, Product Management, Roadmaps, Software Architecture, eBPF, Optimization, Enterprise Systems, GPU Computing, Graphics Processing Unit (GPU), Numerical Analysis, Bots, Web Development, Trading, Stock Trading, Natural Language Processing (NLP), Airtable, Apollo Server, Video Streaming, Marketplaces, Payment APIs, Data Analysis, GPT, Generative Pre-trained Transformers (GPT)
Scikit-learn, jQuery, Pandas, PyTorch, REST APIs, Node.js, SQLAlchemy, Salesforce API, Office 365 API
Master's Degree in Informatics and Applied Mathematics
University of Warwick - Warwick, United Kingdom
Bachelor's Degree in Mathematics and Physics
University of Warwick - Warwick, United Kingdom