Verified Expert in Engineering
Big Data Engineer and Developer
Ruggiero is a real-world-data person with over five years of experience in data engineering, developing models for various use cases in the NLP and cyber security fields. With a background in software engineering and a master's in computer science from ETH Zurich and MIT, he has been coding for over 15 years. Ruggiero also excels in creating pipelines and ETL transforms based on big data technologies for different financial institutions.
Machine Learning, Data Engineering, Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, Docker, SQL, Python
The most amazing...
...thing I've developed is an end-to-end machine learning solution for cyber threats detection.
Lead Data Scientist
- Developed a neural search based on Jina and transformers embeddings.
- Deployed serverless containers on the cloud running on specific schedules.
- Oversaw the development of a web and mobile app in the fintech space.
Big Data Engineer
- Worked on a company-wide solution to have a unique view of customers with data from multiple sources.
- Built ETL pipelines that extract and ingest data from various database systems using big data technologies based on Palantir Foundry.
- Developed and tested data sources that provide feeds to data lakes and their deployment in production.
- Designed pipeline specifications by integrating the business logic with consumers' requirements.
- Communicated with project managers and business analysts to optimize the efficiency of data pipelines.
- Contributed as a contractor to modeling and analyzing different financial data for identifying money laundering.
- Acted as a product owner in an agile workstream comprising of up to 10 developers and business analysts. Identified and prioritized business requirements, then converted them into technical implementation tasks.
- Analyzed machine learning models that have been developed with a focus on explainability.
- Ensured the model technical performance metrics reflected the business use case.
- Conducted ad-hoc analysis of clients' transactional behavior to detect money laundering patterns using state-of-the-art big data technologies based on a Spark cluster.
- Proposed and participated in a project-wide strategy for the implementation, productionalization, and post-deployment monitoring of ML models.
- Represented the team in discussions about collaborations with external data providers.
BIS – Bank for International Settlements
- Developed an end-to-end system to identify various cyber threats and malicious behaviors.
- Built NPL-based detection models—spam classifier built on top of BERT with PyTorch implementation, prioritization model for cyber alerts in the scikit-learn security incident response platform, and anomaly detector for processes commands lines.
- Developed detection models based on network traffic, targeting DNS tunneling, admin access traffic, and malicious domains. Used PySpark for data processing and MLlib for ML models.
- Collaborated with the team to develop the BIS's big data platform based on Apache and Cloudera products. Gathered hardware requirements, selected software tools, and defined use cases.
SQL, Python, Snowflake
Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, MLlib
Spark SQL, Jira, Rundeck
Data Pipelines, Amazon S3 (AWS S3)
Machine Learning, Data Engineering, Language Models, Artificial Intelligence (AI), Deep Learning, Data Modeling, Text Generation, Natural Language Processing (NLP), Large Language Model (LLM), Amazon Machine Learning, GPT, Generative Pre-trained Transformers (GPT), Engineering, Software Engineering, Physics, Big Data, Data Mining, Foundry, Data Visualization, Serverless, Speech Recognition
Amazon Web Services (AWS), Docker, Kubernetes, Google Cloud Platform (GCP)
Master's Thesis in Computer Science
MIT – Massachusetts Institute of Technology - Cambridge, Boston, USA
Master's Degree in Computer Science
ETH Zurich - Zurich, Switzerland
Bachelor's Degree in Software Engineering
Polytechnic University of Milan - Milano, Italy