Ruggiero Dargenio
Verified Expert in Engineering
Big Data Engineer and Developer
Zürich, Switzerland
Toptal member since July 12, 2022
Ruggiero is a real-world-data person with over five years of experience in data engineering, developing models for various use cases in the NLP and cyber security fields. With a background in software engineering and a master's in computer science from ETH Zurich and MIT, he has been coding for over 15 years. Ruggiero also excels in creating pipelines and ETL transforms based on big data technologies for different financial institutions.
Portfolio
Experience
- Python - 7 years
- Pandas - 7 years
- SQL - 7 years
- Scikit-learn - 6 years
- Machine Learning - 6 years
- Data Engineering - 5 years
- PySpark - 5 years
- TensorFlow - 3 years
Availability
Preferred Environment
Machine Learning, Data Engineering, Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, Docker, SQL, Python
The most amazing...
...thing I've developed is an end-to-end machine learning solution for cyber threats detection.
Work Experience
Lead Data Scientist
Duenders LLC
- Developed a neural search based on Jina and transformers embeddings.
- Deployed serverless containers on the cloud running on specific schedules.
- Oversaw the development of a web and mobile app in the fintech space.
Big Data Engineer
Deloitte
- Worked on a company-wide solution to have a unique view of customers with data from multiple sources.
- Built ETL pipelines that extract and ingest data from various database systems using big data technologies based on Palantir Foundry.
- Developed and tested data sources that provide feeds to data lakes and their deployment in production.
- Designed pipeline specifications by integrating the business logic with consumers' requirements.
- Communicated with project managers and business analysts to optimize the efficiency of data pipelines.
Data Modeler
Credit Suisse
- Contributed as a contractor to modeling and analyzing different financial data for identifying money laundering.
- Acted as a product owner in an agile workstream comprising of up to 10 developers and business analysts. Identified and prioritized business requirements, then converted them into technical implementation tasks.
- Analyzed machine learning models that have been developed with a focus on explainability.
- Ensured the model technical performance metrics reflected the business use case.
- Conducted ad-hoc analysis of clients' transactional behavior to detect money laundering patterns using state-of-the-art big data technologies based on a Spark cluster.
- Proposed and participated in a project-wide strategy for the implementation, productionalization, and post-deployment monitoring of ML models.
- Represented the team in discussions about collaborations with external data providers.
Data Scientist
BIS – Bank for International Settlements
- Developed an end-to-end system to identify various cyber threats and malicious behaviors.
- Built NPL-based detection models—spam classifier built on top of BERT with PyTorch implementation, prioritization model for cyber alerts in the scikit-learn security incident response platform, and anomaly detector for processes commands lines.
- Developed detection models based on network traffic, targeting DNS tunneling, admin access traffic, and malicious domains. Used PySpark for data processing and MLlib for ML models.
- Collaborated with the team to develop the BIS's big data platform based on Apache and Cloudera products. Gathered hardware requirements, selected software tools, and defined use cases.
Experience
Purse
Education
Master's Thesis in Computer Science
MIT – Massachusetts Institute of Technology - Cambridge, Boston, USA
Master's Degree in Computer Science
ETH Zurich - Zurich, Switzerland
Bachelor's Degree in Software Engineering
Polytechnic University of Milan - Milano, Italy
Skills
Libraries/APIs
Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, MLlib
Tools
Spark SQL, Jira, Rundeck
Languages
SQL, Python, Snowflake
Frameworks
Spark
Storage
Data Pipelines, Amazon S3 (AWS S3)
Platforms
Amazon Web Services (AWS), Docker, Kubernetes, Google Cloud Platform (GCP)
Industry Expertise
Telecommunications
Other
Machine Learning, Data Engineering, Data Science, Language Models, Artificial Intelligence (AI), Deep Learning, Data Modeling, Text Generation, Natural Language Processing (NLP), Large Language Models (LLMs), Amazon Machine Learning, Generative Pre-trained Transformers (GPT), Engineering, Software Engineering, Physics, Big Data, Data Mining, Foundry, Data Visualization, Serverless, Speech Recognition, Prompt Engineering
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring