Ruggiero Dargenio, Developer in Zürich, Switzerland
Ruggiero is available for hire
Hire Ruggiero

Ruggiero Dargenio

Verified Expert  in Engineering

Big Data Engineer and Developer

Location
Zürich, Switzerland
Toptal Member Since
July 12, 2022

Ruggiero is a real-world-data person with over five years of experience in data engineering, developing models for various use cases in the NLP and cyber security fields. With a background in software engineering and a master's in computer science from ETH Zurich and MIT, he has been coding for over 15 years. Ruggiero also excels in creating pipelines and ETL transforms based on big data technologies for different financial institutions.

Portfolio

Duenders LLC
Google Cloud Platform (GCP), Natural Language Processing (NLP)...
Deloitte
Python, PySpark, Foundry, Pandas, SQL, Rundeck, Jira, Data Visualization...
Credit Suisse
Python, PySpark, Scikit-learn, SQL, Pandas, Jira, Data Science, Data Pipelines...

Experience

Availability

Part-time

Preferred Environment

Machine Learning, Data Engineering, Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, Docker, SQL, Python

The most amazing...

...thing I've developed is an end-to-end machine learning solution for cyber threats detection.

Work Experience

Lead Data Scientist

2021 - PRESENT
Duenders LLC
  • Developed a neural search based on Jina and transformers embeddings.
  • Deployed serverless containers on the cloud running on specific schedules.
  • Oversaw the development of a web and mobile app in the fintech space.
Technologies: Google Cloud Platform (GCP), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Docker, Large Language Models (LLMs), Artificial Intelligence (AI), Amazon Web Services (AWS), Amazon Machine Learning, Amazon S3 (AWS S3), Snowflake

Big Data Engineer

2021 - PRESENT
Deloitte
  • Worked on a company-wide solution to have a unique view of customers with data from multiple sources.
  • Built ETL pipelines that extract and ingest data from various database systems using big data technologies based on Palantir Foundry.
  • Developed and tested data sources that provide feeds to data lakes and their deployment in production.
  • Designed pipeline specifications by integrating the business logic with consumers' requirements.
  • Communicated with project managers and business analysts to optimize the efficiency of data pipelines.
Technologies: Python, PySpark, Foundry, Pandas, SQL, Rundeck, Jira, Data Visualization, Data Engineering, Data Science, Data Pipelines, Data Modeling, Spark SQL, Spark, Artificial Intelligence (AI)

Data Modeler

2020 - 2021
Credit Suisse
  • Contributed as a contractor to modeling and analyzing different financial data for identifying money laundering.
  • Acted as a product owner in an agile workstream comprising of up to 10 developers and business analysts. Identified and prioritized business requirements, then converted them into technical implementation tasks.
  • Analyzed machine learning models that have been developed with a focus on explainability.
  • Ensured the model technical performance metrics reflected the business use case.
  • Conducted ad-hoc analysis of clients' transactional behavior to detect money laundering patterns using state-of-the-art big data technologies based on a Spark cluster.
  • Proposed and participated in a project-wide strategy for the implementation, productionalization, and post-deployment monitoring of ML models.
  • Represented the team in discussions about collaborations with external data providers.
Technologies: Python, PySpark, Scikit-learn, SQL, Pandas, Jira, Data Science, Data Pipelines, Spark SQL, Spark, Artificial Intelligence (AI)

Data Scientist

2018 - 2020
BIS – Bank for International Settlements
  • Developed an end-to-end system to identify various cyber threats and malicious behaviors.
  • Built NPL-based detection models—spam classifier built on top of BERT with PyTorch implementation, prioritization model for cyber alerts in the scikit-learn security incident response platform, and anomaly detector for processes commands lines.
  • Developed detection models based on network traffic, targeting DNS tunneling, admin access traffic, and malicious domains. Used PySpark for data processing and MLlib for ML models.
  • Collaborated with the team to develop the BIS's big data platform based on Apache and Cloudera products. Gathered hardware requirements, selected software tools, and defined use cases.
Technologies: Python, PySpark, Scikit-learn, Pandas, MLlib, TensorFlow, PyTorch, SQL, Data Science, Spark SQL, Spark, Language Models, Text Generation, Large Language Models (LLMs), Artificial Intelligence (AI)

Purse

An innovative coupon mobile app that uses open banking to tailor discounts to a customer's specific interests. The system would recommend coupons based on users' expenses to provide targeted royalty programs that work.

Languages

SQL, Python, Snowflake

Frameworks

Spark

Libraries/APIs

Scikit-learn, Pandas, PySpark, TensorFlow, PyTorch, MLlib

Tools

Spark SQL, Jira, Rundeck

Paradigms

Data Science

Storage

Data Pipelines, Amazon S3 (AWS S3)

Other

Machine Learning, Data Engineering, Language Models, Artificial Intelligence (AI), Deep Learning, Data Modeling, Text Generation, Natural Language Processing (NLP), Large Language Models (LLMs), Amazon Machine Learning, GPT, Generative Pre-trained Transformers (GPT), Engineering, Software Engineering, Physics, Big Data, Data Mining, Foundry, Data Visualization, Serverless, Speech Recognition, Prompt Engineering

Platforms

Amazon Web Services (AWS), Docker, Kubernetes, Google Cloud Platform (GCP)

Industry Expertise

Telecommunications

2017 - 2018

Master's Thesis in Computer Science

MIT – Massachusetts Institute of Technology - Cambridge, Boston, USA

2015 - 2018

Master's Degree in Computer Science

ETH Zurich - Zurich, Switzerland

2012 - 2015

Bachelor's Degree in Software Engineering

Polytechnic University of Milan - Milano, Italy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring