Riccardo Vitale, Developer in London, United Kingdom
Riccardo is available for hire
Hire Riccardo

Riccardo Vitale

Verified Expert  in Engineering

Software Developer

Location
London, United Kingdom
Toptal Member Since
March 8, 2018

Riccardo has been working for almost seven years for startups and big corporations. His past experiences include: Google, Skimlinks, Thought Machine, and Facebook. Riccardo focuses mainly on Python and Java development. He is fluent in many technologies with the ability to speak Go and a bit of front-end languages/technologies.

Portfolio

Facebook
C++, Hack, Java, Python
Thought Machine
Go, Java, Python
Skimlinks
Go, Python

Experience

Availability

Part-time

Preferred Environment

Java, Python, Git, PyCharm, IntelliJ IDEA

The most amazing...

...projects I've worked on were Google's Text-to-Speech system for Italian and building the ML infrastructure for a personal finance service at Thought Machine.

Work Experience

Software Engineer

2017 - PRESENT
Facebook
  • Improved the News Feed relevance and the process of the identification of the closest connections in the work social graph.
  • Created an experimentation framework for testing various Machine Learning configurations through A/B testing and check the results in real time.
  • Contributed to the development of Buck build system for the company's build infrastructure. The goal was to reduce build times and improve the reliability and correctness of the builds.
Technologies: C++, Hack, Java, Python

Machine Learning Software Engineer

2014 - 2016
Thought Machine
  • Supported the Machine Learning team with software engineering best practices in order to create the infrastructure for the training of models and guarantee repeatable experiments and maintainable code.
  • Developed and maintained several Machine Learning components from research and model training to deployment. These included a transaction classification engine, data aggregation to produce useful insights, customer clustering, and cash flow prediction.
  • Contributed to the development of the core platform and other microservices for user onboarding and storage of bank accounts and transactions.
Technologies: Go, Java, Python

Software Engineer

2012 - 2014
Skimlinks
  • Designed, developed, and deployed new projects to generate new revenue streams, such as price comparison tools.
  • Automated and improved the entire product import process to store the merchant products into the database, for a content monetization product.
  • Contributed to the NLP infrastructure for product matching from raw text.
Technologies: Go, Python

Software Engineer

2011 - 2012
Google
  • Supported the development of Text-to-Speech in Italian by developing and embedding the unique linguistic features of the Italian language into the system.
  • Developed a system for the automatic preparation of the data necessary for the training of the Text-to-Speech voice system.
  • Worked on the improvement of internal tools for the collection of samples for the training of both Text-to-Speech and Automated Speech Recognition systems.
Technologies: C++, Java, Python

Italian Text-to-Speech

While at Google I worked on a big project involving several people and teams in London, New York, and Mountain View. The project was to adapt the Text-to-Speech system at the time under development into other European languages. We started developing a system for the initial data collection of the required data for the training. We did so by creating a data pipeline that would automatically get all the samples collected from the BigTables into an aggregation pipeline written in Python. This system would clean all the data and produce a training set for the voice training.

Under the linguistics point of view, we worked hard on adapting the engine to support each language's unique features around text normalization, syllabification and verbalization. All these efforts produced noticeable results in the quality of the voice and in the overall rating metrics.

Transactions Classification

In Thought Machine we wanted to provide next level banking experience by providing an insightful look at the user's personal finance. One of the features we wanted to launch was the possibility to look at one's personal expense categories. We developed a classification engine through Python and the scientific stack (Numpy, Scipy, Scikit-Learn, Pandas) to account for this. We started with a simple Naive Bayes classifier (bag of words model), and the results were already encouraging, but we kept exploring other algorithms too: SVM, Logistic Regression, Random Forest.

Eventually we utilized an ensemble of these models to benefit from the better accuracy of the classifiers in some categories (i.e. the Naive Bayes classifier would be better at classifying transactions of category X vs Logistic Regression which would be better at classifying Y).

This component was the core for other bits of ML insights we provided, such as intelligent alerts on spending, cash flow prediction, user clustering, and so on.

Automated Data Collection for Products Database

One of my responsibilities in Skimlinks was to launch the data collection for one of our main products, Skimwords, which used NLP to recognize products in a page and drive content monetization. This process was very error prone, slow and quite manual. Ideally it would have had to be launched every day but the process itself was taking ~32 hours to process ~30 million items.

I improved the existing process, written in Python, to a more structured and tested component. After the overall execution was more reliable, I introduced parallelization in the data collection component, which would make use of multiprocessing in order to overcome Python's natural issues with multithreading. This brought down the running time while allowing to add more product feeds into the system, thanks to the gain in execution speed. The total running time was brought down to ~12 hours for ~40 million items.

This was the hard limit for how the component was designed in terms of tech stack. So, for the next iteration, I started working on a complete rewrite of the tool in Go, to exploit the excellent parallelization paradigm it offers. Rewriting the component brought down the execution time to ~8 hours for ~60 million items

Languages

Python, Java, Go, C++, Hack

Libraries/APIs

Pandas, Protobuf, Scikit-learn, NumPy

Tools

Git, Atom, PyCharm, IntelliJ IDEA, Phabricator, GitHub, GitLab, Mercurial, Bitbucket

Platforms

Linux

Storage

MySQL, PostgreSQL, Redis, Memcached, MongoDB, Cassandra

Frameworks

Flask, Django

Paradigms

Concurrent Programming

Other

Tornado

2008 - 2012

Master's Degree in Computer Engineering

Università degli Studi di Roma Tre - Rome, Italy

2005 - 2008

Bachelor's Degree in Computer Engineering

Università degli Studi di Roma Tre - Rome, Italy

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring