Scroll To View More
Riccardo Vitale, Java Developer in London, United Kingdom
Riccardo Vitale

Java Developer in London, United Kingdom

Member since January 26, 2018
Riccardo has been working for almost seven years for startups and big corporations. His past experiences include: Google, Skimlinks, Thought Machine, and Facebook. Riccardo focuses mainly on Python and Java development. He is fluent in many technologies with the ability to speak Go and a bit of front-end languages/technologies.
Riccardo is now available for hire

Portfolio

Experience

  • Java, 8 years
  • Python, 8 years
  • Go, 5 years
London, United Kingdom

Availability

Part-time

Preferred Environment

IntelliJ, Pycharm, Git, Python, Java

The most amazing...

...projects I've worked on were Google's Text-to-Speech system for Italian and building the ML infrastructure for a personal finance service at Thought Machine.

Employment

  • Software Engineer

    2017 - PRESENT
    Facebook
    • Improved the News Feed relevance and the process of the identification of the closest connections in the work social graph.
    • Created an experimentation framework for testing various Machine Learning configurations through A/B testing and check the results in real time.
    • Contributed to the development of Buck build system for the company's build infrastructure. The goal was to reduce build times and improve the reliability and correctness of the builds.
    Technologies: Python, Java, Hack, C++
  • Machine Learning Software Engineer

    2014 - 2016
    Thought Machine
    • Supported the Machine Learning team with software engineering best practices in order to create the infrastructure for the training of models and guarantee repeatable experiments and maintainable code.
    • Developed and maintained several Machine Learning components from research and model training to deployment. These included a transaction classification engine, data aggregation to produce useful insights, customer clustering, and cash flow prediction.
    • Contributed to the development of the core platform and other microservices for user onboarding and storage of bank accounts and transactions.
    Technologies: Python, Java, Go
  • Software Engineer

    2012 - 2014
    Skimlinks
    • Designed, developed, and deployed new projects to generate new revenue streams, such as price comparison tools.
    • Automated and improved the entire product import process to store the merchant products into the database, for a content monetization product.
    • Contributed to the NLP infrastructure for product matching from raw text.
    Technologies: Python, Go
  • Software Engineer

    2011 - 2012
    Google
    • Supported the development of Text-to-Speech in Italian by developing and embedding the unique linguistic features of the Italian language into the system.
    • Developed a system for the automatic preparation of the data necessary for the training of the Text-to-Speech voice system.
    • Worked on the improvement of internal tools for the collection of samples for the training of both Text-to-Speech and Automated Speech Recognition systems.
    Technologies: Python, Java, C++

Experience

  • Italian Text-to-Speech (Development)

    While at Google I worked on a big project involving several people and teams in London, New York, and Mountain View. The project was to adapt the Text-to-Speech system at the time under development into other European languages. We started developing a system for the initial data collection of the required data for the training. We did so by creating a data pipeline that would automatically get all the samples collected from the BigTables into an aggregation pipeline written in Python. This system would clean all the data and produce a training set for the voice training.

    Under the linguistics point of view, we worked hard on adapting the engine to support each language's unique features around text normalization, syllabification and verbalization. All these efforts produced noticeable results in the quality of the voice and in the overall rating metrics.

  • Transactions Classification (Development)

    In Thought Machine we wanted to provide next level banking experience by providing an insightful look at the user's personal finance. One of the features we wanted to launch was the possibility to look at one's personal expense categories. We developed a classification engine through Python and the scientific stack (Numpy, Scipy, Scikit-Learn, Pandas) to account for this. We started with a simple Naive Bayes classifier (bag of words model), and the results were already encouraging, but we kept exploring other algorithms too: SVM, Logistic Regression, Random Forest.

    Eventually we utilized an ensemble of these models to benefit from the better accuracy of the classifiers in some categories (i.e. the Naive Bayes classifier would be better at classifying transactions of category X vs Logistic Regression which would be better at classifying Y).

    This component was the core for other bits of ML insights we provided, such as intelligent alerts on spending, cash flow prediction, user clustering, and so on.

  • Automated Data Collection for Products Database (Development)

    One of my responsibilities in Skimlinks was to launch the data collection for one of our main products, Skimwords, which used NLP to recognize products in a page and drive content monetization. This process was very error prone, slow and quite manual. Ideally it would have had to be launched every day but the process itself was taking ~32 hours to process ~30 million items.

    I improved the existing process, written in Python, to a more structured and tested component. After the overall execution was more reliable, I introduced parallelization in the data collection component, which would make use of multiprocessing in order to overcome Python's natural issues with multithreading. This brought down the running time while allowing to add more product feeds into the system, thanks to the gain in execution speed. The total running time was brought down to ~12 hours for ~40 million items.

    This was the hard limit for how the component was designed in terms of tech stack. So, for the next iteration, I started working on a complete rewrite of the tool in Go, to exploit the excellent parallelization paradigm it offers. Rewriting the component brought down the execution time to ~8 hours for ~60 million items

Skills

  • Languages

    Python, Java, Go
  • Libraries/APIs

    Pandas, Protobuf, Scikit-learn, NumPy
  • Tools

    Git, Atom, PyCharm, IntelliJ, Phabricator, GitHub, GitLab, Mercurial, Bitbucket
  • Platforms

    Linux
  • Storage

    MySQL, PostgreSQL, Redis, Memcached, MongoDB, Cassandra
  • Frameworks

    Flask, Django
  • Paradigms

    Concurrent Programming
  • Other

    Tornado

Education

  • Master's degree in Computer Engineering
    2008 - 2012
    Università degli Studi di Roma Tre - Rome, Italy
  • Bachelor's degree in Computer Engineering
    2005 - 2008
    Università degli Studi di Roma Tre - Rome, Italy
I really like this profile
Share it with others