Alexey Rodriguez Yakushev, Data Scientist and Developer in Berlin, Germany
Alexey Rodriguez Yakushev

Data Scientist and Developer in Berlin, Germany

Member since July 1, 2022
Alexey is a professional with 13+ years of experience in machine learning, data science, and software engineering. He was a key contributor to large-scale systems in recommendations, music and audio analysis, click-through rate prediction, RTB auction price prediction, computer vision, compilers, and tools for software parallelization. Alexey enjoys working with teams to build impactful products.
Alexey is now available for hire

Portfolio

  • Freelance
    Spark, SQL, Google Cloud Platform (GCP), Python 3, Machine Learning, Scala...
  • SoundCloud
    Spark, Machine Learning, Python 3, Scala, SQL, C, C++, TensorFlow...

Experience

Location

Berlin, Germany

Availability

Part-time

Preferred Environment

Linux, Python 3

The most amazing...

...thing I've done is a weekly recommendation system for over 100 million users with the supporting distributed computing infrastructure and low latency serving.

Employment

  • Senior Data Scientist and Machine Learning Engineer

    2019 - 2022
    Freelance
    • Contributed to the recommendation and other machine learning systems in an end-to-end manner. Designed and created prototypes for early de-risking with product managers, built large-scale production systems, and designed A/B tests for evaluation.
    • Coached team members, contributed to teamwork practices, and provided hands-on workshops on recommendations and machine learning.
    • Advised the product managers and engineers on the technology strategy of recommendation systems.
    Technologies: Spark, SQL, Google Cloud Platform (GCP), Python 3, Machine Learning, Scala, Data Science, Machine Learning Operations (MLOps)
  • Senior Data Scientist and Machine Learning Engineer

    2015 - 2019
    SoundCloud
    • Contributed to the core recommendations algorithms, including the matrix factorization, word2vec, factorization machines, locally sensitive hashing, learning to rank, and counterfactual evaluation.
    • Performed data engineering work on ETLs for recommendations using AirFlow, SQL, Spark, Cassandra, APIs for online serving, and A/B testing infrastructure.
    • Conducted data analysis tasks such as user behavior analysis and experiment design and evaluation.
    Technologies: Spark, Machine Learning, Python 3, Scala, SQL, C, C++, TensorFlow, Data Science, Machine Learning Operations (MLOps)

Experience

  • Multi-objective Recommender: Balance Interests of Consumers and Content Providers

    In this project, an existing recommendation system was extended to trade-off between user experience and content costs.

    I brainstormed the project together with a product organization leader. We obtained positive evidence within one month. Right about three months into the project, an A/B test of a solid prototype running in production showed positive results.

    After that, I worked with the recommendations team to make this prototype production-ready, run a new A/B test, and confirm a positive impact on business.

  • Weekly Music Recommendation System for over 100 Million Users

    I led a music recommendation project through conception, prototyping, implementation, and production deployment.

    The project had a tight deadline of three months, a high-quality threshold, and large-scale delivery requirements—weekly delivery to over 100 million users. The threshold was important for the new content to be compared to other existing recommenders on the platform, and at the same time, the content had to be personalized.

    The system was deployed to production. The reception was highly positive among users and company leadership. Many platform users claimed to have bought a subscription following the launch of this recommendation product.

  • Audio-based Music Recommender

    I supervised an intern creating a music recommendation system based on audio data.

    The motivation for this project was that content consumption was quite uneven on online platforms; most user consumption happened on a small proportion of items. As a result, many things in the catalog did not have enough information to use collaborative filtering techniques.

    We built a content-based recommendation system to start such content with little or no information on consumption. We set up this task as a regression task from audio content to recommendation embeddings generated by an existing collaborative filtering system. For this purpose, we worked on frequency domain representation of the music data. The input data was a MEL-transformed Short-time Fourier transform of the audio signal. We used a convolutional neural network built in TensorFlow to generate a recommendation embedding. Recommendation retrieval was based on an approximate nearest neighbors index using the audio-based embeddings.

  • Chatbot System for Questions and Answers and Domain Specific Information Retrieval

    Chatbot System for Questions and Answers and Domain-specific Information Retrieval

    The chatbot system was developed for an NGO serving the needs of minority groups. The goal was to help improve the work of the staff advising the NGO target groups. It was developed in a very tight schedule of two months.

    I worked together with a product manager to outline the requirements, establish milestones, and develop the whole solution. We used the Rasa framework for the chatbot and built an easy-to-maintain questions and answers database. The chatbot was also integrated with the NGO search engine to facilitate the retrieval of relevant information and was successfully deployed within the required project timelines.

Skills

  • Languages

    Python 3, Scala, C, SQL, OCaml, C++
  • Frameworks

    Spark
  • Libraries/APIs

    TensorFlow, PyTorch, Rasa NLU, Scikit-learn, Pandas
  • Paradigms

    Data Science
  • Other

    Machine Learning, Machine Learning Operations (MLOps), Data Analysis, Natural Language Processing (NLP)
  • Tools

    Rasa.ai
  • Platforms

    Linux, Google Cloud Platform (GCP)

To view more profiles

Join Toptal
Share it with others