Rudolf is available for hire

Rudolf Eremyan

Verified Expert in Engineering

Data Science Developer

Location

Tbilisi, Georgia

Toptal Member Since

August 2, 2018

Rudolf is a data scientist with eight years of experience in the field. He developed the first chatbot framework for the Georgian language, which the largest bank in Georgia adopted. Rudolf designed big data processing pipelines and analytics solutions based on cloud technologies for Fortune 500 companies. He was invited to be a speaker and judge on international hackathons and conferences like PyData, Google DevFest, and NASA's international space app challenge.

Portfolio

Amgreat North America

Python, Data Science, Plotly, Data Engineering, Amazon Web Services (AWS)...

Midea - Main

Python, Data Science, Data Scraping, Sentiment Analysis, Agile Data Science...

Staude Capital

Data Engineering, Excel VBA, SQL, Data Science, Amazon Web Services (AWS)...

Experience

Python - 8 years Pandas - 6 years SQL - 6 years Amazon Web Services (AWS) - 5 years Data Science - 5 years Data Engineering - 4 years Statistics - 4 years PySpark - 1 year

Availability

Part-time

Preferred Environment

Amazon Web Services (AWS), Python, Big Data, PostgreSQL, SQL, PySpark, Data Modeling, Data Pipelines, Pandas, Data Scraping

The most amazing...

...thing I've developed is a chatbot framework for the Georgian language.

Work Experience

Data Engineer

2023 - 2023

Amgreat North America

Developed scripts for parsing data from social media platforms, contributing to streamlined data analysis and information retrieval processes.
Implemented a topic modeling solution to extract valuable insights from complex datasets, enhancing the depth and efficiency of data analysis processes.
Designed interactive dashboard prototypes using Streamlit and Plotly libraries, elevating data visualization capabilities for enhanced user engagement and comprehension.
Implemented and deployed automated data pipelines on AWS, optimizing data workflows for increased efficiency and scalability.

Technologies: Python, Data Science, Plotly, Data Engineering, Amazon Web Services (AWS), GraphQL, Selenium, JavaScript, Machine Learning, Natural Language Processing (NLP), Docker, Web Scraping, ETL

Data Scientist

2023 - 2023

Midea - Main

Developed scripts for collecting data from eCommerce platforms.
Used cloud service providers for computation and AI-based data analysis.
Designed an advanced insights analytics dashboard with AWS QuickSight.

Technologies: Python, Data Science, Data Scraping, Sentiment Analysis, Agile Data Science, Web Scraping, ETL, Machine Learning

Data Engineer

2021 - 2023

Staude Capital

Designed a data model based on customer-provided requirements and business needs.
Developed an investor CRM system for managing hedge fund trades, orders, and other operations.
Created automated reporting tools and deployed them on the Amazon Web Services.
Developed an internal communication and notification system.

Technologies: Data Engineering, Excel VBA, SQL, Data Science, Amazon Web Services (AWS), Hedge Funds, Python, Pandas, Data Modeling, Docker, ETL

Data Scientist

2020 - 2022

ATH Digital LLC

Created data ingestion scripts for pulling data from ad platforms like Google Ads and Facebook Ads.
Developed automatic uploading of the CSV and Excel file data into the database based on the AWS services.
Set up the marketing streaming cloud infrastructure of the data processing pipeline.
Designed a database model based on the data science team's requirements.
Created a model for forecasting and visualizing the balance burn rate metric.

Technologies: Docker, Plotly, PostgreSQL, Jupyter Notebook, Pandas, AdWords API, Facebook API, Cron, Python, Amazon Kinesis, Amazon EC2, Docker Compose, Jupyter, Google Analytics API, Apache Airflow, Big Data, Amazon Web Services (AWS), ETL

Senior Data Scientist

2019 - 2020

Zelos.AI

Processed and analyzed over 100 million athletic performance data with PySpark running on AWS EMR.
Designed a data model based on the companies business requirements.
Made a batch data processing pipeline orchestrated by Airflow.
Created a data scraping tool for parsing dynamic and static web pages using Scrapy, Selenium, lxml.
Developed athletics competitions simulations based on the Monte Carlo approach.

Technologies: Amazon Elastic MapReduce (EMR), PySpark, Jupyter, Amazon Web Services (AWS), Statistics, Data Science, Amazon DynamoDB, Amazon EC2, lxml, Data Modeling, Database Modeling, Code Architecture, Markov Model, Markov Chain Monte Carlo (MCMC) Algorithms, Scrapy, DB, Data Scraping, Selenium, Data Engineering, Machine Learning, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, ETL, Docker, Python, Apache Airflow, Pandas, Big Data, Web Scraping

Data Scientist

2018 - 2019

Windsor.AI

Optimized existing SQL queries, making them less complex and having higher performance.
Used SQL for gaining insights, detecting anomalies and problems in the collected data.
Created a workflow for the data migration between different database management systems.
Developed scripts for ingesting data from different online advertising platforms.
Designed new database tables according to the analytics team requirements.

Technologies: Jupyter, DB, Marketing, Google Analytics, PostgreSQL, SQL, Statistics, R, Pandas, Python, Docker, Facebook API, AdWords API, Big Data, Amazon Web Services (AWS), ETL

Data Scientist

2018 - 2019

Frontier Data Corporation

Developed models for trend detection in the Twitter stream.
Developed AI-based application's architecture.
Integrated in-house ML models with cloud services as IBM BlueMix and Google Cloud NLP.
Worked with big datasets using Google BigQuery.
Created customized modules for new ML models evaluation.
Trained machine learning models for text classification.
Created tests for existing applications.

Technologies: Jupyter, DB, Time Series Analysis, R, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Big Data, Python, Pandas, Docker, PostgreSQL, Amazon Web Services (AWS)

Data Scientist

2016 - 2018

Pulsar AI

Developed a chatbot framework for the Georgian language applying machine learning and natural language processing (NLP) techniques.
Trained and deployed a machine learning model for an automated grouping of the news and articles from Georgian media websites.
Designed a tool for sentiment classification on texts from social networks.
Analyzed a large amount of user conversations data applying NLP, statistics and presented precise results.
Worked with time series for analyzing and predicting cryptocurrency prices.
Managed a team of linguists who worked on the data collection and labeling.

Technologies: Jupyter, DB, MongoDB, Git, Docker, NumPy, Pandas, SpaCy, fastText, Natural Language Toolkit (NLTK), Gensim, Scikit-learn, Python, PostgreSQL, Amazon Web Services (AWS), Web Scraping, ETL, Machine Learning

Software Developer Internship

2016 - 2016

Virtuace Inc.

Fixed bugs.
Expanded functionality of the existing application.
Tested new modules.

Technologies: XML, Java, Git, Linux, Docker

Full-stack Software Engineer

2014 - 2016

Georgian Technical University

Developed the front-end for managing and working with linguistic corpora.
Created web services for operating with linguistic corpus data.
Organized database structure for storing and manipulating the linguistic corpora.
Analyzed documents using NLP tools and presented results in a clear manner.

Technologies: DB, Python, Natural Language Toolkit (NLTK), Linguistics, MySQL, REST, JavaScript, CSS, HTML, PostgreSQL

Experience

Consumer Insights Analysis

Created interactive customer insights dashboards by developing data collection tools, conducting sentiment analysis on the gathered datasets, and constructing an engaging and user-friendly dashboard using AWS QuickSight.

Social Media Monitoring

Designed and implemented automated data pipelines on AWS for gathering information from diverse social media platforms as part of an internal social media monitoring service. Developed insights extraction analytics metrics and presented them through an interactive dashboard tailored for the product team.

Multi-asset Hedge Fund Management System

As a data engineer at a hedge fund, I created a data model by translating financial Excel sheets and business requirements. I implemented a multi-user interface on a widely used cloud service to manage assets and data within the database efficiently. I also established data pipelines to collect financial data from diverse banks and financial services. Additionally, I developed reporting mechanisms and internal communication services to enhance data accessibility and communication within the organization.

Trend Detection in Twitter Stream

I employed natural language processing algorithms and time series analysis techniques to create a model for early trend detection within the Twitter stream. I also crafted scripts utilizing the Twitter API to extract and analyze data from the Twitter stream. Then, I enhanced the interpretability of the results by visualizing the analysis outcomes through various plots.

Attribution Modeling for Marketing Optimization

I implemented attribution modeling, a technique to assess the financial influence of communication on key business objectives such as sales, customer retention, revenue, and profit. I also utilized SQL extensively for data manipulation and analysis, along with Python and R libraries.

I developed scripts for data migration and client notifications and implemented data integrity tests to ensure the completeness and accuracy of existing data. During this project, there was effective collaboration between me and an international team spread across different geographical locations.

Advanced News Filter

Using Google BigQuery analyzed news big dataset.

Trained machine learning models for text classification which used in text filtering mechanism. Integrated cloud ML services such as IBM BlueMix and Google Cloud NLP with an existing application.

Chatbot Framework for Georgian Language

https://www.facebook.com/TBCTIbot/

Ti-Bot, the first ever Chat Bot to speak Georgian.

Automated News Article Grouping Tool

News article grouping tool uses word vectorizing technologies with a combination of clustering algorithms for automatically grouping similar articles parsed from news websites.

Social Media Sentiment Analysis Tool

Social media sentiment analysis tool is a combination of natural language processing technologies and machine learning algorithms for predicting the sentiment for comments and posts, collected from social networks such as Facebook and Instagram.

Spell Checker for Georgian Language

Spell checker tool uses classical algorithms with a combination of powerful machine learning and natural language processing methods for detecting and correcting mistakes in the sentences. This product used by the largest companies in Georgia for detecting and correcting mistakes in documents.

NLP Tool for Automatic Identification of Georgian Dialects

A tool used for automatic identification of the Georgian dialects in documents from different sources such as forums, social networks, etc. It's based on machine learning classification methods and NLP approaches. During development, I worked with a group of linguists who prepared training and evaluated data for a classification model.

This project was awarded the "Best Scientific Research of the Tbilisi State University 76th Student Conference"

Cryptocurrency Prices Monitoring Tool

Cryptocurrency prices monitoring tool uses time series analysis algorithms and Tweeter API combined with NLP tools such as Sentiment analysis, for monitoring and predicting price movements of Bitcoin and other cryptocurrencies.

Linguistic Corpus Management System

Developed a web application for storing, manipulating, and analyzing linguistic data.

ETL pipeline for pharmaceutical industry data

Worked with clients team building new database for the pharmaceutical industry, by collecting, cleaning and managing data from different sources. Used AWS services for implementing ETL, storing logs, etc.

Simulation of the Tokio 2020 Olympic Games

Parsed and analyzed a large volume of athletes' performance data. Applied the Monte Carlo statistical approach on athletes' performance data for simulating track and field competitions. Used AWS cloud services for running computations and storing generated results.

Publication

Four Pitfalls of Sentiment Analysis Accuracy

https://www.toptal.com/deep-learning/4-sentiment-analysis-accuracy-traps

Publication

Efficiency at Scale: A Tale of AWS Cost Optimization

https://www.toptal.com/aws/aws-cost-optimization-at-scale

Skills

Languages

Python, SQL, XML, JavaScript, Java, HTML, CSS, R, Bash, Excel VBA, GraphQL

Frameworks

Selenium, Flask, Scrapy, Spark

Libraries/APIs

Pandas, Beautiful Soup, REST APIs, XGBoost, SciPy, NumPy, SpaCy, Scikit-learn, Natural Language Toolkit (NLTK), Twitter API, PySpark, Google AdWords, Matplotlib, Google Cloud API, AdWords API, Facebook API, Google Analytics API, Node.js

Tools

Trello, Jupyter, GitHub, Gensim, Apache Airflow, pgAdmin, Bitbucket, Git, Cron, Plotly, Amazon Elastic MapReduce (EMR), Google Analytics, Docker Compose, Spark SQL

Paradigms

Data Science, ETL, Scrum, REST, Database Design, Anomaly Detection

Platforms

Jupyter Notebook, Docker, Amazon Web Services (AWS), Linux, Amazon EC2, Appsmith

Storage

PostgreSQL, MySQL, DB, MongoDB, Database Modeling, Amazon DynamoDB, Redshift, Data Lakes, Data Pipelines, Elasticsearch

Other

Data Scraping, Big Data, Data Engineering, Text Classification, Text Mining, Data Analysis, Data Analytics, Batch File Processing, Predictive Analytics, Apache Superset, Regular Expressions, Web Scraping, Clustering Algorithms, Topic Modeling, Web Services, Data Mining, Attribution Modeling, Data Visualization, Reporting, Trading, Natural Language Processing (NLP), Markov Chain Monte Carlo (MCMC) Algorithms, Markov Model, Code Architecture, Data Modeling, lxml, fastText, Linguistics, Time Series Analysis, SSH, Machine Learning, Computational Linguistics, Statistics, Data Structures, Algorithms, IBM Cloud, Amazon Kinesis, Hedge Funds, GPT, Generative Pre-trained Transformers (GPT), Sentiment Analysis, Agile Data Science, OpenAI, HubSpot CRM, Dash, Financial Data

Industry Expertise

Marketing, Healthcare

Education

2013 - 2017

Bachelor's Degree in Computer Science

Tbilisi State University of Ivane Javakhishvili - Tbilisi, Georgia

Certifications

JUNE 2022 - PRESENT

Data Analysis Nanodegree

Udacity

MAY 2020 - PRESENT

AWS Certified Solutions Architect Associate 2020

CloudGuru

AUGUST 2019 - PRESENT

Marketing Analytics with R

Datacamp.com

DECEMBER 2018 - DECEMBER 2019

Google Analytics Individual Qualification

Digital Academy for Ads

JULY 2017 - PRESENT

Deep Learning Summer School

University of Deusto

JANUARY 2017 - PRESENT

Deep Learning Nanodegree

Udacity

FEBRUARY 2016 - PRESENT

Machine Learning Online Course

Stanford University

FEBRUARY 2016 - PRESENT

Language and Modern Technologies

Goethe University Frankfurt/Main

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring