Rudolf Eremyan, Data Science Developer in Tbilisi, Georgia
Rudolf Eremyan

Data Science Developer in Tbilisi, Georgia

Member since July 3, 2018
Rudolf is a data scientist with five years of experience in natural language processing and machine learning. He's developed the first chatbot framework for the Georgian language which was adopted by the largest bank in Georgia and created AI-based tools for companies from the USA and Europe. His last project was a marketing campaign optimization tool used by Fortune 500 companies.
Rudolf is now available for hire

Portfolio

  • Zelos.AI
    Statistics, Data Science, AWS DynamoDB, AWS Lambda, AWS EC2, AWS S3, lxml...
  • Windsor.AI
    Marketing, Google Analytics, PostgreSQL, SQL, Statistics, R, Pandas, Python
  • Frontier Data Corporation
    Time Series Analysis, R, Natural Language Processing (NLP), Big Data, Python

Experience

Location

Tbilisi, Georgia

Availability

Full-time

Preferred Environment

Scikit-learn, Git, Linux, AWS, Python

The most amazing...

...framework I've developed is a chatbot framework for the Georgian language.

Employment

  • Senior Data Scientist

    2019 - 2020
    Zelos.AI
    • Developed a data scraping tool for parsing dynamic and static web pages using Scrapy, Selenium, lxml, and other Python libraries.
    • Created batch data processing pipeline using AWS services like Batch, ECR, S3, and DynamoDB.
    • Applied machine learning techniques for creating a tool for data extraction from raw texts and incorrect web pages.
    • Used Docker and Docker-Compose for containerizing the entire project.
    • Developed athletics competitions simulations based on the Monte Carlo approach.
    • Designed architecture of the platform and data model for the database.
    Technologies: Statistics, Data Science, AWS DynamoDB, AWS Lambda, AWS EC2, AWS S3, lxml, Data Modeling, Database Modeling, Code Architecture, Markov Model, Markov Chain Monte Carlo (MCMC) Algorithms, Batch, Scrapy, DB, Data Scraping, Selenium, Data Engineering, Machine Learning, Natural Language Processing (NLP), ETL, Docker, AWS, Python
  • Data Scientist

    2018 - 2019
    Windsor.AI
    • Developed scripts for data migration between different database management systems.
    • Expanded existing data preprocessing flow using Python and R libraries.
    • Improved attribution modeling pipeline integrating new features and fixing the bugs.
    • Extensively used SQL for analyzing data, finding anomalies, and valuable insights.
    • Developed and modified scripts for data pulling from different online advertising platforms.
    Technologies: Marketing, Google Analytics, PostgreSQL, SQL, Statistics, R, Pandas, Python
  • Data Scientist

    2018 - 2019
    Frontier Data Corporation
    • Developed models for trend detection in the Twitter stream.
    • Developed AI-based application's architecture.
    • Integrated in-house ML models with cloud services as IBM BlueMix and Google Cloud NLP.
    • Worked with big datasets using Google BigQuery.
    • Created customized modules for new ML models evaluation.
    • Trained machine learning models for text classification.
    • Created tests for existing applications.
    Technologies: Time Series Analysis, R, Natural Language Processing (NLP), Big Data, Python
  • Data Scientist

    2016 - 2018
    Pulsar AI
    • Developed a chatbot framework for Georgian language.
    • Created an automated news article grouping tool.
    • Designed a tool for sentiment classification on texts from social networks.
    • Worked with time series for analyzing and predicting cryptocurrency price.
    • Analyzed data and presented results in a clear manner.
    Technologies: MongoDB, Git, Docker, NumPy, Pandas, SpaCy, fastText, Keras, NLTK, Gensim, Scikit-learn, Python
  • Software Developer Internship

    2016 - 2016
    Virtuace Inc.
    • Fixed bugs.
    • Expanded functionality of the existing application.
    • Tested new modules.
    Technologies: XML, Apache Tomcat, Java, Git, Linux
  • Full Stack Software engineer

    2014 - 2016
    Georgian Technical University
    • Developed the front-end for managing and working with linguistic corpora.
    • Created web services for operating with linguistic corpus data.
    • Organized database structure for storing and manipulating the linguistic corpora.
    • Analyzed documents using NLP tools and presented results in a clear manner.
    Technologies: Python, NLTK, Linguistics, MySQL, REST, JavaScript, CSS, HTML

Experience

  • Trend Detection in Twitter Stream (Development)

    Using natural language processing algorithms with a combination of time series analysis approaches developed model for earlier trend detection in the Twitter stream.
    Developed scripts for pulling and analyzing Twitter Stream using Twitter API.

    Visualized results of the analysis with different plots for better interpreting.

  • Attribution Modeling for Marketing Optimization (Development)

    Attribution modeling is the method used to measure the monetary impact a piece of communication has on real business goals, for example, sales, customer retention, revenue, and profit.

    During working on this project I have extensively used SQL for data manipulation and analysis, as well as Python and R libraries. I have developed data migration and client notification scripts. Also, implemented data integrity tests for checking completeness and the correctness of existing data. Worked with an international team distributed around the world.

  • Advanced News Filter (Development)

    Using Google BigQuery analyzed news big dataset.

    Trained machine learning models for text classification which used in text filtering mechanism. Integrated cloud ML services such as IBM BlueMix and Google Cloud NLP with an existing application.

  • Chatbot Framework for Georgian Language (Development)
    https://www.facebook.com/TBCTIbot/

    Ti-Bot, the first ever Chat Bot to speak Georgian.

  • Automated News Article Grouping Tool (Development)

    News article grouping tool uses word vectorizing technologies with a combination of clustering algorithms for automatically grouping similar articles parsed from news websites.

  • Social Media Sentiment Analysis Tool (Development)

    Social media sentiment analysis tool is a combination of natural language processing technologies and machine learning algorithms for predicting the sentiment for comments and posts, collected from social networks such as Facebook and Instagram.

  • Spell Checker for Georgian Language (Development)

    Spell checker tool uses classical algorithms with a combination of powerful machine learning and natural language processing methods for detecting and correcting mistakes in the sentences. This product used by the largest companies in Georgia for detecting and correcting mistakes in documents.

  • Cryptocurrency Prices Monitoring Tool (Development)

    Cryptocurrency prices monitoring tool uses time series analysis algorithms and Tweeter API combined with NLP tools such as Sentiment analysis, for monitoring and predicting price movements of Bitcoin and other cryptocurrencies.

  • NLP Tool for Automatic Identification of Georgian Dialects (Other amazing things)

    A tool used for automatic identification of the Georgian dialects in documents from different sources such as forums, social networks, etc. It's based on machine learning classification methods and NLP approaches. During development, I worked with a group of linguists who prepared training and evaluated data for a classification model.

    This project was awarded the "Best Scientific Research of the Tbilisi State University 76th Student Conference"

  • Linguistic Corpus Management System (Development)

    Developed a web application for storing, manipulating, and analyzing linguistic data.

  • ETL pipeline for pharmaceutical industry data (Development)

    Worked with clients team building new database for the pharmaceutical industry, by collecting, cleaning and managing data from different sources. Used AWS services for implementing ETL, storing logs, etc.

  • Simulation of the Tokio 2020 Olympic Games (Development)

    Parsed and analyzed a large volume of athletes' performance data. Applied the Monte Carlo statistical approach on athletes' performance data for simulating track and field competitions. Used AWS cloud services for running computations and storing generated results.

  • Four Pitfalls of Sentiment Analysis Accuracy (Publication)
    Manually gathering information about user-generated data is time-consuming, to say the least. That's why more organizations are turning to automatic sentiment analysis methods—but basic models don't always cut it. In this article, Toptal Freelance Data Scientist Rudolf Eremyan gives an overview of some sentiment analysis gotchas and what can be done to address them.

Skills

  • Languages

    Python, SQL, Batch, XML, JavaScript, Java, HTML, CSS, R, Bash
  • Libraries/APIs

    Pandas, Scikit-learn, NLTK, Beautiful Soup, REST APIs, XGBoost, SciPy, NumPy, SpaCy, Twitter API, Google AdWords, Keras, Matplotlib, Google Cloud API, AdWords API, Facebook API, Google Analytics API
  • Tools

    Trello, Jupyter, GitHub, Gensim, pgAdmin, Bitbucket, Git, Apache Tomcat, Google Analytics
  • Paradigms

    Data Science, ETL, Scrum, REST
  • Platforms

    Amazon Web Services (AWS), Mac OS, Docker, Linux, AWS Lambda, AWS EC2
  • Storage

    AWS S3, PostgreSQL, MongoDB, MySQL, DB, Database Modeling, AWS DynamoDB, IBM BlueMix
  • Other

    Data Scraping, Machine Learning, Web Scraping, Text Classification, Text Mining, Data Analysis, Data Analytics, Data Analyst, Batch File Processing, AWS, Predictive Analytics, Data Engineering, Apache Superset, Regular Expressions, Clustering Algorithms, Topic Modeling, Web Services, Data Mining, Attribution Modeling, Natural Language Processing (NLP), Markov Chain Monte Carlo (MCMC) Algorithms, Markov Model, Code Architecture, Data Modeling, lxml, fastText, Linguistics, Big Data, Time Series Analysis, SSH, Computational Linguistics, Statistics, Data Structures, Algorithms, IBM Cloud
  • Frameworks

    Selenium, Flask, Scrapy
  • Industry Expertise

    Trading, Marketing, Healthcare

Education

  • Master's degree in Computer Science
    2017 - 2019
    Tbilisi State University of Ivane Javakhishvili - Tbilisi, Georgia
  • Bachelor's degree in Computer Science
    2013 - 2017
    Tbilisi State University of Ivane Javakhishvili - Tbilisi, Georgia

Certifications

  • AWS Certified Solutions Architect Associate 2020
    MAY 2020 - PRESENT
    CloudGuru
  • Marketing Analytics with R
    AUGUST 2019 - PRESENT
    Datacamp.com
  • Google Analytics Individual Qualification
    DECEMBER 2018 - DECEMBER 2019
    Digital Academy for Ads
  • Deep Learning Summer School
    JULY 2017 - PRESENT
    University of Deusto
  • Deep Learning Nanodegree
    JANUARY 2017 - PRESENT
    Udacity
  • Machine Learning Online Course
    FEBRUARY 2016 - PRESENT
    Stanford University
  • Language and Modern Technologies
    FEBRUARY 2016 - PRESENT
    Goethe University Frankfurt/Main

To view more profiles

Join Toptal
Share it with others