Data Science Developer
Rudolf is a data scientist with six years of experience in the field. He developed the first chatbot framework for the Georgian language, which the largest bank in Georgia adopted. Rudolf designed big data processing pipelines based on cloud technologies for Fortune 500 companies. He was invited to be a speaker and judge on international hackathons and conferences like PyData, Google DevFest, and NASA's international space app challenge.
ExperiencePython - 6 yearsData Science - 5 yearsSQL - 5 yearsPandas - 5 yearsStatistics - 4 yearsAmazon Web Services (AWS) - 4 yearsData Engineering - 4 yearsPySpark - 1 year
Amazon Web Services (AWS), Python, Big Data, Apache Airflow, PostgreSQL, SQL, PySpark, Data Modeling, Data Pipelines, Pandas
The most amazing...
...framework I've developed is a chatbot framework for the Georgian language.
Data Engineer for a Cloud Solution
- Designed a data model based on customer-provided requirements and business needs.
- Developed an investor CRM system for managing hedge fund trades, orders, and other operations.
- Created automated reporting tools and deployed them on the Amazon cloud services.
ATH Digital LLC
- Created data ingestion scripts for pulling data from ad platforms like Adwords and Facebook Ads.
- Developed automatic uploading of the CSV and Excel files data into the database based on the AWS services.
- Set up the marketing streaming cloud infrastructure of the data processing pipeline.
- Designed a database model based on the data science team requirements.
- Created a model for forecasting and visualizing the balance burn rate metric.
Senior Data Scientist
- Processed and analyzed over 100 million athletic performance data with PySpark running on AWS EMR.
- Designed a data model based on the companies business requirements.
- Made a batch data processing pipeline orchestrated by Airflow.
- Created a data scraping tool for parsing dynamic and static web pages using Scrapy, Selenium, lxml.
- Developed athletics competitions simulations based on the Monte Carlo approach.
- Optimized existing SQL queries, making them less complex and having higher performance.
- Used SQL for gaining insights, detecting anomalies and problems in the collected data.
- Created a workflow for the data migration between different database management systems.
- Developed scripts for ingesting data from different online advertising platforms.
- Designed new database tables according to the analytics team requirements.
Frontier Data Corporation
- Developed models for trend detection in the Twitter stream.
- Developed AI-based application's architecture.
- Integrated in-house ML models with cloud services as IBM BlueMix and Google Cloud NLP.
- Worked with big datasets using Google BigQuery.
- Created customized modules for new ML models evaluation.
- Trained machine learning models for text classification.
- Created tests for existing applications.
- Developed a chatbot framework for the Georgian language applying machine learning and natural language processing (NLP) techniques.
- Trained and deployed a machine learning model for an automated grouping of the news and articles from Georgian media websites.
- Designed a tool for sentiment classification on texts from social networks.
- Analyzed a large amount of user conversations data applying NLP, statistics and presented precise results.
- Worked with time series for analyzing and predicting cryptocurrency prices.
- Managed a team of linguists who worked on the data collection and labeling.
Software Developer Internship
- Fixed bugs.
- Expanded functionality of the existing application.
- Tested new modules.
Full-stack Software Engineer
Georgian Technical University
- Developed the front-end for managing and working with linguistic corpora.
- Created web services for operating with linguistic corpus data.
- Organized database structure for storing and manipulating the linguistic corpora.
- Analyzed documents using NLP tools and presented results in a clear manner.
Trend Detection in Twitter Stream
Developed scripts for pulling and analyzing Twitter Stream using Twitter API.
Visualized results of the analysis with different plots for better interpreting.
Attribution Modeling for Marketing Optimization
During working on this project I have extensively used SQL for data manipulation and analysis, as well as Python and R libraries. I have developed data migration and client notification scripts. Also, implemented data integrity tests for checking completeness and the correctness of existing data. Worked with an international team distributed around the world.
Advanced News Filter
Trained machine learning models for text classification which used in text filtering mechanism. Integrated cloud ML services such as IBM BlueMix and Google Cloud NLP with an existing application.
Chatbot Framework for Georgian Languagehttps://www.facebook.com/TBCTIbot/
Automated News Article Grouping Tool
Social Media Sentiment Analysis Tool
Spell Checker for Georgian Language
Cryptocurrency Prices Monitoring Tool
NLP Tool for Automatic Identification of Georgian Dialects
This project was awarded the "Best Scientific Research of the Tbilisi State University 76th Student Conference"
Linguistic Corpus Management System
ETL pipeline for pharmaceutical industry data
Simulation of the Tokio 2020 Olympic Games
Four Pitfalls of Sentiment Analysis Accuracy
Efficiency at Scale: A Tale of AWS Cost Optimization
Pandas, Beautiful Soup, REST APIs, XGBoost, SciPy, NumPy, SpaCy, Scikit-learn, Natural Language Toolkit (NLTK), Twitter API, PySpark, Google AdWords, Matplotlib, Google Cloud API, AdWords API, Facebook API, Google Analytics API
Trello, Jupyter, GitHub, Gensim, Apache Airflow, pgAdmin, Bitbucket, Git, Cron, Plotly, Google Analytics, Docker Compose, Spark SQL
Data Science, ETL, Scrum, REST, Database Design
Jupyter Notebook, Docker, Amazon Web Services (AWS), Linux, Amazon EC2
PostgreSQL, MySQL, DB, MongoDB, Database Modeling, Amazon DynamoDB, Redshift, Data Lakes, Data Pipelines
Data Scraping, Big Data, Data Engineering, Machine Learning, Text Classification, Text Mining, Data Analysis, Data Analytics, Batch File Processing, Predictive Analytics, Apache Superset, Regular Expressions, Web Scraping, Clustering Algorithms, Topic Modeling, Web Services, Data Mining, Attribution Modeling, Data Visualization, Reporting, Trading, Natural Language Processing (NLP), Markov Chain Monte Carlo (MCMC) Algorithms, Markov Model, Code Architecture, Data Modeling, lxml, fastText, Linguistics, Time Series Analysis, SSH, Computational Linguistics, Statistics, Data Structures, Algorithms, IBM Cloud, Amazon Kinesis, Hedge Funds, GPT, Generative Pre-trained Transformers (GPT)
Selenium, Flask, Scrapy, AWS EMR, Spark
Bachelor's Degree in Computer Science
Tbilisi State University of Ivane Javakhishvili - Tbilisi, Georgia
Data Analysis Nanodegree
AWS Certified Solutions Architect Associate 2020
Marketing Analytics with R
Google Analytics Individual Qualification
Digital Academy for Ads
Deep Learning Summer School
University of Deusto
Deep Learning Nanodegree
Machine Learning Online Course
Language and Modern Technologies
Goethe University Frankfurt/Main