Senior Data Engineer
2022 - 2022The Story Market- Successfully deployed the entire ETL job infrastructure on AWS using containerized Apache Airflow.
- Refactored the existing data handling code, making it eight times smaller.
- Developed a framework on top of Apache Airflow to reduce the cost of adding new publishers to The Story Market network.
Technologies: Python 3, Apache Airflow, Docker, Docker Compose, Splash, Scrapy, MongoDB, Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon EC2, AWS Lambda, AWS RDS, PostgreSQL, Pandas, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, GitHub, Jupyter, ETL, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, REST APIs, Data Engineering, Redis, AWS Kinesis, NumPy, Datasets, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, Data Collection, Matplotlib, Information Extraction, SQLAlchemy, DataFrames, MypySoftware Developer
2021 - 2022Bar-All- Successfully led the Odoo application database migration for the upgrade process from version 13 to version 14.
- Refactored and upgraded legacy Odoo applications to make them work in a more modern Odoo 14 environment.
- Implemented the Odoo application customization to make day-to-day business operations seamless and less error-prone.
Technologies: Python 3, XML, MVC Design, Web, Odoo, Odoo.sh, Apps, PostgreSQL, APIs, SQL, Python, Databases, GitHub, Software Engineering, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Scraping, REST APIs, Document Parsing, PDF, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Information Extraction, PytestData Engineer
2020 - 2022Marketsonics LLC- Refactored existing ETL Python code, achieving a 10x reduction in the total lines of code, vastly improving readability and maintainability.
- Designed and implemented custom cloud architecture for handling specific ETL workloads very efficiently and cheaply using Apache Airflow, AWS batch, and Docker.
- Set up automatic alerts in case of any failure during periodic jobs.
Technologies: Amazon Web Services (AWS), Bash, PostgreSQL, Terraform, ETL, CSV, Slack App, Bitbucket, Apache Airflow, Docker, Python 3, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, Jupyter, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Data Engineering, Redis, AWS Kinesis, WebSockets, NumPy, Datasets, Back-end, Email Parsing, Document Parsing, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, MySQL, Text Analytics, Snowflake, DevOps, Data Collection, Matplotlib, Information Extraction, NLP Pipeline, R, SQLAlchemy, DataFrames, MypyFull-stack Developer | Machine Learning Engineer
2019 - 2020inovex GmbH- Developed a fully functional image-captioning service with a team of four, including a web page and a highly available REST API.
- Implemented and trained an image captioning model from scratch based on the latest research papers in the field.
- Designed the micro-service architecture for usable image captioning and deployment on the Google Cloud Platform (GCP).
- Implemented continuous integration/development pipelines for all the microservices, using GitLab enterprise tools.
- Wrote a blog post about project details, including information on technologies and management methodologies.
- Established Google Cloud Budget alerts to automatically monitor and notify team members about possible budget overruns.
Technologies: gRPC, Terraform, Google Cloud Platform (GCP), Helm, Kubernetes, Docker, Vue, JavaScript, PyTorch, Flask, Python, Deep Learning, Databases, GitLab, Jupyter, Google Cloud Storage, Model Development, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Node.js, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Back-end, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Neural Networks, Text Classification, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, BERT, SQLAlchemy, DataFramesMaster Course in Foundations of Data Engineering Tutor
2019 - 2020Technical University of Munich- Conducted tutorial sessions with students explaining the most important aspects of the lecture and answering questions.
- Held study sessions for students who needed individual help with the lectures and assignments.
- Took part in grading assignments and final exams.
Technologies: Spark, Scala, C++, Bash, Big Data, SQL, Databases, Data Science, GitHub, GitLab, ETL, Statistical Analysis, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Quantitative Analysis, CI/CD Pipelines, Data Pipelines, Scraping, XML Parsing, HTML Parsing, PySpark, Redis, WebSockets, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Text Classification, Web Crawlers, MySQL, PostgreSQL, Snowflake, Data Collection, Natural Language Understanding (NLU), Information Extraction, PyTorch Lightning, BERT, SQLAlchemy, DataFramesSenior Software Developer | Data Scientist
2016 - 2020Leavingstone- Developed a framework for creating and deploying dialog systems (chatbots) on Facebook Messenger.
- Implemented numerous chatbots, the best of which had more than a million interactions and a second-day retention rate of 20%.
- Created a graph-based web interface for easy assembly of custom chatbots by entering dialog texts and not a single line of code.
- Wrote a chatbot for helping citizens wrongly fined by the police in Georgia. It would query users about the circumstances of the offense and automatically generate a personalized appeal PDF for submission to the court.
- Developed a platform of custom real-time data analytics for a retail chain with more than 250 stores throughout Tbilisi.
- Implemented a recommendation system for a large retail business. It would automatically generate special offers and gifts for loyal customers, using machine learning tools.
- Created a web scraper to continuously gather publicly available data about all the parking tickets written in Tbilisi.
- Designed and Implemented a data analytics web page on all the parking tickets in Tbilisi. It included a heatmap, other types of charts, and textual comments with analysis.
- Built an API for guessing a person's nationality based on their first and last name, using deep natural language processing (Fast.ai).
- Built a REST API for transliterating from English to Georgian, using Flask and SQL.
Technologies: Amazon Web Services (AWS), BigQuery, Google Cloud Platform (GCP), Flask, Django, Spark, Pandas, Kotlin, Scala, JavaScript, Java, SQL, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, Databases, Data Science, GitHub, GitLab, Jupyter, ETL, Google Cloud Storage, Data Analysis, Statistical Analysis, Model Development, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Recommendation Systems, Data-informed Recommendations, Microservices, Azure, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, PySpark, Data Engineering, Redis, WebSockets, NumPy, Machine Learning, Datasets, Simulations, Computer Vision, Back-end, Email Parsing, Document Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Google BigQuery, Artificial Neural Networks (ANN), Neural Networks, Text Classification, Web Crawlers, MySQL, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, Pytest, DataFrames, MypyPython Developer
2019 - 2019Elasticiti Inc- Refactored the existing Apache Airflow project to remove code duplication and make it easily extensible.
- Examined existing ETL pipelines and tracked and fixed bugs.
- Implemented, tested, and deployed a new ETL pipeline for handling daily updated raw client data.
- Created a detailed wiki markdown documentation about my work and other parts of the existing project.
Technologies: Snowflake, Markdown, Amazon Web Services (AWS), JSON, Apache Airflow, SQL, Requests, Pandas, Python, Deep Learning, Big Data, Databases, Data Science, GitHub, ETL, Data Analysis, Software Engineering, Google Cloud, CI/CD Pipelines, Data Pipelines, XML Parsing, REST APIs, Data Engineering, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, MySQL, Information Extraction, DataFramesSenior Software Developer | Web Automation Engineer
2018 - 2018The Rundown- Developed a scraping tool for automatically gathering live data from various sports-betting websites.
- Implemented a REST API using Flask to make the sports-betting data easily accessible for other services.
- Built automated testing/integration tools for easy and painless software updates.
- Conducted daily stand-up meetings for all the developers to catch up with each other and plan the following day.
- Developed a REST API to automatically and instantly place bets on various sports betting websites.
- Implemented an algorithm to gather all the available sports-betting data and find and place profitable bets.
Technologies: Amazon Web Services (AWS), Docker Compose, Docker, MySQL, Spring, Flask, Scrapy, Requests, HTTP, Selenium, Java, Python, Algorithms, Web Scraping, Data Scraping, Big Data, Databases, GitHub, Jupyter, ETL, Data Analysis, Model Development, Distributed Systems, Quantitative Analysis, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, Data Engineering, NumPy, Datasets, Computer Vision, Back-end, Document Parsing, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Text Analytics, DevOps, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, Pytest, DataFrames, MypySoftware/Web Scraping Engineer
2016 - 2018Freelance- Implemented a data aggregation framework that would collect, process, and extract daily data about various online game stats to help the client (Jabre Capital Partners) with stock market trading decisions.
- Developed a desktop GUI program to monitor and automatically purchase rare items in eCommerce shops when these items come in stock. The websites were Amazon, Walmart, Best Buy, and The Source.
- Built a microframework in Python for writing auto-trader robots for cryptocurrencies, using the Coinigy API.
- Wrote a Python program to use the AWS API and automatically manage tags for all the resources.
- Developed a web crawler for collecting and storing discount coupon codes.
- Created a tool for exporting slides as JPEG images from Microsoft PowerPoint using Python, Unoconv, and Convert.
- Rewrote a large numerical analysis project from Visual Basic to modern Python 3 with careful testing to guarantee the same output.
- Implemented and deployed a custom trading strategy on Python 3 and the Coinigy platform.
- Wrote more than 20 small web-scraping programs using Python.
Technologies: Amazon Web Services (AWS), PyQt 5, Spark, Scala, Selenium, SciPy, Scikit-learn, Pandas, Django, Scrapy, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, SQL, Databases, GitHub, ETL, Data Analysis, Recommendation Systems, Scraping, XML Parsing, HTML Parsing, Node.js, PySpark, Data Engineering, NumPy, Simulations, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Matplotlib, Information Extraction, Pytest, DataFrames