Nika Dogonadze, Data Scientist and Developer in Tbilisi, Georgia
Nika Dogonadze

Data Scientist and Developer in Tbilisi, Georgia

Member since January 22, 2018
Nika has over five years of experience working in tech, specializing in Python, data engineering, web-scraping, and machine learning. He has a master's degree in data engineering and analytics and a great deal of experience working with various technologies. Nika is personable, communicates exceptionally well, and stands out with his work ethic.
Nika is now available for hire

Portfolio

  • The Story Market
    Python 3, Apache Airflow, Docker, Docker Compose, Splash, Scrapy, MongoDB...
  • Bar-All
    Python 3, XML, MVC Design, Web, Odoo, Odoo.sh, Apps, PostgreSQL, APIs, SQL...
  • Marketsonics LLC
    Amazon Web Services (AWS), Bash, PostgreSQL, Terraform, ETL, CSV, Slack App...

Location

Tbilisi, Georgia

Availability

Full-time

Preferred Environment

Jupyter Notebook, Visual Studio Code, VS Code, PyCharm, IntelliJ IDEA, Windows, Linux, Computer Vision, Google BigQuery, Artificial Neural Networks (ANN), Text Classification, Web Crawlers, R, SQLAlchemy, Pytest, DataFrames, Mypy

The most amazing...

...project I've created is a state-of-the-art face forgery detection model found here: https://bit.ly/3gWRbIz

Employment

  • Senior Data Engineer

    2022 - 2022
    The Story Market
    • Successfully deployed the entire ETL job infrastructure on AWS using containerized Apache Airflow.
    • Refactored the existing data handling code, making it eight times smaller.
    • Developed a framework on top of Apache Airflow to reduce the cost of adding new publishers to The Story Market network.
    Technologies: Python 3, Apache Airflow, Docker, Docker Compose, Splash, Scrapy, MongoDB, Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon EC2, AWS Lambda, AWS RDS, PostgreSQL, Pandas, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, GitHub, Jupyter, ETL, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, REST APIs, Data Engineering, Redis, AWS Kinesis, NumPy, Datasets, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, Data Collection, Matplotlib, Information Extraction, SQLAlchemy, DataFrames, Mypy
  • Software Developer

    2021 - 2022
    Bar-All
    • Successfully led the Odoo application database migration for the upgrade process from version 13 to version 14.
    • Refactored and upgraded legacy Odoo applications to make them work in a more modern Odoo 14 environment.
    • Implemented the Odoo application customization to make day-to-day business operations seamless and less error-prone.
    Technologies: Python 3, XML, MVC Design, Web, Odoo, Odoo.sh, Apps, PostgreSQL, APIs, SQL, Python, Databases, GitHub, Software Engineering, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Scraping, REST APIs, Document Parsing, PDF, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Information Extraction, Pytest
  • Data Engineer

    2020 - 2022
    Marketsonics LLC
    • Refactored existing ETL Python code, achieving a 10x reduction in the total lines of code, vastly improving readability and maintainability.
    • Designed and implemented custom cloud architecture for handling specific ETL workloads very efficiently and cheaply using Apache Airflow, AWS batch, and Docker.
    • Set up automatic alerts in case of any failure during periodic jobs.
    Technologies: Amazon Web Services (AWS), Bash, PostgreSQL, Terraform, ETL, CSV, Slack App, Bitbucket, Apache Airflow, Docker, Python 3, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, Jupyter, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Data Engineering, Redis, AWS Kinesis, WebSockets, NumPy, Datasets, Back-end, Email Parsing, Document Parsing, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, MySQL, Text Analytics, Snowflake, DevOps, Data Collection, Matplotlib, Information Extraction, NLP Pipeline, R, SQLAlchemy, DataFrames, Mypy
  • Full-stack Developer | Machine Learning Engineer

    2019 - 2020
    inovex GmbH
    • Developed a fully functional image-captioning service with a team of four, including a web page and a highly available REST API.
    • Implemented and trained an image captioning model from scratch based on the latest research papers in the field.
    • Designed the micro-service architecture for usable image captioning and deployment on the Google Cloud Platform (GCP).
    • Implemented continuous integration/development pipelines for all the microservices, using GitLab enterprise tools.
    • Wrote a blog post about project details, including information on technologies and management methodologies.
    • Established Google Cloud Budget alerts to automatically monitor and notify team members about possible budget overruns.
    Technologies: gRPC, Terraform, Google Cloud Platform (GCP), Helm, Kubernetes, Docker, Vue, JavaScript, PyTorch, Flask, Python, Deep Learning, Databases, GitLab, Jupyter, Google Cloud Storage, Model Development, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Node.js, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Back-end, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Neural Networks, Text Classification, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, BERT, SQLAlchemy, DataFrames
  • Master Course in Foundations of Data Engineering Tutor

    2019 - 2020
    Technical University of Munich
    • Conducted tutorial sessions with students explaining the most important aspects of the lecture and answering questions.
    • Held study sessions for students who needed individual help with the lectures and assignments.
    • Took part in grading assignments and final exams.
    Technologies: Spark, Scala, C++, Bash, Big Data, SQL, Databases, Data Science, GitHub, GitLab, ETL, Statistical Analysis, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Quantitative Analysis, CI/CD Pipelines, Data Pipelines, Scraping, XML Parsing, HTML Parsing, PySpark, Redis, WebSockets, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Text Classification, Web Crawlers, MySQL, PostgreSQL, Snowflake, Data Collection, Natural Language Understanding (NLU), Information Extraction, PyTorch Lightning, BERT, SQLAlchemy, DataFrames
  • Senior Software Developer | Data Scientist

    2016 - 2020
    Leavingstone
    • Developed a framework for creating and deploying dialog systems (chatbots) on Facebook Messenger.
    • Implemented numerous chatbots, the best of which had more than a million interactions and a second-day retention rate of 20%.
    • Created a graph-based web interface for easy assembly of custom chatbots by entering dialog texts and not a single line of code.
    • Wrote a chatbot for helping citizens wrongly fined by the police in Georgia. It would query users about the circumstances of the offense and automatically generate a personalized appeal PDF for submission to the court.
    • Developed a platform of custom real-time data analytics for a retail chain with more than 250 stores throughout Tbilisi.
    • Implemented a recommendation system for a large retail business. It would automatically generate special offers and gifts for loyal customers, using machine learning tools.
    • Created a web scraper to continuously gather publicly available data about all the parking tickets written in Tbilisi.
    • Designed and Implemented a data analytics web page on all the parking tickets in Tbilisi. It included a heatmap, other types of charts, and textual comments with analysis.
    • Built an API for guessing a person's nationality based on their first and last name, using deep natural language processing (Fast.ai).
    • Built a REST API for transliterating from English to Georgian, using Flask and SQL.
    Technologies: Amazon Web Services (AWS), BigQuery, Google Cloud Platform (GCP), Flask, Django, Spark, Pandas, Kotlin, Scala, JavaScript, Java, SQL, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, Databases, Data Science, GitHub, GitLab, Jupyter, ETL, Google Cloud Storage, Data Analysis, Statistical Analysis, Model Development, Classification Algorithms, Natural Language Processing (NLP), Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Recommendation Systems, Data-informed Recommendations, Microservices, Azure, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, PySpark, Data Engineering, Redis, WebSockets, NumPy, Machine Learning, Datasets, Simulations, Computer Vision, Back-end, Email Parsing, Document Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Google BigQuery, Artificial Neural Networks (ANN), Neural Networks, Text Classification, Web Crawlers, MySQL, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, Pytest, DataFrames, Mypy
  • Python Developer

    2019 - 2019
    Elasticiti Inc
    • Refactored the existing Apache Airflow project to remove code duplication and make it easily extensible.
    • Examined existing ETL pipelines and tracked and fixed bugs.
    • Implemented, tested, and deployed a new ETL pipeline for handling daily updated raw client data.
    • Created a detailed wiki markdown documentation about my work and other parts of the existing project.
    Technologies: Snowflake, Markdown, Amazon Web Services (AWS), JSON, Apache Airflow, SQL, Requests, Pandas, Python, Deep Learning, Big Data, Databases, Data Science, GitHub, ETL, Data Analysis, Software Engineering, Google Cloud, CI/CD Pipelines, Data Pipelines, XML Parsing, REST APIs, Data Engineering, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, MySQL, Information Extraction, DataFrames
  • Senior Software Developer | Web Automation Engineer

    2018 - 2018
    The Rundown
    • Developed a scraping tool for automatically gathering live data from various sports-betting websites.
    • Implemented a REST API using Flask to make the sports-betting data easily accessible for other services.
    • Built automated testing/integration tools for easy and painless software updates.
    • Conducted daily stand-up meetings for all the developers to catch up with each other and plan the following day.
    • Developed a REST API to automatically and instantly place bets on various sports betting websites.
    • Implemented an algorithm to gather all the available sports-betting data and find and place profitable bets.
    Technologies: Amazon Web Services (AWS), Docker Compose, Docker, MySQL, Spring, Flask, Scrapy, Requests, HTTP, Selenium, Java, Python, Algorithms, Web Scraping, Data Scraping, Big Data, Databases, GitHub, Jupyter, ETL, Data Analysis, Model Development, Distributed Systems, Quantitative Analysis, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, Data Engineering, NumPy, Datasets, Computer Vision, Back-end, Document Parsing, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Text Analytics, DevOps, Matplotlib, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, Pytest, DataFrames, Mypy
  • Software/Web Scraping Engineer

    2016 - 2018
    Freelance
    • Implemented a data aggregation framework that would collect, process, and extract daily data about various online game stats to help the client (Jabre Capital Partners) with stock market trading decisions.
    • Developed a desktop GUI program to monitor and automatically purchase rare items in eCommerce shops when these items come in stock. The websites were Amazon, Walmart, Best Buy, and The Source.
    • Built a microframework in Python for writing auto-trader robots for cryptocurrencies, using the Coinigy API.
    • Wrote a Python program to use the AWS API and automatically manage tags for all the resources.
    • Developed a web crawler for collecting and storing discount coupon codes.
    • Created a tool for exporting slides as JPEG images from Microsoft PowerPoint using Python, Unoconv, and Convert.
    • Rewrote a large numerical analysis project from Visual Basic to modern Python 3 with careful testing to guarantee the same output.
    • Implemented and deployed a custom trading strategy on Python 3 and the Coinigy platform.
    • Wrote more than 20 small web-scraping programs using Python.
    Technologies: Amazon Web Services (AWS), PyQt 5, Spark, Scala, Selenium, SciPy, Scikit-learn, Pandas, Django, Scrapy, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, SQL, Databases, GitHub, ETL, Data Analysis, Recommendation Systems, Scraping, XML Parsing, HTML Parsing, Node.js, PySpark, Data Engineering, NumPy, Simulations, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Matplotlib, Information Extraction, Pytest, DataFrames

Experience

  • Deep Face Forgery Detection
    https://github.com/Megatvini/DeepFaceForgeryDetection

    I developed a deep learning-based tool for automatically detecting human face forgeries in videos and single frames. I built and trained the model from scratch based on theoretical insights from the latest research papers in the field. I tried multiple approaches, from regular neural networks to LSTMs with 3D convolutional networks as the encoder.

    A detailed description is available in the following paper:
    • https://arxiv.org/abs/2004.11804.

  • Image Captioning Service (Web and API)
    https://www.inovex.de/blog/end-to-end-image-captioning/

    I developed a microservice-based image captioning service with a team of four. I was involved in all parts of the projects, from the initial design of what the services should be to implementing each of them separately.

    The project was entirely developed with Agile methodologies and my role was to oversee the whole development process and take part in the actual coding/implementation.

Skills

  • Languages

    SQL, Python, Bash, JavaScript, R, Java, Snowflake, TypeScript, Kotlin, Python 3, C, Markdown, Scala, C++, XML, Go, PHP
  • Frameworks

    Selenium, Spark, Flask, Scrapy, Django REST Framework, Django, Apache Spark, gRPC, Spring
  • Libraries/APIs

    Beautiful Soup, PyMongo, Matplotlib, Pandas, NumPy, Requests, PyTorch, REST APIs, Python API, SQLAlchemy, Node.js, Setuptools, Scikit-learn, TensorFlow, PyQt 5, Spark ML, Apiary API, Google Cloud API, PySpark, PyTorch Lightning, Mypy, Vue, PhantomJS, Django ORM, Twitch API, PyQt, SciPy, PiLLoW
  • Tools

    Scraping Hub, GitHub, GitLab, Jupyter, JetBrains, GitLab CI/CD, Amazon SageMaker, Pytest, Spark SQL, Tableau, Git, Apache Airflow, Google Kubernetes Engine (GKE), Terraform, BigQuery, IntelliJ IDEA, VS Code, Bitbucket, Helm, Docker Compose, Unoconv, Boto 3, Seaborn, PyCharm, MATLAB, Odoo
  • Paradigms

    Data Science, ETL, REST, Unit Testing, Microservices, DevOps, Agile, Testing, MVC Design
  • Platforms

    Jupyter Notebook, Docker, Google Cloud Platform (GCP), Spark Core, Amazon EC2, AWS Lambda, Linux, Amazon Web Services (AWS), Kubernetes, AWS Kinesis, Azure, ConvertKit, Linux CentOS 7, Android, Windows, Google Cloud SDK, Web, Splash, Databricks, Visual Studio Code
  • Storage

    Amazon S3 (AWS S3), Databases, Google Cloud Storage, XML Parsing, Database Modeling, NoSQL, PostgreSQL, PostgreSQL 10.1, Redis, MySQL, MongoDB, Google Cloud, Data Pipelines, JSON
  • Other

    Scraping, Data Scraping, Screen Scraping, Store Scraping, Natural Language Processing (NLP), Site Bots, Bots, Machine Learning, Networks, HTTP, Supervised Learning, Text Classification, Classification, Predictive Learning, Algorithms, Clustering Algorithms, Statistics, Software Development, Big Data, Machine Vision, Web Scraping, Artificial Intelligence (AI), APIs, Deep Learning, Software Engineering, HTML Parsing, WebSockets, Datasets, Computer Vision, Back-end, Email Parsing, Document Parsing, PDF, Data Modeling, Software Architecture, Artificial Neural Networks (ANN), Neural Networks, Web Crawlers, Text Analytics, Data Collection, Natural Language Understanding (NLU), Information Extraction, NLP Pipeline, DataFrames, Kubernetes Operations (Kops), Google BigQuery, Architecture, Data Engineering, Serverless, API Applications, Parquet, HTTPS, Unsupervised Learning, Regression, Classification Algorithms, Heuristic & Exact Algorithms, Optimization Algorithms, Genetic Algorithms, Mathematics, Computer Science, Machine Learning Automation, ETL Tools, ETL Development, AWS RDS, Data Analysis, Statistical Analysis, Model Development, Distributed Systems, Quantitative Analysis, Numerical Analysis, Back-end Development, CI/CD Pipelines, Recommendation Systems, Data-informed Recommendations, Simulations, Product Development, Team Leadership, Product Roadmaps, Machine Learning Operations (MLOps), Hugging Face, BERT, FastAPI, Slack App, CSV, Google Cloud Functions, Information Theory, Data Compression Algorithms, Cloud, Image Compression, Video Compression, Research, MLflow, Image Captioning, Odoo.sh, Apps, Forecasting, OpenAI

Education

  • Master's Degree in Data Engineering and Analytics
    2018 - 2020
    Technical University of Munich - Munich, Germany
  • Nanodegree in Data Analytics
    2017 - 2017
    Udacity - Udacity.com
  • Bachelor's Degree in Computer Science
    2013 - 2017
    Free University of Tbilisi - Tbilisi, Georgia

Certifications

  • Deep Learning Specialization
    JUNE 2018 - PRESENT
    Coursera
  • Data Analyst
    DECEMBER 2017 - PRESENT
    Udacity
  • IELTS
    JANUARY 2017 - PRESENT
    British Council

To view more profiles

Join Toptal
Share it with others