Nika Dogonadze, Developer in London, United Kingdom
Nika is available for hire
Hire Nika

Nika Dogonadze

Verified Expert  in Engineering

Data Scientist and Developer

Location
London, United Kingdom
Toptal Member Since
January 22, 2018

Nika has over five years of experience working in tech, specializing in Python, data engineering, web scraping, and machine learning. He has a master's degree in data engineering and analytics and a great deal of experience working with various technologies. Nika is personable, communicates exceptionally well, and stands out with his work ethic.

Portfolio

Touch Inflight Solutions
Django, Python, ETL, Data Pipelines, Azure, SQL, PostgreSQL, API Integration...
implicit diagnostics & solutions
Python, Scraping, Web Scraping, Amazon S3 (AWS S3), Windows, Linux, Virtualenv...
The Story Market
Python 3, Apache Airflow, Docker, Docker Compose, Splash, Scrapy, MongoDB...

Experience

Availability

Part-time

Preferred Environment

Python, Unix, Data Engineering, Software Engineering, Machine Learning, Artificial Intelligence (AI), Data Analytics, Amazon Web Services (AWS), Databases, Deep Neural Networks, Test-driven Development (TDD), Data Warehousing, HTML, CSS, Prefect

The most amazing...

...project I've created is a state-of-the-art face forgery detection model. (https://bit.ly/3gWRbIz)

Work Experience

Python and Django Developer

2022 - 2022
Touch Inflight Solutions
  • Consulted with the client to improve the project description and requirements.
  • Deployed Apache Airflow on the client's Azure cloud environment using Docker containers and Microsoft Azure SQL Database for PostgreSQL.
  • Developed a seamless integration with a uniform interaction interface to FTP servers, Azure Blob Storage, Microsoft OneDrive, and SharePoint.
  • Refactored the existing Python data handling script to work within the Airflow structure and scheduler.
  • Prepared full documentation on how the project works and how it can be extended for future work.
Technologies: Django, Python, ETL, Data Pipelines, Azure, SQL, PostgreSQL, API Integration, Data Aggregation, Containerization, Test-driven Development (TDD), Data Integration, Automation

Senior Python Developer

2022 - 2022
implicit diagnostics & solutions
  • Created software that enables the retrieval of TikTok videos based on search phrases from any global location via a suitable proxy and captures all related statistics, like views and publication date, along with the user's profile details.
  • Built a customizable storage back end for optionally storing the collected data on Amazon S3 or a local disk.
  • Developed a data exporting tool to take the raw collected data and export it in an easily digestible table format.
  • Wrote extensive documentation about how the project works and how it can be used and extended for future development.
Technologies: Python, Scraping, Web Scraping, Amazon S3 (AWS S3), Windows, Linux, Virtualenv, Documentation, API Integration, Data Aggregation, Data Analytics, Containerization, Test-driven Development (TDD), Postman, Data Integration, Automation, CSV File Processing

Senior Data Engineer

2022 - 2022
The Story Market
  • Deployed the entire ETL job infrastructure on AWS using containerized Apache Airflow.
  • Refactored the existing data handling code, making it eight times smaller.
  • Developed a framework on top of Apache Airflow to reduce the cost of adding new publishers to The Story Market network.
Technologies: Python 3, Apache Airflow, Docker, Docker Compose, Splash, Scrapy, MongoDB, Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon EC2, AWS Lambda, Amazon RDS, PostgreSQL, Pandas, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, GitHub, Jupyter, ETL, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, REST APIs, Data Engineering, Redis, Amazon Kinesis, NumPy, Datasets, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, Data Collection, Matplotlib, Information Extraction, SQLAlchemy, DataFrames, Mypy, Image Processing, API Integration, Data Visualization, Data Aggregation, Data Analytics, Containerization, Test-driven Development (TDD), Data Warehousing, Data Integration, Automation, CSV File Processing

Software Developer

2021 - 2022
Bar-All
  • Led the Odoo application database migration for the upgrade process from version 13 to version 14.
  • Refactored and upgraded legacy Odoo applications to make them work in a more modern Odoo 14 environment.
  • Implemented the Odoo application customization to make day-to-day business operations seamless and less error-prone.
Technologies: Python 3, XML, MVC Design, Web, Odoo, Apps, PostgreSQL, APIs, SQL, Python, Databases, GitHub, Software Engineering, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Scraping, REST APIs, Document Parsing, PDF, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Information Extraction, Pytest, API Integration, Data Aggregation, Containerization, Test-driven Development (TDD), Data Warehousing, Data Integration

Data Engineer

2020 - 2022
MarketSonics
  • Refactored the existing ETL Python code, achieving a 10x reduction in the total lines of code, vastly improving readability and maintainability.
  • Designed and implemented custom cloud architecture for handling specific ETL workloads very efficiently and inexpensively using Apache Airflow, AWS Batch, and Docker.
  • Set up automatic alerts in case of any failure during periodic jobs.
Technologies: Amazon Web Services (AWS), Bash, PostgreSQL, Terraform, ETL, CSV, Slack App, Bitbucket, Apache Airflow, Docker, Python 3, Web Scraping, Data Scraping, Big Data, SQL, Python, Databases, Jupyter, Data Analysis, Statistical Analysis, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Data Engineering, Redis, Amazon Kinesis, WebSockets, NumPy, Datasets, Back-end, Email Parsing, Document Parsing, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Text Classification, Web Crawlers, MySQL, Text Analytics, Snowflake, DevOps, Data Collection, Matplotlib, Information Extraction, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, R, SQLAlchemy, DataFrames, Mypy, Image Processing, API Integration, Data Visualization, Databricks, Data Aggregation, Exploratory Data Analysis, Data Analytics, Containerization, Risk Analysis, NLU, Deep Neural Networks, Test-driven Development (TDD), Data Warehousing, Postman, Data Integration, Swagger, Automation, CSV File Processing

Full-stack Developer | Machine Learning Engineer

2019 - 2020
inovex GmbH
  • Developed a fully functional image-captioning service with a team of four, including a web page and a highly available REST API.
  • Implemented and trained an image captioning model from scratch based on the latest research papers in the field.
  • Designed the micro-service architecture for usable image captioning and deployment on the Google Cloud Platform.
  • Implemented continuous integration and development pipelines for all the microservices using GitLab enterprise tools.
  • Wrote a blog post about project details, including information on technologies and management methodologies.
  • Established Google Cloud budget alerts to automatically monitor and notify team members about possible budget overruns.
Technologies: gRPC, Terraform, Google Cloud Platform (GCP), Helm, Kubernetes, Docker, Vue, JavaScript, PyTorch, Flask, Python, Deep Learning, Databases, GitLab, Jupyter, Google Cloud Storage, Model Development, Classification Algorithms, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), GPT, Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Data-informed Recommendations, Microservices, Data Pipelines, Node.js, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Back-end, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Neural Networks, Text Classification, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, BERT, SQLAlchemy, DataFrames, Multi-task Cascaded Convolutional Neural Networks (MTCNN), Generative Adversarial Networks (GANs), Generative Artificial Intelligence (GenAI), Image Processing, API Integration, Data Visualization, Seaborn, Databricks, Data Aggregation, Exploratory Data Analysis, Data Analytics, Containerization, NLU, Deep Neural Networks, Test-driven Development (TDD), Postman, Data Integration, Swagger, Automation

Master Course in Foundations of Data Engineering Tutor

2019 - 2020
Technical University of Munich
  • Conducted tutorial sessions with students, explaining the most important aspects of the lecture and answering questions.
  • Held study sessions for students who needed individual help with the lectures and assignments.
  • Took part in grading assignments and final exams.
Technologies: Spark, Scala, C++, Bash, Big Data, SQL, Databases, Data Science, GitHub, GitLab, ETL, Statistical Analysis, Classification Algorithms, Generative Pre-trained Transformers (GPT), GPT, Natural Language Processing (NLP), Mathematics, Distributed Systems, Quantitative Analysis, CI/CD Pipelines, Data Pipelines, Scraping, XML Parsing, HTML Parsing, PySpark, Redis, WebSockets, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, Document Parsing, PDF, NoSQL, Data Modeling, Database Modeling, Software Architecture, Artificial Intelligence (AI), Artificial Neural Networks (ANN), Text Classification, Web Crawlers, MySQL, PostgreSQL, Snowflake, Data Collection, Natural Language Understanding (NLU), Information Extraction, PyTorch Lightning, BERT, SQLAlchemy, DataFrames, Multi-task Cascaded Convolutional Neural Networks (MTCNN), API Integration, Data Aggregation, Exploratory Data Analysis, Containerization, Apache Spark, Data Integration, Automation

Senior Software Developer | Data Scientist

2016 - 2020
Leavingstone
  • Developed a framework for creating and deploying dialog systems (chatbots) on Facebook Messenger.
  • Implemented numerous chatbots, the best of which had more than a million interactions and a second-day retention rate of 20%.
  • Created a graph-based web interface for easy assembly of custom chatbots by entering dialog texts and not a single line of code.
  • Wrote a chatbot for helping citizens wrongly fined by the police in Georgia. It queried users about the circumstances of the offense and automatically generated a personalized appeal PDF for submission to the court.
  • Developed a custom real-time data analytics platform for a retail chain with over 250 stores throughout Tbilisi.
  • Implemented a recommendation system for a large retail business. It automatically generated special offers and gifts for loyal customers using machine learning tools.
  • Created a web scraper to continuously gather publicly available data about all the parking tickets written in Tbilisi.
  • Designed and implemented a data analytics web page on all the parking tickets in Tbilisi. It included a heatmap, other types of charts, and textual comments with analysis.
  • Built an API for guessing a person's nationality based on their first and last name, using deep natural language processing (Fast.ai).
  • Made a REST API for transliterating from English to Georgian, using Flask and SQL.
Technologies: Amazon Web Services (AWS), BigQuery, Google Cloud Platform (GCP), Flask, Django, Spark, Pandas, Kotlin, Scala, JavaScript, Java, SQL, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, Databases, Data Science, GitHub, GitLab, Jupyter, ETL, Google Cloud Storage, Data Analysis, Statistical Analysis, Model Development, Classification Algorithms, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Mathematics, Distributed Systems, Software Engineering, Quantitative Analysis, Back-end Development, Google Cloud, Google Kubernetes Engine (GKE), CI/CD Pipelines, Recommendation Systems, Data-informed Recommendations, Microservices, Azure, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, PySpark, Data Engineering, Redis, WebSockets, NumPy, Machine Learning, Datasets, Simulations, Computer Vision, Back-end, Email Parsing, Document Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Product Development, Team Leadership, Product Roadmaps, Artificial Intelligence (AI), Google BigQuery, Artificial Neural Networks (ANN), Neural Networks, Text Classification, Web Crawlers, MySQL, PostgreSQL, Text Analytics, Machine Learning Operations (MLOps), DevOps, Data Collection, Matplotlib, Natural Language Understanding (NLU), Information Extraction, PyTorch Lightning, Hugging Face, Amazon SageMaker, R, Pytest, DataFrames, Mypy, Generative Adversarial Networks (GANs), Image Processing, API Integration, Speech Recognition, Data Visualization, Seaborn, Databricks, Data Aggregation, Exploratory Data Analysis, Data Analytics, Containerization, Cassandra, Risk Analysis, Predictive Modeling, NLU, Deep Neural Networks, Test-driven Development (TDD), Data Warehousing, HTML, CSS, Apache Spark, Postman, Data Integration, Swagger, Automation

Python Developer

2019 - 2019
Elasticiti
  • Refactored the existing Apache Airflow project to remove code duplication and make it easily extensible.
  • Examined existing ETL pipelines and tracked and fixed bugs.
  • Implemented, tested, and deployed a new ETL pipeline for handling daily updated raw client data.
  • Created a detailed wiki markdown documentation about my work and other parts of the existing project.
Technologies: Snowflake, Markdown, Amazon Web Services (AWS), JSON, Apache Airflow, SQL, Requests, Pandas, Python, Deep Learning, Big Data, Databases, Data Science, GitHub, ETL, Data Analysis, Software Engineering, Google Cloud, CI/CD Pipelines, Data Pipelines, XML Parsing, REST APIs, Data Engineering, Redis, NumPy, Machine Learning, Datasets, Computer Vision, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, MySQL, Information Extraction, DataFrames, Data Aggregation, Containerization, Postman, Data Integration

Senior Software Developer | Web Automation Engineer

2018 - 2018
TheRundown
  • Developed a scraping tool for automatically gathering live data from various sports betting websites.
  • Implemented a REST API using Flask to make the sports betting data easily accessible for other services.
  • Built automated testing and integration tools for easy and painless software updates.
  • Conducted daily stand-up meetings for all the developers to catch up with each other and plan the following day.
  • Developed a REST API to automatically and instantly place bets on various sports betting websites.
  • Implemented an algorithm to gather all the available sports betting data and find and place profitable bets.
Technologies: Amazon Web Services (AWS), Docker Compose, Docker, MySQL, Spring, Flask, Scrapy, Requests, HTTP, Selenium, Java, Python, Algorithms, Web Scraping, Data Scraping, Big Data, Databases, GitHub, Jupyter, ETL, Data Analysis, Model Development, Distributed Systems, Quantitative Analysis, Microservices, Data Pipelines, Scraping, XML Parsing, HTML Parsing, Node.js, REST APIs, Data Engineering, NumPy, Datasets, Computer Vision, Back-end, Document Parsing, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Text Analytics, DevOps, Matplotlib, Natural Language Understanding (NLU), Information Extraction, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Pytest, DataFrames, Mypy, Data Aggregation, Exploratory Data Analysis, Containerization, Predictive Modeling, NLU, Deep Neural Networks, HTML, CSS, Postman, Data Integration, Swagger

Software and Web-scraping Engineer

2016 - 2018
Freelance
  • Implemented a data aggregation framework to collect, process, and extract daily data about various online game stats to help the client, Jabre Capital Partners, with stock market trading decisions.
  • Developed a desktop GUI program to monitor and automatically purchase rare items in eCommerce shops when these items come in stock. The websites were Amazon, Walmart, Best Buy, and The Source.
  • Built a microframework in Python for writing auto-trader robots for cryptocurrencies using the Coinigy API.
  • Wrote a Python program to use the AWS API and automatically manage tags for all the resources.
  • Developed a web crawler for collecting and storing discount coupon codes.
  • Created a tool for exporting slides as JPEG images from Microsoft PowerPoint using Python, Unoconv, and Convert.
  • Rewrote a large numerical analysis project from Visual Basic to modern Python 3 with careful testing to guarantee the same output.
  • Implemented and deployed a custom trading strategy on Python 3 and the Coinigy platform.
  • Wrote more than 20 small web-scraping programs using Python.
Technologies: Amazon Web Services (AWS), PyQt 5, Spark, Scala, Selenium, SciPy, Scikit-learn, Pandas, Django, Scrapy, Python, Algorithms, Web Scraping, Data Scraping, Deep Learning, Big Data, SQL, Databases, GitHub, ETL, Data Analysis, Recommendation Systems, Scraping, XML Parsing, HTML Parsing, Node.js, PySpark, Data Engineering, NumPy, Simulations, Email Parsing, PDF, TypeScript, NoSQL, Data Modeling, Database Modeling, Software Architecture, Web Crawlers, Matplotlib, Information Extraction, Pytest, DataFrames, Data Visualization, Seaborn, Data Aggregation, Exploratory Data Analysis, Data Analytics, NLU, Deep Neural Networks, HTML, CSS, Postman, Data Integration

Lempel Ziv Compression

The native Python 3 code implements a Lempel Ziv algorithm for compression and decompression. It comes with tests. The project demonstrates a modern Python approach to implementing standard traditional algorithms with clean and efficient code.

Deep Face Forgery Detection

https://github.com/Megatvini/DeepFaceForgeryDetection
I developed a deep learning-based tool for automatically detecting human face forgeries in videos and single frames. I built and trained the model from scratch based on theoretical insights from the latest research papers in the field. I tried multiple approaches, from regular neural networks to LSTMs with 3D convolutional networks as the encoder.

A detailed description is available in the following paper:
• https://arxiv.org/abs/2004.11804.

Image Captioning Service (Web and API)

https://www.inovex.de/blog/end-to-end-image-captioning/
I developed a microservice-based image captioning service with a team of four. I was involved in all parts of the projects, from the initial design of what the services should be to implementing each of them separately.

The project was entirely developed with Agile methodologies and my role was to oversee the whole development process and take part in the actual coding/implementation.

Languages

SQL, Python, Bash, JavaScript, R, Java, Snowflake, TypeScript, HTML, CSS, Kotlin, Python 3, C, Markdown, Scala, C++, XML, Go, PHP

Frameworks

Selenium, Spark, Flask, Scrapy, Django REST Framework, Django, Apache Spark, Swagger, gRPC, Spring

Libraries/APIs

Beautiful Soup, PyMongo, Matplotlib, Pandas, NumPy, Requests, PyTorch, REST APIs, Python API, SQLAlchemy, Node.js, Setuptools, Scikit-learn, TensorFlow, PyQt 5, Spark ML, Apiary API, Google Cloud API, PySpark, PyTorch Lightning, Mypy, Vue, PhantomJS, Django ORM, Twitch API, PyQt, SciPy, PiLLoW

Tools

Scraping Hub, GitHub, GitLab, Jupyter, JetBrains, Apache Airflow, GitLab CI/CD, Amazon SageMaker, Pytest, Postman, Seaborn, Spark SQL, Tableau, Git, Google Kubernetes Engine (GKE), Terraform, BigQuery, IntelliJ IDEA, Bitbucket, Helm, Docker Compose, Unoconv, Boto 3, PyCharm, MATLAB, Odoo, Virtualenv

Paradigms

Data Science, ETL, Test-driven Development (TDD), Automation, REST, Unit Testing, Microservices, DevOps, Agile, Testing, MVC Design

Platforms

Jupyter Notebook, Docker, Google Cloud Platform (GCP), Azure, Spark Core, Amazon EC2, AWS Lambda, Linux, Amazon Web Services (AWS), Kubernetes, Databricks, ConvertKit, Linux CentOS 7, Android, Windows, Google Cloud SDK, Web, Splash, Visual Studio Code (VS Code), Unix

Storage

Amazon S3 (AWS S3), Databases, Google Cloud Storage, Data Pipelines, XML Parsing, Database Modeling, Data Integration, NoSQL, PostgreSQL, PostgreSQL 10, Redis, MySQL, MongoDB, Google Cloud, JSON, Cassandra

Other

Scraping, Data Scraping, Screen Scraping, Store Scraping, Natural Language Processing (NLP), Site Bots, Bots, Machine Learning, Networks, HTTP, Supervised Learning, Text Classification, Classification, Predictive Learning, Algorithms, Clustering Algorithms, Statistics, Software Development, Big Data, Machine Vision, Web Scraping, Artificial Intelligence (AI), APIs, Deep Learning, Software Engineering, HTML Parsing, WebSockets, Datasets, Computer Vision, Back-end, Email Parsing, Document Parsing, PDF, Data Modeling, Software Architecture, Artificial Neural Networks (ANN), Neural Networks, Web Crawlers, Text Analytics, Data Collection, Natural Language Understanding (NLU), Information Extraction, DataFrames, Multi-task Cascaded Convolutional Neural Networks (MTCNN), Generative Adversarial Networks (GANs), Generative Artificial Intelligence (GenAI), Image Processing, API Integration, Data Visualization, Data Aggregation, Exploratory Data Analysis, Data Analytics, Containerization, NLU, Deep Neural Networks, Data Warehousing, GPT, Generative Pre-trained Transformers (GPT), CSV File Processing, Kubernetes Operations (kOps), Google BigQuery, Architecture, Data Engineering, Serverless, API Applications, Parquet, HTTPS, Unsupervised Learning, Regression, Classification Algorithms, Heuristic & Exact Algorithms, Optimization Algorithms, Genetic Algorithms, Mathematics, Computer Science, Machine Learning Automation, ETL Tools, ETL Development, Amazon RDS, Data Analysis, Statistical Analysis, Model Development, Distributed Systems, Quantitative Analysis, Numerical Analysis, Back-end Development, CI/CD Pipelines, Recommendation Systems, Data-informed Recommendations, Amazon Kinesis, Simulations, Product Development, Team Leadership, Product Roadmaps, Machine Learning Operations (MLOps), Hugging Face, BERT, FastAPI, Speech Recognition, Risk Analysis, Predictive Modeling, Statistical Modeling, Prefect, Slack App, CSV, Google Cloud Functions, Information Theory, Data Compression Algorithms, Cloud, Image Compression, Video Compression, Research, MLflow, Images, Apps, Forecasting, OpenAI, Stable Diffusion, DALL-E, DreamBooth, Midjourney, Documentation

2018 - 2020

Master's Degree in Data Engineering and Analytics

Technical University of Munich - Munich, Germany

2017 - 2017

Nanodegree in Data Analytics

Udacity - Udacity.com

2013 - 2017

Bachelor's Degree in Computer Science

Free University of Tbilisi - Tbilisi, Georgia

JUNE 2018 - PRESENT

Deep Learning Specialization

Coursera

DECEMBER 2017 - PRESENT

Data Analyst

Udacity

JANUARY 2017 - PRESENT

IELTS

British Council

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring