
Aleksei Ildiakov
Verified Expert in Engineering
Database Developer
Cancún, Mexico
Toptal member since May 20, 2024
Aleksei is a skilled software engineer with expertise in back-end development, including APIs, SQL, business intelligence, and big data. Committed to delivering high-performance data solutions that catalyze business achievements, he brings proficiency in Python, Scala, Go, and other leading-edge technologies. Aleksei is interested in pursuing a challenging role where he can harness his development architectural skills to deliver outstanding outcomes and boost business value.
Portfolio
Experience
- Python - 7 years
- CI/CD Pipelines - 7 years
- Apache Kafka - 5 years
- Apache Airflow - 5 years
- Spark - 5 years
- Data Warehousing - 5 years
- Google Cloud Storage - 4 years
- Snowflake - 4 years
Availability
Preferred Environment
Google Cloud Storage, Snowflake, Apache Airflow, Spark, AWS IoT
The most amazing...
...thing I've worked on is an ETL pipeline factory that allows non-engineering professionals to build and maintain high-performance data pipelines.
Work Experience
Senior Data Engineer
Workiz.com
- Led the design and implementation of a cutting-edge Airflow ETL pipeline factory for a startup project comprising over 50 directed acyclic graphs (DAGs) with auto-scheduling and auto-building capabilities using Kafka, GCP, Python, and PySpark.
- Designed a scalable and efficient Data Vault 2.0 and data warehousing solution for the project, incorporating different types of tables with their respective relations, including mostly historical insert-only tables.
- Achieved a 99.99% data accuracy and match rate by implementing rigorous data validation and data governance practices using Python, BigQuery, and Spark.
- Implemented solutions for low resource utilization calculations for batch and stream flows using BigQuery, SnowFlake, and MySQL.
- Developed Go microservices to improve data transformation and processing, resulting in a 30% increase in performance and throughput.
Senior Big Data Engineer
Mediascope
- Championed designing and implementing an innovative Airflow auto DAG-building solution, utilizing Python, PostgreSQL, AWS, Hadoop, and Git, reducing release time by up to 40% for a high-volume data processing system.
- Implemented and optimized a high load of over ten terabytes a day of data processing pipelines using Spark and PySpark, resulting in a 40% reduction in hardware utilization.
- Boosted team efficiency by 20% by creating bespoke automation and CI/CD tools, empowering colleagues to streamline their workflows and focus on higher-value tasks.
Software Engineer
RT-labs
- Designed and implemented a custom BI service with a multifunctional interface for analyzing geographic, graph, historical, and relational data sources, utilizing Python, PostgreSQL, ClickHouse, Neo4j, and MongoDB.
- Provided the migration from a self-hosted cluster to the AWS cloud.
- Designed and implemented a geographic information system (GIS) platform for flexible data sources and purposes, such as emergency reachability or waste pollution, including parts in computer vision and graph mathematics.
- Improved pre-processing and post-processing functions performance by 20-50% for an analytical web service and back-end, utilizing Lua, Python, and SQL.
- Implemented complex data warehouses and data lakes for analytical purposes and automatic ETL pipelines for parsing, sorting, and verifying data using Python, SQL, and Airflow.
Software Engineer
Onduline
- Conducted statistical analysis and data manipulation using SQL and Python.
- Developed analytical functions and procedures in T-SQL.
- Deployed and managed applications using Azure Kubernetes Service (AKS) for efficient and scalable application hosting.
- Created dashboards and reports for data visualizations using Tableau and SQL Server Data Tools (SSDT).
Experience
Pipeline Factory
I enabled capabilities for establishing connections and extracting data from external APIs and storage repositories.
Education
Master's Degree in Statistics
Siberian State University of Telecommunications and Information Sciences - Novosibirsk, Russia
Skills
Libraries/APIs
PySpark
Tools
Apache Airflow, Git, GitLab CI/CD, Apache NiFi, BigQuery, Stitch Data, Microsoft Power BI, Tableau
Languages
Snowflake, Python, SQL, Scala, T-SQL (Transact-SQL), GraphQL, Go
Frameworks
Apache Spark, Spark, Hadoop, Django
Paradigms
ETL, REST, Automation
Platforms
Apache Kafka, Amazon Web Services (AWS), AWS IoT, Google Cloud Platform (GCP), Azure
Storage
Google Cloud Storage, PostgreSQL, MongoDB, API Databases, Data Pipelines, ClickHouse, Google Cloud, Redshift, Apache Hive, HBase, Greenplum, CockroachDB, Neo4j, Microsoft SQL Server, SQL Server Data Tools (SSDT), SQL Server Integration Services (SSIS)
Other
Cloud, CI/CD Pipelines, Data Warehousing, Data Engineering, Data Modeling, Data Analysis, Big Data, Data, Distributed Systems, Scalability, Data Flows, Git Repo, Orchestration, Data Science, Startups, Static Analysis, Data Build Tool (dbt), Large Language Models (LLMs), FastAPI
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring