Selahattin Gungormus
Verified Expert in Engineering
Data Engineer and Developer
Istanbul, Turkey
Toptal member since May 4, 2021
Selahattin is a data engineer with several years of hands-on experience building scalable data integration solutions using open-source technologies. He excels at developing data applications using distributed processing platforms such as Hadoop, Spark, and Kafka. Selahattin also has practical experience in cloud architecture types such as AWS and Azure, as well as developing microservices using Python and JavaScript frameworks
Portfolio
Experience
Availability
Preferred Environment
Apache Airflow, Visual Studio Code (VS Code), Apache Spark, Amazon Web Services (AWS), Azure, Jupyter Notebook
The most amazing...
...thing I've done is to build a product that leverages Apache Spark for data processing and can be operated with drag-n-drop visual interfaces.
Work Experience
Lead Data and Back-end Engineer
Afiniti
- Built a highly scalable, containerized data integration platform using Spark, Docker/Kubernetes, Python, and Greenplum database.
- Wrapped up whole data pipeline procedures in an easy-to-deploy templating system, capable of running at scale with good performance. That effort made the data pipeline process 70% faster.
- Created data models and pipelines for the application, resulting in powering dashboard reports with over 10 million events.
- Established and standardized CI/CD pipeline processes across the team using Jenkins, Bitbucket, and Kubernetes.
- Built and maintained an app's back-end service using Node.js, JavaScript, and GraphQL.
Senior Data Engineer
Iyzico
- Reengineered and optimized the existing data pipeline processes by creating a new technology stack using Airflow, Python, Spark, and Exasol database.
- Accomplished the migration of over 300 data pipeline jobs from Talend to the new data platform which improved daily ETL performance by 60% (from eight hours to three hours).
- Created a real-time data feed from transactional systems to dashboards using Spark Streaming and Kafka. That new functionality boosted operational efficiency for performance monitoring during peak hours.
- Made an integration through AWS and provided daily data-marts to AWS Redshift service to make daily reports available to the global board.
Owner | Big Data Engineer | Instructor
Majestech
- Provided consultancy and training services to transform data architectures of SMEs with cloud-based alternatives such as Amazon Web Services and Azure.
- Delivered over ten data integration projects for businesses in the retail, banking, and telecommunications sectors. Transformed data integration processes to utilize cloud platforms such as AWS and Azure.
- Built a clickstream data application to collect web traces of app users and store them in a data lake with minimal latency. Used Kafka and Spark Streaming on AWS as the technology base.
- Launched a cloud-based data integration product: Integer8 on the AWS platform.
- Built a visual interface for non-developer data professionals who wanted to leverage Hadoop and Spark distributed processing capabilities.
- Provided big data engineering training with Cloudera partnership (over 20 training sessions).
- Created data integration pipelines on AWS Snowflake Cloud DB using Apache Airflow and S3 Connectors.
Data Engineer
i2i Systems
- Implemented data quality testing automation with Python and used Oracle metadata information to produce daily automated tasks assessing possible issues on daily pipelines.
- Created daily integration pipelines to feed enterprise data warehouse on ODS and RDS layers.
- Built, for a telecommunication operator, a market optimization project's data preparation layer. Data from 35+ million subscribers were collected from five different source systems into a denormalized data structure with Oracle Data Integrator.
Experience
Integer8 Data Integrator
https://www.f6s.com/integer8I created my startup with two developers in 2015 to launch the Integer8 product both on local and international marketplaces. I designed and led the development effort to make the product feasible for local SMEs. At the end of the first year, we deployed our platform to two different retail companies.
I became a cloud partner for Microsoft Azure in Turkey and spent one more year making Integer8 eligible for Azure Marketplace. At the end of this effort, Integer8 successfully became an official Azure Marketplace product.
Data Warehouse Transformation for a Mobile Payment Company
I designed and implemented whole data pipeline processes as the responsible data engineer for the new data platform. I built a CDC mechanism from MySQL database into Kafka to provide a pub/sub-event system for near real time integration. I then prepared live Spark Streaming jobs to consume Kafka topics to refresh target data-stores. That helped the marketing and operations team to monitor the workload on the system and detect anomalies.
All data sources were consolidated into two main data marts for the Tableau reporting layer. Daily pre-aggregated tables helped live reports to perform 400% faster than the previous implementation. That also increased the motivation of using reporting tools by power-users all over the organization.
Cloud ETL Automation on AWS
As the target database I used Amazon Redshift. So individual events are emitted from Amazon EventBridge into Lambda functions and accumulated into Redshift database for further analysis.
Education
Bachelor's Degree in Computer Engineering
Istanbul Technical University - Istanbul, Turkey
Certifications
Cloudera Certified Developer for Apache Hadoop
Cloudera
Skills
Libraries/APIs
Spark Streaming, Node.js, Pandas
Tools
Apache Airflow, Amazon CloudWatch
Languages
Python, SQL, JavaScript, Scala, Snowflake, TypeScript
Frameworks
Apache Spark, Hadoop, Spark
Paradigms
ETL, MapReduce, Database Design
Storage
PL/SQL, Databases, Data Pipelines, Redis, Greenplum, HDFS, HBase, Apache Hive, Amazon S3 (AWS S3)
Platforms
Azure, Apache Kafka, Oracle, Amazon Web Services (AWS), Docker, Visual Studio Code (VS Code), Kubernetes, Jupyter Notebook, Oracle Data Integrator 11g, Google Cloud Platform (GCP), AWS Lambda, Amazon EC2
Other
Data Modeling, Data Warehousing, Data Warehouse Design, ETL Development, Data Engineering, Data Architecture, Big Data Architecture, OOP Designs, Data Structures, Algorithms
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring