Karip Kaya, Developer in Buckhurst Hill, United Kingdom
Karip is available for hire
Hire Karip

Karip Kaya

Verified Expert  in Engineering

Data Developer

Buckhurst Hill, United Kingdom
Toptal Member Since
March 29, 2022

Karip is a data engineer and developer with 18 years of experience in the IT industry and six years in the freelance community. He specializes in Java, Scala, and Python and is skilled at Hadoop, Spark, and PL/SQL. Karip enjoys working on the back end, database, and big data projects.


Apache Spark, Apache Hive, PySpark, SQL, Databricks, Big Data...
Big Data, Apache Spark, Java 8, Apache Maven, Git, SQL, Spark, Spark SQL, HDFS...
Deutsche Bank
Apache Hive, Spark SQL, Pandas, PySpark, Apache Spark, HDFS, SQL, Scala...




Preferred Environment

Java, Scala, Python 3, SQL, Apache Hive, Spark, Databricks

The most amazing...

...thing I've developed is the messaging and file sharing program for desktop environments.

Work Experience

Senior Data Engineer

2021 - 2022
  • Reduced the 14 hours to run a Pandas-based data-transformation code into a PySpark DataFrame, running in five minutes.
  • Solved weekly pipeline data problems. Backloaded needed datasets.
  • Contributed to maintaining the weekly data flow based on Azure Data Factory.
Technologies: Apache Spark, Apache Hive, PySpark, SQL, Databricks, Big Data, Azure Data Factory, HDFS, Java, Python 3, Spark, Spark SQL, Pandas, Python, Git, Databases, Data Engineering, ETL, Data Pipelines, Azure, Hadoop, GitHub, RDBMS

Senior Data Engineer

2021 - 2022
  • Worked on cloud migration of an in-house Java API by using Docker.
  • Maintained the in-house data flow and the Linux crontab schedule maintenance.
  • Solved the cloud API timeout problem by changing an existing Java code.
  • Applied the CDC data flow function performance solution.
Technologies: Big Data, Apache Spark, Java 8, Apache Maven, Git, SQL, Spark, Spark SQL, HDFS, Python, Jenkins, PySpark, Databases, Data Engineering, ETL, Data Pipelines, Linux, Kubernetes, Hadoop, GitHub, RDBMS, Java Development Kit (JDK)

Data Engineer

2020 - 2021
Deutsche Bank
  • Created an initial data pipeline integration with Hive, Spark, Scala, and PySpark from scratch.
  • Solved file count limit problems with partitioning and bucketing Hive tables.
  • Contributed to designing the DWH database model established on Hive.
Technologies: Apache Hive, Spark SQL, Pandas, PySpark, Apache Spark, HDFS, SQL, Scala, Python 3, Spark, Python, Big Data, Git, Databases, Data Engineering, ETL, Data Pipelines, Hadoop, GitHub, Bitbucket, RDBMS

Senior Data Engineer

2015 - 2019
  • Created and maintained DWH tables and ETLs for existing data marts, including all staging level tables.
  • Developed and maintained ETLs on Informatica. Also enhanced the ETL performance by updating SQL and applying the push-down optimization.
  • Adapted an ETL offloading using a Big Data Cluster.
  • Built a near online sales/goals data mart to support daily sales.
  • Analyzed and reported data lineage using Informatica repository tables.
Technologies: Informatica ETL, Oracle Exadata, Big Data, Cloudera, Spark, SQL, Apache Hive, Impala, Apache Sqoop, Apache Spark, Oracle, Databases, Data Migration, Data Engineering, ETL, Data Pipelines, ETL Development, Informatica, Hadoop, GitHub, RDBMS

Senior Developer

2003 - 2015
Yapi Kredi
  • Developed and maintained customer-related applications and database tables.
  • Migrated C-based client-side code into Java Hibernate.
  • Standardized and cleansed customer domain-specific data.
  • Transformed a domain-specific API from PL/SQL to Java-Spring.
Technologies: Oracle 11g, Oracle 12c, PL/SQL, Java, C, COBOL Batch, JCL, Git, Hibernate, Spring, SQL, Oracle, Databases, Data Migration, GitHub, RDBMS

Sudoku Validator

A simple validator that checks the given sudoku solutions.

I wrote it in Java, generated tests, and used Maven for dependency management and building and running operations. The purpose was to demonstrate basic Java, Maven, and test knowledge.

Python Basic



Problem Solving (Basic)



SQL (Advanced)



Oracle Certified Associate



PySpark, Pandas


Spark SQL, Git, Informatica ETL, GitHub, Apache Maven, Oracle Exadata, Impala, CircleCI, Jenkins, JCL, Cloudera, Apache Sqoop, Bitbucket, Java Development Kit (JDK)


Spark, Apache Spark, Hadoop, Hibernate, Spring




Java, SQL, Java 8, Scala, Python 3, Python, C


Oracle, Databricks, Azure, Linux, Kubernetes


Apache Hive, HDFS, PL/SQL, Databases, Data Pipelines, RDBMS, Oracle 11g, Oracle 12c


Big Data, Data Migration, Data Engineering, ETL Development, Informatica, Azure Data Factory, COBOL Batch, Algorithms

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.


Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring