Karip is available for hire

Karip Kaya

Verified Expert in Engineering

Data Developer

Location

Buckhurst Hill, United Kingdom

Toptal Member Since

March 29, 2022

Karip is a data engineer and developer with 18 years of experience in the IT industry and six years in the freelance community. He specializes in Java, Scala, and Python and is skilled at Hadoop, Spark, and PL/SQL. Karip enjoys working on the back end, database, and big data projects.

Data Migration Data Engineering Big Data Java SQL Java 8 GitHub RDBMS Oracle PL/SQL Databases Git ETL Data Pipelines Apache Hive JCL Groupon SUN Sqoop Informatica

Portfolio

Atos

Apache Spark, Apache Hive, PySpark, SQL, Databricks, Big Data...

Groupon

Big Data, Apache Spark, Java 8, Apache Maven, Git, SQL, Spark, Spark SQL, HDFS...

Deutsche Bank

Apache Hive, Spark SQL, Pandas, PySpark, Apache Spark, HDFS, SQL, Scala...

Experience

Java - 15 years SQL - 15 years Apache Hive - 5 years HDFS - 5 years Spark SQL - 3 years Spark - 3 years Scala - 3 years Python 3 - 2 years

Availability

Part-time

Preferred Environment

Java, Scala, Python 3, SQL, Apache Hive, Spark, Databricks

The most amazing...

...thing I've developed is the messaging and file sharing program for desktop environments.

Work Experience

Senior Data Engineer

2021 - 2022

Atos

Reduced the 14 hours to run a Pandas-based data-transformation code into a PySpark DataFrame, running in five minutes.
Solved weekly pipeline data problems. Backloaded needed datasets.
Contributed to maintaining the weekly data flow based on Azure Data Factory.

Technologies: Apache Spark, Apache Hive, PySpark, SQL, Databricks, Big Data, Azure Data Factory, HDFS, Java, Python 3, Spark, Spark SQL, Pandas, Python, Git, Databases, Data Engineering, ETL, Data Pipelines, Azure, Hadoop, GitHub, RDBMS

Senior Data Engineer

2021 - 2022

Groupon

Worked on cloud migration of an in-house Java API by using Docker.
Maintained the in-house data flow and the Linux crontab schedule maintenance.
Solved the cloud API timeout problem by changing an existing Java code.
Applied the CDC data flow function performance solution.

Technologies: Big Data, Apache Spark, Java 8, Apache Maven, Git, SQL, Spark, Spark SQL, HDFS, Python, Jenkins, PySpark, Databases, Data Engineering, ETL, Data Pipelines, Linux, Kubernetes, Hadoop, GitHub, RDBMS, Java Development Kit (JDK)

Data Engineer

2020 - 2021

Deutsche Bank

Created an initial data pipeline integration with Hive, Spark, Scala, and PySpark from scratch.
Solved file count limit problems with partitioning and bucketing Hive tables.
Contributed to designing the DWH database model established on Hive.

Technologies: Apache Hive, Spark SQL, Pandas, PySpark, Apache Spark, HDFS, SQL, Scala, Python 3, Spark, Python, Big Data, Git, Databases, Data Engineering, ETL, Data Pipelines, Hadoop, GitHub, Bitbucket, RDBMS

Senior Data Engineer

2015 - 2019

Akbank

Created and maintained DWH tables and ETLs for existing data marts, including all staging level tables.
Developed and maintained ETLs on Informatica. Also enhanced the ETL performance by updating SQL and applying the push-down optimization.
Adapted an ETL offloading using a Big Data Cluster.
Built a near online sales/goals data mart to support daily sales.
Analyzed and reported data lineage using Informatica repository tables.

Technologies: Informatica ETL, Oracle Exadata, Big Data, Cloudera, Spark, SQL, Apache Hive, Impala, Apache Sqoop, Apache Spark, Oracle, Databases, Data Migration, Data Engineering, ETL, Data Pipelines, ETL Development, Informatica, Hadoop, GitHub, RDBMS

Senior Developer

2003 - 2015

Yapi Kredi

Developed and maintained customer-related applications and database tables.
Migrated C-based client-side code into Java Hibernate.
Standardized and cleansed customer domain-specific data.
Transformed a domain-specific API from PL/SQL to Java-Spring.

Technologies: Oracle 11g, Oracle 12c, PL/SQL, Java, C, COBOL Batch, JCL, Git, Hibernate, Spring, SQL, Oracle, Databases, Data Migration, GitHub, RDBMS

Experience

Sudoku Validator

https://github.com/karipkaya/sudokuvalidator

A simple validator that checks the given sudoku solutions.

I wrote it in Java, generated tests, and used Maven for dependency management and building and running operations. The purpose was to demonstrate basic Java, Maven, and test knowledge.

Skills

Languages

Java, SQL, Java 8, Scala, Python 3, Python, C

Frameworks

Spark, Apache Spark, Hadoop, Hibernate, Spring

Tools

Spark SQL, Git, Informatica ETL, GitHub, Apache Maven, Oracle Exadata, Impala, CircleCI, Jenkins, JCL, Cloudera, Apache Sqoop, Bitbucket, Java Development Kit (JDK)

Paradigms

ETL

Platforms

Oracle, Databricks, Azure, Linux, Kubernetes

Storage

Apache Hive, HDFS, PL/SQL, Databases, Data Pipelines, RDBMS, Oracle 11g, Oracle 12c

Other

Big Data, Data Migration, Data Engineering, ETL Development, Informatica, Azure Data Factory, COBOL Batch, Algorithms

Libraries/APIs

PySpark, Pandas

Certifications

JUNE 2023 - PRESENT

Python Basic

HackerRank

FEBRUARY 2023 - PRESENT

Problem Solving (Basic)

HackerRank

FEBRUARY 2023 - PRESENT

SQL (Advanced)

HackerRank

OCTOBER 2019 - PRESENT

Oracle Certified Associate

Oracle

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring