Harun Zafer, Developer in Toronto, ON, Canada
Harun is available for hire
Hire Harun

Harun Zafer

Verified Expert  in Engineering

Bio

Harun is a software and AI engineer with 15+ years of experience, specializing in back-end development and AI-driven solutions. Recently, he has focused on building large language model (LLM) applications and NLP systems, including entity extraction, content categorization, and document summarization. Skilled in scalable cloud architectures and fluent in Java, Python, and C#, Harun excels in creating innovative, real-world AI solutions while staying adaptable to new technologies.

Portfolio

Freelance Clients
AI Agents, AI Chatbots, Python, TypeScript, OpenAI API, Azure, Azure Functions...
Zyfera
Java, Kubernetes, JavaScript, JavaFX, OAuth 2, PostgreSQL, REST...
Amazon.com
Amazon API Gateway, Amazon S3 (AWS S3), Amazon Alexa, Amazon CloudWatch...

Experience

  • Natural Language Processing (NLP) - 11 years
  • Python - 5 years
  • AI Chatbots - 2 years
  • Vector Search - 2 years
  • Large Language Models (LLMs) - 2 years
  • Retrieval-augmented Generation (RAG) - 2 years
  • OpenAI Assistants API - 1 year
  • AI Agents - 1 year

Availability

Part-time

Preferred Environment

Linux, Windows, Java, Python, Spring, Amazon Web Services (AWS), Docker, Kubernetes, Cloud Services

The most amazing...

...solution I've built is a document analysis system. I've built its NLP pipeline, trained ML models, and implemented the entire back-end logic.

Work Experience

Staff AI Engineer

2024 - PRESENT
Freelance Clients
  • Designed and implemented agentic actions for a QA agent, significantly enhancing its retrieval-augmented generation (RAG) capabilities.
  • Designed and implemented an AI-driven content processing system capable of functioning as scheduled tasks for periodic NLP processing and on-demand HTTP APIs for real-time interactions, ensuring flexibility and scalability.
  • Developed and deployed an end-to-end custom categorization workflow enabling users to define and manage categorization tasks tailored to their specific needs, incorporating both front-end and back-end (web and AI) components.
  • Designed and implemented a scalable content processing system using Azure Functions to periodically process user content through advanced NLP tasks.
  • Developed and deployed NLP services powered by large language models (LLMs), including entity extraction, deduplication, content categorization, document summarization, and capability detection.
  • Enhanced the PostgreSQL database schema design of the software to efficiently support diverse system operations and workflows.
  • Developed robust data validation and retry mechanisms to ensure the reliability of LLM-generated responses.
  • Developed analytics dashboards to deliver actionable insights derived from NLP-processed data.
Technologies: AI Agents, AI Chatbots, Python, TypeScript, OpenAI API, Azure, Azure Functions, Azure Durable Functions, PostgreSQL, Large Language Models (LLMs), Retrieval-augmented Generation (RAG), REST APIs, Neo4j, OpenAI GPT-4 API, Artificial Intelligence (AI), Front-end Development, Back-end Development, ChatGPT, Vector Search, Embeddings from Language Models (ELMo), Software Architecture, Chatbots

Founder

2021 - 2024
Zyfera
  • Developed an AI chatbot capable of taking automated actions, such as making API calls, based on the context of user queries.
  • Implemented a serverless AI back end on AWS to power the LLM and RAG chatbot/agent.
  • Designed and implemented an NLP system for accent character restoration.
  • Applied and scaled a REST API for accent character restoration and customer operations.
  • Implemented a Google Docs plugin and a desktop application that uses the abovementioned API.
  • Set up Prometheus and Grafana to monitor key business metrics.
  • Designed and developed WebDroid (Webdroid.ai), a chatbot engine that creates LLM and RAG-powered AI chatbots from websites.
  • Developed a serverless, event-driven, fault-tolerant web scraping service on AWS.
Technologies: Java, Kubernetes, JavaScript, JavaFX, OAuth 2, PostgreSQL, REST, Google Apps Script, Docker, Spring, Algorithms, Git, Apache Maven, JUnit, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, Data Structures, Databases, Programming, IntelliJ IDEA, Visual Studio Code (VS Code), Linux, APIs, API Gateways, JSON, Spring Boot, Prometheus, Grafana, Cloud Services, REST APIs, Web Scraping, CI/CD Pipelines, ChatGPT, OpenAI GPT-3 API, Artificial Intelligence (AI), Front-end Development, Back-end Development, Web Crawlers, Vector Search, Embeddings from Language Models (ELMo), FAISS, Software Architecture, Scraping

Software Development Engineer

2021 - 2023
Amazon.com
  • Created the infrastructure on AWS using CDK, with Route53, CloudFront, S3, Lambda, DynamoDB, and API Gateway. Developed microservices with Python Lambda and set up AWS ECS services with Java and Docker.
  • Migrated infrastructure code from CDK v1 to v2, built and maintained multiple software deployment pipelines, extensively utilized AWS services, and established CloudWatch metrics and alarms for comprehensive monitoring.
  • Strategically planned, led, and executed the expansion of the Smart Home Appliance Resolution service, enhancing the experience for Alexa users across Europe.
Technologies: Amazon API Gateway, Amazon S3 (AWS S3), Amazon Alexa, Amazon CloudWatch, AWS IAM, SDKs, TypeScript, Docker, React, Python, AWS Lambda, Programming, Amazon Elastic Container Service (ECS), AWS Fargate, Mockito, Amazon DynamoDB, AWS CloudFormation, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon CloudFront CDN, AWS Key Management Service (KMS), Cloud Services, REST APIs, CI/CD Pipelines, Back-end Development, Microservices, Software Architecture

Software Development Engineer

2020 - 2021
Amazon Web Services (AWS)
  • Wrote new web integration tests and rewrote all the existing ones using page-object design patterns to achieve complete CI/CD for all pre-production and production stages.
  • Fixed many bugs and maintained and improved the codebase.
  • Implemented the internalization of the web application with the new React stack.
Technologies: Java, Amazon Web Services (AWS), Selenium, React, Linux, Amazon CloudWatch, Git, TypeScript, JUnit, Algorithms, Data Structures, Programming, Spring, AWS Lambda, IntelliJ IDEA, Visual Studio Code (VS Code), APIs, API Gateways, JSON, Spring Boot, Amazon Elastic Container Service (ECS), Mockito, Amazon DynamoDB, AWS CloudFormation, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon CloudFront CDN, AWS Key Management Service (KMS), Cloud Services, REST APIs, CI/CD Pipelines, Front-end Development, Back-end Development, Microservices, Software Architecture

Machine Learning Software Engineer

2019 - 2020
Borealis AI | RBC Research Institute
  • Worked closely with researchers to create a real-world application on RBC infrastructure and containerized the applications' components for migration to Kubernetes OpenShift.
  • Prepared Conda recipes to build Conda packages for internal and third-party Python libraries. Implemented CI/CD pipelines for components using Jenkins and UrbanCode Deploy.
  • Developed a standard build tool for building and deploying Conda packages and Docker images, which supports auto-versioning and can be used both locally and from Jenkins.
  • Implemented a resilient data sync app that copies data from network-attached storage (NAS) to S3 object storage and vice versa. Set up Prometheus and Grafana to monitor the app and get notifications if any job fails.
Technologies: Python, OpenShift, Kubernetes, Docker, Red Hat Linux, Conda, Jenkins, Amazon S3 (AWS S3), Git, Apache Maven, Machine Learning, Algorithms, Data Structures, Programming, Visual Studio Code (VS Code), Linux, JSON, Prometheus, Grafana, Cloud Services, REST APIs, CI/CD Pipelines, Artificial Intelligence (AI), Back-end Development, Software Architecture

Machine Learning Engineer

2016 - 2018
Diligen
  • Architected and implemented the entire document analysis software's back end, which processes hundreds of contracts daily.
  • Implemented machine learning (ML) and natural language processing (NLP) pipelines to extract information such as legal clauses, parties, dates, and names from legal contracts.
  • Developed a feature engineering framework for faster experimentation of ML models. I also implemented an ML system where users can create custom ML models by labeling their data without coding.
  • Improved all ML modules in terms of accuracy, memory consumption (around 400%), and processing time (by 10 to 100 fold).
  • Designed a REST API that makes the document analysis back-end available for other systems.
  • Trained lawyers, as domain experts, on ML basics, data labeling, and model evaluation.
  • Migrated the document analysis back end to AWS Lambda for better scaling.
Technologies: Java, Stanford NLP, LIBLINEAR, OpenNLP, PDFBox, AWS Lambda, Amazon S3 (AWS S3), PostgreSQL, Git, Apache Maven, JUnit, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, Algorithms, Data Structures, Programming, Amazon Web Services (AWS), IntelliJ IDEA, Linux, APIs, API Gateways, JSON, Cloud Services, REST APIs, CI/CD Pipelines, Artificial Intelligence (AI), Back-end Development, Software Architecture

Software Engineer

2015 - 2016
Freelance Contractor
  • Architected, designed, and developed the back end of an online flight booking system.
  • Unified multiple SOAP APIs under a simple mobile-friendly REST-like JSON API.
  • Designed and created the system's database for all details of the flight booking process.
  • Implemented user notification services such as email and SMS.
  • Integrated a virtual POS API for the payment service.
Technologies: C#, ASP.NET, Entity Framework, SQL, Quartz, Git, Azure, Algorithms, Data Structures, Databases, Programming, Linux, .NET, APIs, JSON, REST APIs, Front-end Development, Back-end Development, Software Architecture

Senior Engineer and Researcher

2013 - 2015
Tubitak
  • Built a text classifier for web pages categorization by implementing machine learning techniques. This tool was used to categorize 100 million web pages.
  • Developed a word stemmer to improve both the quality of search and the accuracy of the text classifier.
  • Developed a query suggestion module that tolerates the spelling errors made by users.
  • Integrated the developed modules with Apache Solr by implementing a plugin.
  • Designed and developed a morphologic analyzer and a part-of-speech tagger for Turkish.
  • Investigated the literature for related academic studies, such as morphologic analysis, disambiguation, part-of-speech tagging, stemming, and lemmatization.
Technologies: Java, Apache Solr, Hadoop, Weka, JUnit, Git, Apache Maven, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Machine Learning, JavaFX, Algorithms, Data Structures, Programming, IntelliJ IDEA, Linux, APIs, JSON, REST APIs, Artificial Intelligence (AI), Back-end Development, Software Architecture

Experience

Document Analysis System with NLP

This project targeted legal contract analysis with natural language processing (NPL) and machine learning (ML). I architected and implemented the entire document analysis back end, which processes hundreds of contracts daily. I designed and implemented the ML pipelines to extract various information such as legal clauses, parties, effective dates, and names from contracts.

MagicAccents

https://magicaccents.com/
Accented characters, sometimes referred to as accents, are essential elements in written language. They frequently occur in many languages, including Spanish, French, Italian, German, and Portuguese. However, accents don't exist on the US English keyboard layout, and as a result, users tend to use the closest English version of these letters. For example, à becomes a, and ö becomes o. MagicAccents can automatically restore these letters using machine learning and natural language processing techniques with high accuracy. It currently supports 27 languages.

I've implemented this product's machine learning back end, REST API, Google Docs plugin, and desktop application using JavaFX.

Natural Language Processing Library for Turkish

https://github.com/hrzafer/nuve
Nuve is a natural language processing library for Turkish, which currently supports morphologic analysis of 35,000 words per second on an i5 2.8GHz 64-bit machine; morphologic generation; stemming; sentence segmentation and boundary detection; and N-gram extraction.

WebDroid – AI Chatbot Assistant for Websites

I led the design and development of WebDroid, an AI chatbot assistant that empowers websites with intelligent, context-aware interactions. Utilizing large language models (LLMs) and retrieval-augmented generation (RAG), WebDroid automates complex user queries by performing API calls and delivering precise responses.

The project’s back end leverages a serverless architecture on AWS, ensuring scalability, fault tolerance, and high availability. I implemented advanced features, including an event-driven web scraping service, and optimized LLM parameters to improve response quality. I developed a user-friendly panel on the front end with SvelteKit and DaisyUI, incorporating secure authentication with email verification and OAuth.

I also led a team of four engineers, driving efficient workflows with CI/CD pipelines deployed via GitHub Actions to Vercel. Additionally, I built the company and product websites with Svelte and Tailwind CSS, ensuring seamless user experiences. WebDroid exemplifies cutting-edge AI integration with robust engineering to transform customer engagement for businesses.

AI-powered Agent for Sales Teams

I designed and implemented a next-generation AI agent to empower sales teams with advanced content processing and retrieval capabilities. The project focused on enhancing retrieval-augmented generation (RAG) systems to deliver precise and actionable insights from vast datasets.

Key contributions included building a scalable content processing pipeline using Azure Functions to handle periodic and on-demand NLP tasks. I developed a range of AI services, such as entity extraction, deduplication, document categorization, summarization, and capability detection for client content. The back-end infrastructure was optimized for performance and scalability with Azure Functions and PostgreSQL supporting custom user workflows and analytics.

To ensure robust quality, I designed retry mechanisms for AI-generated responses and improved schema designs for efficient database operations. Additionally, I researched and proposed knowledge graph integrations to extend the system’s capabilities.

This project demonstrates expertise in delivering scalable AI-driven solutions, combining cutting-edge NLP techniques with seamless integration into cloud platforms.

Education

2009 - 2011

Master's Degree in Computer Engineering

Fatih University - Istanbul, Turkey

2000 - 2006

Bachelor's Degree in Computer Engineering

Hacettepe University - Ankara, Turkey

Certifications

OCTOBER 2017 - PRESENT

Neural Networks and Deep Learning

DeepLearning.AI | via Coursera

Skills

Libraries/APIs

Stanford NLP, LIBLINEAR, OpenNLP, REST APIs, PDFBox, Quartz, OpenAI Assistants API, React, TensorFlow, Entity Framework, Beautiful Soup, OpenAI API, Pydantic

Tools

IntelliJ IDEA, Git, Apache Maven, AWS IAM, ChatGPT, Amazon CloudWatch, Grafana, Amazon Elastic Container Service (ECS), AWS Fargate, AWS CloudFormation, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon CloudFront CDN, GitLab CI/CD, Jenkins, Apache Solr, Weka, AWS Key Management Service (KMS), AWS Cloud Development Kit (CDK)

Languages

Java, SQL, Python, JavaScript, Google Apps Script, C#, TypeScript

Frameworks

Spring, JUnit, Spring Boot, Selenium, .NET, Mockito, Svelte, OAuth 2, Hadoop, ASP.NET, Tailwind CSS

Paradigms

REST, Microservices

Platforms

Linux, Windows, Visual Studio Code (VS Code), Docker, AWS Lambda, Amazon Web Services (AWS), Kubernetes, JavaFX, Amazon Alexa, OpenShift, Red Hat Linux, Azure, Vercel, Azure Functions

Storage

Amazon S3 (AWS S3), Databases, JSON, Amazon DynamoDB, PostgreSQL, Neo4j

Other

Programming, Data Structures, Algorithms, Natural Language Processing (NLP), APIs, API Gateways, Generative Pre-trained Transformers (GPT), SDKs, Cloud Services, Software Architecture, Scraping, Web Scraping, Chatbots, AI Chatbots, CI/CD Pipelines, Back-end Development, Artificial Intelligence (AI), Conda, Machine Learning, Amazon API Gateway, Prometheus, Large Language Models (LLMs), OpenAI, OpenAI GPT-3 API, OpenAI GPT-4 API, Web Crawlers, Retrieval-augmented Generation (RAG), Vector Search, Embeddings from Language Models (ELMo), Front-end Development, Deep Learning, FAISS, Supabase, Google Drive, LangChain, Pgvector, Amazon RDS, Cloudflare, Azure Durable Functions, AI Agents

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring