
Harun Zafer
Verified Expert in Engineering
Back-end Developer
Toronto, ON, Canada
Toptal member since January 28, 2022
Harun is a software and AI engineer with 15+ years of experience, specializing in back-end development and AI-driven solutions. Recently, he has focused on building large language model (LLM) applications and NLP systems, including entity extraction, content categorization, and document summarization. Skilled in scalable cloud architectures and fluent in Java, Python, and C#, Harun excels in creating innovative, real-world AI solutions while staying adaptable to new technologies.
Portfolio
Experience
- Natural Language Processing (NLP) - 11 years
- Python - 5 years
- AI Chatbots - 2 years
- Vector Search - 2 years
- Large Language Models (LLMs) - 2 years
- Retrieval-augmented Generation (RAG) - 2 years
- OpenAI Assistants API - 1 year
- AI Agents - 1 year
Availability
Preferred Environment
Linux, Windows, Java, Python, Spring, Amazon Web Services (AWS), Docker, Kubernetes, Cloud Services
The most amazing...
...solution I've built is a document analysis system. I've built its NLP pipeline, trained ML models, and implemented the entire back-end logic.
Work Experience
Staff AI Engineer
Freelance Clients
- Designed and implemented agentic actions for a QA agent, significantly enhancing its retrieval-augmented generation (RAG) capabilities.
- Designed and implemented an AI-driven content processing system capable of functioning as scheduled tasks for periodic NLP processing and on-demand HTTP APIs for real-time interactions, ensuring flexibility and scalability.
- Developed and deployed an end-to-end custom categorization workflow enabling users to define and manage categorization tasks tailored to their specific needs, incorporating both front-end and back-end (web and AI) components.
- Designed and implemented a scalable content processing system using Azure Functions to periodically process user content through advanced NLP tasks.
- Developed and deployed NLP services powered by large language models (LLMs), including entity extraction, deduplication, content categorization, document summarization, and capability detection.
- Enhanced the PostgreSQL database schema design of the software to efficiently support diverse system operations and workflows.
- Developed robust data validation and retry mechanisms to ensure the reliability of LLM-generated responses.
- Developed analytics dashboards to deliver actionable insights derived from NLP-processed data.
Founder
Zyfera
- Developed an AI chatbot capable of taking automated actions, such as making API calls, based on the context of user queries.
- Implemented a serverless AI back end on AWS to power the LLM and RAG chatbot/agent.
- Designed and implemented an NLP system for accent character restoration.
- Applied and scaled a REST API for accent character restoration and customer operations.
- Implemented a Google Docs plugin and a desktop application that uses the abovementioned API.
- Set up Prometheus and Grafana to monitor key business metrics.
- Designed and developed WebDroid (Webdroid.ai), a chatbot engine that creates LLM and RAG-powered AI chatbots from websites.
- Developed a serverless, event-driven, fault-tolerant web scraping service on AWS.
Software Development Engineer
Amazon.com
- Created the infrastructure on AWS using CDK, with Route53, CloudFront, S3, Lambda, DynamoDB, and API Gateway. Developed microservices with Python Lambda and set up AWS ECS services with Java and Docker.
- Migrated infrastructure code from CDK v1 to v2, built and maintained multiple software deployment pipelines, extensively utilized AWS services, and established CloudWatch metrics and alarms for comprehensive monitoring.
- Strategically planned, led, and executed the expansion of the Smart Home Appliance Resolution service, enhancing the experience for Alexa users across Europe.
Software Development Engineer
Amazon Web Services (AWS)
- Wrote new web integration tests and rewrote all the existing ones using page-object design patterns to achieve complete CI/CD for all pre-production and production stages.
- Fixed many bugs and maintained and improved the codebase.
- Implemented the internalization of the web application with the new React stack.
Machine Learning Software Engineer
Borealis AI | RBC Research Institute
- Worked closely with researchers to create a real-world application on RBC infrastructure and containerized the applications' components for migration to Kubernetes OpenShift.
- Prepared Conda recipes to build Conda packages for internal and third-party Python libraries. Implemented CI/CD pipelines for components using Jenkins and UrbanCode Deploy.
- Developed a standard build tool for building and deploying Conda packages and Docker images, which supports auto-versioning and can be used both locally and from Jenkins.
- Implemented a resilient data sync app that copies data from network-attached storage (NAS) to S3 object storage and vice versa. Set up Prometheus and Grafana to monitor the app and get notifications if any job fails.
Machine Learning Engineer
Diligen
- Architected and implemented the entire document analysis software's back end, which processes hundreds of contracts daily.
- Implemented machine learning (ML) and natural language processing (NLP) pipelines to extract information such as legal clauses, parties, dates, and names from legal contracts.
- Developed a feature engineering framework for faster experimentation of ML models. I also implemented an ML system where users can create custom ML models by labeling their data without coding.
- Improved all ML modules in terms of accuracy, memory consumption (around 400%), and processing time (by 10 to 100 fold).
- Designed a REST API that makes the document analysis back-end available for other systems.
- Trained lawyers, as domain experts, on ML basics, data labeling, and model evaluation.
- Migrated the document analysis back end to AWS Lambda for better scaling.
Software Engineer
Freelance Contractor
- Architected, designed, and developed the back end of an online flight booking system.
- Unified multiple SOAP APIs under a simple mobile-friendly REST-like JSON API.
- Designed and created the system's database for all details of the flight booking process.
- Implemented user notification services such as email and SMS.
- Integrated a virtual POS API for the payment service.
Senior Engineer and Researcher
Tubitak
- Built a text classifier for web pages categorization by implementing machine learning techniques. This tool was used to categorize 100 million web pages.
- Developed a word stemmer to improve both the quality of search and the accuracy of the text classifier.
- Developed a query suggestion module that tolerates the spelling errors made by users.
- Integrated the developed modules with Apache Solr by implementing a plugin.
- Designed and developed a morphologic analyzer and a part-of-speech tagger for Turkish.
- Investigated the literature for related academic studies, such as morphologic analysis, disambiguation, part-of-speech tagging, stemming, and lemmatization.
Experience
Document Analysis System with NLP
MagicAccents
https://magicaccents.com/I've implemented this product's machine learning back end, REST API, Google Docs plugin, and desktop application using JavaFX.
Natural Language Processing Library for Turkish
https://github.com/hrzafer/nuveWebDroid – AI Chatbot Assistant for Websites
The project’s back end leverages a serverless architecture on AWS, ensuring scalability, fault tolerance, and high availability. I implemented advanced features, including an event-driven web scraping service, and optimized LLM parameters to improve response quality. I developed a user-friendly panel on the front end with SvelteKit and DaisyUI, incorporating secure authentication with email verification and OAuth.
I also led a team of four engineers, driving efficient workflows with CI/CD pipelines deployed via GitHub Actions to Vercel. Additionally, I built the company and product websites with Svelte and Tailwind CSS, ensuring seamless user experiences. WebDroid exemplifies cutting-edge AI integration with robust engineering to transform customer engagement for businesses.
AI-powered Agent for Sales Teams
Key contributions included building a scalable content processing pipeline using Azure Functions to handle periodic and on-demand NLP tasks. I developed a range of AI services, such as entity extraction, deduplication, document categorization, summarization, and capability detection for client content. The back-end infrastructure was optimized for performance and scalability with Azure Functions and PostgreSQL supporting custom user workflows and analytics.
To ensure robust quality, I designed retry mechanisms for AI-generated responses and improved schema designs for efficient database operations. Additionally, I researched and proposed knowledge graph integrations to extend the system’s capabilities.
This project demonstrates expertise in delivering scalable AI-driven solutions, combining cutting-edge NLP techniques with seamless integration into cloud platforms.
Education
Master's Degree in Computer Engineering
Fatih University - Istanbul, Turkey
Bachelor's Degree in Computer Engineering
Hacettepe University - Ankara, Turkey
Certifications
Neural Networks and Deep Learning
DeepLearning.AI | via Coursera
Skills
Libraries/APIs
Stanford NLP, LIBLINEAR, OpenNLP, REST APIs, PDFBox, Quartz, OpenAI Assistants API, React, TensorFlow, Entity Framework, Beautiful Soup, OpenAI API, Pydantic
Tools
IntelliJ IDEA, Git, Apache Maven, AWS IAM, ChatGPT, Amazon CloudWatch, Grafana, Amazon Elastic Container Service (ECS), AWS Fargate, AWS CloudFormation, Amazon Simple Notification Service (SNS), Amazon Simple Queue Service (SQS), Amazon CloudFront CDN, GitLab CI/CD, Jenkins, Apache Solr, Weka, AWS Key Management Service (KMS), AWS Cloud Development Kit (CDK)
Languages
Java, SQL, Python, JavaScript, Google Apps Script, C#, TypeScript
Frameworks
Spring, JUnit, Spring Boot, Selenium, .NET, Mockito, Svelte, OAuth 2, Hadoop, ASP.NET, Tailwind CSS
Paradigms
REST, Microservices
Platforms
Linux, Windows, Visual Studio Code (VS Code), Docker, AWS Lambda, Amazon Web Services (AWS), Kubernetes, JavaFX, Amazon Alexa, OpenShift, Red Hat Linux, Azure, Vercel, Azure Functions
Storage
Amazon S3 (AWS S3), Databases, JSON, Amazon DynamoDB, PostgreSQL, Neo4j
Other
Programming, Data Structures, Algorithms, Natural Language Processing (NLP), APIs, API Gateways, Generative Pre-trained Transformers (GPT), SDKs, Cloud Services, Software Architecture, Scraping, Web Scraping, Chatbots, AI Chatbots, CI/CD Pipelines, Back-end Development, Artificial Intelligence (AI), Conda, Machine Learning, Amazon API Gateway, Prometheus, Large Language Models (LLMs), OpenAI, OpenAI GPT-3 API, OpenAI GPT-4 API, Web Crawlers, Retrieval-augmented Generation (RAG), Vector Search, Embeddings from Language Models (ELMo), Front-end Development, Deep Learning, FAISS, Supabase, Google Drive, LangChain, Pgvector, Amazon RDS, Cloudflare, Azure Durable Functions, AI Agents
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring