Burcin Sarac, Developer in Istanbul, Turkey
Burcin is available for hire
Hire Burcin

Burcin Sarac

Verified Expert  in Engineering

Data Scientist and Software Developer

Location
Istanbul, Turkey
Toptal Member Since
August 13, 2021

Burcin is a seasoned data scientist and AI developer with a master's degree in the field and certifications in ML and AI. With a strong command of Python and its ecosystem, he has extensive hands-on experience across various AI and ML technologies. Burcin's current focus lies in the advancements of large language models (LLMs), specializing in task automation and the development and deployment of AI products into cloud environments, particularly on Google Cloud Platform (GCP).

Portfolio

Onyx Relations Corp
Artificial Intelligence (AI), Python, Natural Language Processing (NLP)...
n11.com
Google Cloud ML, Google Cloud Platform (GCP)...
Onyx Relations Corp
Artificial Intelligence (AI), Twitter API, Reddit API...

Experience

Availability

Part-time

Preferred Environment

Python 3, Jupyter Notebook, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Google Cloud Platform (GCP), Amazon Web Services (AWS), Visual Studio Code (VS Code), Ubuntu, Large Language Models (LLMs)

The most amazing...

...thing I've built is a versatile bot using LLMs for automated, context-aware interactions on social media, adaptable to any input and deployed on GCP.

Work Experience

AI Developer (via Toptal)

2023 - 2024
Onyx Relations Corp
  • Developed a versatile bot capable of posting about specific topics and press releases and engaging with users on social media platforms, including Twitter and Reddit.
  • Integrated and leveraged state-of-the-art LLM/GPT technologies, including OpenAI API and Gemini Pro, to enable organic and contextually relevant responses to user interactions.
  • Implemented functionalities to detect and respond to relevant threads, discussions, and trends across multiple platforms.
  • Enhanced the bot's adaptability to any input stock symbol, fetching news data from at least 50 news sources using APIs, RSS feeds, and webpage parsing techniques.
  • Summarized news data using the latest LLM models to provide concise and informative content.
  • Dockerized the entire app as a service and deployed all processes to the Google Cloud Platform using various technologies, such as Cloud Run, Cloud Functions, BigQuery, and Cloud Scheduler, ensuring efficient and scalable operations.
Technologies: Artificial Intelligence (AI), Python, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Large Language Models (LLMs), OpenAI, OpenAI GPT-4 API, Open-source LLMs, HTML Parsing, APIs, Google Cloud, Google Cloud Functions, Google BigQuery, Job Schedulers, Prompt Engineering, LangChain, NumPy, Retrieval-augmented Generation (RAG), Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Gemini, Anthropic, Claude, BERT, Generative AI, Llama 2, Yahoo! Finance, Docker, Cloud Run, Vertex AI, REST, API Integration

Senior Data Scientist

2022 - 2024
n11.com
  • Constructed customer data pipelines for daily, weekly, and monthly generated features based on customer transactions. Scheduled jobs to generate tables in BigQuery using Python.
  • Redesigned and improved a churn model to detect churners and calculate customer lifetime values using customer transactions as raw data, deployed as a Kubeflow service; fetched and processed data from BigQuery, all orchestrated with Cloud Scheduler.
  • Segmented customers based on behaviors using platform logs and transactions, deployed as a Kubeflow service; fetched and processed data from BigQuery, created segments, and wrote to another table, all orchestrated with Cloud Scheduler.
  • Developed and deployed a custom chatbot using customer interaction data, with the model deployed as a custom prediction routine endpoint in Vertex AI. The pipeline was Dockerized and deployed as an API on Cloud Run.
  • Designed an HTML page to use in-office screens to track real-time order amounts with animations to celebrate whenever the target hits using HTML, CSS, and JavaScript together with FastAPI in the back end.
  • Worked as part of a team on a custom in-house recommender system project and contributed to the design of the whole project lifecycle, including the API design. Integrated Gradio to create web interfaces that testers could use on the model.
  • Designed and developed fraud and counterfeit product detection approaches, including image recognition, TF-IDF, lemmatization, stemming, and text embedding generation.
  • Developed and deployed an advanced image processing model on the Google Cloud Platform using Vertex AI to provide real-time predictions. The model was integrated with a Dataflow pipeline that generated and stored image embeddings in a BigQuery table.
  • Designed and developed a Kubernetes-managed service to retrieve image embeddings from BigQuery, standardize and rotate images for data augmentation, and optimize image data for further processing and analysis.
Technologies: Google Cloud ML, Google Cloud Platform (GCP), Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Python 3, Python, Google BigQuery, BigQuery, Apache Airflow, Cron, Cloud Dataflow, Machine Learning, Deep Learning, Unsupervised Learning, Customer Segmentation, Classification, Data Analysis, Data Science, Data Engineering, Data Pipelines, Natural Language Toolkit (NLTK), SpaCy, Artificial Intelligence (AI), Google Cloud Functions, Google Cloud, Kubeflow, APIs, Flask, Chatbots, HTML, CSS, JavaScript, OpenAI, ChatGPT, TensorFlow, Time Series, Beautiful Soup, Clustering, Supervised Machine Learning, Scikit-learn, Apache Beam, Large Language Models (LLMs), LangChain, NumPy, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, BERT, Gemini API, OpenCV, Kubernetes, Machine Learning Operations (MLOps), Vertex AI, REST, API Integration

AI Developer

2023 - 2023
Onyx Relations Corp
  • Developed a bot capable of posting about specific topics, press releases, and engaging with users on social media platforms.
  • Integrated and leveraged LLM/GPT technologies to enable organic and contextually relevant responses to user interactions.
  • Implemented functionalities to detect and respond to relevant threads, discussions, and trends across Twitter and Reddit.
  • Deployed all the processes to Google Cloud Platform using various technologies, such as Cloud Run, Cloud Functions, BigQuery, and Cloud Scheduler, among others.
Technologies: Artificial Intelligence (AI), Twitter API, Reddit API, Generative Pre-trained Transformers (GPT), OpenAI, OpenAI GPT-4 API, Web Scraping, Natural Language Processing (NLP), Automation, Google Cloud Platform (GCP), Google Cloud Functions, Google Cloud, Docker, BigQuery, Machine Learning Operations (MLOps), ChatGPT, TensorFlow, Beautiful Soup, Scikit-learn, Large Language Models (LLMs), Prompt Engineering, LangChain, NumPy, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Generative AI, Llama 2, Yahoo! Finance, Vertex AI, API Integration, APIs

Data Scientist | AI Developer

2023 - 2023
Sole Entrepreneurship in US
  • Developed and did backtesting using price-related data following trend strategies in the US stock market.
  • Automated successful trading strategies based on backtesting results using Python by connecting to stock market APIs.
  • Deployed all fully automated trading bots on the cloud, allowing the user to change parameters and start/stop them through a clean front screen.
  • Created separate BigQuery tables to record closed trades of each trading bot and visualized the trading results with filtering options to let the user analyze the bot performance using Looker Studio.
Technologies: Trading, Artificial Intelligence (AI), Data Science, Data Analysis, Algorithmic Trading, Trend Analysis, Google Cloud, Google Cloud Platform (GCP), Google BigQuery, Looker, API Integration, Finance APIs, Finance, Time Series, Scikit-learn, NumPy, Yahoo! Finance, Vertex AI

Senior Applied Scientist

2022 - 2022
Magnify
  • Acted as an ML model developer in a post-sales automation and orchestration platform development project. Segmented customers based on Salesforce platform usage attributes.
  • Gathered, transformed, and summarized features to define a rule-based churn algorithm to detect possible churners among customers.
  • Connected to the AWS VM Instance using SSH from the local machine, set up MLFlow experiment tracking records in an AWS S3 bucket, and generated experiment track reports using Prefect.
Technologies: Python 3, Machine Learning Operations (MLOps), Clustering, Unsupervised Learning, Amazon SageMaker, Amazon Web Services (AWS), Artificial Intelligence (AI), Data Engineering, Python, Statistics, Data Science, Scikit-learn, Docker, Time Series, NumPy, PostgreSQL, Prefect, MLflow

Senior Data Scientist

2021 - 2022
Intertech (Emirates NBD Bank)
  • Developed an NLP model to summarize texts using claim documents to classify customer requests and forward them to the relevant department.
  • Summarized effort logs of employees were collected as time series data, and then future efforts were estimated for planning future employee capacity requirements.
  • Built an anomaly detection model to detect anomalies in invoice payments and then implemented an email alert system for immediate intervention by the relevant teams.
  • Constructed pipelines for gathering data from various sources such as relational databases and HTML or Excel files to generate reports; these were published via Power BI.
Technologies: Python 3, Microsoft SQL Server, Microsoft Power BI, Financial Modeling, Trend Forecasting, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), Natural Language Understanding (NLU), Data Analysis, Microsoft Azure, Data Visualization, Artificial Intelligence (AI), Data Engineering, Python, ETL, SQL, Data Pipelines, Data Analytics, Data Science, Statistics, Natural Language Toolkit (NLTK), SpaCy, Time Series, Clustering, Supervised Machine Learning, Scikit-learn, NumPy

Senior Data Scientist

2020 - 2021
Sekerbank (Samruk — Kazyna Invest LLP)
  • Built and presented propensity models for retail loan products and loan accounts to determine the tendency of customers to purchase these products.
  • Developed and implemented a clustering algorithm to segment retail customers based on their assets, liabilities, and product ownership.
  • Cleaned and classified texts from customer complaints about products and services to generate weekly reports.
  • Developed a market-basket analysis project based on customer product ownership to improve marketing activities.
  • Constructed pipelines for the parsing and analysis of customer data for daily, weekly, and monthly executive reports to automatize report preparation.
Technologies: Python 3, Oracle SQL, Predictive Modeling, Classification, Trend Forecasting, Machine Learning, Supervised Machine Learning, Machine Learning Operations (MLOps), Data Engineering, SQL, Python, Data Science, Data Analysis, Data Analytics, Data Pipelines, ETL, Scikit-learn, Pandas, Forecasting, Natural Language Toolkit (NLTK), Artificial Intelligence (AI), Time Series, Clustering, NumPy, PostgreSQL

Data Scientist

2019 - 2020
Vakifbank
  • Developed and deployed product propensity models for retail and SME customers to detect if a customer was likely to buy and improve marketing initiatives' customer targeting.
  • Constructed a customer segmentation model based on the customer's balance account, transactions, credit cards, and loan usage behaviors.
  • Investigated and updated currently in-use prediction models to improve prediction performances and simplify the results.
  • Improved report generation pipelines to automate preparation processes based on customer data.
Technologies: Python 3, Oracle SQL, Classification, Machine Learning Operations (MLOps), Clustering, Unsupervised Learning, Supervised Learning, Python, Statistics, Data Pipelines, Data Science, Data Analysis, Data Analytics, SQL, ETL, Machine Learning, Financial Modeling, Trading, Algorithmic Trading, Artificial Intelligence (AI), Finance, Finance APIs, Time Series, Supervised Machine Learning, Scikit-learn, NumPy

Lyrics Generator | A Web Scrapping and Lyric Generation Project

https://github.com/burcins/LyricsGenerator
In this self-developed project, I aimed to generate lyrics by using lyrics of the entire discography of a given performer. I developed my model using Bob Dylan lyrics, but it is open for new trials.

In the first step, I parsed lyrics from a web page via a Beautiful Soup package and then cleaned as well as prepared them for model development. After that, I created a bidirectional LSTM model with a couple of layers and then trained it with a hundred iteration. Eventually, I provided the initial words for the trained model and it predicted an additional 100 words.

Twitter Sentiment Analysis

https://github.com/burcins/Twitter-Sentiment-Analysis
In this project, my aim was to get the most recent bunch of Twitter tweets and clean strings. Afterward, I would do a sentiment analysis on each tweet one by one and assign them scores to determine the positivity or negativity of the tweet.

ATM Cash Demand Forecasting

https://github.com/burcins/Time-Series-Forecasting
The main aim of this project was to forecast the daily cash demands of ATMs for the following month by using year-long logs of deposited and withdrawn cash daily.

The dataset included three features: Cash In, Cash Out, and Date. It also contains 1,186 observations in total which correspond to 1,186 days starting from 01/01/2016 to 03/31/2019. Eventually, it was expected to forecast the Cash In and Cash Out values between 04/01/2019 and 04/30/2019 separately.

Term Deposit Propensity Prediction

https://github.com/burcins/Term-Deposit-Propensity-Prediction
The main project goal was to set up an end-to-end ML project to predict a customer's term deposit buying propensity using call center data. In other words, we tried to predict the probability of customers purchasing a term deposit. In addition, the last part was reserved for customer clustering to identify the customers who are more likely to buy investment products.

The data contains 40,000 pieces of customer data with 14 features, including term deposit ownership.

Text Summarizer

https://huggingface.co/spaces/Burcin/ExtractiveSummarizer
For this project, my primary aim was to summarize texts based on their content. I developed a model and deployed it to Hugging Face with an interface. This interface allows users to summarize Wikipedia content. The only requirement is to write the topic and its collected content by fetching it from Wikipedia. For summarization, this model uses two different extractive summarization methods. The number of sentences in the output depends on the length of the original text.

Multiclass Classification Development and Deployment (MLOps)

https://github.com/burcins/mlops-zoomcamp-main-project
This intensive end-to-end MLOps project includes data exploration, experimenting, and model development, as well as experiment tracking with MLFlow and orchestration with Prefect as a workflow tool to deploy the model as a web service.

For this project, publicly available wine data was used, and a simple multiclass classification model was developed to predict wine quality and assign a quality rate between 3 and 9, based on the product's ingredients as predictors.
2018 - 2020

Master's Degree in Business Analytics

Athens University of Economics and Business - Athens, Greece

2011 - 2013

Master's Degree in Capital Markets

Marmara University - Istanbul, TURKEY

SEPTEMBER 2022 - PRESENT

MLOps Zoomcamp

DataTalks.Club

NOVEMBER 2020 - PRESENT

Natural Language Processing Specialization

Coursera

Libraries/APIs

Pandas, Scikit-learn, Twitter API, NumPy, XGBoost, TensorFlow, Beautiful Soup, Natural Language Toolkit (NLTK), SpaCy, Reddit API, OpenCV

Tools

BigQuery, ChatGPT, PyCharm, Microsoft Power BI, Amazon SageMaker, Yahoo! Finance, Apache Airflow, Cron, Cloud Dataflow, Grafana, Looker, Apache Beam

Languages

Python 3, Python, SQL, SAS, R, HTML, CSS, JavaScript

Paradigms

Data Science, ETL, REST, Automation

Platforms

Jupyter Notebook, Vertex AI, Ubuntu 20.04, Google Cloud Platform (GCP), Docker, Kubeflow, Cloud Run, Amazon Web Services (AWS), Visual Studio Code (VS Code), Ubuntu, Kubernetes

Storage

Google Cloud, Microsoft SQL Server, Oracle SQL, MySQL, PostgreSQL, Data Pipelines, MongoDB, Cassandra, Redis, NoSQL

Frameworks

Flask, Streamlit

Other

Machine Learning, Natural Language Processing (NLP), Time Series, Classification, Clustering, Unsupervised Learning, Supervised Machine Learning, Data Analysis, Supervised Learning, Artificial Intelligence (AI), Data Analytics, Regression, Google BigQuery, Data Processing Automation, API Integration, Google Cloud Functions, Finance, APIs, Deep Learning, Statistics, Text Classification, Web Scraping, Machine Learning Operations (MLOps), Time Series Analysis, Financial Modeling, Trend Forecasting, Microsoft Azure, Data Visualization, Data Engineering, Trading, Algorithmic Trading, Financial Markets, Capital Markets, Stock Market, Stock Trading, Stock Exchange, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Long Short-term Memory (LSTM), Text Categorization, OpenAI, OpenAI GPT-4 API, Finance APIs, Chatbots, Large Language Models (LLMs), Prompt Engineering, LangChain, Retrieval-augmented Generation (RAG), Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Gemini, Anthropic, Claude, Gemini API, Generative AI, Llama 2, Gunicorn, Predictive Modeling, Natural Language Understanding (NLU), Forecasting, Stock Price Analysis, Stock Market Techinical Analysis, Financial Marketing, Big Data, Social Media Analytics, Sequence Models, Data Cleaning, Google Cloud ML, Customer Segmentation, MLflow, Prefect, Trend Analysis, Generative Pre-trained Transformers (GPT), Open-source LLMs, HTML Parsing, Job Schedulers, BERT

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring