Burcin is available for hire

Burcin Sarac

Verified Expert in Engineering

Data Scientist and Software Developer

Location

Istanbul, Turkey

Toptal Member Since

August 13, 2021

Burcin is a seasoned data scientist and AI developer with a master's degree in the field and certifications in ML and AI. With a strong command of Python and its ecosystem, he has extensive hands-on experience across various AI and ML technologies. Burcin's current focus lies in the advancements of large language models (LLMs), specializing in task automation and the development and deployment of AI products into cloud environments, particularly on Google Cloud Platform (GCP).

Data Analysis Machine Learning Time Series Clustering Artificial Intelligence (AI)Data Analytics Natural Language Processing (NLP)Python 3 Python NumPy Pandas Scikit-learn SQL BigQuery ChatGPT Beautiful Soup Hugging Face Kubeflow

Portfolio

Onyx Relations Corp

Artificial Intelligence (AI), Python, Natural Language Processing (NLP), GPT...

n11.com

Google Cloud ML, Google Cloud Platform (GCP), GPT...

Onyx Relations Corp

Artificial Intelligence (AI), GPT, Twitter API, Reddit API...

Experience

Python 3 - 6 years Data Science - 5 years Data Analytics - 4 years Statistics - 4 years SQL - 4 years Data Engineering - 2 years Google Cloud Platform (GCP) - 2 years Natural Language Processing (NLP) - 2 years

Availability

Full-time

Preferred Environment

Python 3, Jupyter Notebook, Natural Language Processing (NLP), GPT, Generative Pre-trained Transformers (GPT), Google Cloud Platform (GCP), Amazon Web Services (AWS), Visual Studio Code (VS Code), Ubuntu, Large Language Models (LLMs)

The most amazing...

...thing I've built is a versatile bot using LLMs for automated, context-aware interactions on social media, adaptable to any input and deployed on GCP.

Work Experience

AI Developer (via Toptal)

2023 - 2024

Onyx Relations Corp

Developed a versatile bot capable of posting about specific topics and press releases and engaging with users on social media platforms, including Twitter and Reddit.
Integrated and leveraged state-of-the-art LLM/GPT technologies, including OpenAI API and Gemini Pro, to enable organic and contextually relevant responses to user interactions.
Implemented functionalities to detect and respond to relevant threads, discussions, and trends across multiple platforms.
Enhanced the bot's adaptability to any input stock symbol, fetching news data from at least 50 news sources using APIs, RSS feeds, and webpage parsing techniques.
Summarized news data using the latest LLM models to provide concise and informative content.
Deployed all processes to Google Cloud Platform using various technologies, such as Cloud Run, Cloud Functions, BigQuery, and Cloud Scheduler, ensuring efficient and scalable operations.

Technologies: Artificial Intelligence (AI), Python, Natural Language Processing (NLP), GPT, Large Language Models (LLMs), OpenAI, OpenAI GPT-4 API, Open-source LLMs, HTML Parsing, APIs, Google Cloud, Google Cloud Functions, Google BigQuery, Job Schedulers, Prompt Engineering, LangChain, NumPy, Retrieval-augmented Generation (RAG), Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Gemini, Anthropic, Claude, BERT, Generative AI, Llama 2, Yahoo! Finance

Senior Data Scientist

2022 - 2024

n11.com

Constructed customer data pipelines for daily, weekly, and monthly generated features based on customer transactions. Scheduled jobs to generate tables in BigQuery using Python.
Redesigned and improved a churn model to detect churners and calculate customer lifetime values using customer transactions as raw data.
Segmented customers based on customer behaviors using platform activity logs and transactions.
Developed and deployed a custom chatbot using customer interaction data. Created a custom model endpoint in GCP and an API in Cloud Run and designed a Kubeflow pipeline for scheduled model retraining operations.
Designed an HTML page to use in-office screens to track real-time order amounts with animations to celebrate whenever the target hits using HTML, CSS, and JavaScript together with FastAPI in the back end.
Worked as part of a team on a custom in-house recommender system development project and contributed to the design of the whole project lifecycle, including the API design.
Designed and developed fraud and counterfeit product detection approaches, including image recognition, TF-IDF, lemmatization stemming and text embedding generation, and keyword extraction.

Technologies: Google Cloud ML, Google Cloud Platform (GCP), GPT, Natural Language Processing (NLP), Python 3, Python, Google BigQuery, BigQuery, Apache Airflow, Cron, Cloud Dataflow, Machine Learning, Deep Learning, Unsupervised Learning, Customer Segmentation, Classification, Data Analysis, Data Science, Data Engineering, Data Pipelines, Natural Language Toolkit (NLTK), SpaCy, Artificial Intelligence (AI), Google Cloud Functions, Google Cloud, Kubeflow, APIs, Flask, Chatbots, HTML, CSS, JavaScript, OpenAI, ChatGPT, TensorFlow, Time Series, Beautiful Soup, Clustering, Supervised Machine Learning, Scikit-learn, Apache Beam, Large Language Models (LLMs), LangChain, NumPy, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, BERT, Gemini API, OpenCV

AI Developer

2023 - 2023

Onyx Relations Corp

Developed a bot capable of posting about specific topics, press releases, and engaging with users on social media platforms.
Integrated and leveraged LLM/GPT technologies to enable organic and contextually relevant responses to user interactions.
Implemented functionalities to detect and respond to relevant threads, discussions, and trends across Twitter and Reddit.
Deployed all the processes to Google Cloud Platform using various technologies, such as Cloud Run, Cloud Functions, BigQuery, and Cloud Scheduler, among others.

Technologies: Artificial Intelligence (AI), GPT, Twitter API, Reddit API, Generative Pre-trained Transformers (GPT), OpenAI, OpenAI GPT-4 API, Web Scraping, Natural Language Processing (NLP), Automation, Google Cloud Platform (GCP), Google Cloud Functions, Google Cloud, Docker, BigQuery, Machine Learning Operations (MLOps), ChatGPT, TensorFlow, Beautiful Soup, Scikit-learn, Large Language Models (LLMs), Prompt Engineering, LangChain, NumPy, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Generative AI, Llama 2, Yahoo! Finance

Data Scientist | AI Developer

2023 - 2023

Sole Entrepreneurship in US

Developed and did backtesting using price-related data following trend strategies in the US stock market.
Automated successful trading strategies based on backtesting results using Python by connecting to stock market APIs.
Deployed all fully automated trading bots on the cloud, allowing the user to change parameters and start/stop them through a clean front screen.
Created separate BigQuery tables to record closed trades of each trading bot and visualized the trading results with filtering options to let the user analyze the bot performance using Looker Studio.

Technologies: Trading, Artificial Intelligence (AI), Data Science, Data Analysis, Algorithmic Trading, Trend Analysis, Google Cloud, Google Cloud Platform (GCP), Google BigQuery, Looker, API Integration, Finance APIs, Finance, Time Series, Scikit-learn, NumPy, Yahoo! Finance

Senior Applied Scientist

2022 - 2022

Magnify

Acted as an ML model developer in a post-sales automation and orchestration platform development project. Segmented customers based on Salesforce platform usage attributes.
Gathered, transformed, and summarized features to define a rule-based churn algorithm to detect possible churners among customers.
Connected to AWS VM Instance using SSH from the local machine, set up ML Flow experiment tracking records in an S3 bucket in AWS, and generated experiment track reports using Prefect.

Technologies: Python 3, Machine Learning Operations (MLOps), Clustering, Unsupervised Learning, Amazon SageMaker, Amazon Web Services (AWS), Artificial Intelligence (AI), Data Engineering, Python, Statistics, Data Science, Scikit-learn, Docker, Time Series, NumPy, PostgreSQL

Senior Data Scientist

2021 - 2022

Intertech (Emirates NBD Bank)

Developed an NLP model to summarize texts using claim documents to classify customer requests and forward them to the relevant department.
Summarized effort logs of employees were collected as time series data, and then future efforts were estimated for planning future employee capacity requirements.
Built an anomaly detection model to detect anomalies in invoice payments and then implemented an email alert system for immediate intervention by the relevant teams.
Constructed pipelines for gathering data from various sources such as relational databases and HTML or Excel files to generate reports; these were published via Power BI.

Technologies: Python 3, Microsoft SQL Server, Microsoft Power BI, Financial Modeling, Trend Forecasting, Natural Language Processing (NLP), GPT, Natural Language Understanding (NLU), Data Analysis, Microsoft Azure, Data Visualization, Artificial Intelligence (AI), Data Engineering, Python, ETL, SQL, Data Pipelines, Data Analytics, Data Science, Statistics, Natural Language Toolkit (NLTK), SpaCy, Time Series, Clustering, Supervised Machine Learning, Scikit-learn, NumPy

Senior Data Scientist

2020 - 2021

Sekerbank (Samruk — Kazyna Invest LLP)

Built and presented propensity models for retail loan products and loan accounts to determine the tendency of customers to purchase these products.
Developed and implemented a clustering algorithm to segment retail customers based on their assets, liabilities, and product ownership.
Cleaned and classified texts from customer complaints about products and services to generate weekly reports.
Developed a market-basket analysis project based on customer product ownership to improve marketing activities.
Constructed pipelines for the parsing and analysis of customer data for daily, weekly, and monthly executive reports to automatize report preparation.

Technologies: Python 3, Oracle SQL, Predictive Modeling, Classification, Trend Forecasting, Machine Learning, Supervised Machine Learning, Machine Learning Operations (MLOps), Data Engineering, SQL, Python, Data Science, Data Analysis, Data Analytics, Data Pipelines, ETL, Scikit-learn, Pandas, Forecasting, Natural Language Toolkit (NLTK), Artificial Intelligence (AI), Time Series, Clustering, NumPy, PostgreSQL

Data Scientist

2019 - 2020

Vakifbank

Developed and deployed product propensity models for retail and SME customers to detect if a customer was likely to buy and improve marketing initiatives' customer targeting.
Constructed a customer segmentation model based on the customer's balance account, transactions, credit cards, and loan usage behaviors.
Investigated and updated currently in-use prediction models to improve prediction performances and simplify the results.
Improved report generation pipelines to automate preparation processes based on customer data.

Technologies: Python 3, Oracle SQL, Classification, Machine Learning Operations (MLOps), Clustering, Unsupervised Learning, Supervised Learning, Python, Statistics, Data Pipelines, Data Science, Data Analysis, Data Analytics, SQL, ETL, Machine Learning, Financial Modeling, Trading, Algorithmic Trading, Artificial Intelligence (AI), Finance, Finance APIs, Time Series, Supervised Machine Learning, Scikit-learn, NumPy

Experience

Lyrics Generator | A Web Scrapping and Lyric Generation Project

https://github.com/burcins/LyricsGenerator

In this self-developed project, I aimed to generate lyrics by using lyrics of the entire discography of a given performer. I developed my model using Bob Dylan lyrics, but it is open for new trials.

In the first step, I parsed lyrics from a web page via a Beautiful Soup package and then cleaned as well as prepared them for model development. After that, I created a bidirectional LSTM model with a couple of layers and then trained it with a hundred iteration. Eventually, I provided the initial words for the trained model and it predicted an additional 100 words.

Twitter Sentiment Analysis

https://github.com/burcins/Twitter-Sentiment-Analysis

In this project, my aim was to get the most recent bunch of Twitter tweets and clean strings. Afterward, I would do a sentiment analysis on each tweet one by one and assign them scores to determine the positivity or negativity of the tweet.

ATM Cash Demand Forecasting

https://github.com/burcins/Time-Series-Forecasting

The main aim of this project was to forecast the daily cash demands of ATMs for the following month by using year-long logs of deposited and withdrawn cash daily.

The dataset included three features: Cash In, Cash Out, and Date. It also contains 1,186 observations in total which correspond to 1,186 days starting from 01/01/2016 to 03/31/2019. Eventually, it was expected to forecast the Cash In and Cash Out values between 04/01/2019 and 04/30/2019 separately.

Term Deposit Propensity Prediction

https://github.com/burcins/Term-Deposit-Propensity-Prediction

The main project goal was to set up an end-to-end ML project to predict a customer's term deposit buying propensity using call center data. In other words, we tried to predict the probability of customers purchasing a term deposit. In addition, the last part was reserved for customer clustering to identify the customers who are more likely to buy investment products.

The data contains 40,000 pieces of customer data with 14 features, including term deposit ownership.

Text Summarizer

https://huggingface.co/spaces/Burcin/ExtractiveSummarizer

For this project, my primary aim was to summarize texts based on their content. I developed a model and deployed it to Hugging Face with an interface. This interface allows users to summarize Wikipedia content. The only requirement is to write the topic and its collected content by fetching it from Wikipedia. For summarization, this model uses two different extractive summarization methods. The number of sentences in the output depends on the length of the original text.

Education

2018 - 2020

Master's Degree in Business Analytics

Athens University of Economics and Business - Athens, Greece

2011 - 2013

Master's Degree in Capital Markets

Marmara University - Istanbul, TURKEY

Certifications

SEPTEMBER 2022 - PRESENT

MLOps Zoomcamp

DataTalks.Club

NOVEMBER 2020 - PRESENT

Natural Language Processing Specialization

Coursera

Skills

Languages

Python 3, Python, SQL, SAS, R, HTML, CSS, JavaScript

Libraries/APIs

Pandas, Scikit-learn, Twitter API, NumPy, TensorFlow, Beautiful Soup, Natural Language Toolkit (NLTK), SpaCy, Reddit API, OpenCV

Tools

BigQuery, ChatGPT, PyCharm, Microsoft Power BI, Amazon SageMaker, Yahoo! Finance, Apache Airflow, Cron, Cloud Dataflow, Grafana, Looker, Apache Beam

Paradigms

Data Science, ETL, Automation

Platforms

Jupyter Notebook, Google Cloud Platform (GCP), Docker, Kubeflow, Amazon Web Services (AWS), Visual Studio Code (VS Code), Ubuntu

Storage

Google Cloud, Microsoft SQL Server, Oracle SQL, MySQL, PostgreSQL, Data Pipelines, MongoDB, Cassandra, Redis, NoSQL

Other

Machine Learning, Natural Language Processing (NLP), Time Series, Classification, Clustering, Unsupervised Learning, Supervised Machine Learning, Data Analysis, Supervised Learning, Artificial Intelligence (AI), Data Analytics, Regression, Google BigQuery, Data Processing Automation, Google Cloud Functions, Finance, Ubuntu 20.04, Deep Learning, Statistics, Text Classification, Web Scraping, Machine Learning Operations (MLOps), Time Series Analysis, Financial Modeling, Trend Forecasting, Microsoft Azure, Data Visualization, Data Engineering, Trading, Algorithmic Trading, Financial Markets, Capital Markets, Stock Market, Stock Trading, Stock Exchange, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Long Short-term Memory (LSTM), Text Categorization, GPT, OpenAI, OpenAI GPT-4 API, Finance APIs, APIs, Chatbots, Large Language Models (LLMs), Prompt Engineering, LangChain, Generative Pre-trained Transformer 3 (GPT-3), OpenAI GPT-3 API, Generative AI, Llama 2, Predictive Modeling, Natural Language Understanding (NLU), Forecasting, Stock Price Analysis, Stock Market Techinical Analysis, Financial Marketing, Big Data, Social Media Analytics, Sequence Models, Data Cleaning, Google Cloud ML, Customer Segmentation, MLflow, Prefect, Trend Analysis, API Integration, Generative Pre-trained Transformers (GPT), Open-source LLMs, HTML Parsing, Job Schedulers, Retrieval-augmented Generation (RAG), Gemini, Anthropic, Claude, BERT, Gemini API

Frameworks

Flask, Streamlit

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring