Daniel Bredun, Developer in Rzeszow, Poland
Daniel is available for hire
Hire Daniel

Daniel Bredun

Verified Expert  in Engineering

Bio

Daniel is a data scientist and engineer who is a whiz at the data lifecycle. He excels at crafting efficient data pipelines, designing databases, conducting advanced analyses, and harnessing machine learning. Coupled with his proficiency in cloud storage systems, Daniel has consistently driven business success. Even in the face of challenging constraints, his passion for problem-solving ensures top-tier, long-term solutions.

Portfolio

StubHub
SQL, T-SQL, SQL Server, Snowflake, Data Build Tool (dbt), Data Migration...
New Columbia Solar
Salesforce Design, Salesforce API, SOQL...
Movement of Mothers
Data Analysis, SQL, Data Visualization, Data Science, Data Classification...

Experience

Availability

Full-time

Preferred Environment

PyCharm, MacOS

The most amazing...

...data collection I've done was from an ancient public API, boosting it from 10 to 60,000 data points per minute by reverse-engineering their web portal requests.

Work Experience

SQL Data Engineer

2024 - 2024
StubHub
  • Co-led the migration of an ERP system from SQL Server to Snowflake and dbt, speeding up journal generation 10-fold while significantly improving the development experience.
  • Led a major internal SQL Server database refactoring project, reducing system issues by 50% and saving 20 hours of employee time per month.
  • Instituted documentation of critical processes, previously shared informally, significantly speeding up new joiners' time to autonomy.
Technologies: SQL, T-SQL, SQL Server, Snowflake, Data Build Tool (dbt), Data Migration, Data Classification, Azure Design, B2C, Big Data Architecture

Senior Integration Engineer

2022 - 2024
New Columbia Solar
  • Led a comprehensive integration project to connect five internal software tools (Salesforce, AWS RDS database, Excel, Contract Logix, Intacct), saving 500+ hours of manual work monthly.
  • Worked closely with the COO to implement nuanced business logic within Salesforce, including finances, inventory management, budget forecasting, asset management, and sales. This resulted in 70% of employees moving to Salesforce from ad-hoc spreadsheets.
  • Migrated gigabytes of different company data from disorganized spreadsheets into Salesforce, significantly speeding up employee adoption of Salesforce.
Technologies: Salesforce Design, Salesforce API, SOQL, Salesforce Object Query Language (SOQL), HubSpot Development, Microsoft Development, Google APIs, REST API, CRM APIs, Apex, APEX Code, Apex, Apex Classes, Apex Triggers, B2B Design, Cloud Engineering, Cloud Platforms

Data Analyst

2023 - 2023
Movement of Mothers
  • Reconciled court case data from various sources and analyzed it, informing the legislature in California.
  • Designed and executed a systematic, unbiased survey to gather critical data, facilitating insightful analysis and decision-making.
  • Worked with stakeholders across multiple nonprofit organizations to gather and understand the data in question.
Technologies: Data Analysis, SQL, Data Visualization, Data Science, Data Classification, Data Cleaning, B2C

Data Science Research Assistant

2022 - 2023
The University of Chicago
  • Deployed machine learning (ML) models using free and proprietary tools, such as Kubernetes and funcX, for scalable use by the scientific community.
  • Collaborated on developing a platform for publishing and sharing AI models for research purposes.
  • Authored ML models predicting the physical properties of new compounds based on their chemical composition.
Technologies: Data Science, Kubernetes, Neural Network, PyTorch, Docker, Data Science, PyCharm, Statistics, Machine Learning Operations (MLOps), Git, Ubuntu, Data Modeling, Machine Learning, Python, Scikit-Learn, Jupyter, Data Science, Microservices Development, Leadership, Data Classification, Data Cleaning, Artificial Intelligence, Analytics Development, Cloud Engineering, Cloud Platforms

Senior Data Science and Engineering Consultant

2019 - 2023
New Columbia Solar
  • Designed and deployed a relational data warehouse and object-oriented data pipeline for asset management data on AWS.
  • Saved over $40,000 monthly in lost profits through an automated predictive model for prompt anomaly detection.
  • Achieved a 9% revenue increase from new assets by identifying performance factors in existing ones.
  • Reduced maintenance time from nine to three days by building a custom web application for asset monitoring and contributing to the 10% efficiency increase.
  • Led a team of three to automate investor reporting, saving over 100 hours of manual work monthly and reducing costs by 12%.
Technologies: Apache Airflow, PostgreSQL, AWS, Python, Statistics, Data Warehouse, Time Series Analysis, Pandas, Google Sheets API, RESTful Services, Cloud Engineering, Google Sheets Development, Dashboard Design, Dashboard, Data Modeling, REST API, Database, PL/SQL, Data Warehouse, Business Intelligence Development, Machine Learning, Data Engineering, Data Science, PyCharm, AWS RDS, Amazon EC2, Redshift, Database, AWS IAM, Amazon S3, ECharts, Vue.js, DevOps, APIs, NumPy, Django, Jupyter, Database Administration (DBA), SQL, Excel Development, JavaScript, GitHub, ETL, AWS, Data Science, Data Build Tool (dbt), CI/CD Pipelines, Node.js, Microservices Development, Proof of Concept (POC), Jira, Performance Optimization, Data Architecture, Leadership, Data Quality Analysis, Data Science, Data Science, Database Migration, Firebase, Amazon Aurora, Database Optimization, Terraform, Data Mapping, AWS Glue, AWS Lambda, Linux, Salesforce Object Query Language (SOQL), Salesforce API, Data Migration, Salesforce Design, Data Classification, Excel 365, Data Cleaning, Artificial Intelligence, Analytics Development, B2B Design, Cloud Engineering, Cloud Platforms

Data Analytics and Engineering

2022 - 2022
Tesla
  • Reduced data storage costs by migrating from Vertica to a data lake using Parquet on Amazon S3. The migration was accomplished via Hudi on Apache Spark.
  • Diagnosed and resolved inefficiency in data replication by automating table schema synchronization.
  • Sped up PostgreSQL data replication by 300% by migrating it from ETL to Apache Kafka data streaming.
Technologies: Spark, PySpark, MySQL, Apache Kafka, Amazon S3, Apache Hudi, Data Lakes, Apache, Database Replication, Kubernetes, Docker, Vertica, InfluxDB, Presto, Pandas, PyCharm, Git, Bash, Data Engineering, Ubuntu, REST API, Database, PL/SQL, Oracle Development, Data Warehouse, Python, Database, Test-driven Deployment, Protobuf, NumPy, SQL, GitHub, ETL, Message Queues, CI/CD Pipelines, Microservices Development, RabbitMQ, Jira, Big Data Architecture, Performance Optimization, BigQuery, Snowflake, Data Science, Databricks, Database Migration, NoSQL, Firestore, Database Optimization, Scala, Data Mapping, Data Cleaning, Hadoop, Big Data Architecture, Cloud Platforms

Junior Data Analyst

2019 - 2019
Prodigal Sun Solar
  • Increased client's revenue by 5% through a hierarchical statistical hypothesis test to compare solar panel manufacturers.
  • Devised a creative optimization for API calling procedure, reducing its time from 3.65 days to 53 seconds.
  • Built an automated ETL system in Python for processing XML, JSON, and CSV data from solar APIs.
Technologies: Data Analysis, R, Pandas, NumPy, Scikit-Learn, Hypothesis Testing, Git, PostgreSQL, Data Visualization, Matplotlib, RESTful Services, Dashboard Design, Tableau Development, Dashboard, Data Modeling, REST API, Database, Data Science, Business Intelligence Development, PyCharm, Python, APIs, GitHub, MongoDB, Leadership, Data Quality Analysis, Data Science, Data Mapping, Data Cleaning, Artificial Intelligence, Analytics Development, B2B Design, Cloud Engineering, Cloud Platforms

HEAReader: Sync-reading Books Voiced by Real Humans

https://github.com/Breedoon/BookSync
I developed HEAReader, a sync-reading books solution voiced by real humans. It used a TensorFlow-based algorithm for word-to-word matching of audiobooks with books, enabling synchronous reading. Also, I learned Swift and created an iOS app to serve as a proof of concept (POC) for the algorithm.

MDtoLongPDF: Converting Markdown to Pageless PDFs

https://github.com/Breedoon/MDtoLongPDF
Pagination in PDF has become irrelevant as most documents are not intended for printing. However, page breaks still disrupt the content flow, splitting sections, breaking tables, and moving figures around, which leads to wasted space, all to serve a function that is no longer needed.

MDtoLongPDF is a tool intended to solve this issue by converting unpaginated formats like Markdown and HTML into a single, extensive PDF page. This tool eliminates unnecessary page breaks, enabling seamless content rendering. I personally rely on it for creating documents and resumes.

AdmitMe

I worked on AdmitMe, an app that helped 300+ high school graduates in Ukraine find the colleges they were most likely to get into based on historical admissions data scraped from the government website and their exam scores. It achieved 89% of accuracy.
DECEMBER 2018 - PRESENT

MTA: SQL Development

Microsoft

DECEMBER 2018 - PRESENT

MTA: Python Development

Microsoft

Libraries/APIs

Pandas, NumPy, Matplotlib, PySpark, PyTorch, TensorFlow, Scikit-Learn, Google Sheets API, REST API, DeepSpeech, Vue.js, Protobuf, Node.js, Salesforce API, Google APIs

Tools

PyCharm, Git, GitHub, Apache Airflow, Jupyter, Google Sheets Development, Tableau Development, Jira, AWS IAM, Prince XML, Pandoc, Excel Development, AWS, RabbitMQ, BigQuery, Terraform, AWS Glue, Apex

Languages

Python, SQL, R, Bash, JavaScript, Java, Markdown, HTML, Swift 5, C++, GraphQL, Snowflake, Scala, T-SQL, Salesforce Object Query Language (SOQL), SOQL, Apex, APEX Code

Paradigms

ETL, Test-driven Deployment, DevOps, Business Intelligence Development, Microservices Development, B2B Design, B2C

Platforms

MacOS, AWS, Salesforce Design, Amazon EC2, Docker, Ubuntu, Apache Kafka, Apache Hudi, Kubernetes, Data Science, Cloud Engineering, Oracle Development, Databricks, Firebase, AWS Lambda, Linux, HubSpot Development, Azure Design

Storage

PostgreSQL, Amazon S3, Database Administration (DBA), Database Migration, Database, Database, PL/SQL, NoSQL, Amazon Aurora, Redshift, MySQL, Data Lakes, Database Replication, Vertica, InfluxDB, MongoDB, Firestore, SQL Server

Frameworks

Big Data Architecture, Hadoop, Spark, Presto, Django

Other

Data Engineering, Data Analysis, Data Science, Data Visualization, Data Warehouse, Data Science, Database Optimization, Data Mapping, Cloud Engineering, Machine Learning, AWS RDS, Data Warehouse, Neural Network, Time Series Analysis, APIs, Hypothesis Testing, RESTful Services, Dashboard Design, Dashboard, Data Modeling, Data Science, Message Queues, Data Science, CI/CD Pipelines, Proof of Concept (POC), Performance Optimization, Data Architecture, Leadership, Data Science, Data Migration, Data Classification, Excel 365, Data Cleaning, Artificial Intelligence, LLM, Analytics Development, Big Data Architecture, Cloud Platforms, Apache, Deep Learning, Web Scraping, Modeling, Statistics, ECharts, Machine Learning Operations (MLOps), Data Build Tool (dbt), Data Quality Analysis, Geotechnical Engineering, Microsoft Development, CRM APIs, Apex Classes, Apex Triggers

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring