Daniel Bredun
Verified Expert in Engineering
Data Scientist and Developer
Rzeszow, Poland
Toptal member since August 29, 2023
Daniel is a data scientist and engineer who is a whiz at the data lifecycle. He excels at crafting efficient data pipelines, designing databases, conducting advanced analyses, and harnessing machine learning. Coupled with his proficiency in cloud storage systems, Daniel has consistently driven business success. Even in the face of challenging constraints, his passion for problem-solving ensures top-tier, long-term solutions.
Portfolio
Experience
Availability
Preferred Environment
PyCharm, MacOS
The most amazing...
...data collection I've done was from an ancient public API, boosting it from 10 to 60,000 data points per minute by reverse-engineering their web portal requests.
Work Experience
SQL Data Engineer
StubHub
- Co-led the migration of an ERP system from SQL Server to Snowflake and dbt, speeding up journal generation 10-fold while significantly improving the development experience.
- Led a major internal SQL Server database refactoring project, reducing system issues by 50% and saving 20 hours of employee time per month.
- Instituted documentation of critical processes, previously shared informally, significantly speeding up new joiners' time to autonomy.
Senior Integration Engineer
New Columbia Solar
- Led a comprehensive integration project to connect five internal software tools (Salesforce, AWS RDS database, Excel, Contract Logix, Intacct), saving 500+ hours of manual work monthly.
- Worked closely with the COO to implement nuanced business logic within Salesforce, including finances, inventory management, budget forecasting, asset management, and sales. This resulted in 70% of employees moving to Salesforce from ad-hoc spreadsheets.
- Migrated gigabytes of different company data from disorganized spreadsheets into Salesforce, significantly speeding up employee adoption of Salesforce.
Data Analyst
Movement of Mothers
- Reconciled court case data from various sources and analyzed it, informing the legislature in California.
- Designed and executed a systematic, unbiased survey to gather critical data, facilitating insightful analysis and decision-making.
- Worked with stakeholders across multiple nonprofit organizations to gather and understand the data in question.
Data Science Research Assistant
The University of Chicago
- Deployed machine learning (ML) models using free and proprietary tools, such as Kubernetes and funcX, for scalable use by the scientific community.
- Collaborated on developing a platform for publishing and sharing AI models for research purposes.
- Authored ML models predicting the physical properties of new compounds based on their chemical composition.
Senior Data Science and Engineering Consultant
New Columbia Solar
- Designed and deployed a relational data warehouse and object-oriented data pipeline for asset management data on AWS.
- Saved over $40,000 monthly in lost profits through an automated predictive model for prompt anomaly detection.
- Achieved a 9% revenue increase from new assets by identifying performance factors in existing ones.
- Reduced maintenance time from nine to three days by building a custom web application for asset monitoring and contributing to the 10% efficiency increase.
- Led a team of three to automate investor reporting, saving over 100 hours of manual work monthly and reducing costs by 12%.
Data Analytics and Engineering
Tesla
- Reduced data storage costs by migrating from Vertica to a data lake using Parquet on Amazon S3. The migration was accomplished via Hudi on Apache Spark.
- Diagnosed and resolved inefficiency in data replication by automating table schema synchronization.
- Sped up PostgreSQL data replication by 300% by migrating it from ETL to Apache Kafka data streaming.
Junior Data Analyst
Prodigal Sun Solar
- Increased client's revenue by 5% through a hierarchical statistical hypothesis test to compare solar panel manufacturers.
- Devised a creative optimization for API calling procedure, reducing its time from 3.65 days to 53 seconds.
- Built an automated ETL system in Python for processing XML, JSON, and CSV data from solar APIs.
Experience
HEAReader: Sync-reading Books Voiced by Real Humans
https://github.com/Breedoon/BookSyncMDtoLongPDF: Converting Markdown to Pageless PDFs
https://github.com/Breedoon/MDtoLongPDFMDtoLongPDF is a tool intended to solve this issue by converting unpaginated formats like Markdown and HTML into a single, extensive PDF page. This tool eliminates unnecessary page breaks, enabling seamless content rendering. I personally rely on it for creating documents and resumes.
AdmitMe
Certifications
MTA: SQL Development
Microsoft
MTA: Python Development
Microsoft
Skills
Libraries/APIs
Pandas, NumPy, Matplotlib, PySpark, PyTorch, TensorFlow, Scikit-Learn, Google Sheets API, REST API, DeepSpeech, Vue.js, Protobuf, Node.js, Salesforce API, Google APIs
Tools
PyCharm, Git, GitHub, Apache Airflow, Jupyter, Google Sheets Development, Tableau Development, Jira, AWS IAM, Prince XML, Pandoc, Excel Development, AWS, RabbitMQ, BigQuery, Terraform, AWS Glue, Apex
Languages
Python, SQL, R, Bash, JavaScript, Java, Markdown, HTML, Swift 5, C++, GraphQL, Snowflake, Scala, T-SQL, Salesforce Object Query Language (SOQL), SOQL, Apex, APEX Code
Paradigms
ETL, Test-driven Deployment, DevOps, Business Intelligence Development, Microservices Development, B2B Design, B2C
Platforms
MacOS, AWS, Salesforce Design, Amazon EC2, Docker, Ubuntu, Apache Kafka, Apache Hudi, Kubernetes, Data Science, Cloud Engineering, Oracle Development, Databricks, Firebase, AWS Lambda, Linux, HubSpot Development, Azure Design
Storage
PostgreSQL, Amazon S3, Database Administration (DBA), Database Migration, Database, Database, PL/SQL, NoSQL, Amazon Aurora, Redshift, MySQL, Data Lakes, Database Replication, Vertica, InfluxDB, MongoDB, Firestore, SQL Server
Frameworks
Big Data Architecture, Hadoop, Spark, Presto, Django
Other
Data Engineering, Data Analysis, Data Science, Data Visualization, Data Warehouse, Data Science, Database Optimization, Data Mapping, Cloud Engineering, Machine Learning, AWS RDS, Data Warehouse, Neural Network, Time Series Analysis, APIs, Hypothesis Testing, RESTful Services, Dashboard Design, Dashboard, Data Modeling, Data Science, Message Queues, Data Science, CI/CD Pipelines, Proof of Concept (POC), Performance Optimization, Data Architecture, Leadership, Data Science, Data Migration, Data Classification, Excel 365, Data Cleaning, Artificial Intelligence, LLM, Analytics Development, Big Data Architecture, Cloud Platforms, Apache, Deep Learning, Web Scraping, Modeling, Statistics, ECharts, Machine Learning Operations (MLOps), Data Build Tool (dbt), Data Quality Analysis, Geotechnical Engineering, Microsoft Development, CRM APIs, Apex Classes, Apex Triggers
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring