Daniel Bredun
Verified Expert in Engineering
Data Scientist and Developer
Daniel is a data scientist and engineer who is a whiz at the data lifecycle. He excels at crafting efficient data pipelines, designing databases, conducting advanced analyses, and harnessing machine learning. Coupled with his proficiency in cloud storage systems, Daniel has consistently driven business success. Even in the face of challenging constraints, his passion for problem-solving ensures top-tier, long-term solutions.
Portfolio
Experience
Availability
Preferred Environment
PyCharm, MacOS
The most amazing...
...data collection I've done was from an ancient public API, boosting it from 10 to 60,000 data points per minute by reverse-engineering their web portal requests.
Work Experience
Senior Data Science and Engineering Consultant
New Columbia Solar
- Designed and deployed a relational data warehouse and object-oriented data pipeline for asset management data on AWS.
- Saved over $40,000 monthly in lost profits through an automated predictive model for prompt anomaly detection.
- Achieved a 9% revenue increase from new assets by identifying performance factors in existing ones.
- Reduced maintenance time from nine to three days by building a custom web application for asset monitoring and contributing to the 10% efficiency increase.
- Led a team of three to automate investor reporting, saving over 100 hours of manual work monthly and reducing costs by 12%.
Data Analyst
Movement of Mothers
- Reconciled court case data from various sources and analyzed it, informing the legislature in California.
- Designed and executed a systematic, unbiased survey to gather critical data, facilitating insightful analysis and decision-making.
- Worked with stakeholders across multiple nonprofit organizations to gather and understand the data in question.
Data Science Research Assistant
The University of Chicago
- Deployed machine learning (ML) models using free and proprietary tools, such as Kubernetes and funcX, for scalable use by the scientific community.
- Collaborated on developing a platform for publishing and sharing AI models for research purposes.
- Authored ML models predicting the physical properties of new compounds based on their chemical composition.
Data Analytics and Engineering
Tesla
- Reduced data storage costs by migrating from Vertica to a data lake using Parquet on Amazon S3. The migration was accomplished via Hudi on Apache Spark.
- Diagnosed and resolved inefficiency in data replication by automating table schema synchronization.
- Sped up PostgreSQL data replication by 300% by migrating it from ETL to Apache Kafka data streaming.
Junior Data Analyst
Prodigal Sun Solar
- Increased client's revenue by 5% through a hierarchical statistical hypothesis test to compare solar panel manufacturers.
- Devised a creative optimization for API calling procedure, reducing its time from 3.65 days to 53 seconds.
- Built an automated ETL system in Python for processing XML, JSON, and CSV data from solar APIs.
Experience
HEAReader: Sync-reading Books Voiced by Real Humans
https://github.com/Breedoon/BookSyncMDtoLongPDF: Converting Markdown to Pageless PDFs
https://github.com/Breedoon/MDtoLongPDFMDtoLongPDF is a tool intended to solve this issue by converting unpaginated formats like Markdown and HTML into a single, extensive PDF page. This tool eliminates unnecessary page breaks, enabling seamless content rendering. I personally rely on it for creating documents and resumes.
AdmitMe
Certifications
MTA: SQL Development
Microsoft
MTA: Python Development
Microsoft
Skills
Libraries/APIs
Pandas, NumPy, Matplotlib, PySpark, PyTorch, TensorFlow, Scikit-learn, Google Sheets API, REST APIs, DeepSpeech, Vue, Protobuf, Node.js
Tools
PyCharm, Git, GitHub, Apache Airflow, Jupyter, Google Sheets, Tableau, Jira, AWS IAM, Prince XML, Microsoft Excel, Amazon Athena, RabbitMQ, BigQuery, Terraform
Languages
Python, SQL, R, Bash, JavaScript, Java, Markdown, HTML, Swift 5, C++, GraphQL, Snowflake, Scala
Paradigms
Data Science, ETL, Test-driven Deployment, DevOps, Business Intelligence (BI), Microservices
Platforms
MacOS, Amazon Web Services (AWS), Amazon EC2, Docker, Ubuntu, Apache Kafka, Apache Hudi, Kubernetes, Anaconda, Google Cloud Platform (GCP), Oracle, Databricks, Firebase
Storage
PostgreSQL, Amazon S3 (AWS S3), Database Administration (DBA), Database Migration, Data Pipelines, Databases, PL/SQL, NoSQL, Amazon Aurora, Redshift, MySQL, Data Lakes, Database Replication, Vertica, InfluxDB, MongoDB, Cloud Firestore
Frameworks
Apache Spark, Spark, Presto, Django
Other
Data Engineering, Data Analysis, Data Visualization, Data Warehousing, Data Reporting, Database Optimization, Data Mapping, Machine Learning, Amazon RDS, Data Warehouse Design, Neural Networks, Time Series Analysis, APIs, Hypothesis Testing, RESTful Services, Dashboard Design, Dashboards, Data Modeling, Data Analytics, Message Queues, Data Scientist, CI/CD Pipelines, Proof of Concept (POC), Performance Optimization, Data Architecture, Leadership, Data Cleansing, Parquet, Deep Learning, Pandoc, Web Scraping, Modeling, Statistics, ECharts, Machine Learning Operations (MLOps), Data Build Tool (dbt), Data Quality Analysis
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring