
Adrian Dominiczak
Verified Expert in Engineering
Data Engineer and Developer
Warsaw, Poland
Toptal member since July 21, 2020
Adrian is a senior big data engineer with nearly a decade of professional experience. Adrian started his career as a software engineer at Samsung's R&D and has worked on a range of projects from machine learning and big data engineering in banking and pharmaceutical industries to big data and cloud architecting at Santander and Lingaro. Adrian's areas of expertise lie mainly with Hadoop and Spark.
Portfolio
Experience
- Java - 8 years
- Python - 8 years
- Data Engineering - 6 years
- Big Data - 5 years
- Spark - 5 years
- Yarn - 5 years
- Amazon Web Services (AWS) - 4 years
- Big Data Architecture - 3 years
Availability
Preferred Environment
IntelliJ IDEA, PyCharm, Linux
The most amazing...
...thing I've done was optimizing a Spark app by measuring the accuracy of ML models while monitoring the client's machines' health statuses.
Work Experience
Big Data and ML Engineer
Roche
- Designed, implemented, and productized software written in Spark for monitoring of the accuracy of statistical models monitoring medical machines' health statuses.
- Designed and improved the project structure by refactoring existing projects before deployments in the area of automatic medical documents generation, and retrieval knowledge from medical documents.
- Designed and developed solutions for processing, auto-generation, and knowledge extraction from medical origin documents.
Big Data Architect | Technical Leader
Lingaro
- Represented a software house and prepared an offer containing architecture design, scope, pricing for a project connecting several independent data platforms, with batch and NRT generated data, with data mart and dashboarding developed in MS Azure.
- Provided architecture and team lead support in an acquired project.
- Analyzed the business needs of clients and translated them into technical requirements.
- Coordinated the project’s development and delivery using the agile methodology.
- Took part in improving and refactoring code along with mentoring younger developers.
- Took part in sales activity.
Big Data Architect
Santander Consumer Technology Services GmbH
- Monitored and provided improvements for the production of Hadoop clusters, ETL processes, and resource utilization.
- Coordinated projects by serving as a single point of contact for stakeholders from the business domain and a team of developers; also monitored, planned, and reported on projects before going live.
- Mentored and managed a small team of junior developers along with leading the development of a PySpark reporting application using the agile methodology.
- Set up development environments, test deployments of software from external providers; also created reports, documentations, and tutorials.
- Analyzed the architectures, functionalities, and performance of solutions from external providers.
- Attended meetings with external software providers including managers and architects.
Big Data and ML Engineer
Roche
- Served as a machine learning and big data expert while obtaining external software (implemented in AWS) for extracting data from a medical origin document; also prepared for the internal knowledge transfer to a support team.
- Designed and improved project structures by providing on-demand refactoring the existing projects before deployments.
- Designed and developed solutions for medical origin document analysis, processing, and auto-generation.
Big Data Engineer
mBank S.A.
- Implemented algorithmic trading software (with an ML approach) that traded live with S&P 500 stocks.
- Designed and implemented ML-based credit-scoring models.
- Implemented a web service for custom visualizations of business data hosted on a Hadoop cluster.
Software Engineer
Samsung Electronics Poland, R&D Center, Artificial Intelligence Group
- Designed, implemented, and supported a module in an NLP user utterance recognition engine.
- Implemented a web service platform used internally by linguists as a tool for gathering, cleaning, and tagging data sets used for training machine learning models for NLP (natural language processing).
- Implemented a knowledge database for closed domain and web scrapers used as sourcing tools.
- Implemented connectors from Prolog to Java in order to utilize knowledge databases stored in Prolog format in internal Java libraries building statistical models in the the NLP domain.
Programmer
Polish Academy of Sciences
- Found a method to accurately recognize and distinguish bone internal structure based on scattered ultrasound signals using machine learning and time series analysis methods.
- Proposed a new method to recognize skin cancer changes based on ultrasound signals using advanced time series analysis and a complex networks mathematical framework.
- Utilized a new way of researching medical origin time series using mathematical frameworks for mapping between time series and complex networks.
Experience
Algorithmic Trading
I implemented modules for training and using built models for daily predictions and took part in the discussion about mathematical approaches for portfolio handling and rebalancing. I also integrated data from a range of data sources: internet, data providers, and so on.
Data Mart in MS Azure with a Dashboard
I architected, designed, and supported the development of an MS Azure cloud solution that synced independent data platforms with various frequencies of generated data, from one-day batches to NRT. I also designed the ETL pipelines, data storage, data mart, and a fast, efficient dashboarding solution.
Statistical Models Validation Software
I optimized and implemented the advanced PoC algorithm along with preparing it for production deployment.
User Utterance Recognition
I designed and implemented the framework in plain Java and was intended to be used as an internal library. The framework performed sentence recognition using the hybrid engine, which was fed by both machine-learning-based and rule-based predictors.
Education
Master of Science (MSc) Degree in Applied Physics
Warsaw University of Technology, Faculty of Physics - Warsaw, Poland
Bachelor of Science (BSc) Degree in Physics
Warsaw University of Technology, Faculty of Physics - Warsaw, Poland
Certifications
Essential Google Cloud Infrastructure: Foundation
Coursera
Google Cloud Platform Fundamentals: Core Infrastructure
Coursera
Essential Google Cloud Infrastructure: Core Services
Coursera
Skills
Libraries/APIs
Pandas
Tools
PyCharm, IntelliJ IDEA, MATLAB, Mathematica, Weka, Apache Sqoop, GitLab CI/CD, Bamboo, Cloudera, Kudu, Microsoft Power BI, Apache Airflow
Languages
Python, Java, SQL, JavaScript, Prolog, Scala, R, Bash
Frameworks
Spark, Hadoop, Yarn, Play
Paradigms
ETL Implementation & Design, ETL
Platforms
Amazon Web Services (AWS), Google Cloud Platform (GCP), Linux, Docker, Kubernetes
Storage
Databases, H2, Elasticsearch, HDFS, Apache Hive, Redis
Other
Big Data, Data Analytics, Data Engineering, Big Data Architecture, Applied Mathematics, Machine Learning, Data Science, Statistics, Computational Physics, Conda, RHEL, Microsoft Azure
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring