Milos Grubjesic
Verified Expert in Engineering
Data Engineer and Developer
Milos is a data scientist and engineer with 15+ years of experience in big data and machine learning with Python, Scala, Spark, and other data engineering technologies. He has delivered data solutions for Christie's Auction House, Syndigo (retail), car pricing predictions, fraud detection, commodities trading, and more. Milos's industry experience is backed by a master's degree in computer science.
Portfolio
Experience
Availability
Preferred Environment
Databricks, Linux, Python, Spark, PyCharm, Docker, MacOS, Machine Learning Operations (MLOps)
The most amazing...
...complete system I've implemented fights scam and fake users on an online dating site, and catching some really bad guys felt really good.
Work Experience
Machine Learning Operations Engineer
PepsiCo
- Migrated a machine learning project to Kubernetes and Kubeflow.
- Set up a machine learning operations platform the eCommerce team uses in PepsiCo.
- Worked on the migration of the machine learning product from AWS to Azure.
Python Engineer
Databricks
- Created a system for retrieving Qualys vulnerability scan reports for containers and DBS. The process involved extracting data from API, triaging data, and manipulating Jira tickets.
- Created a system for extracting Github alerts through GraphQL API calls, merging with internal data, triaging, and cleaning data.
- Implemented migration and updates to various Spark jobs.
- Used Qualys API to run on-demand scans for container bundles, retrieved vulnerability data, and augmented it with internal data sources.
Python Back-end Engineer
GameAnalytics
- Developed the API integrations for an analytics platform, including Unity Ads, AppsFlyer, and Adjust.
- Created Docker images and integrated the services into a client's infrastructure.
- Analyzed the retrieved data and extracted meaningful knowledge.
- Created ETL data ingestion pipelines with Dagster.
Scala and Spark Developer
Syndigo
- Developed the data pipelines for an enterprise client to process large amounts of data daily in Scala and Spark.
- Created hundreds of notebooks on Azure Databricks and set up the ETL process to clean, deduplicate, and aggregate data using Scala and Spark.
- Built a custom Python framework to quickly update notebook batches, saving the client a lot of time and money.
Machine Learning and Machine Learning Operations Engineer
AlgoDriven
- Created various ML models for a used car dealership application used through GCC countries, including Saudi Arabia, Kuwait, the United Arab Emirates, Qatar, Bahrain, and Oman.
- Deployed various ML models to production, ensuring services were scalable and not interrupted by updates and fixes.
- Created pipelines for automated data retrieving, processing, cleaning, deduplicating, augmenting, model building, validating, and deploying to production.
Data Scientist (Freelance)
Jaumo
- Organized an administration team and created an automated system for Jaumo, a popular online dating system, to quickly identify and deal with threats to the platform, such as fake users and scams.
- Created and defined a machine learning operations process.
- Analyzed large amounts of data and gained insights from business owners.
- Implemented metrics for estimating a fake user ratio, developed an artificial user classifier, and introduced local interpretability modeling. All this was executed in the Python ecosystem.
Data Scientist
Christie's (Freelance)
- Analyzed fine art data and gained insights to develop algorithms in the Python ecosystem for this famous auction house.
- Developed algorithms, such as an artist's index, popularity index, demand index, and fine art comparables.
- Enabled matchmaking for artists, customer analyses, recommendations, artwork collection value estimation, and more.
- Leveraged data from multiple sources to help the marketing team find new customers.
Java Software Developer
Custom Software and IT Services Companies
- Implemented Java web applications and conducted white hat penetration testing on one of them.
- Handled a complex Java application related to diseases for Danish customers and tested it under a high load of requests.
- Maintained Linux machines as a junior administrator.
- Sniffed out the network traffic, used various tools to collect passwords, and immediately informed IT support about the findings to strengthen security.
Experience
Data Scientist | Predictive Analytics
Data Scientist | Linear TV Viewership Forecasts
https://videoamp.com/Press release: Globenewswire.com/en/news-release/2018/03/05/1415186/0/en/VideoAmp-s-Oscar-Prediction-Algorithm-Proves-Accurate.html
Python Library for Kubefow Pipelines
Skills
Languages
Python, Python 3, Scala, SQL, R, YAML, GraphQL, Snowflake
Frameworks
Spark, Flask, ASM, Apache Spark, gRPC, Hydra
Libraries/APIs
Pandas, NumPy, SciPy, Scikit-learn, Caret, REST APIs, Spark ML, Protobuf, PySpark, Jira REST API, CatBoost
Tools
PyCharm, Jupyter, Spark SQL, IntelliJ IDEA, Git, GitHub, Apache Avro, Wireshark, Code Climate, Codecov, BigQuery, Pytest, Sentry, Amazon SageMaker, AWS Glue, Amazon Athena, Azure Machine Learning
Other
Machine Learning, Back-end, APIs, Regression, Classification, Artificial Intelligence (AI), Time Series Analysis, Predictive Analytics, Data Analysis, Data Engineering, Algorithms, Statistical Analysis, Data, Data Visualization, Predictive Learning, Predictive Modeling, Data Modeling, Data Mining, Time Series, Forecasting, Neural Networks, User-defined Functions (UDF), Apache Cassandra, Computer Science, Big Data, Web Applications, Azure Data Lake, GitFlow, API Integration, Adjust, CI/CD Pipelines, Poetry, Dagster, Vulnerability Management, Containerization, Containers, Analytics, Machine Learning Operations (MLOps), Kubernetes Operations (kOps), Amazon Machine Learning, Azure Databricks, KServe, MLMD, OpenAI GPT-4 API
Paradigms
ETL, Data Science, Functional Programming, Aspect-oriented Programming, Penetration Testing, Test-driven Development (TDD), Business Intelligence (BI)
Platforms
Jupyter Notebook, Amazon EC2, Linux, Docker, Amazon Web Services (AWS), Azure, Databricks, RStudio, AppsFlyer, QualysGuard, Kubernetes, Kubeflow, Nexus, MacOS
Storage
Databases, Data Pipelines, NoSQL, PostgreSQL, MySQL, MongoDB, Datadog
Education
Master's Degree in Computer Science
Faculty of Technical Sciences - Novi Sad, Serbia
Certifications
Deep Learning Nanodegree
Udacity
Scalable Machine Learning
EdX
Machine Learning
Coursera
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring