Filip Boltuzic
Verified Expert in Engineering
Machine Learning Engineer and Developer
Zagreb, Croatia
Toptal member since April 30, 2020
Filip is a machine learning engineer with several years of professional experience. He's worked on large-scale problems at Amazon Web Services as a software developer and built natural language processing models as a research associate at the University of Zagreb. Filip's main interests are machine learning and natural language processing, with an emphasis on building text classification models.
Portfolio
Experience
Availability
Preferred Environment
Java, Git, Linux, Docker, Apache Solr, Django, PyTorch, Pandas, NumPy, Scikit-learn, Python
The most amazing...
...machine learning model I've developed was an LSTM and CRF model to segment text into argumentative claims as part of my Ph.D. thesis.
Work Experience
Research Advisor
Online freelance agency
- Investigated, researched and documented caching methods in software.
- Reproduced the most popular caching methods for predicting time-to-live from research papers.
- Built a simulator and reinforcement learning model which tries to solve TTL prediction for object caching.
RAG/GPT4 Expert
Inflexion
- Improved an existing RAG-based tool to help the team search internal documentation more efficiently.
- Built a document processing system for different types of content (emails, DOCX attachments, Excel spreadsheets, etc).
- Utilized RAG to implement the 1st version of multiple-question answering.
Technical Blog Writer
Agnostiq
- Wrote several technical blogs on various topics such as machine learning, quantum computing, cloud computing, and large language models.
- Implemented reproducible workflows across three cloud providers: AWS, Google Cloud, and Azure.
- Contributed to the open source workflow covalent library.
AI and ML Developer
Aggieland Software
- Developed a large language model (LLM) LangChain bot to generate software requirements.
- Built and deployed to the cloud a multi-process application exposed via an API that can chat with a user to generate software requirements.
- Collaborated with two teams to integrate the LLM app via APIs to provide both web and mobile application access to the LLM app.
AI Expert
PD4 Solutions LLC
- Developed an LLM-based solution to determine which scientific articles are related to user-inputted free-text criteria.
- Evaluated the LLM solution performance and demonstrated metrics proving considerable improvement over the previously implemented solution.
- Worked with ML engineers to deploy solutions and define an optimal architecture for applying the LLM solution.
Senior Data Scientist
Freelance for Lionbridge (via Newfire Global Partners)
- Developed a machine learning sequence labeling model on text data that achieved above 0.9 F1 score.
- Decreased inference time on a previously developed machine learning model without sacrificing their F1 score.
- Used PySpark and Databricks to perform a large-scale data analysis that the company employed to drive future business decisions.
- Developed multiple highly scalable Python web services that are currently serving production traffic.
Data Science Engineer
BJS
- Developed prototype product recommenders which showed customer purchasing patterns.
- Built simple AWS Lambda functions to conduct an ETL workflow.
- Worked with PySpark on large sets of data (>100GB of historical purchases).
Machine Learning Engineer
Alchemy V Ltd (via Toptal)
- Created a marketing slogan text generator using Hugging Face transformers/text generation pipelines and customer-provided data.
- Created a data ingestion and reporting process via multiple Google Cloud services: BigQuery, Cloud Functions, Cloud Endpoints, and Dataproc.
- Ported existing R reporting code to a Python web service.
Natural Language Processing (NLP) Consultant
Granville Knowledge Management (via Toptal)
- Developed a scraper to download a large (around 20,000) and diverse legal documents (1990 until today) from a European public repository.
- Used machine learning to build a text classification model to automatically classify categories based on document content.
- Created a dataset of legal documents and used it to train and evaluate the built machine learning text classification model. Shared results via Google collab such that customers can interactively try the model performance with their held-out data.
Research Associate
TakeLab at the University of Zagreb
- Developed a search engine for Croatian legal documents.
- Built a named entity recognition model in PyTorch by combining LSTM with a CRF.
- Mentored several students doing intern projects and wrote my master thesis on natural language processing.
Software Development Engineer
Amazon Web Services (AWS)
- Contributed to developing a scalable time-series database solution in Java and C++, which served around 1 million requests/second.
- Served as the team scrum master and product owner.
- Designed and implemented a network correlation engine microservice to handle networking events from the entire Amazon network (patent award https://patents.justia.com/inventor/filip-boltuzic).
Business Intelligence Analyst
Zagrebacka banka Unicredit Group
- Developed SQL reports to determine the promising retail strategies in a data warehouse.
- Built an interactive tool in Java to speed up the processes in Oracle Data Integrator.
- Developed small web applications for the accounting department, using PL/SQL and Oracle Apex.
Experience
Search Engine for Croatian Legal Documents
I was the lead developer on this project and proposed the system's architecture as a set of microservices. The documents were stored and indexed in Solr, whereas the Django front end served requests and communicated with Solr.
Retail Sale Forecasting
Ulpian
http://ulpian.euThe tool is currently under development as part of a startup named Ulpian.
Education
Ph.D. Degree in Natural Language Processing
University of Zagreb - Zagreb, Croatia
Master's Degree in Computer Science
University of Zagreb - Zagreb, Croatia
Erasmus Exchange Study in Computer Science
KTH Royal Institute of Technology - Stockholm, Sweden
Bachelor's Degree in Computer Science
University of Zagreb - Zagreb, Croatia
Certifications
Convolutional Neural Networks
Coursera
Skills
Libraries/APIs
Scikit-learn, NumPy, Pandas, PyTorch, Google Cloud API, SpaCy, Natural Language Toolkit (NLTK), PySpark, LSTM, Spark ML
Tools
Vim Text Editor, Solr, Apache Solr, Git, Oh My Zsh, Boto, Jupyter, Open Neural Network Exchange (ONNX), LaTeX, ARIMA, Azure Machine Learning, ChatGPT, Haystack
Languages
Python, SQL, Haskell, Java, C++, R
Platforms
Amazon Web Services (AWS), Linux, Docker, Databricks, SolrCloud, Azure, Google Cloud Platform (GCP)
Frameworks
Django, Scrapy, Streamlit
Paradigms
Anomaly Detection, Agile, Scrum, Business Intelligence (BI), DevOps
Storage
Elasticsearch, Google Cloud, JSON, PostgreSQL
Other
Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Artificial Intelligence (AI), Machine Learning, Back-end, Data Science, OpenAI GPT-3 API, Data Analysis, Azure Databricks, Retrieval-augmented Generation (RAG), Clustering Algorithms, Clustering, Classification Algorithms, Text Classification, Torch, Web Scraping, Google Colaboratory (Colab), Google BigQuery, Text Generation, Web Services, Neural Networks, Research, Student Engagement, Supervised Machine Learning, Time Series, LangChain, OpenAI, Reinforcement Learning, Deep Reinforcement Learning, Algorithms, Programming, Heuristics, Optimization, Evolutionary Computation, Genetic Algorithms, Convolutional Neural Networks (CNNs), Sorting Algorithms, Pattern Recognition, Language Models, Unsupervised Learning, Big Data, Unstructured Data Analysis, Large Language Models (LLMs), Llama 2, FastAPI, Prompt Engineering, OpenAI GPT-4 API, Pinecone, FAISS
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring