Filip Boltuzic, Machine Learning Engineer and Developer in Zagreb, Croatia
Filip Boltuzic

Machine Learning Engineer and Developer in Zagreb, Croatia

Member since April 28, 2020
Filip is a machine learning engineer with several years of professional experience. He's worked on large-scale problems at Amazon Web Services as a software developer and built natural language processing models as a research associate at the University of Zagreb. His main interests are machine learning and natural language processing with an emphasis on building text classification models.
Filip is now available for hire

Portfolio

Experience

Location

Zagreb, Croatia

Availability

Part-time

Preferred Environment

Java, Git, Linux, Docker, Apache Solr, Django, PyTorch, Pandas, NumPy, Scikit-learn, Python

The most amazing...

...machine learning model I've developed was an LSTM and CRF model to segment text into argumentative claims as part of my Ph.D. thesis.

Employment

  • Senior Data Scientist

    2021 - PRESENT
    Freelance for Lionbridge (via Newfire Global Partners)
    • Developed a machine learning sequence labeling model on text data that achieved above 0.9 F1 score.
    • Decreased inference time on a previously developed machine learning model without sacrificing their F1 score.
    • Used Pyspark and Databricks to perform a large-scale data analysis which the company employed to drive future business decisions.
    • Developed multiple highly scalable Python web services that are currently serving production traffic.
    Technologies: Python, Agile, Scrum, Web Services, JSON, PyTorch, SpaCy, NLTK, PySpark, Jupyter, Databricks, Open Neural Network Exchange (ONNX), Neural Networks, LSTM, Pandas
  • Machine Learning Engineer

    2020 - 2021
    Alchemy V Ltd (via Toptal)
    • Created a marketing slogan text generator using Hugging Face transformers/text generation pipelines and customer-provided data.
    • Created a data ingestion and reporting process via multiple Google Cloud services: BigQuery, Cloud Functions, Cloud Endpoints, and Dataproc.
    • Ported existing R reporting code to a Python web service.
    Technologies: Google Cloud, Google Cloud API, Google BigQuery, R, Python, Text Generation, SQL
  • Natural Language Processing (NLP) Consultant

    2020 - 2021
    Granville Knowledge Management (via Toptal)
    • Developed a scraper to download a large (around 20,000) and diverse legal documents (1990 until today) from a European public repository.
    • Used machine learning to build a text classification model to automatically classify categories based on document content.
    • Created a dataset of legal documents and used it to train and evaluate the built machine learning text classification model. Shared results via Google collab such that customers can interactively try the model performance with their held-out data.
    Technologies: Python, Scrapy, Web Scraping, PyTorch, Jupyter, Google Colaboratory (Colab), Text Classification
  • Research Associate

    2018 - 2020
    TakeLab at the University of Zagreb
    • Developed a search engine for Croatian legal documents.
    • Built a named entity recognition model in PyTorch by combining LSTM with a CRF.
    • Mentored several students doing intern projects and wrote my master thesis on natural language processing.
    Technologies: Scikit-learn, PyTorch, Apache Solr, Django, Python, Torch, Pandas
  • Software Development Engineer

    2014 - 2017
    Amazon Web Services (AWS)
    • Contributed to developing a scalable time-series database solution in Java and C++, which served around 1 million requests/second.
    • Served as the team scrum master and product owner.
    • Designed and implemented a network correlation engine microservice to handle networking events from the entire Amazon network (patent award https://patents.justia.com/inventor/filip-boltuzic).
    Technologies: Amazon Web Services (AWS), C++, Python, Java
  • Business Intelligence Analyst

    2012 - 2014
    Zagrebacka banka Unicredit Group
    • Developed SQL reports to determine the promising retail strategies in a data warehouse.
    • Built an interactive tool in Java to speed up the processes in Oracle Data Integrator.
    • Developed small web applications for the accounting department, using PL/SQL and Oracle Apex.
    Technologies: Java, SQL

Experience

  • Search Engine for Croatian Legal Documents

    A Django and Apache Solr web application.

    I was the lead developer on this project and proposed the system's architecture as a set of microservices. The documents were stored and indexed in Solr, whereas the Django front end served requests and communicated with Solr.

  • Retail Sale Forecasting

    The project was to design a model to predict sale amounts based on historical data of orders, previous sales, and regions. The forecasting was done on a regional and global level and acted as a time series prediction matter. I experimented with several time-series prediction techniques such as ARIMA and SARIMA models.

Skills

  • Other

    Natural Language Processing (NLP), Machine Learning, Back-end, Artificial Intelligence (AI), Clustering Algorithms, Clustering, Classification Algorithms, Text Classification, Torch, Web Scraping, Google Colaboratory (Colab), Google BigQuery, Text Generation, Web Services, Open Neural Network Exchange (ONNX), Neural Networks, Research, Student Engagement, Supervised Machine Learning, Time Series, Autoregressive Integrated Moving Average (ARIMA)
  • Languages

    Python, SQL, Haskell, Java, C++, R
  • Libraries/APIs

    Scikit-learn, NumPy, Pandas, PyTorch, Google Cloud API, SpaCy, NLTK, PySpark, LSTM
  • Tools

    Vim Text Editor, Solr, Apache Solr, Git, Oh My Zsh, Boto, Jupyter, LaTeX
  • Paradigms

    Data Science, Anomaly Detection, Agile, Scrum
  • Platforms

    Amazon Web Services (AWS), Linux, Docker, Databricks, SolrCloud
  • Frameworks

    Django, Scrapy
  • Storage

    Elasticsearch, Google Cloud, JSON

Education

  • Ph.D. Degree in Natural Language Processing
    2012 - 2020
    University of Zagreb - Zagreb, Croatia
  • Master's Degree in Computer Science
    2010 - 2012
    University of Zagreb - Zagreb, Croatia
  • Erasmus Exchange Study in Computer Science
    2010 - 2011
    KTH Royal Institute of Technology - Stockholm, Sweden
  • Bachelor's Degree in Computer Science
    2007 - 2010
    University of Zagreb - Zagreb, Croatia

Certifications

  • Convolutional Neural Networks
    NOVEMBER 2017 - PRESENT
    Coursera

To view more profiles

Join Toptal
Share it with others