Senior Data Scientist2021 - PRESENTFreelance for Lionbridge (via Newfire Global Partners)
Technologies: Python, Agile, Scrum, Web Services, JSON, PyTorch, SpaCy, NLTK, PySpark, Jupyter, Databricks, Open Neural Network Exchange (ONNX), Neural Networks, LSTM, Pandas
- Developed a machine learning sequence labeling model on text data that achieved above 0.9 F1 score.
- Decreased inference time on a previously developed machine learning model without sacrificing their F1 score.
- Used Pyspark and Databricks to perform a large-scale data analysis which the company employed to drive future business decisions.
- Developed multiple highly scalable Python web services that are currently serving production traffic.
Machine Learning Engineer2020 - 2021Alchemy V Ltd (via Toptal)
Technologies: Google Cloud, Google Cloud API, Google BigQuery, R, Python, Text Generation, SQL
- Created a marketing slogan text generator using Hugging Face transformers/text generation pipelines and customer-provided data.
- Created a data ingestion and reporting process via multiple Google Cloud services: BigQuery, Cloud Functions, Cloud Endpoints, and Dataproc.
- Ported existing R reporting code to a Python web service.
Natural Language Processing (NLP) Consultant2020 - 2021Granville Knowledge Management (via Toptal)
Technologies: Python, Scrapy, Web Scraping, PyTorch, Jupyter, Google Colaboratory (Colab), Text Classification
- Developed a scraper to download a large (around 20,000) and diverse legal documents (1990 until today) from a European public repository.
- Used machine learning to build a text classification model to automatically classify categories based on document content.
- Created a dataset of legal documents and used it to train and evaluate the built machine learning text classification model. Shared results via Google collab such that customers can interactively try the model performance with their held-out data.
Research Associate2018 - 2020TakeLab at the University of Zagreb
Technologies: Scikit-learn, PyTorch, Apache Solr, Django, Python, Torch, Pandas
- Developed a search engine for Croatian legal documents.
- Built a named entity recognition model in PyTorch by combining LSTM with a CRF.
- Mentored several students doing intern projects and wrote my master thesis on natural language processing.
Software Development Engineer2014 - 2017Amazon Web Services (AWS)
Technologies: Amazon Web Services (AWS), C++, Python, Java
- Contributed to developing a scalable time-series database solution in Java and C++, which served around 1 million requests/second.
- Served as the team scrum master and product owner.
- Designed and implemented a network correlation engine microservice to handle networking events from the entire Amazon network (patent award https://patents.justia.com/inventor/filip-boltuzic).
Business Intelligence Analyst2012 - 2014Zagrebacka banka Unicredit Group
Technologies: Java, SQL
- Developed SQL reports to determine the promising retail strategies in a data warehouse.
- Built an interactive tool in Java to speed up the processes in Oracle Data Integrator.
- Developed small web applications for the accounting department, using PL/SQL and Oracle Apex.