
Faruk Pasalic
Verified Expert in Engineering
Software Developer
Sarajevo, Federation of Bosnia and Herzegovina, Bosnia and Herzegovina
Toptal member since March 11, 2021
Faruk is a software developer with over 20 years of experience, specializing in Python back-end development. His expertise spans big data technologies like Hadoop and Spark, PostgreSQL, C, and C++. Faruk also has years of experience in machine learning and a hobby-level interest in firmware and hardware design, bridging software and hardware systems.
Portfolio
Experience
- Computer Science - 15 years
- Linux - 10 years
- GitHub - 10 years
- REST APIs - 8 years
- PyCharm - 5 years
- Data Science - 5 years
- Python - 4 years
- Pandas - 4 years
Availability
Preferred Environment
Linux, PyCharm, Git, TensorFlow, Keras, Python, Embedded Systems, Embedded Hardware
The most amazing...
...work I've done was designing and implementing a robust system for PDF parsing, text extraction, and layout understanding using machine learning techniques.
Work Experience
Full-stack Developer and AI Developer
Freelance Job
- Implemented text extraction from PDF documents related to laws using ML clustering algorithms, extract text paragraphs, titles, and subtitles.
- Used OpenAI API, converted paragraphs, titles, and sections into vector embeddings, and stored them in the Milvus vector database.
- Implemented a search over the Milvus database to retrieve paragraphs and references similar to those in the PDF files.
- Implemented a simple web app to retrieve search results and download documents.
Python Developer and Data Engineer (via Toptal)
RiskFinTech Ltd
- Updated the core library for new features and fixed current issues.
- Maintained application updates, deployments, and environments.
- Wrote user documentation and developer documentation, including architecture diagrams and process charts, among others.
- Discussed new features with owners based on clients' feedback.
Senior Developer
Freelance Clients
- Developed a calendar application with specific requirements. Used Django for a production-ready project. Learned Django quickly using my prior knowledge of web services with other platforms, specifically Spring Boot and Java Play.
- Created a calendar application from scratch using React and Django.
- Increased my knowledge of JavaScript and Python. Used Python previously on projects regarding ML and AI.
Embedded Developer
Fox Montgomery Limited
- Investigated livestock scales and how to connect them to the cloud.
- Researched RS232 to TTL converters and RFID readers suitable for the project.
- Wrote simple code in Python to connect sensors to the Raspberry Pi and sent data to the Azure cloud.
Python/Data Engineer Developer
RiskFinTech Ltd
- Implemented new features for the Jupyer Notebook application.
- Fixed bugs in Jupyter Notebook and Java applications.
- Worked on the design and refactoring of client applications.
Python Developer and Data Engineer
RiskFinTech Ltd
- Developed a transformation engine for processing financial portfolios and creating different kinds of reports.
- Wrote or fixed forecasting models based on documentation.
- Conducted demonstrations of the application for the clients and members of the company.
Python Developer and Data Engineer
RiskFinTech Ltd
- Created a transformation engine for processing financial portfolios and writing various reports.
- Converted financial rules to code or rules to be run on an internal transformation engine.
- Wrote or fixed forecasting models based on documentation.
- Conducted application demonstrations for the clients and members of the company.
Lead Software Engineer
Atlantbh
- Designed and developed an ingestion system of a large amount of data from XML files to the Spark data frames.
- Optimized Spark jobs for better performance.
- Orchestrated a Spark jobs execution order for decreasing time consumption.
- Exported data from Spark to big XML files in a user-defined format.
- Mentored new members of the team and organized tasks for new members.
Machine Learning Engineer
Atlantbh
- Designed new features for the in-house built product with machine learning algorithms using Python, TensorFlow 1x, TensorFlow 2, and TensorFlow Serving.
- Recorded deduplication and matching using an unsupervised learning algorithm. Written in Python, distributed processing, and auto-scaling capabilities using AWS.
- Structured data extraction from unstructured text. Extracted addresses out of HTML or text. Implemented as an NLP algorithm, LSTM neural network using TensorFlow and deployed using TensorFlow Serving.
- Created text classification using NLP and supervised learning algorithms. Implemented text classification with LSTM recurrent network in Python and TensorFlow.
- Categorized websites using NLP and supervised learning algorithms. Categorization is done using Python and DNNs with word embeddings in the background.
- Mentored company interns in machine learning and data science. Used Python, TensorFlow, and Keras for the project. Some projects included expression detection using CNN, a spell checker for the Bosnian language, and driver-level prediction.
Senior Software Engineer
Atlantbh
- Developed an ingestion system for a big data platform.
- Created ETL tools for data preprocessing in HDFS.
- Built ORM tools for mapping between HDFS and Java.
- Developed message-based communication between different systems on the same platform.
- Configured an REST service to store and retrieve configurations for different subsystems.
- Processed logs of real-time data collected using Flume and Scribe collectors. I also worked on MapReduce graph algorithms to connect logs from different stages.
Senior Software Engineer
Atlantbh
- Developed location-based services for a client and implemented communication between different subsystems, message storage, and processing facilities.
- Processed POIs from the supplier's input files and normalized them into the client-specific format stored in HDFS. Implemented a tool for transforming different file formats (XML and JSON files) into a single file format.
- Created tools for verifying input data based on the JBoss Drools engine.
Software Engineer
Freelance
- Developed a desktop application for processing JPEG images created by a police radar system.
- Extracted JPEG metadata and used OCR to detect the car's license plates captured on the image.
- Created a report of the traffic offense in DOC and PDF formats.
Software Engineer
Atlantbh
- Implemented geocoding and reverse geocoding algorithms based on GIS data provided by the client.
- Developed drawing maps out of GIS data provided by the client. Maps were partitioned into tiles. Created tile algorithms and produced and supervised tile rendering. Implemented on-demand tile rendering using the Decarta server.
- Implemented routing algorithms and drew routes on maps based on GIS data provided by the client.
Software Engineer
Atlantbh
- Worked as a junior software developer on a system for delivering venue maps to the client. The maps contained POI data stored in a database. Charged with maintaining the codebase and adding new features.
- Contributed to the library for importing different graphical and nongraphical formats such as DXF and PDF into the system and used imported data to create venue maps. Worked on detecting map parts based on object labels from the input file.
- Created different output files (PDF, SWF) out of venue maps stored in the system.
Experience
PDF Text Extraction Library
https://github.com/farukpasalic/pdfmageAdditionally, PDFMage includes debugging options that create visualizations of the extraction process, aiding in understanding and troubleshooting. The library also provides a Config class for customizing various aspects of the text extraction process, such as the path for output storage and the colors used in debug images. Overall, PDFMage is a comprehensive tool for PDF text extraction, offering high accuracy and customization options to cater to various user requirements.
Web Scraping Library - Python
https://github.com/farukpasalic/skrapData Validation, Forecasting and Reporting Engine
Livestock Scales
Machine Learning and Data Science Project
I also utilized CNN, RNN models, and multilayer LSTM on synthetic datasets from crawled websites and integrated features into the system.
I then implemented business name extraction using the Smith-Waterman algorithm, combining URL, title, and website content. I also created a website classification module using TensorFlow, with an LSTM-based classifier showing optimal performance among various models. Designing a profanity filter for the English language using the LSTM network with character-level and word-level embeddings demonstrated generalization even with distorted or misspelled words.
Java and Spark Project
My responsibilities included implementing Spark jobs, maintaining REST API, and documentation.
Django and React Project
My responsibilities encompassed creating custom functions, implementing a unique color scheme, and integrating other personalized features into the application.
Web Scraping Project
Education
Bachelor's Degree in Mathematics and Computer Science
University of Sarajevo - Sarajevo, Bosnia and Herzegovina
High School Diploma in Mathematics and Computer Science
Gymnasium Bosanska Krupa - Bosanska Krupa, Bosnia and Herzegovina
Skills
Libraries/APIs
TensorFlow, Pandas, Matplotlib, NumPy, REST APIs, Scikit-learn, Beautiful Soup, Keras, Natural Language Toolkit (NLTK), SciPy, OpenCV, React, PyQt 5, WebDriver, OpenAI API
Tools
GitHub, PyCharm, RabbitMQ, Jupyter, GIS, Esri, IntelliJ IDEA, Bitbucket, Git, Jenkins, Eclipse IDE, Apache Tomcat, TensorBoard
Languages
Java, XML, Python 3, Python, SQL, Embedded C, HTML, JavaScript, C, C++, MicroPython, Scala
Paradigms
Agile, Scrum, REST, MapReduce, ETL
Storage
JSON, PostGIS, PostgreSQL, HDFS, Amazon S3 (AWS S3), MySQL, Redis
Frameworks
Spring, Spring Boot, Hibernate, Selenium, Django REST Framework, Hadoop, Spark, gRPC, Django
Platforms
Linux, Docker, Amazon Web Services (AWS), Arduino, Jupyter Notebook, OpenShift, JBoss, Amazon EC2, Apache Kafka, Kubernetes, Raspberry Pi 3 GPIO
Other
Data Science, Machine Learning, Neural Networks, Natural Language Processing (NLP), Data Structures, Generative Pre-trained Transformers (GPT), Mathematics, Computer Science, Clustering Algorithms, Artificial Intelligence (AI), APIs, Back-end, Deep Learning, MinIO, Arduino IDE, Image Processing, Big Data, Multithreading, Jupiter, Algorithms, Embedded Systems, Embedded Hardware, Embedded Development, Embedded Software, Microcontroller Programming, AI Programming, Large Data Sets, Data Scientist, Transmission Control Protocol (TCP), UDP, Scraping, Website Data Scraping, OpenAI, Prompt Engineering, OpenAI GPT-3 API, Computer Vision, Applied Mathematics, Messaging, Web Scraping, Convolutional Neural Networks (CNNs), Generative Adversarial Networks (GANs), ESP32, Scripting, Internet of Things (IoT), Hardware Design, Technical Consulting, System Service & Hardware Control, I2C, Networking, PDF, JPEG, Optical Character Recognition (OCR), GitHub Actions, Hugging Face, Large Language Models (LLMs), Data Analysis, Data Analytics, Milvus, Retrieval-augmented Generation (RAG)
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring