
Masum Billal
Verified Expert in Engineering
Software Developer
Dhaka, Dhaka Division, Bangladesh
Toptal member since July 14, 2022
Masum is a versatile and performance-oriented engineer with 7+ years of experience building scalable applications and AI-powered solutions across cloud platforms like AWS and GCP. He is an expert in Python frameworks, databases, automation, asynchronous programming, and architectures in microservices, cloud, and serverless. Masum drives company growth and client success by mentoring teams in adopting best practices for higher code quality and faster development.
Portfolio
Experience
- Microservices - 7 years
- Django - 7 years
- Machine Learning - 7 years
- Python - 7 years
- Data Engineering - 5 years
- Amazon Web Services (AWS) - 5 years
- Google Cloud Platform (GCP) - 4 years
- FastAPI - 4 years
Availability
Preferred Environment
Linux, Python, System Architecture, Cloud Computing, Solution Architecture, Back-end Development, Data Engineering, Machine Learning, Amazon Web Services (AWS), Google Cloud Platform (GCP)
The most amazing...
...thing I've done is architect a universal LLM interface and increase API performance by over 400%.
Work Experience
Python Engineer (via Toptal)
Freelance Client
- Leveraged asynchronous FastAPI and SQLAlchemy, improving LLM API performance by over 400% and reducing latency.
- Designed a universal API framework for LLM integration, cutting vendor onboarding time by over 80%.
- Collaborated on 8+ cross-functional AI projects, ensuring scalable architecture and seamless deployment.
- Mentored team members on best practices and improved CI/CD pipelines, leading to fewer bugs and faster development.
Senior Python Developer | Computer Vision Engineer
Advanced Mobility Analytics Group
- Built real-time data pipelines for traffic analytics with 20% lower resource consumption with YOLO, PyTorch, ONNX, and TensorRT.
- Developed high-performance near-real-time data pipelines for traffic monitoring using YOLO v7 and OpenCV.
- Refactored legacy codebase into a well-maintained modern repository with high code coverage and best practices.
Expert Python Developer (via Toptal)
Woven
- Developed and optimized automated testing applications for mapping platforms, improving CI/CD performance and reducing testing costs.
- Improved data processing and validation performance and CI execution time by around 90% using optimization techniques such as vectorization.
- Defined best practices and standards for improving code quality and better maintenance.
Senior Data Scientist | ML Engineer
iXora Solution
- Engineered event-driven, real-time, and scheduled data pipelines in GCP, Django, Flask, MongoDB, and AWS.
- Optimized ETL memory usage by over 95%, enabling large-scale data handling.
- Developed artificially intelligent solutions using Django, NLP, and machine learning.
- Saved a client thousands of dollars monthly by automating data analysis, cleaning, and reporting tools.
- Developed an in-house facial recognition-based attendance system with incredibly low latency and 100% accuracy.
Senior Data Scientist | ML Engineer
SHOHOZ
- Implemented user segmentation with clustering, reducing marketing spend.
- Built fraud detection data pipelines, reducing fraudulent activities by over 50%.
- Architected data solutions for the government-backed Corona Tracer BD.
Machine Learning Engineer
Auleek
- Developed an application to detect architectural components of a floor plan using deep learning.
- Created an application to determine whether a component can be placed in a floor plan.
- Automated training and prediction of floor plan images.
RND Software Engineer
REVE Systems
- Engaged in full-stack development for a government e-vet platform.
- Improved the packet analysis tool for high-throughput networks.
- Developed the application with technologies like Java, Hibernate, JPA ORM, and Thymeleaf.
Data Scientist | ML Engineer
Thread Equation PTE Ltd.
- Developed an application powered by Django and Django REST framework.
- Created an NLP-powered application to identify attacks on applications.
- Integrated the machine learning application into the Django application.
Experience
New York Vehicles Crash Data Interactive Visualization
Xin-ORM
https://github.com/proafxin/xin• Execute queries on a database.
• Read a database table as a data frame.
• Write a data frame to a database table (still under development).
• Flatten and normalize a data frame with a nested structure.
• Serialize a data frame as a list of Pydantic models.
• Deserialize a list of Pydantic models as a data frame.
OCR Scorer
https://github.com/proafxin/tdm-trackerCorona Tracer
I was the data lead, building data pipelines, analytics, and visualizations. Due to the nature of the pandemic, one challenge was delivering the project within a very tight timeline. Ultimately, we created automatic real-time data pipelines and provided government officials with analytics, reports, and visualizations.
ShiftSmart
https://shiftsmart.com/Additionally, I used a Flask-based back-end application to facilitate intercommunication between the microservices in the pipeline.
DeepCortex
I worked as a data science team lead. We designed and developed the data science-related back end of DeepCortex.
Profiling Users Through Clustering
Clustering was one of the steps of that pipeline. K-means was used to cluster users, and k-means++ was used as a seeding initialization technique to improve the clustering quality.
Facial Recognition API
One of the challenges was to persist the data involved in this application.
Food Recommendation & Trend Analysis
Before feature engineering and tuning, matrix factorization was used as the primary choice model. I also utilized data mining and machine learning techniques for trend analysis.
Fraud Detection
Fifa Simulation in APIs
https://github.com/proafxin/football_managerIn testing, the code has 100% coverage. Continuous integration is used to make sure no bad code is being pushed or merged.
I used Tox and GitHub Actions to ensure the project can be successfully deployed on various platforms and environments.
Dashboard for Cyberattacks
Pandas Extras
https://github.com/proafxin/pd-extrasData frame-to-database is meant to remove all the extra steps from this writing process. Currently, the goal is to support SQL and NoSQL databases, including data warehouses such as Google BigQuery or Apache Cassandra. For SQL databases, SQLAlchemy is used internally to generalize all SQL database connections.
ETL Job Scheduling with Apache Airflow
https://github.com/proafxin/airflowInternally, SQLAlchemy and PyMySQL are used to connect to the database and communicate with it for reading and writing data. Pandera validates a Pandas data frame created from data retrieved in the SQL query.
Bug Tracker
https://github.com/proafxin/bug-trackerThe app is currently under development to reach 100% test coverage, with the minimum functionalities of creating and editing stories or bugs. The application can also be used as a Docker container. Poetry and Tox are used internally to maintain dependencies.
Seeding Methods in K-means Clustering
https://github.com/proafxin/seeding-kmeansReffer: An Open-source Bibliography Management Solution
Certifications
AWS Cloud Technology Consultant
Amazon Web Services
AWS Cloud Solutions Architect
Amazon Web Services
AWS Fundamentals Specialization
Amazon Web Services
IBM Data Science Professional Certificate
IBM | via Coursera
Skills
Libraries/APIs
Pandas, SciPy, NumPy, TensorFlow, PyTorch, OpenCV, Asyncio, Python Asyncio, API Development, Matplotlib, Scikit-learn, REST APIs, PySpark, SQLAlchemy, PyMongo, Pyodbc, PyMySQL, Shapely, OpenAI API, Pydantic, Folium, Back-end APIs
Tools
Pytest, Git, Seaborn, Jira, Coverage.py, Apache Airflow, GIS, Plotly, Jupyter, AWS CLI, Open Neural Network Exchange (ONNX), BigQuery, ChatGPT, Uvicorn, Grafana, Amazon Simple Queue Service (SQS), Amazon Simple Notification Service (SNS), Amazon CloudWatch, AWS SDK, Amazon Virtual Private Cloud (VPC)
Languages
C++, Python, Java, SQL, Python 3, JavaScript
Frameworks
Django, Django REST Framework, Alembic, Django Ninja, Flask, Spark, Streamlit
Paradigms
Scrum, Object-oriented Programming (OOP), REST, Microservices, Testing, Asynchronous Programming, Code Refactoring, Continuous Integration (CI), Business Intelligence (BI), Socket Programming, ETL, RESTful Development, Test-driven Development (TDD), Unit Testing, Continuous Delivery (CD), Real-time Systems, Data-driven Methodology, Distributed Computing, Automation, Scalable Application, Back-end Architecture, Templating
Platforms
Jupyter Notebook, Google Cloud Platform (GCP), Amazon Web Services (AWS), AWS Lambda, Azure, Apache Kafka, Docker, NVIDIA CUDA, Kubernetes, Ollama, Amazon EC2
Storage
MongoDB, Amazon S3 (AWS S3), Databases, Redis, Amazon DynamoDB, NoSQL, MySQL, JSON, PostgreSQL, Microsoft SQL Server, Data Validation, Data Pipelines, Google Cloud, ScyllaDB, SQLite
Other
Clustering, Code Coverage, Machine Learning, Data Science, Deep Learning, Data Engineering, Recommendation Systems, Google BigQuery, Back-end, QA Automation, Architecture, Software Design, SaaS, API Integration, FastAPI, Containerization, Code Review, Supervised Machine Learning, Async/Await, Generative Artificial Intelligence (GenAI), Web Scraping, Software Architecture, Amazon RDS, Recurrent Neural Networks (RNNs), Data Visualization, Data Analytics, Data Mining, Agile Sprints, Multithreading, APIs, CI/CD Pipelines, Natural Language Processing (NLP), Document Parsing, Document Processing, ETL Tools, Data Warehousing, Pandera, DataFrames, Generative Pre-trained Transformers (GPT), GeoPandas, RESTful Microservices, Workflow, RESTful Services, Spatial Analysis, Visualization, Integration, Artificial Intelligence (AI), RESTful Web Services, Poetry, Tox, Business Services, Clustering Algorithms, K-means Clustering, NVIDIA TensorRT, Tensorrt, ONNX Runtime, Optical Character Recognition (OCR), EasyOCR, Tesseract, Image Processing, Convolutional Neural Networks (CNNs), Neural Networks, Retrieval-augmented Generation (RAG), Software Engineering, OpenAI, Gunicorn, Large Language Models (LLMs), GitHub Actions, Prometheus, Middleware, OpenAI SDK, OpenAI GPT-4 API, OpenAI GPT-3 API, Open-source LLMs, Llama, Dify, Information Retrieval, Atlas, Asyncpg, Polars, UV, Speech to Text AI, Text to Speech (TTS), Speech to Text, Image Generation, Moderation, AI Agents, Servers, Dagster, DuckDB, Bokeh, Algorithms, Real-time Data, Real-time Computing, Real-time Vision Systems, Data Analysis, CycleGAN, Generative Adversarial Networks (GANs), Analytical Dashboards, Cloud Computing, Big Data, Collaboration, System Architecture, Security Engineering, AWS Big Data, Solution Architecture, AWS Cloud Security, Cloud Infrastructure, Back-end Development, Scalable Architecture, Back-end Performance, Scalable Web Services, Full-stack, Startups, Scalability, Full-stack Development, Data Cleaning, Front-end
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring