Syed Mohsin Ali, Developer in Lahore, Punjab, Pakistan
Syed is available for hire
Hire Syed

Syed Mohsin Ali

Verified Expert  in Engineering

Bio

Mohsin is a skilled generative AI and machine learning professional with a proven track record of driving innovation through cutting-edge NLP, GenAI, and ML solutions. With over five years of experience fine-tuning large language models and deploying scalable models, he empowers businesses to unlock new possibilities and stay ahead of the curve. Mohsin excels in cross-functional collaboration, working with teams to implement AI-driven solutions that foster technological advancement.

Portfolio

SlashNext
Python, PyTorch, Large Language Models (LLMs)...

Experience

  • Python - 6 years
  • Natural Language Processing (NLP) - 6 years
  • BERT - 4 years
  • Feature Engineering - 4 years
  • Retrieval-augmented Generation (RAG) - 3 years
  • Prompt Engineering - 3 years
  • Fine-tuning - 3 years
  • Large Language Models (LLMs) - 3 years

Availability

Full-time

Preferred Environment

Visual Studio Code (VS Code), Jupyter, PyCharm

The most amazing...

...achievement has been leading the development of an AI-driven BEC detection system, achieving 99% precision using BERT, RAG, and GenAI.

Work Experience

Senior Machine Learning Engineer

2022 - PRESENT
SlashNext
  • Spearheaded the design and implementation of advanced email security algorithms leveraging BERT and large language models (LLMs), reaching 99.2% precision in identifying business email compromise (BEC), phishing, and social engineering attacks.
  • Built machine learning and statistical models for NLP tasks, including sentiment analysis, named entity recognition (NER), text classification, and feature extraction, and utilized LLMs to enhance training data through augmentation and generation.
  • Collaborated closely with DevOps, QA, front-end, and UI/UX teams to deliver the BEC solution as a client-facing product, driving business growth and showcasing expertise in addressing domain-specific challenges.
  • Drove industry-wide recognition through strategic adoption of the BEC solution by leading multinational corporations, including NVIDIA, P&G, Kingston, and Forbes; implemented a robust Python-based production pipeline to support deployment at scale.
Technologies: Python, PyTorch, Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), TensorFlow, Visual Studio Code (VS Code)

Experience

RAG-based System for Advanced Malicious Email Analysis

I designed and implemented a retrieval-augmented generation (RAG) pipeline for advanced analysis of malicious emails. The system processes a dataset of over 10,000 known malicious emails, which are semantically chunked and stored in a FAISS vector database to enable efficient similarity search. To handle long email content, I applied semantic chunking techniques that preserve contextual relevance during vectorization.

Each email is indexed with its sender address and metadata related to the recipient mailbox, allowing context-rich retrieval. When our custom classifier flags a new malicious email, the system performs a vector-based search to retrieve semantically similar historical samples.

An LLM is then prompted with these retrieved samples to generate a comprehensive analytical summary. The response provides insights such as the recurrence of similar scams, behavioral patterns of the sender (e.g., repeated use of the same email address, tone, or scam structure), and intent detection. This enables the identification of scams that follow a similar template with only minor lexical changes, offering deeper threat intelligence and helping to preempt future attacks.

AI-driven Credential Theft Detection

I designed and developed an advanced credential-stealing classifier to detect sophisticated phishing attempts targeting the theft of login credentials. The solution analyzes email content and attachments—including EML, ZIP, HTML, and PDF formats—extracting over 100 categorical features from email headers, subjects, bodies, and file structures.

To ensure high precision, I retrained a BERT model using fine-tuning techniques such as parameter-efficient fine-tuning (PEFT) and low-rank adaptation (LoRA). These approaches enabled efficient model adaptation on resource-constrained hardware without sacrificing performance.

I also implemented custom data augmentation strategies and conducted iterative A/B testing to maximize detection accuracy. The resulting classifier significantly strengthened the email security pipeline, enhancing defenses against credential theft and phishing-based attacks.

Text Normalization and Data Standardization Pipeline

I designed and implemented a comprehensive data normalization pipeline to standardize diverse text data elements, including email addresses, dates, times, URLs, greetings, concluding remarks, alphanumeric text, and digits. This pipeline ensured consistency across various data sources, improving the overall quality and reliability of the data.

To address specific challenges, I developed customized algorithms to identify and normalize complex data patterns accurately. These included handling special cases, such as copyright symbols, which are often difficult to standardize. By incorporating these tailored solutions, I maintained the integrity of the data across different formats and sources.

This initiative significantly optimized data processing workflows, enhancing data quality and more efficient analysis. It streamlined data handling and facilitated faster generation of actionable insights, driving improvements in decision-making processes.

Education

2011 - 2013

Master's Degree in Computer Science

University of Management and Technology - Lahore, Pakistan

2007 - 2011

Bachelor's Degree in Electrical Engineering

University of Engineering and Technology - Lahore, Pakistan

Skills

Libraries/APIs

PyTorch, TensorFlow, LSTM, Natural Language Toolkit (NLTK), SpaCy

Tools

Jupyter, PyCharm

Languages

Python, C++, Verilog, Regex

Platforms

Visual Studio Code (VS Code)

Other

Large Language Models (LLMs), Generative Artificial Intelligence (GenAI), Machine Language, Deep Learning, Retrieval-augmented Generation (RAG), LangChain, Prompt Engineering, Fine-tuning, Natural Language Processing (NLP), Feature Engineering, Neural Networks, Recurrent Neural Networks (RNNs), BERT, LoRa, Data Augmentation, A/B Testing

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring