Suleman Khan, Developer in Lahore City, Punjab, Pakistan
Suleman is available for hire
Hire Suleman

Suleman Khan

Verified Expert  in Engineering

Artificial Intelligence Engineer and Developer

Location
Lahore City, Punjab, Pakistan
Toptal Member Since
October 20, 2022

Suleman has over five years of expertise in data engineering, machine learning, cloud computing, and back-end software development. He has a master's degree in data science and has published three research articles on interpretable machine learning. Suleman currently works on cutting-edge technologies, including Python, Docker, Kubernetes, AWS, Redis, Ray.io, FastAPI, GraphQL, Boto3, RabbitMQ, Celery, SQL, PostgreSQL, Prefect, TensorFlow and most of the python frameworks.

Portfolio

ICS Collections
Amazon EC2, Azure Blobs, Azure Blob Storage API, Azure Machine Learning...
Altosphere
Amazon Web Services (AWS), Artificial Intelligence (AI), Docker, FastAPI...
Freelance Clients
Amazon Web Services (AWS), Artificial Intelligence (AI), Docker, FastAPI, GPT...

Experience

Availability

Full-time

Preferred Environment

Python, Back-end, Artificial Intelligence (AI), Machine Learning, TensorFlow, Docker, PostgreSQL, Amazon Web Services (AWS), Cloud Computing, Kubernetes

The most amazing...

...project I've developed is a no-code data infrastructure platform to store, manage, process, and analyze the wind and solar plants data.

Work Experience

Applied Data Scientist | Machine Learning Engineer

2022 - PRESENT
ICS Collections
  • Designed and developed a multi-tenant SaaS platform back end utilized by numerous medical debt collection clients.
  • Integrated back-end APIs with ChatGPT to create intelligent reporting agents, optimizing data-driven decision-making.
  • Integrated voice-based machine learning models and NLP semantic segmentation models to facilitate better decision-making processes.
  • Developed an automated process for loading data from over 50 clients, streamlining data management workflows.
  • Implemented a microservices architecture using FastAPI with Postgres, contributing to revenue growth exceeding $200,000.
  • Designed and implemented a PostgreSQL database utilized by multiple agencies, scaling to over 50 million rows.
Technologies: Amazon EC2, Azure Blobs, Azure Blob Storage API, Azure Machine Learning, PostgreSQL, Data Engineering, Data Science, Selenium, REST, GraphQL, SQL, Redis, Tableau, TensorFlow, Keras, AWS Lambda, Prefect, Apache Airflow, Azure Databricks, ChatGPT, Natural Language Processing (NLP), Celery, Docker, Python, FastAPI, Twilio, Teamwork

Data Scientist | Machine Learning Engineer

2020 - 2022
Altosphere
  • Developed the back end of a visual data infrastructure platform to store, manage, and process data.
  • Created unsupervised machine learning models and rule-based anomaly detectors for time series data.
  • Built an information extraction and verification pipeline from PDF field reports using Amazon Textract and Boto 3.
  • Utilized Ray.io to streamline the processing of PDF field reports, integrating with Amazon Textract and Boto 3 for efficient data extraction and verification.
  • Implemented a dynamic data pipeline orchestration module using Prefect.
  • Developed asynchronous task execution using Celery, RabbitMQ, and Redis. Monitored tasks with Flower.
  • Integrated multiple microservices using GraphQL and gRPC. Containerized the FastAPI back end using Docker and deployed it on AWS.
Technologies: Amazon Web Services (AWS), Artificial Intelligence (AI), Docker, FastAPI, Back-end, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), GPT, Machine Learning, PostgreSQL, TensorFlow, Python, Web Development, Celery, Leadership, Architecture, Pandas, NumPy, Scikit-learn, BigQuery, Google BigQuery, Data Pipelines, Ray.io, Test-driven Development (TDD), Supervised Machine Learning, Supervised Learning, Classifier Development

Data Scientist / Machine Learning Engineer

2018 - 2021
Freelance Clients
  • Developed an application to detect the number of times the advertisement is played during the radio transmission using TensorFlow.
  • Detected the anomalies in the time-series dataset using CNN, RNN, and other anomaly detection algorithms.
  • Created IVR using bidirectional Twilio streaming, Google speech-to-text, and text-to-speech services for a call center.
  • Built an Android application using TensorFlow Lite to detect a person with COVID-19 in the voice recordings.
  • Implemented a machine learning algorithm to detect a person's fall in a room CCTV recording.
  • Consolidated data from multiple sources in Google BigQuery.
Technologies: Amazon Web Services (AWS), Artificial Intelligence (AI), Docker, FastAPI, GPT, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), PostgreSQL, Python, Machine Learning, TensorFlow, Back-end, Data Transcription, AI Design, Audio, Speech to Text, Speech Recognition, Text to Speech (TTS), Pandas, NumPy, Scikit-learn

Software Engineer

2020 - 2020
Luminogics
  • Built a website for an automobile company using HTML, CSS, JavaScript, React, and Node.js.
  • Developed simple web games using the Phaser library.
  • Created the back end of a website using Python, SQL, PostgreSQL, and Docker.
Technologies: JavaScript, HTML, CSS, React, Angular, Amazon Web Services (AWS), Amazon S3 (AWS S3), Amazon EC2, AWS IAM

Software Engineer | Internship

2018 - 2018
Nextbridge
  • Built a social application using Android networking libraries.
  • Created a painting application using Canvas and similar technologies.
  • Developed a bookstore application using Android SDK.
Technologies: Java, Volley, Volley Android Library, Retrofit, Retrofit 2, Dagger, Android SDK

Software Engineer | Internship

2017 - 2017
Minimax Technologies
  • Built a photo frame mobile application with multiple layouts, frames, templates, stickers, backgrounds, and text fonts to create incredible photos.
  • Developed a voice recording mobile application with custom audio signal visualization.
  • Created a flashlight mobile application with cool features like auto light on and off and timer.
Technologies: Android SDK, Java

Software Engineer | Internship

2016 - 2016
Havanour Technologies
  • Developed tested, idiomatic, and documented websites using HTML, CSS, and JavaScript.
  • Created multiple UI components of the website using HTML and CSS.
  • Wrote and debugged code that would work across different browsers.
Technologies: HTML, CSS, JavaScript, Bootstrap, jQuery, SCSS

Con-Detect | Detecting Adversarially Perturbed Natural Language Inputs to Deep Classifiers

https://www.techrxiv.org/articles/preprint/Con-Detect_Detecting_Adversarially_Perturbed_Natural_Language_Inputs_to_Deep_Classifiers_Through_Holistic_Analysis/19295534
DL algorithms have shown wonders in many NLP tasks but are vulnerable to adversarial attacks. Most mitigation techniques proposed to date are supervised, relying on adversarial retraining to improve the robustness.

We introduce an unsupervised detection methodology for detecting adversarial inputs to NLP classifiers. We note that minimally perturbing an input to change a model's output, a significant strength of adversarial attacks, is a weakness that leaves unique statistical marks reflected in the cumulative contribution scores of the input. Particularly, we show that the cumulative contribution score, called the CF-score of adversarial inputs, is generally greater than that of the clean inputs. We thus propose Con-Detect, a contribution-based detection method for detecting adversarial attacks against NLP classifiers. Con-Detect can be deployed with any classifier without having to retrain it. We show that it can reduce the attack success rate (ASR) of different attacks from 100% to as low as 0% for the best cases and =70% for the worst case. Even in the worst case, we note a 100% increase in the required number of queries and a 50% increase in the number of words perturbed, suggesting that Con-Detect is hard to evade.

Tamp-X | Attacking Explainable Natural Language Classifiers Through Tampered Activations

https://www.sciencedirect.com/science/article/pii/S0167404822001857
While the technique of deep neural networks (DNNs) has been instrumental in achieving state-of-the-art results for various natural language processing (NLP) tasks, recent works have shown that the decisions made by DNNs cannot always be trusted. Recently proposed explainable artificial intelligence (XAI) methods are open to attack and can be manipulated in both white-box gradient-based and black-box perturbation-base scenarios.

We proposed first-of-its-kind Tamp-X, a novel attack that tampers the activations of robust NLP classifiers forcing the state-of-the-art white-box and black-box XAI methods to generate misrepresented explanations. Through extensive experimentation, we show that the explanations generated for the tampered classifiers are unreliable and significantly disagree with those generated for the untampered classifiers, even though the output decisions of tampered and untampered classifiers are almost always the same. Additionally, we study the adversarial robustness of the tampered NLP classifiers and find out that the tampered classifiers, which are harder to explain for the XAI methods, are also harder to attack by adversarial attackers.

All Your Fake Detector Belong to Us | Evaluating Adversarial Robustness of Fake-news Detectors

https://ieeexplore.ieee.org/abstract/document/9446139
With the hyperconnectivity and ubiquity of the Internet, the fake news problem now presents a greater threat than ever before. One promising solution for countering this threat is to leverage deep learning (DL)-based text classification methods for fake-news detection. However, since such methods are vulnerable to adversarial attacks, the integrity and security of DL-based fake news classifiers are under question. We evaluate the performance of fake-news detectors under various configurations under black-box settings. In particular, we investigate the robustness of four different DL architectural choices, such as MLP, CNN, RNN, and a recently proposed Hybrid CNN-RNN trained on three different state-of-the-art datasets, under other adversarial attacks like Text Bugger, Text Fooler, PWWS, and Deep Word Bug implemented using TextAttack.

Additionally, we explore how changing the detector complexity, the input sequence length, and the training loss affects the robustness of the learned model. Our experiments suggest that RNNs are robust compared to other architectures and our evaluations provide vital insights to robustify fake-news detectors against adversarial attacks.

The Art in Our Worlds

https://github.com/msulemannkhan/nasaspaceapp
NASA is moving its data to the cloud, and machine learning (ML) and artificial intelligence (AI) can offer an innovative means to analyze and use this massive archive of free and open data.

We developed a mobile application using ML/AI techniques that allows users to input short text phrases, match that input to NASA science data or imagery, and display the results for the user creatively and artistically.

Domain Adaptation for Emotion Detection from Face Expressions

https://github.com/msulemannkhan/msds19011_Project_DLSpring2020
Humans have seven distinct facial emotions. Facial expression recognition algorithms have applications in healthcare, entertainment, criminal justice, and more.

Deep learning algorithms are efficient for facial expression classification, but these algorithms demand a high amount of data. Domain Adaptation can be used to address the lack of sufficient data. Right Now, we don't have much data on Pakistani facial expressions. In this Project, we created data set of Pakistani facial expressions and used domain adaptation to develop an efficient facial expression algorithm of Pakistani faces. We have achieved 58% accuracy with a baseline of 32 %.

AI Article Writer

https://github.com/msulemannkhan/text-generation
OpenAI's GPT-3 model is a very powerful model which contains 175 billion parameters and was trained on 45TB of text from various data sets.

I developed the project to demonstrate the capabilities of state-of-the-art NLP models. This demonstrates how we can use GPT Neo and GPT-3 to write blogs.

Bookstore Application

The Bookstore Application is an eCommerce platform built using the Django web framework, SQL databases, and Postgres as the database engine. The platform allows users to browse and purchase books from a local bookstore.

The application features a user-friendly interface that allows users to easily search for books by title, author, or genre. Users can add books to their shopping cart and proceed to checkout, where they can enter their payment information and complete their purchases.

The back end of the application is built using Django, which allows for efficient data management and user authentication. The application uses SQL and Postgres to store and manage user and product data, ensuring data security and scalability.

Overall, the Bookstore Application is a versatile and user-friendly platform that enables the local bookstore to sell their books online, providing customers with a convenient way to shop and support their favorite local bookstore.
2019 - 2021

Master's Degree in Data Science

Information Technology University - Lahore, Pakistan

2015 - 2019

Bachelor's Degree in Computer Science

Government College University - Lahore, Pakistan

SEPTEMBER 2022 - PRESENT

Build Text Classification Model with AWS Glue and Amazon SageMaker

AWS

SEPTEMBER 2022 - PRESENT

Introduction to Amazon Kinesis Analytics

LinkedIn

SEPTEMBER 2022 - PRESENT

Learning Amazon Web Services Lambda

LinkedIn

SEPTEMBER 2022 - PRESENT

Learning Amazon SageMaker

LinkedIn

SEPTEMBER 2022 - PRESENT

Learning AWS CloudFormation

LinkedIn

SEPTEMBER 2022 - PRESENT

Efficient Python Production Workflows

LinkedIn

SEPTEMBER 2022 - PRESENT

Data Science Foundations | Python Scientific Stack

LinkedIn

SEPTEMBER 2022 - PRESENT

Artificial Intelligence Foundations | Machine Learning

LinkedIn

SEPTEMBER 2022 - PRESENT

Apache PySpark by Example

LinkedIn

SEPTEMBER 2022 - PRESENT

Amazon Web Services | Data Services

LinkedIn

SEPTEMBER 2022 - PRESENT

Advanced Pandas

LinkedIn

SEPTEMBER 2022 - PRESENT

AWS for Developers | Data-driven Serverless Applications with Kinesis

LinkedIn

SEPTEMBER 2022 - PRESENT

AWS Machine Learning | Building an Expense Tracker Using Amazon Textract

LinkedIn

Languages

Python, SQL, Python 3, JavaScript, HTML, CSS, Java, SCSS, GraphQL

Libraries/APIs

TensorFlow, Pandas, NumPy, Matplotlib, Scikit-learn, PySpark, Folium, Shapely, PyTorch, React, Volley, Volley Android Library, Retrofit, Retrofit 2, jQuery, OpenCV, Flask-RESTful, Django ORM, Azure Blob Storage API, Keras

Tools

Seaborn, PyPI, Amazon SageMaker, AWS IAM, BigQuery, Amazon Athena, AWS CloudFormation, AWS Glue, Jupyter, Celery, Azure Machine Learning, Tableau, Apache Airflow, ChatGPT

Paradigms

Data Science, REST, Test-driven Development (TDD)

Platforms

Amazon EC2, Amazon Web Services (AWS), Docker, AWS Lambda, Jupyter Notebook, Twilio, Kubernetes

Storage

PostgreSQL, Amazon S3 (AWS S3), Databases, Data Pipelines, NoSQL, Data Lakes, Amazon Aurora, Azure Blobs, Redis

Other

Back-end, Artificial Intelligence (AI), Machine Learning, Natural Language Processing (NLP), FastAPI, Data Visualization, Regression, Linear Regression, Data Analytics, Deep Learning, GPT, Generative Pre-trained Transformers (GPT), Data Wrangling, API Gateways, Google BigQuery, Machine Learning Operations (MLOps), Amazon Kinesis, Serverless, Data Analysis, Data Warehousing, Gunicorn, CI/CD Pipelines, Identity & Access Management (IAM), Infrastructure as Code (IaC), Big Data, Neural Networks, Deep Neural Networks, Recurrent Neural Networks (RNNs), Convolutional Neural Networks (CNN), Adversarial Attacks, Explainable Artificial Intelligence (XAI), BERT, Computer Vision, APIs, Google Colaboratory (Colab), OpenAI, Generative Pre-trained Transformer 3 (GPT-3), Text Generation, Web Development, Data Transcription, AI Design, Audio, Speech to Text, Speech Recognition, Text to Speech (TTS), Leadership, Architecture, Security, Data Engineering, Prefect, Azure Databricks, Ray.io, Cloud Computing, Containerization, Ray Train, Ray Tune, Ray Serve, Ray Core, Teamwork, Supervised Machine Learning, Supervised Learning, Classifier Development

Frameworks

Hadoop, Serverless Framework, Android SDK, Angular, Dagger, Bootstrap, Flask, Django, Selenium

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring