Kumar Vishal, Developer in Bengaluru, Karnataka, India
Kumar is available for hire
Hire Kumar

Kumar Vishal

Verified Expert  in Engineering

Bio

Kumar has a strong background in computer vision and image processing using both classical techniques and deep learning approaches. Some of the problems he has worked on are object segmentation, face recognition, scene classification, stereo vision, and motion estimation. He also has experience in full-stack development. Kumar is currently the CTO for a product using FR and scene detection to improve the editing experience for video editors.

Portfolio

AgShift Inc.
Deep Learning, TensorFlow, Keras, PyTorch, Fast.ai, TensorFlow Serving, React
Yobi.ai
JavaScript, C++, Python, Computer Vision, Deep Learning
LensBricks Inc. (Kiba)
Computer Vision, Image Processing, Machine Learning, Python, C++

Experience

  • Python - 6 years
  • Computer Vision - 5 years
  • Image Processing - 5 years
  • C++ - 5 years
  • OpenCV - 5 years
  • Architecture - 4 years
  • Full-stack - 4 years
  • AWS Lambda - 2 years

Availability

Part-time

Preferred Environment

Git, PyCharm, Sublime Text, Windows, MacOS, Linux

The most amazing...

...thing I've created is software to bring facial recognition and scene classification within the framework of professional video editing (https://www.jump.video).

Work Experience

Principal AI Engineer

2019 - 2021
AgShift Inc.
  • Owned the full AI stack, including training, data collection, pre and post-processing, data cleanup, and data visualization. I was the primary developer for the related codebase.
  • Performed segmentation and classification of 5-20 classes depending upon the commodity.
  • Built a back-end framework for automated and parallelized training with a varying dataset, learning rates, and some key hyperparameters.
  • Implemented a result visualization UI back end to navigate a large dataset comprehensively using React.
  • Improved responsiveness by immediate loading of metadata, lazy loading, and dynamic generation of images.
  • Added the logging of additional training statistics within the TensorBoard framework to enable analysis.
Technologies: Deep Learning, TensorFlow, Keras, PyTorch, Fast.ai, TensorFlow Serving, React

CTO

2016 - 2020
Yobi.ai
  • Implemented the software architecture and back end for Jump software—https://jump.video/.
  • Developed an algorithm to detect scene and shot change detection.
  • Designed unsupervised clustering of faces in a video using deep learning techniques.
  • Created a YouTube Chrome plugin to mark interesting moments in a video—http://archive.jump.video/.
  • Interacted with various video editors to tune the output of the algorithm as per their needs.
Technologies: JavaScript, C++, Python, Computer Vision, Deep Learning

Senior Researcher and System Architect

2013 - 2016
LensBricks Inc. (Kiba)
  • Acted as a product architect creating a solution for capturing and processing real-time feeds using GStreamer supporting the following platforms: Intel NUC, Raspberry PI, and custom ARM-based board.
  • Built a processing pipeline consisted of audio keyword detection, scene highlights, heat map generation, depth estimation using a dual camera, live streaming to the iOS app, and timer-based video recording.
  • Generated scene depth using video captured with a stereo camera set up.
  • Created an algorithm for eye blink detection in C++. Used SVM classifier for the classification of eyes.
  • Achieved real-time performance for eye blink detection by replacing redundant face detection with face tracking.
Technologies: Computer Vision, Image Processing, Machine Learning, Python, C++

Researcher

2012 - 2013
Nokia Research Center
  • Filed a patent for optimization of a stereo algorithm using Superpixels within a team of four researchers.
  • Developed Lightspeak, a technology for transmitting information using light.
  • Created CUDA-based optimization for a stereo algorithm.
Technologies: NVIDIA CUDA, C#, C++

Design Engineer

2010 - 2012
Texas Instruments
  • Worked on low-level camera software systems deployed on Android phones, which also interacted with hardware accelerators for real-time performance. This was a highly multi-threaded environment and used pipelining to meet performance requirements.
  • Worked at Google Android HQ and Huawei HQ for porting camera software onto Nexus phone.
  • Served as the primary developer for a feature to use front and back cameras simultaneously in 2012. Took part of the new chip upbringing team for OMAP5 processors.
  • Developed a system integrator for an algorithm for global and local brightness and contrast enhancement in real-time during image and video capture.
Technologies: C++, C

Experience

Quality Analysis of Agricultural Products Using AI

https://agshift.com
Owner of the full AI stack for a commercial product, including training, data collection, pre and post-processing, data cleanup, and data visualization.

Responsibilities:
• Segmentation and classification of 5-20 classes depending upon the commodity.
• Back-end framework for automated and parallelized training with a varying dataset, learning rates, and key hyperparameters.
• Implemented a result visualization UI back end to navigate a large dataset comprehensively using React.

AI Assistance Tool for Video Editors

https://yobi.ai
A software targeted towards video editors to enable them to find anything in a video within 30 seconds.

It utilizes facial recognition and scene and shot detection to accomplish the task. I designed it to work as a Premiere Pro extension.

Education

2005 - 2010

Master of Technology Degree in Communications and Signal Processing

IIT Bombay - Mumbai, India

2005 - 2010

Bachelor of Technology Degree in Electrical Engineering

IIT Bombay - Mumbai, India

Certifications

JUNE 2018 - PRESENT

Structuring Machine Learning Projects

Coursera

JUNE 2018 - PRESENT

Neural Networks and Deep Learning

Coursera

JUNE 2018 - PRESENT

Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization

Coursera

JUNE 2018 - PRESENT

Convolutional Neural Networks

Coursera

Skills

Libraries/APIs

OpenCV, Node.js, Keras, TensorFlow, PyTorch, Fast.ai, React

Tools

PyCharm, Eclipse IDE, Amazon Simple Queue Service (SQS), Sublime Text, Git, WebStorm, TensorFlow Serving

Languages

C++, Python, C, C#, JavaScript, Python 3

Platforms

Linux, AWS Lambda, Amazon EC2, MacOS, Windows, NVIDIA CUDA, Windows Phone, iOS, Android, Amazon Web Services (AWS)

Frameworks

Electron

Paradigms

Back-end Architecture

Storage

Amazon S3 (AWS S3), MongoDB, MySQL

Other

Computer Vision, Image Processing, Multiprocessing, Full-stack, Architecture, Stereoscopic Video, Neural Networks, Deep Learning, Multithreading, Machine Learning

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring