
Kumar Vishal
Verified Expert in Engineering
Computer Vision Developer
Bengaluru, Karnataka, India
Toptal member since March 4, 2019
Kumar has a strong background in computer vision and image processing using both classical techniques and deep learning approaches. Some of the problems he has worked on are object segmentation, face recognition, scene classification, stereo vision, and motion estimation. He also has experience in full-stack development. Kumar is currently the CTO for a product using FR and scene detection to improve the editing experience for video editors.
Portfolio
Experience
- Python - 6 years
- Computer Vision - 5 years
- Image Processing - 5 years
- C++ - 5 years
- OpenCV - 5 years
- Architecture - 4 years
- Full-stack - 4 years
- AWS Lambda - 2 years
Availability
Preferred Environment
Git, PyCharm, Sublime Text, Windows, MacOS, Linux
The most amazing...
...thing I've created is software to bring facial recognition and scene classification within the framework of professional video editing (https://www.jump.video).
Work Experience
Principal AI Engineer
AgShift Inc.
- Owned the full AI stack, including training, data collection, pre and post-processing, data cleanup, and data visualization. I was the primary developer for the related codebase.
- Performed segmentation and classification of 5-20 classes depending upon the commodity.
- Built a back-end framework for automated and parallelized training with a varying dataset, learning rates, and some key hyperparameters.
- Implemented a result visualization UI back end to navigate a large dataset comprehensively using React.
- Improved responsiveness by immediate loading of metadata, lazy loading, and dynamic generation of images.
- Added the logging of additional training statistics within the TensorBoard framework to enable analysis.
CTO
Yobi.ai
- Implemented the software architecture and back end for Jump software—https://jump.video/.
- Developed an algorithm to detect scene and shot change detection.
- Designed unsupervised clustering of faces in a video using deep learning techniques.
- Created a YouTube Chrome plugin to mark interesting moments in a video—http://archive.jump.video/.
- Interacted with various video editors to tune the output of the algorithm as per their needs.
Senior Researcher and System Architect
LensBricks Inc. (Kiba)
- Acted as a product architect creating a solution for capturing and processing real-time feeds using GStreamer supporting the following platforms: Intel NUC, Raspberry PI, and custom ARM-based board.
- Built a processing pipeline consisted of audio keyword detection, scene highlights, heat map generation, depth estimation using a dual camera, live streaming to the iOS app, and timer-based video recording.
- Generated scene depth using video captured with a stereo camera set up.
- Created an algorithm for eye blink detection in C++. Used SVM classifier for the classification of eyes.
- Achieved real-time performance for eye blink detection by replacing redundant face detection with face tracking.
Researcher
Nokia Research Center
- Filed a patent for optimization of a stereo algorithm using Superpixels within a team of four researchers.
- Developed Lightspeak, a technology for transmitting information using light.
- Created CUDA-based optimization for a stereo algorithm.
Design Engineer
Texas Instruments
- Worked on low-level camera software systems deployed on Android phones, which also interacted with hardware accelerators for real-time performance. This was a highly multi-threaded environment and used pipelining to meet performance requirements.
- Worked at Google Android HQ and Huawei HQ for porting camera software onto Nexus phone.
- Served as the primary developer for a feature to use front and back cameras simultaneously in 2012. Took part of the new chip upbringing team for OMAP5 processors.
- Developed a system integrator for an algorithm for global and local brightness and contrast enhancement in real-time during image and video capture.
Experience
Quality Analysis of Agricultural Products Using AI
https://agshift.comResponsibilities:
• Segmentation and classification of 5-20 classes depending upon the commodity.
• Back-end framework for automated and parallelized training with a varying dataset, learning rates, and key hyperparameters.
• Implemented a result visualization UI back end to navigate a large dataset comprehensively using React.
AI Assistance Tool for Video Editors
https://yobi.aiIt utilizes facial recognition and scene and shot detection to accomplish the task. I designed it to work as a Premiere Pro extension.
Education
Master of Technology Degree in Communications and Signal Processing
IIT Bombay - Mumbai, India
Bachelor of Technology Degree in Electrical Engineering
IIT Bombay - Mumbai, India
Certifications
Structuring Machine Learning Projects
Coursera
Neural Networks and Deep Learning
Coursera
Improving Deep Neural Networks: Hyperparameter tuning, Regularization and Optimization
Coursera
Convolutional Neural Networks
Coursera
Skills
Libraries/APIs
OpenCV, Node.js, Keras, TensorFlow, PyTorch, Fast.ai, React
Tools
PyCharm, Eclipse IDE, Amazon Simple Queue Service (SQS), Sublime Text, Git, WebStorm, TensorFlow Serving
Languages
C++, Python, C, C#, JavaScript, Python 3
Platforms
Linux, AWS Lambda, Amazon EC2, MacOS, Windows, NVIDIA CUDA, Windows Phone, iOS, Android, Amazon Web Services (AWS)
Frameworks
Electron
Paradigms
Back-end Architecture
Storage
Amazon S3 (AWS S3), MongoDB, MySQL
Other
Computer Vision, Image Processing, Multiprocessing, Full-stack, Architecture, Stereoscopic Video, Neural Networks, Deep Learning, Multithreading, Machine Learning
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring