Harsha H S, Software Developer in Cambridge, United Kingdom
Harsha H S

Software Developer in Cambridge, United Kingdom

Member since July 1, 2022
Harsha has nearly two decades of experience in software design, architecture, development, and testing across a broad spectrum, from low-level silicon validation, board bring-ups, and device drivers to scalable distributed databases, route planning for autonomous vehicles, and AI applications. He has worked at startups alongside co-founders to bring ideas to life and at MNCs with geographically distributed cross-functional teams. Harsha believes in "simplicity well-tested and -documented."
Harsha is now available for hire

Portfolio

Experience

Location

Cambridge, United Kingdom

Availability

Part-time

Preferred Environment

Linux, C++, C, Python, Distributed Systems, Hypervisors, Algorithms, PyTorch, Cryptography, Assembly

The most amazing...

...thing I've developed is a distributed and hybrid routing service for a fleet of autonomous robotaxis under various weather, traffic, and road constraints.

Employment

  • Designated Partner/Director

    2015 - PRESENT
    Order In Chaos Technology LLP
    • Implemented a customized key exchange protocol on Gemalto's hardware security module (HSM) using elliptic-curve cryptography.
    • Contributed to carving our required compute and memory resources for the workload on a distributed PaaS for AI/machine learning applications, specifically on middleware, similar to how Mesos abstracts all the compute, storage, and memory resources.
    • Integrated a transport layer security on top of an HTTP server, written using proprietary eventing and the asyncio framework.
    Technologies: Python, Asyncio, C++, C, Transport Layer Security (TLS), Mesos, Distributed Systems, HSM, PyTorch, Keras, TensorFlow, NumPy, Linux, Algorithms, Cryptography, Assembly, Functional Programming, SQLite, Agile Software Development, Complexity Theory, OpenMPI, CUDA, Open Source, Object-oriented Programming (OOP), Amazon EC2 (Amazon Elastic Compute Cloud), GPU Computing, SciPy, Pandas, Java, GIS, CMake, JavaScript, ARM, Graphics Processing Unit (GPU), BIOS, x64 Assembly, MIPS, AWS IoT, C#.NET, Amazon Web Services (AWS), MPI, Multiprocessing, JSON, Databases, Deep Neural Networks, Machine Learning, Linux Servers, NGINX, Cloud, Architecture, Embedded C++, GDB, Valgrind, Elliptic Curve Cryptography, Neural Networks, Statistics
  • Senior Kernel/Hypervisor and Robotics Engineer

    2019 - 2022
    Amazon UK
    • Developed a proof of concept for an NX network card emulation in QEMU to remove hardware dependency for the Xen on Nitro program.
    • Supervised a small team to deliver the same to include TCP checksum and segmentation offloading.
    • Maintained the health of accelerated compute infrastructure on Amazon EC2.
    • Managed cross-functional stakeholders in hardware, kernel, hypervisor, and GPU compute teams.
    • Contributed to the overall quality improvement and upgrade of Xen fleets and, eventually, the migration of the Xen fleet to Xen on Nitro architecture.
    • Collected and stored metrics to understand robot behavior and improve operational robustness of scout robots middleware, using the robot operating system to extend battery life.
    Technologies: Python, Robot Operating System (ROS), Xen, Linux Kernel, Quick EMUlator (QEMU), Amazon EC2 (Amazon Elastic Compute Cloud), Ruby, C++, C, GPU Computing, Linux, Hypervisors, Algorithms, Distributed Systems, Agile Software Development, Open Source, ARM, Graphics Processing Unit (GPU), AWS, Amazon Web Services (AWS), Multiprocessing, JSON, Amazon S3 (AWS S3), Linux Servers, Cloud, Architecture, Embedded C++, GDB, Valgrind
  • Senior Software Engineer

    2019 - 2019
    Untangle AI
    • Delivered products as a core founding team member, aimed at explainable AI for convolution neural networks using signal estimation, uncertainty modeling, and concept extraction, and using Cython to package into SDK and integrated keygen licensing.
    • Designed an active learning process using uncertainty modeling to feed right and limited data sets to a deep neural network to reduce training time without compromising accuracy.
    • Developed and delivered a back-end service using tornado, MySQL, and layer-wise relevance propagation algorithm to explain manufacturing failures, trained using long short-term memory (LSTM) recurrent neural network architecture.
    Technologies: PyTorch, Python, Cython, Kdb+, Q, MySQL, NumPy, SciPy, Pandas, Linux, Algorithms, Agile Software Development, CUDA, Object-oriented Programming (OOP), Amazon EC2 (Amazon Elastic Compute Cloud), GPU Computing, Graphics Processing Unit (GPU), AWS, JSON, Data Science, Deep Neural Networks, Machine Learning, Linux Servers, NGINX, Cloud, Architecture, GDB, Neural Networks
  • Senior Software/Research Engineer

    2017 - 2019
    Nutonomy
    • Designed, developed, and tested a fleet management system for thousands of autonomous taxis plying on Singapore's city-scale using dynamic shortest path algorithms to route the taxis under various constraints.
    • Researched and created constrained assignment algorithms to reduce customer wait time and maximize robotaxi utilization, exploring topics such as ride-sharing and a hybrid mode of transport.
    • Developed a grid matching algorithm and worked on GeoJSON data to match customers to available robotaxis in constant time.
    • Scaled the service by working on hybrid mode of motion planning using coarse graphs to reduce graph complexity.
    Technologies: C++, Java, Go, Python, Neo4j, GIS, Linux, Distributed Systems, SQLite, Agile Software Development, Complexity Theory, Operations Research, CUDA, Object-oriented Programming (OOP), Asyncio, Amazon EC2 (Amazon Elastic Compute Cloud), NumPy, SciPy, Pandas, CMake, JavaScript, AWS IoT, C#.NET, Conan, AWS, Multiprocessing, JSON, Data Science, Deep Neural Networks, Machine Learning, Linux Servers, C++17, Firmware over the Air (FOTA), Architecture, Embedded C++, GDB, Valgrind
  • Senior Software Engineer

    2014 - 2017
    Couchbase
    • Reduced network traffic overhead and indexing throughput with a solution designed and developed to optimize MapReduce indexes when document fields are unused.
    • Simplified required efforts for the horizontal scaling of eventing nodes by architecting and mentoring on developing and implementing its consensus-free sharding mechanism.
    • Developed and open-sourced v8-inspector to debug embedded JavaScript applications with WebSocket server on the back end, integrating it with Chrome DevTools on the front end to provide debugging functionality for user-written events.
    • Applied various improvements and bug fixes in B+ tree implementation and MapReduce indexing using async networking primitives in Erlang.
    Technologies: Erlang, Go, Python, C, CMake, Distributed Systems, JavaScript, Google V8, Storage, Open Source, Linux, Algorithms, Functional Programming, Agile Software Development, Object-oriented Programming (OOP), Asyncio, Amazon EC2 (Amazon Elastic Compute Cloud), AWS, JSON, Databases, Linux Servers, Cloud, Architecture, GDB
  • Senior ASIC Engineer

    2012 - 2014
    NVIDIA
    • Developed microcode for ARMv8 instruction set architecture (ISA) to gain out-of-order benefits with a custom in-order VLIW engine, fusing operations and optimizing hot code block using branch prediction performance metrics.
    • Created and maintained the code coverage infrastructure for a management translation software on the microcode engine and a software interpreter handling hypervisor exceptions and slow interrupt paths.
    • Improved the random instruction generator to uncover and fix bugs in the simulator and register-transfer level for ARMv8 ISA.
    Technologies: C++, C, CUDA, Assembly, ARM, Graphics Processing Unit (GPU), Linux, Algorithms, CMake, Firmware, Microcode, Bootloaders, MPI, Hardware Drivers, Device Drivers, Linux Servers, Architecture, Embedded C++, GDB
  • Graphics Software Engineer

    2009 - 2012
    Intel
    • Contributed to the board bring-up process of Ivy Bridge and a next-gen Intel processor as part of the Legacy Video BIOS team, catering to display interfaces such as HDMI, DisplayPort, VGA, and LCD.
    • Managed a small team of system admins, video basic input/output system (BIOS) experts, and testing officers to deliver an extended desktop feature in video BIOS for Acer's Iconia line of laptops, which had a dual display in place of a keyboard.
    • Collaborated with the system administration team to move our codebase from Rational ClearCase to Git and evangelized embracing Agile practices within the system and video BIOS teams.
    Technologies: BIOS, Firmware, x64 Assembly, C, Assembly, Bootloaders, MPI, Hardware Drivers, Device Drivers, Architecture, Embedded C++, GDB
  • System Technologist

    2008 - 2009
    Tandberg (Acquired by Cisco)
    • Developed a test framework for qualifying a hardware board that supports a 720-pixel camera over USB and worked towards obtaining hardware certifications.
    • Created and implemented firmware upgrade mechanisms over USB and universal asynchronous receiver-transmitter interfaces.
    • Designed and developed direct memory access drivers for a pulse-width modulation module to deliver the sinusoidal wave over the trapezoidal wave to the bipolar stepper motor used for autofocus and exposure, combatting motor noise.
    Technologies: C, C++, ARM, USB, Linux, Assembly, Linux Kernel, Open Source, BIOS, Firmware, Bootloaders, Hardware Drivers, Device Drivers, Linux Servers, Embedded C++, GDB
  • Design Engineer 2

    2007 - 2008
    Montalvo Computer Systems India Pvt. Ltd.
    • Implemented microcode for an in-order VLIW machine to mimic an x86_64 architecture's hardware task switching mechanism by saving and restoring register files, the system management mode, and various interrupt paths based on control register entries.
    • Added diagnostics for various FP, SSE, and MMX instructions crossing with control register sensitivities and x86 modes, implementing identical behavior in microcode.
    • Contributed to enhancing various simulator features.
    Technologies: Assembler x86, Microcode, C++, C, Linux, Algorithms, Assembly, Quick EMUlator (QEMU), BIOS, Firmware, x64 Assembly, Bootloaders, MPI, Device Drivers, Embedded C++, GDB
  • Software Engineer

    2003 - 2007
    RMI Corporation (Formerly Raza Microelectronics, Inc.)
    • Developed a flash mode for an XLR MIPS simulator to simulate the boot process from flash.
    • Collaborated with the team to develop a comprehensive benchmark and stress test suite for testing XLR processors.
    • Created a GDB stub and core dump utility in the bootloader to debug crashes of multithreaded applications.
    • Ported OpenSSL to bypass the software cryptographic algorithms and use hardware accelerators.
    • Complied with Federal Information Processing Standards (FIPS) for the HSM that was part of the XLR system on a chip.
    Technologies: C++, C, Firmware, MIPS, OpenSSH, OpenSSL, HSM, Linux, Algorithms, Cryptography, Assembly, Linux Kernel, Open Source, BIOS, Bootloaders, Hardware Drivers, Device Drivers, Linux Servers, Embedded C++, GDB, Elliptic Curve Cryptography

Experience

  • Routing and Assignment Microservices for Robotaxis

    A distributed hybrid routing engine to route thousands of robotaxis around Singapore's city-scale.

    The project involved graph pruning for scalability and developing a parallel dynamic shortest path graph algorithm for the coarse-grained routing of taxis from source to destination. A fine-grained map was also downloaded onto the taxi on-demand, enabling the robotaxi to perform motion planning, obstruction avoidance, and lane switching to achieve the final goal, which was to provide customers with an optimal assignment service under various constraints.

    I was involved with:

    • Conducting a research survey to understand various state-of-the-art shortest path algorithms, decide upon the Ramalingam-Reps algorithm, and implement the back end in C++.
    • Developing a simple HTTP server in C++ to receive various goal positions, traffic conditions, and other constraints to be applied to the road network graph.
    • Creating an extract, transform, and load (ETL) pipeline to scrape GeoJSON data from OpenStreetMap APIs and remove the fine-grained Uni node to get a scalable graph for the cities the robotaxis operated.
    • Implementing assignment microservices using operation research techniques to perform constrained optimization.

  • Distributed Eventing Framework for Couchbase Events

    An eventing framework for Couchbase, a distributed key-value JSON document store supporting 100,000 operations per second.

    The eventing framework was a post-trigger mechanism to hook in user-specified functions or operations on every database event. It also provided users to connect functionalities for non-database events such as timers.

    I was primarily involved with architecting a consensus-free sharding mechanism for eventing nodes to claim ownership of a set of shards by each node without involving leader election, simplifying the design to a large extent.

    I was also involved with various integrations, including:

    • The Google V8 engine to parse and compile user-supplied functionality using JavaScript functions.
    • The v8-inspector on the back end and Chrome DevTools on the front end to debug user functions, then later open-sourcing the same to debug any embedded JavaScript applications.
    • The eventing functionality with the Couchbase multi-dimension scaling paradigm, scaling each service independently depending on its workload.

  • Hardware Security Modules and Cryptography for RMI

    I was involved with various cryptographic algorithms in hardware and software for various HSMs at RMI Corporation.

    I worked towards getting an FIPS certification for HSMs. I also developed OpenSSL and OpenSSH integrations and drivers and implemented an elliptic-curve key exchange algorithm for the Gemalto HSM.

Skills

  • Languages

    C++, C, Python, x64 Assembly, Microcode, Embedded C++, Assembly, Erlang, Go, Java, JavaScript, MIPS, C#.NET, Ruby, Q, Lua, C++17
  • Libraries/APIs

    OpenSSL, MPI, PyTorch, NumPy, SciPy, Pandas, Keras, Asyncio, TensorFlow
  • Tools

    CMake, OpenSSH, GDB, GIS, NGINX, Valgrind, Mesos, Conan
  • Platforms

    Linux, Amazon Web Services (AWS), CUDA, Xen, Quick EMUlator (QEMU), Amazon EC2 (Amazon Elastic Compute Cloud), AWS IoT
  • Other

    Distributed Systems, Hypervisors, Algorithms, Linux Kernel, Open Source, HSM, ARM, BIOS, Firmware, Bootloaders, AWS, Multithreading, Multiprocessing, Deep Neural Networks, Machine Learning, Linux Servers, Cloud, Architecture, Cryptography, Complexity Theory, Operations Research, OpenMPI, GPU Computing, Cython, Google V8, Graphics Processing Unit (GPU), Boost.Asio, Internet of Things (IoT), Compilers, Hardware Drivers, Device Drivers, Elliptic Curve Cryptography, Neural Networks, Transport Layer Security (TLS), Robot Operating System (ROS), Storage, USB, Groovy Scripting, Firmware over the Air (FOTA), Statistics
  • Paradigms

    Agile Software Development, Object-oriented Programming (OOP), Data Science, Functional Programming
  • Storage

    JSON, Databases, Amazon S3 (AWS S3), MySQL, SQLite, Kdb+, Neo4j

Education

  • Bachelor's Degree in Computer Science and Engineering
    1999 - 2003
    Bapuji Institute of Engineering and Technology - Davanagere, Karnataka, India

To view more profiles

Join Toptal
Share it with others