Software Engineer and Architect
2014 - PRESENT
- Built a web-crawler that scans more than 3 billion pages daily, 100x improvement compared to previous system.
- Implemented highly efficient HTML parser.
- Built custom robust distributed data storage for inverse link graph, allowing for very high write/read throughput. Currently stores 200 billion unique links found on 30 billion crawled pages from 400 million websites.
- Built an HTTP API to access link graph database.
- Built a service to perform real-time ranked suggestions while users input their website url, performs up to 100K qps. Compact prefix arrays with pre-build result sets, multi-threaded reading and lock-free data structures for live updates.
- Built a rating of web sites based on number of linking IP addresses, with filtering by user input. Compact suffix arrays for fast filtering, multi-threaded reading and lock-free data structures for live updates.
- Re-built an API to serve reports on web site search engine rankings, implemented asynchronous MYSQL client library to perform concurrent requests to many servers.
- Built a new storage for website search engine rankings, scaling up to 10 billion records with monthly history.
- Built a service to aggregate various counters from web-crawler nodes, storing data on 10 billion domain names.
- Built an HTTP API to access search engine rankings database.
- All of the above were implemented with a focus on cost-efficiency: the HW costs to support all the mentioned services were negligible.
Technologies: C++11, Golang, MySQL, ZeroMQ, protobuf, libcurl, libcares
2013 - 2014
- Created a high quality rendering back-end for a 3D layer of a world map, using OpenGL, NVIDIA Iray, NVIDIA Optix.
- Created a tool to preview 3D models and position them on a world map, using OpenGL and Qt.
- Made supplementary tools to work with the OpenStreetMap data format (C++, XML).
- Set up a 2D world map rendering back-end using Mapnik, pgSQL, mod_tile, Apache, and MapQuest styles.
- Extended a PROJ.4 library to add support for isometric projection into Mapnik and more.
Technologies: C, C++, Qt, OpenGL, NVIDIA Iray, NVIDIA Optix, SQLite, MySQL, PostgreSQL, Windows, Linux, OpenStreetMap, Mapnik, osm2pgsql, PROJ.4, XML, Apache
2007 - 2013
- Created a distributed cache for a proprietary NoSQL database to deliver data closer to processing nodes on big clusters (2013, ZeroMQ, C, C++).
- Developed a distributed textual search engine for a proprietary NoSQL database, including highly scalable distributed sorting (2012-2013, ZeroMQ, ICU, C, C++).
- Performed conversion of the national fingerprint database of Turkey from five different data formats. Set up a month-long project for automated processing of data on a 300 node cluster (2013, C, Oracle, MS SQL, XML, PHP).
- Created a decision engine for the criminal division of the biometric passport system of Uzbekistan, integrating three biometric systems into a single solution (2011-2012, Oracle, C, distributed transactions).
- Refactored a back-end component that aggregated fingerprint search results coming from cluster nodes. Reduced processing time by 10x. Documented the algorithms and wrote the code to be maintainable (2012, C).
- Created a GUI application to manage data distribution between cluster nodes (2012, C++, Qt).
- Created a GUI application to control AFIS search processes distributed between cluster nodes. (2012, C++, Qt).
- Built a library to perform a multi-criterial evaluation of AFIS search results (2010, C).
- Designed a database to perform multi-dimensional analysis of daily logs coming from hundreds of AFIS installations. Built an automatic process to import logs into the database (2009, MySQL, C).
- Added support for customizable forms into a textual CRUD application. Refactored the code. Used static analysis to find and remove obsolete code paths (2008, C).
- Created a viewer for various formats of biometric data, including NIST, EFTS, Papillon, and Interpol (2007, C++, Qt).