Software Engineer
2015 - PRESENTFacebook- Created an E2E machine-learning pipeline for object clustering at scale (with more than 1 billion items); creating the training sets, calculating and analyzing the features, building and producing the classifiers, implementing 72 input features, and setting up a boosted decision tree classifier to detect object similarity.
- Designed and implemented a pipeline for hierarchical object classification; including the text & image input features, dealing with more than 100 million training samples (more than 40 billion items classified in total) and more than 1,000 hierarchical classes. Achieved the desired precision and recall scores.
- Developed Scribe, a large-scale open-source logging system that delivers more than 1 TB/s of logs; also improved the reliability of the system by implementing failover and an E2E testing framework.
- Performed more than 50 coding and system design interviews.
- Supervised five engineers and three projects.
Technologies: Linux, Machine Learning, Artificial Intelligence (AI), MySQL, Apache Hive, Presto DB, Python, C++, PHPSoftware Engineer
2013 - 2015Mail.ru Group- Implemented the core part of revisioned metadata storage (C/C++) for a distributed cloud-based file system that supports shared folders. This storage serves 1.2 billion hits per day using 28 machines.
- Created a WebDAV interface for a cloud-based file system.
Technologies: Linux, C++, CSoftware Engineer
2012 - 2013CocCoc Search Engine- Created a service (Java, C++) that analyzes more than 100 GB of a user's HTTP requests daily and extracts statistics that are useful for ranking documents in the searcher.
- Implemented document-ranking factors based on the user's behavior in the searcher; it anchors popularity, the popularity of SERP clicks, and site popularity. These factors showed high importance based on PFound metric.
- Reduced the size of the Java heap by 60% in the searcher by using unmanaged memory.
Technologies: Linux, JavaSoftware Engineer
2008 - 2011Yandex Corporation- Built a searcher (C++) with low latency between indexing and searching of a document. It was optimized for personalized data using compound keys and sharding. It maintained high performance with 1 TB indices.
- Deployed the Yandex search engine in an email service.
- Interviewed more than ten C++ developers and formed and supervised a team of two new developers.
Technologies: Python, Linux, C++