C++ Developer in New York, NY, United States
Data Scientist2018 - 2018180 by Two (via Toptal)
Technologies: Spark, Hadoop, Python, Azure
- Built a geo-attribution system for a big location dataset.
- Developed algorithms for geo attribution cleansing and verification.
- Provided guidelines for geographical data specification using OpenStreetMap interface.
Data Scientist2016 - 2018SteppeChange
Technologies: Python, TensorFlow, Hadoop, C++
- Developed customer churn models using historical data with Hadoop, Python, and TensorFlow.
- Improved the churn model performance by 25% using mobile network social data.
- Built a user-segmentation pipeline based on mobile network historical records using the Spark infrastructure.
- Created a chatbot ecosystem intended for easy customization and to easily integrate customer data.
- Built a 95% accurate gesture recognition pipeline for wearable electronics with TensorFlow.
Data Scientist2013 - 2016Radiumone
Technologies: Hadoop, Hive, C++, Python, Cuda, jQuery
- Measured the effectiveness of mobile ad campaigns using geolocation data from hundreds of millions mobile devices over the campaign's duration (Hadoop, Hive, and Python).
- Built competitor advertising segments for a major U.S. airline using the terminals' geolocation data.
- Reduced media expenses by 5% by developing a high-cost media filtering system using deep learning techniques.
- Designed and implemented distributed a real-time GPU-powered time series database.
- Designed and implemented a set of tools for processing and visualization large geographical dataset (C++, Cuda, PHP, and jQuery).
- Reduced content classification costs by 90% by developing classification pipeline for future popular content identification.
- Developed a model for social data sharing, increasing performance by over 100% for selected audiences.
Software Architect2010 - 2012Doctorsoft
- Gathered the initial requirements and created the application architecture by taking into account the existing restrictions.
- Estimated the costs for running the application in Amazon Cloud and for the scaling process.
- Worked on the HIPAA certification, providing that the usage of Amazon technology stack would meet the requirements.
- Implemented an integration with an electronic prescribing service provider (eRx).
- Mobile Customer Segmentation Process (Development)
I built a customer segmentation process on historical mobile communications data. On this, I used Spark, Hadoop, manual feature engineering, self-organizing maps, and k-means clustering.
- Mobile Ad Campaign Effectiveness (Development)
I implemented a framework for measuring the effectiveness of a mobile ad campaign based on geographical data gathered from mobile devices. Here, I mainly used Hadoop, Hive, and OpenStreetMap data
- Advertisement Targeting for the Customers of Rival Major Airlines (Development)
I developed a process for the identification of passengers loyal to a major US airline's competitors and facilitated the advertisement delivery to such people. For this project, I used a variety of technologies: Hadoop, Hive, advertisement historical data, US airport geographical locations, flight schedules, and more.
- Customer Journey Analytics (Development)
I created a set of tools for the customer journey analytics on behalf of an online retailer with approximately a 20 million customer base. The goal was to provide analysts with a convenient and painless visualization of individual customer history as well as an aggregate view on a subset of customers. Here I mainly used Hadoop, Hive, MySQL, Python, and jQuery.
- Conversion Funnel Steps Prediction (Development)
I built a process facilitating the prediction of future conversion funnel steps of an online retailer customer. The funnel consisted of the conversion sequence starting from a product page view and ended with a product purchase. I chiefly used Python and Tensorflow.
- Chatbot Development Suite (Development)
I built a chatbot infrastructure for Stepechange which consisted of a dialog definition module, chatbot runtime, and a number of back-end adapters.
• The dialog definition module provided the end user means to define a conversation as a flow diagram,
• Chatbot runtime extended the flow functionality by means of Python callbacks.
• The back-end adapters allowed for different NLP providers selection—IBM Watson, AWS Lex, Microsoft's Text Analytics API, etc.
• The system was also capable of ingesting proprietary data such as CRM or product catalogue and augmenting the NLP accordingly
FrameworksAWS EMR, Spark, Hadoop
Libraries/APIsKeras, jQuery, Stanford NLP, TensorFlow, AWS EC2 API, Pandas, NumPy, Microsoft Cognitive Services, SciPy, Node.js
ToolsGit, IBM Watson, Amazon Lex
PlatformsJupyter Notebook, Linux, AWS EC2, CUDA, AWS Lambda
StorageAWS RDS, AWS S3, Apache Hive, Redis, NoSQL, AWS DynamoDB
OtherConvolutional Neural Networks, Azure Data Lake, Big Data, Recurrent Neural Networks, Neural Networks, Analytics, Natural Language Processing (NLP), Deep Neural Networks, Data Visualization, Deep Reinforcement Learning, Reinforcement Learning, Big Data Architecture, R-trees, Geospatial Data, Chatbots
- Master of Science degree in Computer Science1991 - 1996Peter the Great St. Petersburg Polytechnic University - Saint Petersburg, Russia
- Private PilotAUGUST 2016 - PRESENTFAA | Federal Aviation Administration