Dmitri Ivanovich Arkhipov, Developer in Irvine, CA, United States
Dmitri is available for hire
Hire Dmitri

Dmitri Ivanovich Arkhipov

Verified Expert  in Engineering

Python Developer

Location
Irvine, CA, United States
Toptal Member Since
January 23, 2017

Dmitri has a PhD degree in computer science from UC Irvine, and he's been involved in tech either as a student, freelancer, intern, or employee for over 15 years. Dmitri works primarily in Unix/Linux ecosystems—within which he has developed programs in Python, Java, Scala, C, C++, Perl, JavaScript, and several other languages. His most recent experience has been with Python and JavaScript, but he's willing to adapt.

Portfolio

Anthem
Django, Kubernetes, Python, NGINX, REST
Ticketmaster, Inc.
JDBC, Machine-to-Machine (M2M), Apache Commons, Bash Script, Java 8, Databases...
ChromaCode, Inc.
Amazon Web Services (AWS), AWS Lambda, Tcl, PyDev, Functional Programming...

Experience

Availability

Full-time

Preferred Environment

Ubuntu Linux, Apache Maven, Git, Eclipse, Vim Text Editor, CentOS, Red Hat Linux, Ubuntu

The most amazing...

...project I've worked on was my dissertation on optimally choosing ad request orders to maximize expected return given a time threshold (bounded sub-optimal).

Work Experience

Full-stack Software Engineer | Solutions Engineer Executive Advisor

2020 - PRESENT
Anthem
  • Developed, tested, and integrated back-end API components for TeleHealth OS, an Anthem project.
  • Developed code to connect in and outbound phone calls to online video chat rooms.
  • Expanded back-end data model, extended back-end functionality applied DB migrations.
  • Integrated ICS calendar file event tracking interoperable with all major mail/calendar software.
  • Implemented real-time event notifications via WebSockets and server-side events (HTTP2).
Technologies: Django, Kubernetes, Python, NGINX, REST

Senior Optimization Scientist

2019 - 2020
Ticketmaster, Inc.
  • Developed and deployed machine learning models for the PriceMaster dynamic pricing optimization engine. The engine returns on-demand dynamic ticket price recommendations for live event ticket pricing.
  • Contributed to a B2B application used by event and venue managers.
  • Built and utilized ETL data pipeline, a model built, result evaluation, and storage pipeline.
Technologies: JDBC, Machine-to-Machine (M2M), Apache Commons, Bash Script, Java 8, Databases, Unix, Linux, Artificial Intelligence (AI), Java 7, Scrapy, Domo, Docker, MySQL, StatsModels, Scikit-learn, Pandas, Amazon EC2, Amazon SageMaker, JUnit, Java, MATLAB, Bash, Python 3, GitLab CI/CD

Research Engineer

2017 - 2019
ChromaCode, Inc.
  • Designed, developed, tested, validated, and deployed classification and regression algorithms for QPCR and DDPCR medical diagnostic tests.
  • Developed, packaged, versioned, continuously integrated, and released internal- and external-facing research and publication tools.
  • Wrote code from scratch and transformed it into cohesive, verifiably tested, and carefully documented, modules packaged and stored in a private repository.
  • Guided their deployment and release to tested Docker container constellations executed in AWS ECS.
  • Debugged and expanded features of existing modules; just as in the case of new modules, I took the necessary changes through all the phases described above in my work with existing modules.
  • Developed the front end in React and Node.js to display the machine learning data to the user.
  • Tested the features in Cucumber and implemented unit tests in Mocha.
  • Provisioned (using Ansible) a malware scanning-and-alerting system for file uploads to Amazon S3; file upload events trigger a malware AWS Lambda execution and an AWS SNS notification in the case of virus detection.
Technologies: Amazon Web Services (AWS), AWS Lambda, Tcl, PyDev, Functional Programming, Flask-RESTful, REST, Apache Commons, Bash Script, PostgreSQL, Databases, Linux, React, Node.js, JavaScript, Ansible, Docker Compose, Docker, PyPI, Nexus, Flask, Matplotlib, Scikit-learn, Pandas, PIP, Conda, Tkinter, Tcl/Tk, Bash, Python, Git

Data Engineer

2017 - 2018
Formation, Inc.
  • Developed periodic database unloads, and ETL transforms for a client's data science ingestion. The software launched multiterabyte-sized data dumps daily to unload data from Amazon Redshift to S3.
  • Implemented PySpark and ran it on AWS EMR and Hive and HDFS (running on AWS S3) to perform complex ETL queries on data in Redshift.
  • Packaged ETL code into Docker containers stored in AWS ECR, executed in ECS, and fired by Cron events. Docker containers were orchestrated with Docker Compose.
  • Enabled continuous integration (CI) support with CircleCI.
  • Ensured that the ETL delivery times didn't take more than 1.5 hours consistently; previously, it took too long due to forced unloads.
Technologies: AWS Lambda, Functional Programming, Docker Compose, REST, Event-driven Programming, Bash Script, Apache Hive, Apache Spark, Pandas, Databases, Linux, Bash, Python, PySpark, Spark, ECS, EMR, Amazon EC2, Amazon S3 (AWS S3)

Postdoctoral Researcher

2016 - 2016
Donald Bren School of Information and Computer Sciences | University of California, Irvine
  • Involved with the ongoing research of combinatorial optimization in online advertising.
  • Created novel concurrency and interprocess synchronization methods for parallel and distributed computing.
  • Developed batch-size selections for stochastic gradient descent.
  • Designed spot market clearing mechanisms for transportation spot markets.
  • Built an agent-based framework for general distributed computation.
Technologies: Bash Script, Linux, Algorithms, Mathematical Modeling, Bash, Java, Python

Graduate Student Researcher | Teaching Assistant

2010 - 2016
Donald Bren School of Information and Computer Sciences | University of California, Irvine
  • Worked on several optimization and parallel execution-themed papers.
  • Participated in several smart city projects with researchers in China and the UK.
  • Developed a mathematical model and analysis of sequential ad polling. Implemented an algorithm using the model that guaranteed bounded sub-optimality (with respect to the expected value given a time budget).
  • Assisted in introductory computer programming and discrete mathematics courses. In particular, I have worked on introductory and intermediate Java, C, and Python courses.
  • Performed significant research on thread and process synchronization and low-level inter-thread message passing constructs.
  • Wrote conference papers, journal papers, and a dissertation with research findings.
Technologies: Bash Script, Linux, C, Bash, Android, Java, Python, LaTeX

Advanced Engineering Intern

2015 - 2015
CalAmp Wireless, Inc.
  • Designed, planned, implemented, and tested a new release of the company’s (M2M/MRM) cloud infrastructure back end.
  • Ensured that the release was fault-tolerant, recoverable, scalable through integration with Amazon Web Services (AWS) and that the message triggered execution was dynamically reconfigurable.
  • Implemented and A/B tested the AWS Kinesis and AWS Simple Queue Service (SQS) based solutions to persist and load-balance incoming traffic.
  • Worked on a project that was a dynamic message router implemented for direct integration into a Spring-integration environment.
  • Developed the router component, a meta-construction allowing each Spring-integration component to determine the next procedural step to execute on the current step's output and then routing it.
Technologies: Distributed Computing, Amazon Web Services (AWS), JDBC, Parallel Computing, Eclipse IDE, Machine-to-Machine (M2M), Eclipse CDT, Apache Commons, Spring JDBC, Spring Data JPA, Apache Tomcat, Jakarta Server Pages (JSP), Linux, Distributed Systems, Spring Boot, Amazon Simple Queue Service (SQS), Amazon Kinesis, Spring Integration, Spring, JUnit, Java

Software Engineering Intern

2014 - 2014
Adaptive Medias, Inc.
  • Developed a heuristic optimization solution for the ad-ordering problem written in Java. The solution was deployed into production for determining advertising waterfalls for users.
  • Wrote a RESTful web application in Python Flask to accept web-domain URLs and classify (using MALLET) the semantic content into IAB tier 1 categories. This was deployed into production as a web service and was at the heart of the main aim of a sprint.
  • Worked in Scala with the Cloudera package of Hadoop to write Spark code to perform log aggregation. Terabytes of event logs relating to customer interactions were condensed into lifecycle objects and written to disk.
  • Ensured that each of these three projects was released into production as web microservices and that each was developed as a Git branch and successfully incorporated into the development process and subsequently operating branch.
  • Developed and compared with a team alternative mathematical models for classifications—as part of the classification work.
Technologies: Hortonworks Data Platform (HDP), Apache Solr, Flask-RESTful, Apache Lucene, Event-driven Programming, Apache Commons, Bash, Bash Script, Scala IDE, Cassandra, Linux, Apache Cassandra, Artificial Intelligence (AI), MALLET, Machine Learning, RESTful Web Services, Cloudera, Apache Hive, Hadoop, Scala, Java, Flask, Python

Web Application Developer

2010 - 2010
Intellisurvey, Inc.
  • Designed and implemented new features and enhancements to the Intellisurvey software (in Perl and mod_perl).
  • Found and resolved software defects—debugging and feature creation through an in-house ticketing system.
  • Assisted in the development of the Intellisurvey infrastructure.
  • Created tools to aid in development, testing, and systems administration.
  • Provided technical support to internal software users and clients who use licensed Intellisurvey software tools.
  • Added UI features in JavaScript to components of the web system's front end.
Technologies: Bash Script, Bash, CSS, HTML, JavaScript, Mod_perl, Perl

Research Assistant

2009 - 2010
University of California, Irvine
  • Installed, configured, and populated a PostgreSQL database with heterogeneous data sources; also developed a JSP front end.
  • Linked data sources on the basis of common fields and meta information.
  • Optimized MATLAB algorithms for dynamic network flow optimization.
  • Wrote C functions called from the MEX interface of MATLAB.
  • Wrote a highly optimized Djikstras shortest-path algorithm in C and bound it in MATLAB—it was over 100x faster than the MATLAB equivalent.
Technologies: Bash Script, Linux, Python, MATLAB, Jakarta Server Pages (JSP), Unix, C, Bash, JavaScript, PostgreSQL

Spring Dynamic Message Processor

https://github.com/darkhipo/SpringDynamicMessageProcessor
A Maven/Spring and Spring Boot Integration service that executes dynamically determined processing stage paths.

The Spring Integration is a framework for developing message-passing systems. One component not part of the framework is a dynamic router, next hop, and a current function—this project implements that component.

Evaluating the Value of Social Media Sentiment

https://github.com/darkhipo/unclean_sentiments
I built this command-line interface (CLI) to access data vendor data. The data vendor gives us a daily file that tells us information about a given stock ticker, how many people were tweeting about that ticker, and how many of those tweets were positive.

I made sure that the data was conveniently accessible for evaluation. The data is cleaned, ingested into SQLite using Pandas, and is made available via a CLI interface.

Buffered Sort in Kinesis

https://github.com/darkhipo/kinesis-two-phase-sort
It pulls from the Kinesis Stream every N seconds and then pushes to a new stream. Amazon Kinesis Data Streams collect and process large streams of data records in real time.

Kineses are used to create data-processing applications, known as Kinesis Data Streams applications. A typical Kinesis Data Streams application reads data from a data stream as data records.

Python Flask Document Categorizer Microservice

Online advertisers are concerned with the Internet Advertising Bureau (IAB) categorizations. These categorizations represent themes or topics and can apply broadly to websites advertisements or other web data objects.

I developed and deployed a RESTFul microservice using Flask to train and update classifiers learning from web data and classify any document into a normalized vector representing the theme breakdown of the document into IAB categories. This service used Flask as the REST service layer, the multiprocessing module for process control, MALLET software (Mallet.cs.umass.edu/) for classification, beautiful soup for data preprocessing, and Nutch/SOLR for crawling and indexing web data.

Ad Ordering with Python

Determining the next ad producer to solicit for a particular ad impression is not trivial. A good algorithm will consider many factors; the two most immediately noticeable factors are revenue for the publisher (website hosting the advertisement) and the time to fill the ad slot.

I've developed a suite of algorithms for solving this problem under different assumptions in my work. A version of this work deployed in production use data aggregation over terabytes of log data to build distributions representing how much time a solicitation will take. The deployment used Flask as a RESTful microservice layer, Python, and SciPy for the main algorithms. More advanced research solutions make use of real-time dynamic programming to achieve solutions with bounded suboptimality.

Remote Object SDN with Python

http://ieeexplore.ieee.org/abstract/document/7218384
As part of SDN research work, I developed basic software-defined networking (SDN) solution based on Python, Berkley Sockets, and Python remote objects (Pythonhosted.org/Pyro4). The SDN is built over TCP/IP and allows dynamically reconfigurable networking paths. In fact, in the paper on this topic, the reconfiguration is done based on period flow estimates to ensure that devices are given uncongested networking paths.

To our knowledge, we are the first group to build an SDN over the remote object paradigm. With this approach, the client requests a connection rather than an address from the gateway. The link returned is a remote object now shared by both the client and the network used for reads and writes by the client and reconfigured by the network administration as network conditions change.

Languages

Python, Bash, Python 2, Java, Bash Script, Java 7, Python 3, PHP, Pascal, Scheme, CSS, HTML, JavaScript, Perl, C#, C++, C, Scala, Java 8, Assembler x86, VHDL, Tcl, Tcl/Tk

Frameworks

Spring Integration, Spring, Spring Boot, Spark, Scrapy, Hadoop, Jakarta Server Pages (JSP), Apache Spark, Spring JDBC, JUnit, Flask, Django

Libraries/APIs

MALLET, PySpark, Node.js, React, Flask-RESTful, Mod_perl, Apache Lucene, Facebook Open Graph API, JDBC, OpenGL, Pandas, Scikit-learn, Matplotlib

Tools

Apache Maven, GitLab CI/CD, Amazon SageMaker, StatsModels, Domo, MATLAB, Apache Solr, LaTeX, Apache Tomcat, Vim Text Editor, Eclipse IDE, Amazon Simple Queue Service (SQS), Cloudera, ModelSim, PyDev, Scala IDE, Git, PyPI, Docker Compose, Ansible, NGINX

Paradigms

Distributed Computing, Distributed Programming, Parallel Computing, Constraint Programming, Compiler Design, Software-defined Networking (SDN), Linear Programming, Dynamic Programming, Concurrent Programming, Event-driven Programming, Functional Programming, REST

Platforms

Eclipse, Amazon EC2, Nexus, Amazon Web Services (AWS), AWS Lambda, Hortonworks Data Platform (HDP), Red Hat Linux, Ubuntu, Linux, CentOS, Unix, Android, Ubuntu Linux, Docker, Kubernetes

Storage

Databases, Cassandra, MongoDB, Amazon S3 (AWS S3), Spring Data JPA, PostgreSQL, MySQL, Apache Hive

Other

RESTful Web Services, EMR, ECS, Conda, Data Structures, Algorithms, Distributed Systems, Machine Learning, Evolutionary Algorithms, Genetic Algorithms, Operating Systems, Interpreter Design, Computer Graphics, Computer Science, Mixed-integer Linear Programming, Convex Optimization, Optimization, Combinatorics, Mathematical Programming, Transportation & Shipping, Networks, Machine-to-Machine (M2M), Apache Commons, Metaheuristics, Artificial Intelligence (AI), Binary Search Trees, Decision Trees, Mathematical Modeling, Apache Cassandra, Research, Amazon Kinesis, Eclipse CDT, Tkinter, PIP, Combinatorial Optimization, Clustering

2010 - 2016

PhD Degree in Computer Science (Computational Models for Scheduling in Online Advertising)

University of California, Irvine - Irvine, CA, USA

2010 - 2011

Master's Degree in Computer Science

University of California, Irvine - Irvine, CA, USA

2004 - 2009

Bachelor's Degree in Information and Computer Science: Specializations in Computer Systems, Distributed Systems (Minor: Mathematics)

University of California, Irvine - Irvine, CA, USA

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring