Scroll To View More
Dmitri Ivanovich Arkhipov, Python Developer in Irvine, CA, United States
Dmitri Ivanovich Arkhipov

Python Developer in Irvine, CA, United States

Member since December 12, 2016
Dmitri has a PhD degree in computer science from UC Irvine, and he's been involved in tech either as a student, a freelancer, an intern, or an employee for over 15 years. Dmitri works primarily in Unix/Linux ecosystems—within which he has developed programs in Python, Java, Scala, C, C++, Perl, JavaScript, and several other languages. His most recent experience has been with Python and JavaScript, but he's willing to adapt.
Dmitri is now available for hire

Portfolio

Experience

  • Python, 12 years
  • Java, 10 years
  • Ubuntu Linux, 8 years
  • Eclipse IDE, 6 years
  • Vim Text Editor, 5 years
  • Pandas, 4 years
  • Git, 3 years
  • JavaScript, 2 years
Irvine, CA, United States

Availability

Part-time

Preferred Environment

Ubuntu, Redhat, Centos, Vim, Eclipse, Git, Maven

The most amazing...

...project I've worked on was my dissertation on optimally choosing ad request orders to maximize expected return given a time threshold (bounded sub optimal)..

Employment

  • Research Engineer

    2017 - 2019
    ChromaCode, Inc.
    • Designed, developed, tested, validated, and deployed classification and regression algorithms for QPCR and DDPCR medical diagnostic tests.
    • Developed, packaged, versioned, continuously integrated, and released internal- and external-facing research and publication tools.
    • Wrote code from scratch and transformed it into cohesive, verifiably tested, and carefully documented, modules packaged and stored in a private repository.
    • Guided their deployment and release to tested docker container constellations executed in AWS ECS.
    • Debugged and expanded features of existing modules; just as in the case of new modules, in my work with existing modules, I took the necessary changes through all the phases described above.
    • Developed the front end in React and Node.js to display the machine learning data to the user.
    • Tested the features in Cucumber and implemented unit tests in Mocha.
    • Provisioned (using Ansible) a malware scanning-and-alerting system for file uploads to Amazon S3; file upload events trigger a malware AWS Lambdas execution and an AWS SNS notification in the case of virus detection.
    Technologies: Git, Python, Bash, Tcl/Tk, Tkinter, Conda, PIP, Pandas, Scikit-learn, Matplotlib, Flask, Nexus Repository, PyPI, AWS, Docker, Docker Compose, Ansible, JavaScript, Node.js, React
  • Data Engineer

    2017 - 2018
    Formation, Inc.
    • Developed periodic database unloads and ETL transforms for a client's data science ingestion. The software launched multiterabyte-sized data dumps daily to unload data from Amazon Redshift to S3.
    • Implemented PySpark and ran it on AWS EMR and Hive and HDFS (running on AWS S3) to perform complex ETL queries on data in Redshift.
    • Packaged ETL code into Docker containers stored in AWS ECR and executed in ECS and fired by Cron events. Docker containers were orchestrated with Docker Compose.
    • Enabled CI support with CircleCI.
    • Ensured that the ETL delivery times didn't take more than 1.5 hours consistently; previously, it was taking too long due to forced unloads.
    Technologies: AWS S3, EC2, EMR, ECS, Spark, PySpark, Python, Bash
  • Postdoctoral Researcher

    2016 - 2016
    Donald Bren School of Information and Computer Sciences | University of California, Irvine
    • Was involved with the ongoing research of combinatorial optimization in online advertising.
    • Created novel concurrency and interprocess synchronization methods for parallel and distributed computing.
    • Developed batch-size selections for stochastic gradient descent.
    • Designed spot market clearing mechanisms for transportation spot markets.
    • Built an agent-based framework for general distributed computation.
    Technologies: Python, Java, Bash
  • Graduate Student Researcher | Teaching Assistant

    2010 - 2016
    Donald Bren School of Information and Computer Sciences | University of California, Irvine
    • Worked on several optimization and parallel execution themed papers.
    • Participated in several smart city projects with researchers in China and the UK.
    • Developed a mathematical model and analysis of sequential ad polling. Implemented an algorithm making use of the model that guaranteed bounded sub-optimality (with respect to expected value given a time budget). Wrote a conference paper, journal paper, and a dissertation with these findings.
    • Assisted in introductory computer programming, and discrete mathematics courses. In particular I have worked on introductory and intermediate Java, C, and Python courses.
    • Performed significant research on thread and process synchronization and low-level inter-thread message passing constructs.
    Technologies: LaTeX, Python, Java, Android, Bash, C
  • Advanced Engineering Intern

    2015 - 2015
    CalAmp Wireless, Inc.
    • Designed, planned, implemented, and tested a new release of the company’s (M2M/MRM) cloud infrastructure back end.
    • Ensured that the release was fault-tolerant, recoverable, scalable through integration with Amazon Web Services (AWS) and that the message triggered execution was dynamically reconfigurable.
    • Implemented and A/B tested the AWS Kinesis and AWS Simple Queue Service (SQS) based solutions to persist and load-balance incoming traffic.
    • Worked on a project that was a dynamic message router implemented for direct integration into a Spring-integration environment.
    • Developed the router component which was a meta-construction allowing each Spring Integration component to determine what the next procedural step to execute on the output of the current step should be and then routing it.
    Technologies: Java, JUnit, Spring, Spring Boot, Spring Integration, AWS, AWS Kinesis, AWS SQS
  • Software Engineering Intern

    2014 - 2014
    Adaptive Medias, Inc.
    • Developed a heuristic optimization solution for the ad-ordering problem written in Java. The solution was deployed into production for determining advertising waterfalls for users.
    • Wrote a RESTful web application in Python Flask to accept web-domain URLs and classify (using MALLET) the semantic content into IAB tier 1 categories. This was deployed into production as a web service and was at the heart of the main aim of a sprint.
    • Worked in Scala with the Cloudera package of Hadoop to write Spark code to perform log aggregation. Terabytes of event logs relating to customer interactions were condensed into lifecycle objects and written to disk.
    • Ensured that each of these three projects was released into production as web microservices and that each was developed as a Git branch and successfully incorporated into the development process and subsequently operating branch.
    • Developed and compared with a team alternative mathematical models for classifications—as part of the classification work.
    Technologies: Python, Flask, Java, Scala, Hadoop, Hive, Cloudera, RESTful Web Services, MALLET Classification (Machine Learning)
  • Web Application Developer

    2010 - 2010
    Intellisurvey, Inc.
    • Designed and implemented new features and enhancements to the Intellisurvey software (in Perl and mod_perl).
    • Found and resolved software defects—debugging and feature creation through an in-house ticketing system.
    • Assisted in the development of the Intellisurvey infrastructure.
    • Created tools to aid in development, testing, and systems administration.
    • Provided technical support to internal software users and to clients who use licensed Intellisurvey software tools.
    • Added UI features in JavaScript to components of the web system's front end.
    Technologies: Perl, mod_perl, JavaScript, HTML, CSS, Bash
  • Research Assistant

    2009 - 2010
    University of California, Irvine
    • Installed, configured, and populated a PostgreSQL database with heterogeneous data sources; also developed a JSP front end.
    • Linked data sources on the basis of common fields and meta information.
    • Optimized MATLAB algorithms for dynamic network flow optimization.
    • Wrote C functions called from the MEX interface of MATLAB.
    • Wrote a highly optimized Djikstras shortest-path algorithm in C and bound it in MATLAB—it was over 100x faster than the MATLAB equivalent.
    Technologies: PostgreSQL, JavaScript, Bash, C, UNIX, JSP, MATLAB, Python

Experience

  • Evaluating the Value of Social Media Sentiment (Development)
    https://github.com/darkhipo/unclean_sentiments

    I built this command-line interface (CLI) to access data vendor data. The data vendor gives us a daily file that tells us information about a given stock ticker, how many people were tweeting about that ticker, and how many of those tweets were positive.

    I made sure that the data was conveniently accessible for evaluation. The data is cleaned, ingested into SQLite using Pandas and is made available via a CLI interface.

  • Python Flask Document Categorizer Microservice (Development)
    http://mallet.cs.umass.edu/

    Online advertisers are concerned with the Internet Advertising Bureau (IAB) categorizations. These categorizations represent themes or topics and can apply broadly to websites advertisements or other web data objects.

    I developed and deployed a RESTFul microservice using Flask to train and update classifiers learning from web-data and to classify any document into a normalized vector representing the theme breakdown of the document into IAB categories. This service used Flask as the REST service layer, the multiprocessing module for process control, MALLET software (Mallet.cs.umass.edu/) for classification, beautiful soup for data preprocessing, and Nutch/SOLR for crawling and indexing web data.

  • Ad Ordering with Python (Development)

    Determining the next ad-producer to solicit for a particular ad impression is not trivial. A good algorithm will take many factors into account; the two most immediately noticeable factors are revenue for the publisher (website hosting the advertisement) and the time to fill the ad slot.

    In my work, I've developed a suite of algorithms for solving this problem under different assumptions. A version of this work deployed in production use data aggregation over terabytes of log data to build distributions representing how much time a solicitation will take. The deployment used Flask as a RESTful microservice layer, Python, and SciPy for the main algorithms. More advanced research solutions make use of real-time dynamic programming to achieve solutions with bounded suboptimality.

  • Remote Object SDN with Python (Development)
    http://ieeexplore.ieee.org/abstract/document/7218384

    As part of SDN research work, I developed a basic software-defined-networking (SDN) solution based on Python, Berkley Sockets, and Python remote objects (Pythonhosted.org/Pyro4). The SDN is built over TCP/IP and allows dynamically reconfigurable networking paths. In fact, in the paper on this topic, the reconfiguration is done based on period flow estimates to ensure that devices are given uncongested networking paths.

    To our knowledge, we are the first group to build an SDN over the remote object paradigm. With this approach, the client requests a connection rather than an address from the gateway. The link returned is a remote object now shared by both the client and the network used for reads and writes by the client and reconfigured by the network administration as network conditions change.

  • Maximum Flow and the Linear Assignment Problem (Publication)
    The Hungarian graph algorithm solves the linear assignment problem in polynomial time. By modeling resources (e.g., contractors and available contracts) as a graph, the Hungarian algorithm can be used to efficiently determine an optimum way of allocating resources.

Skills

  • Languages

    Python 2, Java, Bash Script, Java 7, Python 3, Python, PHP, Pascal, Scheme, CSS, HTML, JavaScript, Perl, C#, C++, C, Scala, Java 8, Assembler x86, VHDL, Tcl, Tcl/Tk
  • Frameworks

    Spring Integration, Spring, Spring Boot, Hadoop, JavaServer Pages (JSP), Apache Spark, Spring JDBC, JUnit, Flask
  • Libraries/APIs

    Flask-RESTful, Mod_perl, Apache Lucene, Facebook Open Graph API, JDBC, OpenGL, Pandas, Scikit-learn, Matplotlib
  • Tools

    MATLAB, Apache Solr, LaTeX, Apache Tomcat, Vim Text Editor, Eclipse IDE, Amazon SQS, Cloudera, ModelSim, PyDev, Scala IDE, Git, PyPI, Docker Compose, Ansible
  • Paradigms

    Distributed Programming, Parallel & Distributed Computing, Constraint Programming, Compiler Design, Software-defined Networking (SDN), Linear Programming, Dynamic Programming, Concurrent Programming, Event-driven Programming, Functional Programming, REST
  • Platforms

    AWS Lambda, Hortonworks Data Platform (HDP), Red Hat Linux, AWS Kinesis, Ubuntu, Linux, CentOS, Unix, Android, Ubuntu Linux, Docker
  • Storage

    Databases, Cassandra, MongoDB, AWS S3, Spring Data JPA, PostgreSQL, MySQL, Apache Hive
  • Other

    Bash Scripting, Data Structures, Algorithms, Distributed Systems, Machine Learning, Evolutionary Algorithms, Genetic Algorithms, Operating Systems, Interpreter Design, Computer Graphics, Computer Science, Mixed Integer Linear Programming, Convex Optimization, Optimization, Combinatorics, Mathematical Programming, Transportation & Shipping, Networks, Machine-to-Machine (M2M), Apache Commons, Metaheuristics, Artificial Intelligence (AI), Binary Search Trees, Decision Trees, Mathematical Modeling, Apache Cassandra, Research, Eclipse CDT, TkInter, PIP

Education

  • PhD degree in Computer Science (Computational Models for Scheduling in Online Advertising)
    2010 - 2016
    University of California, Irvine - Irvine, CA, USA
  • Master's degree in Computer Science
    2010 - 2011
    University of California, Irvine - Irvine, CA, USA
  • Bachelor's degree in Information and Computer Science: Specializations in Computer Systems, Distributed Systems (Minor: Mathematics)
    2004 - 2009
    University of California, Irvine - Irvine, CA, USA
I really like this profile
Share it with others