Scroll To View More
David Smith, Python 3 Developer in Santa Barbara, CA, United States
David Smith

Python 3 Developer in Santa Barbara, CA, United States

Member since January 3, 2019
David is a developer specializing in big data, back-end services, and full-stack SaaS products–working in various roles in engineering, product, and management. He’s a hands-on coder regardless of the position who enjoys building products and delivering value to others. David always seeks to understand and build what the client actually needs to take them to the next step whether it is a large scalable architecture or a quick MVP to test.
David is now available for hire

Portfolio

  • Evidation Health
    Spark, Python (Flask, Jupyter, Pandas), Airflow, React/JavaScript, AWS (S3...
  • AppFolio
    Ruby on Rails, JavaScript (jQuery, React), MySQL, Redis, CSS
  • cielo24
    Python (Django), Celery, PostgreSQL, Salt, JavaScript (jQuery), CSS...

Experience

  • REST APIs, 10 years
  • Git, 7 years
  • Python 3, 6 years
  • AWS EMR, 2 years
  • Flask-RESTful, 2 years
  • Apache Airflow, 2 years
  • Spark, 2 years
Santa Barbara, CA, United States

Availability

Part-time

Preferred Environment

Linux or macOS X

The most amazing...

...product I've built is a robust and scalable big data platform for healthcare to analyze terabytes of sensor data in clinical studies.

Employment

  • Director of Technology

    2017 - PRESENT
    Evidation Health
    • Designed and implemented the data platform (a distributed and scalable system that runs securely in an AWS VPC using Airflow, RabbitMQ, and Celery to execute Python ETL scripts to continuously and reliably process hundreds of gigabytes of raw data from third-party sensors, surveys, media, and studies into an S3 data lake daily). This system has been sold to large pharmaceutical and tech companies to allow their data scientists to focus on analysis.
    • Created a method for idempotently merging large, partitioned data sets into a data lake using Amazon EMR, Apache Spark, and additional tools for processing into an S3 back-end–allowing for schema changes and backfilling/reprocessing to occur without system downtime. As a result, we were able to quickly deliver tens of terabytes of data continuously while preserving data integrity.
    • Performed major structural changes to the product architecture and release process to isolate customer environment-specific code from core services with minimal downtime. This reduced the risk of new changes unexpectedly impacting all projects and allows further customization of customer libraries and dependencies.
    • Additionally served as the team’s product manager and quality assurance engineer as the team succeeded in delivering our MVP with the most complex digital biomarker study protocol ever designed as its first use case.
    • Built additional web services on top of our platform for monitoring, delivering data, and quarantining problematic data for root cause analysis and repair.
    Technologies: Spark, Python (Flask, Jupyter, Pandas), Airflow, React/JavaScript, AWS (S3, EMR, IAM, VPC, Lambda, EC2), Hadoop, PostgreSQL, Graphite
  • Tech Lead | Senior Engineer

    2015 - 2017
    AppFolio
    • Acted as the product owner and technical lead for the value-added service that integrates credit, eviction, and criminal data web services from partner systems to provide background check reports to our customers. This service is a critical competitive advantage for the company and makes up 21% of annual revenue.
    • Prototyped our first in-house solution for a criminal and eviction background search service, including a data comparison analysis with our previous solution and evaluating big data tools to perform thousands of queries over a billion records with multiple indexes for instant results.
    • Led the development of the property manager product's HOA support initiative which involved several refactors to import features such as recurring invoice scheduling and reporting.
    • Analyzed customer segments and rapidly prototyped an MVP to perform market validation for a new business offering that involved automatic scanned bill scraping. These efforts saved the company an estimated $1.5 million due to high investment, low-scan quality, and low adoption.
    Technologies: Ruby on Rails, JavaScript (jQuery, React), MySQL, Redis, CSS
  • Director of Product/Engineering

    2014 - 2015
    cielo24
    • Led the development on several high-priority initiatives such as building our own task management system to delegate inbound media transcription jobs from customer systems to external partner APIs and workforce clouds (such as Amazon Mechanical Turk) at a large scale with strict SLA requirements.
    • Gathered stakeholder requirements and provided hands-on technical design and level-of-effort estimates for complex software features, allowing solution discovery to iterate rapidly without extensive redesign by engineers.
    • Defined product roadmaps, served as the scrum master, and managed weekly releases in Jira.
    • Reviewed pull requests to ensure that the requirements of the feature were met, that coding best practices were applied, and that test cases were covered.
    • Managed onsite software engineers, offsite-contractors, and partnerships.
    • Analyzed historical Amazon Mechanical Turk worker quality scores to recommend a new scoring structure to promote alignment across different phases of the media transcription and proofreading process using linear regression.
    Technologies: Python (Django), Celery, PostgreSQL, Salt, JavaScript (jQuery), CSS, Amazon Mechanical Turk
  • Product Manager

    2012 - 2014
    Maker Studios (acquired by Disney)
    • Built the company’s first automated data retrieval service to pull daily analytics from YouTube, Facebook, and Twitter for over 60,000 channels for revenue, viewership, demographics, and engagement into our application data as well as Amazon Redshift to identify better campaign targets for affiliate and brand advertisers. At the time I was the YouTube API's "power user" and met several times with their product and engineering team onsite and at Google's Venice office.
    • Served as the team’s database lead by designing new tables, optimizing query performance, and advising other team members with design and troubleshooting.
    • Designed and automated the customer onboarding workflow to record and sign required tax forms using web form validation, Adobe EchoSign, and Salesforce. Resolving this critical bottleneck which reduced the onboarding time from two weeks to just minutes and allowed the company to drastically increase the number of acquired customers.
    • Led product roadmaps for many high-profile products and authored several major patents.
    Technologies: Node.js, MySQL (Vertica), Redis, Redshift, Salesforce, Adobe EchoSign, Google APIs
  • Programmer/Analyst III

    2010 - 2012
    USC Information Sciences Institute
    • Created a Django-based content management system for the National Institute of Health's Non-Human Primary Research Centers to allow their pathologists to classify specimens and annotate progressively-rendered large virtual microscopy images.
    • Built a REST web service using Java Spring and used it for researching data transfer performance in a computing cluster.
    • Co-authored three publications.
    Technologies: Python (Django), Java 6 (Spring), JavaScript (jQuery), MySQL, Hadoop
  • Senior Software Engineer

    2005 - 2009
    Computer Associates
    • Implemented the product integrations between the Spectrum Network Management product with several other computer associates' products such as single sign-on, service desk, and CMDB, as well as SAP BusinessObjects reports.
    • Acted as the key engineer on the Spectrum Network Reporting product team.
    • Aggregated data in MySQL from a distributed network of SNMP devices and built a Java-based web application on top of it for better visibility into overall network trends.
    Technologies: Java, JSP, JavaScript

Experience

  • Evidation Data Platform (Other amazing things)
    https://evidation.com/product#data-platform

    I built this big data platform for data scientists who were using sensors, media, and other metadata to gain insights into health outcomes during clinical trials and retrospective data analysis.

Skills

  • Languages

    Python 3, HTML, JavaScript, Java 6, CSS, Bash, Scala, ECMAScript (ES6)
  • Frameworks

    AWS EMR, Spark, Ruby on Rails 4, Flask, Django, Hadoop
  • Libraries/APIs

    Flask-RESTful, REST APIs, YouTube API, jQuery, React, Node.js, Pandas
  • Tools

    Git, Apache Airflow, Nginx, uWSGI, Google Analytics, RabbitMQ, Celery, Jenkins, TeamCity, CircleCI, SaltStack, Terraform, Tableau, Superset, Amazon SQS, Amazon Virtual Private Cloud (VPC)
  • Paradigms

    Agile, Kanban
  • Platforms

    Linux, New Relic, Jupyter Notebook, AWS Lambda, Salesforce, AWS EC2, AWS Kinesis, Heroku
  • Storage

    MySQL, PostgreSQL, Redshift, AWS RDS, AWS S3, Redis, Apache Hive
  • Other

    Graphite

Education

  • Master of Business Administration (MBA) degree in Business Administration
    2011 - 2014
    University of Southern California - Los Angeles, CA, USA
  • Bachelor's degree in Computer Science
    2000 - 2005
    University of New Hampshire - Durham, NH, USA
I really like this profile
Share it with others