Derek Owens-Oas, Developer in Ashland, OR, United States
Derek is available for hire
Hire Derek

Derek Owens-Oas

Verified Expert  in Engineering

Bio

Derek has a PhD in statistical science from Duke and has worked as a data scientist and software developer at Xylem. He's the founder of Tech Smart Magic. Published in the Journal of Classification and a TA of the year, he's an expert in research and teaching. He provides interactive apps, visualizations, and reports. Derek's Python and R programming, top AI and ML methods, and ability to communicate make him an asset to any company. His specialties include text data and online social networks.

Portfolio

Varsity Tutors
Algorithms, Computer Science, Data Science, Excel Development, SAS, R, Python...
Toptal Client
Shopify, Excel Development, Design Consulting
Shopper Media Group
SQL, Data Science, Amazon Redshift, Google Chrome, Zoom Development...

Experience

Availability

Part-time

Preferred Environment

GitHub, Python, R, GoDaddy, Social Networks, ChatGPT, Facebook, Google, Microsoft, SQL

The most amazing...

...contribution I made at Xylem was an interactive app to help city utilities visualize water-pipe-network quality in Dallas, DC, and Howard County.

Work Experience

Tutor | Consultant

2018 - PRESENT
Varsity Tutors
  • Developed a web application to visualize cost distribution with health insurance claims data.
  • Used machine learning and labeled data to estimate the sentiment of tweets on Twitter.
  • Quantified wound volume reduction for treated and control groups of patients.
  • Estimated username from internet session activity data.
  • Edited code on programming and statistics homework assignments with high school, college, and graduate students.
Technologies: Algorithms, Computer Science, Data Science, Excel Development, SAS, R, Python, NLP, Generative Pre-trained Transformers (GPT), Mathematics, Statistics, Data Science, Data Analysis, A/B Testing

Data Scientist

2021 - 2021
Toptal Client
  • Consulted with the company chairman and CEO about the sale of health test kits.
  • Analyzed spreadsheet of customer communication for patterns.
  • Discussed plan to provide an automated solution by the chatbot.
Technologies: Shopify, Excel Development, Design Consulting

Data Scientist

2020 - 2020
Shopper Media Group
  • Developed code to estimate the number of visitors at shopping centers with WiFi data.
  • Implemented methods for predicting shopper visits using a proxy center.
  • Imported table with visitation frequency charts into Redshift warehouse.
  • Gave video and audio reports with a daily status.
  • Typed up documentation about the process from surveying to a presentation on the web application.
Technologies: SQL, Data Science, Amazon Redshift, Google Chrome, Zoom Development, Excel Development, Microsoft Word, K-nearest Neighbors (KNN), ARIMA, Redshift, SQL, Python, Big Data Architecture, Pandas, Mathematics, Data Science, Data Analysis

Data Scientist | Software Developer

2020 - 2020
SureTint Technologies
  • Integrated customer relationship management software for a beauty salon application.
  • Continued the development of a Python package about color combination.
  • Reorganized the data and code file folder structure.
  • Gathered and added new data into the existing pipeline.
  • Tested and ensured the good quality of the program performance.
  • Deployed a basic Django app and experimented with an alternate methodology.
  • Typed code in the AWS SageMaker computing environment.
  • Trained multiple linear models to estimate hair color with products.
  • Applied nearest neighbor method to convert a hair formula product line.
Technologies: SQL, SaaS, Data Science, AWS, Statistical Modeling, Django, Git, Jupyter, Python, Amazon SageMaker, Pandas, Mathematics

Data Scientist

2018 - 2019
Xylem, Inc.
  • Developed a predictive model and application to efficiently prioritize water pipe inspection for major US city utilities.
  • Recruited talent to Xylem at an American Statistical Association event.
  • Wrote technical reports with data graphics and statistical language to inform management and a company executive.
  • Composed blog posts to emphasize and clarify company impacts.
  • Created and presented an interactive visualization of water quality and algae levels in Lake Erie.
Technologies: SaaS, Data Science, AWS, Amazon EC2, Amazon S3, Confluence, Jira, GitHub, Python, R, Mathematics, Bayesian Inference & Modeling, Bayesian Statistics, Finance, Regression Modeling, Data Science, Data Analysis

Online Social Network Report and Application

https://github.com/dmo11/political_blog_posts/blob/master/link_block_lda_results.pdf
I developed features, a learning algorithm, and web app visualization for topics and connections in an online social network. The R and Python implementations are available. Blog posts, Facebook comments and messages, Twitter tweets, and courtroom transcriptions are among the communication modes analyzed.

Here is a link to the video showing this application:

https://drive.google.com/file/d/1-Goo7OjKdGs9cvYxDfAu58GUuzDNSQg3/view?usp=sharing

Water Pipe Inspection Prioritizing Application

A statistical report and web application to evaluate water pipe quality in DC, Dallas, and Howard County. I wrote the code by applying machine learning algorithms to estimate the probability of each pipe breaking in the next three years, along with visualizing results on an interactive map.

Lake Erie Water Quality Assessment

I developed an interactive map giving estimates of water quality between sensor locations. I used machine learning, optimized linear predictive modeling, and spatial statistics along with writing a technical report which detailed the patterns of algae blooms.

Health Procedure Cost Explorer | Web App

https://drive.google.com/file/d/1IwtWOAObd1aBcfm2IukvtzqNQaR_PjiP/view
I set up a free, simple, full-stack server for hosting a web app. The link is to an online, interactive box-plot visualization that enables exploring health procedure costs. Insurance claims data are used to show how expenses are distributed to the insurer, provider, and patient.

A second bar-graph version allows the user to mouse-over various procedural choices for treating osteoarthritis. Here is the link:

https://drive.google.com/file/d/10gVQWka51w0RA5wmO4_BPIeEt3nt-ZRr/view?usp=sharing

A healthcare provider can view the patient outcomes to guide the choice going forward.

Learning Topics and Communities in Political Blog Posts

https://arxiv.org/pdf/1610.05756.pdf
I designed, implemented, and authored a publication which applies a statistical learning algorithm to political blog post data. A latent group that provides commentary on sensational crime is identified. The results are published in the Journal of Classification.

Learning Original Poster in Group Conversation Data

https://arxiv.org/pdf/1809.03648.pdf
I contributed to and applied a dynamic programming algorithm to an election day mega-thread on Reddit and courtroom transcriptions. This method is a credit attribution method like those used in web advertising.

Statistics Web Blog

I created a WordPress web blog where I've written posts sharing my experience during my Ph.D. program in statistics and as an early-career data scientist and quantitative consultant. These include graphics I've created and discussion of industries of interest.

Learning to Make a Tableau Dashboard

https://drive.google.com/file/d/1ygKMZlXeIxfsyl8YjEJPGQGrVphbpYUg/view?usp=sharing
I used a tutorial to visualize CO2 emissions data by countries in years. One graphic shows the amounts on a world map, and the other is a time series plot. It's possible to subset portions geographically and to mouse over and get specific observational values.

Salon Customer Brand Converter

https://drive.google.com/file/d/1uVhkJSdCEioSStJNuitvSPb9NVxnSdJ7/view?usp=sharing
SureTint Technologies software LaRu enables beauty salons to record customer hair formulas.

I continued developing application which converts formulas from one product line to another. Data are on AWS, code is Python, and a statistical model was used.

Features developed include a filter to ensure products conform to manufacturer recommendations.
2013 - 2018

Master of Science Degree and PhD in Statistical Science

Duke University - Durham, NC, USA

2009 - 2013

Bachelor of Arts Degree in Mathematics

Pomona College - Claremont, CA, USA

Libraries/APIs

Pandas, Scikit-learn, Caret, Facebook API, Matplotlib, NumPy, PySpark, PyTorch, Node.js, TensorFlow Deep Learning Library (TFLearn), Facebook API, X (formerly Twitter) API, TensorFlow, Keras, Python

Tools

Jira, Confluence, Jupyter, Excel Development, Microsoft Word, Git, GitHub, Tableau Development, Data Science, Amazon SageMaker, ARIMA, Zoom Development, ChatGPT

Languages

R, Python, SQL, JavaScript, HTML, SAS, CSS, Java

Frameworks

RStudio Shiny, Django, Spark

Paradigms

ETL, Automation, App Development, Microservices Development, Quantitative Research, Business Intelligence Development

Industry Expertise

Virtual Coaching, Healthcare App Design, Marketing Design

Storage

Database, Database, SQL, JSON, Amazon S3, Redshift, PostgreSQL, AWS, MySQL

Platforms

RStudio, Amazon EC2, WordPress Development, Docker, AWS Lambda, Google Chrome, AWS, Shopify, Microsoft Development

Other

Data Science, Data Science, Data Visualization, Data Cleaning, Analytics Development, Algorithms, NLP, Data Architecture, Data Modeling, Data Engineering, Analysis, Statistical Modeling, Excel Development, Artificial Intelligence, Quantitative Development, A/B Testing, Topic Modeling, Classification, Visualization, Predictive Analytics, SaaS, Big Data Architecture, Machine Learning, Technical Reports, Applied Mathematics, Statistics, Data Science, Data Science, Data Analysis, Mathematics, Bayesian Inference & Modeling, Bayesian Statistics, Regression Modeling, Generative Pre-trained Transformers (GPT), Design Consulting, Time Series, Data Matching, Higher Education, E-commerce marketing, Scraping, Web Scraping, Video Production, Predictive Modeling, Text Mining, Kalman Filtering, Time Series Analysis, Financial Modeling, UI Development, Web Development, Dashboard, APIs, Scheduling, Custom Audio Embedding, Deep Learning, Advertising Management, Serverless, K-nearest Neighbors (KNN), Computer Science, Amazon Redshift, Data Science, Data Handling, Software Development, Publishing, Blogging, Neural Network, Finance, Consumer Products, Surveying, Compliance, Documentation, OCR, GoDaddy, Social Networks, Facebook, Google Software

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring