Derek Owens-Oas, Developer in Ashland, OR, United States
Derek is available for hire
Hire Derek

Derek Owens-Oas

Verified Expert  in Engineering

Data Scientist and Developer

Location
Ashland, OR, United States
Toptal Member Since
January 21, 2020

Derek has a PhD in statistical science from Duke and has worked as a data scientist and software developer at Xylem. He's the founder of Tech Smart Magic. Published in the Journal of Classification and a TA of the year, he's an expert in research and teaching. He provides interactive apps, visualizations, and reports. Derek's Python and R programming, top AI and ML methods, and ability to communicate make him an asset to any company. His specialties include text data and online social networks.

Portfolio

Varsity Tutors
Algorithms, Computer Science, SPSS, Microsoft Excel, SAS, R, Python...
Toptal Client
Shopify, Microsoft Excel, Consulting
Shopper Media Group
SQL Functions, Data Science, Amazon Redshift, Google Chrome, Zoom...

Experience

Availability

Part-time

Preferred Environment

GitHub, Python, R, GoDaddy, Social Networks, ChatGPT, Facebook, Google, Microsoft, SQL

The most amazing...

...contribution I made at Xylem was an interactive app to help city utilities visualize water-pipe-network quality in Dallas, DC, and Howard County.

Work Experience

Tutor | Consultant

2018 - PRESENT
Varsity Tutors
  • Developed a web application to visualize cost distribution with health insurance claims data.
  • Used machine learning and labeled data to estimate the sentiment of tweets on Twitter.
  • Quantified wound volume reduction for treated and control groups of patients.
  • Estimated username from internet session activity data.
  • Edited code on programming and statistics homework assignments with high school, college, and graduate students.
Technologies: Algorithms, Computer Science, SPSS, Microsoft Excel, SAS, R, Python, GPT, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Mathematics, Statistics, Statistical Analysis, Data Analysis, A/B Testing

Data Scientist

2021 - 2021
Toptal Client
  • Consulted with the company chairman and CEO about the sale of health test kits.
  • Analyzed spreadsheet of customer communication for patterns.
  • Discussed plan to provide an automated solution by the chatbot.
Technologies: Shopify, Microsoft Excel, Consulting

Data Scientist

2020 - 2020
Shopper Media Group
  • Developed code to estimate the number of visitors at shopping centers with WiFi data.
  • Implemented methods for predicting shopper visits using a proxy center.
  • Imported table with visitation frequency charts into Redshift warehouse.
  • Gave video and audio reports with a daily status.
  • Typed up documentation about the process from surveying to a presentation on the web application.
Technologies: SQL Functions, Data Science, Amazon Redshift, Google Chrome, Zoom, Microsoft Excel, Microsoft Word, K-nearest Neighbors (KNN), ARIMA, Redshift, SQL, Python, Big Data, Pandas, Mathematics, Data Analytics, Data Analysis

Data Scientist | Software Developer

2020 - 2020
SureTint Technologies
  • Integrated customer relationship management software for a beauty salon application.
  • Continued the development of a Python package about color combination.
  • Reorganized the data and code file folder structure.
  • Gathered and added new data into the existing pipeline.
  • Tested and ensured the good quality of the program performance.
  • Deployed a basic Django app and experimented with an alternate methodology.
  • Typed code in the AWS SageMaker computing environment.
  • Trained multiple linear models to estimate hair color with products.
  • Applied nearest neighbor method to convert a hair formula product line.
Technologies: SQL Functions, SaaS, Data Science, Amazon Web Services (AWS), Statistical Modeling, Django, Git, Jupyter, Python, Amazon SageMaker, Pandas, Mathematics

Data Scientist

2018 - 2019
Xylem, Inc.
  • Developed a predictive model and application to efficiently prioritize water pipe inspection for major US city utilities.
  • Recruited talent to Xylem at an American Statistical Association event.
  • Wrote technical reports with data graphics and statistical language to inform management and a company executive.
  • Composed blog posts to emphasize and clarify company impacts.
  • Created and presented an interactive visualization of water quality and algae levels in Lake Erie.
Technologies: SaaS, Data Science, Amazon Web Services (AWS), Amazon EC2, Amazon S3 (AWS S3), Confluence, Jira, GitHub, Python, R, Mathematics, Bayesian Inference & Modeling, Bayesian Statistics, Finance, Regression Modeling, Data Analytics, Data Analysis

Online Social Network Report and Application

https://github.com/dmo11/political_blog_posts/blob/master/link_block_lda_results.pdf
I developed features, a learning algorithm, and web app visualization for topics and connections in an online social network. The R and Python implementations are available. Blog posts, Facebook comments and messages, Twitter tweets, and courtroom transcriptions are among the communication modes analyzed.

Here is a link to the video showing this application:

https://drive.google.com/file/d/1-Goo7OjKdGs9cvYxDfAu58GUuzDNSQg3/view?usp=sharing

Water Pipe Inspection Prioritizing Application

A statistical report and web application to evaluate water pipe quality in DC, Dallas, and Howard County. I wrote the code by applying machine learning algorithms to estimate the probability of each pipe breaking in the next three years, along with visualizing results on an interactive map.

Lake Erie Water Quality Assessment

I developed an interactive map giving estimates of water quality between sensor locations. I used machine learning, optimized linear predictive modeling, and spatial statistics along with writing a technical report which detailed the patterns of algae blooms.

Health Procedure Cost Explorer | Web App

https://drive.google.com/file/d/1IwtWOAObd1aBcfm2IukvtzqNQaR_PjiP/view
I set up a free, simple, full-stack server for hosting a web app. The link is to an online, interactive box-plot visualization that enables exploring health procedure costs. Insurance claims data are used to show how expenses are distributed to the insurer, provider, and patient.

A second bar-graph version allows the user to mouse-over various procedural choices for treating osteoarthritis. Here is the link:

https://drive.google.com/file/d/10gVQWka51w0RA5wmO4_BPIeEt3nt-ZRr/view?usp=sharing

A healthcare provider can view the patient outcomes to guide the choice going forward.

Learning Topics and Communities in Political Blog Posts

https://arxiv.org/pdf/1610.05756.pdf
I designed, implemented, and authored a publication which applies a statistical learning algorithm to political blog post data. A latent group that provides commentary on sensational crime is identified. The results are published in the Journal of Classification.

Learning Original Poster in Group Conversation Data

https://arxiv.org/pdf/1809.03648.pdf
I contributed to and applied a dynamic programming algorithm to an election day mega-thread on Reddit and courtroom transcriptions. This method is a credit attribution method like those used in web advertising.

Statistics Web Blog

I created a WordPress web blog where I've written posts sharing my experience during my Ph.D. program in statistics and as an early-career data scientist and quantitative consultant. These include graphics I've created and discussion of industries of interest.

Learning to Make a Tableau Dashboard

https://drive.google.com/file/d/1ygKMZlXeIxfsyl8YjEJPGQGrVphbpYUg/view?usp=sharing
I used a tutorial to visualize CO2 emissions data by countries in years. One graphic shows the amounts on a world map, and the other is a time series plot. It's possible to subset portions geographically and to mouse over and get specific observational values.

Salon Customer Brand Converter

https://drive.google.com/file/d/1uVhkJSdCEioSStJNuitvSPb9NVxnSdJ7/view?usp=sharing
SureTint Technologies software LaRu enables beauty salons to record customer hair formulas.

I continued developing application which converts formulas from one product line to another. Data are on AWS, code is Python, and a statistical model was used.

Features developed include a filter to ensure products conform to manufacturer recommendations.

Languages

R, Python, SQL, JavaScript, HTML, SAS, CSS, Java

Frameworks

RStudio Shiny, Django, Spark

Libraries/APIs

Pandas, Scikit-learn, Caret, Facebook API, Matplotlib, NumPy, PySpark, PyTorch, Node.js, TensorFlow Deep Learning Library (TFLearn), Facebook Ads API, Twitter API, TensorFlow, Keras, Natural Language Toolkit (NLTK)

Paradigms

ETL, Automation, Data Science, App Development, Microservices, Quantitative Research, Business Intelligence (BI)

Industry Expertise

Project Management, Healthcare, Marketing

Storage

Data Pipelines, Databases, SQL Functions, JSON, Amazon S3 (AWS S3), Redshift, PostgreSQL, Amazon DynamoDB, MySQL

Other

Data Analytics, Data Reporting, Data Visualization, Data Cleaning, Analytics, Algorithms, Natural Language Processing (NLP), Data Architecture, Data Modeling, Data Engineering, Analysis, Statistical Modeling, Excel Reporting, Artificial Intelligence (AI), Quantitative Modeling, A/B Testing, Topic Modeling, Classification, Visualization, Predictive Analytics, SaaS, Big Data, Machine Learning, Technical Reports, Applied Mathematics, Statistics, Statistical Analysis, Data Analysis, Mathematics, Bayesian Inference & Modeling, Bayesian Statistics, Regression Modeling, GPT, Generative Pre-trained Transformers (GPT), Consulting, Time Series, Data Matching, Higher Education, eCommerce, Scraping, Web Scraping, Video Production, Predictive Modeling, Text Mining, Kalman Filtering, Time Series Analysis, Financial Modeling, UI Development, Web Development, Dashboards, APIs, Scheduling, Custom Audio Embedding, Deep Learning, Advertising, Serverless, ARIMA, K-nearest Neighbors (KNN), Computer Science, Amazon Redshift, Quantitative Finance, Data Handling, Software Development, Publishing, Blogging, Neural Networks, Finance, Consumer Products, Surveying, Compliance, Documentation, OCR, GoDaddy, Social Networks, Facebook, Google

Tools

Jira, Confluence, Jupyter, Microsoft Excel, Microsoft Word, Git, GitHub, Tableau, SPSS, Amazon SageMaker, Zoom, ChatGPT

Platforms

RStudio, Amazon EC2, WordPress, Docker, AWS Lambda, Google Chrome, Amazon Web Services (AWS), Shopify, Microsoft

2013 - 2018

Master of Science Degree and PhD in Statistical Science

Duke University - Durham, NC, USA

2009 - 2013

Bachelor of Arts Degree in Mathematics

Pomona College - Claremont, CA, USA

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring