
Derek Owens-Oas
Verified Expert in Engineering
Data Scientist and Developer
Derek has a Ph.D. in statistical science and has worked as a data scientist and software developer at Xylem. A published author in the Journal of Classification, his expertise is in providing technical reports and insights with interactive visualizations. Derek's extensive knowledge of Python and R libraries, state-of-the-art methods, and ability to communicate make him an asset to any company. His specialties include text and online social network analysis.
Portfolio
Experience
Availability
Preferred Environment
GitHub, Microsoft Excel, Python, WordPress, Microsoft Word, R
The most amazing...
...contribution I've made at Xylem was an interactive app to help city utilities assess water-pipe-network quality in Dallas, DC, and Howard County.
Work Experience
Tutor | Consultant
Varsity Tutors
- Developed a web application to visualize cost distribution with health insurance claims data.
- Used machine learning and labeled data to estimate the sentiment of tweets on Twitter.
- Quantified wound volume reduction for treated and control groups of patients.
- Estimated username from internet session activity data.
- Edited code on programming and statistics homework assignments with high school, college, and graduate students.
Data Scientist
Toptal Client
- Consulted with the company chairman and CEO about the sale of health test kits.
- Analyzed spreadsheet of customer communication for patterns.
- Discussed plan to provide an automated solution by the chatbot.
Monitor
CDR Maguire
- Attended safety group to get tablets and documents from a supervisor.
- Went to the assigned location to ensure there is adequate water and compliant covering and placarding of vehicles.
- Observed driver piling and loading to monitor debris removal.
- Submitted load entry to system and gave ticket to haul truck driver.
Data Scientist
Shopper Media Group
- Developed code to estimate the number of visitors at shopping centers with WiFi data.
- Implemented methods for predicting shopper visits using a proxy center.
- Imported table with visitation frequency charts into Redshift warehouse.
- Gave video and audio reports with a daily status.
- Typed up documentation about the process from surveying to a presentation on the web application.
Data Scientist | Software Developer
SureTint Technologies
- Integrated customer relationship management software for a beauty salon application.
- Continued the development of a Python package about color combination.
- Reorganized the data and code file folder structure.
- Gathered and added new data into the existing pipeline.
- Tested and ensured the good quality of the program performance.
- Deployed a basic Django app and experimented with an alternate methodology.
- Typed code in the AWS SageMaker computing environment.
- Trained multiple linear models to estimate hair color with products.
- Applied nearest neighbor method to convert a hair formula product line.
Data Scientist
Xylem, Inc.
- Developed a predictive model and application to efficiently prioritize water pipe inspection for major US city utilities.
- Recruited talent to Xylem at an American Statistical Association event.
- Wrote technical reports with data graphics and statistical language to inform management and a company executive.
- Composed blog posts to emphasize and clarify company impacts.
- Created and presented an interactive visualization of water quality and algae levels in Lake Erie.
Experience
Online Social Network Report and Application
https://github.com/dmo11/political_blog_posts/blob/master/link_block_lda_results.pdfHere is a link to the video showing this application:
https://drive.google.com/file/d/1-Goo7OjKdGs9cvYxDfAu58GUuzDNSQg3/view?usp=sharing
Water Pipe Inspection Prioritizing Application
Lake Erie Water Quality Assessment
Health Procedure Cost Explorer | Web App
https://drive.google.com/file/d/1IwtWOAObd1aBcfm2IukvtzqNQaR_PjiP/viewA second bar-graph version allows the user to mouse-over various procedural choices for treating osteoarthritis. Here is the link:
https://drive.google.com/file/d/10gVQWka51w0RA5wmO4_BPIeEt3nt-ZRr/view?usp=sharing
A healthcare provider can view the patient outcomes to guide the choice going forward.
Learning Topics and Communities in Political Blog Posts
https://arxiv.org/pdf/1610.05756.pdfLearning Original Poster in Group Conversation Data
https://arxiv.org/pdf/1809.03648.pdfStatistics Web Blog
Learning to Make a Tableau Dashboard
https://drive.google.com/file/d/1ygKMZlXeIxfsyl8YjEJPGQGrVphbpYUg/view?usp=sharingSalon Customer Brand Converter
https://drive.google.com/file/d/1uVhkJSdCEioSStJNuitvSPb9NVxnSdJ7/view?usp=sharingI continued developing application which converts formulas from one product line to another. Data are on AWS, code is Python, and a statistical model was used.
Features developed include a filter to ensure products conform to manufacturer recommendations.
Skills
Languages
R, Python, SQL, JavaScript, HTML, SAS, CSS, Java
Frameworks
RStudio Shiny, Django, Spark
Libraries/APIs
Pandas, Scikit-learn, Caret, Facebook API, Matplotlib, NumPy, PySpark, PyTorch, Node.js, TensorFlow Deep Learning Library (TFLearn), Facebook Ads API, Twitter API, TensorFlow, Keras, Natural Language Toolkit (NLTK)
Paradigms
ETL, Automation, Data Science, App Development, Microservices, Quantitative Research, Business Intelligence (BI)
Industry Expertise
Project Management, Healthcare, Marketing
Storage
Data Pipelines, Databases, SQL Functions, JSON, Amazon S3 (AWS S3), Redshift, PostgreSQL, Amazon DynamoDB, MySQL
Other
Data Analytics, Data Reporting, Data Visualization, Data Cleaning, Analytics, Algorithms, Natural Language Processing (NLP), Data Architecture, Data Modeling, Data Engineering, Analysis, Statistical Modeling, Excel Reporting, Artificial Intelligence (AI), Quantitative Modeling, A/B Testing, Topic Modeling, Classification, Visualization, Predictive Analytics, SaaS, Big Data, Machine Learning, Technical Reports, Applied Mathematics, Statistics, Statistical Analysis, Data Analysis, Mathematics, Bayesian Inference & Modeling, Bayesian Statistics, Regression Modeling, GPT, Generative Pre-trained Transformers (GPT), Consulting, Time Series, Data Matching, Higher Education, eCommerce, Scraping, Web Scraping, Video Production, Predictive Modeling, Text Mining, Kalman Filtering, Time Series Analysis, Financial Modeling, UI Development, Web Development, Dashboards, APIs, Scheduling, Custom Audio Embedding, Deep Learning, Advertising, Serverless, ARIMA, K-nearest Neighbors, Computer Science, Amazon Redshift, Quantitative Finance, Data Handling, Software Development, Publishing, Blogging, Neural Networks, Finance, Consumer Products, Surveying, Compliance, Documentation, OCR
Tools
Jira, Confluence, Jupyter, Microsoft Excel, Microsoft Word, R Studio, Git, GitHub, Tableau, SPSS, Amazon SageMaker, Zoom
Platforms
Amazon EC2, WordPress, Docker, AWS Lambda, Google Chrome, Amazon Web Services (AWS), Shopify
Education
Doctor of Philosophy & Master of Science Degree in Statistical Science
Duke University - Durham, NC, USA
Bachelor of Arts Degree in Mathematics
Pomona College - Claremont, CA, USA