Derek Owens-Oas, Data Scientist and Developer in Ashland, OR, United States
Derek Owens-Oas

Data Scientist and Developer in Ashland, OR, United States

Member since December 20, 2019
Derek has a Ph.D. in statistical science and has worked as a data scientist and software developer at Xylem. A published author in the Journal of Classification, his expertise is in providing technical reports and insights with interactive visualizations. Derek's extensive knowledge of Python and R libraries, state-of-the-art methods, and ability to communicate make him an asset to any company. His specialties include text and online social network analysis.
Derek is now available for hire

Portfolio

  • SureTint Technologies
    Amazon Web Services (AWS), Statistical Modeling, Django, Git, Jupyter, Python...
  • Xylem, Inc.
    Amazon Web Services (AWS), AWS EC2, AWS S3, AWS, Atlassian Confluence, Jira...

Experience

Location

Ashland, OR, United States

Availability

Part-time

Preferred Environment

Microsoft Excel, GitHub, Python, WordPress, Microsoft Word, R

The most amazing...

...contribution I've made at Xylem was an interactive app to help city utilities assess water-pipe-network quality in Dallas, DC, and Howard County.

Employment

  • Data Scientist | Software Developer

    2020 - PRESENT
    SureTint Technologies
    • Integrated with a customer relationship management software for a beauty salon application.
    • Continued the development of an add-on Python package for converting a hair formula product line and estimating color.
    • Reorganized the data and code file structure successfully.
    • Gathered and added new data into the existing pipeline.
    • Tested and ensured the quality (quality assurance) of the program performance.
    • Deployed a basic Django app and experimented with an alternate methodology.
    Technologies: Amazon Web Services (AWS), Statistical Modeling, Django, Git, Jupyter, Python, Amazon SageMaker, AWS
  • Data Scientist

    2018 - 2019
    Xylem, Inc.
    • Developed a predictive model and application to efficiently prioritize water pipe inspection for major US city utilities.
    • Recruited talent to Xylem at an American Statistical Association event.
    • Wrote technical reports with data graphics and statistical language to inform management and a company executive.
    • Composed blog posts to emphasize and clarify company impacts.
    • Created and presented an interactive visualization of water quality and algae level in Lake Erie.
    Technologies: Amazon Web Services (AWS), AWS EC2, AWS S3, AWS, Atlassian Confluence, Jira, GitHub, Python, R

Experience

  • Online Social Network Report and Application (Development)
    https://github.com/dmo11/political_blog_posts/blob/master/link_block_lda_results.pdf

    I developed features, a learning algorithm, and web app visualization for topics and connections in an online social network. The R and Python implementations are available. Blog posts, Facebook comments and messages, Twitter tweets, and courtroom transcriptions are among the communication modes analyzed.

  • Water Pipe Inspection Prioritizing Application (Development)

    A statistical report and web application to evaluate water pipe quality in DC, Dallas, and Howard County. I wrote the code by applying machine learning algorithms to estimate the probability of each pipe breaking in the next three years, along with visualizing results on an interactive map.

  • Lake Erie Water Quality Assessment (Development)

    I developed an interactive map giving estimates of water quality between sensor locations. I used machine learning, optimized linear predictive modeling, and spatial statistics along with writing a technical report which detailed the patterns of algae blooms.

  • Health Procedure Cost Explorer | Web App (Development)
    https://drive.google.com/file/d/1IwtWOAObd1aBcfm2IukvtzqNQaR_PjiP/view

    I set up a free, simple, full-stack server for hosting a web app. The link is to an online, interactive box-plot visualization that enables exploring health procedure costs. Insurance claims data are used to show how expenses are distributed to the insurer, provider, and patient.

    A second bar-graph version allows the user to mouse-over various procedural choices for treating osteoarthritis. Here is the link:

    https://drive.google.com/file/d/10gVQWka51w0RA5wmO4_BPIeEt3nt-ZRr/view?usp=sharing

    A healthcare provider can view the patient outcomes to guide the choice going forward.

  • Learning Topics and Communities in Political Blog Posts (Development)
    https://arxiv.org/pdf/1610.05756.pdf

    I designed, implemented, and authored a publication which applies a statistical learning algorithm to political blog post data. A latent group that provides commentary on sensational crime is identified. The results are published in the Journal of Classification.

  • Learning Original Poster in Group Conversation Data (Development)
    https://arxiv.org/pdf/1809.03648.pdf

    I contributed to and applied a dynamic programming algorithm to an election day mega-thread on Reddit and courtroom transcriptions. This method is a credit attribution method like those used in web advertising.

  • Statistics Web Blog (Development)
    https://statsaddict.com/

    I created a WordPress web blog where I've written posts sharing my experience during my Ph.D. program in statistics and as an early-career data scientist and quantitative consultant. These include graphics I've created and discussion of industries of interest.

  • Salon Customer Brand Converter (Development)
    https://drive.google.com/file/d/1uVhkJSdCEioSStJNuitvSPb9NVxnSdJ7/view?usp=sharing

    SureTint Technologies software LaRu enables beauty salons to record customer hair formulas.

    I continued developing application which converts formulas from one product line to another. Data are on AWS, code is Python, and a statistical model was used.

    Features developed include a filter to ensure products conform to manufacturer recommendations.

  • Learning to Make a Tableau Dashboard (Development)
    https://drive.google.com/file/d/1ygKMZlXeIxfsyl8YjEJPGQGrVphbpYUg/view?usp=sharing

    I used a tutorial to visualize CO2 emissions data by countries in years. One graphic shows the amounts on a world map, and the other is a time series plot. It's possible to subset portions geographically and to mouse over and get specific observational values.

Skills

  • Languages

    R, Python, HTML, CSS, JavaScript, SAS, Java, SQL
  • Frameworks

    RStudio Shiny, Django, Spark, Express.js
  • Paradigms

    ETL, Data Science, App Development, Microservices, Business Intelligence (BI)
  • Industry Expertise

    Project Management, Financial Modeling, Web Development, Healthcare, Marketing
  • Storage

    Data Pipelines, Databases, JSON, AWS S3, Redshift, AWS DynamoDB, MySQL
  • Other

    Data Analytics, Data Reporting, Data Visualization, Data Cleaning, Analytics, Algorithms, Natural Language Processing (NLP), Data Architecture, Data Modeling, Analysis, Regression Models, Statistical Modeling, Technical Reports, Applied Mathematics, Statistics, Statistical Analysis, Data Analysis, Consulting, Time Series, Data Matching, Machine Learning, Web Scraping, AWS, Tableau Configuration, UI Development, Dashboards, APIs, Scheduling, Data Engineering, Custom Audio Embedding, Deep Learning, Advertising, Serverless, Software Development, Publishing, Blogging, Neural Networks
  • Libraries/APIs

    Pandas, PySpark, PyTorch, Node.js, Facebook API, Twitter API, TensorFlow, Keras
  • Tools

    Jira, Atlassian Confluence, Jupyter, Microsoft Word, R Studio, Git, GitHub, Tableau, SPSS, Amazon SageMaker, AWS QuickSight, Microsoft Excel
  • Platforms

    AWS EC2, WordPress, Docker, AWS Lambda, Amazon Web Services (AWS)

Education

  • Doctor of Philosophy & Master of Science degree in Statistical Science
    2013 - 2018
    Duke University - Durham, NC, USA
  • Bachelor of Arts degree in Mathematics
    2009 - 2013
    Pomona College - Claremont, CA, USA

To view more profiles

Join Toptal
Share it with others