Derek Owens-Oas, Data Analysis Developer in Richmond, VA, United States
Derek Owens-Oas

Data Analysis Developer in Richmond, VA, United States

Member since December 20, 2019
Derek has a Ph.D. in statistical science and worked as a data scientist and software developer at Xylem. A published author in the Journal of Classification, his expertise is in providing technical reports and insights with interactive visualizations. His extensive knowledge of Python and R libraries, state-of-the-art methods, and ability to communicate make him an asset to any company. Specialties include text and online social network analysis.
Derek is now available for hire


  • Xylem, Inc.
    R, Python, GitHub, Jira, Confluence, AWS S3/EC2



Richmond, VA, United States



Preferred Environment

R, Word, WordPress, Python, Excel, GitHub

The most amazing...

...contribution I've made at Xylem was an interactive application to help city utilities assess water-pipe-network quality in Dallas, DC, and Howard County.


  • Data Scientist

    2018 - 2019
    Xylem, Inc.
    • Developed a predictive model and application to efficiently prioritize water pipe inspection for major US city utilities.
    • Recruited talent to Xylem at an American Statistical Association event.
    • Wrote technical reports with data graphics and statistical language to inform management and a company executive.
    • Composed blog posts to emphasize and clarify company impacts.
    • Created and presented an interactive visualization of water quality and algae level in Lake Erie.
    Technologies: R, Python, GitHub, Jira, Confluence, AWS S3/EC2


  • Online Social Network Report and Application (Development)

    I developed features, a learning algorithm, and web app visualization for topics and connections in an online social network. The R and Python implementations are available. Blog posts, Facebook comments and messages, Twitter tweets, and courtroom transcriptions are among the communication modes analyzed.

  • Water Pipe Inspection Prioritizing Application (Development)

    A statistical report and web application to evaluate water pipe quality in DC, Dallas, and Howard County. I wrote the code by applying machine learning algorithms to estimate the probability of each pipe breaking in the next three years, along with visualizing results on an interactive map.

  • Lake Erie Water Quality Assessment (Development)

    I developed an interactive map giving estimates of water quality between sensor locations. I used machine learning, optimized linear predictive modeling, and spatial statistics, writing a technical report which detailed the patterns of algae blooms.

  • Health Procedure Cost Explorer | Web App (Development)

    I set up a free, simple, full-stack server for hosting a web app. The link is to an online, interactive box-plot visualization that enables exploring health procedure costs. Insurance claims data are used to show how expenses are distributed to the insurer, provider, and patient.

    A second bar-graph version allows the user to mouse-over various procedural choices for treating osteoarthritis. Here is the link:

    A healthcare provider can view the patient outcomes to guide the choice going forward.

  • Learning Topics and Communities in Political Blog Posts (Development)

    I designed, implemented, and authored a publication which applies a statistical learning algorithm to political blog post data. A latent group that provides commentary on sensational crime is identified. The results are published in the Journal of Classification.

  • Learning Original Poster in Group Conversation Data (Development)

    I contributed to and applied a dynamic programming algorithm to an election day mega-thread on Reddit and courtroom transcriptions. This method is a credit attribution method like those used in web advertising.

  • Statistics Web Blog (Development)

    I created a WordPress web blog where I've written posts sharing my experience during my Ph.D. program in statistics and as an early-career data scientist and quantitative consultant. These include graphics I've created and discussion of industries of interest.


  • Languages

    R, Python, Java, SQL
  • Paradigms

    Data Science
  • Other

    Technical Reports, Applied Mathematics, Statistics, Statistical Analysis, Data Analysis, Machine Learning, Web Scraping, Software Development, Excel, Publishing, Blogging, Neural Networks
  • Tools

    Microsoft Word
  • Storage

    JSON, AWS DynamoDB
  • Libraries/APIs

    Facebook API, Twitter API
  • Platforms

    Amazon Web Services (AWS)


  • Doctor of Philosophy & Master of Science degree in Statistical Science
    2013 - 2018
    Duke University - Durham, NC, USA
  • Bachelor of Arts degree in Mathematics
    2009 - 2013
    Pomona College - Claremont, CA, USA

To view more profiles

Join Toptal
I really like this profile
Share it with others