Joseph Miano, Developer in New York City, United States
Joseph is available for hire
Hire Joseph

Joseph Miano

Verified Expert  in Engineering

Data Scientist and Developer

Location
New York City, United States
Toptal Member Since
December 1, 2022

Joseph is a data scientist with 3+ years of experience in computer vision, NLP, and tabular datasets. He's worked on analytics for large-scale medication adherence outreach programs, multi-task neural networks for brain microscopy image segmentation, and NLP models to detect COVID-19 outbreaks from news articles. He's also familiar with model explainability for credit risk assessment and fraud detection models. Joseph holds an MS in Computer Science with a specialization in machine learning.

Portfolio

JPMorgan Chase
XGBoost, PySpark, SQL, Teradata, Oracle, Amazon Web Services (AWS), Jira...
Correlation One
Data Science, Python, Data Analytics, Machine Learning, Product Development...
JPMorgan Chase
Jupyter Notebook, Python, Jupyter, Data Visualization, Linux, XGBoost

Experience

Availability

Part-time

Preferred Environment

Windows, Linux, Python, PyTorch, PySpark, SQL, Amazon Web Services (AWS), Git, Pandas, Scikit-learn

The most amazing...

...thing I've developed is a first-author paper published, BERT-based NLP model to detect COVID-19 outbreaks in food establishments via news articles.

Work Experience

AI and Machine Learning Senior Associate

2022 - PRESENT
JPMorgan Chase
  • Engineered 100+ features using PySpark for customer authentication risk assessment models.
  • Trained machine learning models over millions of records to predict fraudulent customer authentication events.
  • Collaborated with business stakeholders to define model goals, scope, and KPIs.
Technologies: XGBoost, PySpark, SQL, Teradata, Oracle, Amazon Web Services (AWS), Jira, Bitbucket, Confluence, Dask, Linux

Teaching Assistant

2022 - 2023
Correlation One
  • Assisted participants in learning in the Walmart Data Science Bootcamp.
  • Answered questions and supported team project groups via weekly office hours.
  • Led additional teaching sessions covering Natural Language Processing.
Technologies: Data Science, Python, Data Analytics, Machine Learning, Product Development, Mentorship

AI and Machine Learning Summer Associate

2021 - 2021
JPMorgan Chase
  • Developed object‐oriented Python code to enable the explainability and interpretability of credit risk assessment models.
  • Presented results and conclusions to the broader intern group and organization consisting of more than 20 colleagues.
  • Visualized XGBoost prediction explanations by using partial dependence plots and Shapley value plots.
Technologies: Jupyter Notebook, Python, Jupyter, Data Visualization, Linux, XGBoost

Graduate Research Assistant

2020 - 2021
Georgia Tech Research Institute
  • Implemented the BERT and RoBERTa neural natural language processing models to automate COVID‐19 outbreak detection using web‐scraped news article contents.
  • Published a paper as the first author in the Springer Lecture Notes in Artificial Intelligence as part of the 2021 Artificial Intelligence in Medicine Conference.
  • Created a tutorial presentation for Microsoft Azure Machine Learning Studio to be used as a learning tool by internal and external collaborators.
Technologies: PyTorch, Beautiful Soup, Python, SQL, DataGrip, Generative Pre-trained Transformers (GPT), Natural Language Processing (NLP), LaTeX, Linux, Azure

Research Assistant

2018 - 2020
Neural Data Science Lab @ Georgia Tech
  • Developed a multi‐task convolutional neural network for microstructure segmentation and brain area classification of mouse brain X‐ray microtomography data.
  • Instructed students during coding workshops by answering technical and conceptual questions for the 2019 Deep Learning for Microscopy Image Analysis Workshop at the Marine Biological Laboratory in Woods Hole, MA.
  • Completed my thesis titled Multi-task Learning for Neural Image Classification and Segmentation and graduated with the Georgia Tech Research Option.
Technologies: PyTorch, Neural Networks, LaTeX, Computer Vision

Software Engineering Summer Intern

2019 - 2019
American Express
  • Trained natural language processing machine learning models using scikit-learn to automate incident ticket routing.
  • Explained the summer project and results to VP‐level organization (40+ colleagues) during an end‐of‐internship presentation.
  • Validated various data sources to ensure consistency across systems.
Technologies: Splunk, Jira, Natural Language Processing (NLP), Generative Pre-trained Transformers (GPT), Random Forests, AIOps, Python, Scikit-learn

Senior Consultant

2016 - 2018
CVS Health
  • Identified patients at risk of medication non-adherence in outcomes-based contracts and executed adherence outreach programs.
  • Quality-tested 50+ features for an enterprise-level predictive modeling project in collaboration with stakeholders from several departments.
  • Delivered a recurring SQL onboarding training course to new hires and members of the product development department.
  • Coordinated the onboarding for eight new hires and guided curriculum development of the onboarding program, including adding one new SQL training and standardizing several existing classes.
Technologies: SQL, Microsoft Excel, Teradata, Oracle, Tableau, Microsoft PowerPoint, Trello, Product Analytics, Product Development, Data Analysis

COVID-19 Outbreak Detection in Food Establishments Using Web Scraping and RoBERTa

A proof-of-concept, human-in-the-loop pipeline to scrape news articles from the web using keywords, classify them as relevant or not to COVID-19 outbreaks in food establishments using a fine-tuned RoBERTa model, and output the most relevant articles for human review. Published a first-author paper at the Artificial Intelligence in Medicine conference in collaboration with internal GTRI partners and members of the Centers for Disease Control and Prevention (CDC).

Diabetes Readmission Dashboard

https://github.com/jmiano/ReaDash
ReaDash is an interactive dashboard displaying information relating to diabetes patients' hospital readmissions. The user can visualize plots related to features predictive of hospital readmission, interact with a machine learning model (logistic regression) in order to download predictions for their own dataset, and visualize the performance of various machine learning models in predicting hospital readmission for patients with diabetes.

Medication Review Modeling

https://github.com/jmiano/Med-Review-NLP
A group project studying the relationship between medication review text, metadata, and review usefulness. My focus in this project was exploratory data analysis and training of text-only DistilBERT models to process the text and hybrid DistilBERT models for joint text and metadata processing. Overall, we were able to predict review usefulness successfully from both the text only and the metadata only, but the hybrid model performed best.
2020 - 2021

Master's Degree in Computer Science

Georgia Institute of Technology - Atlanta, GA, USA

2018 - 2020

Bachelor's Degree in Computer Science

Georgia Institute of Technology - Atlanta, GA, USA

2012 - 2016

Bachelor's Degree in Neuroscience

University of Miami - Miami, FL, USA

Libraries/APIs

PyTorch, Scikit-learn, Pandas, PySpark, Beautiful Soup, XGBoost, Dask, SpaCy

Tools

Microsoft Excel, Microsoft PowerPoint, Git, Tableau, Jira, Bitbucket, Confluence, DataGrip, LaTeX, Jupyter, Plotly, MATLAB, Splunk, Trello

Languages

Python, SQL, C, Java

Paradigms

Data Science, Object-oriented Design (OOD)

Platforms

Windows, Linux, Amazon Web Services (AWS), Jupyter Notebook, Oracle Database, Oracle, Azure

Storage

Teradata, MySQL

Other

Programming, Deep Learning, Computer Vision, Natural Language Processing (NLP), Data Visualization, Neural Networks, Random Forests, Data Analytics, Artificial Intelligence (AI), Machine Learning, Data Analysis, Generative Pre-trained Transformers (GPT), Hypothesis Testing, Statistics, Data Structures, Web Scraping, Product Development, Product Analytics, Experimental Design, AIOps, Time Series, Time Series Analysis, Mentorship

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring