Alex Risman, Software Developer in Chicago, IL, United States
Alex Risman

Software Developer in Chicago, IL, United States

Member since March 15, 2018
In Alex's current role, he uses artificial intelligence to automatically detect diseases in 2D and 3D medical images along with some algorithms to achieve superhuman performance. Previously, he worked as a data scientist at an eCommerce company, where he built and deployed a deep-learning-based product search engine.
Alex is now available for hire

Portfolio

Experience

Location

Chicago, IL, United States

Availability

Part-time

Preferred Environment

Unix, Git, Jupyter Notebook

The most amazing...

...software I've developed is a tool for detecting 14 different diseases in chest X-ray

Employment

  • CTO

    2016 - PRESENT
    Realize
    • Earned multiple US patents for combining convolutional and recurrent neural networks to automatically detect diseases in CT scans and MRIs, the current state of the art.
    • Developed an algorithm that detects tuberculosis in chest X-rays with world-class accuracy (>.9 AUC), as determined by multiple third-party evaluations.
    • Assembled and led the founding team, including a marketer and an MD/Ph.D oncologist, as the CEO until our 2018 merger with leading African radiology IT firm. This merger occurred with a valuation of >30x our paid-in capital.
    • Developed an AI system for the world’s largest radiology group, deployed as a containerized RESTful API.
    • Advised governmental and NGO officials on healthcare applications of AI.
    Technologies: Amazon Web Services (AWS), AWS, Spark, DICOM, Python, Docker, Kubernetes, Keras, PyTorch, Matplotlib, Seaborn, Image Recognition, TensorFlow, APIs, RESTful Development, RESTful APIs, Twisted, Open Data, OpenCV, Architecture, Integration, DevOps, Neural Networks
  • Python Developer for Machine Learning Tools

    2020 - 2020
    Confidential (MBB Consulting Firm via Toptal)
    • Productionized a machine learning prototype my client had built for theirs (a Fortune 500 pharmaceutical firm), reducing the codebase by thousands of lines, adding modularity, and vastly simplifying the logic while preserving the original output.
    • Enabled the deployment of new marketing campaigns by configuration rather than a code change.
    • Wrote Unit Tests for all refactored modules and an automatic end-to-end test for the entire system.
    Technologies: Python, Pytest, Unit Testing, Code Refactoring, NumPy, Pandas, Azure, Tableau
  • Data Engineering Architect

    2018 - 2020
    Confidential (Major US Pharmacy Chain, via Toptal)
    • Created systems, including accurate ML models and deep chains of complex Spark SQL queries, to identify gaps in 100M+ patients' vaccination histories based on CDC guidelines and generate personalized vaccine recommendations daily.
    • Developed a PySpark method for adding a unique 18-digit ID to a DataFrame without coalescing to a single partition, removing a department-wide bottleneck.
    • Scaled existing system for notifying patients their prescriptions were ready from a single node, on-premises SQL to distributed Spark SQL in Azure.
    • Conducted hiring of data scientists and data engineers.
    Technologies: Databricks, Spark, PySpark, Spark SQL, Spark ML, Apache Airflow, SQL, Jira, Agile, Python, Azure, NumPy, Pandas, Scikit-learn, Unit Testing, Big Data, Big Data Architecture, Data Pipelines, Architecture, Integration
  • Spark Consultant

    2018 - 2018
    FLYR
    • Optimized existing YARN-managed PySpark jobs running on GCP, cutting runtimes and costs by over 80%.
    • Trained client's staff in best practices for Spark and data engineering.
    • Used Agile methodology to manage my work including daily scrums and sprint planning with Jira.
    Technologies: Google Cloud Platform (GCP), Google Cloud Dataproc, Spark, PySpark, Spark ML, BigQuery, Kubernetes, YARN, Agile, Jira
  • Data Scientist

    2013 - 2017
    McMaster-Carr Supply
    • Conceived, developed, and deployed a deep-learning-based eCommerce search engine that trained recurrent neural networks on millions of customer searches, increasing the probability a given search would end with an "add to order" by 1.07%.
    • Estimated and visualized the causal effect of “punch-out” purchasing software on sales with R/ggplot2, using a panel dataset of monthly sales figures from 30 customers (two years before and after activation).
    • Built systems for tracking and analyzing A/B tests using a Neo4J graph database and R with methods for verifying assumptions and estimating treatment effects in superiority and non-inferiority trials.
    • Developed a machine learning model to decide if non-catalog products sourced for customers required hazards handling based on supplier/description, achieving .99 AUC, 98% accuracy, and no false negatives in testing.
    • Prototyped the above machine learning model in Python using Scikit-learn and Pandas.
    • Implemented a Random Forest algorithm in C# on top of Accord, the most popular .NET ML framework, for production; Random Forest pull request to Accord accepted to master branch.
    • Prototyped the above machine learning model in R using Random Forest; the implementation is in production pending.
    Technologies: Theano, Keras, Scikit-learn, NumPy, Pandas, Python, C#.NET, Neo4j, Splunk, Time Series, Time Series Analysis

Experience

  • Anomaly Detection in Volumetric Images Using Sequential Convolutional and Recurrent Neural Networks (Development)
    https://patents.google.com/patent/US10347010B2/en

    I created what is now the state-of-the-art deep learning architecture for analyzing CT scans. Computer-implemented methods and apparatuses for anomaly detection in volumetric images are provided. A two-dimensional convolutional neural network (CNN) is used to encode slices within a volumetric image, such as a CT scan. The CNN may be trained using an output layer that is subsequently omitted during the use of the CNN as an encoder. The CNN encoder output is applied to a recurrent neural network (RNN), such as a long short-term memory network. The RNN may output various indications of the presence, probability, and/or location of anomalies within the volumetric image.

  • CT Lung Nodule Detection (Development)
    https://www.youtube.com/watch?v=X_8bpuL0G3Q

    I developed an artificial intelligence software to automatically detect lung nodules which are often missed by radiologists and can portend cancer in CT scans. The link attached leads to a video demonstrating the AI results and the integration into the radiology workflow.

  • Scaling Up Music | Master's Project at UC Berkeley (Development)

    I deployed and used a Spark cluster to predict the genre of songs in The Echo Nest’s Million Song Dataset using data on volume, tempo, pitch, and “danceability”. I also wrote the code to train models using Spark’s MLlib.

Skills

  • Languages

    Python, R, SQL, C#.NET, JavaScript, Scala
  • Frameworks

    Spark, YARN, Twisted
  • Libraries/APIs

    PySpark, Pandas, Scikit-learn, TensorFlow, Keras, NumPy, PyTorch, SciPy, OpenCV, Theano, MLlib, Spark ML, Matplotlib
  • Tools

    Spark SQL, Jira, Jupyter, Apache Airflow, Git, Google Cloud Dataproc, BigQuery, Pytest, Tableau, Seaborn, Splunk, Collada, CAD
  • Paradigms

    Data Science, ETL, DevOps, RESTful Development, Distributed Computing, Agile, Parallel Computing, Unit Testing
  • Platforms

    Databricks, Amazon Web Services (AWS), AWS EC2, Kubernetes, Docker, Ubuntu, Linux, Azure, Jupyter Notebook, Unix, Google Cloud Platform (GCP), Apache Kafka
  • Storage

    AWS S3, Neo4j, Data Pipelines, Cassandra, Google Cloud
  • Other

    APIs, Image Analysis, 3D Image Processing, Image Processing, Machine Learning, Data Engineering, Big Data, Big Data Architecture, RESTful APIs, Architecture, Integration, Neural Networks, Time Series, Time Series Analysis, Artificial Intelligence (AI), Deep Learning, Random Forests, Computer Vision, Natural Language Processing (NLP), Convolutional Neural Networks, Recurrent Neural Networks, Data Modeling, Object Tracking, OCR, DICOM, AWS, Economics, Code Refactoring, Image Recognition, Apache Cassandra, Open Data, Forecasting, 3D CAD

Education

  • Master's Degree in Information and Data Science
    2014 - 2015
    University of California, Berkeley - Berkeley, CA, USA
  • Bachelor's Degree in Mathematical Methods in the Social Sciences, Economics
    2009 - 2013
    Northwestern University - Evanston, IL, USA

To view more profiles

Join Toptal
Share it with others