Scroll To View More
Alex Risman

Alex Risman

Chicago, IL, United States
Member since March 15, 2018
In Alex's current role, he uses artificial intelligence to automatically detect diseases in 2D and 3D medical images along with some algorithms to achieve superhuman performance. Previously, he worked as a data scientist at an eCommerce company, where he built and deployed a deep-learning-based product search engine.
Alex is now available for hire
Portfolio
  • PeriData
    Python, NumPy, Pandas, Scikit-learn, Keras, Docker, Kubernetes, AWS, Jupyter...
  • Realize
    Keras, Kubernetes, Docker, Python, DICOM, Spark, AWS
  • McMaster-Carr
    C#.NET, Python, Pandas, NumPy, Scikit-learn, Keras, Theano
Experience
  • Python, 5 years
  • R, 5 years
  • AWS EC2, 4 years
  • Artificial Intelligence (AI), 4 years
  • Pandas, 4 years
  • Scikit-learn, 4 years
  • AWS S3, 4 years
  • Amazon Web Services (AWS), 3 years
Chicago, IL, United States
Availability
Part-time
Preferred Environment
Jupyter Notebook, Git, Unix Command Line
The most amazing...
...software I've developed is a tool for detecting 14 different diseases in chest X-ray
Employment
  • Founder | Managing Principal
    2018 - PRESENT
    PeriData
    • Successfully completed AI and data science development and advisory engagements for dozens of clients.
    • Led the development of a deep learning algorithm that converted 2D images into 3D CAD models using Theano, Docker, and Pycollada.
    • Developed an educational math website using JupyterHub, Docker, and AWS.
    • Built a platform that enables users to, with a single command line prompt, spin up a new server on AWS and conduct multi-GPU training of deep learning models using Keras, Docker, and Kubernetes.
    • Acted as a Spark consultant to FLYR, an airline revenue management firm. Improved the performance of the existing Spark processes running on Google Cloud Dataproc, cutting job runtimes by up to 80% and computing costs by up to 90% and saving the company $10,000/year.
    Technologies: Python, NumPy, Pandas, Scikit-learn, Keras, Docker, Kubernetes, AWS, Jupyter, Google Cloud
  • CTO
    2016 - PRESENT
    Realize
    • Company acquired by IntriHEALTH, Africa's leading radiology IT provider, to bring AI-powered diagnostic solutions to the developing world. Developed an algorithm that detects 14 diseases in chest X-rays, currently undergoing a multisite clinical trial in South Africa.
    • Spearheaded a strategic partnership with vRad, America’s largest radiology practice, to create and deploy algorithms for prioritizing CT scans in emergency work queues by the likelihood of a pulmonary embolism (a national first).
    • Implemented massive parallel extraction and preprocessing jobs for a 10+ TB database of CT scans using PySpark. Since CT scans are typically several hundred MB each, an extremely memory efficient architecture was required and developed, which eventually allowed us to run jobs arbitrarily quickly by scaling up the cluster.
    • Co-invented a deep learning architecture combining CNNs and RNNs to analyze 3D images: Risman, Alexander; Chen, Sea. 2017. Anomaly Detection in Volumetric Images. US20180033144A1, filed September 26, 2017. Patent pending.
    • Developed an algorithm that can detect lung nodules in CT, with third-party testing finding a <20% miss rate at a clinically acceptable false positive level. For comparison, radiologist miss rates of over 50% have been documented.
    Technologies: Keras, Kubernetes, Docker, Python, DICOM, Spark, AWS
  • Data Scientist
    2013 - 2017
    McMaster-Carr
    • Conceived of, developed, and deployed a deep-learning-based eCommerce search engine that trained recurrent neural networks on millions of customer searches and increasing the probability a given search would end with an "add to order" by 1.07%, as shown by A/B testing.
    • Estimated and visualized the causal effect of “punch-out” purchasing software on sales with R/ggplot2, using a panel dataset of monthly sales figures from 30 customers (two years before and after activation).
    • Built systems for tracking and analyzing A/B tests using a Neo4J graph database and R with methods for verifying assumptions and estimating treatment effects in superiority and non-inferiority trials.
    • Developed a machine learning model to decide if non-catalog products sourced for customers required hazards handling based on supplier/description, achieving .99 AUC, 98% accuracy, and no false negatives in testing.
    • Prototyped the above machine learning model in Python using Scikit-learn and Pandas.
    • Implemented a Random Forest algorithm in C# on top of Accord, the most popular .NET ML framework, for production; Random Forest pull request to Accord accepted to master branch.
    • Developed a machine learning model to sort product attributes in new faceted search panes by predicted popularity rank, correctly predicting the most popular attribute in 59% more existing panes than our merchandising department had placed it at the top in.
    • Prototyped the above machine learning model in R using Random Forest; the implementation is in production pending.
    Technologies: C#.NET, Python, Pandas, NumPy, Scikit-learn, Keras, Theano
Experience
  • CT Lung Nodule Detection (Development)
    https://www.youtube.com/watch?v=X_8bpuL0G3Q

    I developed artificial intelligence software to automatically detect lung nodules which are often missed by radiologists and can portend cancer, in CT scans. The link attached leads to a video demonstrating the AI results and the integration into the radiology workflow.

  • Scaling Up Music | Master's Project at UC Berkeley (Development)

    I deployed and used a Spark cluster to predict the genre of songs in The Echo Nest’s Million Song Dataset using data on volume, tempo, pitch, and “danceability”. I also wrote the code to train models using Spark’s MLlib.

Skills
  • Languages
    Python, R, SQL
  • Libraries/APIs
    Keras, NumPy, Pandas, Scikit-learn, PySpark, TensorFlow
  • Platforms
    Amazon Web Services (AWS), AWS EC2, Kubernetes, Docker
  • Storage
    AWS S3, Google Cloud
  • Other
    Artificial Intelligence (AI), Deep Learning, Random forests, Computer Vision, Natural Language Processing (NLP), Convolutional Neural Networks, Recurrent Neural Networks, 3D CAD
  • Frameworks
    Spark
  • Tools
    Jupyter, CAD, Collada
  • Paradigms
    Parallel & Distributed Computing
Education
  • Master's degree in Information and Data Science
    2014 - 2015
    University of California, Berkeley - Berkely, CA, USA
  • Bachelor's degree in Mathematical Methods in the Social Sciences, Economics
    2009 - 2013
    Northwestern University - Evanston, IL, USA
I really like this profile
Share it with others