James Arnemann, Statistics Developer in New York, NY, United States
James Arnemann

Statistics Developer in New York, NY, United States

Member since May 25, 2019
James is an experienced data scientist and machine learning engineer with publications in leading journals. He's held positions researching and deploying deep learning models at UC Berkeley, Intel, and National Laboratories.
James is now available for hire


  • Statistics 12 years
  • Bayesian Statistics 10 years
  • Data Visualization 10 years
  • Python 10 years
  • Convolutional Neural Networks 7 years


New York, NY, United States



Preferred Environment

Jupyter Notebook, Git

The most amazing...

...thing I've implemented was a deep learning algorithm looking at simulated dark matter distributions to predict cosmological parameters that govern our universe.


  • Machine Learning Engineer

    2019 - PRESENT
    • Built models for behavioral biometrics and continuous multi-factor authentication in the field of cyber security.
    Technologies: Python
  • Program Director of Research Science

    2018 - 2019
    New York-Presbyterian Hospital
    • Built predictive models using historical data to predict the number of patients in the emergency departments at the different NYP hospitals.
    • Cleaned and parsed millions of electronic health records and determined hospital-acquired VTE (Venous thromboembolism) rates and metrics of how it's addressed by different hospitals and departments.
    • Developed analytics for oncology rates of the different departments and different cancer types throughout NYPs ambulatory care network.
    • Taught Python programming and data analysis courses to over 50 NYP employees.
    Technologies: Python, SQL
  • Deep Learning Research Scientist

    2017 - 2018
    National Energy Research Scientific Computing Center (NERSC)
    • Implemented deep learning architectures on cosmology simulations to understand and predict the parameters that govern the evolution of the universe.
    • Collaborated with a diverse team from Lawrence Berkeley National Lab, Intel, and Cray, to run these models at state-of-the-art performance on the world's eighth-largest supercomputer.
    • Published in SC18 (The International Conference for High Performance Computing, Networking, Storage, and Analysis).
    Technologies: Python, TensorFlow
  • Graduate Student Researcher

    2013 - 2018
    UC Berkeley
    • Led multiple computational projects and developed novel algorithms in machine learning.
    • Developed a novel exploration algorithm using Bayesian non-parametric statistical analysis and information theory (accepted to NIPS 2014).
    • Trained an autoencoder neural network to learn temporal dynamics of cellular automata evolution.
    • Classified hand-written digits with an unsupervised neural network algorithm using only local learning rules.
    • Mentored research assistants to take on original research projects.
    Technologies: Python, TensorFlow
  • Data Science Intern (Artificial Intelligence Group)

    2017 - 2017
    • Implemented Neural Style Transfer with VGG-19 (Convolutional Neural Network).
    • Reconstructed audio spectrograms from hidden layer activations of Deep Speech 2 (many layered Bidirectional Recurrent Neural Network model for speech to text).
    • Developed a novel approach for style transfer applied to audio signals.
    Technologies: Python, Neon, TensorFlow


  • You've Got Meal (Development)

    As part of my experience as a data science fellow at Insight, I deployed a recipe recommendation web app using collaborative filtering and implicit feedback using purchasing and online recipe data.


  • Languages

    Python, SQL, C++
  • Libraries/APIs

    Scikit-learn, Pandas, NumPy, Matplotlib, TensorFlow, SciPy
  • Platforms

    Jupyter Notebook, Linux, Unix
  • Other

    Data, Linear Regression, Logistic Regression, Random Forests, Principal Component Analysis (PCA), Deep Learning, K-means Clustering, Statistics, Data Cleaning, Data Visualization, Naive Bayes, Convolutional Neural Networks, Bayesian Statistics, Support Vector Machines (SVM), Agile Data Science, Gradient Boosting
  • Tools

    MATLAB, Git
  • Paradigms

    Agile Software Development, Data Science


  • Master's degree in Physics
    2010 - 2013
    University of California Berkeley - California, USA
  • Bachelor's degree in Mathematics
    2006 - 2009
    University of Illinois Urbana-Champaign - Illinois, USA

To view more profiles

Join Toptal
I really like this profile
Share it with others