Ben Summers, Machine Learning Developer in Uppsala, Sweden
Ben Summers

Machine Learning Developer in Uppsala, Sweden

Member since April 26, 2019
With a PhD in pure maths, Ben would describe himself as an academic at heart, which means he’s deeply passionate about his work. Since finishing his PhD in 2012, he’s worked professionally as a back-end and data engineer for both a large global company and a small startup. For the past four years, he's been obsessed with machine learning, especially neural networks, and enjoys applying these techniques to solve real-world problems.
Ben is now available for hire


  • USC ISI (via Toptal)
    Doccano, Jupyter, PyCharm, ZeroMQ, Flask, Gensim, NLTK, Python
  • Instabridge
    Data Flows, Keras, TensorFlow, Scikit-learn, Pandas, PyTorch, Spark, BigQuery...
  • Instabridge
    Amazon Web Services (AWS), Spark, MongoDB, RabbitMQ...



Uppsala, Sweden



Preferred Environment

Linux, Git, PyCharm, Jupyter

The most amazing...

...project I've worked on is my PhD thesis because it was the first time I was really challenged. Writing did not come naturally at all. I always want to explore.


  • Research Programmer

    2019 - 2020
    USC ISI (via Toptal)
    • Improved cross-lingual query summarization system, resulting in the team winning during the evaluation period (despite being in second place before the summarization stage).
    • Increased the speed of experiment runs by using an approximate k-nearest neighbors algorithm for embedding lookups (using the Annoy library) after identifying the bottleneck using py-spy.
    • Increased iteration speed and reliability by enforcing design decisions with tests and structuring code.
    Technologies: Doccano, Jupyter, PyCharm, ZeroMQ, Flask, Gensim, NLTK, Python
  • Data Scientist

    2018 - 2019
    • Migrated data system from AWS to Google Cloud.
    • Developed models to identify moving WiFi hotspots, e.g., those hotspots on trains or mobile devices.
    • Built models to estimate locations of WiFi hotspots from scans and connections by Android devices.
    • Wrote and deployed data models in/with dbt (data build tools).
    • Produced various ad-hoc analyses for stakeholders.
    • Deployed Snowplow event pipelines on the Google Cloud Platform (GCP) with Cloud Pub/Sub, Dataflow, BigQuery, and Google Compute Engine.
    Technologies: Data Flows, Keras, TensorFlow, Scikit-learn, Pandas, PyTorch, Spark, BigQuery, EMR
  • Back-end Developer

    2015 - 2018
    • Designed and implemented the back-end architecture utilizing Heroku, AWS, and GCP.
    • Implemented data pipelines in Spark running on EMR scheduled with Airflow.
    • Applied machine learning to solve core data problems such as estimating locations of WiFi hotspots, quality of hotspots, classifying hotspots as moving or stationary, public or private, matching hotspots, and venues.
    • Implemented near real-time data pipelines using AWS Kinesis, lambda functions, and DynamoDB.
    Technologies: Amazon Web Services (AWS), Spark, MongoDB, RabbitMQ, Google Cloud Platform (GCP), AWS, Heroku, Ruby on Rails (RoR)
  • Solutions Engineer

    2013 - 2014
    Cadence Design Systems
    • Developed internal productivity/process web applications for one of the two leading electronic design automation companies.
    • Improved my ability to work effectively in teams.
    • Developed communication skills.
    • Continuously evaluated and ranked priorities based on business value.
    Technologies: Microsoft 365, Linux, Oracle, Perforce, MySQL, PHP
  • Associate Tutor

    2008 - 2012
    University of East Anglia
    • Communicated successfully difficult concepts to a range of students.
    • Marked coursework.
    Technologies: Blackboard, Pen & Paper


  • Web-based Server Monitor and Admin Tools for Medal of Honor (Other amazing things)

    This was written in PHP and has lots of socket programming, sessions, and user authentication. The client tool was built using C# and .NET.


  • Languages

    Python, SQL, Python 3, JavaScript, PHP, Haskell, Scala
  • Libraries/APIs

    LSTM, PyTorch, TensorFlow,, Spark ML, FFmpeg, Keras, PySpark, Scikit-learn, NLTK, ZeroMQ, Sklearn, Pandas, NumPy, OpenCV
  • Tools

    BigQuery, Amazon Elastic MapReduce (EMR), Spark SQL, Apache Airflow, AWS Athena, Jupyter, PyCharm, Git, Perforce, Gensim, Doccano, RabbitMQ, Google Compute Engine (GCE)
  • Platforms

    Linux, Google Cloud Platform (GCP), Amazon Web Services (AWS), Heroku, AWS Kinesis, AWS Lambda, Oracle, Blackboard, Arduino, Anaconda
  • Other

    EMR, Convolutional Neural Networks, Linear Algebra, Google BigQuery, Neural Networks, Deep Learning, Artificial Intelligence (AI), Machine Learning, Deep Neural Networks, AWS, Natural Language Processing (NLP), Probability Theory, Stream Processing, IP Networks, Image Recognition, Statistics, Deep Reinforcement Learning, Data Engineering, Computer Vision, Audio, Audio Processing, Digital Signal Processing, Serverless, Big Data, AWS API Gateway, Reinforcement Learning, Data Flows, Microsoft 365, Pen & Paper, Generative Adversarial Networks (GANs)
  • Frameworks

    Apache Spark, AWS EMR, Spark, Flask, Django, Ruby on Rails (RoR)
  • Paradigms

    Functional Programming, Object-oriented Programming (OOP), ETL, Data Science, Serverless Architecture, Agile
  • Storage

    PostgreSQL, Redshift, MySQL, MongoDB


  • B2 CEFR in Greek Language and Culture
    2014 - 2015
    University of Ioannina - Ioannina, Greece
  • PhD degree in Mathematics
    2008 - 2012
    University of East Anglia - Norwich, UK
  • Master's degree in Mathematics
    2004 - 2008
    University of East Anglia - Norwich, UK

To view more profiles

Join Toptal
Share it with others