Eriksson Monteiro, Natural Language Processing (NLP) Developer in Aveiro, Portugal
Eriksson Monteiro

Natural Language Processing (NLP) Developer in Aveiro, Portugal

Member since June 4, 2018
Eriksson possesses a PhD in computer science and is an experienced full-stack developer and machine learning engineer. For the past five years, he's built web and mobile applications and has participated in several machine learning tasks in diverse data science fields. He's particularly good at working with application frameworks such as Django, Play Framework, and Node.js.
Eriksson is now available for hire

Portfolio

Experience

Location

Aveiro, Portugal

Availability

Part-time

Preferred Environment

Docker, Git, IntelliJ IDEA

The most amazing...

...thing I've built was a system that predicted whether a loan will default and the likelihood of loss incurred if it does.

Employment

  • Lead Developer

    2018 - PRESENT
    PinTecnologia
    • Built a system to process administrative documents over the Ethereum blockchain.
    • Worked on the front-end and built the UI using React.
    • Designed and developed the back-end using Django and the Django REST framework.
    Technologies: Blockchain, Ethereum, React, Django
  • Product Owner

    2017 - PRESENT
    BMD Software
    • Managed the PACScenter product which is an all-in-one medical imaging platform for patient studies (storage, visualization, and sharing)—enabling simple and efficient workflows.
    • Developed core features for both the back-end and front-end.
    • Defined the strategy for new version releases.
    Technologies: MySQL, JavaScript, HTML, Hibernate, JPA, Scala, Java, Play Framework
  • Kaggle Competitions Expert

    2018 - 2018
    TalkingData AdTracking Fraud Detection
    • Worked with a large dataset using big data tools like Apache Spark.
    • Built an algorithm that predicts whether a user will download an app after clicking a mobile app ad which helps to combat click fraud.
    Technologies: Feature-driven Development (FDD), Gradient Boosting, MLlib, Apache Spark
  • Kaggle Competitions Expert

    2018 - 2018
    Toxic Comment Classification
    • Studied negative online behaviors e.g., toxic comments.
    • Built a multi-label model that’s capable of detecting different types of toxicity like threats, obscenity, insults, and identity-based hate.
    • Used a labeled dataset of comments from Wikipedia’s talk page edits.
    Technologies: Support Vector Machines (SVM), Naive Bayes, Ensemble Methods, Gated Recurrent Unit (GRU), Embedded Development, GloVe, LSTM, Deep Learning
  • Kaggle Competitions Expert

    2018 - 2018
    Recruit Restaurant Visitor Forecasting
    • Predicted how many customers to expect in each day in a restaurant to effectively purchase ingredients and schedule staff members.
    • Developed a prediction model for this task; this was not easy to make because many unpredictable factors can affect restaurant attendance like the weather and local competition. It's even harder for newer restaurants with little historical data.
    • Worked with heterogeneous datasets.
    Technologies: Gradient Boosting, Time Series, ARIMA, LSTM
  • Kaggle Competitions Expert

    2017 - 2017
    Sberbank Russian Housing Market
    • Created a prediction model capable of making predictions about realty prices so that renters, developers, and lenders are more confident when they sign a lease or purchase a building.
    • Developed algorithms which use a broad spectrum of features to predict realty prices, using a rich dataset that includes housing data and macroeconomic patterns.
    Technologies: Validation, Feature-driven Development (FDD), Ridge Regression, Gradient Boosting
  • Kaggle Competitions Expert

    2017 - 2017
    Two Sigma Financial Modeling
    • Applied technology and systematic strategies to financial trading in order to forecast economic outcomes that can never be entirely predictable,.
    • Back-tested to validate regression models that predict financial time series.
    Technologies: Ridge Regression, ExtraTreesRegressor, TensorFlow, Reinforcement Learning
  • Kaggle Competitions Expert

    2016 - 2016
    Santander Customer Satisfaction
    • Created a model that identifies dissatisfied customers.
    • Worked with hundreds of anonymized features to predict if a customer is satisfied or dissatisfied with their banking experience.
    Technologies: Principal Component Analysis (PCA), Neural Networks, Ensemble Methods, Decision Trees, Random Forests
  • Kaggle Competitions Expert

    2014 - 2014
    Loan Default Prediction
    • Determined whether a loan will default, as well as the loss incurred if it does default.
    • Developed methods unlike traditional finance-based approaches to this problem, where one distinguishes between good or bad counter parties in a binary way, we sought to anticipate and incorporate both the default and the severity of the losses that result.
    • Built, as a team, a bridge between traditional banking, where we are looking at reducing the consumption of economic capital, to an asset-management perspective, where we minimized the risk to the financial investor.
    Technologies: Gradient Boosting, Random Forests
  • Kaggle Competitions Expert

    2013 - 2013
    Job Salary Prediction
    • Built a prediction engine for the salary of any UK job advertisement so they can make huge improvements in the experience of users searching for jobs, and help employers and job seekers figure out the market worth of different positions.
    • Worked with a large dataset (hundreds of thousands of records) which was mostly unstructured text with few structured data fields. These were in a number of different formats because of the hundreds of different sources of records.
    Technologies: Dimensionality Reduction, Tf-idf, Natural Language Processing (NLP), Ridge Regression, Random Forest Regression

Experience

  • Dicom Anonymizer
    https://hub.docker.com/r/bioinformaticsua/us-image-anonymizer/

    A tool that uses machine learning to automatically anonymize medical images' pixel data. The main objective of this project was to provide a suitable alternative to a manual process in medical image anonymization. It uses a neural network to find sensitive information in the images and removes regions containing this information.

  • Scaleus
    http://bioinformatics-ua.github.io/scaleus/

    Scaleus is a semantic web data migration tool.

  • Dicoogle
    http://dicoogle.com

    Dicoogle is an open-source PACS that's well fitted to small-medium sized institutions that want to install a PACS without a huge investment. It is based on a document-indexing model that can be shared over a LAN or a WAN, allowing the creation of federated views and extends the querying and retrieval over a large number of archives. It allows searching through free text queries as well as any DICOM tag. More information can be extracted from medical imaging repositories—offering increasing flexibility when compared to current query and retrieval DICOM services.

Skills

  • Languages

    Java, SQL, Python, Scala, C#, JavaScript, HTML
  • Frameworks

    Apache Spark, Play Framework, Django, JPA, Hibernate
  • Libraries/APIs

    Matplotlib, Pandas, NumPy, Scikit-learn, Keras, TensorFlow, React, LSTM, MLlib, Node.js, SciPy
  • Tools

    Jupyter, Docker Compose, PredictionIO, Git, IntelliJ IDEA, Weka
  • Paradigms

    Data Science, Scrum, Clean Code, DevOps, Concurrent Programming, Object-oriented Programming (OOP), Distributed Computing, Continuous Integration (CI), Parallel Computing, Agile Software Development, Unit Testing
  • Platforms

    Linux, Docker, MacOS, Windows, Ethereum, Blockchain
  • Other

    Deep Neural Networks, Data Mining, Machine Learning, Data Engineering, Predictive Analytics, RESTful Web Services, Regression Modeling, Classification Algorithms, Recommendation Systems, Recurrent Neural Networks, Convolutional Neural Networks, Random Forests, Gradient Boosted Trees, Statistics, Big Data, Natural Language Processing (NLP), Text Mining, GitFlow, Gradient Boosting, ARIMA, Time Series, Reinforcement Learning, ExtraTreesRegressor, Ridge Regression, Random Forest Regression, Tf-idf, Dimensionality Reduction, Feature-driven Development (FDD), Validation, Deep Learning, Embedded Development, Gated Recurrent Unit (GRU), Ensemble Methods, Support Vector Machines (SVM), Decision Trees, Neural Networks, Principal Component Analysis (PCA), Naive Bayes, GloVe
  • Storage

    Elasticsearch, MySQL, Memcached, Microsoft SQL Server, MongoDB, PostgreSQL

Education

  • Doctor of Philosophy (PhD) Degree in Computer Science
    2012 - 2017
    MAP-i (University of Aveiro, Porto and Minho) - Portugal
  • Master's Degree in Computer Science
    2007 - 2012
    University of Aveiro - Portugal

To view more profiles

Join Toptal
Share it with others