Lalu Prasad Lenka, Data Scientist and Software Developer in Dublin, Ireland
Lalu Prasad Lenka

Data Scientist and Software Developer in Dublin, Ireland

Member since July 12, 2022
Lalu is a seasoned data scientist with a master's degree in data science from Trinity College Dublin and 3+ years of work experience in formulating research-driven approaches to solve challenging business problems using data and state-of-the-art machine learning algorithms. Lalu developed and deployed scalable ML solutions on cloud platforms like AWS and Azure and on-premise Kubernetes clusters. He presented compelling insights to many stakeholders, helping them make data-driven decisions.
Lalu is now available for hire

Portfolio

  • AAYS Analytics
    Python 3, Azure, PySpark, Statistical Methods, Machine Learning...
  • Aptus Data Labs
    Python 3, Machine Learning, Deep Learning, Agile, CI/CD Pipelines, Docker...
  • Aptus Data Labs
    Python 3, Python, Data Science, Agile Data Science, AWS...

Experience

Location

Dublin, Ireland

Availability

Part-time

Preferred Environment

Linux, VS Code, Databricks, Jupyter Notebook, MacOS, Slack, Agile, Agile Sprints, Jira, Windows

The most amazing...

...project I've developed is an ML solution for a fast-fashion client that recommended favorable styles, forecasted demand, and increased profitability by 15%.

Employment

  • Data Scientist

    2020 - 2021
    AAYS Analytics
    • Extracted, aggregated, and analyzed large data sets to provide actionable insights; also created intuitive visualizations to convey those results to a broader audience.
    • Analyzed profit erosion for a finance client and discovered adverse cost components which helped optimize existing revenue streams.
    • Developed and deployed an intelligent supply chain solution for a fast-fashion client that helped the client maintain optimal stock levels for favorable clothing styles and increased earnings.
    • Contributed to building the data infrastructure for client organizations on Azure, including setting up a data lake, ETL (data engineering) pipelines, and machine learning pipelines.
    • Acted as a data scientist to build and operationalize reliable and scalable machine learning pipelines for data preparation, model training, and prediction at scale. Deployed data pipelines on the Azure cloud platform.
    • Led client meetings and presented compelling findings and a story for the "why" of these findings to a wide range of stakeholders with insightful visualizations using Power BI reports.
    Technologies: Python 3, Azure, PySpark, Statistical Methods, Machine Learning, Agile Sprints, Agile, Python, Deep Learning, DeepAR, Demand Planning, Git, Agile Data Science, Data Science, Time Series, Time Series Analysis, Time Series Forecasting
  • Data Scientist

    2018 - 2020
    Aptus Data Labs
    • Served as a data scientist to partner with clients to understand their business pain points and design analytical solutions to address those; also helped clients use their organization's data to drive strategic business decisions.
    • Focused on data preprocessing, machine learning modeling, and the operationalization of ML models.
    • Developed and deployed an LSTM-based (named entity recognition) model for a pharma client that helped reduce manual efforts by 90%.
    • Developed and deployed an inventory optimization platform that used hybrid time series models for long-term forecasting and demand sensing. This helped the client maintain optimal inventory for products and plan demand fulfillment.
    • Developed and deployed a deep learning pipeline for a manufacturing client that performs text localization and recognition, helping reduce human error and operations costs by 40%.
    Technologies: Python 3, Machine Learning, Deep Learning, Agile, CI/CD Pipelines, Docker, Kubernetes, AWS, Data Science, Agile Data Science, Time Series, Time Series Analysis, Time Series Forecasting
  • Data Science Intern

    2018 - 2018
    Aptus Data Labs
    • Worked as a data science intern on time series analysis and text analytics projects.
    • Implemented, for a Fortune Global 500 oil-and-gas company, a proof of concept for a supply chain optimization project by creating a time series model to forecast the load(oil, gas) requirement at different ports based on historical data.
    • Developed, for a multinational pharmaceutical company, text-analyzing software to migrate thousands of documents into a different format. It helped them reduce the operational cost of merger by 5%.
    • Created tools for a sanity-check-like document comparison tool to visually analyze the difference in two almost similar documents. Successfully automated the whole process and reduced manual effort to a staggering 1-2% of the initial effort.
    Technologies: Python 3, Python, Data Science, Agile Data Science, AWS, Amazon Web Services (AWS), Docker, Kubernetes, Time Series Analysis, Time Series, Time Series Forecasting
  • Machine Learning Intern

    2017 - 2017
    Tata Consultancy Services
    • Worked on a project called "Image Attribute Extraction" which includes extraction of text from product images and populating specific attributes with extracted text.
    • Developed a Keras model for text recognition using connectionist temporal classification loss.
    • Developed a CNN-RNN based neural network to detect text in product images that helped the team build a more robust text extraction.
    Technologies: Machine Learning, Python 3, Python, Deep Learning, OCR, Text Recognition, Text Detection, Computer Vision

Experience

  • Profit Erosion Analysis

    I performed profit erosion analysis using multivariate descriptive and regression analysis to find the root cause of profit erosion across five major sub-brands of a confectionery giant.

    Afterward, I discovered loss-making consumer pack types and adverse cost components. The next task I did was to perform a prescriptive analysis, and then I suggested business actions that optimized existing revenue streams by 18%.

    Soon after, I performed prescriptive analytics and used time series forecasting to project the sales, major cost components and predicted the potential losses if no action taken. Finally, I created a "What if Tool" to prescribe the next best business action.

  • Fast Fashion Intelligent Supply Chain Solution

    I developed and deployed an intelligent supply chain solution for a fast-fashion client that recommends more favorable SKUs and forecasts their 14-week sales and required inventory.

    The solution helped the client maintain optimal stock levels for all SKUs and increased profitability by 15%.

  • Chemical Named Entity Recognition

    I developed and deployed an LSTM-based named entity recognition model to detect complex chemical names and medicines in drug formulation documents that helped reduce manual efforts by 90%.

    The model was developed on the IUPAC dataset and achieved an F1 score of 0.85.

  • Machine Screen Text Recognition

    I developed and deployed a deep learning pipeline to locate and recognize texts in machine screens like temperature and pressure and log in directly to the database, which helped reduce human error and operations costs by 40%.

    Text extraction on an image was done in two independent steps—detection (region proposal network) and recognition using CNNs. I achieved a mAP (mean average precision) of 0.56.

  • Demand Forecasting and Demand Sensing
    https://www.aptplan.ai/

    I developed an inventory optimization platform that used time-series models for long-term forecasting and demand sensing for short-term forecasts. The combined forecast was more accurate and robust for real-world events like weather changes and customer behavior. It supported optimal demand forecasting model selection from many models like LSTM, ARIMA, Holt-Winters, and hybrid models such as AdaBoost-LSTM ensemble.

    This helped the client maintain optimal inventory for all products and plan demand fulfillment.

  • IoT Streaming Analytics Platform

    I built a proof-of-concept IoT streaming analytics platform on an 8-node Spark cluster. The stream of real-time sensor data was collected using Kafka, processed using ML models on Spark, and output fed to the live Power BI dashboard.

  • Autonomous Car Parking
    https://github.com/Lplenka/Autonomous-Car-Parking

    Project to compare the performance of three different families of AI algorithms for autonomous car parking tasks.

    • Reinforcement learning – soft actor-critic (SAC)
    • Imitation learning – behavior cloning
    • Neuroevolution – genetic algorithm

    The above algorithms were used to build three different AI agents that tried to control the steering and acceleration to park the vehicle correctly. The performance of these agents was compared for multiple simulations.

  • Car Damage Detection Using Semantic Segmentation
    https://github.com/Lplenka/Car-Damage-Detection

    Developed a deep learning model to detect damages in a picture of the car. Implemented Facebook's Detectron model using PyTorch to build an image segmentation model.

    Given a pic of the damaged car, find which part is damaged. The parts can be either the rear bumper, front bumper, headlamp, door, or hood.

Skills

  • Languages

    Python 3, Python, SQL, R
  • Libraries/APIs

    Scikit-learn, TensorFlow, PySpark, Spark ML, LSTM, PyTorch, SpaCy, NLTK, Keras, TensorFlow Deep Learning Library (TFLearn)
  • Tools

    Slack, AWS CodeDeploy, Kafka Streams, Microsoft Power BI, Amazon ECS (Amazon Elastic Container Service), VS Code, Jira, Microsoft Excel, Git, GitHub, Spark SQL
  • Paradigms

    Agile, Azure DevOps, Data Science
  • Platforms

    Jupyter Notebook, Linux, Azure, Docker, Kubernetes, Databricks, AWS Lambda, Apache Kafka, Amazon Web Services (AWS), MacOS, Windows
  • Other

    Machine Learning, Time Series, AWS, Deep Reinforcement Learning, Deep Learning, Data Analysis, Computer Vision, Natural Language Processing (NLP), Statistical Methods, CI/CD Pipelines, Demand Planning, Time-series, Time-series-forecasting, Serverless, ML-Flow, Azure Data Factory, Azure Data Lake, Genetic Algorithms, Artificial Intelligence (AI), Statistics, Agile Sprints, OCR, Text Mining, Predictive Modeling, Predictive Analytics, Data Queries, Web Scraping, DeepAR, Agile Data Science, Text Recognition, Text Detection, Time Series Analysis, Kalman Filtering, Time Series Forecasting, Prescriptive Analytics, Prescriptive Modeling
  • Storage

    MySQL, Azure SQL Databases

Education

  • Master's Degree in Data Science
    2021 - 2022
    Trinity College Dublin - Dublin, Ireland
  • Bachelor's Degree in Computer Science
    2014 - 2018
    Odisha University of Technology and Research - Bhubaneswar, India

Certifications

  • Specialization in Statistics
    AUGUST 2019 - PRESENT
    Coursera
  • Deep Learning Specialization
    NOVEMBER 2018 - PRESENT
    Coursera

To view more profiles

Join Toptal
Share it with others