Ayoub Akennaf, Developer in Casablanca, Grand Casablanca, Morocco
Ayoub is available for hire
Hire Ayoub

Ayoub Akennaf

Data Engineer and AI Developer

Casablanca, Grand Casablanca, Morocco

Toptal member since December 15, 2025

Bio

Ayoub is an experienced AI and data engineer with over four years of expertise in building scalable data pipelines, machine learning models, and AI-driven solutions. Skilled in Python, SQL, BigQuery, and cloud platforms, he delivers clean, efficient, and high-impact data solutions. With a strong focus on scalability and reliability, Ayoub consistently translates complex data challenges into practical, production-ready systems.

Portfolio

Carrefour
Python, SQL, Data Build Tool (dbt), Google Cloud Platform (GCP), BigQuery, ETL...
Veolia
Data Engineering, Robotic Process Automation (RPA), Google Cloud Platform (GCP)...
SCOR – Paris
Data Engineering, Cloud Computing, Data Science, Python...

Experience

  • Python - 7 years
  • Information Science - 6 years
  • Artificial Intelligence (AI) - 5 years
  • Data Science - 4 years
  • Programming - 4 years
  • SQL - 4 years
  • Data Engineering - 4 years
  • Cloud Computing - 4 years

Preferred Environment

Linux, PyCharm, Anaconda, Cloud, Python, SQL

The most amazing...

...achievement has been combining deep technical expertise with a passion for AI to turn raw data into measurable and sustainable business impact.

Work Experience

Data Engineer, Cloud and Enterprise Data Platforms

2025 - 2025
Carrefour
  • Designed and maintained end-to-end cloud-native data pipelines across multiple business units, handling ingestion, transformation, validation, and provisioning, ensuring reliable enterprise-wide data availability at scale.
  • Implemented automated refresh monitoring, data quality checks, and reconciliation logic to guarantee platform reliability, SLA compliance, and consistent, high-integrity data for analytics and business decision-making.
  • Collaborated with cross-functional teams to architect reusable data assets, enforce governance, and optimize workflows, delivering scalable, standardized, and fully governed data solutions across the organization.
Technologies: Python, SQL, Data Build Tool (dbt), Google Cloud Platform (GCP), BigQuery, ETL, Pipelines, Data Quality, Data Governance, Apache Airflow, Web Scraping, OpenAI, Amazon Web Services (AWS), FastAPI, Data Scraping, Browser Automation, Scraping, Playwright, Large-scale Web Crawlers, PostgreSQL, Redis, Data Enrichment, Instagram, Lead Generation, Algorithms, APIs, Natural Language Processing (NLP), Node.js, Dashboards, RSS Feeds, AI Tools, AI Agents, Agentic AI, Claude, Cursor AI, REST APIs, JavaScript, OAuth, Webhooks, React, TypeScript, Application Performance Optimization, Performance, NestJS, AI Integration, Vercel, Selenium, Software Architecture, System Architecture, Code Review, Project Review, Agentic Frameworks, Agentic RAG Systems, Data Migration, ETL Tools, Data Transformation, Data Extraction, Website Data Scraping, ETL Pipelines, Claude API, Claude Code, Web Development, RAG Systems, Benchmarking, Vertex AI, Hyperparameter Tuning, Rule-based Programming, Vector Databases, Optical Character Recognition (OCR), AI Automation, AI Design, Cloud Infrastructure, Best Practices, AI Assistants

Data Engineer and Robotic Process Automation (RPA) Specialist

2024 - 2025
Veolia
  • Directed the entire automation and reporting project solo, managing client interactions, designing data architecture, and delivering dashboards and robotic process automation (RPA) solutions from concept to production.
  • Designed and developed full-stack Python scripts and RPA workflows to extract, process, and store complex data, fully automating reporting, ensuring data integrity, reducing manual effort, and delivering reliable insights across all operations.
  • Developed and implemented the entire data flow end-to-end, from identifying sources to building dynamic dashboards, ensuring real-time updates, seamless integration, and reliable decision-making, managing the project from design to delivery.
Technologies: Data Engineering, Robotic Process Automation (RPA), Google Cloud Platform (GCP), Data Scraping, ETL, JavaScript, Web Scraping, Amazon Web Services (AWS), FastAPI, Browser Automation, Scraping, Playwright, Large-scale Web Crawlers, PostgreSQL, Redis, Data Enrichment, Instagram, Lead Generation, Algorithms, Python, APIs, Natural Language Processing (NLP), Node.js, Dashboards, RSS Feeds, AI Tools, AI Agents, Agentic AI, Claude, Cursor AI, REST APIs, OAuth, Webhooks, React, TypeScript, Application Performance Optimization, Performance, NestJS, AI Integration, Vercel, Selenium, Software Architecture, System Architecture, Code Review, Project Review, Agentic Frameworks, Agentic RAG Systems, Data Migration, ETL Tools, Data Transformation, Data Extraction, Data Processing Automation, Website Data Scraping, ETL Pipelines, Claude API, Claude Code, Web Development, RAG Systems, Benchmarking, Vertex AI, Hyperparameter Tuning, Rule-based Programming, Vector Databases, Optical Character Recognition (OCR), AI Automation, AI Programming, AI Design, Cloud Infrastructure, Best Practices, AI Assistants

AI and Data Engineering Consultant

2021 - 2025
SCOR – Paris
  • Constructed and deployed an end-to-end NLP and LLM-based system to extract structured business entities from emails, combining NER, domain-specific embeddings, custom matching, and production deployment via Python, Flask, SQL, Docker, and CI/CD.
  • Built a production-grade portfolio risk management platform covering contracts, policies, and claims. Built Python/Flask microservices, optimized back end, delivered secure REST APIs, and implemented dockerized CI/CD with SQL and MongoDB under Agile.
  • Migrated data from Oracle to SQL Server, managing mapping, processing, cleaning, and granularity differences with high accuracy. Improved data management and system performance using Python, SQLAlchemy, and data processing libraries.
Technologies: Data Engineering, Cloud Computing, Data Science, Python, Large Language Models (LLMs), ETL, Data Manipulation, Databases, APIs, Microservices, Big Data, LangChain, LangGraph, Web Scraping, Amazon Web Services (AWS), FastAPI, Data Scraping, Browser Automation, Scraping, Playwright, Large-scale Web Crawlers, PostgreSQL, Redis, Data Enrichment, Lead Generation, Algorithms, Natural Language Processing (NLP), Node.js, Dashboards, RSS Feeds, AI Tools, AI Agents, Agentic AI, Claude, Cursor AI, REST APIs, JavaScript, OAuth, Webhooks, React, TypeScript, Google Cloud Platform (GCP), Application Performance Optimization, Performance, NestJS, AI Integration, Vercel, Selenium, Software Architecture, System Architecture, Code Review, Project Review, Agentic Frameworks, Agentic RAG Systems, Data Migration, ETL Tools, Data Transformation, Data Extraction, Data Processing Automation, Website Data Scraping, ETL Pipelines, Web Development, Benchmarking, Vertex AI, Hyperparameter Tuning, Rule-based Programming, Vector Databases, Optical Character Recognition (OCR), AI Automation, AI Programming, AI Design, Cloud Infrastructure, Best Practices, AI Assistants

Experience

DMS: Automated Feature Extraction from Company Email Data Using NER and Custom AI Algorithms

I built an end-to-end AI system to extract business-critical information from large volumes of company emails in the insurance sector. I designed and optimized the entire pipeline, from data preparation to model improvement.

The project combined classical NLP, transfer learning, and custom matching algorithms to reliably map extracted entities to the company's internal records. I implemented a SpaCy-based NER model with transfer learning, engineered a robust feature-matching engine utilizing distance calculation techniques, and iteratively fine-tuned the system to significantly enhance accuracy.

The final solution enabled automated extraction of structured business features, reducing manual review time and improving operational efficiency.

DCM: Location-based Insurance Pricing Platform with Geocoding Intelligence

I led the development of a full insurance pricing web application that leverages geospatial intelligence to deliver precise, location-aware risk assessments.

I integrated advanced geocoding and reverse-geocoding services to accurately map incident and peril locations, enabling the system to automatically determine geographical coordinates and associated risks.

Using these insights, I designed and implemented a location-based pricing algorithm that enables insurers to dynamically adjust premiums based on spatial risk factors.

Additionally, I ensured data security, integrity, and compliance across the platform while collaborating closely with cross-functional teams (data scientists, underwriters, back-end engineers) to align technical decisions with business needs.

The result was a robust, scalable underwriting tool that improved pricing accuracy and operational efficiency for insurance workflows.

FW Data Migration: Oracle-to-DCM Data Migration and System Integration

I executed a full data migration from the FW Oracle database into the DCM application database, ensuring a smooth transition with zero data loss and minimal disruption to ongoing operations.

I managed the entire implementation lifecycle, encompassing data mapping, processing, transformation, and data cleaning. A key challenge involved aligning different levels of data granularity between systems; I engineered efficient mapping and processing logic to ensure accurate, consistent records after migration.

Following the migration, I validated data integrity end-to-end and optimized internal data management workflows, resulting in improved system performance and reduced manual overhead.

This project served as a critical foundation for enabling modernized insurance workflows on the DCM platform.

Automated Reporting and Dashboard System

I developed an end-to-end automated reporting ecosystem for Veolia Morocco subsidiaries (REDAL, AMENDIS, AMANOR), covering water, electricity, and water reuse operations.

I designed a complete data flow pipeline starting from source identification and process mapping across all client dimensions. Using Python, I automated data extraction, transformation, and loading into structured worksheets, ensuring accuracy and consistency.

On top of this foundation, I built interactive dashboards in Google Looker Studio, supporting real-time updates and scheduled monthly refreshes. This solution dramatically improved reporting speed, visibility, and decision-making across operational departments.

RPA-driven Automation for Internal Data Flow and Workflow Optimization

I designed and implemented a robust RPA system for Veolia Morocco to automate complex data sheet generation workflows within internal tools.

I built automation pipelines using Python, Selenium, BeautifulSoup, Pandas, and NumPy, which enabled the efficient extraction, processing, and storage of multidimensional operational data. The RPA flows also automated intricate UI navigation tasks, eliminating manual steps and ensuring consistent, error-free data handling.

Additionally, I created a unified data structure to simplify the retrieval and integration of generated sheets across departments, improving data accessibility and significantly reducing manual workload. This solution enhanced data reliability and accelerated insight generation for decision-makers.

Cloud-native Transversal Data Platform for Large-scale Retail Operations

I contributed to the development and maintenance of a large-scale, cloud-native data platform powering multiple business units within a major retail organization.

I architected and implemented end-to-end data ingestion, transformation, validation, and provisioning workflows, ensuring that enterprise-wide shared data assets were consistently available, reliable, and optimized for downstream analytics and applications.

To ensure platform stability and compliance, I implemented automated refresh monitoring and developed data quality and validation controls. I created robust reconciliation logic to detect anomalies and ensure SLA adherence at scale.

I also collaborated with cross-functional teams, including data engineers, analysts, and governance teams, to define reusable data assets and enforce enterprise data standards across the organization.

Education

2022 - 2025

Progress Toward a PhD in Computer Science and Data Science

School of Information Sciences - Rabat, Morocco

2018 - 2021

Master's Degree in Knowledge Engineering

School of Information Sciences - Rabat, Morocco

Certifications

AUGUST 2025 - PRESENT

Advanced Deployment

dbt Labs

AUGUST 2025 - PRESENT

dbt Fundamentals (dbt Studio)

dbt Labs

APRIL 2024 - PRESENT

Security Operations Center (SOC)

Cisco Learning and Certifications | via Coursera

FEBRUARY 2021 - PRESENT

Introduction to Deep Learning & Neural Networks with Keras

IBM | via Coursera

JUNE 2020 - PRESENT

Neural Networks and Deep Learning

DeepLearning.AI | via Coursera

APRIL 2020 - PRESENT

Machine Learning with Python

IBM | via Coursera

MARCH 2020 - PRESENT

Machine Learning

Stanford University | via Coursera

Skills

Libraries/APIs

REST APIs, Node.js, React, Shopify API, Playwright, Claude API, Puppeteer, Google Geocoding API, SQLAlchemy, Beautiful Soup, Pandas, NumPy, WhatsApp API

Tools

ChatGPT, n8n, Claude, Claude Code, PyCharm, BigQuery, Apache Airflow, Odoo

Languages

Python, SQL, JavaScript, TypeScript, Python Script, Java

Frameworks

Selenium, Shopify Hydrogen, NestJS, Agentic Frameworks, LangGraph

Paradigms

ETL, Rule-based Programming, Best Practices, Microservices, Testing, Anomaly Detection, Automation, Agile, Scrum

Platforms

Google Cloud Platform (GCP), Amazon Web Services (AWS), Shopify, Vercel, DigitalOcean, Vertex AI, Linux, Anaconda

Storage

PostgreSQL, Redis, Databases, Database Management Systems (DBMS), Data Pipelines, Data Validation, JSON, MySQL

Industry Expertise

Cybersecurity

Other

Data Engineering, Cloud Computing, Programming, Algorithms, Machine Learning, Information Science, Large Language Models (LLMs), APIs, Robotic Process Automation (RPA), Data Scraping, Natural Language Processing (NLP), Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI), Data Migration, Data Transformation, Data Extraction, API Integration, Optical Character Recognition (OCR), Web Scraping, OpenAI, Web Crawlers, Dropshipping, Custom Shopify Apps, Document Parsing, Email Parsing, FastAPI, Full-stack Development, Browser Automation, Scraping, Large-scale Web Crawlers, Data Enrichment, Lead Generation, Dashboards, RSS Feeds, Shopify SEO, AI Tools, AI Agents, Agentic AI, Cursor AI, OAuth, Webhooks, Supabase, Application Performance Optimization, Messaging Patterns, Performance, AI Integration, Third-party APIs, Third-party Integration, Software Architecture, System Architecture, Code Review, Project Review, Technical Writing, Agentic RAG Systems, ETL Tools, Data Processing Automation, Website Data Scraping, ETL Pipelines, Web Development, Monday.com, Prompt Engineering, RAG Systems, Anthropic, Solution Architecture, System Design, Benchmarking, Hyperparameter Tuning, Vector Databases, OpenAI GPT-4 API, AI Automation, AI Programming, AI Design, Data Management, Cloud Infrastructure, AI Assistants, Bots, Data Science, LangChain, Instagram, Finance, Big Data, Computer Science, Deep Learning, Research, Data Manipulation, Cloud, Data Build Tool (dbt), Pipelines, Data Quality, Data Governance, CI/CD Pipelines, Deployment, Analysis, Materialization, Modularity, Supervised Learning, Regression, Statistical Analysis, Feature Engineering, Machine Learning Algorithms, Artificial Neural Networks (ANN), Computer Vision, Network Architecture, Calculus, Linear Algebra, Data Mining, Scalability, Dimensionality Reduction, Predictive Modeling, Security Management, Network Monitoring, Threat Modeling, Security Information and Event Management (SIEM), Threat Detection and Response (TDR), Data Cleaning, Data Preprocessing, Fine-tuning, Transfer Learning, Back-end, Cross-functional Collaboration, Web Applications, Pricing Models, Data Mapping, Handling Granularity Differences, Automated ETL, Reporting Dashboard Development, Looker Studio, Operational Analytics, Cross-Department Data Analysis, Workflow Automation, Web App Automation, Process Optimization, Cloud-Native Data Pipelines, ETL/ELT Workflow Architecture, Automated Monitoring & Reconciliation, Scalable Data Engineering, Cloud Services, Fintech, RESTFul APIs, Architecture, QA Testing, MVP Design, Knowledge Graphs

Collaboration That Works

How to Work with Toptal

Toptal matches you directly with global industry experts from our network in hours—not weeks or months.

1

Share your needs

Discuss your requirements and refine your scope in a call with a Toptal domain expert.
2

Choose your talent

Get a short list of expertly matched talent within 24 hours to review, interview, and choose from.
3

Start your risk-free talent trial

Work with your chosen talent on a trial basis for up to two weeks. Pay only if you decide to hire them.

Top talent is in high demand.

Start hiring