
Ayoub Akennaf
Verified Expert in Engineering
Data Engineer and AI Developer
Casablanca, Grand Casablanca, Morocco
Toptal member since December 15, 2025
Ayoub is an experienced AI and data engineer with over four years of expertise in building scalable data pipelines, machine learning models, and AI-driven solutions. Skilled in Python, SQL, BigQuery, and cloud platforms, he delivers clean, efficient, and high-impact data solutions. With a strong focus on scalability and reliability, Ayoub consistently translates complex data challenges into practical, production-ready systems.
Portfolio
Experience
- Python - 7 years
- Information Science - 6 years
- Artificial Intelligence (AI) - 5 years
- Data Science - 4 years
- Programming - 4 years
- SQL - 4 years
- Data Engineering - 4 years
- Cloud Computing - 4 years
Preferred Environment
Linux, PyCharm, Anaconda, Cloud, Python, SQL
The most amazing...
...achievement has been combining deep technical expertise with a passion for AI to turn raw data into measurable and sustainable business impact.
Work Experience
Data Engineer, Cloud and Enterprise Data Platforms
Carrefour
- Designed and maintained end-to-end cloud-native data pipelines across multiple business units, handling ingestion, transformation, validation, and provisioning, ensuring reliable enterprise-wide data availability at scale.
- Implemented automated refresh monitoring, data quality checks, and reconciliation logic to guarantee platform reliability, SLA compliance, and consistent, high-integrity data for analytics and business decision-making.
- Collaborated with cross-functional teams to architect reusable data assets, enforce governance, and optimize workflows, delivering scalable, standardized, and fully governed data solutions across the organization.
Data Engineer and Robotic Process Automation (RPA) Specialist
Veolia
- Directed the entire automation and reporting project solo, managing client interactions, designing data architecture, and delivering dashboards and robotic process automation (RPA) solutions from concept to production.
- Designed and developed full-stack Python scripts and RPA workflows to extract, process, and store complex data, fully automating reporting, ensuring data integrity, reducing manual effort, and delivering reliable insights across all operations.
- Developed and implemented the entire data flow end-to-end, from identifying sources to building dynamic dashboards, ensuring real-time updates, seamless integration, and reliable decision-making, managing the project from design to delivery.
AI and Data Engineering Consultant
SCOR – Paris
- Constructed and deployed an end-to-end NLP and LLM-based system to extract structured business entities from emails, combining NER, domain-specific embeddings, custom matching, and production deployment via Python, Flask, SQL, Docker, and CI/CD.
- Built a production-grade portfolio risk management platform covering contracts, policies, and claims. Built Python/Flask microservices, optimized back end, delivered secure REST APIs, and implemented dockerized CI/CD with SQL and MongoDB under Agile.
- Migrated data from Oracle to SQL Server, managing mapping, processing, cleaning, and granularity differences with high accuracy. Improved data management and system performance using Python, SQLAlchemy, and data processing libraries.
Experience
DMS: Automated Feature Extraction from Company Email Data Using NER and Custom AI Algorithms
The project combined classical NLP, transfer learning, and custom matching algorithms to reliably map extracted entities to the company's internal records. I implemented a SpaCy-based NER model with transfer learning, engineered a robust feature-matching engine utilizing distance calculation techniques, and iteratively fine-tuned the system to significantly enhance accuracy.
The final solution enabled automated extraction of structured business features, reducing manual review time and improving operational efficiency.
DCM: Location-based Insurance Pricing Platform with Geocoding Intelligence
I integrated advanced geocoding and reverse-geocoding services to accurately map incident and peril locations, enabling the system to automatically determine geographical coordinates and associated risks.
Using these insights, I designed and implemented a location-based pricing algorithm that enables insurers to dynamically adjust premiums based on spatial risk factors.
Additionally, I ensured data security, integrity, and compliance across the platform while collaborating closely with cross-functional teams (data scientists, underwriters, back-end engineers) to align technical decisions with business needs.
The result was a robust, scalable underwriting tool that improved pricing accuracy and operational efficiency for insurance workflows.
FW Data Migration: Oracle-to-DCM Data Migration and System Integration
I managed the entire implementation lifecycle, encompassing data mapping, processing, transformation, and data cleaning. A key challenge involved aligning different levels of data granularity between systems; I engineered efficient mapping and processing logic to ensure accurate, consistent records after migration.
Following the migration, I validated data integrity end-to-end and optimized internal data management workflows, resulting in improved system performance and reduced manual overhead.
This project served as a critical foundation for enabling modernized insurance workflows on the DCM platform.
Automated Reporting and Dashboard System
I designed a complete data flow pipeline starting from source identification and process mapping across all client dimensions. Using Python, I automated data extraction, transformation, and loading into structured worksheets, ensuring accuracy and consistency.
On top of this foundation, I built interactive dashboards in Google Looker Studio, supporting real-time updates and scheduled monthly refreshes. This solution dramatically improved reporting speed, visibility, and decision-making across operational departments.
RPA-driven Automation for Internal Data Flow and Workflow Optimization
I built automation pipelines using Python, Selenium, BeautifulSoup, Pandas, and NumPy, which enabled the efficient extraction, processing, and storage of multidimensional operational data. The RPA flows also automated intricate UI navigation tasks, eliminating manual steps and ensuring consistent, error-free data handling.
Additionally, I created a unified data structure to simplify the retrieval and integration of generated sheets across departments, improving data accessibility and significantly reducing manual workload. This solution enhanced data reliability and accelerated insight generation for decision-makers.
Cloud-native Transversal Data Platform for Large-scale Retail Operations
I architected and implemented end-to-end data ingestion, transformation, validation, and provisioning workflows, ensuring that enterprise-wide shared data assets were consistently available, reliable, and optimized for downstream analytics and applications.
To ensure platform stability and compliance, I implemented automated refresh monitoring and developed data quality and validation controls. I created robust reconciliation logic to detect anomalies and ensure SLA adherence at scale.
I also collaborated with cross-functional teams, including data engineers, analysts, and governance teams, to define reusable data assets and enforce enterprise data standards across the organization.
Education
Progress Toward a PhD in Computer Science and Data Science
School of Information Sciences - Rabat, Morocco
Master's Degree in Knowledge Engineering
School of Information Sciences - Rabat, Morocco
Certifications
Advanced Deployment
dbt Labs
dbt Fundamentals (dbt Studio)
dbt Labs
Security Operations Center (SOC)
Cisco Learning and Certifications | via Coursera
Introduction to Deep Learning & Neural Networks with Keras
IBM | via Coursera
Neural Networks and Deep Learning
DeepLearning.AI | via Coursera
Machine Learning with Python
IBM | via Coursera
Machine Learning
Stanford University | via Coursera
Skills
Libraries/APIs
REST APIs, Node.js, React, Shopify API, Playwright, Claude API, Puppeteer, Google Geocoding API, SQLAlchemy, Beautiful Soup, Pandas, NumPy, WhatsApp API
Tools
ChatGPT, n8n, Claude, Claude Code, PyCharm, BigQuery, Apache Airflow, Odoo
Languages
Python, SQL, JavaScript, TypeScript, Python Script, Java
Frameworks
Selenium, Shopify Hydrogen, NestJS, Agentic Frameworks, LangGraph
Paradigms
ETL, Rule-based Programming, Best Practices, Microservices, Testing, Anomaly Detection, Automation, Agile, Scrum
Platforms
Google Cloud Platform (GCP), Amazon Web Services (AWS), Shopify, Vercel, DigitalOcean, Vertex AI, Linux, Anaconda
Storage
PostgreSQL, Redis, Databases, Database Management Systems (DBMS), Data Pipelines, Data Validation, JSON, MySQL
Industry Expertise
Cybersecurity
Other
Data Engineering, Cloud Computing, Programming, Algorithms, Machine Learning, Information Science, Large Language Models (LLMs), APIs, Robotic Process Automation (RPA), Data Scraping, Natural Language Processing (NLP), Artificial Intelligence (AI), Generative Artificial Intelligence (GenAI), Data Migration, Data Transformation, Data Extraction, API Integration, Optical Character Recognition (OCR), Web Scraping, OpenAI, Web Crawlers, Dropshipping, Custom Shopify Apps, Document Parsing, Email Parsing, FastAPI, Full-stack Development, Browser Automation, Scraping, Large-scale Web Crawlers, Data Enrichment, Lead Generation, Dashboards, RSS Feeds, Shopify SEO, AI Tools, AI Agents, Agentic AI, Cursor AI, OAuth, Webhooks, Supabase, Application Performance Optimization, Messaging Patterns, Performance, AI Integration, Third-party APIs, Third-party Integration, Software Architecture, System Architecture, Code Review, Project Review, Technical Writing, Agentic RAG Systems, ETL Tools, Data Processing Automation, Website Data Scraping, ETL Pipelines, Web Development, Monday.com, Prompt Engineering, RAG Systems, Anthropic, Solution Architecture, System Design, Benchmarking, Hyperparameter Tuning, Vector Databases, OpenAI GPT-4 API, AI Automation, AI Programming, AI Design, Data Management, Cloud Infrastructure, AI Assistants, Bots, Data Science, LangChain, Instagram, Finance, Big Data, Computer Science, Deep Learning, Research, Data Manipulation, Cloud, Data Build Tool (dbt), Pipelines, Data Quality, Data Governance, CI/CD Pipelines, Deployment, Analysis, Materialization, Modularity, Supervised Learning, Regression, Statistical Analysis, Feature Engineering, Machine Learning Algorithms, Artificial Neural Networks (ANN), Computer Vision, Network Architecture, Calculus, Linear Algebra, Data Mining, Scalability, Dimensionality Reduction, Predictive Modeling, Security Management, Network Monitoring, Threat Modeling, Security Information and Event Management (SIEM), Threat Detection and Response (TDR), Data Cleaning, Data Preprocessing, Fine-tuning, Transfer Learning, Back-end, Cross-functional Collaboration, Web Applications, Pricing Models, Data Mapping, Handling Granularity Differences, Automated ETL, Reporting Dashboard Development, Looker Studio, Operational Analytics, Cross-Department Data Analysis, Workflow Automation, Web App Automation, Process Optimization, Cloud-Native Data Pipelines, ETL/ELT Workflow Architecture, Automated Monitoring & Reconciliation, Scalable Data Engineering, Cloud Services, Fintech, RESTFul APIs, Architecture, QA Testing, MVP Design, Knowledge Graphs
How to Work with Toptal
Toptal matches you directly with global industry experts from our network in hours—not weeks or months.
Share your needs
Choose your talent
Start your risk-free talent trial
Top talent is in high demand.
Start hiring