Data Engineer2021 - 2021JLL
Technologies: Google Cloud Platform (GCP), Labelbox, Python, Data Pipelines, Google Cloud Storage, Google Cloud Dataproc, Google Cloud Dataflow, Google Cloud Functions
- Developed data pipelines that take AI-generated labels of images from Labelbox and export them into the Google Cloud Platform.
- Explored several options from Google Dataflow, Cloud functions, and more for end-to-end production.
- Reduced data pipeline time from more than 10 minutes to a couple of minutes by integrating Labelbox and Google Cloud Storage.
Lead Data Engineer2019 - 2021Vezeeta.com
Technologies: Data Engineering, Databases, AWS, AWS Glue, AWS S3, AWS Athena, Python, PySpark, Redshift, Docker, Kubernetes, Apache Airflow, Prometheus, Tableau, SQL, Shell
- Provided leadership from concept to production to design, implement, and evolve raw data lake, data catalogs, DWH, data science use cases integration, ETL pipelines for batch and streaming data from more than 20 data sources, and a set of dashboards.
- Designed logical and physical data models for DWH to power up self-service BI.
- Provided engineering leadership to design, implement, and scale batch and streaming data ingestion from many internal and external data sources.
Head of Data Science2018 - 2019Surface Mobility Consultants
Technologies: Adobe Spark, Apache Hive, Impala, Python, SQL, Geospatial Data, Geospatial Analytics, MicroStrategy, Data Engineering, Data Science, Informatica
- Started and led a team of data scientists, data engineers, and business analysts to work on a transportation and traffic big data and data science project.
- Successfully led the team to deliver 17 data science use cases that involved a lot of data engineering, especially in geospatial data processing.
- Developed a custom MicroStrategy visualization component to display advanced geospatial data.
Lead Data Engineer2017 - 2018PegB Tech
Technologies: Couchbase, Elasticsearch, Apache Kafka, HDFS, HP Vertica, SQL, Scala, Docker
- Developed data platform architecture for enterprise data repository and supporting data science.
- Developed a Kafka-based streaming pipeline that supported 1,000 transactions processed per second.
- Migrated huge volumes of legacy data from MySQL database into HDFS and Cloudera to kickstart Spark-based data analytics.
Data Warehouse Engineer2017 - 2017QExpress
Technologies: AWS Glue, AWS Redshift, MicroStrategy, SQL
- Designed a logical and physical data model of a data warehouse optimized for AWS Redshift.
- Re-designed existing ETL packages for more fault-tolerant and optimized ETL jobs.
- Developed a set of MicroStrategy dashboards and reports for management and operation teams.
Data Warehouse Engineer2011 - 2016DesigNET
Technologies: SQL, PostgreSQL, Business Intelligence (BI), BIRT, Java
- Re-designed data export and load as part of ETL packages.
- Developed a data warehouse model and ETL package to source data from around seven operational data sources.
- Worked with multi-agency team to improve customer onboarding program to reduce onboarding time by about 30%.
Freelance DWH and BI Consultant2010 - 2011Self Employed
Technologies: SQL, MySQL, Pentaho Data Integration (Kettle), Oracle, PostgreSQL, Java, MicroStrategy, BIRT
- Worked on business development for my freelance consulting, generating three customer engagements, one of which turned into a long-term job.
- Developed a MicroStrategy-based dashboard for the office of CFO of a major bank in UAE.
- Developed a reporting DB and set of reports for a warehouse based out of Wisconsin, USA.
Professional Services Consultant2006 - 2010Teradata
Technologies: Teradata, SQL, MicroStrategy, Data Warehouse Design, Business Intelligence (BI)
- Led a team of BI developers to implement BI schema, reports, and dashboards for a leading telecom operator in the country.
- Developed a dashboard for the office of the CEO to re-engage customers on a DWH project.
- Trained internal resources on BI and DWH. Participated in logical and physical data modeling for the enterprise DWH.