Data Engineer
2021 - 2021JLL- Developed data pipelines that take AI-generated labels of images from Labelbox and export them into the Google Cloud Platform.
- Explored several options from Google Dataflow, Cloud functions, and more for end-to-end production.
- Reduced data pipeline time from more than 10 minutes to a couple of minutes by integrating Labelbox and Google Cloud Storage.
Technologies: Google Cloud Platform (GCP), Labelbox, Python, Data Pipelines, Google Cloud Storage, Google Cloud Dataproc, Google Cloud Dataflow, Google Cloud FunctionsLead Data Engineer
2019 - 2021Vezeeta.com- Provided leadership from concept to production to design, implement, and evolve raw data lake, data catalogs, DWH, data science use cases integration, ETL pipelines for batch and streaming data from more than 20 data sources, and a set of dashboards.
- Designed logical and physical data models for DWH to power up self-service BI.
- Provided engineering leadership to design, implement, and scale batch and streaming data ingestion from many internal and external data sources.
Technologies: Data Engineering, Databases, AWS, AWS Glue, AWS S3, AWS Athena, Python, PySpark, Redshift, Docker, Kubernetes, Apache Airflow, Prometheus, Tableau, SQL, ShellHead of Data Science
2018 - 2019Surface Mobility Consultants- Started and led a team of data scientists, data engineers, and business analysts to work on a transportation and traffic big data and data science project.
- Successfully led the team to deliver 17 data science use cases that involved a lot of data engineering, especially in geospatial data processing.
- Developed a custom MicroStrategy visualization component to display advanced geospatial data.
Technologies: Adobe Spark, Apache Hive, Impala, Python, SQL, Geospatial Data, Geospatial Analytics, MicroStrategy, Data Engineering, Data Science, InformaticaLead Data Engineer
2017 - 2018PegB Tech- Developed data platform architecture for enterprise data repository and supporting data science.
- Developed a Kafka-based streaming pipeline that supported 1,000 transactions processed per second.
- Migrated huge volumes of legacy data from MySQL database into HDFS and Cloudera to kickstart Spark-based data analytics.
Technologies: Couchbase, Elasticsearch, Apache Kafka, HDFS, HP Vertica, SQL, Scala, DockerData Warehouse Engineer
2017 - 2017QExpress- Designed a logical and physical data model of a data warehouse optimized for AWS Redshift.
- Re-designed existing ETL packages for more fault-tolerant and optimized ETL jobs.
- Developed a set of MicroStrategy dashboards and reports for management and operation teams.
Technologies: AWS Glue, AWS Redshift, MicroStrategy, SQLData Warehouse Engineer
2011 - 2016DesigNET- Re-designed data export and load as part of ETL packages.
- Developed a data warehouse model and ETL package to source data from around seven operational data sources.
- Worked with multi-agency team to improve customer onboarding program to reduce onboarding time by about 30%.
Technologies: SQL, PostgreSQL, Business Intelligence (BI), BIRT, JavaFreelance DWH and BI Consultant
2010 - 2011Self Employed- Worked on business development for my freelance consulting, generating three customer engagements, one of which turned into a long-term job.
- Developed a MicroStrategy-based dashboard for the office of CFO of a major bank in UAE.
- Developed a reporting DB and set of reports for a warehouse based out of Wisconsin, USA.
Technologies: SQL, MySQL, Pentaho Data Integration (Kettle), Oracle, PostgreSQL, Java, MicroStrategy, BIRTProfessional Services Consultant
2006 - 2010Teradata- Led a team of BI developers to implement BI schema, reports, and dashboards for a leading telecom operator in the country.
- Developed a dashboard for the office of the CEO to re-engage customers on a DWH project.
- Trained internal resources on BI and DWH. Participated in logical and physical data modeling for the enterprise DWH.
Technologies: Teradata, SQL, MicroStrategy, Data Warehouse Design, Business Intelligence (BI)