Sagar Sharma, Software Developer in Toronto, ON, Canada
Sagar Sharma

Software Developer in Toronto, ON, Canada

Member since July 11, 2016
Sagar is a seasoned data professional with more than ten years of work experience with relational databases and three years with big data—specializing in designing and scaling data systems and processes. He is a hardworking individual with a constant desire to learn new things and to make a positive impact on the organization. Sagar possesses excellent communication skills and is a motivated team player with the ability to work independently.
Sagar is now available for hire


  • Curvo Labs
    Apache Airflow, Amazon Web Services (AWS), Python 3, Web Scraping, Scala...
  • Colorescience
    Salesforce, MySQL, PostgreSQL, Python, AWS, Data Pipelines
  • Curvo Labs
    Web Scraping, AWS Glue, Docker, AWS S3, Redshift, Apache Airflow, Spark ML...



Toronto, ON, Canada



Preferred Environment

IntelliJ IDEA, Sublime Text, Linux, MacOS

The most amazing...

...project was building a data lake with Hadoop—this included a Hadoop ecosystem installation from scratch and building data pipelines to move data into the lake.


  • Senior Data Engineer

    2019 - 2021
    Curvo Labs
    • Implemented Amazon Redshift Data Warehouse from scratch to serve as a reporting database.
    • Designed ETL jobs in AWS Data Pipelines to move data from the production database to Redshift.
    • Built data pipelines using Python to do data transformations and web scraping using BeautifulSoup and Selenium WebDriver.
    • Implemented an Airflow instance from scratch to orchestrate data pipelines and later moved this to AWS MWAA when it became available.
    • Implemented AWS QuickSight as the primary reporting tool and designed reports which were later embedded into a web application.
    • Built data pipelines using Spark and Scala for distributed data processing and transformation and deployed them in AWS Glue.
    • Implemented a portion of a web application that embedded AWS QuickSight reports. The tech stack used Node.js, React, TypeScript, GraphQL Ant Design, and Jest for testing. Features included user authentication, data access, and dashboard embedding.
    Technologies: Apache Airflow, Amazon Web Services (AWS), Python 3, Web Scraping, Scala, Spark, Spark SQL, Pandas, Amazon Virtual Private Cloud (VPC), TypeScript, GraphQL, Elasticsearch, NestJS, React, Beautiful Soup, Selenium WebDriver, Jest
  • Senior Data Engineer

    2017 - 2020
    • Designed, developed, and maintained a reporting data warehouse built using PostgreSQL (AWS RDS).
    • Built data pipelines to move data from a production system and third-party systems to a centralized data warehouse.
    • Connected to third-party APIs to import data on an incremental basis e.g., Salesforce, Sailthru, and CrowdTwist.
    • Managed resources on AWS.
    • Created reports and dashboards in Looker to provide insights on the data.
    Technologies: Salesforce, MySQL, PostgreSQL, Python, AWS, Data Pipelines
  • Senior Data Engineer

    2019 - 2019
    Curvo Labs
    • Built a new Data pipeline framework orchestrated in Airflow. Also, Airflow was setup with Docker.
    • Wrote new data-pipelines in Python and schedule them in Airflow.
    • Performed a variety of on-demand tasks using Apache Spark/Scala and deployed in AWS Glue.
    • Established Redshift as a centralized data warehouse and moved the data to Redshift from S3, production systems, and third-party applications.
    • Setup the mode to create enterprise reports from the data moved to Redshift.
    Technologies: Web Scraping, AWS Glue, Docker, AWS S3, Redshift, Apache Airflow, Spark ML, Scala, Apache Spark, Python, AWS, Data Pipelines
  • Senior Business Intelligence Engineer

    2016 - 2018
    Altus Group Limited
    • Built a reporting data warehouse using Pentaho, PostgreSQL, and Informatica.
    • Designed a database schema in PostgreSQL to represent the reporting use case.
    • Created ETL tasks in Informatica to move data from the production systems into PostgreSQL.
    • Built reports and dashboards using a Pentaho report designer and deployed them to the BI server.
    Technologies: Reporting, Pentaho, Informatica, Microsoft SQL Server, PostgreSQL, Data Pipelines
  • Data Engineer

    2014 - 2016
    Wave Accounting, Inc.
    • Designed, developed, and maintained big data and business intelligence solutions at Wave.
    • Designed and scheduled complex ETL workflows and jobs using Pentaho Data Integration (Kettle) to load data into the data systems.
    • Wrote custom Python scripts to access third party APIs and download data into the data systems.
    • Developed complex SQL queries including JOINS, subqueries, and common table expressions to address ad hoc business analytics and other requirements.
    • Coordinated with the product and executive teams to gather and understand business requirements.
    • Built an end-to-end relational data warehouse—including infrastructure, schema design, optimization, and administration.
    • Designed and developed a Hadoop Cluster using Horton Works HDP 2.0. Tasks include installing and configuring a Hadoop ecosystem and designing the HDFS.
    • Designed and scheduled Sqoop jobs to load data into the HDFS from the production systems.
    Technologies: Pentaho, Ansible, Sqoop, Apache Hive, MySQL, Hadoop, PostgreSQL, Sisense, Microsoft SQL Server, Python, Data Pipelines
  • Business Intelligence Developer

    2011 - 2014
    Eyereturn Marketing, Inc.
    • Designed real-time reporting solutions using a SQL server (SSIS, SSAS and SSRS) and Pentaho business intelligence tools (MySQL, Mondrian, and Pentaho).
    • Created custom automated/scheduled reports using Eclipse BIRT and Pentaho Report Designer.
    • Built custom ETL tasks to transform data for custom reports using Kettle (Pentaho Data Integration).
    • Designed and optimized database schemas to make reporting faster and efficient.
    • Created, maintained, and scheduled custom data processors to pull and manipulate data from HDFS using Pig, Sqoop, and Oozie (Cloudera Hadoop).
    Technologies: MySQL, Sqoop, Apache Pig, Hadoop, Apache Hive, Pentaho, SSAS, SQL Server Integration Services (SSIS), Microsoft SQL Server
  • Database Analyst

    2010 - 2011
    George Brown College
    • Handled and was responsible for the database administration in the organization using Blackbaud’s Raiser’s Edge.
    • Updated and maintained the alumni database using the MS SQL Server.
    • Conducted data validation and verification to ensure the accuracy and quality of the data.
    • Performed multiple queries at a complex level for the purposes of reports and provide information for divisional and marketing purposes.
    • Provided support to the project managers.
    Technologies: Raiser's Edge, Microsoft SQL Server
  • Software Engineer

    2007 - 2009
    Tata Consultancy Services
    • Provided post-implementation support and training for an enterprise level banking application (TCS B@ncs) to 25,000+ corporate end-users.
    • Handled different modules of the banking operations such as routine banking, loans and mortgages, capital markets, and foreign exchange.
    • Analyzed client business needs and translated them into functional/operational requirements.
    • Communicated successfully with a variety of people including subject matter experts to establish a technical vision, business units, development teams, and support teams.
    Technologies: Oracle, SQL, HTML, Java


  • Data Lake Using Hadoop (Hortonworks HDP 2.0)

    I built a data lake using Hadoop.

    My Tasks:
    • Installed and configured Hadoop ecosystem components on the RackSpace cloud big data platform.
    • Automated the above process using Ansible.
    • Designed and scheduled Sqoop jobs to import data from MySQL and PostgreSQL production tables into HDFS.
    • Set up Hive tables from HDFS files to enable SQL-like querying on HDFS data.

  • Sisense Rebuild Project

    I built a data warehouse for a global animal health company. The goal was to help business users by giving them a single source of truth about their mobile application.

    My Tasks:
    • Redesigned the data model in Sisense ElastiCube. I used a snowflake schema to establish database relationships.
    • Created a build schedule of the above-created data model.
    • Created dashboards, based on the client's requirements.
    • Set up email schedules for the dashboards to relevant stakeholders.

  • Redshift | SnowFlake Migration

    I helped a client to migrate from Redshift to SnowFlake. Redshift was becoming expensive and had limited resources. We set up Snowflake on S3. S3 was used as storage and Snowflake was used as the computing engine.


  • Languages

    SQL, Python, CSS, HTML, Scala, Transact-SQL, Snowflake, JavaScript, Java, SAS, Python 3, TypeScript, GraphQL
  • Tools

    Looker, Pentaho Data Integration (Kettle), Informatica ETL, Pentaho Mondrian OLAP Engine, Sisense, Apache Sqoop, AWS Glue, Apache Airflow, Sublime Text, IntelliJ IDEA, Sqoop, Ansible, SSAS, Spark SQL, Amazon Virtual Private Cloud (VPC)
  • Paradigms

    ETL Implementation & Design, ETL, RESTful Development, MapReduce, Business Intelligence (BI)
  • Storage

    SQL Server 2008, Databases, SQL Server 2012, SQL Server 2014, SQL Server Integration Services (SSIS), HDFS, SQL Server Analysis Services (SSAS), Apache Hive, PostgreSQL, MySQL, Redshift, Data Pipelines, Microsoft SQL Server, AWS S3, MongoDB, Elasticsearch
  • Other

    ETL Tools, ETL Development, Pentaho Reports, Data Build Tool (DBT), Data Warehouse Design, RESTful APIs, Semantic UI, Data Analysis, Pentaho Dashboard, Informatica, Data Engineering, AWS, Data Warehousing, Reporting, Raiser's Edge, Web Scraping, NestJS
  • Frameworks

    Express.js, Bootstrap 3+, Materialize CSS, Foundation CSS, Flask, Hadoop, Spark, Apache Spark, AWS EMR, Jest
  • Libraries/APIs

    REST APIs, NumPy, Pandas, DreamFactory, React, Passport.js, Node.js, PySpark, Spark ML, Beautiful Soup, Selenium WebDriver
  • Platforms

    AWS EC2, Windows, MacOS, Linux, Apache Pig, Apache Kafka, Amazon Web Services (AWS), AWS Lambda, Pentaho, Oracle, Salesforce, Docker


  • Certificate in SAS Certified Base Programmer
    2011 - 2011
    SAS Institute Canada - Toronto, Canada
  • Postgraduate certificate in Strategic Relationship Marketing
    2010 - 2010
    George Brown College - Toronto, Canada
  • Bachelor's degree in Mechanical Engineering
    2003 - 2007
    YMCA University of Science and Technology - Faridabad, India

To view more profiles

Join Toptal
Share it with others