Yuriy Margulis, Data Warehousing Developer in Los Angeles, CA, United States
Yuriy Margulis

Data Warehousing Developer in Los Angeles, CA, United States

Member since June 19, 2016
Yuriy is a data specialist with over 15 years of experience in data warehousing, data engineering, big data, and business intelligence. Over the years, he has worked on 5 large data warehouses for prime internet, media, and entertainment companies. In addition, he has also acted as a hands-on data engineer & architect, ETL developer, database administrator, and provided operational support and SLA compliance.
Yuriy is now available for hire

Portfolio

Experience

  • Oracle, 20 years
  • Data Warehousing, 18 years
  • Leadership, 18 years
  • ETL, 18 years
  • Business Intelligence (BI), 16 years
  • Technology Strategy & Architecture, 15 years
  • Data Architecture, 15 years
  • Big Data, 4 years

Location

Los Angeles, CA, United States

Availability

Part-time

Preferred Environment

Oracle, Redshift, Spark, Hadoop, PostgreSQL, MSSQL

The most amazing...

...I've done was growing a PriceGrabber data warehouse from 5 to 17 subjects & through many platform changes; wrote multiple lines of SQL & other scripting code.

Employment

  • Freelance Data Engineer

    2018 - PRESENT
    BCG GAMMA (via Toptal, Three Contracts)
    • Provided engineering support for data scientists.
    • Designed and built a featured engineering data mart and customer 360-degree data lake in AWS S3.
    • Designed and developed a dynamic S3-to-S3 ETL system in Spark and Hive.
    • Completed various DevOps tasks included an Airflow installation, development of Ansible playbooks, and history backloads.
    • Worked on a feature engineering project which involved Hortonworks, Spark, Python, Hive, and Airflow.
    • Build a one-on-one marketing feature engineering pipeline in PySpark on Microsoft Azure and databricks (used ADF, ADL, Databricks Delta Lake, and ADW as a source).
    Technologies: Python, Spark, Hive, Presto, Athena, Glue, RDS, PostgreSQL, Airflow, Boto 3 API, Ansible
  • Consultant | Co-founder | CEO

    2016 - PRESENT
    Crowd Consulting
    • Worked on full data warehouse implementations for multiple clients.
    • Provided big data training and support.
    • Engineered and built an ETL pipeline for AWS S3 data warehouse using AWS Kinesis, Lambda, Hive, Presto, and Spark. The pipeline was written in Python.
    Technologies: AWS EMR, Hadoop, Spark, Presto, Hive, AWS Lambda, AWS Redshift, AWS RDS: Postgres, MySQL, DynamoDB, AWS Lambda, AWS S3, Python, Scala, Luigi, Tableau
  • Vice President of Data

    2017 - 2018
    Enervee
    • Managed the data engineering, BI reporting, and data science teams.
    • Worked as a hands-on data engineer.
    • Built a data lake on AWS.
    • Developed a reporting system with Redash/Presto.
    Technologies: AWS EMR: Hadoop, Spark, Presto, Hive, AWS Redshift, AWS RDS: PostgreSQL, MySQL, AuroraDB, AWS S3, Python, Airflow, Redash
  • Big Data Architect

    2016 - 2017
    ITG
    • Worked in a full-time position, as a data architect for a transaction cost analysis system.
    • Installed a four-node Apache Hadoop/Spark cluster on ITG's private cloud.
    • Conducted platform POC embedding Apache Spark technology into ITG's data platform.
    • Supported the development of a platform POC for Kx Kdb+; also converted Sybase IQ queries to Kdb+ Q language.
    Technologies: Apache Hadoop, Hive, Spark, Python, Sybase ASE, Sybase IQ, Informatica, Kdb+, Q
  • Freelance Data Engineer and DBA

    2016 - 2017
    American Taekwondo Association (via Toptal)
    • Converted data from a legacy Oracle database to a newly designed SQL Server database.
    • Wrote SQL scripts, stored procedures, kettle transformations.
    • Administered two databases.
    • Performed extensive data cleansing and validation.
    Technologies: MS SQL Server, Oracle, Pentaho (Kettle)
  • Director, Data Warehouse

    2015 - 2016
    Connexity
    • Managed two data warehouses and BI teams for both PriceGrabber and Shopzilla. Connexity is also known as PriceGrabber, Shopzilla, and BizRate.
    • Handled operational support for the PriceGrabber data warehouse. Recovered data warehouse after the data center migration.
    • Merged one data warehouse into another and retired one of them. Hands-on designed business and data integration architecture; developed data validation scripts and ETL integration code. Managed the transfer of a BI reporting system from Cognos to OBIEE and Tableau.
    • Defined the technology platform change strategy for the combined data warehouse.
    • Created SQL: PL SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Completed an Amazon Redshift project.
    • Worked on and completed a Cloudera Impala project.
    Technologies: Oracle, PL/SQL, AWS Redshift, Hadoop, Impala, Cognos, OBIEE, Tableau, Perl, Python, Linux
  • Director, Data Warehouse

    2008 - 2015
    PriceGrabber
    • Oversaw the company's data services, defined the overall and technical strategy for data warehousing, business intelligence, and big data environments.
    • Hired and managed a mixed on-shore (US)/off-shore (India) engineering team.
    • Replatformed a data warehouse to Oracle Exadata X3/Oracle ZFS combination, added big data and machine learning components to the data warehousing environment.
    • Supported 24x7x365 operations in compliance with the company's top-level production SLA.
    • Wrote thousands of lines of PL/SQL, PL/pgSQL, MySQL, and HiveQL code.
    • Wrote ETL scripting in Perl, Python, and JavaScript internally in Kettle.
    • Worked with big data on multiple types of projects (Hadoop, Pig, Hive, and Mahaut).
    • Developed a tool-based ETL for a Pentaho (Kettle) CE ETL redesign project.
    • Worked on machine learning for various types of projects (Python, SciPy, NumPy, and Pandas).
    Technologies: Oracle, Hadoop, Pig, Hive, PostgreSQL, MySQL, Perl, Python, Pentaho (Kettle), Linux
  • Director, Data Warehouse

    2007 - 2008
    Edmunds
    • Managed a data warehouse team and project pipeline; supported operations.
    • Created PL/SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Worked on a tool-based ETL for multiple Informatica projects.
    Technologies: Oracle, Informatica, Perl, Linux
  • Manager, Data Warehouse

    2003 - 2007
    Universal Music Group
    • Managed, developed, and operated a CRM data warehouse.
    • Wrote PL/SQL, MySQL, and Perl code.
    • Administered to a Cognos reporting system.
    • Worked on C# for multiple supporting projects for the OLAP reporting system.
    • Designed and developed a MSAS OLAP cube system.
    Technologies: Oracle, SQL Server, MySQL, Cognos, C#, Perl, Lynux
  • Director, Decision Support and Financial Systems

    2001 - 2003
    MediaLive International
    • Managed a data warehouse, BI, and CRM systems.
    • Assumed responsibilities over an Oracle EBS application team.
    • Did the PL/SQL coding for a data warehouse ETL and Oracle Application integration.
    • Worked with SQL server for multiple Transact-SQL and analysis service projects.
    • Worked on a tool-based ETL for multiple epiphany EPI*Channel projects.
    Technologies: Oracle, Oracle EBS, SQL Server,VB, Epiphany, Unix
  • Senior Principal Consultant (Professional Services, Essbase Practice)

    1999 - 2001
    Hyperion (Currently: Oracle)
    • Led a practice for a consulting company covering for multiple clients.
    • Developed Essbase satellite systems: relational data warehouses and data marts, reporting systems, ETL systems, CRM's, EPP's, ETL in and out of Essbase and with Essbase itself.
    • Worked on multiple PL/SQL projects, by providing full support of the team's Oracle project pipeline.
    • Helped to develop SQL servers for multiple Transact-SQL and analysis services projects.
    • Developed a tool-based ETL for an Informatica project.
    • Worked with Hyperion, Essbase, Enterprise, Pillar, planning, financial analyzers, and VBA projects.
    Technologies: Oracle, SQL Server, Hyperion Essbase, VBA, Informatica

Skills

  • Languages

    Python, SQL, PL/pgSQL, Transact-SQL, C#, Perl
  • Frameworks

    Apache Spark, Hadoop
  • Tools

    Pentaho Data Integration (Kettle), Apache Airflow, Informatica PowerCenter, Talend ETL
  • Paradigms

    ETL, Business Intelligence (BI), Management, Database Design
  • Platforms

    Oracle, Apache Pig
  • Storage

    PostgreSQL, Apache Hive, Databases, Oracle PL/SQL, Microsoft SQL Server, Cassandra, Essbase, MySQL
  • Other

    Data Warehousing, Data Architecture, Leadership, Team Mentoring, Technology Strategy & Architecture, Mixed Reality, Software Development, Big Data, perlpod, Unix Shell Scripting, MSAS, Cognos 10

Education

  • Certificate of Completion in Data Science and Engineering with Apache Spark
    2016 - 2016
    UC BerkeleyX (Online Courses from Berkeley) - Berkeley, California (USA)
  • Certificate of Completion in Cloudera Developer Training for Apache Hadoop
    2012 - 2012
    Cloudera University - New York, New York (USA)
  • Certificate of Completion in Oracle Database Administration
    1995 - 1995
    UCI Extension - Irvine, California (USA)
  • Diploma (Master of Science equivalent) degree in Applied Mathematics
    1975 - 1980
    Odessa I.I. Mechnikov University - Odessa, Ukraine

To view more profiles

Join Toptal
I really like this profile
Share it with others