Scroll To View More
Yuriy Margulis, Software Developer in Los Angeles, CA, United States
Yuriy Margulis

Software Developer in Los Angeles, CA, United States

Member since June 19, 2016
Yuriy is a data specialist with over 15 years of experience in data warehousing, data engineering, big data, and business intelligence. Over the years, he has worked on 5 large data warehouses for prime internet, media, and entertainment companies. In addition, he has also acted as a hands-on data engineer & architect, ETL developer, database administrator, and provided operational support and SLA compliance.
Yuriy is now available for hire

Portfolio

  • Crowd Consulting
    AWS EMR, Hadoop, Spark, Presto, Hive, AWS Lambda, AWS Redshift...
  • BCG GAMMA (via Toptal)
    Python, Spark, Hive, Presto, Athena, Glue, RDS, PostgreSQL, Airflow...
  • Enervee
    AWS EMR: Hadoop, Spark, Presto, Hive, AWS Redshift, AWS RDS: PostgreSQL...

Experience

  • Oracle, 20 years
  • Data Warehouse, 18 years
  • ETL, 18 years
  • Leadership, 18 years
  • Business Intelligence (BI), 16 years
  • Data Architecture, 15 years
  • Technology Strategy & Architecture, 15 years
  • Big Data, 4 years
Los Angeles, CA, United States

Availability

Part-time

Preferred Environment

Oracle, Redshift, Spark, Hadoop, PostgreSQL, MSSQL

The most amazing...

...I've done was growing a PriceGrabber data warehouse from 5 to 17 subjects & through many platform changes; wrote multiple lines of SQL & other scripting code.

Employment

  • Consultant | Co-founder | CEO

    2016 - PRESENT
    Crowd Consulting
    • Worked on full data warehouse implementations for multiple clients.
    • Provided big data training and support.
    • Engineered and built an ETL pipeline for AWS S3 data warehouse using AWS Kinesis, Lambda, Hive, Presto, and Spark. The pipeline was written in Python.
    Technologies: AWS EMR, Hadoop, Spark, Presto, Hive, AWS Lambda, AWS Redshift, AWS RDS: Postgres, MySQL, DynamoDB, AWS Lambda, AWS S3, Python, Scala, Luigi, Tableau
  • Freelance Data Engineer

    2018 - 2018
    BCG GAMMA (via Toptal)
    • Provided engineering support for data scientists.
    • Designed and built a featured engineering data mart and customer 360-degree data lake in AWS S3.
    • Designed and developed a dynamic S3-to-S3 ETL system in Spark and Hive.
    • Completed various DevOps tasks included an Airflow installation, development of Ansible playbooks, and history backloads.
    Technologies: Python, Spark, Hive, Presto, Athena, Glue, RDS, PostgreSQL, Airflow, Boto 3 API, Ansible
  • Vice President of Data

    2017 - 2018
    Enervee
    • Managed the data engineering, BI reporting, and data science teams.
    • Worked as a hands-on data engineer.
    • Built a data lake on AWS.
    • Developed a reporting system with Redash/Presto.
    Technologies: AWS EMR: Hadoop, Spark, Presto, Hive, AWS Redshift, AWS RDS: PostgreSQL, MySQL, AuroraDB, AWS S3, Python, Airflow, Redash
  • Big Data Architect

    2016 - 2017
    ITG
    • Worked in a full-time position, as a data architect for a transaction cost analysis system.
    • Installed a four-node Apache Hadoop/Spark cluster on ITG's private cloud.
    • Conducted platform POC embedding Apache Spark technology into ITG's data platform.
    • Supported the development of a platform POC for Kx Kdb+; also converted Sybase IQ queries to Kdb+ Q language.
    Technologies: Apache Hadoop, Hive, Spark, Python, Sybase ASE, Sybase IQ, Informatica, Kdb+, Q
  • Freelance Data Engineer and DBA

    2016 - 2017
    American Taekwondo Association (via Toptal)
    • Converted data from a legacy Oracle database to a newly designed SQL Server database.
    • Wrote SQL scripts, stored procedures, kettle transformations.
    • Administered two databases.
    • Performed extensive data cleansing and validation.
    Technologies: MS SQL Server, Oracle, Pentaho (Kettle)
  • Director, Data Warehouses

    2015 - 2016
    Connexity
    • Managed two data warehouses and BI teams for both PriceGrabber and Shopzilla. Connexity is also known as PriceGrabber, Shopzilla, and BizRate.
    • Handled operational support for the PriceGrabber data warehouse. Recovered data warehouse after the data center migration.
    • Merged one data warehouse into another and retired one of them. Hands-on designed business and data integration architecture; developed data validation scripts and ETL integration code. Managed the transfer of a BI reporting system from Cognos to OBIEE and Tableau.
    • Defined the technology platform change strategy for the combined data warehouse.
    • Created SQL: PL SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Completed an Amazon Redshift project.
    • Completed a Cloudera Impala project.
    Technologies: Oracle, PL/SQL, AWS Redshift, Hadoop, Impala, Cognos, OBIEE, Tableau, Perl, Python, Linux
  • Director, Data Warehouses

    2008 - 2015
    PriceGrabber
    • Oversaw the company's data services, defined the overall and technical strategy for data warehousing, business intelligence, and big data environments.
    • Hired and managed a mixed on-shore (US)/off-shore (India) engineering team.
    • Replatformed a data warehouse to Oracle Exadata X3/Oracle ZFS combination, added big data and machine learning components to the data warehousing environment.
    • Supported 24x7x365 operations in compliance with the company's top-level production SLA.
    • Wrote thousands of lines of PL/SQL, PL/pgSQL, MySQL, and HiveQL code.
    • Wrote ETL scripting in Perl, Python, and JavaScript internally in Kettle.
    • Worked with big data on multiple types of projects (Hadoop, Pig, Hive, and Mahaut).
    • Developed a tool-based ETL for a Pentaho (Kettle) CE ETL redesign project.
    • Worked on machine learning for various types of projects (Python, SciPy, NumPy, and Pandas).
    Technologies: Oracle, Hadoop, Pig, Hive, PostgreSQL, MySQL, Perl, Python, Pentaho (Kettle), Linux
  • Director, Data Warehouses

    2007 - 2008
    Edmunds
    • Managed a data warehouse team and project pipeline; supported operations.
    • Created PL/SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Worked on a tool-based ETL for multiple Informatica projects.
    Technologies: Oracle, Informatica, Perl, Linux
  • Manager, Data Warehouses

    2003 - 2007
    Universal Music Group
    • Managed, developed, and operated a CRM data warehouse.
    • Wrote PL/SQL, MySQL, and Perl code.
    • Administered to a Cognos reporting system.
    • Worked on C# for multiple supporting projects for the OLAP reporting system.
    • Designed and developed a MSAS OLAP cube system.
    Technologies: Oracle, SQL Server, MySQL, Cognos, C#, Perl, Lynux
  • Director, Decision Support and Financial Systems

    2001 - 2003
    MediaLive International
    • Managed a data warehouse, BI, and CRM systems.
    • Assumed responsibilities over an Oracle EBS application team.
    • Did the PL/SQL coding for a data warehouse ETL and Oracle Application integration.
    • Worked with SQL server for multiple Transact-SQL and analysis service projects.
    • Worked on a tool-based ETL for multiple epiphany EPI*Channel projects.
    Technologies: Oracle, Oracle EBS, SQL Server,VB, Epiphany, Unix
  • Senior Principal Consultant (Professional Services, Essbase Practice)

    1999 - 2001
    Hyperion (Currently: Oracle)
    • Led a practice for a consulting company covering for multiple clients.
    • Developed Essbase satellite systems: relational data warehouses and data marts, reporting systems, ETL systems, CRM's, EPP's, ETL in and out of Essbase and with Essbase itself.
    • Worked on multiple PL/SQL projects, by providing full support of the team's Oracle project pipeline.
    • Helped to develop SQL servers for multiple Transact-SQL and analysis services projects.
    • Developed a tool-based ETL for an Informatica project.
    • Worked with Hyperion, Essbase, Enterprise, Pillar, planning, financial analyzers, and VBA projects.
    Technologies: Oracle, SQL Server, Hyperion Essbase, VBA, Informatica

Skills

  • Languages

    Python, SQL, PL/pgSQL, Transact-SQL, C#, Perl
  • Frameworks

    Apache Spark, Hadoop
  • Tools

    Pentaho Data Integration (Kettle), Apache Airflow, Informatica PowerCenter, Talend ETL
  • Paradigms

    ETL, Management, Database Design
  • Platforms

    Oracle, Apache Pig
  • Storage

    PostgreSQL, Apache Hive, Databases, Oracle PL/SQL, Microsoft SQL Server, Cassandra, Essbase, MySQL
  • Other

    Data Warehouse, Data Architecture, Leadership, Business Intelligence (BI), Team Mentoring, Technology Strategy & Architecture, Big Data, perlpod, Unix Shell Scripting, MSAS, Cognos 10

Education

  • Certificate of Completion in Data Science and Engineering with Apache Spark
    2016 - 2016
    UC BerkeleyX (Online Courses from Berkeley) - Berkeley, California (USA)
  • Certificate of Completion in Cloudera Developer Training for Apache Hadoop
    2012 - 2012
    Cloudera University - New York, New York (USA)
  • Certificate of Completion in Oracle Database Administration
    1995 - 1995
    UCI Extension - Irvine, California (USA)
  • Diploma (Master of Science equivalent) degree in Applied Mathematics
    1975 - 1980
    Odessa I.I. Mechnikov University - Odessa, Ukraine
I really like this profile
Share it with others