Scroll To View More
Yuriy Margulis

Yuriy Margulis

Los Angeles, CA, United States
Member since June 19, 2016
Yuriy is a data specialist with over 15 years of experience in data warehousing, data engineering, big data, and business intelligence. Over the years, he has worked on 5 large data warehouses for prime internet, media, and entertainment companies. In addition, he has also acted as a hands-on data engineer & architect, ETL developer, database administrator, and provided operational support and SLA compliance.
Yuriy is now available for hire
Portfolio
  • Crowd Consulting
    AWS EMR, Hadoop, Spark, Presto, Hive, AWS Lambda, AWS Redshift...
  • BCG GAMMA (via Toptal)
    Python, Spark, Hive, Presto, Athena, Glue, RDS, PostgreSQL, Airflow...
  • Enervee
    AWS EMR: Hadoop, Spark, Presto, Hive, AWS Redshift, AWS RDS: PostgreSQL...
Experience
  • Oracle, 20 years
  • Leadership, 18 years
  • Data Warehouse, 18 years
  • ETL, 18 years
  • Business Intelligence (BI), 16 years
  • Data Architecture, 15 years
  • Technology Strategy & Architecture, 15 years
  • Big Data, 4 years
Los Angeles, CA, United States
Availability
Full-time
Preferred Environment
Oracle, Redshift, Spark, Hadoop, PostgreSQL, MSSQL
The most amazing...
...I've done was growing a PriceGrabber data warehouse from 5 to 17 subjects & through many platform changes; wrote multiple lines of SQL & other scripting code.
Employment
  • Consultant | Co-founder | CEO
    2016 - PRESENT
    Crowd Consulting
    • Worked on full data warehouse implementations for multiple clients.
    • Provided big data training and support.
    • Engineered and built an ETL pipeline for AWS S3 data warehouse using AWS Kinesis, Lambda, Hive, Presto, and Spark. The pipeline was written in Python.
    Technologies: AWS EMR, Hadoop, Spark, Presto, Hive, AWS Lambda, AWS Redshift, AWS RDS: Postgres, MySQL, DynamoDB, AWS Lambda, AWS S3, Python, Scala, Luigi, Tableau
  • Freelance Data Engineer
    2018 - 2018
    BCG GAMMA (via Toptal)
    • Provided engineering support for data scientists.
    • Designed and built a featured engineering data mart and customer 360-degree data lake in AWS S3.
    • Designed and developed a dynamic S3-to-S3 ETL system in Spark and Hive.
    • Completed various DevOps tasks included an Airflow installation, development of Ansible playbooks, and history backloads.
    Technologies: Python, Spark, Hive, Presto, Athena, Glue, RDS, PostgreSQL, Airflow, Boto 3 API, Ansible
  • Vice President of Data
    2017 - 2018
    Enervee
    • Managed the data engineering, BI reporting, and data science teams.
    • Worked as a hands-on data engineer.
    • Built a data lake on AWS.
    • Developed a reporting system with Redash/Presto.
    Technologies: AWS EMR: Hadoop, Spark, Presto, Hive, AWS Redshift, AWS RDS: PostgreSQL, MySQL, AuroraDB, AWS S3, Python, Airflow, Redash
  • Big Data Architect
    2016 - 2017
    ITG
    • Worked in a full-time position, as a data architect for a transaction cost analysis system.
    • Installed a four-node Apache Hadoop/Spark cluster on ITG's private cloud.
    • Conducted platform POC embedding Apache Spark technology into ITG's data platform.
    • Supported the development of a platform POC for Kx Kdb+; also converted Sybase IQ queries to Kdb+ Q language.
    Technologies: Apache Hadoop, Hive, Spark, Python, Sybase ASE, Sybase IQ, Informatica, Kdb+, Q
  • Freelance Data Engineer and DBA
    2016 - 2017
    American Taekwondo Association (via Toptal)
    • Converted data from a legacy Oracle database to a newly designed SQL Server database.
    • Wrote SQL scripts, stored procedures, kettle transformations.
    • Administered two databases.
    • Performed extensive data cleansing and validation.
    Technologies: MS SQL Server, Oracle, Pentaho (Kettle)
  • Director, Data Warehouses
    2015 - 2016
    Connexity
    • Managed two data warehouses and BI teams for both PriceGrabber and Shopzilla. Connexity is also known as PriceGrabber, Shopzilla, and BizRate.
    • Handled operational support for the PriceGrabber data warehouse. Recovered data warehouse after the data center migration.
    • Merged one data warehouse into another and retired one of them. Hands-on designed business and data integration architecture; developed data validation scripts and ETL integration code. Managed the transfer of a BI reporting system from Cognos to OBIEE and Tableau.
    • Defined the technology platform change strategy for the combined data warehouse.
    • Created SQL: PL SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Completed an Amazon Redshift project.
    • Completed a Cloudera Impala project.
    Technologies: Oracle, PL/SQL, AWS Redshift, Hadoop, Impala, Cognos, OBIEE, Tableau, Perl, Python, Linux
  • Director, Data Warehouses
    2008 - 2015
    PriceGrabber
    • Oversaw the company's data services, defined the overall and technical strategy for data warehousing, business intelligence, and big data environments.
    • Hired and managed a mixed on-shore (US)/off-shore (India) engineering team.
    • Replatformed a data warehouse to Oracle Exadata X3/Oracle ZFS combination, added big data and machine learning components to the data warehousing environment.
    • Supported 24x7x365 operations in compliance with the company's top-level production SLA.
    • Wrote thousands of lines of PL/SQL, PL/pgSQL, MySQL, and HiveQL code.
    • Wrote ETL scripting in Perl, Python, and JavaScript internally in Kettle.
    • Worked with big data on multiple types of projects (Hadoop, Pig, Hive, and Mahaut).
    • Developed a tool-based ETL for a Pentaho (Kettle) CE ETL redesign project.
    • Worked on machine learning for various types of projects (Python, SciPy, NumPy, and Pandas).
    Technologies: Oracle, Hadoop, Pig, Hive, PostgreSQL, MySQL, Perl, Python, Pentaho (Kettle), Linux
  • Director, Data Warehouses
    2007 - 2008
    Edmunds
    • Managed a data warehouse team and project pipeline; supported operations.
    • Created PL/SQL stored procedures, packages, and anonymous scripts for ETL and data validation.
    • Worked on a tool-based ETL for multiple Informatica projects.
    Technologies: Oracle, Informatica, Perl, Linux
  • Manager, Data Warehouses
    2003 - 2007
    Universal Music Group
    • Managed, developed, and operated a CRM data warehouse.
    • Wrote PL/SQL, MySQL, and Perl code.
    • Administered to a Cognos reporting system.
    • Worked on C# for multiple supporting projects for the OLAP reporting system.
    • Designed and developed a MSAS OLAP cube system.
    Technologies: Oracle, SQL Server, MySQL, Cognos, C#, Perl, Lynux
  • Director, Decision Support and Financial Systems
    2001 - 2003
    MediaLive International
    • Managed a data warehouse, BI, and CRM systems.
    • Assumed responsibilities over an Oracle EBS application team.
    • Did the PL/SQL coding for a data warehouse ETL and Oracle Application integration.
    • Worked with SQL server for multiple Transact-SQL and analysis service projects.
    • Worked on a tool-based ETL for multiple epiphany EPI*Channel projects.
    Technologies: Oracle, Oracle EBS, SQL Server,VB, Epiphany, Unix
  • Senior Principal Consultant (Professional Services, Essbase Practice)
    1999 - 2001
    Hyperion (Currently: Oracle)
    • Led a practice for a consulting company covering for multiple clients.
    • Developed Essbase satellite systems: relational data warehouses and data marts, reporting systems, ETL systems, CRM's, EPP's, ETL in and out of Essbase and with Essbase itself.
    • Worked on multiple PL/SQL projects, by providing full support of the team's Oracle project pipeline.
    • Helped to develop SQL servers for multiple Transact-SQL and analysis services projects.
    • Developed a tool-based ETL for an Informatica project.
    • Worked with Hyperion, Essbase, Enterprise, Pillar, planning, financial analyzers, and VBA projects.
    Technologies: Oracle, SQL Server, Hyperion Essbase, VBA, Informatica
Skills
  • Languages
    SQL, Transact-SQL, Python, PL/pgSQL, C#, Perl
  • Tools
    Pentaho Data Integration (Kettle), Informatica PowerCenter, Talend ETL
  • Paradigms
    ETL, Database Design, Management
  • Platforms
    Oracle, Apache Pig
  • Storage
    Oracle PL/SQL, Microsoft SQL Server, MySQL, Apache Hive, Databases, Essbase, PostgreSQL
  • Other
    Business Intelligence (BI), Data Warehouse, Technology Strategy & Architecture, Team Mentoring, Leadership, Data Architecture, Big Data, MSAS, perlpod, Cognos 10, Unix Shell Scripting
  • Frameworks
    Apache Spark, Hadoop
Education
  • Certificate of Completion in Data Science and Engineering with Apache Spark
    2016 - 2016
    UC BerkeleyX (Online Courses from Berkeley) - Berkeley, California (USA)
  • Certificate of Completion in Cloudera Developer Training for Apache Hadoop
    2012 - 2012
    Cloudera University - New York, New York (USA)
  • Certificate of Completion in Oracle Database Administration
    1995 - 1995
    UCI Extension - Irvine, California (USA)
  • Diploma (Master of Science equivalent) degree in Applied Mathematics
    1975 - 1980
    Odessa I.I. Mechnikov University - Odessa, Ukraine
I really like this profile
Share it with others