Senior Data Engineer
2021 - PRESENTEmplifi (Acquired Socialbakers)- Built a data lake, which integrates data across the whole company, and enables any internal client to ask any data question.
- Decommissioned an obsolete AWS Redshift DWH (for cost efficiency), and established an alternative solution using relational databases, column-stores, and S3 storage.
- Established and maintained an internal framework for real-time and near-real-time streaming of social-media data and the application of machine-learning models.
Technologies: Spark, Redshift, Presto DB, EMR, RabbitMQ, AWS Kinesis, PostgreSQL, MongoDB, Amazon DynamoDB, Amazon S3 (AWS S3), Python, Java, Scala, Machine Learning, Streaming, Databricks, Apache Airflow, Data Engineering, Big Data, NoSQLData Engineer
2018 - 2021Socialbakers- Established an internal data-engineering educational group.
- Led the development of an internal ETL-DWH reporting application for internal customers.
- Led an integration of Mixpanel and Salesforce, which significantly increased the effectiveness of sales personnel.
- Built an internal CLI tool for the integration of a databricks workspace with local development and a Git versioning system.
Technologies: Spark, Redshift, Presto DB, EMR, RabbitMQ, AWS Kinesis, PostgreSQL, MongoDB, Amazon DynamoDB, Amazon S3 (AWS S3), Python, Java, Scala, Machine Learning, Streaming, Databricks, Apache Airflow, Data Engineering, Big Data, NoSQLFull-stack Developer
2017 - 2018Edvisor- Built a React widget, which was then deployed at Kaplan International Languages school.
- Delivered a new internal back-end GraphQL platform for the interaction of the front end and databases, which then replaced the old REST platform.
- Experimented with the integration of AngularJS and React and introduced a way to incrementally migrate the main product to React.
Technologies: JavaScript, MariaDB, Node.js, Sentry, AngularJS, React, Jira, Keen.io, SQL, GraphQL, RESTSystems and Data Integrator
2013 - 2017Socialbakers- Designed the architecture of the internal DWH solution, according to Kimball's best practices. I used a couple of open-source technologies, including heavily optimized PostgreSQL as ROLAP DWH, and Pentaho ETL/BI.
- Built an internal Salesforce widget that was showing health-metrics per Salesforce account.
- Maintained old PHP processes for product integration and replaced them with a more stable ETL solution.
Technologies: JavaScript, ETL, Pentaho, PostgreSQL, PHP, RabbitMQ, MongoDB, Data Engineering, Big Data, NoSQLJunior Big Data Specialist (Internship)
2013 - 2013IBM- Entered the world of big data. Digged through lots of big-data technologies (both closed and open-source).
- Prepared an extensive comparison of data-transformation processes' efficiency and performance when using GPFS over HDFS.
- Completed a wide range of time management and soft skills trainings.
Technologies: HDFS, GPFS, SQL, IBM BigInsights, Apache Hive, Apache Pig, Big DataJava Developer and Integrator
2010 - 2012Zitec- Designed and implemented a complex integration platform, which integrated a large volume of point-of-sale (POS) terminals, a central ERP system (ADempiere), an open-source CRM system (SugarCRM), and an open-source eCommerce platform (Magento).
- Implemented and customized an open-source ERP system (ADempiere) according to client demands. This solution was implemented using a standard open-source stack, including PostgreSQL, JBoss, and Java (J2EE on the back end, Swing, and ZK on the front end).
- Built an effective reporting platform using set of ETL processes and Palo multidimensional database (Cuda-accelerated), which was used as a base for an internal BI solution.
Technologies: Java, PostgreSQL, MySQL, ADempiere, iDempiere ERP, VMware ESXi, VirtualBox, CentOS, Linux, JBoss, GlassFish, Apache Tomcat