Scroll To View More
Gonzalo Andres Diaz, Data Engineering Developer in Córdoba, Cordoba, Argentina
Gonzalo Andres Diaz

Data Engineering Developer in Córdoba, Cordoba, Argentina

Member since July 9, 2012
Gonzales has around 10 years of experience learning and developing solutions with multiple languages and technologies. In the data science and big data scene, he became a data engineer, as it is his strongest suit. It's not rare to find him learning topics related to machine learning as he learns about the data engineering ecosystem and state-of-the-art big data platforms.
Gonzalo is now available for hire

Portfolio

  • Rappi
    AWS, Airflow, Snowflake, Periscope, Rakam, Python, SQL
  • Olapic
    Redshift, S3, EC2, Kinesis Firehose, Kinesis Streaming, Lambda), PHP, Python...
  • Bytelion
    Node.js, Express, MongoDB, Python, Django, PostgreSQL, AWS, AngularJS, Heroku...

Experience

  • Agile Software Development, 10 years
  • Python, 6 years
  • Redshift, 3 years
  • Data Engineering, 3 years
  • SQL, 3 years
  • Apache Airflow, 3 years
  • Data Modeling, 2 years
  • Distributed Computing, 2 years
Córdoba, Cordoba, Argentina

Availability

Part-time

Preferred Environment

PyCharm, macOS, GitHub, AWS

The most amazing...

...thing I've evolved is the data infrastructure to handle real-world big data and customer needs.

Employment

  • Data Engineer

    2019 - PRESENT
    Rappi
    • Embedded in a data science team that is recently adopting data engineering practices. In charge of the communication with platform and infrastructure teams. Set up the tooling, documentation, and data modeling, and planned the shape of the data pipelines moving forward.
    • Set up a machine learning and data engineering orchestration tool for the company, using Airflow in collaboration with DevOps.
    Technologies: AWS, Airflow, Snowflake, Periscope, Rakam, Python, SQL
  • Data Engineer

    2016 - 2019
    Olapic
    • Served as back-end analytics engineer, supporting the maintenance of the data infrastructure and ClickStream API.
    • Authored an anomaly detector tool to identify problems in customer data to enable early detection and fixing of the underlying data and tooling.
    • Collaborated on the design and implementation of the new version of the data pipeline, from Jenkins and SQL scripts to Apache Airflow.
    • Collaborated with the design and implementation of the ETL to sync data from operational databases in RDS/Aurora (MySQL) to the data warehouse in Redshift.
    • Supported the business intelligence team to write and maintain the funnels and reducers needed to fuel the reporting of the company.
    Technologies: Redshift, S3, EC2, Kinesis Firehose, Kinesis Streaming, Lambda), PHP, Python, Pandas, SQL, Jenkins
  • Full-stack Developer

    2014 - 2015
    Bytelion
    • Operated as back-end Node.js developer using ES6. Collaborated to write the data ingestion framework to provide a third-party news provider.
    • Wrote and maintained the data pipelines that ingested the LawIQ proprietary information.
    • Set up the codebase of the project. Delivered the MVP from scratch to the initial group of clients.
    • Collaborated to improve the API and help improve the performance of the machine learning algorithm at SameGrain (iOS Social Network).
    Technologies: Node.js, Express, MongoDB, Python, Django, PostgreSQL, AWS, AngularJS, Heroku, Celery
  • Software Developer

    2012 - 2014
    Santex America
    • Developed and maintained several tools used to convert from DOCX and HTML to a proprietary XML variation and the other way around.
    • Developed an HTML importer for proprietary XML uses JTidy, CssToXslfo, XSLFO, and XSLT. After months of improving this module, I took charge of the migration from XSLT to Java, using Jsoup to marshall HTML and Jaxb to unmarshal the proprietary XML.
    • Uses Agile methodologies with a remote team in Argentina and the customer HQ in Ames, Iowa.
    • Built in-house CI tool integrated with GitHub.
    Technologies: Python, Java, Scrum, AWS, GAE, Ant, Maven, XML, XSD, XSLT
  • Software Development

    2009 - 2011
    Globant
    • Developed and enhanced the insurance module of Orbitz World Wide and its entire platform.
    • Built an internal tool to manage configuration files of the entire platform (XML files). Written an interface to Accurev (SCM) and Atlassian Jira.
    • Coordinated with European product owners the products to include on each release.
    • Developed the configuration change files (XML) requested by the product owners and coordinated with the release management team to include these changes in the next deployment. The goal was to deliver a configuration bundle every two weeks without production issues.
    • Led a successful migration from Ant2 to Gradle.
    • Prepared and maintained VMware virtual machines.
    Technologies: Java, Ruby on Rails, Agile, Scrum, Ant2, Gradle

Experience

  • Insurance Vertical for Orbitz World Wide (Other amazing things)

    I built (with my team) the entire stack for the insurance modules, going from the front-end to the back-end, connecting to a payment gateway, and using several technologies like Drools, Spring, ProtoBuf, etc.

  • MVP of a Social Media (Development)
    http://www.daocloud.com/

    As a full-stack JavaScript developer, I wrote the first version of the product that was presented to the clients to sell the product.

  • Next Gen Data Infrastructure (Development)

    In Olapic, we designed and implemented a world-class data infrastructure with enhanced service tracking using Flask and Apache Spark, data pipeline orchestration with Airflow, and data warehouse in Redshift.

  • LawIQ Ingestion Pipeline (Development)
    https://www.lawiq.com/

    Using AWS EC2 and Python scripting, I built the data pipeline to ingest the assets coming from the third party.

Skills

  • Languages

    Python, SQL, Snowflake, PHP
  • Tools

    GitHub, PyCharm, Apache Airflow, Amazon SQS, Apache, Celery, Jenkins, Vagrant, Periscope, Chartio, AWS Push Notification Service (AWS SNS)
  • Paradigms

    Continuous Integration (CI), Object-oriented Programming (OOP), Functional Programming, Agile Software Development, Dimensional Modeling
  • Platforms

    MacOS, Linux, AWS Kinesis, AWS EC2, AWS Lambda, Heroku, Docker
  • Storage

    JSON, Amazon Kinesis Data Firehouse, Redshift, SQLite, MySQL, PostgreSQL, Distributed Databases, AWS S3
  • Other

    Data Engineering, MVP Design, Data Modeling, Distributed Computing
  • Libraries/APIs

    Pandas, PySpark

Education

  • Information System Engineer (incomplete) degree in Information System Engineering
    2002 - 2013
    UTN National University of Technology - Cordoba
Certifications
  • Reproducible Research
    MARCH 2016 - PRESENT
    Coursera Course Certificates
  • Hadoop Starter Kit
    DECEMBER 2015 - PRESENT
    Udemy
  • Data Analysis and Statistical Inference
    NOVEMBER 2015 - PRESENT
    Coursera
  • Exploratory Data Analysis
    AUGUST 2015 - PRESENT
    Coursera Verified Certificates
  • Getting and Cleaning Data
    MARCH 2015 - PRESENT
    Coursera Verified Certificates
  • R Programming
    FEBRUARY 2015 - PRESENT
    Coursera Verified Certificates
  • The Data Scientist's Toolbox
    JANUARY 2015 - PRESENT
    Coursera Verified Certificates
  • Machine Learning
    DECEMBER 2014 - PRESENT
    Coursera
I really like this profile
Share it with others