Gonzalo Andres Diaz, Data Engineering Developer in Córdoba, Cordoba, Argentina
Gonzalo Andres Diaz

Data Engineering Developer in Córdoba, Cordoba, Argentina

Member since July 26, 2012
Gonzales has around 10 years of experience learning and developing solutions with multiple languages and technologies. In the data science and big data scene, he became a data engineer, as it is his strongest suit. It's not rare to find him learning topics related to machine learning as he learns about the data engineering ecosystem and state-of-the-art big data platforms.
Gonzalo is now available for hire


  • Rappi
    Amazon Web Services (AWS), SQL, Python, Rakam, Periscope, Snowflake...
  • Olapic
    Jenkins, SQL, Pandas, Python, PHP, AWS Lambda, Streaming...
  • Bytelion
    Amazon Web Services (AWS), Celery, Heroku, AngularJS, PostgreSQL, Django...



Córdoba, Cordoba, Argentina



Preferred Environment

Amazon Web Services (AWS), GitHub, MacOS, PyCharm

The most amazing...

...thing I've evolved is the data infrastructure to handle real-world big data and customer needs.


  • Data Engineer

    2019 - PRESENT
    • Embedded in a data science team that is recently adopting data engineering practices. In charge of the communication with platform and infrastructure teams. Set up the tooling, documentation, and data modeling, and planned the shape of the data pipelines moving forward.
    • Set up a machine learning and data engineering orchestration tool for the company, using Airflow in collaboration with DevOps.
    Technologies: Amazon Web Services (AWS), SQL, Python, Rakam, Periscope, Snowflake, Apache Airflow
  • Data Engineer

    2016 - 2019
    • Served as back-end analytics engineer, supporting the maintenance of the data infrastructure and ClickStream API.
    • Authored an anomaly detector tool to identify problems in customer data to enable early detection and fixing of the underlying data and tooling.
    • Collaborated on the design and implementation of the new version of the data pipeline, from Jenkins and SQL scripts to Apache Airflow.
    • Collaborated with the design and implementation of the ETL to sync data from operational databases in RDS/Aurora (MySQL) to the data warehouse in Redshift.
    • Supported the business intelligence team to write and maintain the funnels and reducers needed to fuel the reporting of the company.
    Technologies: Jenkins, SQL, Pandas, Python, PHP, AWS Lambda, Streaming, Amazon Kinesis Data Firehose, Amazon EC2, Amazon S3 (AWS S3), Redshift
  • Full-stack Developer

    2014 - 2015
    • Operated as back-end Node.js developer using ES6. Collaborated to write the data ingestion framework to provide a third-party news provider.
    • Wrote and maintained the data pipelines that ingested the LawIQ proprietary information.
    • Set up the codebase of the project. Delivered the MVP from scratch to the initial group of clients.
    • Collaborated to improve the API and help improve the performance of the machine learning algorithm at SameGrain (iOS Social Network).
    Technologies: Amazon Web Services (AWS), Celery, Heroku, AngularJS, PostgreSQL, Django, Python, MongoDB, Express.js, Node.js
  • Software Developer

    2012 - 2014
    Santex America
    • Developed and maintained several tools used to convert from DOCX and HTML to a proprietary XML variation and the other way around.
    • Developed an HTML importer for proprietary XML uses JTidy, CssToXslfo, XSLFO, and XSLT. After months of improving this module, I took charge of the migration from XSLT to Java, using Jsoup to marshall HTML and Jaxb to unmarshal the proprietary XML.
    • Uses Agile methodologies with a remote team in Argentina and the customer HQ in Ames, Iowa.
    • Built in-house CI tool integrated with GitHub.
    Technologies: Amazon Web Services (AWS), XSLT, XSD, XML, Maven, Ant, GAE, Scrum, Java, Python
  • Software Development

    2009 - 2011
    • Developed and enhanced the insurance module of Orbitz World Wide and its entire platform.
    • Built an internal tool to manage configuration files of the entire platform (XML files). Written an interface to Accurev (SCM) and Atlassian Jira.
    • Coordinated with European product owners the products to include on each release.
    • Developed the configuration change files (XML) requested by the product owners and coordinated with the release management team to include these changes in the next deployment. The goal was to deliver a configuration bundle every two weeks without production issues.
    • Led a successful migration from Ant2 to Gradle.
    • Prepared and maintained VMware virtual machines.
    Technologies: Gradle, Ant, Scrum, Agile, Ruby on Rails (RoR), Java


  • Insurance Vertical for Orbitz World Wide

    I built (with my team) the entire stack for the insurance modules, going from the front-end to the back-end, connecting to a payment gateway, and using several technologies like Drools, Spring, ProtoBuf, etc.

  • MVP of a Social Media

    As a full-stack JavaScript developer, I wrote the first version of the product that was presented to the clients to sell the product.

  • Next Gen Data Infrastructure

    In Olapic, we designed and implemented a world-class data infrastructure with enhanced service tracking using Flask and Apache Spark, data pipeline orchestration with Airflow, and data warehouse in Redshift.

  • LawIQ Ingestion Pipeline

    Using AWS EC2 and Python scripting, I built the data pipeline to ingest the assets coming from the third party.


  • Languages

    Python, SQL, Snowflake, Java, XML, XSD, XSLT, PHP
  • Tools

    GitHub, PyCharm, Apache Airflow, Amazon Simple Queue Service (SQS), Apache, Celery, Jenkins, Vagrant, Periscope, Chartio, AWS Simple Notification Service (AWS SNS), Gradle, Maven
  • Paradigms

    Continuous Integration (CI), Object-oriented Programming (OOP), Functional Programming, Agile Software Development, Dimensional Modeling, Distributed Computing, Agile, Scrum
  • Platforms

    MacOS, Linux, AWS Kinesis, Amazon EC2, AWS Lambda, Heroku, Rakam, Amazon Web Services (AWS), Docker
  • Storage

    JSON, Redshift, SQLite, MySQL, PostgreSQL, Distributed Databases, MongoDB, Amazon S3 (AWS S3)
  • Other

    Amazon Kinesis Data Firehose, Data Engineering, MVP Design, Data Modeling, Ant, Streaming
  • Libraries/APIs

    Pandas, Node.js, PySpark
  • Frameworks

    Ruby on Rails (RoR), GAE, Express.js, Django, AngularJS


  • Information System Engineer (Incomplete) Degree in Information System Engineering
    2002 - 2013
    UTN National University of Technology - Cordoba


  • Reproducible Research
    MARCH 2016 - PRESENT
    Coursera Course Certificates
  • Hadoop Starter Kit
  • Data Analysis and Statistical Inference
  • Exploratory Data Analysis
    Coursera Verified Certificates
  • Getting and Cleaning Data
    MARCH 2015 - PRESENT
    Coursera Verified Certificates
  • R Programming
    Coursera Verified Certificates
  • The Data Scientist's Toolbox
    Coursera Verified Certificates
  • Machine Learning

To view more profiles

Join Toptal
Share it with others