Scroll To View More
Emilio Almansi

Emilio Almansi

Buenos Aires, Argentina
Member since October 29, 2017
Emilio has a broad skill set ranging from full-stack development and distributed computing to blockchain technologies. He has had extensive experience with the Java and Node.js ecosystems as an intern at Google and the Max Planck Institute for Informatics, as well as working remotely with a big data startup in Berlin. He has strong analytical skills and takes great care in communicating with clients to find the right solution for their needs.
Emilio is now available for hire
  • Object-oriented Programming (OOP), 5 years
  • Java, 5 years
  • C++, 5 years
  • Git, 4 years
  • JavaScript, 3 years
  • Python, 3 years
  • Node.js, 2 years
  • Functional Programming, 2 years
Buenos Aires, Argentina
Preferred Environment
Linux, Visual Studio Code, IntelliJ, git.
The most amazing...
...lines of code I've written reduced the largest performance bottleneck and distributed the profiling algorithm by a factor of 20 in both time and memory use.
  • Associate Software Engineer
    2016 - 2017
    Trifacta Inc.
    • Wrote and optimized algorithms for computing data transformation primitives on GCP’s Dataflow engine for parallel data processing.
    • Developed a time scheduling microservice based on Java Quartz, designed for high availability and resilience.
    • Integrated Google’s BigQuery large-scale data warehouse into the product, spanning multiple back-end services (Node.js, Java, Python) and the platform’s web application interface (front-end and back-end).
    Technologies: Node.js, Java, Python, C++, Docker, Google Cloud Storage, Google Dataflow, BigQuery
  • Research Intern
    2015 - 2016
    Max Planck Institute for Informatics
    • Built a Java tool for exporting Wikipedia’s full edit history XML dumps (+10TB uncompressed) into Avro format.
    • Extracted the full link structure of all +37M pages and +640M revisions in Wikipedia’s edit history.
    • Wrote a data processing pipeline for Apache Spark SQL engine to compute Jaccard-type semantic relatedness scores between pages and various page popularity metrics.
    Technologies: Java, Scala, Apache: HDFS, MapReduce, Spark SQL, Pig, Avro, Parquet.
  • Software Engineering Intern
    2015 - 2015
    • Wrote a FlumeJava distributed processing pipeline for detecting book series from messy or incomplete book metadata.
    • Set up automatic deployment for the developed pipeline using Borg for daily extraction.
    • Executed extraction on data provided by major book partners yielding +1500 book series.
    Technologies: Blaze, Piper, Java, Guice, FlumeJava, Borg
  • Freelance Software Engineer
    2013 - 2014
    Data Extraction Freelance Projects
    • Created a stand-alone tool for continuous, high-performance web data extraction jobs. Written in PHP and multi-cURL for leveraging multiple asynchronous requests, the tool harvested millions of entries per day producing a MySQL database as output.
    • Developed multiple customized web crawlers using Python's Scrapy Framework, later deployed to the cloud for autonomous periodic execution.
    Technologies: PHP, MySQL, Python, Scrapy Framework
  • Web Developer
    2012 - 2012
    Artfos SA
    • Developed and maintained CRUD applications with a standardized development process.
    • Developed the back-end using PHP, Yii Framework, MySQL. Front-end development using JavaScript, HTML, LESS.
    • Launched a PHP continuous integration server based on JenkinsCI.
    • Wrote automated end-to-end tests with Selenium IDE.
    Technologies: PHP, Yii Framework, MySQL, HTML, LESS, JavaScript, JenkinsCI
  • Google Cloud Dataprep (Development)

    Google Cloud Dataprep, born from a collaboration between Trifacta and Google, is an intelligent data service for visually exploring, cleaning, and preparing structured and unstructured data for analysis.

    At Trifacta, I worked as part of the team involved in rearchitechting Trifacta's data preparation product into a microservice based architecture fit for integration into the Google Cloud Platform.

    My contributions to this project included implementing and optimizing data transformation operations as data-parallel primitives on Dataflow, Google Cloud's distributed computing engine. I was also responsible for the integration of BigQuery - Google's serverless, highly scalable, low cost enterprise data warehouse - as a data source on Dataprep.

  • ErcyBot (Development)

    ErcyBot is a Slack bot which provides a real-time feed of Ethereum ERC20 token transfers in your Slack workspace.

    The bot listens to the Ethereum Blockchain for incoming events occurring in one of the contracts of interest and, upon detecting a token transfer, sends a message to the configured Slack channel.

  • Robotics for High School Students (Other amazing things)

    As part of the Computer Science popularization team at my university, I developed a web-based IDE à la MIT Scratch for programming and simulating an Arduino-based, two-wheel robot.

    The application has been used to introduce programming and robotics concepts to over a thousand high-school students, and is still in use today. It also has a pretty slick user interface:

  • Scraple: High-Performance Web Data Extraction (Development)

    Highly configurable web data extraction tool, built in PHP and using multi-cURL for leveraging concurrent asynchronous requests. Harvesting millions of entries per day, the tool produced a MySQL database as output, yielding an easily browsable representation of freely available data on the web.

  • Languages
    Java, C++, JavaScript, Python, C, Sass, HTML, Bash, Haskell, SQL, CSS, PHP
  • Tools
    Git, Gulp.js, Bower, NPM, Apache Avro, LaTeX, Mocha, MATLAB, Spark SQL
  • Other
    Algorithms, API Design, Integration Testing, Parquet, Agile Sprints, Ethereum Smart Contracts, Google BigQuery, Bitcoin, Data Extraction, Slackbot
  • Libraries/APIs
    Node.js, Standard Template Library (STL), React
  • Paradigms
    Object-oriented Programming (OOP), Functional Programming, E2E Testing, Prototype-based OOP, Unit Testing, Scrum, MapReduce, Testing
  • Frameworks
    Spring, Redux, Truffle, Express.js, Bootstrap, Scrapy
  • Platforms
    Docker, Heroku
  • Storage
    HDFS, PostgreSQL
  • Master's degree in Computer Science
    2012 - 2018
    University of Buenos Aires - Buenos Aires, Argentina
  • Bachelor's degree in Computer Science
    2012 - 2017
    University of Buenos Aires - Buenos Aires, Argentina
I really like this profile
Share it with others