Hadoop has become the cornerstone of large data systems. Indeed, Hadoop developers know how to write applications that interact with Hadoop; but they also know how to build, operate, and troubleshoot large Hadoop clusters.
Hadoop has become the cornerstone of large data systems. Indeed, Hadoop developers know how to write applications that interact with Hadoop; but they also know how to build, operate, and troubleshoot large Hadoop clusters.
Hadoop is a large software stack. On one hand, it deals with low-level hardware resources, and on the other, it provides a high-level API to build software. As a result, a Hadoop developer not only develops software but also operates it.
Furthermore, a good grasp of algorithms and their runtime characteristics is essential to developing for Hadoop efficiently.
Hadoop Developer - Job Description and Ad Template
Copy this template, and modify it as your own:
Company Introduction
{{ Write a short and catchy paragraph about your company. Make sure to provide information about the company’s culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else that you think makes your company interesting. }}
Job Description
We are looking for a Hadoop developer to help us build large-scale data storage and processing software and infrastructure. Knowledge of existing tools is essential, as is the capacity to write software using the Hadoop API.
Responsibilities
Write software to interact with HDFS and MapReduce.
Assess requirements and evaluate existing solutions.
Build, operate, monitor, and troubleshoot Hadoop infrastructure.
Develop tools and libraries, and maintain processes for other engineers to access data and write MapReduce programs.
Develop documentation and playbooks to operate Hadoop infrastructure.
Evaluate and use hosted solutions on AWS / Google Cloud / Azure. {{If you’d like to use hosted solutions}}
Write scalable and maintainable ETLs. {{If you need to run ETLs}}
Understand Hadoop’s security mechanisms and implement Hadoop security. {{If you need fine-grained security within your organization}}
Write software to ingest data into Hadoop.
Skills
You know the JVM runtime, the Java language, and ideally another JVM-based programming language. {{Mention Java version if it matters to your existing code}}
You know computer science fundamentals, particularly algorithmic complexity.
You know trade-offs in distributed systems.
You’re proficient at software engineering principles that produce maintainable software and you can use them in practice.
You have worked with a Hadoop distribution.
You have worked with one or more computation frameworks, such as Spark.
You’re familiar with HBase, Kafka, ZooKeeper, or other Apache software. {{Add or remove Apache software based on need}}
You know Linux and its operation, networking, and security.
You know how to efficiently move large data around.
Toptal is a marketplace for top Hadoop developers, engineers, programmers, coders, architects, and consultants. Top companies and startups choose Toptal Hadoop freelancers for their mission-critical software projects.
As a highly effective technical leader with over 20 years of experience, Andrew specializes in data: integration, conversion, engineering, analytics, visualization, science, ETL, big data architecture, analytics platforms, and cloud architecture. He has an array of skills in building data platforms, analytic consulting, trend monitoring, data modeling, data governance, and machine learning.
Abhimanyu is a machine learning expert with 15 years of experience creating predictive solutions for business and scientific applications. He’s a cross-functional technology leader, experienced in building teams and working with C-level executives. Abhimanyu has a proven technical background in computer science and software engineering with expertise in high-performance computing, big data, algorithms, databases, and distributed systems.
Mustafa has more than 20 years of experience in systems and software deployed on Linux platforms. He has extensive scripting/coding experiences both as a systems and software engineer. He has solid skills in DevOps, SaaSOps, and Linux systems engineering.
Ibrahim is a veteran data scientist and software developer with a passion for deep learning and extensive experience in statistics and time series. Ibrahim is interested in helping businesses take full advantage of their data and take data-driven actions.
For over the past decade, Bruno's been working with databases in various fields. He also has an Oracle SQL Expert certification and specializes in optimizing SQL queries and PLSQL procedures, but he’s also developed with PostgreSQL and MySQL. Bruno likes to keep himself up to date, and that's why he’s undertaking a Ph.D. degree in computer science.
Pablo is an architect and developer with extensive experience in a wide range of techniques and technologies and a strong ability to understand and solve problems efficiently while keeping in mind the big picture. He consistently achieves very high quality and has successfully led several projects with small teams.