Big Data

Big Data Engineer Job Description Template

A Big Data Engineer is a person who creates and manages a company’s Big Data infrastructure and tools, and is someone that knows how to get results from vast amounts of data quickly.

Share

A Big Data Engineer is a person who creates and manages a company’s Big Data infrastructure and tools, and is someone that knows how to get results from vast amounts of data quickly.

The actual definition of this role varies, and often mixes with the Data Scientist role. Here, we will assume that it is a role focused on engineering, without statistics and strong machine learning skills required.

The world of Big Data has grown significantly during the last decade; therefore, the skills started to be more specific. While in the majority of cases it is built around Hadoop, there are many tools that have become very significant on their own. We have covered some common cases in the following sample description.

Big Data Engineer - Job Description and Ad Template

Copy this template, and modify it as your own:

Company Introduction

{{Write a short and catchy paragraph about your company. Make sure to provide information about the company culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else you think makes your company interesting. Big Data Engineers like to work on huge problems - mentioning the scale (or the potential) can help gain the attention of top talent.}}

Job Description

We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across the company.

Responsibilities

  • Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
  • Implementing ETL process {{if importing data from existing data sources is relevant}}
  • Monitoring performance and advising any necessary infrastructure changes
  • Defining data retention policies
  • {{Add any other responsibility that is relevant}}

Skills and Qualifications

  • Proficient understanding of distributed computing principles
  • Management of Hadoop cluster, with all included services {{unless you are going to have specific Big Data DevOps roles for this}}
  • Ability to solve any ongoing issues with operating the cluster {{unless you are going to have specific Big Data DevOps roles for this}}
  • Proficiency with Hadoop v2, MapReduce, HDFS
  • Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming {{if stream-processing is relevant for the role}}
  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
  • Experience with Spark {{if you are including or planning to include it}}
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O {{if you are going to integrate Machine Learning in your Big Data infrastructure}}
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Experience with Cloudera/MapR/Hortonworks {{you can specify the distribution you are currently using or planning to use here}}
  • {{List any other technologies you are using or planning to use. Most Big Data Engineers will know some of the ones listed here: The Hadoop Ecosystem Table}}
  • {{List education level or certification you require}}

Recent Big Data Articles by Toptal Engineers

How to Hire an Excellent Big Data Architect

Big Data is an extremely broad domain, typically addressed by a hybrid team of data scientists, software engineers, and statisticians. Real expertise in big data therefore requires far more than learning the ins and outs of a particular technology. This guide offers a sampling of effective questions to help evaluate the breadth and depth of a candidate's mastery of this complex domain.

Read Hiring Guide

Hire a Top Big Data Architect Now

Toptal is a marketplace for top big data architects. Top companies and startups choose Toptal big data freelancers for their mission-critical software projects.

See Their Profiles

James Cahall

Freelance Big Data Architect
United StatesFreelance Big Data Developer at Toptal Since October 17, 2016

James is a results-driven, can-do, and entrepreneurial engineer with eight years of C-level experience (15+ years of professional engineering)—consistently delivering successful bleeding-edge products to support business goals. He's an architect in innovative tech initiatives that add to and accelerate business revenue streams. He's also the CTO and lead developer of Toon Goggles—an SVOD/AVOD kids' entertainment service with 8 million users.

Show More

George Kobiashvili

Freelance Big Data Architect
GeorgiaFreelance Big Data Developer at Toptal Since September 4, 2019

George is a seasoned systems engineer with great breadth and depth knowledge of building and automating complex systems. As an early adopter of cloud technology, he led a team to design and build an on-premise cloud. His 12 years of teaching developed the skill of coaching and communicating complex concepts. George is fluent in C, Go, and Python languages, with a keen interest in data science and AI, focused on delivering the highest quality results. He is eager to work with complex problems.

Show More

Sung Jun (Andrew) Kim

Freelance Big Data Architect
AustraliaFreelance Big Data Developer at Toptal Since June 18, 2020

As a highly effective technical leader with over 20 years of experience, Andrew specializes in data: integration, conversion, engineering, analytics, visualization, science, ETL, big data architecture, analytics platforms, and cloud architecture. He has an array of skills in building data platforms, analytic consulting, trend monitoring, data modeling, data governance, and machine learning.

Show More

Bruno Machado Agostinho

Freelance Big Data Architect
BrazilFreelance Big Data Developer at Toptal Since June 18, 2020

For over the past decade, Bruno's been working with databases in various fields. He also has an Oracle SQL Expert certification and specializes in optimizing SQL queries and PL/SQL procedures, but he's also developed with PostgreSQL and MySQL. Bruno likes to keep himself up to date, and that's why he's undertaking a Ph.D. in computer science.

Show More

Pieter van Beek

Freelance Big Data Architect
PortugalFreelance Big Data Developer at Toptal Since September 8, 2014

Pieter has 39 years of programming experience, including time spent as a software product manager. He is a challenger, an independent worker, and a team player as circumstances demand, and he boasts expertise and skill in a range of topics, including big data, cryptography, and machine learning.

Show More

Benjamin Li

Freelance Big Data Architect
CanadaFreelance Big Data Developer at Toptal Since November 3, 2021

Benjamin has over two decades of software and big data development experience, including data modeling and data warehouse design. His active toolset includes Spark, Python, Scala, AWS, Azure, SQL, Hive, Linux, Microsoft BI solutions, C#.NET, and Java. His orientation to detail and strong analytical and problem-solving skills make him an excellent addition to any team. A kind and intentional communicator, Benjamin always produces high-quality work.

Show More

Daphne Liu

Freelance Big Data Architect
United StatesFreelance Big Data Developer at Toptal Since June 18, 2020

Daphne is a highly motivated big data analytic architect and SQL/Tableau developer with strong business analytic solution delivery skills and 20 years of progressively responsible OLTP/OLAP database development/architecture experience. She is a frequent seminar speaker and workshop trainer in business intelligence and analytic solutions. Daphne is experienced collaborating with business users in data modeling and business analytic solutions.

Show More

Tafsuth Boumali

Freelance Big Data Architect
FranceFreelance Big Data Developer at Toptal Since November 1, 2021

Tafsuth is a highly efficient and dedicated professional with a broad software and data engineering skillset. Her career assignments have ranged from building real-time prediction pipelines for startups to leading project teams and designing and maintaining large data lakes for Fortune 500 companies. Tafsuth is interested in helping businesses make data-driven decisions, and she enjoys sharing her knowledge by mentoring engineers.

Show More

Lian Yagoda

Freelance Big Data Architect
IsraelFreelance Big Data Developer at Toptal Since November 26, 2021

Lian has a decade of experience with different BI platforms, working as a BI developer and technical support consultant. She is an expert in data modeling, querying, manipulation, and visualization of data outputs, and she likes to use Sisense, Tableau, Qlik Sense, Power BI, and Looker. Lian enjoys using her skills to contribute to the exciting technological advances daily.

Show More

Piotr Pietruszka

Freelance Big Data Architect
PolandFreelance Big Data Developer at Toptal Since February 5, 2021

Piotr is a database developer with 12 years of experience in business intelligence projects as a back- and front-end developer. He designed and developed SQL ETL jobs to migrate a financial system from Oracle to SAP at the European Space Agency. Piotr excels in Oracle databases, SQL, ETL processes development, and the creation of high-quality reports. He has been working on big data projects and building data pipelines using Apache Spark technology for over three years.

Show More

Igor Gorbenko

Freelance Big Data Architect
United Arab EmiratesFreelance Big Data Developer at Toptal Since October 18, 2021

Igor is a data engineer and cloud architect with nearly 13 years of solid experience building high-load reliable systems, DWH, ETL, and machine learning pipelines for Gazprombank, Stanford, GlaxoSmithKline, Fujitsu, AbbVie, and Royal Mail. He is a cloud-agnostic engineer specializing in Flask, FastAPI, and database integration. Igor is also keen on building GCP-based systems to leverage businesses to work more efficiently, gain more flexibility, and allow a strategic advantage.

Show More

Sign up now to see more profiles.

Start Hiring

Toptal Connects the Top 3% of Freelance Talent All Over The World.

Join the Toptal community.