The Vital Guide to Big Data Interviewing

Big data is an extremely broad domain, typically addressed by a hybrid team of data scientists, software engineers, and statisticians. Real expertise in big data therefore requires far more than learning the ins and outs of a particular technology. This guide offers a sampling of effective questions to help evaluate the breadth and depth of a candidate's mastery of this complex domain.

Hire a top Big Data architect now.
Toptal is a marketplace for top Big Data architects. Top companies and start-ups choose Toptal freelancers for their mission critical software projects.
Full
profile
Maksim SiposUnited Kingdom
Max's academic background is in numerical computational physics (Ph.D.). He worked as a quant developer on Wall Street, and then as a data scientist consultant in finance and internet companies. Max writes full-stack, production-level, high-performance, distributed solutions for complex big- or small-data problems. He is an experienced programmer in C++ (C++11, Qt), Java, Python (NumPy, SciPy, Sklearn) and JavaScript (Node and front-end).
[click to continue…]
Big Data ArchitectPythonC++JavaCGit
Hire
Full
profile
James CahallUnited States
James is a result-driven, can-do, and entrepreneurial engineer with 6+ years of C-level experience (15+ years of professional engineering), consistently delivering successful bleeding-edge products to support business goals. He is an architect in innovative technology initiatives that add to, and accelerate, business revenue streams. He is also the CTO and lead developer of Toon Goggles, a SVOD/AVOD kids' entertainment service with 5M+ users.
[click to continue…]
Big Data ArchitectObjective-CJavaJavaScriptHTML5CSS3jQueryAdobe PhotoshopGitiOSAndroidMac OSGoogle TV
Hire
Full
profile
Reuben FirminUnited States
Reuben is an experienced software architect and engineer with significant technical and project management experience. He boasts expertise in big data, data warehousing, and scalable and distributed applications. He excels with Java, relational and NoSQL databases, and web technologies.
[click to continue…]
Big Data ArchitectJavaAgile Software DevelopmentPostgreSQL
Hire
Full
profile
Mark Wong-VanHarenSpain
Mark is an entrepreneur, engineer, CTO, and artisan with decades of startup experience, including co-founding Excite.com. He makes complex problems simple with expressive, maintainable code. He believes in building small, well-tested, functional pieces, loosely joined by a well-documented contract.
[click to continue…]
Big Data ArchitectSwiftClojurePythonRubyJavaScriptHTML5CSSOCamlCoffeeScriptRuby on RailsAndroid SDKjQuery
Hire
Full
profile
Brandon VariloneUnited States
Brandon has 13+ years identifying business objectives and defining technical strategies and processes to achieve them. He demonstrates an extraordinary aptitude for leveraging technology to efficiently and concisely solve complex problems. His expertise includes thought leadership, technical strategy, enterprise architecture, cloud computing and big data.
[click to continue…]
Big Data ArchitectHTML5C#JavaScript.NETKnockout.jsWebSocketsjQueryWindows Azure SDKAmazon Web Services (AWS)Amazon S3SQL ServerAzure
Hire
Full
profile
Pieter van BeekNetherlands
Pieter has 36 years of programming experience, including time spent as Software Product Manager. He is a challenger, an independent worker, and a team player as circumstances demand, and boasts expertise and skill in a range of topics including big data, cryptography, and machine learning.
[click to continue…]
Big Data ArchitectJavaScriptJRubyRubyPythonBootstrapHadoopMac OS XMySQL
Hire
Full
profile
Konstantin KanishchevFrance
Konstantin is a Theoretical Physicist with a strong background in C++, Python, and JavaScript programming. With deep experience in research-level software development, heavy data analysis (WLCG), and data visualization (d3.js), he provides high-level expertise in Physics, CS, and Applied Mathematics.
[click to continue…]
Big Data ArchitectC++JavaScript
Hire
Full
profile
Bryce OttUnited States
With more than 13 years working as an engineer, architect, director, vice president, and chief technology officer, Bryce brings a deep understanding of enterprise software, management, and technical strategy to any project. His specialties include real-time systems, business intelligence, big data, enterprise web apps, scalability, and open-source software.
[click to continue…]
Big Data ArchitectCSSPHPVisual BasicSQLJavaScriptJavaHTML5Zend Framework 2Yii FrameworkTwitter BootstrapCakePHPAngularJS
Hire
Full
profile
Daniel MichulkeLuxembourg
An accomplished developer, entrepreneur, and researcher with a background in mathematics and science, Daniel develops large and flexible applications with the broad vision of an architect and the risk-taking of an entrepreneur.
[click to continue…]
Big Data ArchitectClojureJavaPostgreSQLMachine Learning
Hire
Full
profile
Ioana GrigoropolRomania
Ioana is an experienced J2SE software engineer with a Master of Science degree in Artificial Intelligence. She is passionate about natural language processing and semantic analysis, analytic by nature, and a dedicated problem solver.
[click to continue…]
Big Data ArchitectJava
Hire

A Big Data Engineer is a person who creates and manages a company’s Big Data infrastructure and tools, and is someone that knows how to get results from vast amounts of data quickly.

The actual definition of this role varies, and often mixes with the Data Scientist role. Here, we will assume that it is a role focused on engineering, without statistics and strong machine learning skills required.

The world of Big Data has grown significantly during the last decade; therefore, the skills started to be more specific. While in the majority of cases it is built around Hadoop, there are many tools that have become very significant on their own. We have covered some common cases in the following sample description.

Big Data Engineer - Job Description and Ad Template

Company Introduction

{{Write a short and catchy paragraph about your company. Make sure to provide information about the company culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else you think makes your company interesting. Big Data Engineers like to work on huge problems - mentioning the scale (or the potential) can help gain the attention of top talent.}}

Job Description

We are looking for a Big Data Engineer that will work on the collecting, storing, processing, and analyzing of huge sets of data. The primary focus will be on choosing optimal solutions to use for these purposes, then maintaining, implementing, and monitoring them. You will also be responsible for integrating them with the architecture used across the company.

Responsibilities

  • Selecting and integrating any Big Data tools and frameworks required to provide requested capabilities
  • Implementing ETL process {{if importing data from existing data sources is relevant}}
  • Monitoring performance and advising any necessary infrastructure changes
  • Defining data retention policies
  • {{Add any other responsibility that is relevant}}

Skills and Qualifications

  • Proficient understanding of distributed computing principles
  • Management of Hadoop cluster, with all included services {{unless you are going to have specific Big Data DevOps roles for this}}
  • Ability to solve any ongoing issues with operating the cluster {{unless you are going to have specific Big Data DevOps roles for this}}
  • Proficiency with Hadoop v2, MapReduce, HDFS
  • Experience with building stream-processing systems, using solutions such as Storm or Spark-Streaming {{if stream-processing is relevant for the role}}
  • Good knowledge of Big Data querying tools, such as Pig, Hive, and Impala
  • Experience with Spark {{if you are including or planning to include it}}
  • Experience with integration of data from multiple data sources
  • Experience with NoSQL databases, such as HBase, Cassandra, MongoDB
  • Knowledge of various ETL techniques and frameworks, such as Flume
  • Experience with various messaging systems, such as Kafka or RabbitMQ
  • Experience with Big Data ML toolkits, such as Mahout, SparkML, or H2O {{if you are going to integrate Machine Learning in your Big Data infrastructure}}
  • Good understanding of Lambda Architecture, along with its advantages and drawbacks
  • Experience with Cloudera/MapR/Hortonworks {{you can specify the distribution you are currently using or planning to use here}}
  • {{List any other technologies you are using or planning to use. Most Big Data Engineers will know some of the ones listed here: The Hadoop Ecosystem Table}}
  • {{List education level or certification you require}}
Hire Big Data architects now

Recent Big Data Articles by Toptal Engineers

  • Trusted by: