Spark Developer Job Description Template
Apache Spark has become one of the most used frameworks for distributed data processing. Its mature codebase, horizontal scalability, and resilience make it a great tool to process huge amounts of data.
Trusted by leading brands and startups
Apache Spark has become one of the most used frameworks for distributed data processing. Its mature codebase, horizontal scalability, and resilience make it a great tool to process huge amounts of data.
Spark’s great power and flexibility requires a developer that does not only know the Spark API well: They must also know about the pitfalls of distributed storage, how to structure a data processing pipeline that has to handle the 5V of Big Data—volume, velocity, variety, veracity, and value—and how to turn that into maintainable code.
Spark Developer - Job Description and Ad Template
Copy this template, and modify it as your own:
Copy to ClipboardCompany Introduction
{{ Write a short and catchy paragraph about your company. Make sure to provide information about the company’s culture, perks, and benefits. Mention office hours, remote working possibilities, and everything else that you think makes your company interesting. }}
Job Description
We are looking for a Spark developer who knows how to fully exploit the potential of our Spark cluster.
You will clean, transform, and analyze vast amounts of raw data from various systems using Spark to provide ready-to-use data to our feature developers and business analysts.
This involves both ad-hoc requests as well as data pipelines that are embedded in our production environment.
Responsibilities
- Create Scala/Spark jobs for data transformation and aggregation
- Produce unit tests for Spark transformations and helper methods
- Write Scaladoc-style documentation with all code
- Design data processing pipelines
Skills
- Scala (with a focus on the functional programming paradigm)
- Scalatest, JUnit, Mockito {{ , Embedded Cassandra }}
- Apache Spark 2.x
- {{ Apache Spark RDD API }}
- {{ Apache Spark SQL DataFrame API }}
- {{ Apache Spark MLlib API }}
- {{ Apache Spark GraphX API }}
- {{ Apache Spark Streaming API }}
- Spark query tuning and performance optimization
- SQL database integration {{ Microsoft, Oracle, Postgres, and/or MySQL }}
- Experience working with {{ HDFS, S3, Cassandra, and/or DynamoDB }}
- Deep understanding of distributed systems (e.g. CAP theorem, partitioning, replication, consistency, and consensus)
Recent Spark Articles by Toptal Engineers
Find the right Spark interview questions
Read a list of great community-driven Spark interview questions.
Read them, comment on them, or even contribute your own.
Hire a Top Spark Developer Now
Toptal is a marketplace for top Spark developers, engineers, programmers, coders, architects, and consultants. Top companies and start-ups choose Toptal Spark freelancers for their mission-critical software projects.
See Their ProfilesLuigi Crispo
Freelance Spark Developer
Luigi is a seasoned cloud and leadership specialist with over two decades of professional experience in a variety of environments. He is passionate about technology and value-driven projects, and he is highly adaptable. He has been part of significant industry transformation waves directly from some of the leaders driving the digital era.
Show MoreSteve Fox
Freelance Spark Developer
Steve is a certified AWS solution architect professional with big data and machine learning speciality certifications. He has a diverse background, and experience architecting, building, and operating big data machine learning applications in AWS. Steve has held roles from technical contributor to CTO and CEO.
Show MoreKhushali Patel
Freelance Spark Developer
Khushali is a detail-oriented data engineer with a get-it-done, on-time, and high-quality product delivery attitude. He has over three years of experience in the design and development of scalable, robots, and reusable big data products and frameworks for many startups and well-known financial firms. Khushali excels in programming (Scala, Java, Python), big data (Hadoop, Spark, Hive, Impala, Druid), and streaming technology (Kafka, KSQL).
Show MoreTadej Slamic
Freelance Spark Developer
With over a decade in the software industry, Tadej has helped startups launch their first product, assisted FTSE100 enterprises with digital transformation, been a part of the fintech boom, and helped particle accelerators cool down. He loves creating scalable back ends and is an expert in crafting modern and performant mobile, web, and desktop apps.
Show MoreAndreas Bollig
Freelance Spark Developer
With a Ph.D. in electrical engineering and extensive experience in building machine learning applications, Andreas spans the entire AI value chain, from use case identification and feasibility analysis to implementation of custom-made statistical models and applications. Throughout projects, he stays focused on solving the business problem at hand and creating value from data.
Show MoreMohammad Amin Khashkhashi Moghaddam
Freelance Spark Developer
Currently earning his master’s degree in computer science at ETH Zürich, Mohammad’s professional experience includes the technical management of a mobile advertisement product and working on products with tens of millions of users. He also has over two years of experience in data science and engineering—developing ETL pipelines, training, tuning big data infrastructures, and more.
Show MoreOleksii Sliusarenko
Freelance Spark Developer
Oleksii is a senior research engineer specializing in machine learning with several years of hands-on, in-depth experience. In his free time, he competes in international programming and math competitions—and often wins. At Deloitte and Grammarly, he developed their core deep learning and AI algorithms. Oleksii has worked at all stages of R&D from problem formulation with clients to product deployment.
Show MoreYuxiang Bao
Freelance Spark Developer
Having studied advanced machine learning (ML) theory for the past three years, it’s safe to say Yuxiang knows ML quite well and he's delivered multiple projects using cutting-edge ML algorithms and tools. While at school, he also spent two years researching NLP. With a solid knowledge base in ML and NLP, hands-on experience, and exemplary communication skills—both written and verbal—Yuxiang will add value to your project.
Show MoreIvan Nikolaev
Freelance Spark Developer
Ivan has experience working as a data scientist and a data engineer in network security and finance industries. This includes processing and cleaning data, formalizing business problems and creating solutions by designing features and applying machine learning techniques to solve the problems. He works with big data using Spark and MapReduce, and can visualize and present results to stakeholders in an easy-to-understand format.
Show MoreSung Jun Kim
Freelance Spark Developer
As a highly effective technical leader with over 25 years of experience, Andrew specializes in data integration, data conversion, data engineering, ETL, big data architecture, data analytics, data visualization, data science, analytics platforms, and cloud architecture. He has an array of skills in building data platforms, analytic consulting, trend monitoring, data modeling, data governance, and machine learning.
Show MoreDiego Ariel Bendersky
Freelance Spark Developer
Diego is a computer science licentiate with more than 15 years of experience. He's worked for companies of all sizes, both on-site and remotely, mainly as senior developer/architect (programming in C/C++, Python and recently Go), and as a technical leader for small teams of programmers. He has a problem-solving attitude and likes to use the most suitable tool for each task. He's a co-author of two patents and a few research publications.
Show MoreSign up now to see more profiles.
Start HiringToptal Connects the Top 3% of Freelance Talent All Over The World.
Join the Toptal community.