Spark

Showing 1-3 of 3 results
EngineeringIcon ChevronData Science and Databases

Apache Spark Optimization Techniques for High-performance Data Processing

By Necati Demir, PhD

Apache Spark is an analytics engine that can handle very large data sets. This guide reveals strategies to optimize its performance using PySpark.

11 minute readContinue Reading
EngineeringIcon ChevronBack-end

How I Used Apache Spark and Docker in a Hackathon to Build a Weather App

By Radek Ostrowski

Hackathons often inspire engineers to create amazing software. By blending various technologies together, really useful and often fun projects can be realized in a short period of time. In this article, Toptal engineer Radek Ostrowski shares his experience participating in the IBM Sparkathon, and walks us through how he elegantly combined the power of Apache Spark and Docker in IBM Bluemix to build a weather app.

8 minute readContinue Reading
EngineeringIcon ChevronData Science and Databases

Introduction to Apache Spark With Examples and Use Cases

By Radek Ostrowski

In this post, Toptal engineer Radek Ostrowski introduces Apache Spark—fast, easy-to-use, and flexible big data processing. Billed as offering "lightning fast cluster computing", the Spark technology stack incorporates a comprehensive set of capabilities, including SparkSQL, Spark Streaming, MLlib (for machine learning), and GraphX. Spark may very well be the "child prodigy of big data," rapidly gaining a dominant position in the complex world of big data processing.

8 minute readContinue Reading

Join the Toptal® community.