Showing 1-2 of 2 results


Apache Spark Optimization Techniques for High-performance Data Processing

Apache Spark is an analytics engine that can handle very large data sets. This guide reveals strategies to optimize its performance using PySpark.

11 minute readContinue Reading
Necati Demir, PhD

Necati Demir, PhD

Introduction to Apache Spark With Examples and Use Cases

In this post, Toptal engineer Radek Ostrowski introduces Apache Spark—fast, easy-to-use, and flexible big data processing. Billed as offering “lightning fast cluster computing”, the Spark technology stack incorporates a comprehensive set of capabilities, including SparkSQL, Spark Streaming, MLlib (for machine learning), and GraphX. Spark may very well be the “child prodigy of big data,” rapidly gaining a dominant position in the complex world of big data processing.

8 minute readContinue Reading
Radek Ostrowski

Radek Ostrowski

World-class articles, delivered weekly.

Subscription implies consent to our privacy policy

Join the Toptal® community.