Glossary

Spark

A distributed engine for big-data batch (and stream) processing.

1 min read·4 sections
Open the interactive version → diagrams, practice & more

Definition

A distributed engine for big-data batch (and stream) processing.

How it works

In-memory, fault-tolerant via lineage; the modern successor to MapReduce.

Common questions

What is Spark?

A distributed engine for big-data batch (and stream) processing.

How does Spark work?

In-memory, fault-tolerant via lineage; the modern successor to MapReduce.

What is Spark used for in system design?

In-memory, fault-tolerant via lineage; the modern successor to MapReduce.

Part of Glossary on SystemLore — system design explained with 148 deep topics, interactive diagrams, and a build-it-yourself game. Build this one →