Jul 7, 2021
What’s Spark? prueba The definition says:
Spark is a fast and general processing engine compatible with Hadoop data. It can run in Hadoop clusters >through YARN or Spark’s standalone mode, and it can process data in HDFS, HBase, Cassandra, Hive, and any >Hadoop InputFormat. It is designed to perform both batch processing (similar to MapReduce) and new >workloads like streaming, interactive queries, and machine learning.
Basically is a framework to work with big amounts of data stored in distributed systems instead of just one machine.