Building Spark: build Spark using the Maven system.Migration Guide: Migration guides for Spark components.Integration with other storage systems:.Hardware Provisioning: recommendations for cluster hardware.Job Scheduling: scheduling resources across and within Spark applications.Tuning Guide: best practices to optimize performance and memory use.Monitoring: track the behavior of your applications.Configuration: customize Spark via its configuration system.Kubernetes: deploy Spark on top of Kubernetes.YARN: deploy Spark on top of Hadoop NextGen (YARN).Standalone Deploy Mode: launch a standalone cluster quickly without a third-party cluster manager.Amazon EC2: scripts that let you launch a cluster on EC2 in about 5 minutes.Submitting Applications: packaging and deploying applications.Cluster Overview: overview of concepts and components when running on a cluster.Spark SQL CLI: processing data with SQL on the command line.PySpark: processing data with Spark in Python.SparkR: processing data with Spark in R.MLlib: applying machine learning algorithms.Spark Streaming: processing data streams using DStreams (old API).Structured Streaming: processing structured data streams with relation queries (using Datasets and DataFrames, newer API than DStreams).Spark SQL, Datasets, and DataFrames: processing structured data with relational queries (newer API than RDDs).RDD Programming Guide: overview of Spark basics - RDDs (core but old API), accumulators, and broadcast variables.Quick Start: a quick introduction to the Spark API start here!.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |