Manged Hadoop cluster on EC2, including Spark, HBase, Presto, Flink, Hive and more.
- Master node
- Core node for storage
- Task node for processing
Storage
- HDFS
- EMRFS (S3 as HDFS)
- Local FS
- EBS
Spark
- Spark Streaming (w/ Kinesis)
- Spark SQL
- MLLib
- GraphX
- Spark Core
Notebooks
- Zeppelin
- Notebook