Amazon S3 is an object storage service that stores data as objects within buckets. An object is a file and any metadata that describes the file. A bucket is a container for objects.
Requirements
- A bucket name must be globally unique
- An object has a key being the full path to that object
- The maximum size of an object is 5TB
- The maximum number of tags of an object is 10
Features
- S3 is the backbone of many ML services
- It is an ideal candidate for creating a data lake
- It has a centralized architecture
-
It supports partitioning for query speedup. For example:
-
by data
s3://bucket/year/month/day/file -
by product
s3://bucket/product
-
by data
Tiers
- S3 Standard General Purpose
- S3 Standard Infrequent Access(IA)
- S3 One Zone IA
- S3 Intelligent
- Glacier
Life Cycle
-
Transition actions
- For example: Standard -> IA
-
Expiration actions
- For example: IA -> Glacier
Security
^8a6482
Four methods of encryption:
- SSE-S3: managed by AWS/S3
- SSE-KMS: managed by AWS Key Management Service
- SSE-C: managed by Client
- Client Side Encryption
Access Control
- User based or resource based Access Control(Bucket Policies)
- VPC Endpoint Gateway
- Logging and Audit
- Tags