S3

#aws #s3 #storage

Amazon S3 is an object storage service that stores data as objects within buckets. An object is a file and any metadata that describes the file. A bucket is a container for objects.

Requirements

  • A bucket name must be globally unique
  • An object has a key being the full path to that object
  • The maximum size of an object is 5TB
  • The maximum number of tags of an object is 10

Features

  • S3 is the backbone of many ML services
  • It is an ideal candidate for creating a data lake
  • It has a centralized architecture
  • It supports partitioning for query speedup. For example:
    • by data s3://bucket/year/month/day/file
    • by product s3://bucket/product

Tiers

  • S3 Standard General Purpose
  • S3 Standard Infrequent Access(IA)
  • S3 One Zone IA
  • S3 Intelligent
  • Glacier

Life Cycle

  • Transition actions
    • For example: Standard -> IA
  • Expiration actions
    • For example: IA -> Glacier

Security

^8a6482

Four methods of encryption:

  • SSE-S3: managed by AWS/S3
  • SSE-KMS: managed by AWS Key Management Service
  • SSE-C: managed by Client
  • Client Side Encryption

Access Control

  • User based or resource based Access Control(Bucket Policies)
  • VPC Endpoint Gateway
  • Logging and Audit
  • Tags