In this topic, we are going to compare between S3 vs RDS vs DynamoDB vs SimpleDB in AWS.
- Many options exist
- Each has trade-offs
- Ideal use cases
- Know and embrace the limitations
- Keep up to date on developments
- S3 (Object Storage)
- RDS (managed MySQL, Oracle, SqlServer….)
- DynamoDB (NoSQL database)
S3 value props
- S3 (Simple Storage Service)
- RDS (Relational Database Service)
- Hands-off RDBMS (MySQL, Orale, SqlServer)
- Manually scalable (like EC2)
- DynamoDB (“column family” oriented NoSQL)
- It Deals awesome with the “3 Vs” (Verity, velocity, and volume) of big data
- It’s very Hands-off & very simple to scale
- The Value Proposition to S3
- Static Storage
- It’s always cheapest option to both store (GB/mon) & serve It is cheapest to serve than an EC2 instance
- There is no limit on the No. of objects you can create (it requires no “pre-provision” in size)
- It’s extremely durable (having 11 9’s durability). It achieves that durability by leveraging multiple nodes in multiple availability zones, so when you put files into S3 you are actually coping it many times into many availability zones.
- It’s also very scalable.
- All Objects are natively web-accessible
- It can host static websites
There are some limitations on S3 you need to be aware.
AWS S3 Limitations:
- It comes with higher Latency especially when you are talking about EC2, It always going to be faster to access emperableEBS Disk then S3.
- It is a write once and read many file store – WORM (write once, read many), meaning you put an entire object in and you take it entire object out. You can’t really edit
- It also cannot serve dynamic (PHP, Ruby) content
- You are limit to 100 buckets and no objects larger than 5TB in size, these are Hard limits you cannot change them
- It just provides read, write, list, delete and so on.
RDS Value Props:
- It is essential managed MySQL, Oracle, SqlServer
- DR include in
- Automated backups
- Real-time snapshots
- It is very Cost Effective
- Lower cost than “roll your own” EC2 + database
- You can also bring your own license with Oracle, SqlServer
- It is very Scalable
- Provides scaling with “Instance Types” like EC2 and provisioned IOPS like EBS
There are some limitations on RDS you need to be aware.
- It does have an Upper limit on the vertical scale
- SQL client interface
- No direct SSH or hardware control
- If you need direct control, you must use EC2 and install your own RDBMS on top of it.
|Cluster parameter groups||50|
|Cross-region snapshots copy requests||5|
|Manual cluster snapshots||100|
|Read replicas per master||5|
|Rules per DB security group||20|
|Rules per VPC security group||50 inbound 50 outbound|
|DB Security groups||25|
|VPC Security groups||5|
|Subnets per subnet group||20|
|Tags per resource||50|
|Total storage for all DB instances||100 TiB|
DynamoDB value props
- Great for “Three Vs” Web Scale:
- It is “Column family” key-value store
- It also very strongly consistent
- Inside a region when you deploy DynamoDB you are actually deploying multiple availability zones and every time when there is change inside the data points (inside AZ’s) it copies to all other in the cluster
- DynamoDB value props (there’s more)
- Tuned via (independent) read and write throughput
- Extremely cost effective
- Very Easy to Administer
- Automatically re-partitions data in back-end
- It Supports two lookup key types
- Hash key (unique ID)
- Hash + ranged key (unique ID + range – typically date)
There are some limitations on DynamoDB you need to be aware.
- Limited querying
- Lookup only on the primary key (hash or hash + range)
- The only operation you can do – Get, Put, Update, BatchWrite, Query (on the range) and Scan (which is slow and not a good idea).
- It has come with NO limit on table size (volume) and allows you to scale to very big volumes.
- It has 64K row limit in size
- Complicated (but reasonable) pricing model
- No limit on throughput
- It is provisioned by the user
- You obviously pay for higher limits
Compare system with each other:
- Static files/backups?
- Also, consider setting up CDN through CloudFront!
- RDBMS interactions that need ACID compliance and/or ad-hoc query capability?
- “Three Vs” issue?
The use cases for each of the service, you can see inside of a single web app you may want to use all three together.
Eg: In a website, you may:
- Store orders, customers, “ad hoc” & reporting type data in RDS
- Pull “hot” tables (site likes, social interactions) out into DynamoDB
- Use a unique key like “user-id” for the primary key in Dynamo
Let’s cover something that probably you never want to do with this different storage module:
- Serving static content out of anything besides s3
- Slower, more expensive, less redundant
- If you try to make RDS do “web scale” you will get a lot of problems
- Pull hot tables (at least) out into DynamoDB
- If you trying to use DynamoDB for anything but “web scale”
- You are going to find SQL (RDBMS) is better understood, more flexible
- “Rolling your own” without good reason is also not a good idea
- AWS Service will be easier to use and manage
- SimpleDB is a NoSQL datastore that preceded DynamoDB
- It still inside AWS
- It just not very performant
- The reason for lack of performance – Indexes on multiple fields
- That indexing meant wide variations in performance
- And every time indexes itself-slow down a little
- DynamoDB has preferred a way to do NoSQL now
- SimpleDB still has (a dwindling number of) use cases
- If you aren’t worried about the varying response time (internal apps)
Storage in AWS: Roll your own
- Some customers decide to install their own RDBMS
- RDBMS (they really need to control)
- Some folks also decide to install their own NoSQL DB such as MongoDB, Cassandra, HBase (these are popular)
- If you need a large distributed file system.
- Tie together many EC2/EBS Volumes via GlusterFS
- Each storage type has tradeoffs and limitations
- Make sure you’re using the right type for the right reason
- Make sure you understand what you’re losing if you “roll your own”
- Native integration to other AWS services
This topic covers how to compare between S3 vs RDS vs DynamoDB vs SimpleDB in AWS.