Notes on Michael Stonebraker’s webinar for the ACM: The Fast Data Challenge and Pickiing the Right Database: Why One Size Doesn’t Fit All.
The Expert Guide to Fast Data
Characteristics of Fast Data
High message rate:
- 1,000 transactions per second – use RDBMS
- 100,000 transactions per second – interesting
Transactions = messages
According to the TPC, these are the top results (as of 2015/05/02):
Benchmark Top Result Database Manager Performance TPS Report Date TPC-C SPARC T5-8 Server Oracle 11g Release 2 Enterprise Edition with Oracle Partitioning 8,552,523 tpmC 142,542 2013/03/26 TPC-E System x3950 X6 Microsoft SQL Server 2014 Enterprise Edition 9,145.01 9,145 2014/11/25 TPC-H Dell PowerEdge R720xd using EXASolution 5.0 EXASOL EXASolution 5.0 11,612,395 QphH@100000GB 3,225 2014/09/23 TPC-VMS HP Proliant DL380 Gen8 Intel® Xeon® E5-2697v2 C/S with 1 Proliant DL360 G7 Microsoft SQL Server 2014 Enterprise Edition 718.12 VMStpsE 718 2014/04/14
A rate of 100,000 transactions per second (tps) is the same as 6,000,000 tpm (which the Oracle test easily surpassed in 2013). That configuration was not clustered. Thus, there was no protection against a server failure.
Requirements for Fast Data Applications
- Scale-out is preferential over scale-up.
- High level language like SQL with windowing aggregates.
- High availability like Tandem.
- Zero-data loss.
- Data consistency — requires ACID transactions.
- Eventual consistency does not work!
Non-Solutions for Fast Data
RDBMS are not a solution because there is 90% overhead because of:
- Buffer pool overhead
- Locking overhead
- Write-ahead log overhead
- Threading overhead
Disk-based systems are slow.
NoSQL are not a solution because:
- Low level language
- No ACID
- Buffer pool and threading overhead still present
- Low performance and low function
Solutions for fast-data
- High performance main memory SQL-ACID DBMS (VoltDB, hekaton, Hana, …)
- Complex event processing engine (CEP) (Storm, Streambase, …)
I wonder how the in-memory database option for Oracle RDBMS 12c would stack up under these conditions for fast data.
CEP natural for “little state, big patterns”
Main memory SQL-ACID DBMS natural for “big state, little patterns”