Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
James Anthony
Technology Director
1
Introduction
Im pretty sure a LOT is going to be written about the InMemory option for 12c
released in July this year. We at Red Stack Tech have been lucky enough to be
part of the beta programme and therefore have been using it for a few months
now, and Ive got to say its pretty awesome! In my mind this is the biggest thing
to happen to the database since RAC arrived. I was asked to put this article
together to give an intro to the InMemory option, the concepts and some of the
performance gains. If youre interested to know more, or even want to try it out,
drop me a line at james.anthony@redstk.com.
Before I start a quick word of warning there is a lot left unsaid in this article,
and its definitely not a deep dive. I was asked to keep this article short, and
failed, but even so a lot of pruning has had to go off!
I remember a few years back a lot of fuss was being made about column store
database in the warehousing space, but much of that came to nothing when
people started getting impacted by the column cliff. Oracle themselves
introduced HCC to provide some of the benefits of columnar storage (namely
the fact that compression works better in column storage than traditional row
storage). In the last couple of years in memory has become an increasing
trend, driven by ever falling RAM prices and the ability of modern CPUs to
address increased amounts of main memory.
The 12c InMemory option merges these two concepts, because at heart its an
in memory column store. Simply put the RDBMS engine will maintain a separate
pivoted view of your data in memory, and hold this in column format.. and dont
worry, through a journaling process the row cache (your current buffer cache in
the SGA) and the column store are kept in sync.
The (incredibly simplified) figure 1 shows this, with an amount of data being held
in a block shown with the dotted boxes. You can see on the left a traditional row
storage format, that would (and still is) held in the buffer cache, each row is then
pivoted and the data held within the column store.
2
Figure
Figure 1 1
When a predicate is applied (for example order_value > x) the column store can
then be queried, with the optimizer only required to scan the values for a single
column in comparison to the row store where to filter on that predicate the other
columns must also be superfluously read. Enhancements such as SIMD
processing, compression and min/max pruning (covered later) provided
significant speed up to this processing.
At this stage youre possibly thinking well if I have xGB of data that means I need
xGB for the column store, but thats not the end of the story. For starters (and
crucially) the InMemory option does NOT require all of a table be within the
column store! The optimizer can seamlessly work with a query where part of the
data is within the column store and part of it resides on disk still (indeed on initial
querying the data may not yet be in the column store and this is exactly what will
happen whilst the store is background populated). Its worth noting as well that
you can choose just to put given partitions into the column store.
Multiple Predicates
Anyone who has worked on Exadata will know just how powerful the storage
indexes maintained at the cell level are. InMemory brings a similar capability.
For InMemory the min and max values are stored for each InMemory
Compression Unit (IMCU), these IMCUs are the storage format (similar to an
Oracle block but much larger) within the column store.
3
Dropping indexes/Removing reporting
databases/Operational efficiencies
Whilst a lot of the headlines around InMemory will be clearly around what are
going to be some extraordinary performance gains for reporting/analytical
workloads its worth noting the impact on OLTP and general efficiencies.
Within a typical database a large portion of the space used will be for indexing
(go on, just do a quick query on dba_extents and group it by object type to figure
out your value). These indexes both increase the size of the database, but also
slow down OLTP operations as they need maintaining (especially where we are
inserting new rows). 12c InMemory gives us the opportunity to totally remove the
indexes need to service reporting, querying workloads, allowing our databases
to be smaller (backing up, recovering and cloning faster, and aggregating gains
across non-production environments), but also accelerating OLTP by reducing
the index maintenance operations.
By the same means we see a lot of organisations who run separate reporting
databases, or ODS systems to offload reporting from the production. I firmly
believe that InMemory is going to chance the game here, allowing organisations
to report from the real time data (eliminating lag), shifting the compute power and
Oracle licencing from these reporting databases etc. to the production system,
and reducing the amount of operational work the DBAs and administrators must
do to manage these ancillary datastores.
4
The following diagram illustrates this more clearly, showing the stages an instruction
goes through in the CPU (fetch, decode, execute, store result) and how pipelining
ensures that no idle time is encountered. In the first example (non-pipelined) you
can see how each phase completes before the next begins, with each phase
consuming a clock cycle. In the pipelined example below you can see how the
different parts of the CPU are used in parallel to process more operations in a given
space of time.
Traditional
Pipelined
Figure 2
SIMD processing is particularly good for the type of columnar scans being
performed in the 12c InMemory Database, allowing for the repetitive task of
evaluating a predicate against several rows worth of data in a single pass
operation as opposed to having each tuple evaluated separately in a scalar
operation (one instruction to process one data value). One of the drawbacks of
SIMD is that differing operations cannot be applied to the data values, but in this
case SIMD works in this case as the same operation is being applied to each
value. By using the Intel (and other) optimisations for SIMD vector processing
the Oracle 12c InMemory code is able to scan a greater number of data values
with each CPU operation, significantly improving throughput.
5
Figure 3
Simplicity
Putting stuff into the InMemory column store couldnt be easier, we just alter the
table using the inmemory clause as follows:
We can also specify a subset of columns, for example in the following we put all
columns of a table into memory except one:
I was asked to keep this article short, so I wont expand too much, but once this
has been issued the first query against the table will begin the loading into
memory. Its also possible to use the inmemory_priority attribute of a table to
specify that it should be loaded into memory preferentially at database startup
based on their priority level (LOW, MEDIUM, HIGH and CRITICAL)
Compression Levels
Im not going to labour too much on compression within Oracle, as it has been
done to death in many articles. Suffice to say one of the key advantages of
column storage is that compression levels through de-duplication are higher than
that of row storage. The InMemory option allows for differing levels of
compression to be applied (using the MEMCOMPRESS keyword).
6
To give you an idea on compression rates we applied these to some data tables we
use for demonstration purposes and got a 9.5x compression rate on a 23m row orders
table, with some of the dimension tables getting 30x compression. As always your
mileage will vary depending on the data (and often other ordering of the data), but
running the following query will yield your compression ratios:
NOTE : The results above used the default compression of FOR QUERY LOW
enabling maximum throughput.
The Results!
So youre probably keen to see just how fast this makes it right! Well, going back to
that 23m row table of orders and contrasting query performance against data held
entirely in the SGA buffer cache (so no physical IO its all logical IO)
First an example of scanning an entire column. Remember in this case we will be able
to just read the compressed lo_ordtotalprice column from the column store, but we
wont be able to use any of the column index optimisations.
MAX(L_EXTENDEDPRICE)
--------------------
104948.5
Elapsed: 00:00:01.89
7
Now running the same query against the InMemory column store..
SQL> select max(l_extendedprice) from h_lineitem
2 /
MAX(L_EXTENDEDPRICE)
--------------------
104948.5
Elapsed: 00:00:00.02
Execution Plan
----------------------------------------------------------
Plan hash value: 192022634
---------------------------------------------------------------------------
-----
---------------------------------------------------------------------------
-----
| 1 | SORT AGGREGATE | | 1 | 6 |
| | | | |
| 2 | PX COORDINATOR | | | |
| | | | |
| 4 | SORT AGGREGATE | | 1 | 6 |
| | Q1,00 | PCWP | |
---------------------------------------------------------------------------
-----
Statistics
----------------------------------------------------------
9 recursive calls
0 db block gets
75 consistent gets
0 physical reads
0 redo size
Thats a reduction in query time from 1.89 seconds to a meagre 0.02 seconds (to
scan 23m records)... and thats memory vs. memory! Pretty impressive!
The observant amongst you are probably right now suggesting its not a fair test as
wed have an index on that field. Well, yep I agree, but a) this is a very simple test
with no predicate and b) a quick test showed the results vs an index on that column
are almost identical, but the index consumes another 16% of the space already taken
up by the table, dropping it (and any other indexes) has a big impact on database
size (backup, recovery, cloning etc.) and OLTP insert/update performance.
8
Lets go with another example and this time add a predicate in there to allow us to
accelerate with the ability to filter against the min/max values stored at IMCU level,
again please note this is memory vs memory and all Logical IO, no physical, running
against just over 375m rows in this case.
Elapsed: 00:00:55.94
SQL> ALTER SESSION set inmemory_query = enable;
Session altered.
Elapsed: 00:00:00.00
Elapsed: 00:00:00.63
DISPLAY_NAME VALUE
---------------------------------------------------------------- ---------
-
IM scan CUs predicates optimized 7
IM scan CUs optimized read 0
IM scan CUs pruned 7
IM scan segments minmax eligible 372
9
Lets run another example this time using the order value (against 96m rows in this
case):
DISPLAY_NAME VALUE
---------------------------------------------------------------- ----------
IM scan CUs predicates optimized 72
IM scan CUs optimized read 0
IM scan CUs pruned 72
IM scan segments minmax eligible 91
From the stats above we can see that for my very simple query on a single run we only
evaluated 19 of the 91 storage chunks in memory (known as an InMemory Column
Unit or IM CUs), and eliminated 72 of them! So even though we know from the previous
example we can scan columns quickly not reading just over 83% of the data is always
going to help!
Conclusion
This article barely scratches the surface of our testing on InMemory, and I was asked
to keep it short and Im failing to do that! So if youve got any questions drop me a line
at james.anthony@redstk.com.
10
Contact Red Stack Tech for more information
UK Headquarters:
3rd Floor
Farr House
27 30 Railway Street
Chelmsford
Essex
England
CM1 1QS
Australia Headquarters:
Suite 3
Level 19
141 Queen Street
Brisbane
QLD 4000
Email: contactus@redstk.com
Web: www.redstk.com
Media Enquiries:
Elizabeth Spencer
elizabeth.spencer@redstk.com
01245 200 532
11