Data Model Non relational data model. Typically a row-name, column-
name, and timestamp are sufficient to uniquely map to a value in the database Relational data model. It provides a declarative method for specifying data and queries: users directly state what information the database contains and what information they want from it, and let the database management system software take care of describing data structures for storing the data and retrieval procedures for answering queries. Independence of Columns It stores parts of a data entity or row in separate column- families, and has the ability to access these column-families separately. This means that not all parts of a row are picked up in a single I/O operation from storage, which is considered a good thing if only a subset of a row is relevant for a particular query. However, column-families may consist of many columns, and these columns within column-families are not independently accessible. It stores columns from a traditional relational database table separately so that they can be accessed independently. Like Cassandra, this is useful for queries that only access a subset of table attributes in any particular query. However, the main difference is that every column is stored separately, instead of families of columns as in Cassandra (this statement ignores fine- grained hybrid options within ParAccel). Interface It is distinguished by being part of the NoSQL movement and does not typically have a traditional SQL interface. It supports standard SQL interfaces. Optimized workload It can handle a more diverse set of application requirements such as much higher rate of updates. It generally does better for individual row queries, and does not perform well on aggregation-heavy workloads. It can put attributes that tend to be co-accessed in the same column-family; this saves the seek cost that results from column-stores needing to find different attributes from the same row in many different places. It is optimized for read-mostly analytical workloads. These systems support reasonably fast load times, but high update rates tend to be problematic. Hence, data warehouses are an ideal market for ParAccel, since they are typically bulk-loaded, require many complex read queries, and are updated rarely. It tends to struggle on workloads that get or put individual rows in the data set, but thrive on big aggregations and summarizations that require scanning many rows as part of a single query.