## Partitioning Edition

To get insights from a large amount of data, the icCube server partitioning edition is dedicated
to support large schemas; i.e. facts with more than 1 billion rows. For that purpose, this edition mainly
improves on both the speed to load such cubes and to process MDX requests against large number of facts.

### Table Partitioning

Table partitioning is currently supported for relational database tables, multi-files tables and MongoDB tables;
please contact the icCube support if you'd like another type of datasource supporting this feature.

The following is assuming DB tables but the concepts are similar for all datasources supporting partitions.

When loading large schemas, the bottleneck is more likely related to the underlying DB server speed.
One way to improve that loading time is to take advantage of DB table partitioning. icCube is going
to generate one SQL loading request per actual partition and process them in parallel (the actual
number of parallel requests is a configuration property). Note that while other icCube editions are
able to load several tables in parallel (i.e. asynchronous and parallel table processing feature)
only this edition is able to process a given table using several parallel SQL load requests.

Even if the underlying DB table is not partitioned, it might be worth investigating if parallel
requests are improving the loading time of the cube. Indeed, it might be possible that the DB server
processes those SQL requests efficiently.

Activating table partitioning is achieved by defining a `Partitioning Column` as shown on the following
picture. How does it work? icCube is going to use a `SELECT DISTINCT partitioning-column WHERE data-table`
SQL statement to determine the actual partition keys and generate one SQL request per partition key value.
Note that there is a configuration property (`maxTablePartitionCount`) defining the maximum amount of rows
returned by this `SELECT DISTINCT` request to prevent creating too many partitions. Its default value is 1024,
creating too many partitions makes no real sense and might be inefficient.

### Facts Partitioning

Typically, fact tables are several orders of magnitude larger than dimension tables. Therefore, for now,
partitioning in the context of icCube is about facts partitioning.

This edition uses an improved columnar data storage that is improving performance for 'partitioned' MDX queries
as well as removing the ~ 1 billion rows constraint of the other enterprise editions.

Without revealing too many secrets about the guts of icCube's in-memory engine, we can say that fast
MDX processing requires efficient data indexing and memory access. Splitting facts into several separated
and smaller physical memory areas is opening the door to more efficient implementations. E.g., when it is
possible to detect that several partitions are not used within a MDX request (e.g., when filtering by a
given year - and `[Year]` being a partition), then its processing will require less costly memory access
and obviously less CPU.

Activating the facts partitioning is done via the `Use Fact Partitioning` property at schema level as shown in the
following picture:

![table partitioning](img/table_partitioning_large.png)

Then for each facts definition, you can specify if partitioning applies and how the partitioning
is defined : data-table or MDX-level based.

#### Facts Partitioning Mode : Data-Table Based

In this mode, each data-table partition maps to a facts partition.

Defining a data-table based partitioning is shown on the following picture :

![define table partitioning](img/facts_partition_table_defined_large.png)

#### Facts Partitioning Mode : MDX-Level Based

DB table partitioning is not required to benefit from the facts partitioning feature. Indeed, you can define
facts partitions using any MDX level. For example, as shown in the following picture, a date dimension is a
good candidate for partitioning your facts as it is more likely that you're going to filter your MDX requests
per year more often.

Defining an MDX-level based partitioning is shown on the following picture :

![define level partitioning](img/facts_partition_level_defined_large.png)

Advanced: note that even if you have defined the table-based partitioning mode, you can still propose a level.
It is going to be used as a performance hint during tuple evaluations. This level must be consistent with the
table-based partitioning; e.g. having a geography table partitioning and using a date level makes no sense.

### Memory Mapped Files

This edition unlocks the [memory mapped file](../configuring_iccube/ram.md) feature. Indeed, very large heap
in Java is more likely creating large GC pauses ( > 1 minute ) that might freeze MDX request processing. One way
to avoid that is to take advantage of a well known OS feature: memory mapped files. Using this, facts are saved
into files instead of in-memory. The OS is going to cache them in RAM (therefore the more RAM the better) and
you should expect the same level of performance as available with the in-memory storage engine. We strongly
encourage you to use this feature with Linux rather than Windows as I/O are faster, steadier and do not suffer
from files being locked.

### Hardware / OS Requirements

Last but not least, the partitioning edition is still available on commodity machines able to run
a JAVA virtual machine making it ideal to deploy on private/public cloud platform.

_