Development notes

Thoughts, notes and ideas about development

Partitioning in Relational databases

2016-12-21 3 min read Development Alexey Bogdanov

Introduction

Partitioning is a process of dividing a table into multiple smaller pieces, where each piece of such a database object is called a partition. This tutorial describes what is Partitioning and provided an example of Range partitioning in PostgreSQL database.

Benefits of partitioning

  • Query performance could be increased when:
  • most of the heavily accessed rows of the table are in a single partition or a small number of partitions.
  • queries or updates access a large percentage of a single partition because of sequential scan of that partition instead of using an index and random access reads scattered across the whole table
  • Seldom-used data can be migrated to cheaper and slower storage media
  • Performing maintenance operations on one or more partitions can be made more quickly.
  • Data availability
  • when one or some partitions are not available the other partitions still could be available.

Partitioning Types

There are following partitioning methods which control how data is placed into individual partitions:

  • Range
  • List
  • Hash

Range partitioning

Range partitioning is the most common type which is mostly used with dates. It divides table into ranges defined by single or multiple columns. It’s important to note that ranges should not be overlapped with each other. In that case an equality operators such as >, < are mostly used for verifications.
Supported by Oracle DB, PostgreSQL

List partitioning

List partitioning is very similar to Range partitioning. But List partition stores only the data which is present in a set of value lists, instead of one set of ranges of values. In verifications the values IN (list_of_coma_separated_values) is used. Supported by Oracle DB, PostgreSQL

Hash partitioning

In Hash partitioning data is stored into partitions based on hash of partitioning key. In that case data is distributed to partitions approximately in the same size. Supported by Oracle DB

Partitioning in PostgreSQL

Currently PostgreSQL supports basic table partitioning via table inheritance. It means that each partition must be created as a child table of a single parent table. The parent table itself is normally empty and exists just to represent the entire data set.

Partitioning example in PostgreSQL

To create partitioned table in PostgreSQL the following steps should be performed:

  1. Create the master table, from which all of the partitions will inherit
  2. Create several child tables that each inherit from the master table. Normally, these tables will not add any columns to the set inherited from the master.
  3. Add table constraints to the partition tables to define the allowed key values in each partition.
  4. For each partition, create an index on the key column(s), as well as any other indexes you might want. (The key index is not strictly necessary, but in most scenarios it is helpful. If you intend the key values to be unique then you should always create a unique or primary-key constraint for each partition.)
  5. Optionally, define a trigger or rule to redirect data inserted into the master table to the appropriate partition.
  6. Ensure that the constraint_exclusion configuration parameter is not disabled in postgresql.conf. If it is, queries will not be optimized as desired.

The example of Range partitioning in PostgreSQL could be found into the following repository on Github: PostgreSQL partitioning example SQL script for creating and filling partitions could be found here.

Additional Reading:

  1. Partitioning in PostgreSQL
  2. Partitioning Concepts in Oracle DB
  3. Scaling Database via Sharding and Partitioning [language: Russian]
comments powered by Disqus