Manage Partition
This documentation is for an unreleased version of Apache Paimon. We recommend you use the latest stable version.

Expiring Partitions #

You can set partition.expiration-time when creating a partitioned table. Paimon will periodically check the status of partitions and delete expired partitions according to time.

How to determine whether a partition has expired: compare the time extracted from the partition with the current time to see if survival time has exceeded the partition.expiration-time.

Note: After the partition expires, it is logically deleted and the latest snapshot cannot query its data. But the files in the file system are not immediately physically deleted, it depends on when the corresponding snapshot expires. See Expire Snapshots.

An example for single partition field:

    'partition.expiration-time' = '7 d',
    'partition.expiration-check-interval' = '1 d',
    'partition.timestamp-formatter' = 'yyyyMMdd'

An example for multiple partition fields:

CREATE TABLE t (...) PARTITIONED BY (other_key, dt) WITH (
    'partition.expiration-time' = '7 d',
    'partition.expiration-check-interval' = '1 d',
    'partition.timestamp-formatter' = 'yyyyMMdd',
    'partition.timestamp-pattern' = '$dt'

More options:

Option Default Type Description
1 h Duration The check interval of partition expiration.
(none) Duration The expiration interval of a partition. A partition will be expired if it‘s lifetime is over this value. Partition time is extracted from the partition value.
(none) String The formatter to format timestamp from string. It can be used with 'partition.timestamp-pattern' to create a formatter using the specified value.
  • Default formatter is 'yyyy-MM-dd HH:mm:ss' and 'yyyy-MM-dd'.
  • Supports multiple partition fields like '$year-$month-$day $hour:00:00'.
  • The timestamp-formatter is compatible with Java's DateTimeFormatter.
(none) String You can specify a pattern to get a timestamp from partitions. The formatter pattern is defined by 'partition.timestamp-formatter'.
  • By default, read from the first field.
  • If the timestamp in the partition is a single field called 'dt', you can use '$dt'.
  • If it is spread across multiple fields for year, month, day, and hour, you can use '$year-$month-$day $hour:00:00'.
  • If the timestamp is in fields dt and hour, you can use '$dt $hour:00:00'.