Sequence & Rowkind
This documentation is for an unreleased version of Apache Paimon. We recommend you use the latest stable version.

Sequence and Rowkind #

When creating a table, you can specify the 'sequence.field' by specifying fields to determine the order of updates, or you can specify the 'rowkind.field' to determine the changelog kind of record.

Sequence Field #

By default, the primary key table determines the merge order according to the input order (the last input record will be the last to merge). However, in distributed computing, there will be some cases that lead to data disorder. At this time, you can use a time field as sequence.field, for example:

CREATE TABLE my_table (
    pk BIGINT PRIMARY KEY NOT ENFORCED,
    v1 DOUBLE,
    v2 BIGINT,
    dt TIMESTAMP
) WITH (
    'sequence.field' = 'dt'
);

The record with the largest sequence.field value will be the last to merge, regardless of the input order.

Sequence Auto Padding:

When the record is updated or deleted, the sequence.field must become larger and cannot remain unchanged. For -U and +U, their sequence-fields must be different. If you cannot meet this requirement, Paimon provides option to automatically pad the sequence field for you.

  1. 'sequence.auto-padding' = 'row-kind-flag': If you are using same value for -U and +U, just like “op_ts” (the time that the change was made in the database) in Mysql Binlog. It is recommended to use the automatic padding for row kind flag, which will automatically distinguish between -U (-D) and +U (+I).

  2. Insufficient precision: If the provided sequence.field doesn’t meet the precision, like a rough second or millisecond, you can set sequence.auto-padding to second-to-micro or millis-to-micro so that the precision of sequence number will be made up to microsecond by incremental id (Calculate within a single bucket).

  3. Composite pattern: for example, “second-to-micro,row-kind-flag”, first, add the micro to the second, and then pad the row kind flag.

Row Kind Field #

By default, the primary key table determines the row kind according to the input row. You can also define the 'rowkind.field' to use a field to extract row kind.

The valid row kind string should be '+I', '-U', '+U' or '-D'.