Overview

Spec Overview #

This is the specification for the Paimon table format, this document standardizes the underlying file structure and design of Paimon.

Terms #

  • Schema: fields, primary keys definition, partition keys definition and options.
  • Snapshot: the entrance to all data committed at some specific time point.
  • Manifest list: includes several manifest files.
  • Manifest: includes several data files or changelog files.
  • Data File: contains incremental records.
  • Changelog File: contains records produced by changelog-producer.
  • Global Index: index for a bucket or partition.
  • Data File Index: index for a data file.

Run Flink SQL with Paimon:

CREATE CATALOG my_catalog WITH (
    'type' = 'paimon',
    'warehouse' = '/your/path'
);       
USE CATALOG my_catalog;

CREATE TABLE my_table (
    k INT PRIMARY KEY NOT ENFORCED,
    f0 INT,
    f1 STRING
);

INSERT INTO my_table VALUES (1, 11, '111');

Take a look to the disk:

warehouse
└── default.db
    └── my_table
        ├── bucket-0
        │   └── data-59f60cb9-44af-48cc-b5ad-59e85c663c8f-0.orc
        ├── index
        │   └── index-5625e6d9-dd44-403b-a738-2b6ea92e20f1-0
        ├── manifest
        │   ├── index-manifest-5d670043-da25-4265-9a26-e31affc98039-0
        │   ├── manifest-6758823b-2010-4d06-aef0-3b1b597723d6-0
        │   ├── manifest-list-9f856d52-5b33-4c10-8933-a0eddfaa25bf-0
        │   └── manifest-list-9f856d52-5b33-4c10-8933-a0eddfaa25bf-1
        ├── schema
        │   └── schema-0
        └── snapshot
            ├── EARLIEST
            ├── LATEST
            └── snapshot-1
Edit This Page
Copyright © 2024 The Apache Software Foundation. Apache Paimon, Paimon, and its feather logo are trademarks of The Apache Software Foundation.