Command Line Interface
PyPaimon provides a command-line interface (CLI) for interacting with Paimon catalogs and tables. The CLI allows you to read data from Paimon tables directly from the command line.
Installation
The CLI is installed automatically when you install PyPaimon:
pip install pypaimon
After installation, the paimon command will be available in your terminal.
Basic Usage
Before using the CLI, you need to create a catalog configuration file.
By default, the CLI looks for a paimon.yaml file in the current directory.
Create a paimon.yaml file with your catalog settings:
Filesystem Catalog:
metastore: filesystem
warehouse: /path/to/warehouse
REST Catalog:
metastore: rest
uri: http://localhost:8080
warehouse: catalog_name
Usage:
paimon [OPTIONS] COMMAND [ARGS]...
-c, --config PATH: Path to catalog configuration file (default:paimon.yaml)--help: Show help message and exit
Table Commands
Table Read
Read data from a Paimon table and display it in a tabular format.
paimon table read mydb.users
Options:
--select, -s: Select specific columns to read (comma-separated)--where, -w: Filter condition in SQL-like syntax--limit, -l: Maximum number of results to display (default: 100)--format, -f: Output format:table(default) orjson
Examples:
# Read with limit
paimon table read mydb.users -l 50
# Read specific columns
paimon table read mydb.users -s id,name,age
# Filter with WHERE clause
paimon table read mydb.users --where "age > 18"
# Combine select, where, and limit
paimon table read mydb.users -s id,name -w "age >= 20 AND city = 'Beijing'" -l 50
# Output as JSON (for programmatic use)
paimon table read mydb.users --format json
WHERE Operators
The --where option supports SQL-like filter expressions:
| Operator | Example |
|---|---|
=, !=, <> | name = 'Alice' |
<, <=, >, >= | age > 18 |
IS NULL, IS NOT NULL | deleted_at IS NULL |
IN (...), NOT IN (...) | status IN ('active', 'pending') |
BETWEEN ... AND ... | age BETWEEN 20 AND 30 |
LIKE | name LIKE 'A%' |
Multiple conditions can be combined with AND and OR (AND has higher precedence). Parentheses are supported for grouping:
# AND condition
paimon table read mydb.users -w "age >= 20 AND age <= 30"
# OR condition
paimon table read mydb.users -w "city = 'Beijing' OR city = 'Shanghai'"
# Parenthesized grouping
paimon table read mydb.users -w "(age > 18 OR name = 'Bob') AND city = 'Beijing'"
# IN list
paimon table read mydb.users -w "city IN ('Beijing', 'Shanghai', 'Hangzhou')"
# BETWEEN
paimon table read mydb.users -w "age BETWEEN 25 AND 35"
# LIKE pattern
paimon table read mydb.users -w "name LIKE 'A%'"
# IS NULL / IS NOT NULL
paimon table read mydb.users -w "email IS NOT NULL"
Literal values are automatically cast to the appropriate Python type based on the table schema (e.g., INT fields cast to int, DOUBLE to float).
Output:
id name age city
1 Alice 25 Beijing
2 Bob 30 Shanghai
3 Charlie 35 Guangzhou
4 David 28 Shenzhen
5 Eve 32 Hangzhou
Table Explain
Show the scan plan of a query without reading any data: the target snapshot, the pushed-down predicate / projection / limit, the partition / bucket / file-stats pruning funnel, and split-level signals (raw-convertible ratio, deletion-vector ratio, level histogram, files-per-split and split-size distribution). Useful for previewing the pruning effect of a predicate before actually running the read.
paimon table explain mydb.events
Options:
--select, -s: Project specific columns (comma-separated)--where, -w: Filter condition in SQL-like syntax (same operators astable read)--limit, -l: Row limit to push down--verbose, -v: List every split with its files--format, -f: Output format:table(default) orjson
Examples:
# Whole-table scan plan
paimon table explain mydb.events
# Push filter and projection through the planner
paimon table explain mydb.events --where "dt = '2026-05-16' AND id = 7" -s dt,id,val
# List every split (and its files) instead of just the aggregates
paimon table explain mydb.events -w "dt = '2026-05-16'" --verbose
# Machine-readable output for scripting (level_histogram keys are JSON strings)
paimon table explain mydb.events --format json
Output:
== PyPaimon Scan Plan ==
Table: mydb.events (PK, HASH_FIXED)
Snapshot: 5 (schema 0)
Predicate: (dt = '2026-05-16') AND (id = 7)
Projection: [dt, id, val]
Limit: <none>
Partition pruning: 20 -> 4 (pruned 16)
Bucket pruning: 4 -> 1 (pruned 3)
File skipping: 1 -> 1 (pruned 0)
Splits: 1
raw-convertible: 1 / 1
with DV: 0 / 1
all-above-L0: 0 / 1
files/split: min=1 max=1 avg=1.00
size/split: min=2.6 KiB p50=2.6 KiB p95=2.6 KiB max=2.6 KiB
Files: 1
Total size: 2.6 KiB
Estimated rows: 10 (merged: 10)
Level histogram: L0=1
Deletion files: 0
explain reads the manifest list and manifest files but never opens any data files, so it is dramatically cheaper than a real read on large tables.
Table Get
Get and display table schema information in JSON format. The output format is the same as the schema JSON format used in table create, making it easy to export and reuse table schemas.
paimon table get mydb.users
Output:
{
"fields": [
{"id": 0, "name": "user_id", "type": "BIGINT"},
{"id": 1, "name": "username", "type": "STRING"},
{"id": 2, "name": "email", "type": "STRING"},
{"id": 3, "name": "age", "type": "INT"},
{"id": 4, "name": "city", "type": "STRING"},
{"id": 5, "name": "created_at", "type": "TIMESTAMP"},
{"id": 6, "name": "is_active", "type": "BOOLEAN"}
],
"partitionKeys": ["city"],
"primaryKeys": ["user_id"],
"options": {
"bucket": "4",
"changelog-producer": "input"
},
"comment": "User information table"
}
Note: The output JSON can be saved to a file and used directly with the table create command to recreate the table structure.
Table Snapshot
Get and display the latest snapshot information of a Paimon table in JSON format. The snapshot contains metadata about the current state of the table.
paimon table snapshot mydb.users
Output:
{
"version": 3,
"id": 5,
"schemaId": 1,
"baseManifestList": "manifest-list-5-base-...",
"deltaManifestList": "manifest-list-5-delta-...",
"changelogManifestList": null,
"totalRecordCount": 1000,
"deltaRecordCount": 100,
"changelogRecordCount": null,
"commitUser": "user-123",
"commitIdentifier": 1709123456789,
"commitKind": "APPEND",
"timeMillis": 1709123456789,
"watermark": null,
"statistics": null,
"nextRowId": null
}
Table Create
Create a new Paimon table with a schema defined in a JSON file. The schema JSON format is the same as the output from
table get, ensuring consistency and easy schema reuse.
Options:
--schema, -s: Path to schema JSON file - Required--ignore-if-exists, -i: Do not raise error if table already exists
The schema JSON file follows the same format as output by table get:
Field Properties:
id: Field ID (integer, typically starts from 0) - Requiredname: Field name - Requiredtype: Field data type (e.g.,INT,BIGINT,STRING,TIMESTAMP,DECIMAL(10,2)) - Requireddescription: Optional field description
Schema Properties:
fields: List of field definitions - RequiredpartitionKeys: List of partition key column namesprimaryKeys: List of primary key column namesoptions: Table options as key-value pairscomment: Table comment
Example Workflow:
-
Export schema from an existing table:
paimon table get mydb.users > users_schema.json -
Create a new table with the same schema:
paimon table create mydb.users_copy --schema users_schema.json
Table Import
Import data from CSV or JSON files into an existing Paimon table. This is useful for bulk loading data from external sources.
Options:
--input, -i: Path to input file (CSV or JSON format) - Required
Supported Formats:
- CSV (
.csv): Comma-separated values file - JSON (
.json): JSON file with array of objects format
Import from CSV
The CSV file should have:
- A header row with column names matching the table schema
- Data types compatible with the table columns
id,name,age,city
1,Alice,25,Beijing
2,Bob,30,Shanghai
3,Charlie,35,Guangzhou
Output:
Successfully imported 3 rows into 'mydb.users'.
Import from JSON
The JSON file should be an array of objects with keys matching the table column names.
[
{"id": 1, "name": "Alice", "age": 25, "city": "Beijing"},
{"id": 2, "name": "Bob", "age": 30, "city": "Shanghai"},
{"id": 3, "name": "Charlie", "age": 35, "city": "Guangzhou"}
]
Output:
Successfully imported 3 rows into 'mydb.users'.
Important Notes
- The target table must exist before importing data
- Column names in the file must match the table schema
- Data types should be compatible with the table schema
- The import operation appends data to the existing table
Table List Partitions
List partitions of a Paimon table. Supports optional pattern filtering to match specific partitions.
paimon table list-partitions mydb.orders
Options:
--pattern, -p: Partition name pattern to filter partitions--format, -f: Output format:table(default) orjson
Examples:
# List all partitions
paimon table list-partitions mydb.orders
# List partitions matching a pattern
paimon table list-partitions mydb.orders --pattern "dt=2024*"
# Output as JSON (for programmatic use)
paimon table list-partitions mydb.orders --format json
Output:
Partition RecordCount FileSizeInBytes FileCount LastFileCreationTime UpdatedAt UpdatedBy
dt=2024-01-01,region=us 500 1048576 10 1704067200000 1704153600000 admin
dt=2024-01-02,region=eu 300 524288 5 1704153600000 1704240000000 user1
dt=2024-01-03,region=us 200 262144 3 1704240000000 1704326400000 admin
Table Rename
Rename a table in the catalog. Both source and target must be specified in database.table format.
paimon table rename mydb.old_name mydb.new_name
Output:
Table 'mydb.old_name' renamed to 'mydb.new_name' successfully.
Note: Both filesystem and REST catalogs support table rename. For filesystem catalogs, the rename is performed by renaming the underlying table directory.
Table Full-Text Search
Perform full-text search on a Paimon table with a Tantivy full-text index and display matching rows.
paimon table full-text-search mydb.articles --column content --query "paimon lake"
Options:
--column, -c: Text column to search on - Required--query, -q: Query text to search for - Required--limit, -l: Maximum number of results to return (default: 10)--select, -s: Select specific columns to display (comma-separated)--format, -f: Output format:table(default) orjson
Examples:
# Basic full-text search
paimon table full-text-search mydb.articles -c content -q "paimon lake"
# Search with limit
paimon table full-text-search mydb.articles -c content -q "streaming data" -l 20
# Search with column projection
paimon table full-text-search mydb.articles -c content -q "paimon" -s "id,title,content"
# Output as JSON
paimon table full-text-search mydb.articles -c content -q "paimon" -f json
Output:
id content
0 Apache Paimon is a streaming data lake platform
2 Paimon supports real-time data ingestion and...
4 Data lake platforms like Paimon handle large-...
Note: The table must have a Tantivy full-text index built on the target column. PyPaimon uses
the tokenizer settings stored in the index metadata; ngram full-text indexes require a tantivy-py
package with custom tokenizer support, and jieba full-text indexes require the Python jieba
package. See Global Index for how to create full-text indexes.
Table Drop
Drop a table from the catalog. This will permanently delete the table and all its data.
Options:
--ignore-if-not-exists, -i: Do not raise error if table does not exist
paimon table drop mydb.old_table
Output:
Table 'mydb.old_table' dropped successfully.
Warning: This operation cannot be undone. All data in the table will be permanently deleted.
Table Alter
Alter a table's schema or options. This command supports multiple sub-commands for different types of schema changes.
Basic Syntax
paimon table alter DATABASE.TABLE [--ignore-if-not-exists] SUBCOMMAND [OPTIONS]
Global Options:
--ignore-if-not-exists, -i: Do not raise error if table does not exist
Set Option
Set a table option (key-value pair):
paimon table alter mydb.users set-option -k snapshot.num-retained-max -v 10
Remove Option
Remove a table option:
paimon table alter mydb.users remove-option -k snapshot.num-retained-max
Add Column
Add a new column to the table:
Example:
paimon table alter mydb.users add-column -n email -t STRING -c "User email address"
Example with position (first):
paimon table alter mydb.users add-column -n row_id -t BIGINT --first
Example with position (after):
paimon table alter mydb.users add-column -n email -t STRING --after name
Drop Column
Drop a column from the table:
paimon table alter mydb.users drop-column -n email
Rename Column
Rename an existing column:
paimon table alter mydb.users rename-column -n username -m user_name
Alter Column
Alter an existing column's type, comment, or position. Multiple changes can be specified in a single command.
Change Column Type:
paimon table alter mydb.users alter-column -n age -t BIGINT
Change Column Comment:
paimon table alter mydb.users alter-column -n age -c 'User age in years'
Change Column Position:
paimon table alter mydb.users alter-column -n age --first
paimon table alter mydb.users alter-column -n age --after name
Multiple changes in one command:
paimon table alter mydb.users alter-column -n age -t BIGINT -c 'User age in years'
Update Comment
paimon table alter mydb.users update-comment -c "Updated user information table"
Tag Commands
Manage tags (named snapshots) on a table. Tags are useful for time travel and pinning a snapshot for later access.
paimon tag <create|list|get|delete> mydb.users ...
Tag Create
# Tag the latest snapshot
paimon tag create mydb.users v1
# Tag a specific snapshot
paimon tag create mydb.users v1 --snapshot-id 3
# Do not error if the tag already exists
paimon tag create mydb.users v1 --ignore-if-exists
Options:
--snapshot-id, -s: Snapshot id to tag (default: the latest snapshot)--ignore-if-exists, -i: Do not raise an error if the tag already exists
Tag List
# List all tags
paimon tag list mydb.users
# Only tags with a name prefix
paimon tag list mydb.users --prefix prod_
# JSON output
paimon tag list mydb.users --format json
Options:
--prefix, -p: Only list tags whose name starts with this prefix--format, -f: Output format,table(default) orjson
Tag Get
paimon tag get mydb.users v1
# JSON output
paimon tag get mydb.users v1 --format json
Options:
--format, -f: Output format,table(default) orjson
Tag Delete
paimon tag delete mydb.users v1
Database Commands
DB Get
Get and display database information in JSON format.
paimon db get mydb
Output:
{
"name": "mydb",
"options": {}
}
DB Create
Create a new database.
# Create a simple database
paimon db create mydb
# Create with properties
paimon db create mydb -p '{"key1": "value1", "key2": "value2"}'
# Create and ignore if already exists
paimon db create mydb -i
DB Drop
Drop an existing database.
# Drop a database
paimon db drop mydb
# Drop and ignore if not exists
paimon db drop mydb -i
# Drop with all tables (cascade)
paimon db drop mydb --cascade
DB Alter
Alter database properties by setting or removing properties.
# Set properties
paimon db alter mydb --set '{"key1": "value1", "key2": "value2"}'
# Remove properties
paimon db alter mydb --remove key1 key2
# Set and remove properties in one command
paimon db alter mydb --set '{"key1": "new_value"}' --remove key2
DB List Tables
List all tables in a database.
paimon db list-tables mydb
Output:
orders
products
users
Catalog Commands
Catalog List DBs
List all databases in the catalog.
paimon catalog list-dbs
Output:
default
mydb
analytics
SQL Command
Execute SQL queries on Paimon tables directly from the command line. This feature is powered by pypaimon-rust and DataFusion.
Prerequisites:
pip install pypaimon[sql]
One-Shot Query
Execute a single SQL query and display the result:
paimon sql "SELECT * FROM users LIMIT 10"
Output:
id name age city
1 Alice 25 Beijing
2 Bob 30 Shanghai
3 Charlie 35 Guangzhou
Options:
--format, -f: Output format:table(default) orjson
Examples:
# Direct table name (uses default catalog and database)
paimon sql "SELECT * FROM users"
# Two-part: database.table
paimon sql "SELECT * FROM mydb.users"
# Query with filter and aggregation
paimon sql "SELECT city, COUNT(*) AS cnt FROM users GROUP BY city ORDER BY cnt DESC"
# Output as JSON
paimon sql "SELECT * FROM users LIMIT 5" --format json
Interactive REPL
Start an interactive SQL session by running paimon sql without a query argument. The REPL supports arrow keys for line editing, and command history is persisted across sessions in ~/.paimon_history.
paimon sql
Output:
____ _
/ __ \____ _(_)___ ___ ____ ____
/ /_/ / __ `/ / __ `__ \/ __ \/ __ \
/ ____/ /_/ / / / / / / / /_/ / / / /
/_/ \__,_/_/_/ /_/ /_/\____/_/ /_/
Powered by pypaimon-rust + DataFusion
Type 'help' for usage, 'exit' to quit.
paimon> SHOW DATABASES;
default
mydb
paimon> USE mydb;
Using database 'mydb'.
paimon> SHOW TABLES;
orders
users
paimon> SELECT count(*) AS cnt
> FROM users
> WHERE age > 18;
cnt
42
(1 row in 0.05s)
paimon> exit
Bye!
SQL statements end with ; and can span multiple lines. The continuation prompt > indicates that more input is expected.
REPL Commands:
| Command | Description |
|---|---|
USE <database>; | Switch the default database |
SHOW DATABASES; | List all databases |
SHOW TABLES; | List tables in the current database |
SELECT ...; | Execute a SQL query |
help | Show usage information |
exit / quit | Exit the REPL |
For more details on SQL syntax and the Python API, see SQL Query.
Branch Commands
Manage branches on a table. Branches are independent lines of a table that can be created from the current state or from a tag, and later fast-forwarded back into main.
paimon branch <create|list|delete|rename|fast-forward> mydb.users ...
Branch Create
# Create a branch from the current state
paimon branch create mydb.users b1
# Create a branch from an existing tag
paimon branch create mydb.users b1 --tag v1
Options:
--tag, -t: Create the branch from this tag (default: current state)
Branch List
# List all branches
paimon branch list mydb.users
# JSON output
paimon branch list mydb.users --format json
Options:
--format, -f: Output format,table(default) orjson
Branch Delete
paimon branch delete mydb.users b1
Branch Rename
paimon branch rename mydb.users b1 b2
Branch Fast-Forward
Fast-forward the main branch to the given branch (main adopts the branch's snapshots).
paimon branch fast-forward mydb.users b1