Roadmap #
Native Format IO #
Integrate native Parquet & ORC reader & writer.
Deletion Vectors (Merge On Write) #
- Primary Key Table Deletion Vectors Mode supports async compaction.
- Append Table supports DELETE & UPDATE with Deletion Vectors Mode. (Now only Spark SQL)
- Optimize lookup performance for HDD disk.
Flink Lookup Join #
Support Flink Custom Data Distribution Lookup Join to reach large-scale data lookup join.
Produce Iceberg snapshots #
Introduce a mode to produce Iceberg snapshots.
Branch #
Branch production ready.
Changelog life cycle decouple #
Changelog life cycle decouple supports none changelog-producer.
Partition Mark Done #
Support partition mark done.
Default File Format #
- Default compression is ZSTD with level 1.
- Parquet supports filter push down.
- Parquet supports arrow with row type element.
- Parquet becomes default file format.
Variant Type #
Support Variant Type with Spark 4.0 and Flink 2.0. Unlocking support for semi-structured data.
Bucketed Join #
Support Bucketed Join with Spark SQL to reduce shuffler in Join.
File Index #
Add more index:
- Bitmap
- Inverse
Column Family #
Support Column Family for super Wide Table.
View & Function support #
Paimon Catalog supports views and functions.
Files Schema Evolution Ingestion #
Introduce a files Ingestion with Schema Evolution.
Foreign Key Join #
Explore Foreign Key Join solution.