Apache Paimon

Apache Paimon #

Apache Paimon(incubating) is a streaming data lake platform that supports high-speed data ingestion, change data tracking and efficient real-time analytics.

Paimon offers the following core capabilities:

  • Unified Batch & Streaming: Paimon supports batch write and batch read, as well as streaming write changes and streaming read table changelogs.
  • Data Lake: As a data lake storage, Paimon has the following advantages: low cost, high reliability, and scalable metadata.
  • Merge Engines: Paimon supports rich Merge Engines. By default, the last entry of the primary key is reserved. You can also use the “partial-update” or “aggregation” engine.
  • Changelog producer: Paimon supports rich Changelog producers, such as “lookup” and “full-compaction”. The correct changelog can simplify the construction of a streaming pipeline.
  • Append Only Tables: Paimon supports Append Only tables, automatically compact small files, and provides orderly stream reading. You can use this to replace message queues.

Try Paimon

If you’re interested in playing around with Paimon, check out our quick start guide with Flink, Spark or Hive. It provides a step by step introduction to the APIs and guides you through real applications.

Get Help with Paimon

If you get stuck, you can subscribe User Mailing List (user-subscribe@paimon.apache.org), Paimon tracks issues in GitHub and prefers to receive contributions as pull requests. You can also create an issue.