Read Performance

Read Performance #

Primary Key Table #

For Primary Key Table, it’s a ‘MergeOnRead’ technology. When reading data, multiple layers of LSM data are merged, and the number of parallelism will be limited by the number of buckets. Although Paimon’s merge performance is efficient, it still cannot catch up with the ordinary AppendOnly table.

If you want to query fast enough in certain scenarios, but can only find older data, you can:

  1. Configure ‘compaction.optimization-interval’ when writing data. For streaming jobs, optimized compaction will then be performed periodically; For batch jobs, optimized compaction will be carried out when the job ends.
  2. Query from read-optimized system table. Reading from results of optimized files avoids merging records with the same key, thus improving reading performance.

You can flexibly balance query performance and data latency when reading.

Edit This Page
Apache Paimon is an effort undergoing incubation at The Apache Software Foundation (ASF), sponsored by the Apache Incubator. Incubation is required of all newly accepted projects until a further review indicates that the infrastructure, communications, and decision making process have stabilized in a manner consistent with other successful ASF projects. While incubation status is not necessarily a reflection of the completeness or stability of the code, it does indicate that the project has yet to be fully endorsed by the ASF.
Copyright © 2023 The Apache Software Foundation. Apache Paimon, Paimon, and its feather logo are trademarks of The Apache Software Foundation.