Overview #
Compatibility Matrix #
Engine | Version | Batch Read | Batch Write | Create Table | Alter Table | Streaming Write | Streaming Read | Batch Overwrite | DELETE & UPDATE | MERGE INTO | Time Travel |
---|---|---|---|---|---|---|---|---|---|---|---|
Flink | 1.15 - 1.20 | ✅ | ✅ | ✅ | ✅(1.17+) | ✅ | ✅ | ✅ | ✅(1.17+) | ❌ | ✅ |
Spark | 3.1 - 3.5 | ✅ | ✅(3.2+) | ✅ | ✅ | ✅(3.3+) | ✅(3.3+) | ✅(3.2+) | ✅(3.2+) | ✅(3.2+) | ✅(3.3+) |
Hive | 2.1 - 3.1 | ✅ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Trino | 420 - 439 | ✅ | ✅(427+) | ✅(427+) | ✅(427+) | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Presto | 0.236 - 0.280 | ✅ | ❌ | ✅ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ |
StarRocks | 3.1+ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Doris | 2.0.6+ | ✅ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ❌ | ✅ |
Streaming Engines #
Flink Streaming #
Flink is the most comprehensive streaming computing engine that is widely used for data CDC ingestion and the construction of streaming pipelines.
Recommended version is Flink 1.17.2.
Spark Streaming #
You can also use Spark Streaming to build a streaming pipeline. Spark’s schema evolution capability will be better implemented, but you must accept the mechanism of mini-batch.
Batch Engines #
Spark Batch #
Spark Batch is the most widely used batch computing engine.
Recommended version is Spark 3.4.3.
Flink Batch #
Flink Batch is also available, which can make your pipeline more integrated with streaming and batch unified.
OLAP Engines #
StarRocks #
StarRocks is the most recommended OLAP engine with the most advanced integration.
Recommended version is StarRocks 3.2.6.
Other OLAP #
You can also use Doris and Trino and Presto, or, you can just use Spark, Flink and Hive to query Paimon tables.