Go Integration¶
The Go integration is a binding built on top of Apache Paimon Rust, allowing you to access Paimon tables from Go programs. It uses the Arrow C Data Interface for zero-copy data transfer.
Prerequisites¶
- Go 1.22.4 or later
- Supported platforms: Linux (amd64, arm64), macOS (amd64, arm64)
Installation¶
The pre-built native library is embedded in the package and automatically loaded at runtime — no manual build step is needed.
Creating a Catalog¶
Use NewCatalog with a map of options to create a catalog. The catalog type is determined by the metastore option (default: filesystem).
import paimon "github.com/apache/paimon-rust/bindings/go"
// Local filesystem
catalog, err := paimon.NewCatalog(map[string]string{
"warehouse": "/path/to/warehouse",
})
if err != nil {
log.Fatal(err)
}
defer catalog.Close()
Alibaba Cloud OSS¶
catalog, err := paimon.NewCatalog(map[string]string{
"warehouse": "oss://bucket/warehouse",
"fs.oss.accessKeyId": "your-access-key-id",
"fs.oss.accessKeySecret": "your-access-key-secret",
"fs.oss.endpoint": "oss-cn-hangzhou.aliyuncs.com",
})
REST Catalog¶
catalog, err := paimon.NewCatalog(map[string]string{
"metastore": "rest",
"uri": "http://localhost:8080",
"warehouse": "my_warehouse",
})
Reading a Table¶
Paimon Go uses a scan-then-read pattern: first scan the table to produce splits, then read data from those splits as Arrow RecordBatches.
import (
"errors"
"fmt"
"io"
"github.com/apache/arrow-go/v18/arrow/array"
paimon "github.com/apache/paimon-rust/bindings/go"
)
// Get a table from the catalog
table, err := catalog.GetTable(paimon.NewIdentifier("default", "my_table"))
if err != nil {
log.Fatal(err)
}
defer table.Close()
// Create a read builder
rb, err := table.NewReadBuilder()
if err != nil {
log.Fatal(err)
}
defer rb.Close()
// Step 1: Scan — produces a Plan containing DataSplits
scan, err := rb.NewScan()
if err != nil {
log.Fatal(err)
}
defer scan.Close()
plan, err := scan.Plan()
if err != nil {
log.Fatal(err)
}
defer plan.Close()
splits := plan.Splits()
// Step 2: Read — consumes splits and returns Arrow RecordBatches
read, err := rb.NewRead()
if err != nil {
log.Fatal(err)
}
defer read.Close()
reader, err := read.NewRecordBatchReader(splits)
if err != nil {
log.Fatal(err)
}
defer reader.Close()
for {
record, err := reader.NextRecord()
if errors.Is(err, io.EOF) {
break
}
if err != nil {
log.Fatal(err)
}
fmt.Println(record)
record.Release()
}
Column Projection¶
Use WithProjection to select specific columns. Only the requested columns are read, reducing I/O.
rb, err := table.NewReadBuilder()
if err != nil {
log.Fatal(err)
}
defer rb.Close()
// Only read the "id" and "name" columns
if err := rb.WithProjection([]string{"id", "name"}); err != nil {
log.Fatal(err)
}
// Continue with scan-then-read as above...
Filter Push-Down¶
Filter push-down prunes data at two levels:
- Scan planning — skips partitions, buckets, and data files based on file-level statistics (min/max).
- Read-side — applies row-level filtering via Parquet native row filters for leaf predicates.
Warning
Filter push-down is a best-effort optimization. The returned results may still contain rows that do not satisfy the filter condition. Callers should always apply residual filtering on the returned records to ensure correctness.
Building Predicates¶
Create predicates through the PredicateBuilder obtained from a table:
pb := table.PredicateBuilder()
// Comparison predicates
pred, err := pb.Eq("id", 1) // id = 1
pred, err := pb.NotEq("name", "bob") // name != "bob"
pred, err := pb.Lt("id", 3) // id < 3
pred, err := pb.Le("id", 2) // id <= 2
pred, err := pb.Gt("id", 1) // id > 1
pred, err := pb.Ge("id", 2) // id >= 2
// Null checks
pred, err := pb.IsNull("name") // name IS NULL
pred, err := pb.IsNotNull("name") // name IS NOT NULL
// IN / NOT IN
pred, err := pb.In("id", 1, 2, 3) // id IN (1, 2, 3)
pred, err := pb.NotIn("name", "x", "y") // name NOT IN ("x", "y")
Applying Filters¶
Pass a predicate to WithFilter on the ReadBuilder:
rb, err := table.NewReadBuilder()
if err != nil {
log.Fatal(err)
}
defer rb.Close()
pb := table.PredicateBuilder()
pred, err := pb.Eq("id", 1)
if err != nil {
log.Fatal(err)
}
// Ownership of pred is transferred — do NOT close it after this call
if err := rb.WithFilter(pred); err != nil {
log.Fatal(err)
}
// Continue with scan-then-read...
Compound Predicates¶
Combine predicates with And, Or, and Not. The predicate sub-package provides variadic helpers:
import (
paimon "github.com/apache/paimon-rust/bindings/go"
"github.com/apache/paimon-rust/bindings/go/predicate"
)
pb := table.PredicateBuilder()
p1, _ := pb.Ge("id", 1)
p2, _ := pb.Le("id", 3)
p3, _ := pb.Eq("name", "alice")
// id >= 1 AND id <= 3
combined, err := predicate.And(p1, p2)
// (id >= 1 AND id <= 3) OR name = "alice"
combined, err = predicate.Or(combined, p3)
// NOT (...)
negated, err := predicate.Not(combined)
Predicate Ownership
Predicates follow a move ownership model. After passing a predicate to WithFilter, And, Or, or Not, the predicate is consumed and must NOT be closed or reused by the caller.
Supported Datum Types¶
Predicate values are automatically converted from Go types:
| Go Type | Paimon Type |
|---|---|
bool |
Bool |
int8 |
TinyInt |
int16 |
SmallInt |
int32 |
Int |
int / int64 |
Int or Long |
float32 |
Float |
float64 |
Double |
string |
String |
paimon.Date |
Date (epoch days) |
paimon.Time |
Time (millis) |
paimon.Timestamp |
Timestamp |
paimon.LocalZonedTimestamp |
LocalZonedTimestamp |
paimon.Decimal |
Decimal |
paimon.Bytes |
Binary |
For special types, use the dedicated constructors:
// Date as epoch days since 1970-01-01
pred, _ := pb.Eq("dt", paimon.Date(19000))
// Decimal(123.45) as DECIMAL(10,2)
pred, _ := pb.Eq("amount", paimon.NewDecimal(12345, 10, 2))
// Timestamp
pred, _ := pb.Eq("ts", paimon.Timestamp{Millis: 1700000000000, Nanos: 0})
Resource Management¶
All Paimon objects (Catalog, Table, ReadBuilder, TableScan, Plan, TableRead, RecordBatchReader) hold native resources and must be closed when no longer needed. Use defer to ensure cleanup:
catalog, err := paimon.NewCatalog(opts)
if err != nil { log.Fatal(err) }
defer catalog.Close()
table, err := catalog.GetTable(id)
if err != nil { log.Fatal(err) }
defer table.Close()
// ... and so on for ReadBuilder, TableScan, Plan, TableRead, RecordBatchReader
All Close() methods are safe to call multiple times.
Complete Example¶
package main
import (
"errors"
"fmt"
"io"
"log"
"github.com/apache/arrow-go/v18/arrow/array"
paimon "github.com/apache/paimon-rust/bindings/go"
)
func main() {
// 1. Open catalog and table
catalog, err := paimon.NewCatalog(map[string]string{
"warehouse": "/tmp/paimon-warehouse",
})
if err != nil {
log.Fatal(err)
}
defer catalog.Close()
table, err := catalog.GetTable(paimon.NewIdentifier("default", "my_table"))
if err != nil {
log.Fatal(err)
}
defer table.Close()
// 2. Configure read: projection + filter
rb, err := table.NewReadBuilder()
if err != nil {
log.Fatal(err)
}
defer rb.Close()
if err := rb.WithProjection([]string{"id", "name"}); err != nil {
log.Fatal(err)
}
pb := table.PredicateBuilder()
pred, err := pb.Gt("id", 0)
if err != nil {
log.Fatal(err)
}
if err := rb.WithFilter(pred); err != nil {
log.Fatal(err)
}
// 3. Scan
scan, err := rb.NewScan()
if err != nil {
log.Fatal(err)
}
defer scan.Close()
plan, err := scan.Plan()
if err != nil {
log.Fatal(err)
}
defer plan.Close()
// 4. Read
read, err := rb.NewRead()
if err != nil {
log.Fatal(err)
}
defer read.Close()
reader, err := read.NewRecordBatchReader(plan.Splits())
if err != nil {
log.Fatal(err)
}
defer reader.Close()
for {
record, err := reader.NextRecord()
if errors.Is(err, io.EOF) {
break
}
if err != nil {
log.Fatal(err)
}
idCol := record.Column(0).(*array.Int32)
nameCol := record.Column(1).(*array.String)
for i := 0; i < int(record.NumRows()); i++ {
fmt.Printf("id=%d name=%s\n", idCol.Value(i), nameCol.Value(i))
}
record.Release()
}
}