public class NestedColumnReader extends Object implements ColumnReader<WritableColumnVector>
Brief explanation of reading repetition and definition levels: Repetition level equal to 0 means that this is the beginning of a new row. Other value means that we should add data to the current row.
For example, if we have the following data: repetition levels: 0,1,1,0,0,1,[0] (last 0 is
implicit, normally will be the end of the page) values: a,b,c,d,e,f will consist of the sets of:
(a, b, c), (d), (e, f).
Definition levels contains 3 situations: level = maxDefLevel means value exist and is not null
level = maxDefLevel - 1 means value is null level < maxDefLevel - 1 means value doesn't exist For
non-nullable (REQUIRED) fields the (level = maxDefLevel - 1) condition means non-existing value
as well.
Quick example (maxDefLevel is 2): Read 3 rows out of: repetition levels: 0,1,0,1,1,0,0,... definition levels: 2,1,0,2,1,2,... values: a,b,c,d,e,f,... Resulting buffer: a,n, ,d,n,f that result is (a,n),(d,n),(f) where n means null
Constructor and Description |
---|
NestedColumnReader(boolean isUtcTimestamp,
org.apache.parquet.column.page.PageReadStore pages,
ParquetField field) |
Modifier and Type | Method and Description |
---|---|
void |
readToVector(int readNumber,
WritableColumnVector vector) |
public NestedColumnReader(boolean isUtcTimestamp, org.apache.parquet.column.page.PageReadStore pages, ParquetField field)
public void readToVector(int readNumber, WritableColumnVector vector) throws IOException
readToVector
in interface ColumnReader<WritableColumnVector>
readNumber
- number to read.vector
- vector to write.IOException
Copyright © 2023–2024 The Apache Software Foundation. All rights reserved.