Modifier | Constructor and Description |
---|---|
protected |
WriterOptions(Properties tableProperties,
org.apache.hadoop.conf.Configuration conf) |
Modifier and Type | Method and Description |
---|---|
OrcFile.WriterOptions |
blockPadding(boolean value)
Sets whether the HDFS blocks are padded to prevent stripes from straddling blocks.
|
OrcFile.WriterOptions |
blockSize(long value)
Set the file system block size for the file.
|
OrcFile.WriterOptions |
bloomFilterColumns(String columns)
Comma separated values of column names for which bloom filter is to be created.
|
OrcFile.WriterOptions |
bloomFilterFpp(double fpp)
Specify the false positive probability for bloom filter.
|
OrcFile.WriterOptions |
bloomFilterVersion(OrcFile.BloomFilterVersion version)
Set the version of the bloom filters to write.
|
OrcFile.WriterOptions |
bufferSize(int value)
The size of the memory buffers used for compressing and storing the stripe in memory.
|
OrcFile.WriterOptions |
callback(OrcFile.WriterCallback callback)
Add a listener for when the stripe and file are about to be closed.
|
OrcFile.WriterOptions |
clone() |
OrcFile.WriterOptions |
compress(CompressionKind value)
Sets the generic compression that is used to compress the data.
|
OrcFile.WriterOptions |
directEncodingColumns(String value)
Set the comma-separated list of columns that should be direct encoded.
|
OrcFile.WriterOptions |
encodingStrategy(OrcFile.EncodingStrategy strategy)
Sets the encoding strategy that is used to encode the data.
|
OrcFile.WriterOptions |
encrypt(String value)
Encrypt a set of columns with a key.
|
OrcFile.WriterOptions |
enforceBufferSize()
Enforce writer to use requested buffer size instead of estimating buffer size based on
stripe size and number of columns.
|
OrcFile.WriterOptions |
fileSystem(org.apache.hadoop.fs.FileSystem value)
Provide the filesystem for the path, if the client has it available.
|
boolean |
getBlockPadding() |
long |
getBlockSize() |
String |
getBloomFilterColumns() |
double |
getBloomFilterFpp() |
OrcFile.BloomFilterVersion |
getBloomFilterVersion() |
int |
getBufferSize() |
OrcFile.WriterCallback |
getCallback() |
CompressionKind |
getCompress() |
OrcFile.CompressionStrategy |
getCompressionStrategy() |
org.apache.hadoop.conf.Configuration |
getConfiguration() |
String |
getDirectEncodingColumns() |
OrcFile.EncodingStrategy |
getEncodingStrategy() |
String |
getEncryption() |
org.apache.hadoop.fs.FileSystem |
getFileSystem() |
org.apache.orc.impl.HadoopShims |
getHadoopShims() |
Map<String,org.apache.orc.impl.HadoopShims.KeyMetadata> |
getKeyOverrides() |
org.apache.orc.impl.KeyProvider |
getKeyProvider() |
String |
getMasks() |
org.apache.orc.MemoryManager |
getMemoryManager() |
boolean |
getOverwrite() |
double |
getPaddingTolerance() |
org.apache.orc.PhysicalWriter |
getPhysicalWriter() |
boolean |
getProlepticGregorian() |
int |
getRowIndexStride() |
org.apache.orc.TypeDescription |
getSchema() |
long |
getStripeRowCountValue() |
long |
getStripeSize() |
boolean |
getUseUTCTimestamp() |
OrcFile.Version |
getVersion() |
OrcFile.WriterVersion |
getWriterVersion() |
boolean |
getWriteVariableLengthBlocks() |
OrcFile.ZstdCompressOptions |
getZstdCompressOptions() |
boolean |
isBuildIndex() |
boolean |
isEnforceBufferSize() |
OrcFile.WriterOptions |
masks(String value)
Set the masks for the unencrypted data.
|
OrcFile.WriterOptions |
memory(org.apache.orc.MemoryManager value)
A public option to set the memory manager.
|
OrcFile.WriterOptions |
overwrite(boolean value)
If the output file already exists, should it be overwritten? If it is not provided, write
operation will fail if the file already exists.
|
OrcFile.WriterOptions |
paddingTolerance(double value)
Sets the tolerance for block padding as a percentage of stripe size.
|
OrcFile.WriterOptions |
physicalWriter(org.apache.orc.PhysicalWriter writer)
Change the physical writer of the ORC file.
|
OrcFile.WriterOptions |
rowIndexStride(int value)
Set the distance between entries in the row index.
|
OrcFile.WriterOptions |
setKeyProvider(org.apache.orc.impl.KeyProvider provider)
Set the key provider for column encryption.
|
OrcFile.WriterOptions |
setKeyVersion(String keyName,
int version,
org.apache.orc.EncryptionAlgorithm algorithm)
For users that need to override the current version of a key, this method allows them to
define the version and algorithm for a given key.
|
OrcFile.WriterOptions |
setProlepticGregorian(boolean newValue)
Should the writer use the proleptic Gregorian calendar for times and dates.
|
OrcFile.WriterOptions |
setSchema(org.apache.orc.TypeDescription schema)
Set the schema for the file.
|
OrcFile.WriterOptions |
setShims(org.apache.orc.impl.HadoopShims value)
Set the HadoopShims to use.
|
OrcFile.WriterOptions |
stripeSize(long value)
Set the stripe size for the file.
|
OrcFile.WriterOptions |
useUTCTimestamp(boolean value)
Manually set the time zone for the writer to utc.
|
OrcFile.WriterOptions |
version(OrcFile.Version value)
Sets the version of the file that will be written.
|
protected OrcFile.WriterOptions |
writerVersion(OrcFile.WriterVersion version)
Manually set the writer version.
|
OrcFile.WriterOptions |
writeVariableLengthBlocks(boolean value)
Should the ORC file writer use HDFS variable length blocks, if they are available?
|
protected WriterOptions(Properties tableProperties, org.apache.hadoop.conf.Configuration conf)
public OrcFile.WriterOptions clone()
public OrcFile.WriterOptions fileSystem(org.apache.hadoop.fs.FileSystem value)
public OrcFile.WriterOptions overwrite(boolean value)
public OrcFile.WriterOptions stripeSize(long value)
public OrcFile.WriterOptions blockSize(long value)
public OrcFile.WriterOptions rowIndexStride(int value)
public OrcFile.WriterOptions bufferSize(int value)
public OrcFile.WriterOptions enforceBufferSize()
public OrcFile.WriterOptions blockPadding(boolean value)
public OrcFile.WriterOptions encodingStrategy(OrcFile.EncodingStrategy strategy)
public OrcFile.WriterOptions paddingTolerance(double value)
public OrcFile.WriterOptions bloomFilterColumns(String columns)
public OrcFile.WriterOptions bloomFilterFpp(double fpp)
fpp
- - false positive probabilitypublic OrcFile.WriterOptions compress(CompressionKind value)
public OrcFile.WriterOptions setSchema(org.apache.orc.TypeDescription schema)
schema
- the schema for the file.public OrcFile.WriterOptions version(OrcFile.Version value)
public OrcFile.WriterOptions callback(OrcFile.WriterCallback callback)
callback
- the object to be called when the stripe is closedpublic OrcFile.WriterOptions bloomFilterVersion(OrcFile.BloomFilterVersion version)
public OrcFile.WriterOptions physicalWriter(org.apache.orc.PhysicalWriter writer)
SHOULD ONLY BE USED BY LLAP.
writer
- the writer to control the layout and persistencepublic OrcFile.WriterOptions memory(org.apache.orc.MemoryManager value)
public OrcFile.WriterOptions writeVariableLengthBlocks(boolean value)
value
- the new valuepublic OrcFile.WriterOptions setShims(org.apache.orc.impl.HadoopShims value)
value
- the new valueprotected OrcFile.WriterOptions writerVersion(OrcFile.WriterVersion version)
version
- the version to writepublic OrcFile.WriterOptions useUTCTimestamp(boolean value)
public OrcFile.WriterOptions directEncodingColumns(String value)
value
- the value to setpublic OrcFile.WriterOptions encrypt(String value)
Format of the string is a key-list.
value
- a key-list of which columns to encryptpublic OrcFile.WriterOptions masks(String value)
Format of the string is a mask-list.
value
- a list of the masks and column namespublic OrcFile.WriterOptions setKeyVersion(String keyName, int version, org.apache.orc.EncryptionAlgorithm algorithm)
This will mostly be used for ORC file merging where the writer has to use the same version of the key that the original files used.
keyName
- the key nameversion
- the version of the key to usealgorithm
- the algorithm for the given key versionpublic OrcFile.WriterOptions setKeyProvider(org.apache.orc.impl.KeyProvider provider)
provider
- the object that holds the master secretspublic OrcFile.WriterOptions setProlepticGregorian(boolean newValue)
newValue
- true if we should use the proleptic calendarpublic org.apache.orc.impl.KeyProvider getKeyProvider()
public boolean getBlockPadding()
public long getBlockSize()
public String getBloomFilterColumns()
public boolean getOverwrite()
public org.apache.hadoop.fs.FileSystem getFileSystem()
public org.apache.hadoop.conf.Configuration getConfiguration()
public org.apache.orc.TypeDescription getSchema()
public long getStripeSize()
public long getStripeRowCountValue()
public CompressionKind getCompress()
public OrcFile.WriterCallback getCallback()
public OrcFile.Version getVersion()
public org.apache.orc.MemoryManager getMemoryManager()
public int getBufferSize()
public boolean isEnforceBufferSize()
public int getRowIndexStride()
public boolean isBuildIndex()
public OrcFile.CompressionStrategy getCompressionStrategy()
public OrcFile.EncodingStrategy getEncodingStrategy()
public OrcFile.ZstdCompressOptions getZstdCompressOptions()
public double getPaddingTolerance()
public double getBloomFilterFpp()
public OrcFile.BloomFilterVersion getBloomFilterVersion()
public org.apache.orc.PhysicalWriter getPhysicalWriter()
public OrcFile.WriterVersion getWriterVersion()
public boolean getWriteVariableLengthBlocks()
public org.apache.orc.impl.HadoopShims getHadoopShims()
public boolean getUseUTCTimestamp()
public String getDirectEncodingColumns()
public String getEncryption()
public String getMasks()
public boolean getProlepticGregorian()
Copyright © 2023–2024 The Apache Software Foundation. All rights reserved.