java.lang.Object
org.apache.arrow.vector.ipc.ArrowReader
- All Implemented Interfaces:
AutoCloseable
,DictionaryProvider
- Direct Known Subclasses:
ArrowFileReader
,ArrowStreamReader
Abstract class to read Schema and ArrowRecordBatches.
-
Nested Class Summary
Nested classes/interfaces inherited from interface org.apache.arrow.vector.dictionary.DictionaryProvider
DictionaryProvider.MapDictionaryProvider
-
Field Summary
-
Constructor Summary
ModifierConstructorDescriptionprotected
ArrowReader
(BufferAllocator allocator) protected
ArrowReader
(BufferAllocator allocator, CompressionCodec.Factory compressionFactory) -
Method Summary
Modifier and TypeMethodDescriptionabstract long
Return the number of bytes read from the ReadChannel.void
close()
Close resources, including vector schema root and dictionary vectors, and the underlying read source.void
close
(boolean closeReadSource) Close resources, including vector schema root and dictionary vectors.protected abstract void
Close the underlying read source.protected void
Initialize if not done previously.Get all dictionary IDs.Returns any dictionaries that were loaded along with ArrowRecordBatches.Returns the vector schema root.protected void
Reads the schema and initializes the vectors.protected void
loadDictionary
(ArrowDictionaryBatch dictionaryBatch) Load an ArrowDictionaryBatch to the readers dictionary vectors.abstract boolean
Load the next ArrowRecordBatch to the vector schema root if available.protected void
loadRecordBatch
(ArrowRecordBatch batch) Load an ArrowRecordBatch to the readers VectorSchemaRoot.lookup
(long id) Lookup a dictionary that has been loaded using the dictionary id.protected void
Ensure the reader has been initialized and reset the VectorSchemaRoot row count to 0.protected abstract Schema
Read the Schema from the source, will be invoked at the beginning the initialization.
-
Field Details
-
allocator
-
dictionaries
-
-
Constructor Details
-
ArrowReader
-
ArrowReader
-
-
Method Details
-
getVectorSchemaRoot
Returns the vector schema root. This will be loaded with new values on every call to loadNextBatch.- Returns:
- the vector schema root
- Throws:
IOException
- if reading of schema fails
-
getDictionaryVectors
Returns any dictionaries that were loaded along with ArrowRecordBatches.- Returns:
- Map of dictionaries to dictionary id, empty if no dictionaries loaded
- Throws:
IOException
- if reading of schema fails
-
lookup
Lookup a dictionary that has been loaded using the dictionary id.- Specified by:
lookup
in interfaceDictionaryProvider
- Parameters:
id
- Unique identifier for a dictionary- Returns:
- the requested dictionary or null if not found
-
getDictionaryIds
Description copied from interface:DictionaryProvider
Get all dictionary IDs.- Specified by:
getDictionaryIds
in interfaceDictionaryProvider
-
loadNextBatch
Load the next ArrowRecordBatch to the vector schema root if available.- Returns:
- true if a batch was read, false on EOS
- Throws:
IOException
- on error
-
bytesRead
public abstract long bytesRead()Return the number of bytes read from the ReadChannel.- Returns:
- number of bytes read
-
close
Close resources, including vector schema root and dictionary vectors, and the underlying read source.- Specified by:
close
in interfaceAutoCloseable
- Throws:
IOException
- on error
-
close
Close resources, including vector schema root and dictionary vectors. If the flag closeReadChannel is true then close the underlying read source, otherwise leave it open.- Parameters:
closeReadSource
- Flag to control if closing the underlying read source- Throws:
IOException
- on error
-
closeReadSource
Close the underlying read source.- Throws:
IOException
- on error
-
readSchema
Read the Schema from the source, will be invoked at the beginning the initialization.- Returns:
- the read Schema
- Throws:
IOException
- on error
-
ensureInitialized
Initialize if not done previously.- Throws:
IOException
- on error
-
initialize
Reads the schema and initializes the vectors.- Throws:
IOException
-
prepareLoadNextBatch
Ensure the reader has been initialized and reset the VectorSchemaRoot row count to 0.- Throws:
IOException
- on error
-
loadRecordBatch
Load an ArrowRecordBatch to the readers VectorSchemaRoot.- Parameters:
batch
- the record batch to load
-
loadDictionary
Load an ArrowDictionaryBatch to the readers dictionary vectors.- Parameters:
dictionaryBatch
- dictionary batch to load
-