Module org.apache.arrow.algorithm
Class LinearDictionaryEncoder<E extends BaseIntVector,D extends ValueVector>
java.lang.Object
org.apache.arrow.algorithm.dictionary.LinearDictionaryEncoder<E,D>
- Type Parameters:
E
- encoded vector type.D
- decoded vector type, which is also the dictionary type.
- All Implemented Interfaces:
DictionaryEncoder<E,
D>
public class LinearDictionaryEncoder<E extends BaseIntVector,D extends ValueVector>
extends Object
implements DictionaryEncoder<E,D>
Dictionary encoder based on linear search.
-
Constructor Summary
ConstructorDescriptionLinearDictionaryEncoder
(D dictionary) Constructs a dictionary encoder, with the encode null flag set to false.LinearDictionaryEncoder
(D dictionary, boolean encodeNull) Constructs a dictionary encoder. -
Method Summary
-
Constructor Details
-
LinearDictionaryEncoder
Constructs a dictionary encoder, with the encode null flag set to false.- Parameters:
dictionary
- the dictionary. Its entries should be sorted in the non-increasing order of their frequency. Otherwise, the encoder still produces correct results, but at the expense of performance overhead.
-
LinearDictionaryEncoder
Constructs a dictionary encoder.- Parameters:
dictionary
- the dictionary. Its entries should be sorted in the non-increasing order of their frequency. Otherwise, the encoder still produces correct results, but at the expense of performance overhead.encodeNull
- a flag indicating if null should be encoded. It determines the behaviors for processing null values in the input during encoding. When a null is encountered in the input, 1) If the flag is set to true, the encoder searches for the value in the dictionary, and outputs the index in the dictionary. 2) If the flag is set to false, the encoder simply produces a null in the output.
-
-
Method Details
-
encode
Encodes an input vector by linear search. When the dictionary is sorted in the non-increasing order of the entry frequency, it will have constant time complexity, with no extra memory requirement.- Specified by:
encode
in interfaceDictionaryEncoder<E extends BaseIntVector,
D extends ValueVector> - Parameters:
input
- the input vector.output
- the output vector. Note that it must be in a fresh state. At least, all its validity bits should be clear.
-