pyarrow.dataset.ParquetFileFormat#
- class pyarrow.dataset.ParquetFileFormat(read_options=None, default_fragment_scan_options=None, **kwargs)#
Bases:
FileFormat
FileFormat for Parquet
- Parameters:
- read_options
ParquetReadOptions
Read options for the file.
- default_fragment_scan_options
ParquetFragmentScanOptions
Scan Options for the file.
- **kwargs
dict
Additional options for read option or scan option
- read_options
- __init__(*args, **kwargs)#
Methods
__init__
(*args, **kwargs)equals
(self, ParquetFileFormat other)- Parameters:
inspect
(self, file[, filesystem])Infer the schema of a file.
make_fragment
(self, file[, filesystem, ...])Make a FileFragment from a given file.
make_write_options
(self, **kwargs)- Parameters:
Attributes
- default_extname#
- default_fragment_scan_options#
- equals(self, ParquetFileFormat other)#
- Parameters:
- Returns:
- inspect(self, file, filesystem=None)#
Infer the schema of a file.
- make_fragment(self, file, filesystem=None, Expression partition_expression=None, row_groups=None, *, file_size=None)#
Make a FileFragment from a given file.
- Parameters:
- filefile-like object, path-like or
str
The file or file path to make a fragment from.
- filesystem
Filesystem
, optional If filesystem is given, file must be a string and specifies the path of the file to read from the filesystem.
- partition_expression
Expression
, optional An expression that is guaranteed true for all rows in the fragment. Allows fragment to be potentially skipped while scanning with a filter.
- row_groups
Iterable
, optional The indices of the row groups to include
- file_size
int
, optional The size of the file in bytes. Can improve performance with high-latency filesystems when file size needs to be known before reading.
- filefile-like object, path-like or
- Returns:
- fragment
Fragment
The file fragment
- fragment
- make_write_options(self, **kwargs)#
- Parameters:
- **kwargs
dict
- **kwargs
- Returns:
pyarrow.dataset.FileWriteOptions
- read_options#