pyarrow.StructType#

class pyarrow.StructType#

Bases: DataType

Concrete class for struct data types.

StructType supports direct indexing using [...] (implemented via __getitem__) to access its fields. It will return the struct field with the given index or name.

Examples

>>> import pyarrow as pa

Accessing fields using direct indexing:

>>> struct_type = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct_type[0]
pyarrow.Field<x: int32>
>>> struct_type['y']
pyarrow.Field<y: string>

Accessing fields using field():

>>> struct_type.field(1)
pyarrow.Field<y: string>
>>> struct_type.field('x')
pyarrow.Field<x: int32>

# Creating a schema from the struct type’s fields: >>> pa.schema(list(struct_type)) x: int32 y: string

__init__(*args, **kwargs)#

Methods

__init__(*args, **kwargs)

equals(self, other, *[, check_metadata])

Return true if type is equivalent to passed value.

field(self, i)

Select a field by its column name or numeric index.

get_all_field_indices(self, name)

Return sorted list of indices for the fields with the given name.

get_field_index(self, name)

Return index of the unique field with the given name.

to_pandas_dtype(self)

Return the equivalent NumPy / Pandas dtype.

Attributes

bit_width

Bit width for fixed width type.

byte_width

Byte width for fixed width type.

fields

Lists all fields within the StructType.

id

names

Lists the field names.

num_buffers

Number of data buffers required to construct Array type excluding children.

num_fields

The number of child fields.

bit_width#

Bit width for fixed width type.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().bit_width
64
byte_width#

Byte width for fixed width type.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().byte_width
8
equals(self, other, *, check_metadata=False)#

Return true if type is equivalent to passed value.

Parameters:
otherDataType or str convertible to DataType
check_metadatabool

Whether nested Field metadata equality should be checked as well.

Returns:
is_equalbool

Examples

>>> import pyarrow as pa
>>> pa.int64().equals(pa.string())
False
>>> pa.int64().equals(pa.int64())
True
field(self, i) Field#

Select a field by its column name or numeric index.

Parameters:
iint or str
Returns:
pyarrow.Field

Examples

>>> import pyarrow as pa
>>> struct_type = pa.struct({'x': pa.int32(), 'y': pa.string()})

Select the second field:

>>> struct_type.field(1)
pyarrow.Field<y: string>

Select the field named ‘x’:

>>> struct_type.field('x')
pyarrow.Field<x: int32>
fields#

Lists all fields within the StructType.

Examples

>>> import pyarrow as pa
>>> struct_type = pa.struct([('a', pa.int64()), ('b', pa.float64()), ('c', pa.string())])
>>> struct_type.fields
[pyarrow.Field<a: int64>, pyarrow.Field<b: double>, pyarrow.Field<c: string>]
get_all_field_indices(self, name)#

Return sorted list of indices for the fields with the given name.

Parameters:
namestr

The name of the field to look up.

Returns:
indicesList[int]

Examples

>>> import pyarrow as pa
>>> struct_type = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct_type.get_all_field_indices('x')
[0]
get_field_index(self, name)#

Return index of the unique field with the given name.

Parameters:
namestr

The name of the field to look up.

Returns:
indexint

The index of the field with the given name; -1 if the name isn’t found or there are several fields with the given name.

Examples

>>> import pyarrow as pa
>>> struct_type = pa.struct({'x': pa.int32(), 'y': pa.string()})

Index of the field with a name ‘y’:

>>> struct_type.get_field_index('y')
1

Index of the field that does not exist:

>>> struct_type.get_field_index('z')
-1
id#
names#

Lists the field names.

Examples

>>> import pyarrow as pa
>>> struct_type = pa.struct([('a', pa.int64()), ('b', pa.float64()), ('c', pa.string())])
>>> struct_type.names
['a', 'b', 'c']
num_buffers#

Number of data buffers required to construct Array type excluding children.

Examples

>>> import pyarrow as pa
>>> pa.int64().num_buffers
2
>>> pa.string().num_buffers
3
num_fields#

The number of child fields.

Examples

>>> import pyarrow as pa
>>> pa.int64()
DataType(int64)
>>> pa.int64().num_fields
0
>>> pa.list_(pa.string())
ListType(list<item: string>)
>>> pa.list_(pa.string()).num_fields
1
>>> struct = pa.struct({'x': pa.int32(), 'y': pa.string()})
>>> struct.num_fields
2
to_pandas_dtype(self)#

Return the equivalent NumPy / Pandas dtype.

Examples

>>> import pyarrow as pa
>>> pa.int64().to_pandas_dtype()
<class 'numpy.int64'>