Skip to content

ArrayV2Metadata.to_dict() fails when passed numpy type directly #3253

@TomNicholas

Description

@TomNicholas

Zarr version

v3.1.0

Numcodecs version

n/a

Python Version

3.12

Operating System

linux

Installation

uv pip

Description

ArrayV2Metadata can error on .to_dict(). I think the problem is actually that it accepts a np.dtype as an attribute without any kind of validation on that field, which then sits around incorrectly until the .to_dict() surfaces the problem.

I assume what's supposed to happen is that parse_data_type is meant to be called on the input, because when I do that then .to_dict() behaves as expected.

Steps to reproduce

In [1]: import numpy as np

In [2]: from zarr.core.metadata import ArrayV2Metadata

In [3]: ArrayV2Metadata(
   ...:             chunks=(10,),
   ...:             shape=(5,),
   ...:             dtype=np.dtype("int32"),
   ...:             order="C",
   ...:             fill_value=None,
   ...:         ).to_dict()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Cell In[3], line 7
      1 ArrayV2Metadata(
      2             chunks=(10,),
      3             shape=(5,),
      4             dtype=np.dtype("int32"),
      5             order="C",
      6             fill_value=None,
----> 7         ).to_dict()

File ~/Documents/Work/Code/VirtualiZarr/.venv/lib/python3.13/site-packages/zarr/core/metadata/v2.py:227, in ArrayV2Metadata.to_dict(self)
    224     zarray_dict["fill_value"] = fill_value
    226 # pull the "name" attribute out of the dtype spec returned by self.dtype.to_json
--> 227 zarray_dict["dtype"] = self.dtype.to_json(zarr_format=2)["name"]
    229 return zarray_dict

AttributeError: 'numpy.dtypes.Int32DType' object has no attribute 'to_json'

In [4]: from zarr.dtype import ZDType, parse_data_type

In [5]: ArrayV2Metadata(
   ...:             chunks=(10,),
   ...:             shape=(5,),
   ...:             dtype=parse_data_type(np.dtype("int32"), zarr_format=2),
   ...:             order="C",
   ...:             fill_value=None,
   ...:         ).to_dict()
Out[5]: 
{'shape': (5,),
 'chunks': (10,),
 'dtype': '<i4',
 'fill_value': None,
 'order': 'C',
 'filters': None,
 'dimension_separator': '.',
 'compressor': None,
 'attributes': {},
 'zarr_format': 2}

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions