Skip to content

unhelpful error message for unknown codecs when opening a group #3384

@keewis

Description

@keewis

Zarr version

3.1.0 (from main)

Numcodecs version

0.16.1

Python Version

3.12.3

Operating System

Ubuntu 24.04

Installation

using uv

Description

When opening a zarr store that has a variable with an unknown codec (in my case, because I forgot to import the package defining the codec), zarr will raise a pretty unhelpful error message:

ZarrUserWarning: Object at <codec-name> is not recognized as a component of a Zarr hierarchy.

This is because

for fetched_node_coro in asyncio.as_completed(node_tasks):
try:
fetched_node = await fetched_node_coro
except KeyError as e:
# keyerror is raised when `key` names an object (in the object storage sense),
# as opposed to a prefix, in the store under the prefix associated with this group
# in which case `key` cannot be the name of a sub-array or sub-group.
warnings.warn(
f"Object at {e.args[0]} is not recognized as a component of a Zarr hierarchy.",
ZarrUserWarning,
stacklevel=1,
)
continue
assumes that KeyError must have been thrown by the store.

However, if the codec is unknown the registry will also throw a KeyError, which will be caught and hidden behind the warning.

Steps to reproduce

# /// script
# requires-python = ">=3.11"
# dependencies = [
#   "zarr@git+https://github.com/zarr-developers/zarr-python.git@main",
# ]
# ///

import zarr
import json
from zarr.buffer.cpu import Buffer

store = zarr.storage.MemoryStore()

root = zarr.create_group(store=store)
z1 = root.create_array(
    name="baz", shape=(10000, 10000), chunks=(1000, 1000), dtype="int32"
)
print(z1)

metadata = json.loads(store._store_dict["baz/zarr.json"].to_bytes())
metadata["codecs"][0]["name"] = "b"

store._store_dict["baz/zarr.json"] = Buffer.from_bytes(json.dumps(metadata).encode())

g = zarr.open_group(store)
print(list(g.keys()))

This prints:

<Array memory://128459357805184/baz shape=(10000, 10000) dtype=int32>
.../zarr-python/src/zarr/core/group.py:3391: ZarrUserWarning: Object at b is not recognized as a component of a Zarr hierarchy.
  warnings.warn(
[]

where the warning is very misleading, this has nothing to do with the hierarchy.

To fix that, we'd probably have to catch the KeyError somewhere up the stack (in parse_codecs?) and convert it to a different error with a bit more information.

Additional output

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugPotential issues with the zarr-python library

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions