Skip to content

Avoid calls to np.full #3368

@TomAugspurger

Description

@TomAugspurger

Zarr version

main

Numcodecs version

n/a

Python Version

n/a

Operating System

Linux/Mac

Installation

source

Description

In

# np.zeros is much faster than np.full, and therefore using it when possible is better.
if fill_value is None or (isinstance(fill_value, int) and fill_value == 0):
return cls(np.zeros(shape=tuple(shape), dtype=dtype, order=order))
else:
return cls(np.full(shape=tuple(shape), fill_value=fill_value, dtype=dtype, order=order))
, we create an empty buffer. This is hit on the read path (IIRC, to allocate the buffer the data copied into).

full is surprisingly(?) slow.

In [18]: %timeit np.full(shape=125000000, fill_value=0)
74.6 ms ± 848 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [19]: %timeit np.empty(shape=125000000)
972 ns ± 36.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)

It's worth confirming whether we need the Array's fill value here. My understanding is that it's just there in case the chunk happens to be missing. However, we should be able to limit the memset to just those regions of the output array that are actually missing. In pseduo code:

out = NDBuffer.create(...)  # uses empty
for key, chunk_projection in zip(keys, chunk_projections):
    maybe_buffer = await store.get(key)
    if maybe_buffer is None:
        out[chunk_projection] = array.metadata.fill_value
    else:
        out[chunk_projection] = maybe_buffer

Steps to reproduce

n/a

Additional output

xref #2904.

Metadata

Metadata

Assignees

No one assigned

    Labels

    performancePotential issues with Zarr performance (I/O, memory, etc.)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions