-
-
Notifications
You must be signed in to change notification settings - Fork 356
Closed
Labels
performancePotential issues with Zarr performance (I/O, memory, etc.)Potential issues with Zarr performance (I/O, memory, etc.)
Description
Zarr version
main
Numcodecs version
n/a
Python Version
n/a
Operating System
Linux/Mac
Installation
source
Description
In
zarr-python/src/zarr/core/buffer/cpu.py
Lines 157 to 161 in a26926c
# np.zeros is much faster than np.full, and therefore using it when possible is better. | |
if fill_value is None or (isinstance(fill_value, int) and fill_value == 0): | |
return cls(np.zeros(shape=tuple(shape), dtype=dtype, order=order)) | |
else: | |
return cls(np.full(shape=tuple(shape), fill_value=fill_value, dtype=dtype, order=order)) |
full
is surprisingly(?) slow.
In [18]: %timeit np.full(shape=125000000, fill_value=0)
74.6 ms ± 848 μs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [19]: %timeit np.empty(shape=125000000)
972 ns ± 36.2 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
It's worth confirming whether we need the Array's fill value here. My understanding is that it's just there in case the chunk happens to be missing. However, we should be able to limit the memset to just those regions of the output array that are actually missing. In pseduo code:
out = NDBuffer.create(...) # uses empty
for key, chunk_projection in zip(keys, chunk_projections):
maybe_buffer = await store.get(key)
if maybe_buffer is None:
out[chunk_projection] = array.metadata.fill_value
else:
out[chunk_projection] = maybe_buffer
Steps to reproduce
n/a
Additional output
xref #2904.
Metadata
Metadata
Assignees
Labels
performancePotential issues with Zarr performance (I/O, memory, etc.)Potential issues with Zarr performance (I/O, memory, etc.)