Skip to content

MAINT,ENH: Possible improvements related to buffered iteration #28018

@seberg

Description

@seberg

gh-27883 moved setup of buffering to the beginning of the iterator construction to avoid duplicate work later on (and just simplify things a lot).

There are a few related improvements that can be done now:

  • The FixedStridesArray is now always identical to the actual "inner strides". There are two things we can do here:
    1. Internally, just stop using the fixed strides (and use the strides immediately) as a slight simplification/optimization.
    2. We could advertise this fact promising to not change it back (I don't think there is much of a reason to consider changing back, but I am not 100% sure.).
  • We now know clearly at setup time if we need any buffering at all.
    • We could skip all buffer setup (overhead cost) and use a faster iternext() function in principle. There are two things to keep in mind:
      1. The BUFFERED flag is used to reject some API calls, we must keep rejecting the API even if we don't need to buffer internally.
      2. The iterator struct is different when buffering, simply unsetting the flag would break offsets. (i.e. may need a new flag to just skip steps).
  • The buffered iternext() always uses goto_iterindex. This function is heavy weight. During normal iteration advancing the iterator is much easier and faster (mainly, no need to look at all dimensions).
  • The buffer setup is currently unable to realize that it may be beneficial to use a "reduce style" iteration (a double loop) even when not required because doing so may mean fewer operands need to be buffered. discussion
  • The code tries to guess the best buffer-size, but this method is crude and can probably be improved. A small constraint is that we should err a bit on larger buffers (or we have to deal with floating point precision changes with float16 in the einsum tests).
    The code optimizes "overheads" (very crudely), for small buffers that dominates, but for largish buffers, the buffer copy itself might also make a difference.
    I am sure this can be better, but it is OK if it isn't ideal.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions