Skip to content

Tracking Issue for feature(iter_advance_by) #77404

Open
@timvermeulen

Description

@timvermeulen
Contributor

This is a tracking issue for the methods Iterator::advance_by and DoubleEndedIterator::advance_back_by.
The feature gate for the issue is #![feature(iter_advance_by)].


Previously the recommendation was to implement nth and nth_back on your iterators to efficiently advance them by multiple elements at once (useful for .skip(n) and .step_by(n)). After this feature is stabilized the recommendation will/should be to implement advance_by and advance_back_by instead, because they compose better and are often simpler to implement.

Iterators in libcore that wrap another iterator (possibly from elsewhere than libcore) will need to keep their nth and nth_back implementations for the foreseeable future and perhaps indefinitely, because the inner iterator may have an efficient nth implementation without implementing advance_by as well.

About tracking issues

Tracking issues are used to record the overall progress of implementation.
They are also used as hubs connecting to other relevant issues, e.g., bugs or open design questions.
A tracking issue is however not meant for large scale discussion, questions, or bug reports about a feature.
Instead, open a dedicated issue for the specific matter and add the relevant feature gate label.

Steps

Implementation history

Activity

added
C-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFC
on Oct 1, 2020
timvermeulen

timvermeulen commented on Oct 1, 2020

@timvermeulen
ContributorAuthor

I’ve already written implementations of these methods for most iterators in core::iter, I’ll submit PRs for them once I have enough tests.

added
T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.
on Oct 1, 2020
added a commit that references this issue on Oct 6, 2020

Auto merge of rust-lang#77594 - timvermeulen:chain_advance_by, r=scot…

GuillaumeGomez

GuillaumeGomez commented on Oct 13, 2020

@GuillaumeGomez
Member

The fact that advance_by returns a Result isn't great in my opinion. In case it moves after the end of the iterator, it's not an error, just that it ran after the end. It would make more sense to return Some(how_much_after_the_end) instead of Err(how_much_after_the_end).

m-ou-se

m-ou-se commented on Oct 23, 2020

@m-ou-se
Member

Previously the recommendation was to implement nth and nth_back on your iterators to efficiently advance them by multiple elements at once (useful for .skip(n) and .step_by(n)). After this feature is stabilized the recommendation will/should be to implement advance_by and advance_back_by instead, because they compose better and are often simpler to implement.

That means that every exising Iterator implementation out there that already has an efficient nth() will now automatically get a default slow advance_by() added, implemented as a next() loop, right? I don't think it can be expected that all these implementations will be updated overnight the moment this hits stable, or even get updated at all. Generic code will then have to make a choice between 1) being efficient/simple with std's Iterators but potentially very slow with other Iterators (advance_by) and 2) being efficient with all iterators but with a slightly more confusing function (nth). :(

I like advance_by, but I feel like this problem is not something to ignore.

m-ou-se

m-ou-se commented on Oct 23, 2020

@m-ou-se
Member

The fact that advance_by returns a Result isn't great in my opinion. In case it moves after the end of the iterator, it's not an error, just that it ran after the end. It would make more sense to return Some(how_much_after_the_end) instead of Err(how_much_after_the_end).

Returning an Option also isn't great. None in the context of an Iterator usually means the end is reached, so using None here for not reaching the end can also get confusing.

I'm wondering about the downside of having to return the 'missing length' from advance_by. I can imagine there are data structures that do not store their length, but do have an upper bound on their length. (E.g. a fixed size buffer containing a C-style 0-terminated string.) Those can just return None directly from nth(n) if n is >= the maximum length. advance_by(n) would still require counting the elements to return the right Err(i), even if the caller is probably just going to use .is_ok(). Having these two similar functions (nth and advance_by) where you can't know in general which one is the more efficient one would be a shame.

What are the advantages of returning Err(missing_length) as opposed to None or false?

kevincox

kevincox commented on Oct 23, 2020

@kevincox
Contributor

Since advance_by returns the number of items you can't implement in terms of .nth() which was the previous recommended way to skip elements. (one exception is that you could use nth for cases where size_hint().1 > 0 but this doesn't solve the recursion problem discussed below) I think that returning the number of elements is a useful feature so I wouldn't want to avoid it. This is also something that .nth() lacks today.

However even if it didn't return the length it isn't clear how the default would be implemented. You can't have the defaults of advance_by and nth refer to each other because then if neither was implemented you would have infinite recursion. And if we implement advance_by(n) as if n > 0 { self.nth(n-1) } then for maximum performance you would need to implement both advance_by and nth. I think it is best to avoid this legacy.

So while I admit that it is unfortunate that switching callers from nth to advance_by can bypass an optimized nth until the iterator implementation is updated I think that this is the best solution available.

timvermeulen

timvermeulen commented on Oct 23, 2020

@timvermeulen
ContributorAuthor

@m-ou-se

That means that every exising Iterator implementation out there that already has an efficient nth() will now automatically get a default slow advance_by() added, implemented as a next() loop, right? I don't think it can be expected that all these implementations will be updated overnight the moment this hits stable, or even get updated at all.

You're absolutely right, which is why we're not going to change any uses of nth in libcore to advance_by (on potentially user-defined iterator types), even if it would clean up the code. It's an unfortunate side-effect of the fact that nth was here first.

What are the advantages of returning Err(missing_length) as opposed to None or false?

The main problem that advance_by solves is that it composes well in ways that nth doesn't for iterators like Chain. For example, in order to efficiently compute (0..10).chain(10..20).chain(20..30).nth(25), it's crucial that the first inner iterator of Chain doesn't just indicate that it doesn't have an nth element (or that it isn't able to advance by n), but that it also somehow indicates how much progress was made in order to know which value of n to pass to the second iterator. nth isn't able to solve this problem on its own.

Having these two similar functions (nth and advance_by) where you can't know in general which one is the more efficient one would be a shame.

I don't really have a satisfying answer to this... There are indeed iterators out there for which nth can be faster than advance_by for certain inputs, although I do believe that these are relatively rare. The best way to handle this might just be to only use advance_by if either the iterator type is known to have an equally efficient advance_by implementation, or you need the Err(n) value for something. An example of that is how the current implementation of Chain::nth calls advance_by on the first iterator (because it needs the error value) and nth on the second.

32 remaining items

the8472

the8472 commented on Oct 13, 2021

@the8472
Member

I don't think ExhaustedExact is a valuable concept. If people want to know if the iterator is done they should call size_hint.

It would be mostly useful for minor optimization to skip a followup operation, such as the next() nth or to drop the front iterator in Flatten. We can obviously already do that in the Err case because we know the iterator is exhausted. But in the Ok case we don't know that. (usize, bool) could convey those pieces orthogonally.

Granted, the value of that distinction is minor, all it costs to notice the exhaustion is another call to next or advance_by.

Just returning usize doesn't really make that easy to do as you either have to store the expected amount to compare

Many of the implementations need to do that anyway.

https://github.com/rust-lang/rust/pull/87091/files#diff-c0d520d60171cd367fdbe3ad1387b13cd2be83597f623d4209f8ca4fc1394f56R398

the8472

the8472 commented on Oct 15, 2021

@the8472
Member

#89916 removes the inconsistency from the docs and updates the implementations to always return Ok(()) in that case.

the8472

the8472 commented on Dec 26, 2021

@the8472
Member

For discussion: #92284 which changes the return type to usize. The change does simplify a few things.

kkharji

kkharji commented on Apr 28, 2022

@kkharji

What is blocking this feature? advance_by is super useful.

jonas-schievink

jonas-schievink commented on Jun 21, 2022

@jonas-schievink
Contributor

This method is one of only 4 in the vtable of dyn Iterator (alongside next, size_hint and nth). Are you sure that incurring this codegen bloat for every Rust program that uses dyn Iterator is worth it, or should there be a where Self: Sized bound on this method?

added a commit that references this issue on Jul 25, 2022

Rollup merge of rust-lang#99703 - dtolnay:tokenstreamsizehint, r=petr…

5bbdf65
the8472

the8472 commented on Mar 13, 2023

@the8472
Member

#92284 has been updated if anyone wants to take a look, it now returns a Result<(), NonZeroUsize> where the error case represents the amount of items that couldn't be drained, rather than the number that could.

added a commit that references this issue on Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-iteratorsArea: IteratorsC-tracking-issueCategory: An issue tracking the progress of sth. like the implementation of an RFCLibs-TrackedLibs issues that are tracked on the team's project board.T-libs-apiRelevant to the library API team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @kevincox@m-ou-se@the8472@jonas-schievink@timvermeulen

        Issue actions

          Tracking Issue for feature(iter_advance_by) · Issue #77404 · rust-lang/rust