Exhausted iterator evaluate as False

I just think there should be a simple syntax to tell if an iterator is finished without actually pulling an item from it, although I’m guessing there’s a problem with generators because maybe you can’t properly tell its finished without running the generator to the end

I actually assumed it would behave that way already, so although there wouldn’t be a good reason to do this you could write:

while myiterator:

print(next(myiterator))

and have it list out the values the same way a “for” loop would.without throwing.

From experiment it appears as if an iterator is considered “True” if it was created with items in, and remains true even when finished. In contrast if I had a list and progressively popped items it would become empty and evaluate as False eventually.

It’s a problem for a lot of iterators, as they can’t know they’re exhausted without pulling something from them. For example, suppose you prompt the user to enter a series of numbers, ending with a blank line; the only way to know that the user is done entering numbers is to get the next number. So this will never work in general - there’ll never be something that, for any arbitrary iterator, tells you whether it’s still got stuff to give you.

There are a few options available to you. You could make a “preview” wrapper that pumps the iterator one step early, retaining the value so it can tell you whether there’s one coming or not. For specific iterator types, where that info is available, you could code it to report this availability. Or you could just assume that the iterator always has something, and then break out when you get signalled that it doesn’t (so, you have a “while True:” and then catch StopIteration at a much higher level). What’s right for you will depend on your code and what you need to do with these iterators.

6 Likes

This sounds like a feature that should be implementable (and indeed desirable) for more_itertools.peekable

But indeed it cannot be implemented for iterators in general

1 Like

It doesn’t work for generators and it’s horrible, but it works:

from collections.abc import Iterator

class MyIter(Iterator):
    def __init__(self, it):
        new_it = tuple(it)
        self._it = iter(new_it)
        self._alive = bool(new_it)
    
    def __next__(self):
        try:
            return next(self._it)
        except StopIteration:
            self._alive = False
            raise
    
    def __bool__(self):
        return self._alive

I tried this on itertools.count(), and then the lights went out :wink:

On a serious note: tuple(it) consumes the entire iterable at once upfront. Not all iterables are finite. Even if they are, consuming all may be undesirable, e.g. if these are user inputs. And there’s space and time considerations.

You may want to look ahead one instead of looking ahead to the end.

2 Likes

From experiment it appears as if an iterator is considered “True” if
it was created with items in, and /remains true even when finished/.

I don’t think the language makes any guarantees about the truth values
of iterators, but most of them are probably just true regardless:

i = iter()
bool(i)
True

Indeed I said it’s horrible and it doesn’t work with generators :smiley:

Maybe this is a little better:

from collections.abc import Iterator

_sentinel = object()

class MyIter(Iterator):
    def __init__(self, it):
        itr = self._it = iter(it)
        
        try:
            self._first = next(itr)
        except StopIteration:
            self._alive = False
            self._first = _sentinel
        else:
            self._alive = True
    
    def __next__(self):
        first = self._first
        
        if first is not _sentinel:
            self._first = _sentinel
            return first
        
        try:
            return next(self._it)
        except StopIteration:
            self._alive = False
            raise
    
    def __bool__(self):
        return self._alive

3 Likes

Mostly because objects in general are true unless they choose to not be.

They are truthy unless they implement __bool__() to override that default.

Python 2 had a __nonzero__() dunder for this purpose, but it was rendered obsolete when bools were introduced, and Python 3 dropped it.

I don’t expect this will change:

>>> myit = None
>>> bool(myit)
False
>>> myit = iter([])
>>> bool(myit)
True

That is, code now can rely on that most objects (iterator or not, exhausted iterator or not) are truthy. The exceptions are objects that advertise they have their own notion of what truthiness is.

Which iterators don’t do. They’re implemented by many different objects in many different ways, and has already been spelled out, there is in general no way to know when one is exhausted short of consuming “the next” (if any) value.

The docs do show ways to use itertools.tee in clever ways to package things so that a "peek()` function can “push back” the next value, leaving things acting like peek() hadn’t been called.

1 Like

Yeah, it already does that. I’ve used that years ago.

From a fun old discussion:

We don’t really want to start writing code
like this:

while it:
    x = it.next()
    ...process x...

when we can already write it like this:

for x in it:
    ...process x...

do we?

5 Likes

I see where you’re coming from, it’d be a feature that only worked for certain categories of iterators, and obviously that’s undesirable, if a behaviour is specified it should work all the time.

It wouldn’t be acceptable to have a “halting problem” leave a boolean in a strange superposition of true and false that doesn’t resolve itself until the generator either yields or ends cleanly.

I’ve previously played with a different generator/coroutine implementation (in rather janky C) that returned a value when it finished and so couldn’t get into an uncertain state, the last returned value would flip it to its “ended” state.

2 Likes

I don’t actually want to put it in a while loop when for is perfectly good, that was just a trivial example. What I actually had in mind was using next() to pull results one at a time as part of some other process, interleaved with other activity.

Possibly as part of a game’s logic.

Anyway I can see it is too easily broken so whatever the idea’s merits it would break when faced with generators with non-trivial terminating conditions.

1 Like

Yeah, if it’s specified by the language, it should work all the time. However, if it’s only specified by you, for your specific class, then it’s allowed to be true only for that class. Personally, I think it’d be confusing to have an iterator define __bool__ in this way (I’d prefer to have an explicit method), but you are absolutely welcome to define your own iterator type with this sort of feature. The ONLY rules that Python has for iterators are: 1) The __iter__ method returns self; and 2) The __next__ method returns the next value, or raises StopIteration. That’s it. Any other behaviour is entirely up to you.

2 Likes

Yes, as @Rosuav just said, the “iterator protocol” in Python is as minimal as can be. That was intentional, to make it as easy as possible for new types to participate.

But it doesn’t cater to some pretty common desires. You have to “roll your own” if you want more for a specific application.

Here’s a way that exploits the lesser-known 2-argument forms of next() and Iter(). A WrappedIter instance has a bool .active attribute that starts life as True and flips to False upon exhaustion. 2-arg iter() is then used to suppress a special ITERDONE object. The calling code never sees it returned from iter(). And once a wrapped iter is exhausted, .active remains False, and further attempts to get more values from it just end at once.

ITERDONE = object()

class WrappedIter:
    def __init__(self, it):
        self.it = iter(it)
        self.active = True

    def __bool__(self):
        return self.active

    def __call__(self):
        val = next(self.it, ITERDONE)
        if val is ITERDONE:
            self.active = False
        return val

xs = list(range(5))
wit = WrappedIter(xs)
for val in iter(wit, ITERDONE):
    print(wit.active, val)
print(wit.active)
print(list(iter(wit, ITERDONE)))
print(list(iter(wit, ITERDONE)))

which displays:

True 0
True 1
True 2
True 3
True 4
False
[]
[]

A bit more typing to use, but doesn’t rely on anything that the iterator protocol doesn’t supply.

2 Likes

If required to check whether the last item was already pulled, you might pull one item in advance and buffer it.

class BufferedIterator:
    def __init__(self, it):
         self.sentinel = object()
         self.it = it
         self.advance()

    def advance(self):
         self.buffered = next(self.it, self.sentinel)
         self.active = False if self.buffered is self.sentinel else True

    def __next__(self):
         current = self.buffered
         self.advance()
         return current

    def __bool___(self):
         return self.active

    def peek(self):  # bonus method
         return self.buffered
1 Like

I didn’t know about the two-args iter and next. Good to know! But I don’t understand how your code can work with empty iterators:

>>> xs = []
>>> wit = WrappedIter(xs)
>>> a = iter(wit, ITERDONE)
>>> bool(a)
True
>>> bool(wit)
True

Am I missing something? ?_?

1 Like

Yup! I routinely forget about 2-arg iter() for months at a time, but next(it, default) is as useful at times as dict.get(key, default).

Same way as it works for all iterators: the wrapper starts truthy, and remains so until an attempt to actually get the next item raises StopIteration. There is no other way in Python to know whether a general iterator “is empty”.

Other code tries to buffer “the next” result internally, and those can say “for sure” whether “the next” attempt will succeed. But that brings its own subtle failure modes.

Python’s iterator protocol is very lightweight, requiring very little of objects. In particular, there is no requirement that an object’s state at time an iterator is created be the same as when the iterator is later used.

>>> xs = []
>>> it = iter(xs)
>>> xs.extend([4, 2]) # doesn't matter that the state changed before _use_
>>> next(it, "nope")
4
>>> next(it, "nope")
2
>>> next(it, "nope")
'nope'

Any attempt to “peek ahead” breaks that:

>>> xs = []
>>> it = iter(xs)
>>> next(it, "nope") # "peek ahead"
'nope'
>>> xs.extend([4, 2]) # but the iterator is already dead
>>> next(it, "nope")
'nope'
>>> next(it, "nope")
'nope'

My wrapper preserves the original behavior, because it never tries to advance the underlying iterator until the users forces that.

In return, sure, it will remain truthy unlit trying to advance the underlying iterator actually fails.

3 Likes

Ah, I see that. So it’s as Chris said previously: you can implement it in two ways, and both have some problems. Even if the “buffered” version can be improved checking if bool(iterable) is False, it will still fail if the iterable is a generator and we send a value to it.

1 Like

I pretty frequently need to check if iterators are empty (database access), and use the very rarely seen for … else pattern to attempt a next call and immediately break . If the iterator is empty, the break is never hit and the else block is entered.

for val in my_iter:
    print('has values')
    break
else:
    print('exhausted')
    val = None

You will need to check val afterwards if you need to process the iterable.

This is probably a terrible idea though since it’s almost never used, and this is basically just a way to abuse the for loops built in StopIteration handling

1 Like

Have you considered the two-arg next call?

val = next(my_iter, None)
1 Like