In computer programming, an iterator is an object that progressively provides access to each item of a collection, in order.
A collection may provide multiple iterators via its interface that provide items in different orders, such as forwards and backwards.
An iterator is often implemented in terms of the structure underlying a collection implementation and is often tightly coupled to the collection to enable the operational semantics of the iterator.
An iterator is behaviorally similar to a database cursor.
Iterators date to the CLU programming language in 1974.
Pattern
An iterator provides access to an element of a collection (element access) and can change its internal state to provide access to the next element (element traversal). It also provides for creation and initialization to a first element and indicates whether all elements have been traversed. In some programming contexts, an iterator provides additional functionality.
An iterator allows a consumer to process each element of a collection while isolating the consumer from the internal structure of the collection. In Python, a generator is an iterator constructor: a function that returns an iterator. An example of a Python generator returning an iterator for the Fibonacci numbers using Python's <code>yield</code> statement follows:
<syntaxhighlight lang="python">
from typing import Generator
def fibonacci(limit: int) -> Generator[int, None, None]:
a, b = 0, 1
for _ in range(limit):
yield a
a, b = b, a + b
for number in fibonacci(100): # The generator constructs an iterator
print(number)
</syntaxhighlight>
Internal iterator
An internal iterator is a higher-order function (often taking anonymous functions) that traverses a collection while applying a function to each element. For example, Python's <code>map</code> function applies a caller-defined function to each element:
<syntaxhighlight lang="python">
from typing import Iterator
digits: list[int] = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
squared_digits: Iterator[int] = map(lambda x: x**2, digits)
- Iterating over this iterator would result in 0, 1, 4, 9, 16, ..., 81.
</syntaxhighlight>
Implicit iterator
Some object-oriented languages such as C#, C++ (later versions), Delphi (later versions), Go, Java (later versions), Lua, Perl, Python, Ruby provide an intrinsic way of iterating through the elements of a collection without an explicit iterator. An iterator object may exist, but is not represented in the source code.
An implicit iterator is often manifest in language syntax as <code>foreach</code>.
In Python, a collection object can be iterated directly:
<syntaxhighlight lang="python">
for value in iterable:
print(value)
</syntaxhighlight>
In Ruby, iteration requires accessing an iterator property:
<syntaxhighlight lang="ruby">
iterable.each do |value|
puts value
end
</syntaxhighlight>
This iteration style is sometimes called "internal iteration" because its code fully executes within the context of the iterable object (that controls all aspects of iteration), and the programmer only provides the operation to execute at each step (using an anonymous function).
Languages that support list comprehensions or similar constructs may also make use of implicit iterators during the construction of the result list, as in Python:
<syntaxhighlight lang="python">
names: list[str] = [person.name for person in roster if person.male]
</syntaxhighlight>
Sometimes the implicit hidden nature is only partial. The C++ language has a few function templates for implicit iteration, such as <code>for_each()</code>. These functions still require explicit iterator objects as their initial input, but the subsequent iteration does not expose an iterator object to the user.
Stream
Iterators are a useful abstraction of input streams – they provide a potentially infinite iterable (but not necessarily indexable) object. Several languages, such as Perl and Python, implement streams as iterators. In Python, iterators are objects representing streams of data. Alternative implementations of stream include data-driven languages, such as AWK and sed.
Contrast with indexing
Instead of using an iterator, many languages allow the use of a subscript operator and a loop counter to access each element. Although indexing may be used with collections, the use of iterators may have advantages such as:
- Counting loops are not suitable to all data structures, in particular to data structures with no or slow random access, like lists or trees.
- Iterators can provide a consistent way to iterate on data structures of all kinds, and therefore make the code more readable, reusable, and less sensitive to a change in the data structure.
- An iterator can enforce additional restrictions on access, such as ensuring that elements cannot be skipped or that a previously visited element cannot be accessed a second time.
- An iterator may allow the collection object to be modified without invalidating the iterator. For instance, once an iterator has advanced beyond the first element it may be possible to insert additional elements into the beginning of the collection with predictable results. With indexing this is problematic since the index numbers must change.
The ability of a collection to be modified while iterating through its elements has become necessary in modern object-oriented programming, where the interrelationships between objects and the effects of operations may not be obvious. By using an iterator one is isolated from these sorts of consequences. This assertion must however be taken with a grain of salt, because more often than not, for efficiency reasons, the iterator implementation is so tightly bound to the collection that it does preclude modification of the underlying collection without invalidating itself.
For collections that may move around their data in memory, the only way to not invalidate the iterator is, for the collection, to somehow keep track of all the currently alive iterators and update them on the fly. Since the number of iterators at a given time may be arbitrarily large in comparison to the size of the tied collection, updating them all will drastically impair the complexity guarantee on the collection's operations.
An alternative way to keep the number of updates bound relatively to the collection size would be to use a kind of handle mechanism, that is a collection of indirect pointers to the collection's elements that must be updated with the collection, and let the iterators point to these handles instead of directly to the data elements. But this approach will negatively impact the iterator performance, since it must effectuate a double pointer following to access the actual data element. This is usually not desirable, because many algorithms using the iterators invoke the iterators data access operation more often than the advance method. It is therefore especially important to have iterators with very efficient data access.
All in all, this is always a trade-off between security (iterators remain always valid) and efficiency. Most of the time, the added security is not worth the efficiency price to pay for it. Using an alternative collection (for example a singly linked list instead of a vector) would be a better choice (globally more efficient) if the stability of the iterators is needed.
Classification
Categories
Iterators can be categorised according to their functionality. Here is a (non-exhaustive) list of iterator categories:
{| class="wikitable sortable"
|-
! Category !! Languages
|-
| Bidirectional iterator || C++, Rust
|-
| Forward iterator || C++
|-
| Input iterator || C++
|-
| Output iterator || C++
|-
| Random access iterator || C++
|-
| Trivial iterator || C++ (old STL)
|}
Types
Different languages or libraries used with these languages define iterator types. Some of them are
{| class="wikitable sortable"
|-
! Type !! Languages
|-
| Array iterator || PHP, R
|-
| Caching iterator || PHP
|-
| Constant iterator || C++, PHP
|-
| Directory iterator || PHP, Python
|-
| Filter iterator || PHP, R
|-
| Limit iterator || PHP
|-
| List iterator || C#, Java,<code>IEnumerator</code> provides a <code>MoveNext()</code> method, which advances to the next element and indicates whether the end of the collection has been reached; a <code>Current</code> property, to obtain the value of the element currently being pointed at; and may optionally support a However, support for iterators was added in PHP 5 through the introduction of the internal <code>Traversable</code> interface. The two main interfaces for implementation in PHP scripts that enable objects to be iterated via the <code>foreach</code> loop are <code>Iterator</code> and <code>IteratorAggregate</code>. The latter does not require the implementing class to declare all required methods, instead it implements an accessor method (<code>getIterator</code>) that returns an instance of <code>Traversable</code>. The Standard PHP Library provides several classes to work with special iterators. PHP also supports Generators since 5.5.
The simplest implementation is by wrapping an array, this can be useful for type hinting and information hiding.
<syntaxhighlight lang="php">
namespace Wikipedia\Iterator;
final class ArrayIterator extends \Iterator
{
private array $array;
public function __construct(array $array)
{
$this->array = $array;
}
public function rewind(): void
{
echo 'rewinding' , PHP_EOL;
reset($this->array);
}
public function current()
{
$value = current($this->array);
echo "current: {$value}", PHP_EOL;
return $value;
}
public function key()
{
$key = key($this->array);
echo "key: {$key}", PHP_EOL;
return $key;
}
public function next()
{
$value = next($this->array);
echo "next: {$value}", PHP_EOL;
return $value;
}
public function valid(): bool
{
$valid = $this->current() !== false;
echo 'valid: ', ($valid ? 'true' : 'false'), PHP_EOL;
return $valid;
}
}
</syntaxhighlight>
All methods of the example class are used during the execution of a complete foreach loop (<code>foreach ($iterator as $key => $current) {}</code>). The iterator's methods are executed in the following order:
- <code>$iterator->rewind()</code> ensures that the internal structure starts from the beginning.
- <code>$iterator->valid()</code> returns true in this example.
- <code>$iterator->current()</code> returned value is stored in <code>$value</code>.
- <code>$iterator->key()</code> returned value is stored in <code>$key</code>.
- <code>$iterator->next()</code> advances to the next element in the internal structure.
- <code>$iterator->valid()</code> returns false and the loop is aborted.
The next example illustrates a PHP class that implements the <code>Traversable</code> interface, which could be wrapped in an <code>IteratorIterator</code> class to act upon the data before it is returned to the <code>foreach</code> loop. The usage together with the <code>MYSQLI_USE_RESULT</code> constant allows PHP scripts to iterate result sets with billions of rows with very little memory usage. These features are not exclusive to PHP nor to its MySQL class implementations (e.g. the <code>PDOStatement</code> class implements the <code>Traversable</code> interface as well).
<syntaxhighlight lang="php">
mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT);
$mysqli = new \mysqli('host.example.com', 'username', 'password', 'database_name');
// The \mysqli_result class that is returned by the method call implements the internal Traversable interface.
foreach ($mysqli->query('SELECT `a`, `b`, `c` FROM `table`', MYSQLI_USE_RESULT) as $row) {
// Act on the returned row, which is an associative array.
}
</syntaxhighlight>
===Python===<!-- This section is linked from Associative array -->
Iterators in Python are a fundamental part of the language and in many cases go unseen as they are implicitly used in the <code>for</code> (foreach) statement, in list comprehensions, and in generator expressions. All of Python's standard built-in collection types support iteration, as well as many classes that are part of the standard library. The following example shows typical implicit iteration over a sequence:
<syntaxhighlight lang="python">
for value in sequence:
print(value)
</syntaxhighlight>
Python dictionaries (a form of associative array) can also be directly iterated over, when the dictionary keys are returned; or the <code>items()</code> method of a dictionary can be iterated over where it yields corresponding key,value pairs as a tuple:
<syntaxhighlight lang="python">
for key in dictionary:
value = dictionary[key]
print(key, value)
</syntaxhighlight>
<syntaxhighlight lang="python">
for key, value in dictionary.items():
print(key, value)
</syntaxhighlight>
Iterators however can be used and defined explicitly. For any iterable sequence type or class, the built-in function <code>iter()</code> is used to create an iterator object. The iterator object can then be iterated with the <code>next()</code> function, which uses the <code>__next__()</code> method internally, which returns the next element in the container. (The previous statement applies to Python 3.x. In Python 2.x, the <code>next()</code> method is equivalent.) A <code>StopIteration</code> exception will be raised when no more elements are left. The following example shows an equivalent iteration over a sequence using explicit iterators:
<syntaxhighlight lang="python">
from typing import Iterator
sequence: list[int] = [1, 2, 3, 4]
it: Iterator[int] = iter(sequence)
while True:
try:
value = it.next() # in Python 2.x
value = next(it) # in Python 3.x
except StopIteration:
break
print(value)
</syntaxhighlight>
Any user-defined class can support standard iteration (either implicit or explicit) by defining an <code>__iter__()</code> method that returns an iterator object. The iterator object then needs to define a <code>__next__()</code> method that returns the next element.
Python's generators implement this iteration protocol.
Raku
Iterators in Raku are a fundamental part of the language, although usually users do not have to care about iterators. Their usage is hidden behind iteration APIs such as the <code>for</code> statement, <code>map</code>, <code>grep</code>, list indexing with <code>.[$idx]</code>, etc.
The following example shows typical implicit iteration over a collection of values:
<syntaxhighlight lang="raku">
my @values = 1, 2, 3;
for @values -> $value {
say $value
}
- OUTPUT:
- 1
- 2
- 3
</syntaxhighlight>
Raku hashes can also be directly iterated over; this yields key-value <code>Pair</code> objects. The <code>kv</code> method can be invoked on the hash to iterate over the key and values; the <code>keys</code> method to iterate over the hash's keys; and the <code>values</code> method to iterate over the hash's values.
<syntaxhighlight lang="raku">
my %word-to-number = 'one' => 1, 'two' => 2, 'three' => 3;
for %word-to-number -> $pair {
say $pair;
}
- OUTPUT:
- three => 3
- one => 1
- two => 2
for %word-to-number.kv -> $key, $value {
say "$key: $value"
}
- OUTPUT:
- three: 3
- one: 1
- two: 2
for %word-to-number.keys -> $key {
say "$key => " ~ %word-to-number{$key};
}
- OUTPUT:
- three => 3
- one => 1
- two => 2
</syntaxhighlight>
Iterators however can be used and defined explicitly. For any iterable type, there are several methods that control different aspects of the iteration process. For example, the <code>iterator</code> method is supposed to return an <code>Iterator</code> object, and the <code>pull-one</code> method is supposed to produce and return the next value if possible, or return the sentinel value <code>IterationEnd</code> if no more values could be produced. The following example shows an equivalent iteration over a collection using explicit iterators:
<syntaxhighlight lang="raku">
my @values = 1, 2, 3;
my $it := @values.iterator; # grab iterator for @values
loop {
my $value := $it.pull-one; # grab iteration's next value
last if $value =:= IterationEnd; # stop if we reached iteration's end
say $value;
}
- OUTPUT:
- 1
- 2
- 3
</syntaxhighlight>
All iterable types in Raku compose the <code>Iterable</code> role, <code>Iterator</code> role, or both. The <code>Iterable</code> is quite simple and only requires the <code>iterator</code> to be implemented by the composing class. The <code>Iterator</code> is more complex and provides a series of methods such as <code>pull-one</code>, which allows for a finer operation of iteration in several contexts such as adding or eliminating items, or skipping over them to access other items. Thus, any user-defined class can support standard iteration by composing these roles and implementing the <code>iterator</code> and/or <code>pull-one</code> methods.
The <code>DNA</code> class represents a DNA strand and implements the <code>iterator</code> by composing the <code>Iterable</code> role. The DNA strand is split into a group of trinucleotides when iterated over:
<syntaxhighlight lang="raku">
subset Strand of Str where { .match(/^^ <[ACGT]>+ $$/) and .chars %% 3 };
class DNA does Iterable {
has $.chain;
method new(Strand:D $chain) {
self.bless: :$chain
}
method iterator(DNA:D:){ $.chain.comb.rotor(3).iterator }
};
for DNA.new('GATTACATA') {
.say
}
- OUTPUT:
- (G A T)
- (T A C)
- (A T A)
say DNA.new('GATTACATA').map(*.join).join('-');
- OUTPUT:
- GAT-TAC-ATA
</syntaxhighlight>
The <code>Repeater</code> class composes both the <code>Iterable</code> and <code>Iterator</code> roles:
<syntaxhighlight lang="raku">
class Repeater does Iterable does Iterator {
has Any $.item is required;
has Int $.times is required;
has Int $!count = 1;
multi method new($item, $times) {
self.bless: :$item, :$times;
}
method iterator { self }
method pull-one(--> Mu){
if $!count <= $!times {
$!count += 1;
return $!item
}
else {
return IterationEnd
}
}
}
for Repeater.new("Hello", 3) {
.say
}
- OUTPUT:
- Hello
- Hello
- Hello
</syntaxhighlight>
Ruby
Ruby implements iterators quite differently; all iterations are done by means of passing callback closures to container methods - this way Ruby not only implements basic iteration but also several patterns of iteration like function mapping, filters and reducing. Ruby also supports an alternative syntax for the basic iterating method <code>each</code>, the following three examples are equivalent:
<syntaxhighlight lang="ruby">
(0...42).each do |n|
puts n
end
</syntaxhighlight>
...and...
<syntaxhighlight lang="ruby">
for n in 0...42
puts n
end
</syntaxhighlight>
or even shorter
<syntaxhighlight lang="ruby">
42.times do |n|
puts n
end
</syntaxhighlight>
Ruby can also iterate over fixed lists by using <code>Enumerator</code>s and either calling their <code>#next</code> method or doing a for each on them, as above.
Rust
Rust makes use of external iterators throughout the standard library, including in its <code>for</code> loop, which implicitly calls the <code>next()</code> method of an iterator until it is consumed. The most basic <code>for</code> loop for example iterates over a <code>Range</code> type:
<syntaxhighlight lang="rust">
for i in 0..42 {
println!("{}", i);
}
// Prints the numbers 0 to 41
</syntaxhighlight>
Specifically, the <code>for</code> loop will call a value's <code>into_iter()</code> method, which returns an iterator that in turn yields the elements to the loop. The <code>for</code> loop (or indeed, any method that consumes the iterator), proceeds until the <code>next()</code> method returns a <code>None</code> value (iterations yielding elements return a <code>Some(T)</code> value, where <code>T</code> is the element type).
All collections provided by the standard library implement the <code>IntoIterator</code> trait (meaning they define the <code>into_iter()</code> method). Iterators themselves implement the <code>Iterator</code> trait, which requires defining the <code>next()</code> method. Furthermore, any type implementing <code>Iterator</code> is automatically provided an implementation for <code>IntoIterator</code> that returns itself.
Iterators support various adapters (<code>map()</code>, <code>filter()</code>, <code>skip()</code>, <code>take()</code>, etc.) as methods provided automatically by the <code>Iterator</code> trait.
Users can create custom iterators by creating a type implementing the <code>Iterator</code> trait. Custom collections can implement the <code>IntoIterator</code> trait and return an associated iterator type for their elements, enabling their use directly in <code>for</code> loops. Below, the <code>Fibonacci</code> type implements a custom, unbounded iterator:
<syntaxhighlight lang="rust">
struct Fibonacci(u64, u64);
impl Fibonacci {
pub fn new() -> Self {
Self(0, 1)
}
}
impl Iterator for Fibonacci {
type Item = u64;
fn next(&mut self) -> Option<Self::Item> {
let next = self.0;
self.0 = self.1;
self.1 = self.0 + next;
Some(next)
}
}
fn main() {
let fib = Fibonacci::new();
for n in fib.skip(1).step_by(2).take(4) {
println!("{n}");
}
// Prints 1, 2, 5, and 13
}
</syntaxhighlight>
UML class diagram depecting the iterator-related structs and traits in Rust
See also
References
External links
- Java's Iterator, Iterable and ListIterator Explained
- .NET interface
- Article "Understanding and Using Iterators" by Joshua Gatcomb
- Article "A Technique for Generic Iteration and Its Optimization " (217 KB) by Stephen M. Watt
- Iterators
- Boost C++ Iterator Library
- Java interface
- PHP: Object Iteration
- STL Iterators
<!-- Death link * Template reference -->
- What are iterators? - Reference description
<!--Categories-->
