# Iterators
# Overview
# Iterators are Positions
Iterators are a means of navigating and operating on a sequence of elements and are a generalized extension of pointers. Conceptually it is important to remember that iterators are positions, not elements. For example, take the following sequence:
A B C
The sequence contains three elements and four positions
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
Elements are things within a sequence. Positions are places where meaningful operations can happen to the sequence. For example, one inserts into a position, before or after element A, not into an element. Even deletion of an element (erase(A)
) is done by first finding its position, then deleting it.
# From Iterators to Values
To convert from a position to a value, an iterator is dereferenced:
auto my_iterator = my_vector.begin(); // position
auto my_value = *my_iterator; // value
One can think of an iterator as dereferencing to the value it refers to in the sequence. This is especially useful in understanding why you should never dereference the end()
iterator in a sequence:
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
↑ ↑
| +-- An iterator here has no value. Do not dereference it!
+-------------- An iterator here dereferences to the value A.
In all the sequences and containers found in the C++ standard library, begin()
will return an iterator to the first position, and end()
will return an iterator to one past the last position (not the last position!). Consequently, the names of these iterators in algorithms are oftentimes labelled first
and last
:
+---+---+---+---+
| A | B | C | |
+---+---+---+---+
↑ ↑
| |
+- first +- last
It is also possible to obtain an iterator to any sequence, because even an empty sequence contains at least one position:
+---+
| |
+---+
In an empty sequence, begin()
and end()
will be the same position, and neither can be dereferenced:
+---+
| |
+---+
↑
|
+- empty_sequence.begin()
|
+- empty_sequence.end()
The alternative visualization of iterators is that they mark the positions between elements:
+---+---+---+
| A | B | C |
+---+---+---+
↑ ^ ^ ↑
| |
+- first +- last
and dereferencing an iterator returns a reference to the element coming after the iterator. Some situations where this view is particularly useful are:
insert
operations will insert elements into the position indicated by the iterator,erase
operations will return an iterator corresponding to the same position as the one passed in,- an iterator and its corresponding reverse iterator (opens new window) are located in the same .position between elements
# Invalid Iterators
An iterator becomes invalidated if (say, in the course of an operation) its position is no longer a part of a sequence. An invalidated iterator cannot be dereferenced until it has been reassigned to a valid position. For example:
std::vector<int>::iterator first;
{
std::vector<int> foo;
first = foo.begin(); // first is now valid
} // foo falls out of scope and is destroyed
// At this point first is now invalid
The many algorithms and sequence member functions in the C++ standard library have rules governing when iterators are invalidated. Each algorithm is different in the way they treat (and invalidate) iterators.
# Navigating with Iterators
As we know, iterators are for navigating sequences. In order to do that an iterator must migrate its position throughout the sequence. Iterators can advance forward in the sequence and some can advance backwards:
auto first = my_vector.begin();
++first; // advance the iterator 1 position
std::advance(first, 1); // advance the iterator 1 position
first = std::next(first); // returns iterator to the next element
std::advance(first, -1); // advance the iterator 1 position backwards
first = std::next(first, 20); // returns iterator to the element 20 position forward
first = std::prev(first, 5); // returns iterator to the element 5 position backward
auto dist = std::distance(my_vector.begin(), first); // returns distance between two iterators.
Note, second argument of std::distance should be reachable from the first one(or, in other words first
should be less or equal than second
).
Even though you can perform arithmetic operators with iterators, not all operations are defined for all types of iterators. a = b + 3;
would work for Random Access Iterators, but wouldn't work for Forward or Bidirectional Iterators, which still can be advanced by 3 position with something like b = a; ++b; ++b; ++b;
. So it is recommended to use special functions in case you are not sure what is iterator type (for example, in a template function accepting iterator).
# Iterator Concepts
The C++ standard describes several different iterator concepts. These are grouped according to how they behave in the sequences they refer to. If you know the concept an iterator models (behaves like), you can be assured of the behavior of that iterator regardless of the sequence to which it belongs. They are often described in order from the most to least restrictive (because the next iterator concept is a step better than its predecessor):
- Input Iterators : Can be dereferenced only once per position. Can only advance, and only one position at a time.
- Forward Iterators : An input iterator that can be dereferenced any number of times.
- Bidirectional Iterators : A forward iterator that can also advance backwards one position at a time.
- Random Access Iterators : A bidirectional iterator that can advance forwards or backwards any number of positions at a time.
- Contiguous Iterators (since C++17) : A random access iterator that guaranties that underlying data is contiguous in memory.
Algorithms can vary depending on the concept modeled by the iterators they are given. For example, although random_shuffle
can be implemented for forward iterators, a more efficient variant that requires random access iterators could be provided.
# Iterator traits
Iterator traits provide uniform interface to the properties of iterators. They allow you to retrieve value, difference, pointer, reference types and also category of iterator:
template<class Iter>
Iter find(Iter first, Iter last, typename std::iterator_traits<Iter>::value_type val) {
while (first != last) {
if (*first == val)
return first;
++first;
}
return last;
}
Category of iterator can be used to specialize algorithms:
template<class BidirIt>
void test(BidirIt a, std::bidirectional_iterator_tag) {
std::cout << "Bidirectional iterator is used" << std::endl;
}
template<class ForwIt>
void test(ForwIt a, std::forward_iterator_tag) {
std::cout << "Forward iterator is used" << std::endl;
}
template<class Iter>
void test(Iter a) {
test(a, typename std::iterator_traits<Iter>::iterator_category());
}
Categories of iterators are basically iterators concepts, except Contiguous Iterators don't have their own tag, since it was found to break code.
# Vector Iterator
begin
returns an iterator
to the first element in the sequence container.
end
returns an iterator
to the first element past the end.
If the vector object is const
, both begin
and end
return a const_iterator
. If you want a const_iterator
to be returned even if your vector is not const
, you can use cbegin
and cend
.
Example:
#include <vector>
#include <iostream>
int main() {
std::vector<int> v = { 1, 2, 3, 4, 5 }; //intialize vector using an initializer_list
for (std::vector<int>::iterator it = v.begin(); it != v.end(); ++it) {
std::cout << *it << " ";
}
return 0;
}
Output:
1 2 3 4 5
# Map Iterator
An iterator to the first element in the container.
If a map object is const-qualified, the function returns a const_iterator
. Otherwise, it returns an iterator
.
// Create a map and insert some values
std::map<char,int> mymap;
mymap['b'] = 100;
mymap['a'] = 200;
mymap['c'] = 300;
// Iterate over all tuples
for (std::map<char,int>::iterator it = mymap.begin(); it != mymap.end(); ++it)
std::cout << it->first << " => " << it->second << '\n';
Output:
a => 200
b => 100
c => 300
# Reverse Iterators
If we want to iterate backwards through a list or vector we can use a reverse_iterator
. A reverse iterator is made from a bidirectional, or random access iterator which it keeps as a member which can be accessed through base()
.
To iterate backwards use rbegin()
and rend()
as the iterators for the end of the collection, and the start of the collection respectively.
For instance, to iterate backwards use:
std::vector<int> v{1, 2, 3, 4, 5};
for (std::vector<int>::reverse_iterator it = v.rbegin(); it != v.rend(); ++it)
{
cout << *it;
} // prints 54321
A reverse iterator can be converted to a forward iterator via the base()
member function. The relationship is that the reverse iterator references one element past the base()
iterator:
std::vector<int>::reverse_iterator r = v.rbegin();
std::vector<int>::iterator i = r.base();
assert(&*r == &*(i-1)); // always true if r, (i-1) are dereferenceable
// and are not proxy iterators
+---+---+---+---+---+---+---+
| | 1 | 2 | 3 | 4 | 5 | |
+---+---+---+---+---+---+---+
↑ ↑ ↑ ↑
| | | |
rend() | rbegin() end()
| rbegin().base()
begin()
rend().base()
In the visualization where iterators mark positions between elements, the relationship is simpler:
+---+---+---+---+---+
| 1 | 2 | 3 | 4 | 5 |
+---+---+---+---+---+
↑ ↑
| |
| end()
| rbegin()
begin() rbegin().base()
rend()
rend().base()
# Stream Iterators
Stream iterators are useful when we need to read a sequence or print formatted data from a container:
// Data stream. Any number of various whitespace characters will be OK.
std::istringstream istr("1\t 2 3 4");
std::vector<int> v;
// Constructing stream iterators and copying data from stream into vector.
std::copy(
// Iterator which will read stream data as integers.
std::istream_iterator<int>(istr),
// Default constructor produces end-of-stream iterator.
std::istream_iterator<int>(),
std::back_inserter(v));
// Print vector contents.
std::copy(v.begin(), v.end(),
//Will print values to standard output as integers delimeted by " -- ".
std::ostream_iterator<int>(std::cout, " -- "));
The example program will print 1 -- 2 -- 3 -- 4 --
to standard output.
# C Iterators (Pointers)
// This creates an array with 5 values.
const int array[] = { 1, 2, 3, 4, 5 };
#ifdef BEFORE_CPP11
// You can use `sizeof` to determine how many elements are in an array.
const int* first = array;
const int* afterLast = first + sizeof(array) / sizeof(array[0]);
// Then you can iterate over the array by incrementing a pointer until
// it reaches past the end of our array.
for (const int* i = first; i < afterLast; ++i) {
std::cout << *i << std::endl;
}
#else
// With C++11, you can let the STL compute the start and end iterators:
for (auto i = std::begin(array); i != std::end(array); ++i) {
std::cout << *i << std::endl;
}
#endif
This code would output the numbers 1 through 5, one on each line like this:
1
2
3
4
5
# Breaking It Down
const int array[] = { 1, 2, 3, 4, 5 };
This line creates a new integer array with 5 values. C arrays are just pointers to memory where each value is stored together in a contiguous block.
const int* first = array;
const int* afterLast = first + sizeof(array) / sizeof(array[0]);
These lines create two pointers. The first pointer is given the value of the array pointer, which is the address of the first element in the array. The sizeof
operator when used on a C array returns the size of the array in bytes. Divided by the size of an element this gives the number of elements in the array. We can use this to find the address of the block after the array.
for (const int* i = first; i < afterLast; ++i) {
Here we create a pointer which we will use as an iterator. It is initialized with the address of the first element we want to iterate over, and we'll continue to iterate as long as i
is less than afterLast
, which means as long as i
is pointing to an address within array
.
std::cout << *i << std::endl;
Finally, within the loop we can access the value our iterator i
is pointing to by dereferencing it. Here the dereference operator *
returns the value at the address in i
.
# Write your own generator-backed iterator
A common pattern in other languages is having a function that produces a "stream" of objects, and being able to use loop-code to loop over it.
We can model this in C++ as
template<class T>
struct generator_iterator {
using difference_type=std::ptrdiff_t;
using value_type=T;
using pointer=T*;
using reference=T;
using iterator_category=std::input_iterator_tag;
std::optional<T> state;
std::function< std::optional<T>() > operation;
// we store the current element in "state" if we have one:
T operator*() const {
return *state;
}
// to advance, we invoke our operation. If it returns a nullopt
// we have reached the end:
generator_iterator& operator++() {
state = operation();
return *this;
}
generator_iterator operator++(int) {
auto r = *this;
++(*this);
return r;
}
// generator iterators are only equal if they are both in the "end" state:
friend bool operator==( generator_iterator const& lhs, generator_iterator const& rhs ) {
if (!lhs.state && !rhs.state) return true;
return false;
}
friend bool operator!=( generator_iterator const& lhs, generator_iterator const& rhs ) {
return !(lhs==rhs);
}
// We implicitly construct from a std::function with the right signature:
generator_iterator( std::function< std::optional<T>() > f ):operation(std::move(f))
{
if (operation)
state = operation();
}
// default all special member functions:
generator_iterator( generator_iterator && ) =default;
generator_iterator( generator_iterator const& ) =default;
generator_iterator& operator=( generator_iterator && ) =default;
generator_iterator& operator=( generator_iterator const& ) =default;
generator_iterator() =default;
};
live example (opens new window).
We store the generated element early so we can more easily detect if we are already at the end.
As the function of an end generator iterator is never used, we can create a range of generator iterators by only copying the std::function
once. A default constructed generator iterator compares equal to itself, and to all other end-generator-iterators.