Iterators are an abstraction of pointers. They provide a means to traverse containers and access objects within containers.
The generic algorithms are a collection of routines that can be used to perform common operations on container classes, such as sorting, merging and finding. The are called generic because they are independent of the container class used and of the datatype of the objects held in those container classes. The iterators are used to link a container with an algorithm. A container class has methods to return iterators that provide access to its contents. These iterators are provided, as arguments, to an algorithm. The algorithm can then traverse, access and manipulate the contents of the container.
There are a few benefits that the container classes offer over arrays.
- Container classes handle all sizing and memory allocation issues. You can't write past the end of a container class like you can with an array
- Container classes are optimized for their intended use. Some allow constant time insertions at their end, some allow this anywhere. Some are better to maintain sorted lists, some better for random access retrievals. We'll learn more on this as we proceed.
- They contain utility methods such as empty(), size() and clear() that can simplify programming.
- They contain other methods that all insertion, deletion and copying.
- They have equality, inequality and assignment operators. You can directly compare two containers or assign on to another.
- They provide iterators. Iterators are an abstraction of pointers. They allow similar syntax and use, but also have some built-in methods and checks that increase their utility.
The Big-Oh
Before we begin to discuss containers further and begin to discuss the other important components of the STL, such as the generic algorithms, it is necessary to understand a little about quantifying computational costs. We need a measure to understand how an algorithm behaves. Suppose, we run an algorithm on a container containing 5 objects during testing and now we intend to run this algorithm on a container containing 50, 500 or 5 million objects. How long will it take? How do we quantify this? Suppose, we retrieve the first element in a container. How long does it take to retrieve the 79th? The same amount of time? Seventy-nine times as much time? The answers to these questions depend on the particular container or particular algorithm, but there are standard ways of expressing and quantifying these relationships. The computational cost of an algorithm or operation is generally expressed using big-Oh notation. I'm sorry to disappoint any of my readers about the sub-title. The "Oh" indicates the order of the operation. It may be constant, or increase in some other way as the size of the container or number of elements being manipulated increase. The "big" in "big-Oh" indications that we are interest in approximate or order of magnitude estimates. If it takes twice as long to retreive element 79 as element 1, this is still considered a constant time. That's close enough. We're concerned that the retrieval time does not increase quadratically (as a square or power of 2) or cubically (power of 3) for instance. If that were the case the operation would require on order of 79^2 = 6241 or 79^3 = 493039 times the number of operations as the retrieval of element 1.
- Constant time, O(1). The cost of the operation is independent of the size of the container or number of elements. As noted above, it may not really be constant, but to a resonable approreasonable it's constant.
- Linear time, O(N). The cost of the algorithm or operation increases linearly as the number of elements increases. For example, the cost to retrieve element 79 is roughly 79 times the cost to get element 1.
- Quadratic time, O(N2). The cost of the algorithm or operation increases as the square of the number of elements. This is the case for some crude sorting algorithms.
- Logarithmic time, O(log N). The cost of the algorithm or operation increases as the log of the number of elements. As a review, logbasenumber = exponent. For example, log10100 = 2 and log1000 = 3. For decimal numbers, logs are often written without explicitly showing the base. For instance, log10X is usually written as log X. You should note that for many algorithms that work by successively dividing the number of elements in half, such as many tree based algorithms, the cost actually increases as log2 N. The cost, however, is usually expressed as O(log N) = O(log10N). Why, because logarithms in different bases of the same number are related by a constant. For now, just remember that operations that increase logarithmically increase more slowly than those that increase linearly.
- O(N log N). The cost increases on the order of N log N. This is true for some algorithms. For instance, the Fast Fourier Transform, FFT, beloved by electrical engineers world wide, has this property.
How do all these orders relate to each other? Let's look at some examples
Number of elements | O(1) | O(log N) | O(N) | O(N log N) | O(N2) |
10 | constant | 1 | 10 | 10 | 100 |
100 | constant | 2 | 100 | 200 | 10000 |
1000 | constant | 3 | 1000 | 3000 | 1000000 |
10000 | constant | 4 | 10000 | 40000 | 100000000 |
As you can see, as the number of elements increases, the order of an algorithm becomes very significant.
Sequence Containers
The STL provides both sequence and associative containers. The sequence containers are meant to hold a collection of objects of one type in a linear sequence. This is similar to an array. The STL provides three sequence containers.
- vector - The vector class provides random access retrievals of objects at any position in constant time. This is useful if your code needs to access elements in an unordered way. You need element 1, then element 5, then element 8752 and so on. The cost (cpu clocks or time needed) of getting element 8752 can be considered to be the same constant as the cost to retreive element 1, for instance. Vector is optimized for insertions and deletions at its end. These are also done in constant time. Insertions or deletions at positions other than the end require linear time.
- deque - The deque (pronounces deck) class also provides random access retrievals of objects at any position in constant time. Insertions and deletions at both the front and end of a deque are optimized and occur in constant time. Insertions and deletions in the middle require linear time.
- list - The list class does not support random access retrievals. Retrievals require linear time. This is because lists are implement with an underlying doubly linked-list structure. A linked-list is a series of nodes connected by previous and next pointers. The "previous" pointer gets you to the previous element; the next pointer to the next. Insertions and deletions at any point are efficient and occur in constant time. Insertion or deletion requires onlreassignmentnt of the next and previous pointers. No elements must be moved. Insertions and deletions in the middle of vectors and lists and at the beginning of vectors require other elements to be moved. This is why they require linear time.
In addition to the containers list above, ordinary arrays and the string class can be considered sequence containers. As will be shown later, they can also be used with the generic algorithms that provide much of the usefulness of the STL. In addition, most STL implementations also provide stack, queue and priority queue container classes. These classes are created using container adapters and one of the core STL sequence containers. We explore this in a later lesson.
Generic Algorithms
The generic algorithms provide functions that implement many common programming tasks in efficient manners. We'll explore these algorithms in detail later in the tutorial. For now, I'll just list them so that you can see the wide range of algorithms available. Many of the algorithms have overloaded function signatures. That is, they have versions that take different number and/or types of arguments. Here is a list of the algorithms. You can surmise their use from their names. The suffix "if" on some of the method names indicates that the method takes a function object as an argument. We'll see more on this shortly. The suffix "copy" indicates that the method copies its results into a destination container.
adjacent_find |
binary_search |
copy |
copy_backward |
count |
count_if |
equal |
equal_range |
fill |
fill_n |
find |
find_end |
find_first_of |
find_if |
for_each |
generate |
generate_n |
includes |
inplace_merge |
iter_swap |
lexicographical_compare |
lower_bound |
make_heap |
max |
max_element |
merge |
min |
min_element |
mismatch |
next_permutation |
nth_element |
partial_sort |
partition |
pop_heap |
prev_permutation |
push_heap |
random_shuffle |
remove |
remove_copy |
remove_copy_if |
replace |
replace_copy |
replace_copy_if |
replace_if |
reverse |
reverse_copy |
rotate |
rotate_copy |
search |
search_n |
set_difference |
set_intersection |
set_symmetric_difference |
set_union |
sort |
sort_heap |
stable_partition |
stable_sort |
swap |
swap_ranges |
transform |
unique |
unique_copy |
upper_bound |
The point to take away is that there are many algorithms available and that before writting code, check to see if an existing algorithm can suffice.
Function Objects
Many of the generic algorithms on the previous page perform some type of comparison or test as they execute. For instance, the count function returns the number of elements equal to some value. The find function returns an iterator containing the location of the first element equal to some value. If this were the limit to the flexibility of these algorithms they would still be useful, but the STL was designed to make the algorithms significantly more flexible and more useful. Many of the algorithms have versions that take a function or function object as an parameter. Function objects are classes that have the function operator, (), overloaded. These function or function objects usually take one or two parameters and return a boolean, or true/false value. A function that returns a boolean value is also called a predicate. Let's see how a predicate can be used to extend the functionality of the count algorithm.
#include |
OUTPUT:
2 elements equal to 25
1 elements less than 25
2 elements greater than 23
Notice how the behavior of count is modified when a predicate, such as the functions less or more, is provided as a parameter. Instead of a simple function as shown in this example, a class that overloads the function operator, (), could also be used. We will see the benefits of this in a later lesson. Also, please note that the STL provides built-in function objects via the functional include file. This will be covered in a later lesson
#include |
OUTPUT:
2 elements equal to 25
1 elements less than 25
2 elements greater than 23
0 comments;Click here for request info on this topic:
Post a Comment