Thursday, August 30, 2007

STL Algorithms


Special function objects allow us to override the default operator semantics of the generic algorithms

The range of elements over which the algorithm is to traverse is marked by a pair of iterators: a first iterator that addresses the first element to operate upon and a last iterator that marks 1 past the last element to operate upon. The element addressed by last is not itself operated upon; it serves as a sentinel to terminate the traversal. It also serves as the return value indicating no position. If the value is found, the iterator marking that position is returned.

Two versions of each algorithm: one that uses the equality operator of the underlying type of the element, and one that uses a function object or pointer to a function to implement the comparison.

Five categories of iterators predefined by the standard library.

A ForwardIterator supports both reading and writing of the element it addresses.

int *presult = find( &ia[0], &ia[6], search_value );

cout "The value " search_value( presult == &ia[6] ? " is not present" : " is present" )endl;

If the pointer returned is equal to the address of &ia[6] (that is, 1 past the elements of ia), then the search is unsuccessful; otherwise, the value is found.

The copy() algorithm takes as its first two arguments a pair of iterators marking the range of elements to be copied. The third argument is an iterator marking where to begin placing the elements copied. back_inserter is spoken of as an iterator adaptor; it causes elements to be inserted at the back of the vector passed to it as an argument.

unique() removes duplicate values from the container, but only those duplicate values that are adjacent. Given the sequence 01123211, the result is 012321 and not 0123. To achieve the latter result, we must first sort() the vector; that is, the sequence 01111223 becomes 0123. Well, almost. Actually, the result is 01231223.

unique() behaves somewhat unintuitively: the size of the container it acts upon is not changed. Rather, each unique element is assigned in turn to the next free slot beginning with the first. In our example, the physical result is 01231223; the sequence 1223 represents, so to speak, the refuse of the algorithm. unique() returns an iterator marking the beginning of this refuse. Typically, this iterator is then passed to the associated erase() container operation to delete the invalid elements. (Because the built-in array does not support an erase() operation, the family of unique() algorithms is less suited to the built-in array type.) Here is this portion of our function:

void process_vocab( vector<>*pvec )

{

// ...

// sort the elements of texts

sort( texts.begin(), texts.end() );

// delete all duplicate elements

vector::iterator it;

it = unique( texts.begin(), texts.end() );

texts.erase( it, texts.end() );

// ...

}



stable_sort() preserves the relative position of equivalent elements; within elements of the same length, the current alphabetical ordering is maintained.

remove() behaves the same as unique() in that it does not actually alter the size of the container but rather separates the elements into those to be retained (copying them in turn to the front of the container) and those to be removed (which remain at the back). It returns an iterator marking the first element of those to be removed. Here is how we might use it to erase a collection of common words we do not wish to retain within the vector:

void process_vocab( vector<>*pvec )

{

// ...

static string rw[] = { "and", "if", "or", "but", "the" };

vector<> remove_words( rw, rw+5 );

vector::iterator it2 = remove_words.begin();

for ( ; it2 != remove_words.end(); ++it2 ) {

// just to show another form of count()

int cnt = count( texts.begin(), texts.end(), *it2 );

cout cnt " instances removed: " (*it2) endl;

texts.erase( remove(texts.begin(),texts.end(),*it2 ) texts.end()

);

}

}

IMP

bool less_than( const string & s1, const string & s2 )

{

return s1.size() <>

}

void process_vocab( vector<>*pvec )

{

// ...

// sort the elements of texts by length

// preserving the previous order of elements

stable_sort( texts.begin(), texts.end(), less_than );

// ...

}


Although this gets the job done, it is considerably less efficient than we might wish.
less_than() is implemented as a single statement. Normally, it would be invoked as an inline function. By passing it in as a pointer to function, however, we prevent it from being inlined.

An alternative strategy to preserve the inlineability of the operation is the use of a function object.


The benefits of a function object over a pointer to function.

First, if the overloaded call operator is an inline function, the compiler is able to perform the inlining, providing a possibly significant efficiency gain.

Second, the function object can hold an arbitrary amount of additional data, either cached results or data to help in the current operation.

Three sources of function objects.

􀁺 A set of arithmetic, relational, and logical function objects is predefined by the standard library.

􀁺 A set of predefined function adaptors allows us to specialize or extend the predefined function objects (or, for that matter, any function object).

􀁺 We can define our own function objects, to be passed to the generic algorithms and possibly against which to apply the function adaptors.

Function Adaptors for Function Objects

The standard library also provides a set of function adaptors with which to specialize and extend both unary and binary function objects. The adaptors are special classes divided into the following two categories.

  1. Binders: a binder is a function adaptor that converts a binary function object into a unary object by binding one of the arguments to a particular value. For example, to count all the elements within a container that are less than or equal to 10, we would pass count_if() a less_equal function object with one of its arguments bound to 10. In the following section we'll look at how we do that.

2. Negators: a negator is a function adaptor that reverses the truth value of a function object. For example, to count all the elements within a container that are greater than 10, we could pass count_if() the negator of our less_equal function object with one of its arguments bound to 10. Of course, in this case, it is considerably more straightforward to simply pass a binder of the greater function object with one of its arguments bound to 10.

There are two predefined binder adaptors provided by the standard library: bind1st and bind2nd.

bind1st binds the value to the first argument of the binary function object, and

bind2nd binds the value to the second

0 comments;Click here for request info on this topic:

Post a Comment