Tuesday, November 5, 2024

C++:Beyond The Standard Library

Once you have mastered the C++ language and the standard library, what do you tackle next? The C++ community has plenty to keep you busy for years to come. To start with, you might take a look at the many C++ libraries that extend the standard. Sure, the C++ library has strings, I/O streams, containers, iterators, algorithms, and all that fun stuff. But it lacks much that is commonplace in modern computing: networks, GUIs, concurrency, and so on.

As a result, numerous C++ library projects have arisen to complete, correct, or extend the standard C++ library. Some were started even before the C++ standard was complete; others arose later. This article highlights a number of open source C++ libraries, including ACE, Loki, and Boost. (Many other projects exist, of course, but it is not possible to present them all in a single article.) If you have time to learn only one extra library, I recommend Boost.

Numerics

Two projects, Blitz++ and Matrix Template Library (MTL), provide high-performance numeric computing. In a way, they represent what valarray should have been.

The Blitz++ project was one of the early projects to take advantage of templates to shift part of the computing burden from runtime to compile time. It has powerful array and matrix classes, operators, and functions, with strides, subarrays, and so on.

One of the key optimizations is that arithmetic operators and mathematical functions involving Blitz++ arrays do not compute values immediately. Instead, they return expression objects. When an expression object is assigned to an array, the expression is computed, storing the results directly in the target of the assignment, without the need for large temporary arrays.

The MTL also uses templates to optimize performance. Its approach is more reminiscent of the STL, using containers (such as Matrix and Vector), iterators, and generic algorithms (such as transpose, dot (dot project), and so on).

ACE

ACE (ADAPTIVE Communication Environment) has been around for a long time, and provides a framework for networks, communication, and concurrency. It is well documented in two books: C++ Network Programming,: Mastering Complexity with ACE and Patterns, and C++ Network Programming: Systematic Reuse with ACE and Frameworks, both by Douglas Schmidt and Stephen Huston.

ACE is big. It does a lot, and it does a lot for you. It has concurrency, synchronization primitives, sockets, and related network classes. ACE also comes with TAO (The ACE Orb), for CORBA support. Most important, ACE is an integrated framework, where the networking and concurrency work together.

At the bottommost layer, ACE has a portable API to interface with the operating system (Windows, Linux, and many other UNIX variants). On top of this API is a set of C++ wrapper classes. The wrappers include synchronization primitives (such as Mutex and Semaphore), interprocess communication with sockets and pipes (such as ACE_SOCK_Connector, ACE_SOCK_FIFO, and ACE_SOCK_Addr), and concurrency (such as Future and Servant thread classes).

The next layer up is the main framework. The Reactor and Proactor classes dispatch messages in an event-oriented fashion, integrated with all the ACE synchronization classes. At this level, the ACE Orb is also integrated for CORBA.

You can download the source code from the main Distributed Object Computing web site at Washington University. (Follow the ACE link.)

Loki

Loki was invented by Andrei Alexandrescu and is documented in his book, Modern C++ Design. It pushes templates to their limits (and beyond for many compilers), showing the power of templates for policy-based programming and metaprogramming.

Unlike many other projects, Loki was developed on Windows, not Unix, and the files are DOS text files.

To see the benefits of policy-based programming, consider the problem of implementing a singleton class. A singleton class is a class for which no more than one instance can exist at runtime. Different needs for singletons give rise to different implementations. For example, sometimes you might want a single, static instance. Other times, you might want lazy instantiation, so the singleton object is not created until it is needed, but then is not destroyed until the program exits. Other times, you might want a dynamic instance that is destroyed when it is no longer needed, only to be recreated when it is needed again. A single class cannot easily meet these different demands, but a policy-based template can.

Loki declares the SingletonHolder template, which can be used to manage a singleton. You provide a type as a template parameter, and the SingletonHolder creates the singleton instance for you, which it returns from its Instance() member function. Other template parameters specify the policies that govern the singleton.

template
<
    typename T,
    template <class> class CreationPolicy = CreateUsingNew,
    template <class> class LifetimePolicy = DefaultLifetime,
    template <class> class ThreadingModel = SingleThreaded
>
class SingletonHolder
{
public:
    static T& Instance();
        
private:
    // Helpers
    static void MakeInstance();
    static void DestroySingleton();
        
    // Protection
    SingletonHolder();
        
    // Data
    typedef typename ThreadingModel<T*>::VolatileType PtrInstanceType;
    static PtrInstanceType pInstance_;
    static bool destroyed_;
};

The CreationPolicy determines how the singleton instance is created. Loki provides CreateUsingNew, CreateUsingMalloc, and CreateStatic. The latter constructs the singleton object in a static buffer. You can supply a custom creation policy by writing your own class template that provides static Create() and Destroy() member functions. The SingletonHolder takes care of the details of when Create() and Destroy() are called. Following is the CreateUsingNew policy:

template <class T> struct CreateUsingNew
{
    static T* Create()
    { return new T; }
        
    static void Destroy(T* p)
    { delete p; }
};

The LifetimePolicy determines the singleton’s lifetime. The DefaultLifetime destroys the singleton object when the program exits (using an atexit handler). If the program attempts to refer to the singleton object after it has been destroyed, the LifetimePolicy object throws an exception. The PhoenixLifetime policy is similar, except it allows the singleton to be recreated after it has been destroyed.

The SingletonWithLongevity policy lets you control the relative lifetimes of multiple singletons. A longevity, in this case, is an unsigned integer. Objects with higher valued longevity are destroyed after objects with lower valued longevity. You must provide a free function, GetLongevity, that returns the longevity of an object.

Finally, the NoDestroy policy never destroys the singleton object. Use of this policy can result in memory or other resource leaks. To implement your own lifetime policy, write a class template that has static ScheduleDestruction and OnDeadReference functions. For example, the following code is the DefaultLifetime policy:

template <class T>
struct DefaultLifetime
{
    static void ScheduleDestruction(T*, void (*pFun)())
    { std::atexit(pFun); }

    static void OnDeadReference()
    { throw std::logic_error(std::string("Dead Reference Detected")); }
};

The final template parameter is the ThreadingModel, which can be SingleThreaded or ClassLevelLockable. The first does not perform any locking, and is suitable for single-threaded applications. ClassLevelLockable creates a lock for each class. Loki also has ObjectLevelLockable, but SingletonHolder is never instantiated because you use only the static Instance() function. Thus, you have no reason to use ObjectLevelLockable in this case.

To use SingletonHolder, start by declaring your class. Be sure to make the constructor private, with the creation policy as a friend, or otherwise ensure that the public cannot circumvent SingletonHolder and create arbitrary instances of your class. Then instantiate SingletonHolder with your type as the first template argument. You can use the default values for the remaining template arguments, or choose one of Loki’s policy implementations, or write your own. The following is a simple example:

#include <cassert>
#include "Loki/Singleton.h"

class thingy {
    thingy() {}
    template <typename T> friend class Loki::CreateUsingNew;
public:
};

typedef Loki::SingletonHolder<thingy> single_thingy;

void foo(thingy& t)
{
    assert(&t == &single_thingy::Instance());
}

int main()
{
    foo(single_thingy::Instance());
}

Boost

I saved the best for last. Some members of the C++ standard committee started Boost as a platform for exploring future directions for the C++ standard library. Already, parts of Boost are being added to the standard library (such as type traits, regular expressions, smart pointers, enhanced binders, and adapters). You can expect to see other parts of Boost added in the future.

Boost is big and eclectic. It contains everything from type traits to quaternions (a generalization of imaginary numbers to four dimensions), from parsing to linear algebra, from simple memory utilities to threads and synchronization.

This section presents only a few of the packages in Boost that are part of the proposed library extension: tuples, smart pointers, lambda expressions, and the Spirit parser generator.

Building and Installing Boost

The web site has instructions for building and installing Boost. You need to install Boost Jam, which is like make, but is designed for greater portability and ease of use. Follow the directions on the web site to build and install Boost.

Tuples

A tuple is a generalization of the standard pair class template. Instead of being limited to two elements, a tuple can contain up to 10 elements (and the maximum can be extended if necessary). The following example shows one way to use boost::tuple:

#include <numeric>
#include <iostream>
#include <ostream>
#include <vector>
#include "boost/tuple/tuple.hpp"

// Store count, sum, and sum of squares.
typedef boost::tuple<std::size_t, double, double> Stats;

// Accumulate statistics.
Stats stats(Stats s, double x)
{
  ++s.get<0>();
  s.get<1>() += x;
  s.get<2>() += x * x;
  return s;
}

int main()
{
  std::vector<double> v;
  ... fill v with data ...

  Stats s = std::accumulate(v.begin(), v.end(),
                  boost::make_tuple(0U, 0.0, 0.0), stats);
  std::cout << "count = " << s.get<0>() << 'n';
  std::cout << "mean  = " << s.get<1>() / s.get<0>() << 'n';
}

Smart Pointers

Boost has several smart pointer class templates. They solve a number of problems that the standard auto_ptr<> class template does not. For example, you cannot store an auto_ptr<> object in a standard container, but you can store a boost::shared_ptr<> object. Boost has several other smart pointer templates; for the sake of brevity, the following example only shows shared_ptr<>:

#include <algorithm>
#include <iostream>
#include <iterator>
#include <ostream>
#include <string>
#include <vector>
#include "boost/smart_ptr.hpp"

// A company has employee. Each employee can be Exempt
// or NonExempt. Certain Exempt employees are Managers,
// which are distinguished by having a group of Employees.
// All the memory for said employees is managed
// automatically by Boost shared_ptr<> templates.

class Employee {
public:
  Employee(const std::string& name) : name_(name) {}
  virtual ~Employee() {}
  const std::string name() const { return name_; }
  virtual void print(std::ostream&);
private:
  const std::string name_;
};
typedef boost::shared_ptr<Employee> employee;

void Employee::print(std::ostream& out)
{
  out << name() << 'n';
}

class Exempt : public Employee {
public:
  Exempt(const std::string& name) : Employee(name) {}
};

class NonExempt : public Employee {
public:
  NonExempt(const std::string& name) : Employee(name) {}
};

class Manager : public Exempt {
public:
  Manager(const std::string& name) : Exempt(name) {}
  void add(Employee* e) { group_.push_back(employee(e)); }
  void add(employee e) { group_.push_back(e); }
  virtual void print(std::ostream&);
private:
  std::vector<employee> group_;
};
typedef boost::shared_ptr<Manager> manager;

void Manager::print(std::ostream& out)
{
  out << name() << " { ";
  std::copy(group_.begin(), group_.end(),
    std::ostream_iterator<employee>(out, ""));
  out << "}n";
}

// Make it easier to print any kind of employee.
template<typename charT, typename traits>
std::basic_ostream<charT,traits>& operator<<
     (std::basic_ostream<charT,traits>& out, employee e)
{
  e->print(out);
  return out;
}

int main()
{
  manager ceo(new Manager("I. M. Portant"));
  manager muddle(new Manager("Muddled manager"));
  ceo->add(muddle);
  muddle->add(new Exempt("J. Doe"));
  muddle->add(new NonExempt("J. Dough"));
  ceo->print(std::cout);

Lambda Expressions

C++ supports multiple programming paradigms, including procedural, object-oriented, and generic programming. You probably don’t think of C++ when it comes to functional programming, though. Nonetheless, Boost supports functional programming in C++, albeit with some limitations.

Central to most functional programming languages is the ability to write a function on the fly. These unnamed functions are typically called Lambda Functions. (The name comes from the lambda-calculus. If you don’t know about it, don’t worry.)

For example, suppose you have a vector of numbers and you want to print the numbers, one per line. You can use an ostream_iterator as follows:

vector<int> data;
copy(data.begin(), data.end(), ostream_iterator<int>(cout, "n"));

Or you can use a lambda function:

for_each(data.begin(), data.end(), 
cout << _1 << 'n');

The last argument to for_each must be a function pointer or a functor—an object that has an overloaded operator(), which can be called with a single argument, namely, each element of the range. In this case, the Boost lambda library overloads the operator<< (and all the other overloadable operators) to create a functor. The _1 is a placeholder for the functor’s argument, that is, the values from the range.

As the output formatting grows more complex, the value of the lambda function increases. For example, a common problem is printing the contents of a map. You cannot use ostream_iterator because it requires operator<<, which is not defined for std::pair. Instead, you can write your own print function, and call it from for_each, but that moves the print logic away from where it is needed. Instead, you can use a lambda function:

map<string, unsigned long> counts;
typedef pair<const string, unsigned long> mypair;
for_each(counts.begin(), counts.end(),
  cout << bind(&mypair::first, _1) << 't' 
       << bind(&mypair::second, _1) << 'n');

The nature of the lambda function requires the bind function, which delays binding the member pointer with an object until runtime. Again _1 refers to each element of the range. Every time for_each iterates, bind binds the element of the range to the member pointer, and evaluates the expression.

Lambda expressions are used heavily in the Spirit parser generator, as described in the next section.

Spirit Parser Generator

Perhaps the most interesting and exciting part of the latest Boost release (1.30.0) is the Spirit parser generator. Spirit takes advantage of C++ operator overloading and expression templates to let you define a grammar for a recursive-descent parser in C++. You don’t need an external parser-generating tool such as YACC or ANTLR. The drawback is that you cannot define grammars that are as complex as what parser-generators can handle, but Spirit is robust and can handle many complicated tasks.

You can use Spirit without Boost. Spirit is hosted at SourceForge.

Spirit makes heavy use of lambda functions and expression templates. Following is a simple grammar in BNF (Backus-Naur Form):

start ::= expression
factor ::= '(' expression ')' | integer
term ::= factor '*' factor | factor '/' factor
expression ::= term '+' term | term '-' term

Following is the same grammar in YACC syntax:

start: expression;
factor: '(' expression ')' | integer;
term: factor '*' factor | factor '/' factor;
expression: term '+' term | term '-' term;

Following is the same grammar in Spirit syntax:

start = expression;
factor = '(' >> expression >> ')' | int_p;
term = factor >> '*' >> factor | factor >> '/' >> factor;
expression = term >> '+' >> term | term >> '-' >> term;

Notice how little difference there is. The most significant difference is the use of the >> operator to separate symbols in each production. All the operators are normal, albeit overloaded, C++ operators. The production names are ordinary objects of type rule<>.

As grammars grow more complex, the C++ syntax forces more differences. For example, optional items use the ! operator. The Kleene star, which is postfix in BNF, uses the unary * operator, which is prefix in C++. For example, suppose the grammar is extended to allow a series of terms in an expression:

expression = term >> *( ('+' >> term) | ('-' >> term) ) ;

Spirit generates a recursive descent grammar, so you cannot use left-recursive rules, such as the following:

// illegal: causes infinite recursion in Spirit
expression = term | expression >> '+' >> term | expression 
             >> '-' >> term ;

All left-recursive rules are trivially rewritten to avoid left-recursion. (See any textbook on parsing for details. For example, consult the “Dragon Book”: Compilers: Principles, Techniques, and Tools, by Alfred V. Aho, et al.)

In a real parser, you need more than just a grammar. You also need semantic actions: the parser must do something when it matches a production. Specify semantic actions as lambda expressions.

Spirit was added to Boost recently, and the author of Spirit wrote his own Lambda library, called Phoenix. Spirit was built to use Phoenix (or Phoenix was written to implement Spirit, depending on your point of view). In time, as Spirit and Boost continue to merge, the Boost lambda library will incorporate the best features of Phoenix into a single lambda library. Until then, you need to use Phoenix, although you can mix in some Boost if you really want to confuse yourself.

You can define a context for each production. The context can store information, and you can assign to the context in a semantic action, or obtain the context from another production. Semantic actions are specified in square brackets. For example, suppose the earlier grammar takes a double value that is computed on the fly. In Phoenix, you use arg1 instead of Boost’s _1.

start = expression [start.value = arg1];
factor = '(' >> expression[factor.value=arg1] >> ')'
       | int_p[factor.value = arg1];
term = factor[term.value=arg1] >> '*' >> factor[term.value *= arg1]
     | factor[term.value=arg1] >> '/' >> factor[term.value /= arg1] ;
expression = term[expression.value=arg1] >> '+' >> term[expression.value += arg1]
           | term[expression.value=arg1] >> '-' >> term[expression.value -= arg1];

You can also call functions or functors, use binders to call arbitrary member functions, and so on. Numerous built-in parsers come with Spirit, such as int_p to parse an integer, real_p to parse a real number, and so on.

A parser gets its input from a scanner. You can scan a null-terminated character string, a range of characters specified by two iterators, or write a custom scanner. Spirit lets you control every aspect of the scanning and parsing, or you can choose the defaults, which work well for traditional text-oriented languages.

Click here for a complete example of a simple calculator.

First published by the O’Reilly Network.


C++ in a Nutshell

packs an enormous amount of information on C++ (and the many libraries used with it) in an indispensable quick reference for those who live in a deadline-driven world and need the facts but not the frills. Cross-references link related methods, classes, and other key features. This is an ideal resource for students as well as professional programmers.

cover

Related Articles

1 COMMENT

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Latest Articles