SI221 (Fall AY2021)

Programming with Templates

Introduction

One of the guiding principles in dealing with computers, whether it be as a programmer or as a user, is never keep multiple copies of the same data (backing up data is a different issue, by the way). Why? Because the possiblity for error is too great. First, you may make an error entering one or more copies, so they may not agree even from the get go. Second, as time passes you're bound to want to modify the data, and you'll almost certainly forget to keep all of the several copies up-to-date. Then you've got different versions running around, and life gets pretty ugly.

The holy grail of software engineering - a fancy term for programming - is a concept we've discussed before: code reuse. Obviously, you don't want to keep reinventing (or reimplementing as the case may be) the wheel. Moreover, if you're not reusing code, if instead you're writing the same stuff over and over again, you're in exactly the situation from the previous paragraph - multiple copies of the same thing - and you're subject to all the pitfalls that implies. So to whatever extent we can, we want to write code that not only does the job at hand, but is likely to be useful in the future.

One key to writing reusable code is generality. If your program needs to compute the cube of a value, don't write a "cube" function, write a "power" function and just pass 3 as an argument. The "power" function is certainly more likely to be useful in other programs than the less general "cube" function. If your program needs to read data from a file input.txt, don't write your "read" function so that it can only read from input.txt, write it so that it takes an istream as an argument. That way in the future you can reuse that function to read from different files, or from cin.

Templates are an extremely powerful tool for writing code that is as general as possible, and thus for writing reusable code. Why implement a class for stacks of ints when you can implement a class for stacks of any kind of object? Which is likely to be more reusable? We're going to learn the basics of writing template classes and functions, and then concentrate on being users of the Standard Template Library, which is a large collection of templated classes and functions that is part of the C++ language.

The basic template mechanism

We saw the basics of templates last class. If you have a program like this:

template <class DataType>
DataType maximum(DataType a, DataType b) {
  if (a < b)
    return b;
  else
    return a;
}

int main() {
  int a = 5, b = -2;
  string s = "hello", t = "heart";
  double x = 0.5, y = 0.502;

  cout << maximum(a,b) << endl;
  cout << maximum(s,t) << endl;
  cout << maximum(x,y) << endl;
}

The compiler literally looks at the calls to the maximum function and determines that we have ints, strings and doubles used in place of DataType, and therefore replaces the templated definition of maximum with three different non-templated functions (called instantiations of the template function). So when your compiler starts really compiling it sees your program like this:

int maximum(int a, int b) {
  if (a < b)
    return b;
  else
    return a;
}

string maximum(string a, string b) {
  if (a < b)
    return b;
  else
    return a;
}

double maximum(double a, double b) {
  if (a < b)
    return b;
  else
    return a;
}

int main() {
  int a = 5, b = -2;
  string s = "hello", t = "heart";
  double x = 0.5, y = 0.502;

  cout << maximum(a,b) << endl;
  cout << maximum(s,t) << endl;
  cout << maximum(x,y) << endl;
}

That's really what's going on, and keeping that in mind can help you make sense of templates.

Problems with templates and the compile-link process

Our usual model of separately compiling files with function definitions and then linking the compiled files into an executable doesn't work well with templates. An example should illustrate why. In the following program we have a module defining a templated "maximum" function. maximum.h has the prototype for the function and maximum.cpp has the implementation. The file main.cpp uses this module to do a little work. Below we show all three files, as well as what happens when we try to compile them.

Maximum module

maximum.h

template <class DataType>
DataType maximum(DataType a, DataType b);

maximum.cpp

template <class DataType>
DataType maximum(DataType a, DataType b) {
  if (a < b)
    return b;
  else
    return a;
}

main.cpp

#include "maximum.h"
#include <iostream>
using namespace std;

int main() {
  int a = 2, b = 1;
  cout << maximum(a,b);
  return 0;
}

Compilation

$ g++ -o go main.cpp maximum.cpp
/tmp/ccqsgehJ.o:main.cpp:function main: error: undefined reference to 'int maximum(int, int)'
collect2: ld returned 1 exit status
$

What went wrong? To help us understand, we need to remember how g++ does its job when attempting to "link" the source .cpp files together. First, it compiles each individual file into a .o file (it does this as a temporary file if you are using the '-o' option with multiple files, or as an actual object file with the same name if using the '-c' option with only one file). Then, if the files successfully compile individually it links those files together into a binary file, which in the case above would be go.

Well, when the templated definition of maximum in maximum.cpp was processed by the compiler, it didn't see any calls using it, so it didn't generate any concrete instantiations of "maximum". In particular, it didn't generate the instantiation of "maximum" with int in place of DataType. Now, when the compiler processed main.cpp, it saw a call to maximum with int in place of DataType, but because the definition of the templated "maximum" function was not a part of that compilation unit, it couldn't actually instantiate and compile the templated definition with int in place of DataType. Thus, both main.cpp and maximum.cpp are left pointing their fingers at each other saying "I thought you were going to take care of instantiating and compiling "maximum" with int in place of DataType!"

So, separate compilation doesn't work well with templates. However, if you place templated function definitions in header files, the errors will go away.

Bottom line: put defintions of template functions in your header files along with the prototypes; don't make a separate .cpp file for them.

Writing templated functions

Let's take another look at writing templated functions. Lots of times when you're writing a program you want to print out the contents of an array. Sometimes this is part of the actual program and sometimes it's just a temporary thing for debugging. None-the-less, it's something that you want to do. Here is a function that nicely prints out arrays of doubles with length n.

void print(double *A, int n) {
  cout << '[' << A[0];
  for(int i = 1; i < n; i++)
    cout << ',' << A[i];
  cout << ']';
}

Hopefully after the pep-talk at the beginning of the notes you're asking yourself: "How can I make this as general, as widely applicable as possible?" Well, obviously we may have arrays with other types - like strings or chars or objects of class Point, etc. Can we make this a templated function so that no matter what type is stored in the array, we can print it out? Yes! First, let's see what changes when we define this for strings rather than doubles:

void print(string *A, int n) {
  cout << '[' << A[0];
  for(int i = 1; i < n; i++)
    cout << ',' << A[i];
  cout << ']';
}

What changed was ... almost nothing. The type of the array simply changed from double to string, that's all. So a templated version of this function would use a template parameter in place of double/string from the previous definitions.

template<class DataType>
void print(DataType *A, int n) {
  cout << '[' << A[0];
  for(int i = 1; i < n; i++)
    cout << ',' << A[i];
  cout << ']';
}

Now this "print" function is much more general. We can use this no matter what type of array we have. Or can we? What assumptions does this program make about the type of object stored in the array? It assumes that operator<< has been overloaded for output of that type. So, we can use this no matter what type of array we have, so long as the elements of the array can be printed out with operator<<.

Of course, we could make this function more general still. Sometimes you'll want to print arrays not to standard output with cout, but to a file. Our function doesn't allow this! We could make the ostream to which output is written an argument to the function. If you were really slick, you could make cout the default value of that ostream parameter, so that if you call "print" without the third parameter it automatically prints to standard out using cout.

template<class DataType>
void print(DataType *A, int n, ostream& OUT = cout) {
  OUT << '[' << A[0];
  for(int i = 1; i < n; i++)
    OUT << ',' << A[i];
  OUT << ']';
}

Now, if we put this nice templated array "print" function in a header file, you need only include the header to effortlessly print arrays!

Writing templated classes

Lists, arrays, queues, stacks, and the like are examples of what are called containers. Obviously they contain objects, but the real point is that what they do doesn't depend on the type of object they store. That makes them perfect candidates for templates. Let's make a container calledTop3, which will store the 3 smallest values added to the container, allowing the user to access them by index 1, 2 and 3. This might be useful for keeping track of gold, silver and bronze medal times in a race, for example, and we'll start off by implementing this for doubles. The Top3 class should have an add function to add a new value to the container, and a get function to get a value by index 0, 1 or 2.

Top3.h Top3.cpp

#include <iostream>
#include <algorithm> // "swap" is defined here!
using namespace std;

class Top3 {
private:
  double a, b, c;
  int n;

  void sort() {
    if (n <= 1) return;
    if (b < a)  swap(a,b);
    if (n == 2) return;
    if (c < b)  swap(b,c);
    if (b < a)  swap(a,b);
  }

public:
  Top3() {
    n = 0;
  }

  void add(double x) {
    if (n == 0)      a = x;
    else if (n == 1) b = x;
    else if (n == 2) c = x;
    else if (x < c)
      c = x;
    n++;
    sort();
  }

  double get(int i) {
    if (i == 0) return a;
    if (i == 1) return b;
    return c;
  }
};

#include "Top3.h"

int main() {
  // Read and store data
  Top3 T;
  double x;
  while(cin >> x)
    T.add(x);

  // Print out 1st, 2nd and 3rd
  cout << "Gold is   " << T.get(0) << endl;
  cout << "Silver is " << T.get(1) << endl;
  cout << "Bronze is " << T.get(2) << endl;
  return 0;
}

Now, if we were to change this implementation so that it worked for char objects, for example, what would we do differently? Well, the type of a, b and c would be char instead of double, as would the argument to add and the return type of get. Same thing for any other type, just change those three occurences of double to the type you want to use and you're good to go. So to generalize, let's make a template parameter for the kind of object stored in Top3, and change those three occurences of double to the template parameter name.

Top3Template.h Top3Template.cpp

#include <iostream>
#include <algorithm> // "swap" is defined here!
using namespace std;

template <class DataType>
class Top3 {
private:
  DataType a, b, c;
  int n;
  void sort() {
    if (n <= 1) return;
    if (b < a)  swap(a,b);
    if (n == 2) return;
    if (c < b)  swap(b,c);
    if (b < a)  swap(a,b);
  }

public:
  Top3() {
    n = 0;
  }

  void add(DataType x) {
    if (n == 0)      a = x;
    else if (n == 1) b = x;
    else if (n == 2) c = x;
    else if (x < c)
      c = x;
    n++;
    sort();
  }

  DataType get(int i) {
    if (i == 0) return a;
    if (i == 1) return b;
    return c;
  }
};

#include "Top3Template.h"

int main() {
  // Read and store data
  Top3<double> T;
  double x;
  while(cin >> x)
    T.add(x);

  // Print out 1st, 2nd and 3rd
  cout << "Gold is   " << T.get(0) << endl;
  cout << "Silver is " << T.get(1) << endl;
  cout << "Bronze is " << T.get(2) << endl;
  return 0;
}

Question: What implicit requirements are there for the type we use for DataType in Top3?