StuBS
C++ Introduction for OS Developers

This is by no means a complete introduction to the C++ programming language, but rather a glimpse into certain topics which are relevant for this lecture's exercises. The examples below try to give you a deeper grasp of C++ internals by taking a look at C++ code examples, terminal output and assembly generated by the compiler.

For a more complete introduction to C++ refer to cplusplus.com.

Object vs. Pointer vs. Reference

Every variable definition reserves a piece of memory and associates a name as well as a data type with it. We can access this memory via its name. The memory's data type, on the other hand, determines how the region of memory is to be interpreted by the compiler. In order to clarify these rather abstract notions, let us take a look at the compiler's (g++ -m32 -fno-PIE test.c -o test) output generated from the following piece of C++ code:

int integer;
double floatingpoint;
int main(void) {
integer += 1;
floatingpoint += 1;
return 0;
}
int main()
Definition test.cc:7

First of all, we use nm to get a list of global objects, their addresses in memory and their sizes. Since we are only interested in the variables defined above, we indicate missing parts of nm's output by using ....

$ nm --numeric-sort -S test
...
0000052d 0000001b T main
...
00002020 00000008 B floatingpoint
00002028 00000004 B integer
...

The B in the program's output above indicates that both variables are part of the BSS segment. They are 4 and 8 bytes large. All variables that are part of the BSS segment are initialized to their zero-value on program start. This is one of the fundamental properties of the BSS segment. We see that the variable integer is a memory region of 4 bytes and starts at address 0x2028. However, we do not have any information regarding its data type. The data type is merely used by the compiler to generate the correct instructions. From this example, we see that C and C++ do not ensure type-safety like other languages. Let us take a look at the instructions generated from our example:

$ objdump -d test
...
000052d <main>:
52d: 83 05 28 20 00 00 01 addl $0x1,0x2028 # Add a constant value to the memory location
534: d9 e8 fld1 # Push 1 on the float-stack
536: dc 05 20 20 00 00 faddl 0x2020 # Add memory location to the float-stack
53c: dd 1d 20 20 00 00 fstpl 0x2020 # Store top-of(-float-stack) in memory location
542: b8 00 00 00 00 mov $0x0,%eax # Copy a constant to a register
547: c3 ret

We can see that the function main starts at address 0x52d. This is not surprising, since we have already seen that address in nm's output above. The first instruction adds 1 to the value stored at address 0x2028. This corresponds to the C++ line integer += 1;. The next three instructions increment the foatingpoint's value by 1. The code is more complex as it uses the processor's floating-point unit. Afterwards, we move the value 0 into eax. This is the function's return value.

Calling objects by value can be slow and may consume a lot of resources. It is a rather inflexible approach. We can achieve better results by introducing a small indirection and forward pointers to objects, instead of the objects themselves to functions.

void increment(int *number) {
*number = *number + 1;
}

This function receives a pointer to a memory location that holds a variable of type integer. Everytime we want to access the actual value stored at that memory location, we have to dereference the pointer (*number) first. Let us take a look at how this function is translated to assembly:

00000548 <_Z9incrementPi>:
548: 8b 44 24 04 mov 0x4(%esp),%eax # Load the first argument into %eax
54c: 83 00 01 addl $0x1,(%eax) # Add 1 to the memory location pointed to by %eax
54f: c3 ret

The first thing you might have noticed is the peculiar function name. This is caused by a process that is called "C++ Name Mangling". The C++ compiler encodes each function's parameter types (void and int * in our case) into the function's name. The function's first instruction loads the pointer number from its location on the stack into the register %eax. The next line uses the register indirect addressing mode ((%eax)) to access the value stored at the memory location that %eax points to. That means if we would call the function like this: increment(&integer), then %eax would contain the value 0x2020.

One large problem when working with pointers is the fact that we do not get any guarantee that the pointer actually points to a valid location in memory. The common way for functions to indicate an error state when they normally would return a pointer is to return the so-called null pointer (i.e. NULL in C or nullptr in modern C++). The C Standard specifies that this special value is actually the value 0, hence its name.

In most environments, dereferencing a null pointer raises an exception. You might have experienced this in previous lectures' homework assignments. While programming your operating system, however, this is not the case, due to the fact that you are not able to configure the processor in such a way that it would treat the address 0 as invalid at first. You might be able to do so during one of our later assignments.

In order to prevent these problems of pointers, C++ allows us to address objects in a different, more secure way, namely by references. A reference is an alias for a certain object, meaning that a reference always points to a valid object. Normally, compilers implement this high level language feature via pointers, but the C++ language specification does not restrict them to this implementation.

void increment_ref(int &number) {
number = number + 1;
}

As you can see, a reference is denoted by the & character. Unlike pointers, we do not have to derefence references, as they are an alias of their respective object, that is, the name number behaves just as if it was the actual object that the reference points to. If we take a closer look at the assembly generated by the compiler, we can see that the compiler chose to implement the reference via a pointer. It produces the exact same output as in our previous pointer-based example:

00000550 <_Z13inkrement_refRi>:
550: 8b 44 24 04 mov 0x4(%esp),%eax
554: 83 00 01 addl $0x1,(%eax)
557: c3 ret

Structs, Classes and Methods

As you might already know, C++ is an object-oriented programming language. First steps into that direction have already been taken in C. It allows us to group multiple variables of varying types into composite data types. The following C++ data structure definition allows us to create and work with objects of type foobar.

struct foobar {
int foo;
double bar;
};
FileOut foo("foo.txt")

Additionally, C++ allows us define methods of a structure. Methods are functions that work on the constituents of the data structure. When accessing variables by their name inside a method, the compiler first checks the surrounding structure for a matching variable, before interpreting the name as a global variable.

struct foobar {
int foo;
double bar;
int calc(int x);
};
int foobar::calc(int x) {
return (foo * 23) + x;
}

In this example we can see that the method calc is declared inside the structure's declaration. Afterwards, it is defined (i.e. filled with life) outside of the structure. It is possible to define the method directly in the structure declaration (similar to classes in Java), however, separating a class's interface and implementation is more common in the C++ ecosystem.

The method calc() accesses the field foo, meaning the function seems to have some sort of implicit parameter which points to a specific object and calc() is executed in this object's context. The function does in fact have such an implicit parameter and we can access it using the name this. The parameter is a pointer to the respective object, so in the case of the method calc() it is of type foobar *. We can also explicitly tell the compiler to access the respective object's field foo, by writing this->foo. Now let us take a look at the assembly generated by the compiler:

0000055a <_ZN6foobar4calcEi>:
55a: 8b 44 24 04 mov 0x4(%esp),%eax # Implicit paramter: this
55e: 6b 00 17 imul $0x17,(%eax),%eax # Multiply by 23
561: 03 44 24 08 add 0x8(%esp),%eax # Explicit paramter: x
565: c3 ret

You might have noticed that the object-oriented abstraction is realized simply by providing the this-pointer as the routine's first argument. All other parameters are moved by the size of the this pointer, meaning that the method's actual argument x == 0x8(%esp). This is equivalent to a regular function with the signature int calc(foobar *this, int x). The data structure's member foo is simply accessed by dereferencing the this pointer (%eax).

However, coupling of code and data is not the only characteristic of object- oriented programming. Another feature that is often associated with object- oriented languages is inheritance. This allows a class to access methods and members of the class it inherits from and allows it to extend that base class's functionality by adding new methods or member variables.

struct foobar_premium : public foobar {
bool is_premium() {
foo -= 1;
return true;
};
};

This example shows a class foobar_premium, which extends our previously defined class foobar. It inherits all of foobar's members and methods, the keyword public ensures that all of foobar's members and methods that are declared public are public in foobar_premium as well. This is similar to Java, with a few important exceptions:

  • Methods are not virtual by default, meaning they cannot be overriden by derived classes (this is equivalent to Java's final keyword).
  • Inheriting from multiple classes is possible.
  • Objects may be allocated on both, the stack and the heap.
  • The visibility of inherited members and methods is changeable, that is, we can use the private keyword to make all inherited members and methods private in our derived class.

You might wonder why we are talking about classes, when the example code uses the keyword struct. There also exists the keyword class in C++, but the meaning of both keywords is actually almost identical. The only difference being that all members of structs have public visibility by default, while the default visibility of classes is private.

Memory Management in Operating Systems

C/C++ programs generally have three areas, where data may be located:

  1. Global variables. These are either in the data or the BSS segment. They are initialized, before a program's main()-function is called. They are filled with zeros, if they have not been initialized with a certain value in the program's source.
  2. Local variables are pushed onto the stack. They are valid only in the context of the function they belong to. With that in mind it makes sense that they are no longer valid once the function returns, since the function's stack frame is removed from the stack.
  3. Objects on the heap. In an environment with a complete libc you may allocate typeless (void *) memory using the library function malloc(3). The library itself utilizes the system calls brk(2) & mmap(2). When programming in C++ we usually do not call malloc(3) directly, but use the language's new-operator to create new objects.

However, we cannot rely on Linux's libc implementation when building an operating system and, thus, do not have dynamic memory management at our disposal. We have to rely on memory types 1 and 2 during this lecture. We will revisit this topic and implement dynamic memory management in BST.

When working with local variables we have to take care when it comes to the size of objects. The stack of our operating system is only 4096 bytes large and one common mistake is to forget about that fact and allocate an object that is too large, leading to a stack overflow:

struct huge_object {
char buffer[5000];
};
void foo() {
huge_object barfoo;
}

This will not cause our operating system to crash, but it will corrupt any data structure that is located directly below the stack in memory. This in turn may lead to nondeterministic behaviour of your operating system, as it fills other global variables with arbitrary data.

Another common mistake would be to return the address of a local variable. As we have discussed earlier, the lifetime of a local variable allocated on the stack ends once its surrounding function returns. That means that once the function returns the returned address is no longer valid, which also may result in memory corruption bugs. Normally, the compiler will find these errors, but there exist cases where it does not, so be careful when working with local variables.

object_t * foo() {
object_t ret;
ret.foo = 80;
return &ret;
}

Bitwise Operations and Bitfields

When implementing an operating system, we often have to manipulate the hardware's state directly by writing to its control registers. The hardware's designers often try to implement their hardware as efficient as possible, which may lead to unrelated control bits being joined into a single byte. One example from our first exercise is the CGA graphics card. Each character on the screen (80x25 characters) is programmed via 2 bytes. The first byte contains the ASCII character that should be displayed, while the second byte contains the character's fore- and background color.

Bit76543210
MeaningBlinkBackground ColorForeground Color

We could assemble these configuration bytes using C or C++'s bitwise operations. These are the bitwise AND (&), OR (|), NOT (~), XOR (^) as well as both shift operations (<<, >>). Using these we could write a wrapper function that does the job of assembling the two bytes for us like in the example below:

char make_attribute(char foreground, char background, char blink) {
foreground &= 0xf; // 0000 1111
background &= 0x7; // 0000 0111
blink &= 1; // 0000 0001
background <<= 4; // 0000 0XXX -> 0XXX 0000
blink <<= 7; // 0000 000X -> X000 0000
return foreground | background | blink; // Bbbb ffff
}

We first use the bitwise AND with a bitmask on all three arguments to clear all bits outside of the variable's valid range. Afterwards, we shift the bits of background and blink to their correct position and combine all three arguments into a single two-byte value using bitwise OR.

There exist common combinations of these bitwise operators for adding, removing or comparing certain bits. They often make use of a bitmask which has all bits set to 1 that one is interested in:

int MASK = 0x11011;
int value;
// Remove all of the mask's bits from value
value &= ~MASK;
// Set all of mask's bits in value
value |= MASK;
// Set value's 13th bit
value |= (1 << 13);
// Check whether value has at least one of mask's bits set
if ((value & MASKE) != 0) {...}
// Check, whether value1 and value2 are the same
// with the respect to the bitmask's bits.
if ((value1 & MASKE) == (value2 & MASKE)) {...}

Manipulating bits like is fine for small applications. If we use these operations on larger projects, however, they tend to make the code impossible to read since it is full with magic constant values. In order to solve this issue, C and C++ offer bit fields which allow us to access single bits in a data structure by a name, just like regular structure members.

struct CGA_Attr {
char foreground : 4;
char background : 3;
char blink : 1;
} __attribute__((packed));

On most compilers, the structure's first member (foreground) will be stored in the least significant bits (i.e. starting at bit 0). Furthermore, the additional attribute packed will prevent the compiler from adding padding bits. Padding bits are additional bits added by the compiler between the structure's actual member variables so that the latter are aligned in memory in a way that the processor may access them faster, yielding better performance. If we use our newly created data structure, the code from the previous example becomes much more legible:

struct CGA_Attr {
char foreground : 4;
char background : 3;
char blink : 1;
CGA_Attr(char fg, char bg, char B)
: foreground(fg), background(bg), blink(B) {}
} __attribute__((packed));

What does volatile mean?

The keyword volatile is not a magic wand that miraculously lets the compiler fix all race conditions in a program. Its meaning becomes clearer, if we understand that the compiler does not read a variable's value from memory each time it is accessed. Accesses to memory are slow when compared to working with the processor's registers. Therefore, the compiler normally generates code that will read the variable's value from memory into a register, perform all relevant modifications to it using the value from the register and only then write the resulting value back to memory. Here is a short example:

mov foo, %eax
add $1, %eax
shl $3, %eax
orl $17, %eax
mov %eax, foo

Here, the variable foo's value is loaded from memory. During the next three instructions its actual value is stored in register eax and modifications done by these operations are not visible in memory. Only after the final mov instructions is the new value visible in memory.

The keyword volatile forbids the compiler to do these optimizations. If a variable is declared volatile the compiler has to read the value from memory, perform one operation and write the updated value back to memory.

Operator Overloading

One of C++'s most controversial features, apart from C++-Templates, is operator overloading. This feature allows you to define the semantics of standard operators, like a + b for example, when used with objects of your classes. In order to modify the behaviour of comparing two objects using the == operator for example, you simply have to define a method in your class named operator==:

bool operator==(const &foo a, const &foo b) {
return a.id == b.id;
}

Since there exist other operator implementations, C++'s regular rules for function overloading will decide which implementation is going to be used for a certain operation. We may also overload an operator by only specifying one instead of two arguments:

class foo {
int id;
public:
bool operator==(const &foo other) {
return this->id == other.id;
}
};

In that case the operator's left hand side will be the implicit this argument. Since the operator is defined as a method of class foo, it may access all of its private fields, which would not be allowed in the previous implementation.

When it comes to I/O, the C++ standard library uses this feature rather curiously. It overloads the shift operator << to send numbers and strings to an output stream:

std::cout << 12 << "Hello, World" << &main << std::endl;

Every operator<<-call returns a reference to the output stream.

Stream & operator<<(Stream& out, int x) {...; return out;}
Stream & operator<<(Stream& out, std::string x) {...; return out;}
Stream & operator<<(Stream& out, void *x) {...; return out;}

How the Virtual Keyword Works

In Java, all of a class's methods are virtual which means that every call to a method is dispatched dynamically. Another way of dispatching method calls is static dispatch. In this chapter, we take a closer look at virtual and static dispatches in C++.

Object *obj = AbstractObjectFactory.make_object();
obj->method();

The code above first creates an object which is assignable to a variable of type Object *. So it is either an object of class Object or its class inherits from the class Object. Whether obj points to an Object or to DerivedObject depends on its dynamic type. The variable that holds a reference to the object determines obj's static type. This means that while *obj will have the static type Object, we cannot directly discern its dynamic type from the code above.

The difference between dynamic and static dispatch is, whether the static or dynamic type of an object is used by the compiler to determine the method being called. The compiler will generate a call instruction, if the method ->method() is dispatched statically:

mov %eax,(%esp)
call 8048926 <_ZN6Object6methodEv_>

First, the pointer to the object is pushed onto stack and then the method is called. Notice that the method name has been subject to C++'s name mangling.

If the program should use a dynamic dispatch instead, it first has to figure out the referenced object's type. This also means that the object's type information has to be stored somewhere. If there is not a single virtual method defined in a class, classes and structs are mapped directly to memory. Here is an example of how our class Object might be defined:

class Object {
int a;
int b;
int c;
public:
void method();
};

Here, method() will be dispatched statically and there is no need to know type information during runtime of the program. Therefore, C++ will use exactly 12 bytes (3 * 4 bytes for each integer) for an object of this class. The class's attributes (a, b and c) are stored in memory in the order that they appear in the class definition. Therefore, the following holds:

Object * obj = ...;
char *memory = (char*) obj;
obj->a == *(int*)(memory + 0);
obj->b == *(int*)(memory + 4);
obj->c == *(int*)(memory + 8);

If we understand what the compiler does we could even cast a fitting piece of memory to an Object pointer. We assume that the code is compiled for a little-endian machine in the following example

char memory[] = {12, 0, 0, 0, // a == 12
23, 0, 0, 0, // b == 23
45, 0, 0, 0, // c == 45
};
Object *obj = (Object *)memory;

In order to retrieve the dynamic type information, the compiler has to store this information at the object's memory location. Most compilers will add a pointer to the class's Virtual Function Table at the beginning of the object's memory. If two objects have the same dynamic type, then their respective pointers will point to the same table.

class Object {
int a;
int b;
int c;
public:
virtual void method();
};
int main() {
Object *obj = new Object();
// Breakpoint
obj->method();
}

Let us take a look at this example using GDB at the specified breakpoint.

Breakpoint 1, main () at test.c:22
22 obj->method();
# Check how obj is layed out in memory:
(gdb) p obj
$1 = (Object *) 0x80f1ce0
(gdb) p *obj
$2 = {_vptr.Object = 0x80b7f2c <vtable for Object+8>, a = 0, b = 0, c = 0}
(gdb) x /4x obj
0x80f1ce0: 0x080b7f2c 0x00000000 0x00000000 0x00000000
# This is its VTable:
(gdb) info vtbl obj
vtable for 'Object' @ 0x80b7f2c (subobject @ 0x80f1ce0):
[0]: 0x8048926 <Object::method()>
(gdb) x/a 0x80b7f2c
0x80b7f2c <_ZTV6Object+8>: 0x8048926 <Object::method()>

The compiler did in fact insert a pointer to Object's Vtable (<vtable for Object+8) at the beginning of its memory. The VTable itself contains a pointer to Object::method(). The pointer to the Vtable directly depends on the object's dynamic type and, therefore, this function table may be used to implement the dynamic dispatch. Here is the compiler generated assembly for the dynamic method call (%eax hold the Object *):

mov (%eax),%edx # Move VTable pointer to %edx
push %eax # Push this pointer onto the Stack
call *(%edx) # Dereference the function pointer at index 0
# and call it indirectly.

Every virtual method is assigned a slot in the virtual function table. We can compile our files using g++ -fdump-class-hierarchy in order to see what information the compiler adds to these tables. Here is the compiler's output:

Vtable for Object
Object::_ZTV6Object: 3 entries
0 (int (*)(...))0
4 (int (*)(...))(& _ZTI6Object)
8 (int (*)(...))Object::method
Class Object
size=16 align=4
base size=16 base align=4
Object (0x0x7f6b48574060) 0
vptr=((& Object::_ZTV6Object) + 8)

As you may have realized already, the vtable actually starts 8 bytes before the the address pointed to by the virtual pointer stored in our objects. The additional space is used for a pointer to the class's runtime type information (RTTI). Using our newly aquired knowledge, we might be able to create a "fake" VTable:

void my_method(Object *obj) {
printf("X: %d %d %d\n", obj->a, obj->b, obj->c);
}
int main() {
int vtable[] = {
0, 0, // No RTTI
(int) &my_method
};
int memory[] = {
(int) vtable + 8,
1, 2, 3
};
Object *obj = (Object *)memory;
}

Compiling the code as a 32-bit binary (using the compiler flag -m32), we get the expected output: X: 1 2 3. Finally, here is a quick summary of what we have learnt so far:

  • Static dispatches come at no additional runtime costs.
  • Dynamic dispatches are an indirect call via a pointer in the object's virtual function table.
  • C++ will only pay the additional runtime costs of dynamic dispatches, if we explicitly declare method virtual.

Links:

Inheritance in C++

In our previous chapters we have already considered how compund data types (e.g. struct, union and class) are mapped into memory. We also saw that members of simple data structures or "Plain Old Data Structures" (PODs) as they are sometimes called, are simply aranged into memory in their order of appearance (from top to bottom) in ascending order, i.e. from low to high addresses. Furthermore, we have learnt that pointers to a class's virtual methods are stored in the class's virtual function table and that each object contains a pointer to this table (vptr). What we did not cover so far is inheritance. Therefore, let us compile the example below using clang -Xclang -fdump-record-layouts:

class Base {
int a;
int b;
};
class Derived : public Base {
int cc;
};
int main() {
Base x;
Derived yy;
}

From the programme's output, we can see that the class Base is mapped into memory just the way we would expect it. The left side in the text below shows the offsets from the base address of a pointer to an object of the class:

0 | class Base
0 | int a
4 | int b
| [sizeof=8, dsize=8, align=4,
| nvsize=8, nvalign=4]

Both of its integer members (a and b) are simply mapped into memory one after the other. If we take a look at the class Derived, on the other hand, we notice that Base's members have been prepended to the attribute cc defined in the derived class itself. Therefore, cc lies at offset 8:

0 | class Derived
0 | class Base (base)
0 | int a
4 | int b
8 | int cc
| [sizeof=12, dsize=12, align=4,
| nvsize=12, nvalign=4]

Storing objects in that way has a huge advantage. Since all member variables inherited by Base are at the exact same offset as they would appear in an object of the class Base, we can treat any object of type Derived just like object of type Base. Additionally, if we add a virtual method to Base, each Derived object will be able to use the vptr to point to its own virtual function table.

0 | class Derived
0 | class Base (primary base)
0 | (Base vtable pointer)
8 | int a
12 | int b
16 | int cc
| [sizeof=24, dsize=20, align=8,
| nvsize=20, nvalign=8]

As long as we inherit from a single base class only, this is all you need to know when it comes to the memory layout of classes. If we inherit from multiple classes, however, the situation is a bit more complicated. Here is an example without the use of virtual methods:

class Base {
int a;
int b;
};
class Base2 {
int c;
int d;
};
class Derived : public Base, public Base2 {
int xx;
};

This is the memory layout of the the derived class, when it inherits from Base and Base2:

0 | class Derived
0 | class Base (base)
0 | int a
4 | int b
8 | class Base2 (base)
8 | int c
12 | int d
16 | int xx
| [sizeof=20, dsize=20, align=4, nvsize=20, nvalign=4]

You might have noticed that the memory layout of its base classes is mapped one-to-one into the derived class's memory. However, we cannot use a pointer to an object of type Derived as a pointer of type Base directly. We have to adjust the address to which the this-pointer points to fix this. This is exactly what the compiler does when calling methods of the base class. Let us take a look at the function Base2 *foo(Derived *x) {return x;} to illustrate this:

00000000 <_Z3fooP7Derived>:
mov 0x4(%esp),%ecx # Load first argument into %ecx
lea 0x8(%ecx),%eax # %eax = %ecx + 8 (Base2's offset)
test %ecx,%ecx # Test, whether %ecx == 0
cmove %ecx,%eax # If yes, then set %eax to 0
ret # Return %eax

To convert the Derived * x to a Base2 *, the compiler increses the pointer to x by 8, which is exactly the offset to the Base2-part of Derived. The instructions test; cmove treat the special case that x is a null pointer. In that case the value returned should still be a null pointer and not a pointer to address 8.

These implicit offset changes are the reason why there exist multiple cast operations in C++. If classes inherit from multiple base classes and use virtual methods, the situation becomes even more complex. We will not cover this here, but if you are interested you can take a look at the links provided below.