Your program crashes at some point, gives an unexpected interrupt, boots the computer or produces other unwanted results. Possibly the error does not occur if you insert seemingly trivial statements somewhere, create an application process more or less, etc.
You create too large local variables in main()
or in one
of your application processes.
The main program main()
of StuBS is assigned a stack of
4 KB during system initialization (see startup code in
boot/startup.asm
and reserved memory in
machine/core.cc
). This is sufficient to create small local
variables, but not enough for an application processes (including their
own stacks).
Hence the following piece of code is wrong:
int main () {
Application appl;
// ...
}
Here a an Application
object (which inherits from
Thread
, having a 4 KB char array member) is created on the
initial stack, which is already larger than the available 4 KB. However,
since StuBS does not check the stack boundaries and does not implement
any other protection concepts, the local object overwrites memory that
main()
does not have available at all. The result is that
either the stack of other cores or global variables such as the
interrupt vector table are overwritten. This can go unnoticed, e.g. if
only interrupt vectors are destroyed that are never needed, but it can
also lead to crashes or other errors. This happens especially, if you
either overwrite your own code or if the error causes random values to
be interpreted as adresses of functions.
Always creates large variables/objects like the
Application
globally. This is because the compiler then
ensures that the memory space is available for them. If you want, you
can use the keyword static
to indicate that the
corresponding variable should only be referenced in the file in which it
was declared.
Don’t make any assumptions about the order global constructors are evaluated.
For example, having a constructor using a (module) global
IOPort
object might fail:
static IOPort foo(0x23);
Bar::Bar() {
foo.outb(0x42);
}
Bar bar;
There is no guarantee that foo
is initalized before
bar
.
Try to avoid such dependencies by not depending on other statically allocated objects in constructors.
In MPStuBS you have to explicitly enable interrupts on each core
using Core::Interrupt::enable();
(in main()
and main_ap()
).
Make sure that you confirm the processing of an interrupt with an
End Of Interrupt (EOI) using
LAPIC::endOfInterrupt()
in the
interrupt_handler
– otherwise, the IOAPIC will not send
further interrupts from this device!
Make sure the destination
mask in the IOAPIC
Redirection Table only references the actual available cores.
The Intel manual states:
For both configurations of logical destination mode, when combined with lowest priority delivery mode, software is responsible for ensuring that all of the local APICs included in or addressed by the IPI or I/O subsystem interrupt are present and enabled to receive the interrupt.
More recent processors assume the correctness of the value written to
destination
, and, therefore, will try to redirect the
interrupt to a (non-existing) LAPIC if misconfigured – and waits
(forever) for response.
Since the Linux kernel v4.6, the KVM kernel module for the I/O APIC
mode we use (Logical Destination Mode with Lowest Priority
Delivery Mode) uses by default a technique called Vector
Hashing. In this technique, the interrupt vector number is
calculated by a hash function (interrupt vector number modulo number of
virtual cores), and the result is taken as the target core for this
interrupt vector number. In our case, the result is
33 % 4 == 1
, so that the keyboard interrupt is always sent
to CPU 1. For virtualizing real operating systems, this may be
worthwhile, since it allows better use of caching effects - but for our
exercise, this is rather obstructive.
Employing module parameters this behavior can be disable, hence the interrupts are distributed again to all virtual CPUs. To achieve this on the currently booted kernel, the kernel module must be unloaded and reloaded with the correct parameter. This can be achieved with the following commands (using Ubuntu 18.04 and an Intel CPU in the host as an example):
sudo modprobe -r kvm_intel kvm
sudo modprobe kvm vector_hashing=N
sudo modprobe kvm_intel
After a restart, however, the changed settings are lost again. To set
these parameters automatically on boot, an entry in the file
/etc/modprobe.d/kvm_options.conf
is necessary. The easiest
way to do this is to execute the following command:
echo "options kvm vector_hashing=N" | sudo tee /etc/modprobe.d/kvm_options.conf
In our CIP, KVM is already configured with the correct setting, so
that the interrupts should be distributed to all cores. You can check
the current setting by reading the file
/sys/module/kvm/parameters/vector_hashing
, e.g. using
cat /sys/module/kvm/parameters/vector_hashing
If it says N
, vector hashing is deactivated.
According to the standard, you should actually wait for the
ACK
from the keyboard. But this is error-prone – because
you have the following possibilities:
Wait until a character is available and then loop until it is an
ACK
.
do {
while (!(ctrl_port.inb() & HAS_OUTPUT)) {}
} while (data_port.inb() != ACK);
This can cause the driver to go into an endless loop.
Only check if the next character received is an
ACK
while (!(ctrl_port.inb() & HAS_OUTPUT)) {}
unsigned char ack = data_port.inb();
if (ack != ACK) { // wait for ack
DBG << "keyboard: " << hex << (int) ack << dec << " instead of ACK" << endl;
}
That’s better. Doesn’t cause a dead locked, however, a keystroke could be read accidentally.
We just do nothing, and leave the response. In
fetch()
we read and ignore the ACK
packages.
Since we only send trivial commands (like the active LEDs) to the
keyboard, and since we don’t have any reasonable handling if the
keyboard won’t ACK
our requestes, we take the pragmatic
solution 3 – the other two are more complicated and (in the worst case)
might cause problems in the following exercises.
Some documentations state that keyboard should be edge triggered. Moreover, executing
cat /proc/interrupts
in Linux will probably show something like
IR-IO-APIC 1-edge i8042
So Linux configures the keyboard in an edge-triggered mode. But the StuBS documentation insists on level-triggered mode! Which one is correct?
In fact, both trigger modes will work: If a key is in the buffer, the the level is pulled up (having polarity high / using active high), and the there is now also an edge during this level change. So both configurations will trigger an interrupt.
If we want to be fully compliant with the standards, we have to
examine the ACPI MADT (field flags_trigger_mode
in
struct Interrupt_Source_Override
). However, this is
unnecessarily fiddly, that’s why we ignore it – our systems will work
with both trigger modes.
So we have to rephrase the question: what is better?
Having edge-triggered interrupts, a keystroke will only cause a single interrupt request. If, for any reasons, this interrupt is not handled correctly (or the keyboard buffer was not drained during configuration), we won’t receive any further interrupts from the keyboard.
With level-triggered interrupts, however, we would periodically receive interrupts from the keyboard device, until all characters are fetched from its buffer.
Hence, the level-triggered mode is the more “forgiving” variant, so we recommend it. But if you coded everything correctly, both will do.