Also available in pdf, slides. beamer. latex.
Hello! I’m your lab-instructor for this course.
In this series, you will join forces with me, and together, we will build a kernel from the scratch. We both will be working on this kernel.
I’ll do some coding in a branch, and ask you to implement some functionality. You can get my code by merging the branch with yours, and implement the functionality I asked. Once you implement it and commit the changes in your repository, I’ll again work on the kernel on some other branch..
So far, I have managed to write: See osdev barebones
set the stack pointer,
movl $tmpstack_bottom, %esp
clear flags,
pushl $0
popf
call the C function
call core_boot
enter infinite loop
cli
loop:
hlt
jmp loop
x86/main.cc : a C function which does nothing
extern "C" void core_boot(){
}
Syntax:
bash$ make <target> B=<release/debug>
where target =
iso : create boot cd
exe : build kernel (default)
qemu : run qemu
qemu-gdb : qemu with gdb
Try ‘make qemu-direct’ and ‘make qemu-gdb-direct B=debug’ if you face any issues.
I’ve enabled qemu’s instruction tracing. So after executing ‘make qemu’, a trace file created named qemu.log in the current working directory.
When looking at the tracefile(qemu.log), please skip the initial bios instructions
----------------
IN:
0xfffffff0: ljmp $0xf000,$0xe05b
and also skip the bootloader code,
----------------
IN:
0x00007c00: call 0x7c03
Towards the end you can see our kernel’s instruction trace. For example:
----------------
IN:
0x00100050: mov $0x104080,%esp
0x00100055: push $0x0
0x00100057: popf
0x00100058: call 0x1040a0
----------------
IN: core_boot
0x001040a0: repz ret
0x0010005d: cli
0x0010005e: hlt
Optional: Multiboot specification specifies the interface between boot loader(eg: grub) and the kernel. You can also boot our kernel from your laptop, by using any multiboot combatible boot loader.
For example: On grub2, I press ‘c’ to enter command prompt, and type:
(grub2) multiboot (hd0,msdos5)/home/alice/hohlabs/_tmp/hoh.exe
(grub2) boot
So here’s what you should do:
Please ensure you have latest version of:
In debian/ubuntu, do:
bash$ sudo apt-get install qemu qemu-system g++-multilib git-all grub2 grub-pc-bin libboost-all-dev xorriso
Since we both will work on this kernel, we need to have a version control system. We’ll use git as our version control system. Please clone the repository to your local directory
user@host:~$ git clone ssh://<user>@palasi.cse.iitd.ac.in/misc/research/teaching/sbansal/os/hohlabs.git
user@host:~$ cd hohlabs
user@host:~/hohlabs$
For each parts, do
Please get the changes done by lab-instructor by merging the corresponding branch to your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/<branch_name>
For example, to get first part, do:
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/vgatext
Modify the files under the directory “labs” only to add the missing functionality. For example, for the first part, you should modify the function writechar in labs/vgatext.h
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ vim labs/vgatext.h
Test your code by:
user@host:~/hohlabs$ make qemu
(Optional) To debug:
From first terminal:
user@host:~/hohlabs$ make qemu B=debug
From another terminal:
user@host:~/hohlabs$ gdb
In gdb, you can set break point for example ’_start’
(gdb) break _start
(gdb) ni
(gdb) continue
Commit your changes in your local repository
user@host:~/hohlabs$ git add -p labs/
user@host:~/hohlabs$ git commit -m "your log message"
#Advanced: git add labs/ ; git commit -m "commit message" ; git stash ; .... now do pull/merge .... ; git stash pop;
Do submit your code so far. (resubmissions are allowed)
make sure your changes are available in palasi. Skip this step, if you’re working in GCL.
user@host:~/os$ scp -r hohlabs user@palasi.cse.iitd.ac.in:os
user@host:~/os$ ssh user@palasi.cse.iitd.ac.in
user@palasi:~$ cd os/hohlabs
user@palasi:~/os/hohlabs$ os-submit-lab <labid>
If you are working in GCL, just submit your changes in palasi using:
user@palasi:~/hohlabs$ os-submit-lab <labid>
os-get-submission
In this first part, we’ll look into basic primitives required for writing an OS.
Marks for each part is computed by following equation:
Marks = (Wd * D + Wv * V)
For 1.14: Wd = 2.00 and Wv = 0.50
During viva: If you’re not able to explain the code that you wrote yourself(what is the code doing) we will report you as a major copy case and demo won’t be taken for any of the parts.
I’ve added few more code in origin/vgatext branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/vgatext
In this part, we’ll program a memory mapped device while enhancing our kernel by adding the functionality to display “Hello, world!”.
In VGA text mode, 16 bit (2 bytes) of information is stored for each screen character and is stored in row-major order. First byte(MSB) is the ASCII code of the screen character and the next byte(LSB) encodes background(4 bit: msb) and foreground color(4 bit: lsb). Color: 0x0 corresponds to black pallete, 0x7 corresponds to white pallete, 0x1 corresponds to blue pallete.
I’ve added few lines of C code in x86/main.cc:
for(i=0;i<sizeof mesg;i++){
vgatext::writechar(i, mesg[i], bg_color, fg_color, vgatext_base_address);
}
You need to define the following functions in labs/vgatext.h
void writechar(int loc, uint8_t c, uint8_t bg, uint8_t fg, addr_t base);
Arguments of vgatext::writechar:
To help you with mmio, I also added util/io.h which has following functions:
mmio::read8(base,byte_offset)
mmio::write8(base,byte_offset,8 bit value)
mmio::read16(base,byte_offset)
mmio::write16(base,byte_offset,16 bit value)
mmio::read32(base,byte_offset)
mmio::write32(base,byte_offset,32 bit value)
You’re required to implement vgatext::writechar() in labs/vgatext.h
The kernel shall print ‘Hello, world!’ in the top left corner of the screen.
Endianness is a property of CPU - it’s about what should be the memory contents “when a CPU executes Write instruction to memory” or what is the value of register if we execute read instruction from memory.
When we say: MSB: char(8 bits) and LSB: bgfg (8 bits) - it’s independent of endianness.
It means: first byte should be char. and next byte is bgfg.
It specifies what should be the memory contents after you execute the CPU instruction. And depending on the target CPU’s (in which your OS is written for) endianness, you need to figure out what value you should write.
Be prepared to answer following viva questions:
Now it’s my turn. I’ve added few more code in origin/serial branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/serial
In this part, we’ll program an I/O mapped device while enhancing our kernel by adding debugging routines which will print debug messages to serial port.
Serial port aka pc16550d uart(universal asynchronous receiver transmitter). In pc16550d uart,
Registers:
The line status register has several fields (in lsb order):
name="dr", size="1 bit", description="Data ready"
name="oe", size="1 bit", description="Overrun error"
name="pe", size="1 bit", description="Parity error"
name="fe", size="1 bit", description="Framing error"
name="bi", size="1 bit", description="Break interrupt"
name="thre", size="1 bit", description="Transmitter holding register"
name="temt", size="1 bit", description="Transmitter empty"
name="erfifo", size="1 bit", description="Error in RCVR FIFO"
Before one writes a character(data) to transmitter holding register, one need to ensure that “thre” bit ([5:5] from lsb: fifth bit indexed from zero) in the line status register is set.
I’ve added hoh_debug macro in util/debug.h, which will convert the arguments into string and call serial::print for each character in the string. Usage:
In x86/main.cc: I’ve added the following line.
hoh_debug("Hello, serial!");
hoh_debug macro will expand to a call to serial::print()
I also added serial::print function in util/debug.cc:
void serial::print(char c){
wait until serial::is_transmitter_ready(serial_portbase) is true
call serial::writechar(c,serial_portbase)
}
So, once you implement the required two functions, you’ll be able to see “Hello, serial!” in your terminal.
You need to define the following functions in labs/serial.h
bool is_transmitter_ready(io_t baseport);
void writechar(uint8_t c, io_t baseport);
To help you with I/O(in and out asm), I had added following functions in util/io.h:
io::write8(baseport, offset, 8 bit value)
io::write16(baseport, offset, 16 bit value)
io::read8(baseport,offset)
io::read16(baseport,offset)
You’re required to implement serial::is_transmitter_ready() and serial::writechar() in labs/serial.h
The kernel shall print ‘Hello, serial!’ in your terminal.
Be prepared to answer following viva questions:
I’ve added few more code in origin/keyboard branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/keyboard
In this part, we’ll look at one way of abstracting out details of mmio::read8 vs io::read8 while enhance our kernel by adding a simple keyboard driver.
In Keyboard(8042, name=lpc_kbd), there are two main registers
status register: size=“8 bits” The status register has several fields
name="perr", size="1 bit", description="Parity error"
name="timeout", size="1 bit", description="General timeout"
name="aobf", size="1 bit", description="Auxiliary device output buffer full"
name="is", size="1 bit", description="Inhibit switch"
name="cd", size="1 bit", description="Command/data"
name="sf", size="1 bit", description="System flag"
name="ibf", size="1 bit", description="Input buffer full"
name="obf", size="1 bit", description="Output buffer full"
input register: size=“8 bits”
Before reading “input” register value, we need to make sure that the input buffer(of size 1) has data. Data availability in input buffer is indicated by the “Output Buffer full” bit in “status” register(Keyboard’s output buffer to CPU). So, we need to make sure that “Output Buffer full” bit is set in the “status” register.
To read value of register, use:
regiser_value = <devicename>_<registername>_rd(address of device info structure);
To extract value of a field from register value, use:
field_value = <devicename>_<registername>_<fieldname>_extract(register_value);
For example, generated/lpc_kbd.h contains following functions:
core_loop_step():
if(!has_key(dev)){
return;
}
input=get_key(dev);
hoh_debug("Got key: "<<input);
core_loop():
repeat core_loop_step
You need to define the following functions in labs/keyboard.h
bool has_key(lpc_kbd_t& dev);
uint8_t get_key(lpc_kbd_t& dev);
Following functions are defined in generated/lpc_kbd.h(generated from spec/lpc_kbd.spec using modified mackerel):
lpc_kbd_status_rd() : return the value of "status" register of "lpc_kbd" device
lpc_kbd_status_obf_extract() : extract "obf" field from "status" register of "lpc_kbd" device
lpc_kbd_input_rd() : return the value of "input" register of "lpc_kbd" device
Trivial.
You’re required to implement the required functions in labs/keyboard.h
Kernel shall print scancode of each key pressed in your terminal(hoh_debug).
Be prepared to answer following viva questions:
Device interface functions in generated/lpc_kbd.h are generated by a modified version of mackerel.
I’ve added few more code in origin/shell branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/shell
In this part, we’ll look at one design approach while implementing a toy shell supporting builtin functions only.
Reuses previous parts of this series to create a shell.
core_loop_step():
if user has pressed key, get the key and do:
shell_update(ro: key, rw: shell_state);
// execute shell for one time slot to do some computation, if required.
shell_step(rw: shell_state);
// shellstate -> renderstate: compute render state from shell state
shell_render(ro: shell_state, wo: render_state);
if not render_eq(last renderstate and new renderstate):
render(ro: render_state, wo: vga text buffer);
You need to define the following structures in labs/shell.h
// state for shell
struct shellstate_t{
};
// state required to render( for ex: intermediate results shouldnt be in render)
struct renderstate_t{
};
You also need to define the following functions in labs/shell.cc
void shell_init(shellstate_t& state);
// input: handle keyboard event
void shell_update(uint8_t scankey, shellstate_t& stateinout);
// computation: do one step of computation, if required
void shell_step(shellstate_t& stateinout);
// copy necessary information required to render the UI to renderstate
void shell_render(const shellstate_t& shell, renderstate_t& render);
// output: how to render
bool render_eq(const renderstate_t& a, const renderstate_t& b);
void render(const renderstate_t& state, int w, int h, addr_t display_base);
NA.
There’re several helper functions given in the labs/shell.cc. When you execute, you’ll be seeing a simple menu based interface. You may or may not use those functions. Please feel free to create your own interface.
See the comments inside labs/shell.cc
You’re required to define the structures in labs/shell.h and implement the required functions in shell.cc
A simple shell with several builtin commands including a “long computation task” and a status bar showing the “number of key presses” so far.
Have you noticed that:
ie. System latency to keyboard events is high - we’ll improve this in next part.
Be prepared to answer following viva questions:
I’ve added few more code in origin/coroutine branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/coroutine
In this part, we’ll learn about “asymmetric-stackless coroutines” while enhancing our kernel to make it responsive to key presses while long computation task is running.
Coroutines are a generalization of coroutines which allows explicit suspend and resume operations(yield and call). Coroutines can be used for nonpremptive multitasking(fibers), event loop, and light weight pipes(producer consumer problem).
Definition of coroutine from Coroutines: A Programming Methodology, a Language Design and an Implementation(1980):
For the purposes of this thesis, the following will be regarded as
the fundamental characteristics of a coroutine:
(1) the values of data local to a coroutine persist between
successive occasions on which controls enters it (that is, between
successive calls), and
(2) the execution of a coroutine is suspended as control leaves it,
only to carry on where it left off when control re-enters the
coroutine at some later stage.
Classification of coroutines from Revisiting coroutines(2009):
There is a proposal to support coroutines in C++. (Several languages like: C#, Perl, Python, Haskell, Erlang, Scheme, Factor supports coroutines.)
See Simon Thatham’s coroutine implementation or boost coroutine’s Introduction & Motivation or Protothreads for more details.
Slides: Coroutines and Fibers
Since we don’t have language support yet, Let’s first build a coroutine library first.
You’ll help me in implementing the long computation as a coroutine.
You need to define the following structures in labs/coroutine.h
// state for your coroutine implementation:
struct f_t{
};
You also need to define the following functions in labs/coroutine.cc
shell_step_coroutine(shellstate_t&, coroutine_t&, f_t&);
You also need to enhance your shell implementation in labs/shell.h
update shellstate_t and renderstate_t structure: i
for handling coroutine state, and
new menu item for long computation task in coroutine form
You also need to enhance your shell implementation in labs/shell.cc
new menu item for long computation task
core_loop_step():
if user has pressed key, get the key and do:
shell_update(ro: key, rw: shell_state);
// execute shell for one time slot to do some computation, if required.
shell_step(rw: shell_state);
// execute shell for one time slot to do some computation based on coroutine, if required.
shell_step_coroutine(rw: shell_state, f_coro, f_locals);
// shellstate -> renderstate: compute render state from shell state
shell_render(ro: shell_state, wo: render_state);
if not render_eq(last renderstate and new renderstate):
render(ro: render_state, wo: vga text buffer);
Following functions are defined in util/coroutine.h:
coroutine_t : internal data structure to save the state of coroutine (where to continue)
coroutine_reset() : initialize/reset coroutine_t
h_begin() : begin coroutine ( jump to saved state )
h_yield() : yield ( save the state, and return)
h_end() : end ( infinitely call yield )
//
// state of function f to be preserved across multiple calls.
//
struct f_t{
int i;
int j;
};
//
// first time you call f(), it'll
// execute h_yield with value 1. (i=1 and j=1 at this point)
//
// next time you resume/call it, it'll continue execution from this point,
// and calls h_yield with value 2 (i=1 and j=2 at this point)
//
// In short, each time you resume/call f(), it'll return
//
// 1*1, 1*2, 1*3
// 2*1, 2*2, 2*3
// 3*1, 3*2, 3*3
//
//
void f(coroutine_t* pf_coro,f_t* pf_locals,int* pret,bool* pdone){
coroutine_t& f_coro = *pf_coro; // boilerplate: to ease the transition from existing code
int& ret = *pret;
bool& done = *pdone;
int& i = pf_locals->i;
int& j = pf_locals->j;
h_begin(f_coro);
for(i=1;i<=3;i++){
for(j=1;j<=3;j++){
ret=i*j; done=false; h_yield(f_coro); // yield (i*j, false)
}
}
ret=0; done=true; h_end(f_coro); // yield (0,true)
}
// How to use use f()?
coroutine_t f_coro;
coroutine_reset(f_coro);
f_t f_locals;
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=1*1 f_done=false
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=1*2 f_done=false
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=1*3 f_done=false
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=2*1 f_done=false
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=2*2 f_done=false
...
f(f_coro,f_locals,shell.f_ret,shell.f_done); //post cond: f_ret=0 f_done=true
I’ve added few more code in origin/fiber branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/fiber
In this part, we’ll learn about “fibers” while enhancing our kernel to make it responsive to key presses while long computation task is running.
core_loop_step():
if user has pressed key, get the key and do:
shell_update(ro: key, rw: shell_state);
// execute shell for one time slot to do some computation, if required.
shell_step(rw: shell_state);
// execute shell for one time slot to do some computation, if required.
shell_step_coroutine(rw: shell_state, rw: f_coro, rw: f_locals);
// execute shell for one time slot to do some computation based on fiber, if required.
shell_step_fiber(rw: shell_state, rw: main_stack, rw: f_stack, rw: f_array, ro: f_arraysize);
// shellstate -> renderstate: compute render state from shell state
shell_render(ro: shell_state, wo: render_state);
if not render_eq(last renderstate and new renderstate):
render(ro: render_state, wo: vga text buffer);
You need to define the following functions in labs/fiber.cc
shell_step_fiber(shellstate_t&, addr_t& main_stack, addr_t& f_stack, addr_t f_array, uint32_t f_arraysize);
You also need to enhance your shell implementation in labs/shell.h
update shellstate_t and renderstate_t structure:
for handling fiber state, and
new menu item for long computation task as fibers
You also need to enhance your shell implementation in labs/shell.cc
new menu item for long computation task
stack_reset(f_stack,f_array,f_arraysize,f_start,f_args...) : resets the stack. use std::ref() from functional to pass references
stack_resetN(f_stack,f_array,f_arraysize,f_start,f_args...): resets the stack. for C/ older C++ compilers.
stack_saverestore(from_stack,to_stack) : saves the context to from_stack, restore the context from to_stack.
void f(addr_t* pmain_stack, addr_t* pf_stack, int* pret, bool* pdone){
addr_t& main_stack = *pmain_stack; // boilerplate: to ease the transition from existing code
addr_t& f_stack = *pf_stack;
int& ret = *pret;
bool& done = *pdone;
int i;
int j;
for(i=1;i<=3;i++){
for(j=1;j<=3;j++){
ret=i*j;done=false; stack_saverestore(f_stack,main_stack);
}
}
for(;;){
ret=0;done=true; stack_saverestore(f_stack,main_stack);
}
}
// How to use use f()?
uint8_t f_array[F_STACKSIZE];
const size_t f_arraysize=F_STACKSIZE;
addr_t main_stack;
addr_t f_stack;
stack_reset4(f_stack, &f_array, f_arraysize, &f, &main_stack, &f_stack, &shell.f_ret, &shell.f_done);
stack_saverestore(main_stack,f_stack); //post cond: f_ret=1*1 f_done=false
stack_saverestore(main_stack,f_stack); //post cond: f_ret=1*2 f_done=false
stack_saverestore(main_stack,f_stack); //post cond: f_ret=1*3 f_done=false
stack_saverestore(main_stack,f_stack); //post cond: f_ret=2*1 f_done=false
stack_saverestore(main_stack,f_stack); //post cond: f_ret=2*2 f_done=false
...
stack_saverestore(main_stack,f_stack); //post cond: f_ret=0 f_done=true
//
// Switch stacks.
//
// Algo:
// 1. Save _c's context to stack,
// 2. push ip of _c's restore handler
// 3. switch stacks
// 4. execute ip of _n's restore handler to restore _n's context from stack.
//
//
// stack layout:
// teip[-1:-32]: continuation to restore,
// Stack layout expected by teip:
// ebp[ -33: -64],
// ebx[ -65: -96],
// eax[ -97:-128],
// Stack layout expected by eip+4:
// Preserved.
#define stack_saverestore(from_stack,to_stack) do { \
asm volatile( \
" pushl %%eax \n\t" \
" pushl %%ecx \n\t" \
" pushl %%ebp \n\t" \
" pushl $1f \n\t" \
" \n\t" \
" movl %%esp,(%0) \n\t" \
" movl (%1),%%esp \n\t" \
" \n\t" \
" ret \n\t" \
"1: \n\t" \
" popl %%ebp \n\t" \
" popl %%ecx \n\t" \
" popl %%eax \n\t" \
: \
:"a" (&from_stack), "c" (&to_stack) \
:_ALL_REGISTERS, "memory" \
); \
} while(false)
//
// Initializes stack.
//
// Algo:
// 1. Push Ip of reset handler
// (which will reset ebp and jmp to actual eip etc)
//
// stack layout:
// teip[-1:-32]: continuation to restore(1f),
// Stack layout expected by teip:
// args passed in registers when calling eip (NONE),
// eip[-33:-64],
// args passed in stack when calling eip (NONE),
//
// initial values: teip=t_start; eip=f_start;
//
#define stack_inithelper(_teip) do{ \
asm volatile( \
" movl $1f,%0 \n\t" \
" jmp 2f \n\t" \
"1: \n\t" \
" movl $0, %%ebp \n\t" \
" jmp *(%%esp) \n\t" \
"2: \n\t" \
:"=m" (_teip) \
: \
); \
}while(false)
NA
Before we do so, let’s first implement support for multiple non-premptive threads.
Syntax: GCC Extended Asm
asm [volatile] ( AssemblerTemplate
: OutputOperands
[ : InputOperands
[ : Clobbers ] ])
Label 1f means: the immediate label 1 in the forward direction.. and label 1b means the immediate label 1 in the backward direction.. And $1f means address of label 1 in the forward direction.
In stack_inithelper macro, the _teip gets the address of label 1f.
:“a”(value) inside input operands means : gcc will make sure %eax is not live at that point, and Move the value into “%eax” register
:“c”(value) inside input operands means : gcc will make sure %eax is not live at that point, and Move the value into “%ecx” register
if a register is mentioned in clobbered list - gcc will ensure that register is not live before calling asm statement. (all the integer registers which are not pushed in the macro are mentioned in _ALL_REGISTERS as clobbered. stack_saverestore is a macro - not a function so no calling convention is applied)
I’ve added few more code in origin/fiber_scheduler branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/fiber_scheduler
In this part, we’ll learn about non-preemptive sheduling while enhancing our shell to support multiple pending long computation task.
NA
core_loop_step():
if user has pressed key, get the key and do:
shell_update(ro: key, rw: shell_state);
// execute shell for one time slot to do some computation, if required.
shell_step(rw: shell_state);
// execute shell for one time slot to do the computation based on coroutine, if required.
shell_step_coroutine(rw: shell_state, rw: f_coro, rw: f_locals);
// execute shell for one time slot to do the computation based on fiber, if required.
shell_step_fiber(rw: shell_state, rw: main_stack, rw: f_stack, rw: f_array, ro: f_arraysize);
// execute shell for one time slot for additional long computation tasks.
shell_step_fiber_scheduler(rw: shell_state, rw: stackptrs, ro: stackptrs_size, rw: arrays, ro: arrays_size);
// shellstate -> renderstate: compute render state from shell state
shell_render(ro: shell_state, wo: render_state);
if not render_eq(last renderstate and new renderstate):
render(ro: render_state, wo: vga text buffer);
You need to define the following functions in labs/fiber_scheduler.cc
shell_step_fiber_scheduler(shellstate_t&, addr_t stacks[], uint32_t stacks_size, addr_t arrays, uint32_t arrays_size);
You also need to enhance your shell implementation in labs/shell.h
update shellstate_t and renderstate_t structure:
for handling scheduler state, etc
You also need to enhance your shell implementation in labs/shell.cc
atleast two long computation tasks.
and ui changes.etc
NA
This is the goal: So far, we have the capability to run only one fiber. We need to support multiple fibers - Let's say:G and H with types:
1. G:: GArg -> GResult
2. H.:: HArg -> HResult
We also want to support multiple invocations of these fibers. (atmax 3). Question also states about one more constraint - total number of instances for G and H should be <= 5.
Now, we have to store 3*(GArg,GResult) and 3*(HArg,HResult) in shellstate_t.. just like we did it for f (we'd stored args and result in shellstate for 1.5 and 1.6).
What should be a good data structure for storing these? Two common approaches are:
1. 3*(GArg,GResult) and 3*(HArg,HResult)
2. 5* Union of (GArg,GResult) and (HArg,HResult)
How to do scheduling?
Let's say, we have a circular buffer/linked list on top of array.
When someone wanted to start an instance(press enter), just check the resource limitations. and change state, add into the queue.
and in each invocation of fiber_scheduler... just pick one fiber(round robin), and execute.
ie. in next invocation - pick the next fiber and execute it.. so on.
This is just one way to implement.. You don't need to implement this way
- mentioned at the last day to help those students who're running out of
time.
To test how good is your design:
etc.
I’ve added few more code in origin/preemption branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/preemption
In this part, we’ll learn about “preemption” while enhancing our kernel to make it responsive to key presses while long computation task is running.
You also have to make following changes in the existing implementation:
FXSAVE and FXRSTOR assembly instructions: To save and restore FPU/SIMD registers To save/restore all these registers, Intel provided a single instruction FXSAVE/FXRSTOR.
To know more about fxsave and fxrstor instruction, please read: Vol-1, Chapter 10, Section 5. You can use the example on Page 404 (Section E.2, Example E-1) for the code required to save and restore these registers. Notice that you need to create a space for 512 bytes, aligned at a 16-byte boundary to be able to execute the FXSAVE instruction.
Note that memory address passed to fxsave and fxrstor must be 16 byte aligned. ie. must be a multiple of 16.
Possible Control flow
Make sure your ring0_prempt will be able to work with below scenario
Main thread (fiber scheduler) : The main thread needs to distinguish between the two cases: one where the control reached it due to voluntary switching from a fiber thread (through calling stack_saverestore directly) [step 11 in figure], and the other where the control reached it due to preemption [step 31 in figure]. For the first case, the location of the stack pointer of the switched-out fiber is available in the parameter “from_stack” of stack_saverestore. For the second case, the pointer would be available in the core_t.preeempt.foo field (implemented by you).
FPU: eax,ecx, edx, ebx, esp, ebp, esi, edi are all integer registers.
Let’s try to write a simple C functions which add two floats:
float add(float a, float b){
return a+b;
}
Which registers are they going to use, and which instructions? integers registers? addl instruction? No! What’s the format of floats? number is represented as (sign,mantisa,exponent). To know about it, please read about IEEE754 floating point representation/basic computer architecture course. That’s where legacy 8087 FPU comes into picture.
It has 8 80-bit FPU registers: st(0),st(1), st(2)…st(7). ( 1 bit sign, 64 bit mantissa, 15 bit exponent) sizeof(double)=8. Floating point loads and stores will convert this 80-bit representation to 64 bit representation when it store to memory.. and viceversa. To know about FPU registers, please read: Vol-1, Chapter 8
MMX/SIMD2: With one instruction, we want to add N pairs in parallel, which means we want registers than hold N ints (or N floats).
x86 has mm0, mm1, mm2.. mm7 (which are SIMD2, ie. N=2 - it holds two floats). To know more about MMx registers, please read: Vol-1 Chapter 9
SSE/SIMD4: It also introduced SIMD4(128 bit registers) xmm registers. xmm0, xmm1 … xmm7. To know more about xmm registers, please read: Vol-1 Chapter 10
AVX/SIMD8 Intel also introduced (SIMD8) ymm registers in architectures like Sandybridge, Haswell etc. but since our gcl machines doesn’t support these - we won’t discuss it here.
Overview of preemption handler’s control flow:
Read the code - to understand where ring0_prempt is getting called
You need to define the following structures in labs/preempt.h
// preempt_t : State for your timer/preemption handler
struct preempt_t{
};
You also need to define the following functions in labs/preempt.h
//
// _name: label name
// _f : C function to be called
//
# define _ring0_preempt(_name,_f) \
You also need to modify labs/fiber.cc and labs/fiber_scheduler.cc to set the timer and reset the timer
Both the shell_step_fiber and shell_step_fiber_sched are passed an dev_lapic_t object. which has a member function:
reset_timer_count(int count).
LAPIC Timer unit will decrement this count every tick, and when it reaches zero, will fire a timer interrupt.
To know more about LAPIC Timer: please read: Vol 3A, 10.5.4.
%gs: See x86/main.h and x86/except.* on usage of %gs
Outline of ring0_preempt:
#define _ring0_preempt(_name,_f)
_name:
call C function: _f
// begin
if thread is already inside yield,
jmp iret_toring0
save the CPU state to current stack (fiber's stack)
save the current stack pointer to core_t.preempt.foo
switch context and stack using stack_saverestore()
... (control will not reach here immediately, it will reach here only on the next context switch back to this fiber)
restore CPU state from the current stack
jmp iret_toring0
NA
I’ve added few more code in origin/multicore branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/multicore
In this part, we’ll learn about multicore programming by implementing a SPSC queue and use it to send messages between two cores.
struct channel_t{
};
struct writeport_t{
//
// Writer
//
// no of entries available to write
size_t write_reservesize();
// Can write 'n' entries?
bool write_canreserve(size_t n);
// Reserve 'n' entries for write
size_t write_reserve(size_t n);
//
// Deleter
//
// No of entires available to delete
size_t delete_reservesize();
// Can delete 'n' entires?
bool delete_canreserve(size_t n);
// Reserve 'n' entires for deletion
size_t delete_reserve(size_t n);
//
// Synchronized operations
//
// Note: Feel free to implement these functions the way you want.
// You're not allowed to change the function prototype
// PS: Don't go by the function names.
//
// Read/Write shared memory data structure
void write_sync(channel_t& ch);
// Read/Write shared memory data structure
void read_sync(channel_t& ch);
// Update the state, if any.
void delete_sync();
};
struct readport_t{
//
// Reader
//
// no of entries available to read
size_t read_reservesize();
// Can Read 'n' entires?
bool read_canreserve(size_t n);
// Reserve 'n' entires to be read
size_t read_reserve(size_t n);
//
// Synchronization operation
//
// Note: Feel free to implement these functions the way you want.
// You're not allowed to change the function prototype
// PS: Don't go by the function names.
// Read/write shared memory data structure
void read_sync(channel_t& ch);
// Read/Write shared memory data structure
void write_sync(channel_t& ch);
};
NA
std::atomic<T>
PS: shell_update() is your keyboard handler, on every key press it will be called. it’s independent of shell_step()
I’ve added few more code in origin/ring3 branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3
In this part, we’ll learn about ELF headers, page table handling and user mode switching while enhancing our kernel to load arbitary user program and execute it.
Please see lecture videos:
static inline void elf_load(addr_t from, size_t fromsize, process_t& proc, bitpool_t& pool4M);
static inline void ring3_step(preempt_t& preempt, process_t& proc, dev_lapic_t& lapic);
static inline void ring3_step_done(process_t& proc, dev_lapic_t& lapic);
I’ve added few more code in origin/ring3 branch. Please merge it with your master branch
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3
In this part, we’ll learn about preempting user program while enhancing our kernel to make it responsive to key presses while long computation task is running in ring3/user mode.
Please see following lecture videos: - Process context switch
In labs/ring3_preempt.h:
#define _ring3_preempt(_name, _f)
NA
NA
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3
In this part, we’ll learn about upcalls (passing information to the exception handler running in user mode) by letting the user process manage the exceptions(like INT3, page faults etc).
type of (_start+4) is:
void user_exception(uint32_t rank, uint32_t masterro, uint32_t masterrw, uint32_t sharedrw, uint32_t num, uint32_t errorcode, uint32_t oldesp, uint32_t old_eip, uint32_t cr2)
In labs/ring3_upcall.h:
#define _ring3_upcall(_name, _f)
NA
NA
Generate an int3 or a page fault yourself. and see if it is getting reported correctly. ie. Match the values in qemu.log and the ones printed by user_exception handler.
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3
In this part, we’ll learn about downcalls/system calls by implementing following system calls: - You need to define the following function: system_call();
System call should NOT modify/write to the system call memory. (See Tip)
nop()
mark the process as done(proc->state=PROC_DONE). process shouldn’t be scheduled after this. So make sure, in your ring3_step, you ignore the process if proc->state==PROC_DONE.
done/exit()
call appropriate mmio::read
mmio_read(size, addr_t) -> value
call appropriate mmio::write
mmio_write(size, addr_t, value)
call appropriate io::read
pmio_read(size, io_t) -> value
call appropriate io::write
pmio_write(size, io_t, value)
Make sure you swap the flags as well. For example: if va1 is not mapped into user page, and va2 is mapped, After swap: va1 is mapped into user page and va1 is not.
mmu_swapva(va1,va2)
Note: VA_RANGE is defined as 2GB-3GB.
mmu_mapmmio(nva)
for time being, set iopl flags to 3. ie. proc->iopl=3. and always make sure eflags = (eflags & ~(3u<<12)) | (proc->iopl<<12);
pmu_mappmio(io_t)
Note: This system call returns either 0 or a va within the range VA_RANGE.
pool_alloc() -> va
Note: No need to implement authorization. We haven’t implemented support for capabilities in this kernel yet. We’ll implement capabilities in IPC part only.
User shall pass arguments through begin of page shared between user and kernel. Memory layout:
Kernel may execute system call asynchronously by reading the shared page. User can alternatively force the use of system call execution, by using INT 0x48.
Note: Make sure In elf_load() you clears first 64 byte of proc.masterrw. esp. initialize proc.masterrw[0] as zero.
NA
static inline void xsyscall(uint32_t* systemcallmmio, uint32_t fnum, uint32_t arg1, uint32_t arg2, uint32_t arg3, uint32_t& ret1, uint32_t& ret2, uint32_t& ret3){
systemcallmmio[2]=arg1;
systemcallmmio[3]=arg2;
systemcallmmio[4]=arg3;
systemcallmmio[1]=fnum; //write this field at the end.
hoh_debug("Shell Before making system call");
asm volatile("int $0x48":::"memory");
hoh_debug("Shell After making system call");
hoh_assert(systemcallmmio[1]==0,"XXX");
ret1=systemcallmmio[2];
ret2=systemcallmmio[3];
ret3=systemcallmmio[4];
hoh_debug("Syscall ret: "<<ret1<<","<<ret2<<","<<ret3);
}
// call test_systemcall by:
//swapva
uint32_t ret1;
uint32_t ret2;
uint32_t ret3;
xsyscall(core.syscallmmio, 0x6, xxx, yyy, 0, ret1, ret2, ret3);
//pool_alloc
uint32_t ret1;
uint32_t ret2;
uint32_t ret3;
xsyscall(systemcallmmio, 0x9, 0,0,0, ret1,ret2,ret3);
hoh_debug("Allocated at: "<<ret1);
NA
uint32_t* systemcall_mmio = cast<uint32_t*>(proc.masterrw);
uint32_t fnum =systemcall_mmio[1]; //read fnum first.
if(fnum==0){ //make sure you check fnum.
return;
}
uint32_t farg1=systemcall_mmio[2];
uint32_t farg2=systemcall_mmio[3];
uint32_t farg3=systemcall_mmio[4];
uint32_t fret1=0;
uint32_t fret2=0;
uint32_t fret3=0;
switch(fnum){
case 0: {
}break;
case 1: {
}break;
case 2: {
}break;
}
if(fnum!=0){
// do not modify the arguments if fnum is zero.
systemcall_mmio[2]=fret1;
systemcall_mmio[3]=fret2;
systemcall_mmio[4]=fret3;
systemcall_mmio[1]=0; //modify this last.
}
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3
In this part, we’ll learn about virtual memory by emulating an array of Nv virtual pages using Np physical pages.
Note: There is a change : Nv = 16 and Np = 4 instead of Nv = 16 and Np = 8. Ie. you need to emulate 16 page array using 4 physical pages. not 8
Please read lecture videos on demand paging and page replacement policy.
Nv = 16 and Np = 4
You need to define a function to_index which will map this 3D array into varray. ie
//access x,y,z of this 3D array by:
varray[to_index(x,y,z)]=f(x,y,z);
//or:
hoh_assert(varray[to_index(x,y,z)] == f(x,y,z)," Bug");
for(uint32_t x = 0; x<256; x+=32){ for(uint32_t y = 0; y<256; y+=32){ for(uint32_t z = 0; z<256; z+=32){ sum_neighbours(x,y,z,f_lut); } } }
``` 3. You need to implement page replacement policy. You need to find a page replacement policy that will minimize the number of page faults and cache misses.
Motivation for the application:
Let’s say we’ve a long computation function
uint32_t f(uint8_t x, uint8_t y, uint8_t z);
To reduce invocation of this function each time we need it. We precompute ‘f’ for all the possible inputs and store it in a lookup table/array. ie.
for(x=0;x<=255;x++){
for(y=0;y<=255;y++){
for(z=0;z<=255;z++){
varray[to_index(x,y,z)] = f(x,y,z);
}
}
}
//then, we can replace f with f2 where
uint32_t f2(uint8_t x,uint8_t y,uint8_t z){
return varray[to_index(x,y,z)];
}
To get f2 working, without any modifications: We will emulate the array ‘varray’ of 16 larges pages within VA_RANGE.
addr_t varray = addr_t(2<<30); //2GB
Demo tip: We need to call f2(x,y,z) for all possible x,y and z. What should be the order in which we should call f(x,y,z) so that it will minimize number of cache-line misses and tlb misses
Note: Please don’t publish the code even after your demo is done(Code is part of my PhD work).
//
// Call sum_neighbours for each x<-[0..255], y<-[0..255], z<-[0..255]
//
// You're allowed to change this implementation:
// Note: d is defined to be 2^6. and cannot be changed
uint32_t sum_neighbours(uint8_t x, uint8_t y, uint8_t z){
//computes sum of all the elements in the list defined by
// [f(x+i,y+j,z+k) | i<-[-d, -(d-1),...,-1,0,1,...,(d-1),d],
// j<-[-d, -(d-1),...,-1,0,1,...,(d-1),d],
// k<-[-d, -(d-1),...,-1,0,1,...,(d-1),d]]
//ie. Note d=2^6
size_t sum=0;
for(int i=-d; i<d; i++){
for(int j=-d; j<d; j++){
for(int k=-d; k<d; k++){
sum += f2(x+i, y+j, z+k);
}
}
}
}
//
// Call weightedsum_neighbours for each x<-[0..255], y<-[0..255], z<-[0..255]
//
// You're allowed to change this implementation:
// Note: d is defined to be 2^6. and cannot be changed
uint32_t weightedsum_neighbours(uint8_t x, uint8_t y, uint8_t z){
//computes sum of all the elements in the list defined by
// [f(x+i,y+j,z+k) | i<-[-d, -(d-1),...,-1,0,1,...,(d-1),d],
// j<-[-d, -(d-1),...,-1,0,1,...,(d-1),d],
// k<-[-d, -(d-1),...,-1,0,1,...,(d-1),d]]
//ie. Note d=2^6
size_t sum=0;
for(int i=-d; i<d; i++){
for(int j=-d; j<d; j++){
for(int k=-d; k<d; k++){
sum += w(i,j,k,d) * f2(x+i, y+j, z+k);
}
}
}
}
//
// Call another traversal for each x<-[0..255], y<-[0..255], z<-[0..255]
//
void for_aach(){
for each element in 3D array:
print sum_neighbours(element);
}
Notation: [f(x) | x<-[1..10]] as: list of all f(x) such that x ∈ [1..10]
Notation: Read ‘|’ as such that. Read ‘<-’ as ∈.
Congrats on making it so far! It’s been a pleasure working with you.
Hope you enjoyed it as well!
Let’s try to summarize the plot:
In short: - You’ve implemented kernel coroutines, kernel threads and a kernel thread scheduler. And implmented a kernel application - shell. - You also have implemented user coroutines, user threads and a user thread scheduler. And implemented a user application - shell.
Shell is already done for you! Your shell which you implemented in part 1.4 is already moved to user mode as an application. So the role has been revereed - whatever you’ve done till parts 1.9 are now in user mode. And parts 1.10 - parts 1.13 are in kernel.
user@host:~/hohlabs$ git pull
user@host:~/hohlabs$ git merge origin/ring3_shell
Get User shell working
Please don’t make the source code public even after you finish this course - The code you been working on is part of Deepak Ravi’s PhD. We hope to release the code under AGPL3 licence (Current LICENSE doesn’t allow the code). Till then please don’t publish.
End of lab
Please make sure you submit the feedback form
--
Regards,
DR H
Hoh labs