Frequently asked questions, and answers by TAs, in previous years. Unless otherwise mentioned, the answers (and in some cases, the questions) are by Deepak Ravi.
qemu.log
?
Because you're programming two different devices. - using mmio, you were programming vga device which is connected to monior (In qemu, monitor is emulated as qemu window) - using pmio, you were programming uart/serial console device, which is supposed to connect to a serial device. (In qemu, this serial port is connected to terminal) How did we connect the serial port of machine emulated by qemu to linux terminal? Look at the makefile: qemu:: iso $(QEMU) $(QEMUFLAGS) $(QEMULOG) -serial stdio -serial null -cdrom $O/$(NAME).iso To understand PMIO: Let's look at 8086 design. a 16-bit processor where address spaces are small. 8086 processor has a pin for MEM/IO. If MEM(pin is set), the address request will go to memory, and if IO(pin is unset), the address request will goto 16-bit IO bus. In short, for 8086, there're two address spaces - one for memory and another for IO. Memory address are actually 20-bit wide(remember 8086's segmentation? (CS:IP)), and IO address are 16-bit wide. So, how to specify which address space to choose? Yes just provide two different set of instructions to users, and let the user explicitly mention whether it is memory or IO. So to use IO bus, you need to use inX/outX instructions - they'll choose IO namespace by unsetting MEM/IO pin... Time's changed. we've 32/64 bit registers - memory address space is larger than physical RAM - so why not combine MEM/IO into one single address space, and use same instructions for them? Yes, but we need to change the behaviour slightly. - caching: Let's introduce MTRR/PAT. Let's look at MTRR to determine whether we want request to be cached or not etc. - simulate inb vs inw. On cache miss, processor usually does a burst read to fetch entire cacheline from RAM. So, if you want inb/inw behaviour - we need to say, let's have two modes - In cached mode, behave normally (do burst read). In uncached mode, behave like inb/inw. if the instruction was trying to load a char. do read only 8-bit. - ie. dont do burst reads. So CPU doesn't need to know whether it is an IO or MEM - just support above behaviours. But at some point we need to demux it to mem or io. - have a north bridge/memory controller. Let it do demux.... (Most modern devices uses MMIO. legacy devices like serial port, ps2 keyboards do use PMIO) Let's come back to your question. The MMIO functions (memory mapped input output ) are used to write to the region of devices which have memory mapped buffers (such as display vga buffer), while the IO functions are just designed to read/write to the input output ports of different devices. The serial output is written using the io functions because your qemu emulator and the terminal you start the qemu with are connected via a uart port, so anything you write to the port of the uart is displayed on the terminal. The IO functions are written using assembly because in x86 the IO ports are written using the inX and outX instructions (like inb, outb, inw, outw) and inline assembly is used to specify the compiler to specifically output these instructions only for the IO functions. The MMIO functions require just writing the memory so the function just assigns a value to a pointer. note that here the compiler will output the x86 mov instruction. The mfence instruction is used to ensure that the memory writes are synchronized up to the point of execution of this instruction and when the function exits, the buffer is surely updated. You can read more about mfence from here (http://x86.renejeschke.de/html/file_module_x86_id_170.html) Also: You don't need to manually write assembly instruction - mfence. C++11 provide std::atomic_thread_fence(...) and std::atomic_signal_fence(...) functions defined in <atomic> headerfile. It's much much better than writing the assembly instruction - why? Because, it's portable and depending on the memory model, compiler will be able to optimize away those fences etc. So, in util/io.h: it's equivalent to have: std::atomic_thread_fence(std::memory_order_seq_cst) instead of mfence instruction. (try disassembly - you should see compiler generating mfence) Thanks, Dushyant Behl (TA 2015)
: \ :"a" (&from_stack), "c" (&to_stack) \ :_ALL_REGISTERS, "memory" \ );
Don't we need to pass main_stack as an argument to this handler in that case?
Answer: No, the main_stack is in core_t.
And core_t instance is mapped to %gs.
There're no global variable. And the trap handlers are stateless.
What's the minimum number of GDT entries we need: 1. for kernel mode only. a. with only read-only data b. with read/write data 2. for kernel+user mode. a. with only read-only data b. with read/write data 3. for kernel+user mode+multicore. and why? Why did we have per_core entry in gdt? What are the alternate approaches?
If we have paging, then we can actually implement with just 1 entry which maps all the locations as valid and paging takes care of permissions for user/kernel/read/write/etc. If we want to use segmentation to control permissions, below would be my answer: 1. Only kernel mode: a. with ro data Only 1 entry (the code and data both can be mapped onto a single entry since they will be read-only) b. with rw data 2 entries - one for code (ro) and one for data (rw). (If we have ro data also, we can club it with code) 2. for kernel+user mode. a. with only read-only data 2 entries: One with kernel permissions and other with user permissions and mapping their individual sections. b. with read/write data 4 entries: Two with kernel permissions (ro code, rw data) and other two with user permissions (ro code and rw data).Answer 2:
Following could be the minimum number of GDT entries for the given scenarios. 1. Only kernel mode with ro and rw data => 2 ( + 1 Null Descriptor ) => 1 , if paging is used to implement r/w permissions , ( + 1 Null Descriptor ) 2. For kernel + user mode : => 4 ( kernel rw , kernel ro , user rw , user ro ) ( + 1 Null Descriptor ) => 2 , if paging is used to implement r/w permissions ( + 1 Null Descriptor ) ( We can't have just one entry for both user and kernel, for obvious reasons ) 3. For kernel + user + multicore CPU [ No of entries for user + kernel ] + 1 for multicore ( + 1 Null Descriptor ) Note : ro -> read only , rw -> read write Some useful discussions : http://stackoverflow.com/questions/3029064/segmentation-in-linux-segmentation-paging-are-redundant
inside isr user ring0: 00000006 esp=00c03f50
struct preempt_t{ // insert your code here }; You've to define the preempt_t structure. %gs:core_offset_preempt will give you the first 4 bytes of this structure.
In general, interrupts are disabled when you enter an ISR. In this case, since we are jumping back to the main function within the handler itself, you need to enable interrupts in your main code once you return back from the preempt handler to be able to see interrupts again. With the above hint, you can also try to understand why it starts working when one of the fiber ends. Consider the situation with 2 fibers both of which not yield at anytime. start thread1 . . timer intrupt . . (intrupts disabled) . start thread2(before calling iret) .......... keeps running (as intrupts are disabled).
I preempt myself with an apology if I am nitpicking (:
Answer: (student answer) I had the same concern. But the idea is that "yield" should do the job of blocking interrupts. However, the given implementation of yield ie., stacksaverestore does not provide that. So consider it as modifying the yield instruction.
1. What is the utility of the functions : size_t delete_reservesize() ; size_t read_reservesize(); These functions are not being called anywhere in the code. 2. What is the utility of the render_flag ? Why is it set to false ? In the following snippet : if(apps.render_flag && render_eq(apps.render_state, apps.rendertmp)){ apps.render_state=apps.rendertmp; apps.render_flag = true; goto norender; } render_flag will never get set to true, as control never enters the block.
I'll fix the render flag. Thanks. Reserve size function was there to extract parallelism.. It's not part of this lab.