6.828
https://pdos.csail.mit.edu/6.828/2018/labguide.html
lab tools guide
Kernel
GDB:
qemu-gdb target1 (or qemu-gdb-nox variant)
: make QEMU wait for GDB to attach-d
: output detailed loginfo mem
: debug virtual memory issues for high levelinfo pg
for detailthread
andinfo thread
: debug CPUS (Lab 4+)
User environment (Lab 3+)
GDB doesn't know the difference between multi-user environments or user and kernel
make run-name
: start JOSrun-name-gdb
:make QEMU wait for GDB to attachsymbol-file
: tell GDB which symbol table- .gdbinit: loads the kernel symbol table: obj/kern/kernel
symbol-file obj/user/name
: locate symbol table for user environment in ELF binary- Point: dont load symbol from .o file, libraries are statically linked into JOS user binaries.
Reference
JOS makefile
-
make qemu
build everthing and start QEMU with the VGA console in a new window and the serial console in your terminal. To exit, either close the VGA window or press Ctrl-c or Ctrl-a in terminal
-
make qemu-nox
Like make qemu, but run with only the serial console, To exit, press Ctrl-a x. This is particularly useful over SSH connections to Athena dialups because the VGA window consumes a lot of bandwidth/
-
make qemu-gdb
Like make qemu, but rather than only the serial passively accepting GDB connections at any time, this pauses at the first machine instruction and waits for a GDB connection.
-
make qemu-nox-gdb
A conbination of the qemu-nox and qemu-gdb targets
-
make run-name
(Lab 3+) Run user program name. For example, make run-hello runs user/hello.c.
-
make run-name-nox
run-name-gdb
run-name-gdb-nox
(Lab 3+) Variants of run-name that correspond to the variants of the qemu target.
The makefile also accepts a few useful variables
-
make V=1 ...
Verbose mode. Print out every command being executed including arguments.
-
make V=1 grade
Stop after any failed grade test and leave the QEMU output in jos.out for inspection
-
make QEMUEXTRA='args' ...
Specify additional arguments to pass to QEMU.
JOS obj/
files usefule while debugging:
-
obj/boot/boot.asm, obj/kern/kernel.asm, obj/user/hello.asm
Assembly code listing for the bootloader, kernel, and user programs
-
obj/kern/kernel.sym, obj/user/hello.sym, etc
Symbol tables for the kernel and user programs.
-
obj/boot/boot.out, obj/kern/kernel, obj/user/hello, etc
Linked ELF images of the kernel and user programs. These contain symbol information that cam be used by GDB
GDB
-
Ctrl-c
Halt the machine and break in to GDB at the current instruction. IF QEMU has multiple virtual CPUs, this halts all of them.
-
c or contimue
Continue execution until the next breakpoint or Ctrl-c.
-
si or stepi
Execute one machine instruction
-
b function
orb file:line
orbreakpoint
Set a breakpoint at the given function or line.
-
b *addr
orbreakpoint
Set a breakpoint at the EIP addr.
-
set print pretty
Enable pretty-printing of arrays and structs/
-
info registers
Print the general purpose registers, eip, eflags, and the segment selectors. For a much more thorough dump of the machine register state, see QEMU's own info register command.
-
x/Nx addr
Display a hex dump of N words starting at virtual address addr. If N is omitted, it defaults to 1. addr can be any expression.
-
x/Ni addr
Display the N assembly instructions stating at addr. Using
$eip
as addr will display the instructions at the current instruction pointer. -
symbol-file file
(Lab 3+) Switch to symbol file file. When GDB attaches to QEMU, it has no notion of the process boundaries within the virtual machine, so we have to tell it which symbols to use. By default, we configure GDB to use the kernel symbol file,
obj/kern/kernel
. If the machine is running user code, sayhello.c
, you can switch to the hello symbol file usingsymbol-file obj/user/hello
.QEMU represents each virtual CPU as a thread in GDB, so you can use all of GDB's thread-
related commands to view or manipulate QEMU's virtual CPUs.
-
thread n
GDB focuses on one thread (i.e., CPU) at a time. This command switches that focus to thread n, numbered from zero.
-
info threads
List all threads (i.e., CPUs), including their state (active or halted) and what function they're in.
QEMU
Lab 1
- get familiarized with x86 assembly language, the QEMU x86 emulator, PC's power-on bootstap procedure.
- examines the boot loader for 6.828
boot
andlab
tree - delves into the initial templates for 6.828 JOS
kernel
directory
Step 1 get familiar with x86 assembly lang
Credit: Brennan's Guide to Inline Assembly
http://www.delorie.com/djgpp/doc/brennan/brennan_att_inline_djgpp.html
Syntax
-
Register name, 注册名字
register names are prefixed with "%", to reference eax
AT&T:
%eax
Intel:
eax
-
Source/Destination order
AT&T is UNIX standard?????
source is always left
destination is always on the right
eg. load ebx with the value in eax:
AT&T:
movl %eax, %ebx
Intel:
mov ebx, eax
-
Constant value/immediate value format
all constant/immediate values must be prefix with $;
eg. load eax with the address of the "C" variable booga, static
AT&T:
movl $_booga, %eax
Intel:
mov eax, _booga
eg. load ebx with 0xd00d:
AT&T:
movl $0xd00d, %eax
Intel:
mov ebx, d00dh
-
Operator size specification:
指令必须加一个后缀:b w l 来指明目标的宽度:byte word longword
eg.
AT&T:
movw %ax, %bx
Intel:
mov bx, ax
-
Referencing memory:
DJGPP uses 386-protected mode, so you can forget all the read-mode addressing junk, including the restrictions on which register has what default segment, which register can be base or index pointers. Now, we just get 6 general purpose registers. (7 if use ebp), but be sure to restore it yourself or compile with -fomit-frame-pointer.)
AT&T: immed32(basepointer,indexpointer,indexscale) Intel: [basepointer + indexpointer*indexscale + immed32]
You could think of the formula to calculate the address as:
immed32 + basepointer + indexpointer * indexscale
Assembly language
-
8088 8086: AX, BX, CX, DX, SI, DI, BP, SP, CS, DS, SS, ES, IP, FLAGS
-
80286: 16-bit protected mode: it can access up to 16 megabytes and protect programs from accessing each other’s memory.
-
80386: it extends many of the registers to hold 32-bits (EAX, EBX, ECX, EDX, ESI, EDI, EBP, ESP, EIP) adds two new 16-bit registers FS and GS
32-bit protected mode: In this mode, it can access up to 4 gigabytes
-
80486/ Pentium /Pentium Pro: They mainly speed up the execution of the instructions
-
Pentium MMX: MMX instructions, (multi media extension)
-
Pentium II: This is the Pentium Pro processor with the MMX instructions added. (The Pentium III is essentially just a faster Pentium II.)
8086:
general register AX BX CX DX : AX(16 bit): divided into 2 sub : AH AL(8 bit)
index registers: SI, DI used for pointers they can not be decomposed into 8-bit registers
BP and SP: Point to data in machine language stack
segment register: CS, DS, SS and ES CS=Code segment
NASM: Netwide Assembler
每一行指令有0~3个操作数:就是空位
- register
- memory
- immediate
- implied
Basic instructions:
-
mov: mov dest, src 右往左移
mov eax, 3 ; store 3 into EAX, 3 is immediate operand
-
add eax, 4 ; eax = eax + 4
add al, ah; al = al+ah
-
sub bx, 10; bx = bx - 10
-
inc ecx ; ecx++
-
dec dl ; dl--
some directives
-
定义symbol:
symbol equ value
-
%和C中的#define很像:
%define SIZE 100
mov eax, SIZE
-
-
resx dx数据结构的初始化
L1 db 0 ; byte labeled L1 with initial value 0 L2 dw 1000 ; word labeled L2 with initial value 1000 L3 db 110101b ; byte initialized to binary 110101 (53) L4 db 12h ; byte initialized to hex 12 (18 in decimal) L5 db 17o ; byte initialized to octal 17 (15 in decimal) L6 dd 1A92h ; double word initialized to hex 1A92 L7 resb 1 ; 1 uninitialized byte L8 db "A" ; byte initialized to ASCII code for A (65)
-
times
L9 db 0, 1, 2, 3 ; defines 4 bytes L10 db "w", "o", "r", "d", 0 ; defines a C string = "word" L11 db "word", 0 ; same as L10 L12 times 100 db 0 ; equivalent to 100 (db 0)'s L13 resw 100 ; reserves room for 100 words
-
address and data (like pointers):
mov al, [L1] ; copy byte at L1 into AL 2 mov eax, L1 ; EAX = address of byte at L1 3 mov [L1], ah ; copy AH into byte at L1 4 mov eax, [L6] ; copy double word at L6 into EAX 5 add eax, [L6] ; EAX = EAX + double word at L6 6 add [L6], eax ; double word at L6 += EAX 7 mov al, [L6] ; copy first byte of double word at L6 into AL
mov dword [L6], 1: dword位置还有BYTE, WORD, QWORD, TWORD
Input and Output
- print_int: prints out to the screen the value of the integer stored in EAX
- print_char: prints out to the screen the character whose ASCII value stored in AL
- print_string: prints out to the screen the contents of the string at the address stored in EAX. The string must be a C-type string (i.e. null terminated).
- print_nl
- read_int
- read_char
Lab1: Bootstrap
+------------------+ <- 0xFFFFFFFF (4GB)
| 32-bit |
| memory mapped |
| devices |
| |
/\/\/\/\/\/\/\/\/\/\
/\/\/\/\/\/\/\/\/\/\
| |
| Unused |
| |
+------------------+ <- depends on amount of RAM
| |
| |
| Extended Memory |
| |
| |
+------------------+ <- 0x00100000 (1MB)
| BIOS ROM |
+------------------+ <- 0x000F0000 (960KB)
| 16-bit devices, |
| expansion ROMs |
+------------------+ <- 0x000C0000 (768KB)
| VGA Display |
+------------------+ <- 0x000A0000 (640KB)
| |
| Low Memory |
| |
+------------------+ <- 0x00000000
Spring Boot Note here
JSR 303