From C to Machine Instructions

The C32 compiler produces object files which are then linked into executable files. There is debug information in the executable file that allows the user to associate the binary machine instructions with the original source code. This process of reverse engineering is called disassembly. You may examine the disassembled source by selecting View -- Disassembly Listing from the top menu. You may also wish to select View -- CPU Registers and View -- Locals as well.

If you have set the simulator as described in PC32 Simulator, pressing the Reset button on the toolbar (or selecting Debugger -- Reset -- Processor Reset (F6) from the top menu) will position the cursor at the start of the main function, as shown below. Note that the CPU Registers are arranged alphabetically rather than numerically. The association between register numbers and register names is described in registers.

The information in each of the windows above may be saved by right clicking on the window and selecting the Output to File menu item. The disassembly listing, for example, is reproduced below.

You may wish to step through the program and observe the changes in the values of local variables and CPU registers. Press the Step Into button on the Debugger toolbar or select Debugger -- Step Into (F7) from the top menu. The window below shows the result after several steps.

You can choose what format information is displayed by right-clicking in a window to bring up the context-sensitive menu for that window. We changed the display format for na and nb in the Locals window from hex to decimal.

The motivation for this example is to explore how a C program allocates variables, assigns values to those variables, and performs simple arithmetic and logic operations. We show how the program decodes the diassembly information and interpret the operations being performed.

Contents

C Source
Disassembly Listing
Hand Disassembly
Interpretation

C Source

Download C source from: calc.zip


01: /* calc.c
02: *
03: * look at assembly language behind simple C statements
04: *
05: */
06: #include <p32xxxx.h>
07: 
08: int main()
09: {
10:         int na, nb, nc;
11: 
12:         na = 14;
13:         nb = na + 21;
14:         nc = na & 0x3c;
15:         return 0;
16: }


Disassembly Listing

---  C:\pic32\timer\calc\calc.c  -----------------------------------------------------------------
1:                   /* calc.c
2:                   *
3:                   * look at assembly language behind simple C statements
4:                   *
5:                   */
6:                   #include 
7:                   
8:                   int main()
9:                   {
9D000018  27BDFFE8   addiu       sp,sp,-24
9D00001C  AFBE0010   sw          s8,16(sp)
9D000020  03A0F021   addu        s8,sp,zero
10:                  	int na, nb, nc;
11:                  
12:                  	na = 14;
9D000024  2402000E   addiu       v0,zero,14
9D000028  AFC20000   sw          v0,0(s8)
13:                  	nb = na + 21;
9D00002C  8FC20000   lw          v0,0(s8)
9D000030  24420015   addiu       v0,v0,21
9D000034  AFC20004   sw          v0,4(s8)
14:                  	nc = na & 0x3c;
9D000038  8FC20000   lw          v0,0(s8)
9D00003C  3042003C   andi        v0,v0,0x3c
9D000040  AFC20008   sw          v0,8(s8)
15:                  	return 0;
9D000044  00001021   addu        v0,zero,zero
16:                  }
9D000048  03C0E821   addu        sp,s8,zero
9D00004C  8FBE0010   lw          s8,16(sp)
9D000050  27BD0018   addiu       sp,sp,24
9D000054  03E00008   jr          ra
9D000058  00000000   nop         

Hand Disassembly

The disassembly window decodes the machine language (binary instructions) into the instruction mnemonic and the associated argments (registers or immediate operands).

For example the first three instructions are:

addressmachine
instruction
mnemonicarguments
9D00001827BDFFE8 addiu sp,sp,-24
9D00001CAFBE0010 sw s8,16(sp)
9D00002003A0F021 addu s8,sp,zero

The first two instructions are immediate (I-Format) instructions which have the following fields:

oprsrtoperand/offset
31:2625:2120:1615:0
6 bits5 bits5 bits16 bits
OpcodeSource or baseDestination
or data
immediate operand
or address offset

Using this information, we can decode the first two instructions:

instructionoprsrtoperand
0x27BDFFE80010 0111 11101 11010xFFE8
00 10011 11011 1101
0x09r29 (sp)r29 (sp) -2410
addiu sp,sp,-24
0xAFBE00101010 1111 1011 1110 0x0010
10 10111 11011 1110
0x2Br29 (sp)r30 (s8)1610
sw s8,16(sp)

The third instruction is a register or R-Format instruction, which has the following fields

oprsrtrdshfn
31:2625:2120:1615:1110:65:0
6 bits5 bits5 bits5 bits5 bits6 bits
OpcodeSource 1Source 2Destination
register
Shift
amount
Opcode
extension

We can decode the third instruction as follows:

instructionoprsrtrdshfn
0x03A0F0210000 0011 1010 0000 1111 0000 0010 0001
00 00001 11010 00001 11100 000010 0001
0x00r29 (sp)r0 (zero)r30 (s8)0x000x21
add s8,sp,zero

Interpretation

Once we have the assembly language equivalent to the C source we can infer the register transfer operations that are being performed and make some observations about what is happening.

8:                   int main()
9:                   {
   addiu       sp,sp,-24      sp <= sp - 24 Allocate 24 bytes on stack
   sw          s8,16(sp)      M[s8+16] <= s8 save s8
   addu        s8,sp,zero     s8 <= sp This is how MIPS does a simple register transfer
10:                  	int na, nb, nc;
11:                  
12:                  	na = 14;
   addiu       v0,zero,14     v0 <= 14 This is how MIPS initializes a register
   sw          v0,0(s8)       M[s8+0] <= v0
13:                  	nb = na + 21;
   lw          v0,0(s8)       v0 <= na
   addiu       v0,v0,21       v0 <= na + 21
   sw          v0,4(s8)       M[s8+4] <= v0
14:                  	nc = na & 0x3c;
   lw          v0,0(s8)       v0 <= na
   andi        v0,v0,0x3c     v0 <= v0 & 0x3c
   sw          v0,8(s8)       M[s8+8] <= v0
15:                  	return 0;
   addu        v0,zero,zero   v0 <= zero Set the return value
16:                  }
   addu        sp,s8,zero     sp <= s8 restore stack pointer
   lw          s8,16(sp)      s8 <= M[sp+16] restore s8
   addiu       sp,sp,24       sp <= sp + 24 deallocate memory on stack
   jr          ra             return from function
   nop         

We observe that the compiler assigned the following local variables

addressvariable
sp+0na
sp+4nb
sp+8nc


Maintained by John Loomis, updated Sun Aug 03 21:38:03 2008