CA: MIPS Architecture and Assembly Programming
AI-Generated Content
CA: MIPS Architecture and Assembly Programming
MIPS provides the foundation for understanding how software translates into hardware action. By learning its clean, reduced instruction set, you move beyond abstract programming to see how a CPU physically executes commands, manages data, and controls program flow. This knowledge is critical for computer architecture, compiler design, and low-level systems programming, making MIPS an essential educational tool.
The RISC Philosophy and MIPS Design
At its core, MIPS is a RISC (Reduced Instruction Set Computer) architecture. This design philosophy prioritizes a small, simple, and highly regular set of instructions that each execute in a single clock cycle. This contrasts with CISC (Complex Instruction Set Computer) architectures like x86, which have a large number of complex, multi-cycle instructions. The simplicity of RISC makes the hardware easier to design, pipeline efficiently, and optimize for speed.
MIPS embodies this simplicity with a load-store architecture. This means only specific load and store instructions (like lw and sw) can access memory. All arithmetic and logical operations, such as add or and, must be performed on values already held in the CPU's registers. This separation clarifies the instruction set and enforces a structured programming model. As a student, you benefit from this regularity, as it creates a clear mental model of data movement from memory to registers, between registers, and back to memory.
Instruction Formats: R, I, and J
Every MIPS instruction is exactly 32 bits wide and belongs to one of three core formats, which dictates how the CPU interprets the binary pattern. Mastering these formats is the key to reading assembly code and understanding machine language.
- R-Format (Register): Used for arithmetic and logical operations that involve only registers. The 32 bits are divided into fields for the operation code (
opcode), source registers (rs,rt), destination register (rd), shift amount (shamt), and function code (funct). For example, the instructionadd __MATH_INLINE_0__s1, __MATH_INLINE_1__s1isrs,__MATH_INLINE_2__t0isrd, andfuncttells the ALU to perform addition. - I-Format (Immediate): Used for instructions that involve a constant value (immediate), load/store operations, and conditional branches. The fields are
opcode, two register operands (rs,rt), and a 16-bit immediate value. For instance,addi __MATH_INLINE_3__t1, 5adds the value 5 to__MATH_INLINE_4__t0. Similarly,lw __MATH_INLINE_5__s1)uses__MATH_INLINE_6__t0. - J-Format (Jump): Used for unconditional jump instructions that need a large target address, such as
j labelorjal procedure_name. It contains a 6-bitopcodeand a 26-bit address field, which is combined with the program counter to calculate the final jump target.
Register Conventions and Usage
MIPS has 32 general-purpose registers, each 32 bits wide, numbered __MATH_INLINE_7__31. For clarity, assemblers use symbolic names that encode standard software conventions. Following these conventions is mandatory for writing interoperable, correct programs.
-
__MATH_INLINE_8__0): Always holds the constant value 0. Writing to it has no effect. -
__MATH_INLINE_9__v1(__MATH_INLINE_10__3): Used to return values from functions. -
__MATH_INLINE_11__a3(__MATH_INLINE_12__7): Used to pass the first four arguments to a function. -
__MATH_INLINE_13__t9(__MATH_INLINE_14__15,__MATH_INLINE_15__25): Temporary registers that are not preserved across function calls. The calling function assumes these may be overwritten. -
__MATH_INLINE_16__s7(__MATH_INLINE_17__23): Saved registers that must be preserved across function calls. If a called function uses an$sregister, it must save the original value to the stack and restore it before returning. -
__MATH_INLINE_18__29): The stack pointer, which holds the address of the top of the runtime stack. -
__MATH_INLINE_19__31): The return address register. The instructionjal(jump and link) automatically stores the address of the next instruction here, so the function can return withjr $ra.
Function Calls and Stack Frames
Implementing function calls requires careful management of state. The stack, a last-in-first-out (LIFO) data structure in memory, is used for this purpose. A stack frame (or activation record) is a block of memory on the stack dedicated to a single function call.
The procedure for a non-leaf function (one that calls another function) typically follows these steps:
- Prologue: The caller places arguments in registers
__MATH_INLINE_20__a3(and on the stack if more than four). - Call: The caller executes
jal function, which jumps to the function's code and stores the return address in$ra. - Callee Setup: The called function allocates its stack frame by decrementing the stack pointer:
addi __MATH_INLINE_21__sp, -12. It then saves any__MATH_INLINE_22__raregister (if it will make a call itself) onto the stack:sw __MATH_INLINE_23__sp). - Function Body: The function executes its logic, using the stack for local variables and spilled registers.
- Callee Teardown: The function restores saved registers and
__MATH_INLINE_24__sp, and returns withjr $ra. - Cleanup: The caller is responsible for cleaning up any arguments it placed on the stack.
This disciplined use of the stack and registers ensures that control flow can nest and return correctly.
Connecting Software to Hardware: The Datapath
Writing assembly is one side of the coin; understanding how the CPU executes it is the other. The datapath is the collection of hardware components—registers, ALUs, multiplexers, sign-extenders, and memory units—that work together to process instructions and data. Tracing an instruction through the single-cycle datapath reveals the hardware-software connection.
Let's trace a load word instruction, lw __MATH_INLINE_25__s1):
- Instruction Fetch: The Program Counter (PC) sends an address to the Instruction Memory, which outputs the 32-bit
lwinstruction. - Instruction Decode: The instruction is split into its I-format fields. Register
$s1is read from the Register File. The 16-bit offset (4) is sign-extended to 32 bits. - Execution: The ALU adds the contents of
__MATH_INLINE_26__s1 + 4). - Memory Access: This computed address is sent to the Data Memory unit, which reads and outputs the 32-bit word at that location.
- Write Back: The data word read from memory is written into the destination register,
$t0, in the Register File.
Each instruction type (R, I, J) uses a specific path through this datapath, controlled by the opcode and funct fields. Seeing this flow makes concrete the abstract steps of "fetch, decode, execute, memory, writeback" and shows why the load-store architecture and regular instruction formats simplify hardware design.
Common Pitfalls
- Ignoring Register Conventions: Using
__MATH_INLINE_27__tregisters across a function call, will corrupt data. Always treat__MATH_INLINE_28__s7as "owned" by the caller and preserve their values if you modify them. - Incorrect Stack Pointer Management: Forgetting to decrement
__MATH_INLINE_29__sp. The stack pointer must always be kept word-aligned (a multiple of 4). - Misunderstanding Branch Offset Calculation: The immediate field in a branch instruction (e.g.,
beq) is a word offset, not a byte offset. The hardware calculates the target as . Manually calculating this offset in assembly is error-prone; always use labels and let the assembler handle it. - Confusing
la(Load Address) withlw(Load Word):la __MATH_INLINE_31__t0.lw __MATH_INLINE_32__t0. Using one where the other is intended will cause a runtime error or logical bug.
Summary
- MIPS is a prime example of a RISC, load-store architecture, where its simplicity and regularity make it an ideal educational tool for connecting software and hardware.
- All instructions conform to one of three 32-bit formats: R-format for register arithmetic, I-format for immediate operations and data transfer, and J-format for long jumps.
- Adherence to register conventions—using
__MATH_INLINE_33__vfor return values,__MATH_INLINE_34__sfor saved variables—is essential for writing correct and interoperable functions. - Function calls require meticulous management of the stack frame to preserve register state and return addresses, following a strict sequence of prologue, call, setup, teardown, and return.
- Tracing instructions through the single-cycle datapath provides a concrete model of the fetch-decode-execute cycle, revealing how software instructions directly control hardware components like the ALU, register file, and memory units.