**HCPU-16 Mark II — Instruction Set Architecture Specification**  
**Version:** 0.3-DRAFT  
**Date:** 2026-03-17  
**Status:** Pre-release (community review)

---

## Table of Contents

1. [Design Philosophy](#1-design-philosophy)  
2. [Registers](#2-registers)  
3. [Memory Model](#3-memory-model)  
4. [Instruction Encoding](#4-instruction-encoding)  
5. [Operand Encoding](#5-operand-encoding)  
6. [Instruction Reference — Basic Opcodes](#6-instruction-reference--basic-opcodes)  
7. [Instruction Reference — Special Opcodes](#7-instruction-reference--special-opcodes)  
8. [Flags Register Behavior](#8-flags-register-behavior)  
9. [Conditional Skip Instructions](#9-conditional-skip-instructions)  
10. [Conditional Jump Instructions](#10-conditional-jump-instructions)  
11. [Byte Addressing](#11-byte-addressing)  
12. [8.8 Signed Fixed-Point Arithmetic](#12-88-signed-fixed-point-arithmetic)  
13. [Block Copy](#13-block-copy)  
14. [Interrupt Handling](#14-interrupt-handling)  
15. [Memory-Mapped I/O](#15-memory-mapped-io)  
16. [Memory Protection Unit](#16-memory-protection-unit)  
17. [Cycle Costs](#17-cycle-costs)  
18. [Initial State & Boot Sequence](#18-initial-state--boot-sequence)  
19. [Undefined Behavior & Edge Cases](#19-undefined-behavior--edge-cases)  
20. [Assembler Conventions](#20-assembler-conventions)  
21. [Hardware Device Slots](#21-hardware-device-slots)  
22. [Revision History](#22-revision-history)

Appendix A: [Opcode Map (Quick Reference)](#appendix-a-opcode-map-quick-reference)  
Appendix B: [Instruction Encoding Examples](#appendix-b-instruction-encoding-examples)  
Appendix C: [Quick Reference Card](#appendix-c-quick-reference-card)

---

## 1. Design Philosophy

The HCPU-16 Mark II is a 16-bit processor designed for use as an in-game computer. It is a spiritual successor to Notch's DCPU-16 from the cancelled game 0x10c, redesigned to fix real usability problems while preserving the core identity: a constrained machine that rewards mastery.

### Core Principles

- **16-bit words, always.** The address space is 64K words (128KB). There is no 32-bit mode, no extended addressing. The constraint is the game.
- **Assembly-first.** The ISA is designed to be written by hand in assembly. It should feel satisfying, not frustrating. A lightweight C-like compiler should be possible but not required.
- **Real embedded systems flavor.** Memory-mapped I/O, proper flags, byte addressing — the things that make real low-level programming work, without breaking the retro aesthetic.
- **No magic.** No built-in operating system, no standard library, no garbage collection. The programmer builds everything from opcodes up.

### Key Features

The HCPU-16 Mk II retains the original DCPU-16's instruction encoding format, register names, and general character. Key features:

- **Flags register (FL)** with Zero, Carry, Sign, and Overflow flags
- **CMP instruction** and **conditional jumps** (Jcc) alongside the original skip-style conditionals
- **Byte addressing** via LDB/STB instructions for the lower 32K words
- **8.8 fixed-point math** instructions (FXMUL, FXDIV) instead of floating-point
- **Block copy** instruction (BCOPY) for fast memory transfers and DMA
- **Memory-mapped I/O** region (0xE000–0xFFFF) as the sole hardware interface
- **Memory Protection Unit** for sandboxing untrusted code
- **ADC/SBB** for multi-word arithmetic
- **Utility instructions**: NEG, NOT, SXB, SWP, BRK, HLT

### What Was Deliberately Omitted

- 32-bit extensions or wide registers
- IEEE 754 floating-point
- Virtual memory or paging
- Hardware multiply-accumulate or SIMD
- String instructions beyond BCOPY
- Any form of privilege levels or user/kernel mode (the MPU is opt-in)
- Register-passing hardware interrupt commands (all hardware interaction is MMIO)

---

## 2. Registers

### General-Purpose Registers (8)

| Name | Index | Description |
|------|-------|-------------|
| A    | 0x0   | General purpose / STB implicit source |
| B    | 0x1   | General purpose |
| C    | 0x2   | General purpose / BCOPY word count |
| X    | 0x3   | General purpose / index |
| Y    | 0x4   | General purpose / index |
| Z    | 0x5   | General purpose |
| I    | 0x6   | General purpose / loop counter |
| J    | 0x7   | General purpose |

All general-purpose registers are 16 bits wide and fully interchangeable. The descriptions above are conventions, not hardware constraints. Any register can be used for any purpose.

### Special Registers

| Name | Description |
|------|-------------|
| PC   | Program counter. Points to the next instruction word to fetch. |
| SP   | Stack pointer. Points to the top of the stack. Pre-decrements on push, post-increments on pop. Fully general — supports `[SP]` and `[SP + offset]` addressing like any GP register. |
| EX   | Excess register. Receives overflow/underflow from arithmetic operations. |
| FL   | Flags register. Updated by arithmetic and logic instructions. See §8. |
| IA   | Interrupt address. When non-zero, points to the interrupt handler. See §14. |

**FL** and **IA** are not accessible through the standard operand encoding positions used by GP registers, SP, PC, and EX. Instead:

- **FL** is accessible at operand value `0x20` in the `a` field and `0x1F` in the `b` field (see §5).
- **IA** is accessible through the `IAG` and `IAS` special instructions (see §7).

### Register Width

All registers are 16 bits. There are no 8-bit sub-registers or 32-bit register pairs. Instructions that produce results wider than 16 bits (MUL, DIV, FXMUL, etc.) use EX to hold the extra bits.

---

## 3. Memory Model

### Word-Addressed Space

The primary address space is **65,536 words** (0x0000–0xFFFF), where each word is 16 bits. This is the space used by SET, ADD, and all standard instructions.

```
0x0000 ┌─────────────────────────┐
       │                         │
       │     RAM (usable)        │
       │     Up to 56K words     │
       │     (112KB)             │
       │                         │
0xDFFF ├─────────────────────────┤
0xE000 │                         │
       │  Memory-Mapped I/O      │
       │  8K words (16KB)        │
       │  32 device slots        │
       │                         │
0xFFFF └─────────────────────────┘
```

### Byte-Addressed Space

The lower portion of memory is also addressable at byte granularity via the LDB and STB instructions. A 16-bit byte address covers **65,536 bytes** (byte addresses 0x0000–0xFFFF), which maps to the first **32,768 words** (word addresses 0x0000–0x7FFF).

Memory above word address 0x7FFF (including the upper RAM region and the MMIO window) is **word-addressed only**. Programs that need byte access to individual bytes in upper memory should use the standard word-level pattern: read the word, mask and shift, write the word back. This is the same pattern used on real 16-bit microcontrollers and is a deliberate architectural constraint, not a limitation.

See §11 for the complete byte-addressing scheme.

### Tiered RAM (Game-Layer Concept)

The amount of installed RAM is an upgrade within the game. Uninstalled regions read as 0x0000 and silently discard writes.

| Tier     | RAM Size   | Word Range    | Fully Byte-Addressable? |
|----------|------------|---------------|-------------------------|
| Basic    | 16K words  | 0x0000–0x3FFF | Yes (all RAM within byte range) |
| Standard | 32K words  | 0x0000–0x7FFF | Yes (all RAM within byte range) |
| Advanced | 48K words  | 0x0000–0xBFFF | First 32K words; upper 16K word-only |
| Full     | 56K words  | 0x0000–0xDFFF | First 32K words; upper 24K word-only |

The MMIO region (0xE000–0xFFFF) is always accessible regardless of installed RAM tier.

**Design note:** The Basic and Standard tiers have the property that all installed RAM is byte-addressable. This makes these tiers particularly clean for C-like compilers and string-heavy applications. Advanced and Full tiers add word-only storage ideal for code, lookup tables, and word-aligned data structures. Good memory layout — byte-oriented data low, word-oriented data high — is a skill that rewards the programmer.

### Endianness

The HCPU-16 Mk II is **big-endian** at the byte level:

- Word `0xABCD` at word address W is stored as:
  - Byte address W×2: `0xAB` (high byte)
  - Byte address W×2+1: `0xCD` (low byte)

This matters only for LDB/STB and for hardware devices that expose byte-granularity data.

### Stack

The stack grows downward. SP starts at 0x0000 and wraps to 0xFFFF on the first push. Standard stack behavior:

- **PUSH (write):** SP decrements by 1, then value is written to [SP]
- **POP (read):** Value is read from [SP], then SP increments by 1
- **PEEK:** Value is read from [SP] without modifying SP
- **PICK n:** Value is read from [SP + n] without modifying SP

There is no hardware stack overflow detection. SP wrapping around will silently corrupt memory. The MPU (§16) can be configured to catch this.

---

## 4. Instruction Encoding

All instructions are encoded in one to three 16-bit words:

```
Word 1 (required):  aaaaaabbbbbooooo
Word 2 (optional):  [next word for operand a or b, if required by addressing mode]
Word 3 (optional):  [next word for the other operand, if both require one]
```

| Field | Bits  | Width | Description |
|-------|-------|-------|-------------|
| o     | 4–0   | 5     | Opcode |
| b     | 9–5   | 5     | First operand (usually destination) |
| a     | 15–10 | 6     | Second operand (usually source) |

When `o` is non-zero, the instruction is a **basic instruction** with two operands.

When `o` is zero, the instruction is a **special instruction**: the `b` field becomes a sub-opcode, and only operand `a` is used.

When `o` is zero AND `b` is zero, the instruction is **reserved**. The specific combination `o=0, b=0, a=0` (the word 0x0000) is treated as a **NOP** by convention.

### Next Word Ordering

If both operands require a "next word," operand `a`'s next word appears first, followed by operand `b`'s next word. The CPU reads next words in the order they are needed during operand evaluation, and `a` is always evaluated before `b`.

---

## 5. Operand Encoding

### Operand `a` (6 bits: values 0x00–0x3F)

The `a` operand appears in bits 15–10 of the instruction word. It is evaluated first and is typically the **source**.

| Value     | Description            | Cycles | Next Word? |
|-----------|------------------------|--------|------------|
| 0x00–0x07 | Register (A–J)         | 0      | No         |
| 0x08–0x0F | [register] (A–J)       | 1      | No         |
| 0x10–0x17 | [register + nw] (A–J)  | 2      | Yes        |
| 0x18      | POP / [SP++]           | 1      | No         |
| 0x19      | PEEK / [SP]            | 1      | No         |
| 0x1A      | PICK / [SP + nw]       | 2      | Yes        |
| 0x1B      | SP                     | 0      | No         |
| 0x1C      | PC                     | 0      | No         |
| 0x1D      | EX                     | 0      | No         |
| 0x1E      | [nw] (memory at nw)    | 2      | Yes        |
| 0x1F      | nw (literal value)     | 1      | Yes        |
| 0x20      | FL                     | 0      | No         |
| 0x21–0x3F | literal: value − 0x22  | 0      | No         |

The inline literal range 0x21–0x3F encodes the values **-1 to 29** (i.e., the encoded literal's value is `operand_code - 0x22`). These are free (0 extra cycles, no next word).

When `a` is used as a source, POP reads [SP] and then increments SP.

### Operand `b` (5 bits: values 0x00–0x1F)

The `b` operand appears in bits 9–5 of the instruction word. It is evaluated second and is typically the **destination**.

| Value     | Description            | Cycles | Next Word? |
|-----------|------------------------|--------|------------|
| 0x00–0x07 | Register (A–J)         | 0      | No         |
| 0x08–0x0F | [register] (A–J)       | 1      | No         |
| 0x10–0x17 | [register + nw] (A–J)  | 2      | Yes        |
| 0x18      | PUSH / [--SP]          | 1      | No         |
| 0x19      | PEEK / [SP]            | 1      | No         |
| 0x1A      | PICK / [SP + nw]       | 2      | Yes        |
| 0x1B      | SP                     | 0      | No         |
| 0x1C      | PC                     | 0      | No         |
| 0x1D      | EX                     | 0      | No         |
| 0x1E      | [nw] (memory at nw)    | 2      | Yes        |
| 0x1F      | FL                     | 0      | No         |

When `b` is used as a destination, PUSH decrements SP and then writes to [SP].

### Notes

- Writing to a literal (`a` = 0x21–0x3F or 0x1F) silently fails. The instruction executes but the result is discarded. Cycle cost is still paid.
- Reading from PUSH or writing to POP are undefined and should not be done.
- `nw` means "next word" — the 16-bit word immediately following the instruction word in memory.
- Operand `a` is always evaluated before operand `b`. This matters when both operands modify SP (e.g., POP as source, PUSH as destination).
- In all basic instructions, `b` is **always written to** (or silently discarded). There are no exceptions.
- **The only exception to normal b-as-destination rules is the STB special instruction (§7), which uses operand a as the byte address but sources the byte from register A (implicit).**

---

## 6. Instruction Reference — Basic Opcodes

Basic opcodes occupy the `o` field (bits 4–0) when non-zero. All basic instructions have two operands: `b` (destination) and `a` (source). In every basic instruction, `b` is the write target.

**Notation:** In cost expressions, `B` and `A` refer to the operand cycle costs from §5.

### Arithmetic

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x01 | `SET b, a` | b = a | 1+A+B |
| 0x02 | `ADD b, a` | b = b + a. Sets EX to 0x0001 on carry, 0x0000 otherwise. Updates FL. | 2+A+B |
| 0x03 | `SUB b, a` | b = b - a. Sets EX to 0xFFFF on borrow, 0x0000 otherwise. Updates FL. | 2+A+B |
| 0x04 | `MUL b, a` | b = (b × a) & 0xFFFF (unsigned). EX = ((b × a) >> 16) & 0xFFFF. Updates FL. | 3+A+B |
| 0x05 | `MLI b, a` | Signed multiply. b and a are treated as signed. EX = ((b × a) >> 16) & 0xFFFF. Updates FL. | 3+A+B |
| 0x06 | `DIV b, a` | b = b / a (unsigned). EX = ((b << 16) / a) & 0xFFFF. If a = 0, b = 0 and EX = 0. Updates FL. | 4+A+B |
| 0x07 | `DVI b, a` | Signed divide, rounds toward zero. EX gets fractional bits as with DIV. If a = 0, b = 0 and EX = 0. Updates FL. | 4+A+B |
| 0x08 | `MOD b, a` | b = b % a (unsigned). If a = 0, b = 0. Updates FL. | 4+A+B |
| 0x09 | `MDI b, a` | Signed modulo. Result sign matches dividend (b). If a = 0, b = 0. Updates FL. | 4+A+B |

### Bitwise

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x0A | `AND b, a` | b = b & a. Updates FL (Z, S set; C, O cleared). | 1+A+B |
| 0x0B | `BOR b, a` | b = b \| a. Updates FL (Z, S set; C, O cleared). | 1+A+B |
| 0x0C | `XOR b, a` | b = b ^ a. Updates FL (Z, S set; C, O cleared). | 1+A+B |

### Shifts

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x0D | `SHR b, a` | Logical shift right. b = b >>> a. EX = ((b << 16) >> a) & 0xFFFF. Updates FL (C = last bit shifted out). | 1+A+B |
| 0x0E | `ASR b, a` | Arithmetic shift right (sign-extending). b = b >> a (signed). EX = ((b << 16) >>> a) & 0xFFFF. Updates FL (C = last bit shifted out). | 1+A+B |
| 0x0F | `SHL b, a` | Shift left. b = b << a. EX = ((b << a) >> 16) & 0xFFFF. Updates FL (C = last bit shifted out). | 1+A+B |

### Conditional Skip

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x10 | `IFB b, a` | Skip next instruction unless (b & a) ≠ 0. | 2+A+B (+1 if skipping) |
| 0x11 | `IFC b, a` | Skip next instruction unless (b & a) = 0. | 2+A+B (+1 if skipping) |
| 0x12 | `IFE b, a` | Skip next instruction unless b = a. | 2+A+B (+1 if skipping) |
| 0x13 | `IFN b, a` | Skip next instruction unless b ≠ a. | 2+A+B (+1 if skipping) |
| 0x14 | `IFG b, a` | Skip next instruction unless b > a (unsigned). | 2+A+B (+1 if skipping) |
| 0x15 | `IFA b, a` | Skip next instruction unless b > a (signed). | 2+A+B (+1 if skipping) |
| 0x16 | `IFL b, a` | Skip next instruction unless b < a (unsigned). | 2+A+B (+1 if skipping) |
| 0x17 | `IFU b, a` | Skip next instruction unless b < a (signed). | 2+A+B (+1 if skipping) |

See §9 for details on skip behavior, chaining, and interaction with multi-word instructions.

### Extended Arithmetic (Mk II)

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x18 | `ADC b, a` | b = b + a + (FL.C ? 1 : 0). Add with carry. Sets EX on carry. Updates FL. | 2+A+B |
| 0x19 | `SBB b, a` | b = b - a - (FL.C ? 1 : 0). Subtract with borrow. Sets EX on borrow. Updates FL. | 2+A+B |

### Compare & Test (Mk II)

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x1A | `CMP b, a` | Compute b - a. Update FL. Discard result. b is unchanged. EX is unchanged. | 2+A+B |
| 0x1B | `TST b, a` | Compute b & a. Update FL (Z, S; C and O cleared). Discard result. b is unchanged. | 1+A+B |

### Fixed-Point Math (Mk II)

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x1C | `FXMUL b, a` | 8.8 signed fixed-point multiply. See §12. Updates FL. | 4+A+B |
| 0x1D | `FXDIV b, a` | 8.8 signed fixed-point divide. See §12. Updates FL. | 8+A+B |

### Byte Load (Mk II)

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x1E | `LDB b, a` | Load byte from byte address `a` into `b` (zero-extended). See §11. Updates FL (Z, S). | 2+A+B |

### Reserved

| Opcode | Mnemonic | Description | Cost |
|--------|----------|-------------|------|
| 0x1F | *(reserved)* | Reserved for future use. Treated as NOP. | 1+A+B |

---

## 7. Instruction Reference — Special Opcodes

Special instructions are encoded when the basic opcode `o` = 0x00. The `b` field (bits 9–5) becomes the sub-opcode, and only operand `a` is used.

```
Encoding: aaaaaa[sub]00000
```

### Subroutine Control

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x01 | `JSR a` | Push PC to stack, then set PC = a. | 3+A |
| 0x02 | `BSR a` | Push PC to stack, then set PC = PC + (signed)a. Relative call. | 3+A |

### Conditional Jumps (Mk II)

All conditional jumps test the FL register. If the condition is true, PC is set to the value of `a`. If false, the instruction has no effect (PC advances normally, any next word for `a` is still consumed).

| Sub | Mnemonic | Condition | Description | Cost |
|-----|----------|-----------|-------------|------|
| 0x03 | `JZ a`  | Z = 1     | Jump if zero (equal) | 2+A (+1 taken) |
| 0x04 | `JNZ a` | Z = 0     | Jump if not zero (not equal) | 2+A (+1 taken) |
| 0x05 | `JC a`  | C = 1     | Jump if carry (unsigned below) | 2+A (+1 taken) |
| 0x06 | `JNC a` | C = 0     | Jump if no carry (unsigned above or equal) | 2+A (+1 taken) |
| 0x07 | `JS a`  | S = 1     | Jump if sign (negative) | 2+A (+1 taken) |
| 0x08 | `JNS a` | S = 0     | Jump if no sign (positive or zero) | 2+A (+1 taken) |
| 0x09 | `JO a`  | O = 1     | Jump if overflow | 2+A (+1 taken) |
| 0x0A | `JA a`  | C=0, Z=0  | Jump if above (unsigned strict) | 2+A (+1 taken) |
| 0x0B | `JBE a` | C=1 or Z=1| Jump if below or equal (unsigned) | 2+A (+1 taken) |
| 0x0C | `JGE a` | S = O     | Jump if greater or equal (signed) | 2+A (+1 taken) |
| 0x0D | `JL a`  | S ≠ O     | Jump if less (signed) | 2+A (+1 taken) |
| 0x0E | `JG a`  | Z=0, S=O  | Jump if greater (signed strict) | 2+A (+1 taken) |
| 0x0F | `JLE a` | Z=1 or S≠O| Jump if less or equal (signed) | 2+A (+1 taken) |

See §10 for usage patterns with CMP.

### Interrupt Management

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x10 | `INT a` | Trigger a software interrupt with message `a`. | 4+A |
| 0x11 | `IAG a` | Set `a` = IA (get interrupt handler address). | 1+A |
| 0x12 | `IAS a` | Set IA = `a` (set interrupt handler address). | 1+A |
| 0x13 | `RFI a` | Return from interrupt. Pops A from stack, then pops PC. Re-enables interrupt dispatch. `a` is ignored (use 0). | 3 |
| 0x14 | `IAQ a` | If a ≠ 0, interrupts are added to queue instead of dispatched. If a = 0, queued interrupts resume dispatching. | 1+A |

### Reserved (0x15–0x17)

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x15 | *(reserved)* | Reserved for future use. Treated as NOP. | 1 |
| 0x16 | *(reserved)* | Reserved for future use. Treated as NOP. | 1 |
| 0x17 | *(reserved)* | Reserved for future use. Treated as NOP. | 1 |

### Unary Operations (Mk II)

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x18 | `NEG a` | a = −a (two's complement negate). Updates FL. | 1+A |
| 0x19 | `NOT a` | a = ~a (bitwise complement). Updates FL (Z, S; C, O cleared). | 1+A |
| 0x1A | `SXB a` | Sign-extend byte: bits 7–0 of a are sign-extended to 16 bits. Updates FL (Z, S). | 1+A |
| 0x1B | `SWP a` | Swap high and low bytes of a. a = ((a & 0xFF) << 8) \| ((a >> 8) & 0xFF). | 1+A |

### Block & System Operations (Mk II)

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x1C | `BCOPY` | Block copy. See §13. `a` is ignored (use 0). | 2 + C (value of register C at invocation) |
| 0x1D | `BRK` | Breakpoint. If a debugger device is attached, halts execution for inspection. Otherwise, treated as NOP. `a` is ignored (use 0). | 1 |
| 0x1E | `HLT` | Halt. CPU stops executing until an interrupt is received. If IA = 0 (interrupts disabled), halts permanently until external reset. `a` is ignored (use 0). | 1 |

### Byte Store (Mk II)

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x1F | `STB a` | Store byte. Writes the low byte of register A (bits 7–0) to the byte address given by operand a. See §11. Does not update FL. | 2+A |

**Note on STB:** Unlike most special opcodes, STB uses `a` as a destination address rather than a source value. Register A is the implicit source — only its low 8 bits are written. This keeps STB consistent with the HCPU's convention that register A is the default data register (used by interrupt dispatch, and conventionally used for return values and primary operands).

### Sub-Opcode 0x00

When both `o = 0x00` and `b = 0x00`, the instruction is reserved. The all-zeros word `0x0000` is defined as NOP by convention:

| Sub | Mnemonic | Description | Cost |
|-----|----------|-------------|------|
| 0x00 | `NOP` | No operation. `a` is evaluated (including any side effects like POP) but the result is discarded. | 1+A |

---

## 8. Flags Register Behavior

The FL register is a 16-bit register with four defined flag bits:

```
Bit:  15 14 13 12 11 10  9  8  7  6  5  4  3  2  1  0
       ─  ─  ─  ─  ─  ─  ─  ─  ─  ─  ─  ─  O  S  C  Z
```

| Bit | Name | Description |
|-----|------|-------------|
| 0   | Z (Zero)     | Set if the result is zero. |
| 1   | C (Carry)    | Set on unsigned carry (addition) or borrow (subtraction). For shifts, set to the last bit shifted out. |
| 2   | S (Sign)     | Set if bit 15 of the result is 1 (result is negative in signed interpretation). |
| 3   | O (Overflow) | Set if signed overflow occurred (result has the wrong sign). |
| 15–4 | *(reserved)* | Read as zero. Writes are ignored. |

### Which Instructions Update FL

**Always update FL:** ADD, SUB, MUL, MLI, DIV, DVI, MOD, MDI, AND, BOR, XOR, SHR, ASR, SHL, ADC, SBB, CMP, TST, FXMUL, FXDIV, LDB, NEG, NOT, SXB.

**Never update FL:** SET, IFx (all skip instructions), STB, JSR, BSR, Jcc (all conditional jumps), INT, IAG, IAS, RFI, IAQ, SWP, BCOPY, BRK, HLT, NOP.

### FL Update Rules by Instruction Type

**Arithmetic (ADD, SUB, ADC, SBB, CMP, NEG):**
- Z = 1 if result = 0
- C = 1 if unsigned carry (ADD/ADC) or unsigned borrow (SUB/SBB/CMP/NEG)
- S = bit 15 of result
- O = 1 if signed overflow: the inputs' signs are compatible but the result's sign differs

**Multiply (MUL, MLI, FXMUL):**
- Z = 1 if result = 0
- C = 1 if EX ≠ 0 (result didn't fit in 16 bits)
- S = bit 15 of result
- O = 1 if signed overflow (MLI/FXMUL only)

**Divide (DIV, DVI, MOD, MDI, FXDIV):**
- Z = 1 if result = 0
- C = 0
- S = bit 15 of result
- O = 0

**Bitwise (AND, BOR, XOR, NOT):**
- Z = 1 if result = 0
- C = 0 (cleared)
- S = bit 15 of result
- O = 0 (cleared)

**Shifts (SHR, ASR, SHL):**
- Z = 1 if result = 0
- C = last bit shifted out (or 0 if shift amount is 0)
- S = bit 15 of result
- O = 0 (cleared)

**Test (TST):**
- Z = 1 if (b & a) = 0
- C = 0 (cleared)
- S = bit 15 of (b & a)
- O = 0 (cleared)

**Load byte (LDB):**
- Z = 1 if loaded byte = 0
- C = unchanged
- S = 0 (byte values are zero-extended, so bit 15 is always 0)
- O = unchanged

**Sign-extend byte (SXB):**
- Z = 1 if result = 0
- C = unchanged
- S = bit 7 of the original byte (which becomes bit 15 after sign extension)
- O = unchanged

---

## 9. Conditional Skip Instructions

The IFx skip-style conditionals are a signature feature of the HCPU-16 architecture. They evaluate a condition and either continue normally or skip the next instruction entirely.

### Basic Skip Behavior

An IFx instruction evaluates its condition. If the condition is **true**, execution continues normally to the next instruction. If the condition is **false**, the next instruction is skipped entirely (it is not executed, and any next words it would consume are skipped over).

```asm
IFE A, 5          ; if A == 5...
  SET B, 1        ;   ...then B = 1 (executed only if A == 5)
SET C, 2          ; always executed
```

### Chaining

If a skipped instruction is itself an IFx, the skip continues — both the second IFx and the instruction after it are skipped. This allows multi-condition logic:

```asm
IFG A, 0          ; if A > 0 (unsigned)
  IFL A, 100      ;   AND A < 100 (unsigned)
    SET B, 1      ;     then B = 1
SET C, 2          ; always executed
```

When chaining, each skipped IFx costs 1 cycle. The final non-IFx instruction that is skipped costs 1 cycle.

### Skipping Multi-Word Instructions

When an instruction is skipped, its next words are also skipped. The CPU reads enough words to determine the full instruction length, then advances PC past all of them.

```asm
IFE A, 0
  SET [0x1000], [0x2000]   ; 3-word instruction: skipped entirely
ADD X, 1                    ; execution continues here if A ≠ 0
```

### IFx vs. CMP+Jcc

Both styles coexist. Guidelines:

- **IFx** is concise for single-condition inline checks. It reads naturally: "if equal, do this one thing."
- **CMP + Jcc** is better for branching logic, loops, and compiler output. It separates comparison from control flow and handles complex branching patterns cleanly.

Neither style is deprecated. Use whichever is clearer.

---

## 10. Conditional Jump Instructions

Conditional jumps test the FL register (set by a prior CMP, TST, or arithmetic instruction) and branch to the target address if the condition is met.

### Typical Pattern

```asm
CMP A, 10         ; compare A to 10 (sets FL)
JGE done          ; if A >= 10 (signed), jump to 'done'
; ... code for A < 10 ...
done:
; ... continues here ...
```

### Unsigned Comparison After CMP

| Condition | Jump | Meaning |
|-----------|------|---------|
| b = a     | JZ   | Equal |
| b ≠ a     | JNZ  | Not equal |
| b < a     | JC   | Below (unsigned) |
| b ≥ a     | JNC  | Above or equal (unsigned) |
| b > a     | JA   | Above (unsigned) |
| b ≤ a     | JBE  | Below or equal (unsigned) |

### Signed Comparison After CMP

| Condition | Jump | Meaning |
|-----------|------|---------|
| b = a     | JZ   | Equal |
| b ≠ a     | JNZ  | Not equal |
| b < a     | JL   | Less (signed) |
| b ≥ a     | JGE  | Greater or equal (signed) |
| b > a     | JG   | Greater (signed) |
| b ≤ a     | JLE  | Less or equal (signed) |

### Assembler Aliases

Conforming assemblers SHOULD support these aliases for readability:

| Alias   | Expands To | After CMP |
|---------|------------|-----------|
| `JE a`  | `JZ a`     | Jump if equal |
| `JNE a` | `JNZ a`    | Jump if not equal |
| `JB a`  | `JC a`     | Jump if below (unsigned) |
| `JAE a` | `JNC a`    | Jump if above or equal (unsigned) |

---

## 11. Byte Addressing

The HCPU-16 Mk II provides byte-granularity access to memory through two dedicated instructions: LDB (basic opcode) and STB (special opcode).

### Byte Address Mapping

A 16-bit byte address maps to a specific byte within the word-addressed memory:

```
Byte Address BA → Word Address: BA >> 1
                  Byte Select:  BA & 1
                    0 = high byte (bits 15–8)
                    1 = low byte (bits 7–0)
```

With 16-bit byte addresses, the addressable byte range is **0x0000–0xFFFF** (65,536 bytes), covering word addresses **0x0000–0x7FFF** (the first 32,768 words).

Memory above word address 0x7FFF is word-addressed only. There is no extended byte addressing mode. Programs that need byte access to upper memory should use the standard word-level pattern:

```asm
; Read high byte of word at address 0x8000
SET A, [0x8000]    ; read the full word
SHR A, 8           ; shift high byte to low position
AND A, 0xFF        ; mask to byte (optional, SHR already zeroed upper bits)
```

### LDB — Load Byte (Basic Opcode 0x1E)

```
LDB b, a
```

Loads a single byte from memory, zero-extends it to 16 bits, and stores it in `b`.

1. Evaluate `a` to get a 16-bit value `BA` (the byte address).
2. Compute word address: `WA = BA >> 1`.
3. Read the word at `MEM[WA]`.
4. If `BA` is even (bit 0 = 0): extract high byte → `(MEM[WA] >> 8) & 0xFF`.
5. If `BA` is odd (bit 0 = 1): extract low byte → `MEM[WA] & 0xFF`.
6. Zero-extend the 8-bit value to 16 bits and store in `b`.
7. Update FL: Z = 1 if byte is 0, S = 0 (always, since zero-extended), C and O unchanged.

### STB — Store Byte (Special Opcode 0x1F)

```
STB a
```

Writes the low byte of register A (bits 7–0) to the byte address given by `a`.

1. Evaluate `a` to get a 16-bit value `BA` (the byte address).
2. The source value is register A (implicit). Only bits 7–0 of A are used.
3. Compute word address: `WA = BA >> 1`.
4. Read the existing word at `MEM[WA]`.
5. If `BA` is even: replace high byte → `MEM[WA] = (A << 8) | (MEM[WA] & 0x00FF)`.
6. If `BA` is odd: replace low byte → `MEM[WA] = (MEM[WA] & 0xFF00) | (A & 0xFF)`.
7. FL is not updated.

### Byte Address Range Limits

Byte addresses 0x0000–0xFFFF map to word addresses 0x0000–0x7FFF. Byte addresses do not wrap: there is no way to byte-address word 0x8000 or above using LDB/STB alone.

If a byte address maps to a word in the MMIO region (this cannot happen with 16-bit byte addresses, since word 0x7FFF is below MMIO), the behavior would be device-specific. In practice, all byte-addressable memory is RAM.

### Examples

```asm
; Store the ASCII string "Hi" starting at byte address 0x100
SET A, 0x48        ; 'H' (ASCII)
STB 0x100          ; store low byte of A at byte 0x100

SET A, 0x69        ; 'i' (ASCII)
STB 0x101          ; store low byte of A at byte 0x101

; Read it back
LDB B, 0x100       ; B = 0x0048 ('H', zero-extended)
LDB C, 0x101       ; C = 0x0069 ('i', zero-extended)

; Relationship to word access:
; The word at address 0x80 now contains 0x4869
; Because byte 0x100 is word 0x80 high byte,
;     and byte 0x101 is word 0x80 low byte.

; Using a register as byte address
SET X, 0x200       ; byte address in X
SET A, 0x41        ; 'A'
STB X              ; store 'A' at byte address 0x200
ADD X, 1           ; advance to next byte
SET A, 0x42        ; 'B'
STB X              ; store 'B' at byte address 0x201
```

---

## 12. 8.8 Signed Fixed-Point Arithmetic

The HCPU-16 Mk II provides hardware-accelerated fixed-point math using **8.8 signed fixed-point** representation. In this format, a 16-bit register holds a signed fixed-point number where the upper 8 bits are the integer part and the lower 8 bits are the fractional part.

### Representation

An 8.8 fixed-point number is a signed 16-bit integer with an implicit division by 256. The value represented is `register_value / 256`.

| Decimal | 8.8 Hex  | Register Bits        | Calculation        |
|---------|----------|----------------------|--------------------|
| 1.0     | 0x0100   | 00000001.00000000    | 256 / 256 = 1.0    |
| 0.5     | 0x0080   | 00000000.10000000    | 128 / 256 = 0.5    |
| 2.5     | 0x0280   | 00000010.10000000    | 640 / 256 = 2.5    |
| -1.0    | 0xFF00   | 11111111.00000000    | −256 / 256 = −1.0  |
| -0.5    | 0xFF80   | 11111111.10000000    | −128 / 256 = −0.5  |
| 3.14    | 0x0324   | 00000011.00100100    | 804 / 256 ≈ 3.141  |
| 0.00390625 | 0x0001 | 00000000.00000001 | 1 / 256 (smallest positive) |

**Range:** −128.0 to +127.99609375 (−128.0 to approximately +128.0)

**Precision:** 1/256 ≈ 0.00390625

This is sufficient precision for physics (velocity, rotation, acceleration), bullet trajectories, sensor bearings, and most in-game math. Players who need higher-precision trig can use the math coprocessor peripheral (§21), which provides lookup tables accessible via MMIO.

### FXMUL — Fixed-Point Multiply (Basic Opcode 0x1C)

```
FXMUL b, a
```

Multiplies two 8.8 fixed-point values. The key insight: multiplying two 8.8 values (each is really an integer ×256) produces an intermediate result that is 16.16 (integer ×65536). Shifting right by 8 converts back to 8.8.

**Pseudocode:**
```
result32 = (int32_t)(int16_t)b × (int32_t)(int16_t)a    // signed 16×16→32
b        = (result32 >> 8) & 0xFFFF                       // middle 16 bits = 8.8 result
EX       = result32 & 0x00FF                               // low 8 fractional bits lost to rounding
```

**FL update:** Z = 1 if b = 0; C = 1 if EX ≠ 0 (precision was lost); S = bit 15 of b; O = 1 if signed overflow (result outside 8.8 range).

### FXDIV — Fixed-Point Divide (Basic Opcode 0x1D)

```
FXDIV b, a
```

Divides an 8.8 fixed-point value by another. The numerator is shifted left by 8 before division to maintain precision in the 8.8 result.

**Pseudocode:**
```
numerator32 = (int32_t)(int16_t)b << 8                     // shift up to create 16.16 from 8.8
result32    = numerator32 / (int32_t)(int16_t)a             // signed division, rounds toward zero
b           = result32 & 0xFFFF                              // 8.8 result
EX          = numerator32 % (int32_t)(int16_t)a             // remainder
```

If `a = 0`: `b = 0`, `EX = 0`. No fault is generated.

**FL update:** Z = 1 if b = 0; C = 0; S = bit 15 of b; O = 0.

### Usage Examples

```asm
; Multiply 2.5 × 1.5 in 8.8 fixed-point
; 2.5 in 8.8 = 2 * 256 + 128 = 0x0280
; 1.5 in 8.8 = 1 * 256 + 128 = 0x0180

SET A, 0x0280       ; A = 2.5
SET B, 0x0180       ; B = 1.5
FXMUL A, B          ; A = 3.75 = 0x03C0 (3 * 256 + 192)

; Divide 7.0 / 2.0 in 8.8 fixed-point
SET A, 0x0700       ; A = 7.0
SET B, 0x0200       ; B = 2.0
FXDIV A, B          ; A = 3.5 = 0x0380

; --- Practical example: velocity integration ---
; Physics: position += velocity * delta_time
;
; Suppose:
;   X = x_position (8.8), e.g. 50.0 = 0x3200
;   Y = x_velocity (8.8), e.g.  1.5 = 0x0180
;   Z = delta_time  (8.8), e.g.  0.25 = 0x0040 (1/4 second)

SET X, 0x3200       ; x_pos = 50.0
SET Y, 0x0180       ; x_vel = 1.5
SET Z, 0x0040       ; dt = 0.25

; Compute velocity * dt
SET A, Y            ; A = velocity (1.5)
FXMUL A, Z          ; A = 1.5 * 0.25 = 0.375 = 0x0060
ADD X, A            ; x_pos += 0.375 → 50.375 = 0x3260

; --- Practical example: trig with lookup table ---
; Convert a bearing angle (0–255 in register) to a velocity component
; using a sin table stored in memory at :sin_table
; Each table entry is an 8.8 value, indexed by angle (0 = 0°, 64 = 90°, etc.)

SET I, bearing      ; I = angle index (0–255, word-sized table)
SET A, [I + sin_table]  ; A = sin(bearing) in 8.8 format
FXMUL A, speed      ; A = speed * sin(bearing), in 8.8
; Result is the velocity X-component
```

---

## 13. Block Copy

```
BCOPY
```

BCOPY copies a region of memory. It uses three registers as implicit operands:

| Register | Role |
|----------|------|
| A        | Source start address (word address) |
| B        | Destination start address (word address) |
| C        | Number of words to copy |

### Behavior

1. If `C = 0`, BCOPY is a no-op (costs 2 cycles).
2. If `A < B` (destination is after source, potential overlap), copy proceeds **backwards** from the last word to the first (like `memmove`).
3. If `A ≥ B`, copy proceeds **forwards** from the first word to the last.
4. After completion: `A = A + C` (original C value), `B = B + C` (original C value), `C = 0`.

This overlap-safe behavior means BCOPY always produces correct results regardless of source/destination relationship.

### Cycle Cost

`2 + C` cycles, where C is the value of register C at invocation. This is faster than a manual copy loop (which would cost approximately 5–6 cycles per word).

### Example

```asm
; Copy 128 words from 0x1000 to 0x2000
SET A, 0x1000      ; source
SET B, 0x2000      ; destination
SET C, 128         ; count
BCOPY              ; copies 128 words, costs 130 cycles
; After: A = 0x1080, B = 0x2080, C = 0
```

### Interaction with MMIO

BCOPY can read from and write to the MMIO region (0xE000–0xFFFF). This enables DMA-like transfers to and from hardware devices — for example, blitting a tile map to the LEM3200 display's video RAM, or bulk-reading a packet buffer from the NIC.

Device burst support is device-specific. The following standard devices support burst reads and writes via BCOPY:

- **LEM3200 Display** (slot 1): Full burst support for video RAM, tile map, and sprite table regions.
- **Floppy Drive** (slot 4): Burst read/write of sector data buffer.

Devices that do NOT support burst access (BCOPY reads/writes one word at a time, each access is independent):

- **System Control Block** (slot 0): Each register read is independent.
- **Sensor Array** (slot 6): Polled data; each read returns the value at the moment of access.

Consult individual device specifications for details.

### Interaction with MPU

If the MPU is enabled, each word access during BCOPY is bounds-checked. A violation on any word aborts the remaining copy and triggers a fault (if fault mode is enabled). On abort, registers reflect the state at the point of the fault: C contains the remaining word count, A and B point to the word that faulted.

---

## 14. Interrupt Handling

### Interrupt Dispatch

When an interrupt arrives with message `M`:

1. If `IA = 0`, the interrupt is silently discarded.
2. If interrupt queueing is active (see IAQ), the interrupt (with message `M`) is added to the back of the queue.
3. Otherwise:
   a. Interrupt queueing is automatically activated (prevents nested interrupts by default).
   b. PUSH PC (current program counter pushed to stack).
   c. PUSH A (current value of register A pushed to stack).
   d. `PC = IA` (jump to interrupt handler).
   e. `A = M` (message is placed in register A).

### Returning from Interrupts

The `RFI` instruction:

1. Pops A from the stack (restoring its pre-interrupt value).
2. Pops PC from the stack (returning to the interrupted code).
3. Deactivates interrupt queueing (re-enables interrupt dispatch).

### Interrupt Queue

Interrupts that arrive while queueing is active are stored in a FIFO queue.

- **Maximum queue depth:** 256 entries.
- When queueing is deactivated (by RFI or `IAQ 0`), queued interrupts are dispatched one per cycle.

### Queue Overflow

If the queue reaches 256 entries and another interrupt arrives, the behavior depends on the **Interrupt Queue Mode** (IQM), a configuration value in the system control MMIO block (§15):

| IQM Value | Behavior |
|-----------|----------|
| 0 (default) | The new interrupt is dropped. |
| 1 | The oldest queued interrupt is dropped to make room. |
| 2 | A non-maskable fault is triggered. The CPU pushes PC, pushes A, and jumps to IA with A = 0xFFFF (fault code). This dispatch ignores the queueing flag. |

### Catastrophic Failure

If a CPU accumulates 1,024 or more unprocessed interrupts during a single game tick — through any combination of queue overflow mode, EMP damage, or hardware malfunction this is a hardware failure event in the game layer:

- The CPU immediately halts.
- The host loses all CPU-dependent systems (communiation, shields, navigation, weapons, sensors).
- The ROM chip may be damaged (game-layer probability roll).
- Repair requires a replacement CPU component.

This limit exists as catastrophic failure and as a safety valve against runaway interrupt storms. It is enforced by the game runtime, not by the CPU emulator itself.

---

## 15. Memory-Mapped I/O

The top 8K words of the address space (0xE000–0xFFFF) are reserved for memory-mapped hardware I/O. This is the **sole mechanism** for interacting with hardware devices on the HCPU-16 Mk II. There are no special hardware-interaction instructions — all device communication is through standard memory reads and writes to the MMIO region.

### Device Slot Layout

Each hardware device is assigned a **256-word slot** in the MMIO region. With 8K words, there are **32 device slots** (0–31).

```
Slot  0: 0xE000–0xE0FF   (System Control — always present)
Slot  1: 0xE100–0xE1FF
Slot  2: 0xE200–0xE2FF
  ...
Slot 31: 0xFF00–0xFFFF
```

### Slot 0 — System Control Block

Slot 0 is always present and provides system-level configuration and status:

| Offset | Name          | R/W | Description |
|--------|---------------|-----|-------------|
| 0x00   | SYS_ID        | R   | System identifier: 0x4802 (HCPU Mk II) |
| 0x01   | SYS_VER       | R   | Firmware version (high byte = major, low = minor) |
| 0x02   | SYS_RAM       | R   | Installed RAM in words (e.g., 0x8000 = 32K words) |
| 0x03   | SYS_CLK       | R   | Cycle budget per game tick |
| 0x04   | SYS_TICKS     | R   | Game tick counter (wraps at 0xFFFF) |
| 0x05   | SYS_IQM       | R/W | Interrupt Queue Mode (0, 1, or 2; see §14) |
| 0x06   | SYS_MPU_BASE  | R/W | MPU base address (see §16) |
| 0x07   | SYS_MPU_LIMIT | R/W | MPU limit address (see §16) |
| 0x08   | SYS_MPU_CTRL  | R/W | MPU control register (see §16) |
| 0x09   | SYS_RNG       | R   | Hardware random number (new value each read) |
| 0x0A   | SYS_HWCOUNT   | R   | Number of connected hardware devices (including system control) |
| 0x0B–0xFF | *(reserved)* | — | Reserved for future system use |

### Device Enumeration via MMIO

Since all hardware interaction is through MMIO, device enumeration is performed by reading the standard identification registers at the beginning of each slot. All devices MUST implement the first 4 words of their slot as follows:

| Offset | Name      | R/W | Description |
|--------|-----------|-----|-------------|
| 0x00   | DEV_ID    | R   | Device type identifier (low word) |
| 0x01   | DEV_ID_HI | R   | Device type identifier (high word) |
| 0x02   | DEV_VER   | R   | Device version |
| 0x03   | DEV_CTRL  | R/W | Primary control/status register |

An empty slot reads 0x0000 for all registers. To enumerate devices, scan slots 1–31 and check whether DEV_ID is non-zero:

```asm
; Enumerate all hardware devices
SET I, 1               ; start at slot 1 (slot 0 is system control)
SET J, 0xE100          ; MMIO base of slot 1

:enum_loop
    SET A, [J]         ; read DEV_ID of this slot
    IFN A, 0           ; if device present...
        JSR handle_device  ; ... process it (slot base in J, ID in A)
    ADD J, 0x100       ; advance to next slot (256-word stride)
    ADD I, 1
    CMP I, 32
    JL enum_loop       ; loop until all 32 slots scanned
```

The remaining 252 words of each slot are device-specific.

### Reading/Writing MMIO

MMIO locations are accessed using standard word-addressed instructions:

```asm
; Read the tick counter
SET A, [0xE004]    ; A = current tick count

; Set interrupt queue mode
SET [0xE005], 2    ; IQM = fault mode

; Read a random number
SET A, [0xE009]    ; A = random 16-bit value

; Read display device ID
SET A, [0xE100]    ; A = LEM3200 DEV_ID low word
```

---

## 16. Memory Protection Unit

The MPU provides simple base-and-bounds memory protection. It is configured through the System Control Block in MMIO (§15).

### Configuration Registers

| MMIO Offset | Name | Description |
|-------------|------|-------------|
| 0xE006 | SYS_MPU_BASE  | Base address of the allowed memory region (inclusive). |
| 0xE007 | SYS_MPU_LIMIT | Limit address of the allowed memory region (exclusive). |
| 0xE008 | SYS_MPU_CTRL  | Control register (see below). |

### MPU Control Register (SYS_MPU_CTRL)

| Bit | Name | Description |
|-----|------|-------------|
| 0   | EN   | MPU enabled (1 = active, 0 = disabled). |
| 1   | FAULT| Fault mode (1 = trigger interrupt on violation, 0 = silent: reads return 0, writes are discarded). |
| 2   | WP   | Write-protect mode (1 = writes outside region are blocked, reads are allowed everywhere). |
| 15–3 | *(reserved)* | Must be 0. |

### Behavior

When `EN = 1`, every memory access is checked:

- **Normal mode (WP = 0):** Any read or write to an address outside `[MPU_BASE, MPU_LIMIT)` is a violation.
- **Write-protect mode (WP = 1):** Only writes outside the region are violations; reads are unrestricted.

On a violation:

- **FAULT = 0 (silent):** Reads return 0x0000. Writes are discarded. Execution continues.
- **FAULT = 1 (interrupt):** A fault interrupt is triggered with message `0xFFFE`. The handler receives the faulting PC in the normal interrupt frame.

### Scope

- The MPU applies to **RAM accesses only** (0x0000–0xDFFF).
- MMIO accesses (0xE000–0xFFFF) are **never** blocked by the MPU. (A sandboxed program that shouldn't access hardware should have its allowed region within RAM — MMIO is always above the RAM ceiling.)
- Instruction fetches are subject to MPU checks. Code executing outside the allowed region will fault.
- Stack operations (PUSH/POP) are subject to MPU checks.
- BCOPY operations are subject to per-word MPU checks (see §13).

### Typical Use Case: Sandboxing Untrusted Code

```asm
; Load untrusted code from floppy into 0x4000–0x5FFF, then sandbox it
; (Code loading from floppy omitted — see floppy device spec)

SET [0xE006], 0x4000    ; MPU_BASE = 0x4000
SET [0xE007], 0x6000    ; MPU_LIMIT = 0x6000 (exclusive)
SET [0xE008], 0x0003    ; EN=1, FAULT=1
SET PC, 0x4000          ; jump into sandboxed region

; If the untrusted code tries to access memory outside 0x4000–0x5FFF,
; a fault interrupt fires with message 0xFFFE.
; The handler can terminate the sandboxed code and reclaim control.
```

---

## 17. Cycle Costs

Every instruction has a deterministic cycle cost. The total cost is the instruction's base cost plus the operand cost(s) for its addressing modes.

### Operand Costs (from §5, summarized)

| Addressing Mode         | Cost |
|-------------------------|------|
| Register direct         | 0    |
| [register] indirect     | 1    |
| [register + nw] indexed | 2    |
| POP / PUSH              | 1    |
| PEEK [SP]               | 1    |
| PICK [SP + nw]          | 2    |
| SP / PC / EX / FL       | 0    |
| [nw] memory direct      | 2    |
| nw (literal next word)  | 1    |
| Small literal (inline)  | 0    |

### Basic Instruction Costs

| Opcode | Mnemonic | Base Cost |
|--------|----------|-----------|
| 0x01   | SET      | 1         |
| 0x02   | ADD      | 2         |
| 0x03   | SUB      | 2         |
| 0x04   | MUL      | 3         |
| 0x05   | MLI      | 3         |
| 0x06   | DIV      | 4         |
| 0x07   | DVI      | 4         |
| 0x08   | MOD      | 4         |
| 0x09   | MDI      | 4         |
| 0x0A   | AND      | 1         |
| 0x0B   | BOR      | 1         |
| 0x0C   | XOR      | 1         |
| 0x0D   | SHR      | 1         |
| 0x0E   | ASR      | 1         |
| 0x0F   | SHL      | 1         |
| 0x10–0x17 | IFx   | 2 (+1 if skipping) |
| 0x18   | ADC      | 2         |
| 0x19   | SBB      | 2         |
| 0x1A   | CMP      | 2         |
| 0x1B   | TST      | 1         |
| 0x1C   | FXMUL    | 4         |
| 0x1D   | FXDIV    | 8         |
| 0x1E   | LDB      | 2         |
| 0x1F   | *(reserved)* | 1     |

**Total cost** = Base + A_cost + B_cost, where A_cost and B_cost are the operand costs from §5.

### Special Instruction Costs

| Sub  | Mnemonic | Cost |
|------|----------|------|
| 0x00 | NOP      | 1 (+A_cost, but conventionally a=0 so cost=1) |
| 0x01 | JSR      | 3 + A_cost |
| 0x02 | BSR      | 3 + A_cost |
| 0x03–0x0F | Jcc | 2 + A_cost (+1 if branch taken) |
| 0x10 | INT      | 4 + A_cost |
| 0x11 | IAG      | 1 + A_cost |
| 0x12 | IAS      | 1 + A_cost |
| 0x13 | RFI      | 3 |
| 0x14 | IAQ      | 1 + A_cost |
| 0x15–0x17 | *(reserved)* | 1 |
| 0x18 | NEG      | 1 + A_cost |
| 0x19 | NOT      | 1 + A_cost |
| 0x1A | SXB      | 1 + A_cost |
| 0x1B | SWP      | 1 + A_cost |
| 0x1C | BCOPY    | 2 + C (value of register C at invocation) |
| 0x1D | BRK      | 1 |
| 0x1E | HLT      | 1 |
| 0x1F | STB      | 2 + A_cost |

### Cycle Budget (Game-Layer Concept)

In the game, each CPU executes a fixed number of cycles per game tick. The base budget is **10,000 cycles per tick**. This can be increased by installing clock crystal upgrades:

| Component | Budget |
|-----------|--------|
| Standard CPU | 10,000 cycles/tick |
| +Clock Crystal Mk I | 15,000 cycles/tick |
| +Clock Crystal Mk II | 25,000 cycles/tick |
| +Overclocking Module | +50% (generates heat) |

Unspent cycles within a tick are lost (no banking). If the CPU exhausts its budget mid-instruction, the instruction completes but the next tick begins with 0 remaining.

---

## 18. Initial State & Boot Sequence

### CPU State at Reset

| Register | Value |
|----------|-------|
| A–J      | 0x0000 |
| PC       | 0x0000 |
| SP       | 0x0000 |
| EX       | 0x0000 |
| FL       | 0x0000 |
| IA       | 0x0000 |

- Interrupt queueing: disabled.
- Interrupt queue: empty.
- MPU: disabled (SYS_MPU_CTRL = 0).
- IQM: 0 (drop new interrupts on overflow).

### Memory State at Boot

- **Program ROM** is loaded into RAM starting at word address 0x0000. The ROM image is an array of 16-bit words copied verbatim.
- RAM not covered by the ROM image is initialized to 0x0000.
- MMIO registers are initialized to device-specific defaults.

### Boot Sequence

1. CPU state is set to reset values (above).
2. ROM image is copied into RAM.
3. Execution begins at PC = 0x0000.
4. The first instruction of the program runs.

There is no boot delay by default. In the game layer, a "boot time" may be imposed as the number of ticks the ROM copy takes (proportional to ROM size), during which the CPU-dependent systems are offline.

### First Instructions Convention

Programs SHOULD begin with:

```asm
; Set up stack pointer
SET SP, 0xDFFF     ; top of RAM (below MMIO region)

; Set up interrupt handler (if using interrupts)
IAS interrupt_handler

; ... initialization code ...
```

This is convention, not hardware requirement. The CPU will execute whatever is at 0x0000.

---

## 19. Undefined Behavior & Edge Cases

### Division by Zero

All divide instructions (DIV, DVI, MOD, MDI, FXDIV): if divisor = 0, result = 0, EX = 0. No fault is generated. FL is updated with Z=1, C=0, S=0, O=0.

### Shift by Zero

If the shift amount is 0, the value is unchanged. C flag is set to 0 (no bit was shifted out). Z and S reflect the unchanged value. O is cleared.

### Shift by 16 or More

If the shift amount is ≥ 16:
- **SHL:** result = 0. EX = (depends on amount; typically 0 for amounts ≥ 32). C = 0 (no bit to shift "last").
- **SHR:** result = 0. EX = (depends on amount; typically 0 for amounts ≥ 32). C = 0.
- **ASR:** result = 0x0000 (if original value was positive) or 0xFFFF (if negative). C = sign bit of original value.

### FXMUL/FXDIV Overflow

If the result of FXMUL exceeds the 8.8 signed range (−128.0 to +127.996), the result wraps (truncated to 16 bits). The O flag is set to indicate overflow. FXDIV cannot overflow (dividing makes values smaller or equal in magnitude, except for the degenerate case of dividing the minimum value by −1, which wraps).

### Self-Modifying Code

Fully supported. Writes to RAM in the region being executed take effect immediately for subsequent instruction fetches. There is no instruction cache.

### Stack Overflow/Underflow

SP wraps around (0x0000 − 1 = 0xFFFF). There is no hardware detection. If SP enters the MMIO region (above 0xDFFF), stack operations will read/write device registers, which is almost certainly a bug. The MPU (§16) can be configured to detect stack operations outside a defined region.

### PC Overflow

If PC increments past 0xFFFF, it wraps to 0x0000. If PC enters the MMIO region, the CPU will attempt to execute MMIO register values as instructions, which produces unpredictable results.

### Writing to Read-Only MMIO

Writes to read-only MMIO registers are silently discarded.

### Accessing Uninstalled RAM

Reads from addresses beyond installed RAM return 0x0000. Writes are silently discarded.

### Concurrent MMIO Access

If BCOPY reads from or writes to MMIO during the same tick that a hardware device is updating those registers, behavior is device-specific. Most devices guarantee coherent word-level access but not multi-word atomicity.

### Byte Addressing Above Word 0x7FFF

LDB and STB only address words 0x0000–0x7FFF via 16-bit byte addresses. There is no mechanism to byte-address higher memory via these instructions. The byte address space does not wrap: byte address 0xFFFF maps to word 0x7FFF (low byte), and there is no byte address that maps to word 0x8000.

### NEG of 0x8000

Negating the minimum signed value (0x8000 = −32768) produces 0x8000 (the result wraps). The O (overflow) flag is set. Z = 0, C = 1 (borrow from zero), S = 1.

---

## 20. Assembler Conventions

Conforming assemblers for the HCPU-16 Mk II SHOULD support the following:

### Labels

```asm
:label_name
loop:               ; alternate syntax
```

### Numeric Literals

```asm
SET A, 42           ; decimal
SET A, 0x2A         ; hexadecimal
SET A, 0b101010     ; binary
SET A, 0o52         ; octal
SET A, 'A'          ; character literal (0x0041)
```

### Directives

```asm
.dat 0x1234, 0x5678         ; embed raw data words
.dat "Hello", 0             ; embed string as words (one char per word, null-terminated)
.datb "Hello", 0            ; embed string as packed bytes (two chars per word)
.org 0x1000                 ; set assembly origin
.fill 128, 0                ; fill 128 words with 0
.equ SCREEN, 0xE100         ; define constant
.include "lib.asm"          ; include source file
.reserve 64                 ; reserve 64 words of uninitialized space
```

### Register Aliases

Assemblers SHOULD allow defining register aliases:

```asm
.alias frame_ptr, J         ; J is used as frame pointer
SET frame_ptr, SP
```

### Instruction Aliases

| Alias | Expansion | Notes |
|-------|-----------|-------|
| `JMP a` | `SET PC, a` | Unconditional jump |
| `RET` | `SET PC, POP` | Return from subroutine |
| `PSH a` | `SET PUSH, a` | Push value |
| `POP a` | `SET a, POP` | Pop value |
| `NOP` | word 0x0000 | No operation |

### Conditional Jump Aliases

| Alias | Expansion | After CMP |
|-------|-----------|-----------|
| `JE a`  | `JZ a`  | Jump if equal |
| `JNE a` | `JNZ a` | Jump if not equal |
| `JB a`  | `JC a`  | Jump if below (unsigned) |
| `JAE a` | `JNC a` | Jump if above or equal (unsigned) |

### Notes on STB Syntax

`STB` is a special opcode with one explicit operand (the byte address). Register A is the implicit source. Assemblers SHOULD accept:

```asm
STB 0x100          ; store low byte of A at byte address 0x100
STB X              ; store low byte of A at byte address held in X
STB [X + 5]        ; store low byte of A at byte address [X + 5]
```

**Additional examples for clarity:**
```asm
STB X              ; store low byte of A at byte address in X
STB [X+5]          ; store low byte of A at byte address [X+5]
```

There is no two-operand form of STB. The value to store must be placed in register A before calling STB.

### Calling Convention (Recommended)

This is a suggested calling convention, not a hardware requirement:

- **Arguments:** First 4 in A, B, C, X. Additional arguments pushed right-to-left.
- **Return value:** A (or A:B for 32-bit).
- **Caller-saved:** A, B, C, X, Y, Z (caller preserves if needed).
- **Callee-saved:** I, J, SP.
- **Frame pointer:** J (optional; can be omitted for leaf functions).
- **Stack frame:**

```
[SP + N]    ; local variables
[SP + 0]    ; saved J (frame pointer)
[SP - 1]    ; return address (pushed by JSR)
[SP - 2..]  ; arguments beyond the 4th
```

---

## 21. Hardware Device Slots

This section lists the **standard hardware devices** defined alongside the HCPU-16 Mk II architecture. Each device is described in its own companion specification. This section provides only an overview and slot assignment conventions.

### Standard Devices

| Slot | Device | Device ID (DEV_ID:DEV_ID_HI) | Description |
|------|--------|-------------------------------|-------------|
| 0    | System Control | 0x4802:0x0000 | Always present. See §15. |
| 1    | LEM3200 Display | 0xF615:0x734D | 256×192 pixel display with tile/sprite layers. |
| 2    | Keyboard Controller | 0x7406:0x30CF | Keyboard input buffer and key-state map. |
| 3    | Clock | 0xB402:0x12D0 | Configurable timer that fires interrupts. |
| 4    | Floppy Drive | 0x24C5:0x4FD5 | 1440-word floppy disk read/write. |
| 5    | NIC | 0xA465:0x7EC3 | Network interface card. Raw packet send/receive. |
| 6    | Sensor Array | 0xC0DE:0x53E4 | Environmental sensor data. |
| 7    | Navigation Unit | 0x5600:0x4E41 | Position, velocity, orientation. |
| 8–15 | *(expansion)* | — | Available for additional devices. |
| 16   | Math Coprocessor | 0x5448:0x4D41 | Trig lookup tables and fast sqrt. |
| 17   | Crypto Coprocessor | 0x5950:0x4352 | Hash and cipher acceleration (rare loot). |
| 18   | Debug Probe | 0x4700:0x4442 | Breakpoint and single-step support. |
| 19–31 | *(expansion)* | — | Available for game-specific or mod devices. |

### Device Specification Format

Each device specification (published separately) documents:

1. **Device ID** (DEV_ID and DEV_ID_HI register values).
2. **MMIO register map** (offsets within the 256-word slot).
3. **Interrupt messages** the device may generate.
4. **Cycle costs** for device operations.
5. **Burst support** (whether BCOPY can efficiently read/write the device's registers).
6. **Behavioral notes** (timing, buffering, error conditions).

---

## 22. Revision History

| Version | Date | Changes |
|---------|------|---------|
| 0.1-DRAFT | 2026-03-17 | Initial specification (as DCPU-16 Mk II). |
| 0.2-DRAFT | 2026-03-17 | Renamed to HCPU-16 Mk II. Removed HWN/HWQ/HWI (MMIO is sole hardware interface). STB moved from basic opcode to special opcode 0x1F (implicit A source). Byte addressing simplified: 16-bit addresses covering words 0x0000–0x7FFF only, no extended mode. Fixed-point section rewritten as clean 8.8 format. Added SYS_HWCOUNT to system control block. Added device enumeration MMIO example. Added hello world in Appendix B. Added Quick Reference Card as Appendix C. |
| 0.3-DRAFT | 2026-03-17 | Minor polish pass: added future-proofing note (§1), STB operand clarification (§5), wording fix in STB description (§7), enhanced STB assembler examples (§20), new encoding example in Appendix B. |

---

## Appendix A: Opcode Map (Quick Reference)

### Basic Opcodes (o field, 5 bits)

```
0x00  (special)    0x08  MOD     0x10  IFB     0x18  ADC
0x01  SET          0x09  MDI     0x11  IFC     0x19  SBB
0x02  ADD          0x0A  AND     0x12  IFE     0x1A  CMP
0x03  SUB          0x0B  BOR     0x13  IFN     0x1B  TST
0x04  MUL          0x0C  XOR     0x14  IFG     0x1C  FXMUL
0x05  MLI          0x0D  SHR     0x15  IFA     0x1D  FXDIV
0x06  DIV          0x0E  ASR     0x16  IFL     0x1E  LDB
0x07  DVI          0x0F  SHL     0x17  IFU     0x1F  (reserved)
```

### Special Opcodes (b field when o=0x00, 5 bits)

```
0x00  NOP          0x08  JNS     0x10  INT     0x18  NEG
0x01  JSR          0x09  JO      0x11  IAG     0x19  NOT
0x02  BSR          0x0A  JA      0x12  IAS     0x1A  SXB
0x03  JZ           0x0B  JBE     0x13  RFI     0x1B  SWP
0x04  JNZ          0x0C  JGE     0x14  IAQ     0x1C  BCOPY
0x05  JC           0x0D  JL      0x15  (reserved)  0x1D  BRK
0x06  JNC          0x0E  JG      0x16  (reserved)  0x1E  HLT
0x07  JS           0x0F  JLE     0x17  (reserved)  0x1F  STB
```

### Operand Value Quick Reference

```
a field (6 bits):                    b field (5 bits):
0x00–0x07  register A–J             0x00–0x07  register A–J
0x08–0x0F  [register]               0x08–0x0F  [register]
0x10–0x17  [register + nw]          0x10–0x17  [register + nw]
0x18       POP [SP++]               0x18       PUSH [--SP]
0x19       PEEK [SP]                0x19       PEEK [SP]
0x1A       PICK [SP + nw]           0x1A       PICK [SP + nw]
0x1B       SP                       0x1B       SP
0x1C       PC                       0x1C       PC
0x1D       EX                       0x1D       EX
0x1E       [nw]                     0x1E       [nw]
0x1F       nw (literal)             0x1F       FL
0x20       FL
0x21–0x3F  literal -1 to 29
```

---

## Appendix B: Instruction Encoding Examples

### Example 1: `SET A, 10`

```
a = 10 (decimal) → inline literal: 10 + 0x22 = 0x2C → a field = 0x2C
b = A → b field = 0x00
o = SET → o field = 0x01

Instruction word: 0x2C << 10 | 0x00 << 5 | 0x01 = 0xB001
Binary: 1011000000000001
Size: 1 word. Cost: 1 cycle.
```

### Example 2: `SET [0x1000], A`

```
a = A → a field = 0x00
b = [next word] → b field = 0x1E, next word = 0x1000
o = SET → o field = 0x01

Instruction word: 0x00 << 10 | 0x1E << 5 | 0x01 = 0x03C1
Next word: 0x1000
Binary: 0000001111000001  0001000000000000
Size: 2 words. Cost: 1 + 0 + 2 = 3 cycles.
```

### Example 3: `ADD [X+5], [Y+3]`

```
a = [Y + nw] → a field = 0x14, next word a = 3
b = [X + nw] → b field = 0x13, next word b = 5
o = ADD → o field = 0x02

Instruction word: 0x14 << 10 | 0x13 << 5 | 0x02 = 0x5262
Next word (a): 0x0003
Next word (b): 0x0005
Size: 3 words. Cost: 2 + 2 + 2 = 6 cycles.
```

### Example 4: `JSR my_function` (where my_function = 0x0200)

```
This is a special instruction: o = 0x00, sub (b field) = 0x01
a = next word literal → a field = 0x1F, next word = 0x0200

Instruction word: 0x1F << 10 | 0x01 << 5 | 0x00 = 0x7C20
Next word: 0x0200
Size: 2 words. Cost: 3 + 1 = 4 cycles.
```

### Example 5: `CMP A, 0` followed by `JZ label` (where label = 0x0300)

```
CMP A, 0:
  a = literal 0 → inline: 0 + 0x22 = 0x22 → a field = 0x22
  b = A → b field = 0x00
  o = CMP → o field = 0x1A
  Instruction word: 0x22 << 10 | 0x00 << 5 | 0x1A = 0x881A
  Size: 1 word. Cost: 2 cycles.

JZ 0x0300:
  o = 0x00, sub = 0x03
  a = next word literal → a field = 0x1F, next word = 0x0300
  Instruction word: 0x1F << 10 | 0x03 << 5 | 0x00 = 0x7C60
  Next word: 0x0300
  Size: 2 words. Cost: 2 + 1 = 3 cycles (+ 1 if taken = 4).
```

### Example 6: `STB X` (store low byte of A at byte address in X)

```
This is a special instruction: o = 0x00, sub (b field) = 0x1F
a = X → a field = 0x03

Instruction word: 0x03 << 10 | 0x1F << 5 | 0x00 = 0x0FE0
Size: 1 word. Cost: 2 + 0 = 2 cycles.
```

### Example 7: `STB 0x0100` (store low byte of A at literal byte address)

```
This is a special instruction: o = 0x00, sub (b field) = 0x1F
a = next word literal → a field = 0x1F, next word = 0x0100

Instruction word: 0x1F << 10 | 0x1F << 5 | 0x00 = 0x7FE0
Next word: 0x0100
Size: 2 words. Cost: 2 + 1 = 3 cycles.
```

### Example 8: `BCOPY` (with registers pre-loaded)

```
SET A, 0x1000      ; source
SET B, 0x2000      ; destination
SET C, 64          ; count
BCOPY              ; o=0, b=0x1C, a=0x00 (ignored)

Instruction word: 0x00 << 10 | 0x1C << 5 | 0x00 = 0x0380
Size: 1 word. Cost: 2 + 64 = 66 cycles.
```

### Example 9: `STB` + `BCOPY` combined pattern (common in string-to-display code)

```asm
SET A, 'X'         ; character to store
STB 0x100          ; store at byte address 0x100 (special: o=0, b=0x1F, a=0x1F + next word)

; ... later ...
SET A, 0x1000      ; source
SET B, 0xE180      ; destination (MMIO display buffer)
SET C, 32          ; 32 words
BCOPY              ; bulk DMA the string
```

---

## Appendix C: Quick Reference Card

*Intended for printing on a single sheet. Covers all opcodes, operand encoding, and flag behavior.*

### Basic Opcodes (two operands: b, a)

```
Hex  Mnem   Description                          Base  FL
───  ─────  ─────────────────────────────────────  ────  ────
01   SET    b = a                                 1     -
02   ADD    b = b + a           (EX = carry)      2     ZCSO
03   SUB    b = b - a           (EX = borrow)     2     ZCSO
04   MUL    b = b × a unsigned  (EX = high)       3     ZCS-
05   MLI    b = b × a signed    (EX = high)       3     ZCSO
06   DIV    b = b / a unsigned  (EX = frac)       4     Z-S-
07   DVI    b = b / a signed    (EX = frac)       4     Z-S-
08   MOD    b = b % a unsigned                    4     Z-S-
09   MDI    b = b % a signed                      4     Z-S-
0A   AND    b = b & a                             1     Z-S-
0B   BOR    b = b | a                             1     Z-S-
0C   XOR    b = b ^ a                             1     Z-S-
0D   SHR    b = b >>> a         (EX = shifted)    1     ZCS-
0E   ASR    b = b >> a signed   (EX = shifted)    1     ZCS-
0F   SHL    b = b << a          (EX = shifted)    1     ZCS-
10   IFB    skip unless (b & a) ≠ 0               2+1   -
11   IFC    skip unless (b & a) = 0               2+1   -
12   IFE    skip unless b = a                     2+1   -
13   IFN    skip unless b ≠ a                     2+1   -
14   IFG    skip unless b > a unsigned             2+1   -
15   IFA    skip unless b > a signed               2+1   -
16   IFL    skip unless b < a unsigned             2+1   -
17   IFU    skip unless b < a signed               2+1   -
18   ADC    b = b + a + C       (EX = carry)      2     ZCSO
19   SBB    b = b - a - C       (EX = borrow)     2     ZCSO
1A   CMP    FL ← b - a         (b unchanged)     2     ZCSO
1B   TST    FL ← b & a         (b unchanged)     1     Z-S-
1C   FXMUL  b = (b×a)>>8 8.8fp (EX = low bits)  4     ZCSO
1D   FXDIV  b = (b<<8)/a 8.8fp (EX = remainder) 8     Z-S-
1E   LDB    b = byte at addr a  (zero-extended)  2     Z-S-
1F   (reserved)                                   1     -
```

### Special Opcodes (one operand: a)

```
Hex  Mnem   Description                          Base  FL
───  ─────  ─────────────────────────────────────  ────  ────
00   NOP    no operation                          1     -
01   JSR    push PC, PC = a                       3     -
02   BSR    push PC, PC += signed(a)              3     -
03   JZ     jump if Z=1                           2+1   -
04   JNZ    jump if Z=0                           2+1   -
05   JC     jump if C=1                           2+1   -
06   JNC    jump if C=0                           2+1   -
07   JS     jump if S=1                           2+1   -
08   JNS    jump if S=0                           2+1   -
09   JO     jump if O=1                           2+1   -
0A   JA     jump if C=0 and Z=0                   2+1   -
0B   JBE    jump if C=1 or Z=1                    2+1   -
0C   JGE    jump if S=O                           2+1   -
0D   JL     jump if S≠O                           2+1   -
0E   JG     jump if Z=0 and S=O                   2+1   -
0F   JLE    jump if Z=1 or S≠O                    2+1   -
10   INT    software interrupt, message = a       4     -
11   IAG    a = IA                                1     -
12   IAS    IA = a                                1     -
13   RFI    pop A, pop PC, re-enable interrupts   3     -
14   IAQ    if a≠0 queue interrupts               1     -
15   ---    (reserved)                            1     -
16   ---    (reserved)                            1     -
17   ---    (reserved)                            1     -
18   NEG    a = -a                                1     ZCSO
19   NOT    a = ~a                                1     Z-S-
1A   SXB    a = sign_extend(a & 0xFF)             1     Z-S-
1B   SWP    a = swap_bytes(a)                     1     -
1C   BCOPY  copy C words from [A] to [B]          2+C   -
1D   BRK    breakpoint (NOP if no debugger)       1     -
1E   HLT    halt until interrupt                  1     -
1F   STB    byte at addr a = low byte of A        2     -
```

### FL Flags (bits 3–0)

```
Bit 0: Z (Zero)      result = 0
Bit 1: C (Carry)     unsigned carry/borrow, or last bit shifted out
Bit 2: S (Sign)      bit 15 of result
Bit 3: O (Overflow)  signed overflow

FL column key:  Z = updates Z    C = updates C
                S = updates S    O = updates O
                - = clears or leaves unchanged
```

### Operand Encoding

```
       a (6-bit, source)              b (5-bit, destination)
Value  Meaning          Cost  NW?    Value  Meaning          Cost  NW?
─────  ───────────────  ────  ───    ─────  ───────────────  ────  ───
00-07  register A–J      0    no     00-07  register A–J      0    no
08-0F  [register]        1    no     08-0F  [register]        1    no
10-17  [register + nw]   2    yes    10-17  [register + nw]   2    yes
  18   POP [SP++]        1    no       18   PUSH [--SP]       1    no
  19   PEEK [SP]         1    no       19   PEEK [SP]         1    no
  1A   PICK [SP+nw]      2    yes      1A   PICK [SP+nw]      2    yes
  1B   SP                0    no       1B   SP                0    no
  1C   PC                0    no       1C   PC                0    no
  1D   EX                0    no       1D   EX                0    no
  1E   [nw]              2    yes      1E   [nw]              2    yes
  1F   nw (literal)      1    yes      1F   FL                0    no
  20   FL                0    no
21-3F  literal -1..29    0    no
```

### CMP + Jcc Quick Reference

```
After CMP b, a:
  Unsigned:  b=a → JZ    b≠a → JNZ   b<a → JC    b≥a → JNC   b>a → JA    b≤a → JBE
  Signed:    b=a → JZ    b≠a → JNZ   b<a → JL    b≥a → JGE   b>a → JG    b≤a → JLE
```

### Memory Map

```
0x0000–0x7FFF   RAM (byte-addressable via LDB/STB)
0x8000–0xDFFF   RAM (word-addressed only, if installed)
0xE000–0xFFFF   MMIO (32 slots × 256 words)
```

---