Changes to Recovery Flow

Comparing version 2.1 to 2.0
+215 additions -36 deletions
@@ -1,5 +1,5 @@
11 <div style="font-size: 0.85em; color: #656d76; margin-bottom: 1em; padding: 0.5em; background: #f6f8fa; border-radius: 4px;">
2-📄 Source: <a href="https://github.com/chipsalliance/i3c-core/blob/aae3424a8ecbd4edb7a60e23f76421de2d891712/doc/source/recovery_flow.md" target="_blank">chipsalliance/i3c-core/doc/source/recovery_flow.md</a> @ <code>aae3424</code>
2+📄 Source: <a href="https://github.com/chipsalliance/i3c-core/blob/bb79ebd9b487c61cd1bea1aec2574ae4740f104d/doc/source/recovery_flow.md" target="_blank">chipsalliance/i3c-core/doc/source/recovery_flow.md</a> @ <code>bb79ebd</code>
33 </div>
44
55 # Recovery flow
@@ -10,26 +10,171 @@
1010 In order to facilitate this process, the I3C Core implements CSRs as specified in the [Secure Firmware Recovery Interface](ext_cap.md#secure-firmware-recovery-interface---0xc0).
1111 The firmware is responsible for implementing the recovery flow and transferring firmware data to the program memory.
1212
13-The recovery flow adheres to the following steps:
14-
15-1. Upon reset, the hardware sets the FIFO size and region type in the `INDIRECT_FIFO_STATUS` CSR
16-1. The device's firmware configures the I3C core and sets the appropriate bits in the `PROT_CAP` CSR to indicate its recovery capabilities
17- These must include the mandatory ones:
18- - bit 0 (`DEVICE_ID`)
19- - bit 4 (`DEVICE_STATUS`)
20- - bit 6 (`Local C-image support`) or bit 7 (`Push C-image support`)
21- - bit 5 (`INDIRECT_CTRL`), only if bit 7 is set
22-1. Upon request for recovery mode entry, the firmware writes `0x3` (Recovery mode) to `DEVICE_STATUS`
23-1. The Recovery Initiator writes to `INDIRECT_FIFO_CTRL` to inform the Recovery handler about the image size
24- - Component Memory Space (`CMS`) field is set to `0`
25-1. The Recovery Initiator writes a data chunk to the receive FIFO via the `INDIRECT_FIFO_DATA` CSR.
26- The I3C core responds with a NACK when the FIFO is full.
27-1. The Recovery Handler updates FIFO pointers (Read Index and Write Index) presented in the `INDIRECT_FIFO_STATUS` CSR
28-1. The device's firmware reacts to a signal that a data chunk has been written to the FIFO by polling the `INDIRECT_FIFO_STATUS` CSR
29-1. The device's firmware reads the data chunk from the FIFO and stores it in an appropriate location in the memory
30-1. Steps 5 to 8 are repeated until the Recovery handler detects that the whole firmware image has been transmitted
31-1. The device's firmware polls the `RECOVERY_CONTROL` register until it receives the "Activate image" command from the Recovery Initiator
32-1. The device's firmware updates the `RECOVERY_STATUS` CSR to indicate that the uploaded firmware is being booted
13+```{note}
14+For the alternative AXI-based recovery flow, see [axi_recovery_flow](axi_recovery_flow.md).
15+```
16+
17+## Recovery Flow Actors
18+
19+| Actor | Interface | Role |
20+| ------- | ----------- | ------ |
21+| **Recovery Initiator** | I3C bus (virtual target addr) | BMC or Controller that sends firmware image |
22+| **I3C Core** | Recovery Handler HW | Translates I3C commands to recovery CSR accesses |
23+| **Device Firmware** | AXI bus | Reads FIFO data, manages recovery state, boots image |
24+
25+
26+## Recovery Flow Timeline
27+
28+The diagram below shows the interleaved operations between the three actors.
29+All CSR accesses from the Recovery Initiator use I3C private transfers to the
30+virtual target address. Device Firmware accesses CSRs over the AXI bus.
31+
32+```
33+ Recovery Initiator I3C Core (Recovery CSRs) Device Firmware
34+ (I3C Bus) ------------------------ ---------------
35+ | | |
36+ | | 1. Configure I3C core
37+ | | Set PROT_CAP bits:
38+ | | bit 0 (DEVICE_ID)
39+ | | bit 4 (DEVICE_STATUS)
40+ | | bit 5 (INDIRECT_CTRL)
41+ | | bit 7 (Push C-image)
42+ | | |
43+ | | 2. Write DEVICE_STATUS
44+ | | .DEV_STATUS = 0x3
45+ | | (enter recovery mode)
46+ | | |
47+ | | 3. Write RECOVERY_STATUS
48+ | | .DEV_REC_STATUS = 0x1
49+ | | .REC_IMG_INDEX = <stage>
50+ | | (awaiting image)
51+ | | |
52+ | | 3a. Wait for payload_available_o
53+ | | to deassert (no-op on first
54+ | | stage; on subsequent stages
55+ | | waits for Initiator to complete
56+ | | prior image handshake; in AXI
57+ | | bypass mode waits for Image
58+ | | Provider to clear
59+ | | REC_PAYLOAD_DONE at step 17)
60+ | | |
61+ 4. Read PROT_CAP | |
62+ (verify recovery caps) | |
63+ | | |
64+ 5. Read DEVICE_ID | |
65+ (identify device) | |
66+ | | |
67+ 6. Poll DEVICE_STATUS | |
68+ until DEV_STATUS == 0x3 | |
69+ | | |
70+ 7. Read RECOVERY_STATUS | |
71+ confirm DEV_REC_STATUS == 0x1 | |
72+ | | |
73+ 8. Write RECOVERY_CTRL | |
74+ .CMS = 0, .REC_IMG_SEL = 0x1 | |
75+ | | |
76+ 9. Write INDIRECT_FIFO_CTRL | |
77+ .RESET = 1 (reset FIFO) | |
78+ .IMAGE_SIZE = size in 4B units | |
79+ | | |
80+ v | |
81+ ,----------------------------. | |
82+ | 10. DATA TRANSFER LOOP | | |
83+ | (Initiator writes, | | |
84+ | FW reads) | | |
85+ `----------------------------' | |
86+ | | |
87+ | I3C Write INDIRECT_FIFO ----->| INDIRECT_FIFO |
88+ | _DATA (chunk + PEC) | | |
89+ | | `-----> Read by |
90+ | | Device FW |
91+ | | |
92+ | I3C NACKs when FIFO full | DMA triggered by |
93+ | (flow control via NACK) | payload_available_o |
94+ | | (reads FIFO_SIZE or |
95+ | | remaining bytes) |
96+ | | |
97+ | ... repeat until all data transferred ... |
98+ | | |
99+ | | 11. Write DEVICE_STATUS
100+ | | .DEV_STATUS = 0x4
101+ | | (recovery pending)
102+ | | |
103+ | | 12. Poll RECOVERY_CTRL
104+ | | .ACTIVATE_REC_IMG
105+ | | until == 0xF
106+ | | |
107+ 13. Write RECOVERY_CTRL | |
108+ .ACTIVATE_REC_IMG = 0xF | |
109+ | | |
110+ | | 14. Write DEV_REC_STATUS = 0x2
111+ | | (processing image)
112+ | | |
113+ 15. Poll DEVICE_STATUS.DEV_STATUS | |
114+ Wait while DEV_STATUS == 0x4 | |
115+ (Device FW is validating the image) | |
116+ | | |
117+ | | 16. Validate image
118+ | | (crypto verification)
119+ | | |
120+ | | [Intermediate, on success]
121+ | | Reset INDIRECT_FIFO_CTRL
122+ | | -> loop back to step 2
123+ | | (with next REC_IMG_INDEX)
124+ | | |
125+ | | [On error (any stage)]
126+ | | Write DEV_REC_STATUS >= 0xc
127+ | | Write DEV_STATUS = 0xF
128+ | | |
129+ | | [Final stage, on success]
130+ | | Write DEV_REC_STATUS = 0x3
131+ | | Write DEV_STATUS = 0x1
132+ | | (recovery successful,
133+ | | device healthy)
134+ | | |
135+ 17. DEV_STATUS changed from 0x4: | |
136+ 0x3 = next stage ready | |
137+ -> go back to step 6 | |
138+ 0x1 = recovery complete | |
139+ -> exit | |
140+ 0xF = error | |
141+ -> abort recovery | |
142+ | | |
143+```
144+
145+### Key Differences from AXI Bypass Flow
146+
147+The Device Firmware side of the flow is identical regardless of transport.
148+The only differences are on the data provider (Initiator / Image Provider) side:
149+
150+| Aspect | I3C Recovery Flow | AXI Bypass Flow |
151+| -------- | ------------------- | ----------------- |
152+| Data path | I3C bus to virtual target | AXI bus direct to FIFO |
153+| Initiator | External BMC/Controller | Internal Image Provider |
154+| PEC checksum | Required on every I3C transfer | Not used |
155+| FIFO full handling | I3C core NACKs the write | Image Provider polls FIFO status |
156+| CSR access | I3C private write/read commands | Direct AXI register access |
157+| Activation | Initiator writes RECOVERY_CTRL directly | Image Provider writes via W1C register |
158+| `REC_PAYLOAD_DONE` | Not used (HW manages `payload_available_o`) | Image Provider sets/clears to signal last chunk |
159+
160+
161+```{important}
162+**Inter-stage synchronization**: In multi-image recovery flows, Device
163+Firmware must wait for `payload_available_o` to deassert after writing
164+`RECOVERY_STATUS` (step 3) and before polling for new payload. In AXI
165+bypass mode, `payload_available_o` can remain asserted between stages
166+because the Image Provider's `REC_PAYLOAD_DONE` from the prior stage is
167+still set. The Image Provider clears `REC_PAYLOAD_DONE` when it enters
168+the next stage (step 17 in the AXI bypass flow). Without this wait,
169+Device Firmware may read stale `IMAGE_SIZE` from the prior stage and
170+set up DMA transfers with the wrong byte count.
171+```
172+
173+### Register Reference
174+
175+Recovery registers are part of the Secure Firmware Recovery Interface
176+extended capability. See {ref}`tab-secure-firmware-recovery-interface` in
177+[ext_cap](ext_cap.md) for the full register list and field descriptions.
33178
34179 ## Recovery handler
35180
@@ -102,21 +247,55 @@
102247
103248 ### CSR write
104249
105-When the Recovery Initiator tries to write to a CSR through I3C, it first sends the command byte followed by two length bytes which are received by the handler logic.
106-If the length is non-zero, the handler resets the `PEC` block and sets `R2MUX` to pass the remaining data to the TTI RX data queue.
107-`R2MUX` disconnects the queue just before the last byte, which is the PEC checksum.
108-Finally, the last byte is compared with the checksum computed by the `PEC` block.
109-
110-If the checksum matches the command, the handling part of the handler logic reads data from the TTI RX data queue and updates relevant CSR fields.
111-
112-If the checksum does not match, then the handler discards all the data in the queue.
113-
114-In case of an `INDIRECT_FIFO_DATA` write command the handler does not process the data at all. Instead, it passes the data to the RX indirect FIFO queue to make it available to the software.
115-When the software reads data from the `INDIRECT_FIFO_DATA` CSR, the handler updates the queue pointer in the `INDIRECT_FIFO_STATUS` CSR.
250+When the Recovery Initiator writes to a CSR through I3C, the handler receives
251+the command byte, two length bytes, payload data, and a trailing PEC byte. The
252+PEC is computed over CMD + LEN + DATA using CRC-8. The handler validates the
253+transaction in `CmdDispatch` after all bytes are received.
254+
255+The following errors are detected:
256+
257+1. **CRC/PEC error**: computed PEC does not match received PEC byte
258+2. **Length error**: payload length does not match expected CSR size,
259+ premature Stop/Restart, or FIFO overflow
260+3. **Unsupported command error**: invalid command code, or access to a
261+ recovery-only CSR when recovery mode is not active
262+4. **Read-only error**: write attempt to a read-only CSR (reported as
263+ unsupported command per OCP spec)
264+5. **Bus-level parity error**: T-bit parity error detected on the I3C bus
265+ during PEC or length bytes (reported as CRC error)
266+
267+On any error, the handler enters an Error state that drains remaining bus
268+bytes, discards all received data, and writes the
269+error code to `DEVICE_STATUS_0.PROTOCOL_ERROR`. The CSR is not updated.
270+
271+PEC errors are also reported through the TTI error interrupt path.
272+All recovery handler errors have dedicated interrupt status bits, enable
273+bits, detection enables, and saturating counters:
274+
275+| Error | Interrupt Status | Detection Enable | Counter |
276+| ------- | ----------------- | ------------------ | --------- |
277+| CRC/PEC | `RI_PEC_ERR_STAT` | `RI_PEC_ERR_DET_EN` | `TARGET_ERR_CNT_RI_PEC` |
278+| Length | `RI_LENGTH_ERR_STAT` | `RI_LENGTH_ERR_DET_EN` | `TARGET_ERR_CNT_RI_LENGTH` |
279+| Read-only | `RI_READONLY_ERR_STAT` | `RI_READONLY_ERR_DET_EN` | `TARGET_ERR_CNT_RI_READONLY` |
280+| Unsupported | `RI_UNSUPPORTED_ERR_STAT` | `RI_UNSUPPORTED_ERR_DET_EN` | `TARGET_ERR_CNT_RI_UNSUPPORTED` |
281+| RX FIFO overflow | `RI_RX_FIFO_OVERFLOW_ERR_STAT` | `RI_RX_FIFO_OVERFLOW_ERR_DET_EN` | `TARGET_ERR_CNT_RI_RX_FIFO_OVERFLOW` |
282+| Indirect FIFO overflow | `RI_INDIRECT_FIFO_OVERFLOW_ERR_STAT` | `RI_INDIRECT_FIFO_OVERFLOW_ERR_DET_EN` | `TARGET_ERR_CNT_RI_INDIRECT_FIFO_OVERFLOW` |
283+
284+
285+All registers are in the `TARGET_ERR_INTR_STATUS`, `TARGET_ERR_CTRL`, and
286+`TARGET_ERR_CNT_*` register groups within the TTI extended capability.
287+See [error_handling](error_handling.md) for details on the TTI error interrupt mechanism.
288+
289+If no errors are detected, the handler reads data from the RX queue and
290+updates the target CSR fields.
291+
292+For `INDIRECT_FIFO_DATA` write commands, the handler passes data directly
293+to the Indirect FIFO without processing. The handler updates the FIFO
294+write pointer in `INDIRECT_FIFO_STATUS` as data is written.
116295
117296 ### CSR read
118297
119-A CSR read begins similarly as write, by receiving the command byte.
120-Following that, the command handling part reads data from the CSR being read and passes it to the transmit part, which formats the response I3C packet.
121-
122-The transmit logic injects the CSR length, resets the TX `PEC` module, sends the CSR content to the I3C core and finally injects the calculated PEC checksum.
298+A CSR read begins with the command byte and PEC checksum in the write
299+phase. After PEC validation, the handler queues a TX descriptor before the
300+Repeated Start. The transmit logic then sends the CSR length, data, and a
301+computed TX PEC checksum.