Changes to Hardware Specification

Comparing version 2.1 to 2.0
+476 additions -131 deletions
@@ -1,12 +1,12 @@
11 <div style="font-size: 0.85em; color: #656d76; margin-bottom: 1em; padding: 0.5em; background: #f6f8fa; border-radius: 4px;">
2-📄 Source: <a href="https://github.com/chipsalliance/caliptra-rtl/blob/35b0bc5691b2bd0fc180403914cfabe207379089/docs/CaliptraHardwareSpecification.md" target="_blank">chipsalliance/caliptra-rtl/docs/CaliptraHardwareSpecification.md</a> @ <code>35b0bc5</code>
2+📄 Source: <a href="https://github.com/chipsalliance/caliptra-rtl/blob/76a7d85c90bd88840834513979cddc90c60fe238/docs/CaliptraHardwareSpecification.md" target="_blank">chipsalliance/caliptra-rtl/docs/CaliptraHardwareSpecification.md</a> @ <code>76a7d85</code>
33 </div>
44
55 ![OCP Logo](../images/caliptra-rtl/docs/images/OCP_logo.png)
66
77 <p style="text-align: center;">Caliptra Hardware Specification</p>
88
9-<p style="text-align: center;">Revision 2.0.3</p>
9+<p style="text-align: center;">Revision 2.1</p>
1010
1111 <div style="page-break-after: always"></div>
1212
@@ -28,10 +28,10 @@
2828 * Caliptra uC may use internally in mailbox mode or via the Caliptra AXI DMA assist engine in streaming mode
2929 * SHA Accelerator adds new SHA save/restore functionality
3030 * Adams Bridge Dilithium/ML-DSA (refer to [Adams bridge spec](https://github.com/chipsalliance/adams-bridge/blob/main/docs/AdamsBridgeHardwareSpecification.md))
31-* Subsystem mode support (refer to [Subsystem Specification](https://github.com/chipsalliance/caliptra-ss/blob/main/docs/Caliptra%202.0%20Subsystem%20Specification%201.pdf) for details)
31+* Subsystem mode support (refer to [Subsystem Specification](https://github.com/chipsalliance/caliptra-ss/blob/main/docs/CaliptraSSIntegrationSpecification.md) for details)
3232 * ECDH hardware support
3333 * HMAC512 hardware support
34- * AXI Manager with DMA support (refer to [DMA Specification](https://github.com/chipsalliance/caliptra-ss/blob/main/docs/CaliptraSSHardwareSpecification.md#caliptra-axi-manager--dma-assist))
34+ * AXI Manager with DMA support (refer to [DMA Specification](https://github.com/chipsalliance/caliptra-ss/blob/main/docs/CaliptraSSHardwareSpecification.md#caliptra-core-axi-manager--dma-assist))
3535 * Manufacturing and Debug Unlock
3636 * UDS programming
3737 * Read logic for Secret Fuses
@@ -39,29 +39,38 @@
3939 * RISC-V core PMP support
4040 * CSR HMAC key for manufacturing flow
4141
42+## Key Caliptra 2.1 Changes
43+* AXI Manager DMA AES feature for OCP L.O.C.K. support (refer to [DMA Specification](https://github.com/chipsalliance/caliptra-ss/blob/main/docs/CaliptraSSHardwareSpecification.md#caliptra-core-axi-manager--dma-assist))
44+* [AES Big Endian mode](#aes-endian)
45+* [External Staging Area](./CaliptraIntegrationSpecification.md#external-staging-area)
46+* [OCP LOCK Support](#ocp-lock-hardware-architecture)
47+* [SHA3](#sha3)
48+* [ML-KEM](#adams-bridge-kyber-ml-kem)
49+
50+
4251 ## Boot FSM
4352
4453 The Boot FSM detects that the SoC is bringing Caliptra out of reset. Part of this flow involves signaling to the SoC that Caliptra is awake and ready for fuses. After fuses are populated and the SoC indicates that it is done downloading fuses, Caliptra can wake up the rest of the IP by de-asserting the internal reset.
4554
46-The following figure shows the initial power-on arc of the Mailbox Boot FSM.
47-
48-*Figure 1: Mailbox Boot FSM state diagram*
49-
50-![](../images/caliptra-rtl/docs/images/HW_mbox_boot_fsm.png)
55+The following figure shows the state transitions and associated actions in Caliptra's boot state machine.
56+
57+*Figure: Caliptra Boot FSM state diagram*
58+
59+![](../images/caliptra-rtl/docs/images/Caliptra_boot_fsm.png)
5160
5261 The Boot FSM first waits for the SoC to assert cptra\_pwrgood and de-assert cptra\_rst\_b. In the BOOT\_FUSE state, Caliptra signals to the SoC that it is ready for fuses. After the SoC is done writing fuses, it sets the fuse done register and the FSM advances to BOOT\_DONE.
5362
54-BOOT\_DONE enables Caliptra reset de-assertion through a two flip-flop synchronizer.
55-
56-## FW update reset (Impactless FW update)
57-
58-When a firmware update is initiated, Runtime FW writes to fw\_update\_reset register to trigger the FW update reset. When this register is written, only the RISC-V core is reset using cptra\_uc\_fw\_rst\_b pin and all AHB targets are still active. All registers within the targets and ICCM/DCCM memories are intact after the reset. Reset is deasserted synchronously after a programmable number of cycles; the minimum allowed number of wait cycles is 5, which is also the default configured value. Reset de-assertion is done through a two flip-flop synchronizer. Since ICCM is locked during runtime, the boot FSM unlocks it when the RISC-V reset is asserted. Following FW update reset deassertion, normal boot flow updates the ICCM with the new FW from the mailbox SRAM. The boot flow is modified as shown in the following figure.
59-
60-*Figure 2: Mailbox Boot FSM state diagram for FW update reset*
61-
62-![](../images/caliptra-rtl/docs/images/mbox_boot_fsm_FW_update_reset.png)
63+Once in the BOOT\_DONE state, Caliptra de-asserts resets through a two flip-flop synchronizer.
64+
65+### FW update reset (Impactless FW update)
66+
67+When a firmware update is initiated, Runtime FW writes to fw\_update\_reset register to trigger the FW update reset. When this register is written, only the RISC-V core is reset using cptra\_uc\_rst\_b pin and all AHB targets are still active. All registers within the targets and ICCM/DCCM memories are intact after the reset. Reset is deasserted synchronously after a programmable number of cycles; the minimum allowed number of wait cycles is 5, which is also the default configured value. Reset de-assertion is done through a two flip-flop synchronizer. Since ICCM is locked during runtime, the boot FSM unlocks it when the RISC-V reset is asserted. Following FW update reset deassertion, normal boot flow updates the ICCM with the new FW from the mailbox SRAM.
6368
6469 Impactless firmware updates may be initiated by writing to the fw\_update\_reset register after Caliptra comes out of global reset and enters the BOOT\_DONE state. In the BOOT\_FWRST state, only the reset to the RISC-V core is asserted and the wait timer is initialized. After the timer expires, the FSM advances from the BOOT\_WAIT to BOOT\_DONE state where the reset is deasserted and ICCM is unlocked.
70+
71+### Breakpoints for Debug
72+
73+Integrators may connect a breakpoint input to Caliptra, which is intended to connect to a chip GPIO pin. When asserted, this pin causes the Caliptra boot FSM to follow a modified arc. Instead of transitioning immediately to the BOOT_DONE state upon completion of fuse programming, the state machine transitions from BOOT_FUSE to BOOT_WAIT. Here, the state machine halts until the Caliptra register [CPTRA_BOOTFSM_GO](https://chipsalliance.github.io/caliptra-rtl/main/internal-regs/?p=clp.soc_ifc_reg.CPTRA_BOOTFSM_GO) is set, either by AXI or TAP access.
6574
6675 ## RISC-V core
6776
@@ -116,8 +125,9 @@
116125 | Data Vault | 5 | 8 KiB | 0x1001_C000 | 0x1001_DFFF |
117126 | SHA512 | 6 | 32 KiB | 0x1002_0000 | 0x1002_7FFF |
118127 | SHA256 | 10 | 32 KiB | 0x1002_8000 | 0x1002_FFFF |
119-| ML-DSA | 14 | 64 KiB | 0x1003_0000 | 0x1003_FFFF |
128+| ABR (MLDSA/MLKEM) | 14 | 64 KiB | 0x1003_0000 | 0x1003_FFFF |
120129 | AES | 15 | 4 KiB | 0x1001_1000 | 0x1001_1FFF |
130+| SHA3 | 16 | 4 KiB | 0x1004_0000 | 0x1004_0FFF |
121131
122132
123133 #### Peripherals subsystem
@@ -196,8 +206,8 @@
196206 | Mailbox (Notifications) | 20 | 7 |
197207 | SHA512 Accelerator (Errors) | 23 | 8 |
198208 | SHA512 Accelerator (Notifications) | 24 | 7 |
199-| MLDSA (Errors) | 23 | 8 |
200-| MLDSA (Notifications) | 24 | 7 |
209+| ABR (MLDSA/MLKEM) (Errors) | 23 | 8 |
210+| ABR (MLDSA/MLKEM) (Notifications) | 24 | 7 |
201211 | AXI DMA (Errors) | 25 | 8 |
202212 | AXI DMA (Notifications) | 26 | 7 |
203213
@@ -220,7 +230,7 @@
220230
221231 The following figure shows the two timers.
222232
223-*Figure 3: Caliptra Watchdog Timer*
233+*Figure: Caliptra Watchdog Timer*
224234
225235 ![](../images/caliptra-rtl/docs/images/WDT.png)
226236
@@ -358,7 +368,7 @@
358368
359369 The following figure shows the timing information for clock gating.
360370
361-*Figure 10: Clock gating timing*
371+*Figure: Clock gating timing*
362372
363373 ![](../images/caliptra-rtl/docs/images/clock_gating_timing.png)
364374
@@ -372,19 +382,19 @@
372382
373383 The following figure shows the integrated TRNG block.
374384
375-*Figure 11: Integrated TRNG block*
385+*Figure: Integrated TRNG block*
376386
377387 ![](../images/caliptra-rtl/docs/images/integrated_TRNG.png)
378388
379389 The following figure shows the CSRNG block.
380390
381-*Figure 12: CSRNG block*
391+*Figure: CSRNG block*
382392
383393 ![](../images/caliptra-rtl/docs/images/CSRNG_block.png)
384394
385395 The following figure shows the entropy source block.
386396
387-*Figure 13: Entropy source block*
397+*Figure: Entropy source block*
388398
389399 ![](../images/caliptra-rtl/docs/images/entropy_source_block.png)
390400
@@ -450,7 +460,7 @@
450460
451461 The following figure shows the top level signals defined in caliptra\_top.
452462
453-*Figure 14: caliptra\_top signals*
463+*Figure: caliptra\_top signals*
454464
455465 ![](../images/caliptra-rtl/docs/images/caliptra_top_signals.png)
456466
@@ -472,7 +482,7 @@
472482
473483 The following figure shows the entropy source signals.
474484
475-*Figure 15: Entropy source signals*
485+*Figure: Entropy source signals*
476486
477487 ![](../images/caliptra-rtl/docs/images/entropy_source_signals.png)
478488
@@ -634,7 +644,54 @@
634644
635645 Note: If the debug security state switches to debug mode anytime, the security assets and keys are still flushed even though JTAG is not open.
636646
637-*Figure 16: JTAG implementation*
647+The following table details the alias addresses for registers in soc ifc that are accessible through JTAG.
648+Debug Locked registers are a subset of registers accessible when debug intent is set, when debug is unlocked, or the lifecycle state is DEVICE_MANUFACTURING.
649+Debug Unlocked registers are accessible when debug is unlocked, or the lifecycle state is DEVICE_MANUFACTURING.
650+
651+| Register Name | JTAG Address | Accessibility | Debug Locked | Debug Unlocked |
652+| ------------------------------------------- | -------------- | --------------- | -------------- | ---------------- |
653+| mbox_lock | 7’h75 | RO | YES | YES |
654+| mbox_cmd | 7’h76 | RW | YES | YES |
655+| mbox_dlen | 7’h50 | RW | YES | YES |
656+| mbox_dataout | 7’h51 | RO | YES | YES |
657+| mbox_datain | 7’h62 | WO | YES | YES |
658+| mbox_status | 7’h52 | RW | YES | YES |
659+| mbox_execute | 7’h77 | WO | YES | YES |
660+| CPTRA_BOOT_STATUS | 7’h53 | RO | YES | YES |
661+| CPTRA_HW_ERRROR_ENC | 7’h54 | RO | YES | YES |
662+| CPTRA_FW_ERROR_ENC | 7’h55 | RO | YES | YES |
663+| SS_UDS_SEED_BASE_ADDR_L | 7’h56 | RO || YES |
664+| SS_UDS_SEED_BASE_ADDR_H | 7’h57 | RO || YES |
665+| CPTRA_HW_ERROR_FATAL | 7’h58 | RO | YES | YES |
666+| CPTRA_FW_ERROR_FATAL | 7’h59 | RO | YES | YES |
667+| CPTRA_HW_ERROR_NON_FATAL | 7’h5a | RO | YES | YES |
668+| CPTRA_FW_ERROR_NON_FATAL | 7’h5b | RO | YES | YES |
669+| CPTRA_DBG_MANUF_SERVICE_REG | 7’h60 | RW | YES | YES |
670+| CPTRA_BOOTFSM_GO | 7’h61 | RW | YES | YES |
671+| SS_DEBUG_INTENT | 7’h63 | RW || YES |
672+| SS_CALIPTRA_BASE_ADDR_L | 7’h64 | RW || YES |
673+| SS_CALIPTRA_BASE_ADDR_H | 7’h65 | RW || YES |
674+| SS_MCI_BASE_ADDR_L | 7’h66 | RW || YES |
675+| SS_MCI_BASE_ADDR_H | 7’h67 | RW || YES |
676+| SS_RECOVERY_IFC_BASE_ADDR_L | 7’h68 | RW || YES |
677+| SS_RECOVERY_IFC_BASE_ADDR_H | 7’h69 | RW || YES |
678+| SS_OTP_FC_BASE_ADDR_L | 7’h6A | RW || YES |
679+| SS_OTP_FC_BASE_ADDR_H | 7’h6B | RW || YES |
680+| SS_STRAP_GENERIC_0 | 7’h6C | RW || YES |
681+| SS_STRAP_GENERIC_1 | 7’h6D | RW || YES |
682+| SS_STRAP_GENERIC_2 | 7’h6E | RW || YES |
683+| SS_STRAP_GENERIC_3 | 7’h6F | RW || YES |
684+| SS_DBG_SERVICE_REG_REQ | 7’h70 | RW | YES | YES |
685+| SS_DBG_SERVICE_REG_RSP | 7’h71 | RO | YES | YES |
686+| SS_DBG_UNLOCK_LEVEL0 | 7’h72 | RW || YES |
687+| SS_DBG_UNLOCK_LEVEL1 | 7’h73 | RW || YES |
688+| SS_STRAP_CALIPTRA_DMA_AXI_USER | 7’h74 | RW || YES |
689+| SS_EXTERNAL_STAGING_AREA_BASE_ADDR_L | 7’h78 | RW || YES |
690+| SS_EXTERNAL_STAGING_AREA_BASE_ADDR_H | 7’h79 | RW || YES |
691+
692+
693+
694+*Figure: JTAG implementation*
638695
639696 ![](../images/caliptra-rtl/docs/images/JTAG_implementation.png)
640697
@@ -644,18 +701,19 @@
644701
645702 * Symmetric cryptographic primitives
646703 * De-obfuscation engine
647- * SHA512/384 (based on NIST FIPS 180-4 [2])
648- * SHA256 (based on NIST FIPS 180-4 [2])
649- * HMAC512 (based on [NIST FIPS 198-1](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.198-1.pdf) [5] and [RFC 4868](https://tools.ietf.org/html/rfc4868) [6])
704+ * SHA512/384 (based on NIST FIPS 180-4 [2])
705+ * SHA256 (based on NIST FIPS 180-4 [2])
706+ * HMAC512 (based on [NIST FIPS 198-1](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.198-1.pdf) [5] and [RFC 4868](https://tools.ietf.org/html/rfc4868) [6])
707+ * SHA3 (based on [NIST FIPS 202](https://doi.org/10.6028/NIST.FIPS.202) [17])
650708 * Public-key cryptography
651- * NIST Secp384r1 Deterministic Digital Signature Algorithm (based on FIPS-186-4 [11] and RFC 6979 [7])
709+ * NIST Secp384r1 Deterministic Digital Signature Algorithm (based on FIPS-186-4 [11] and RFC 6979 [7])
652710 * Key vault
653- * Key slots
654- * Key slot management
711+ * Key slots
712+ * Key slot management
655713
656714 The high-level architecture of Caliptra cryptographic subsystem is shown in the following figure.
657715
658-*Figure 17: Caliptra cryptographic subsystem*
716+*Figure: Caliptra cryptographic subsystem*
659717
660718 ![](../images/caliptra-rtl/docs/images/Crypto-2p0.png)
661719
@@ -680,7 +738,7 @@
680738
681739 The total size should be equal to 128 bits short of a multiple of 1024 since the goal is to have the formatted message size as a multiple of 1024 bits (N x 1024). The following figure shows the SHA512 input formatting.
682740
683-*Figure 18: SHA512 input formatting*
741+*Figure: SHA512 input formatting*
684742
685743 ![](../images/caliptra-rtl/docs/images/SHA512_input.png)
686744
@@ -692,7 +750,7 @@
692750
693751 The SHA512 architecture has the finite-state machine as shown in the following figure.
694752
695-*Figure 19: SHA512 FSM*
753+*Figure: SHA512 FSM*
696754
697755 ![](../images/caliptra-rtl/docs/images/SHA512_fsm.png)
698756
@@ -722,7 +780,7 @@
722780
723781 The following pseudocode demonstrates how the SHA512 interface can be implemented.
724782
725-*Figure 20: SHA512 pseudocode*
783+*Figure: SHA512 pseudocode*
726784
727785 ![](../images/caliptra-rtl/docs/images/SHA512_pseudo.png)
728786
@@ -803,7 +861,7 @@
803861
804862 The following figure shows SHA256 input formatting.
805863
806-*Figure 21: SHA256 input formatting*
864+*Figure: SHA256 input formatting*
807865
808866 ![](../images/caliptra-rtl/docs/images/SHA256_input.png)
809867
@@ -815,7 +873,7 @@
815873
816874 The SHA256 architecture has the finite-state machine as shown in the following figure.
817875
818-*Figure 22: SHA256 FSM*
876+*Figure: SHA256 FSM*
819877
820878 ![](../images/caliptra-rtl/docs/images/SHA256_fsm.png)
821879
@@ -850,7 +908,7 @@
850908
851909 The following pseudocode demonstrates how the SHA256 interface can be implemented.
852910
853-*Figure 23: SHA256 pseudocode*
911+*Figure: SHA256 pseudocode*
854912
855913 ![](../images/caliptra-rtl/docs/images/SHA256_pseudo.png)
856914
@@ -890,6 +948,164 @@
890948 | 1 KiB message | 8761 | 21.90 | 45,657 |
891949
892950
951+## SHA3
952+
953+The SHA3 HWIP performs the hash functions, whose purpose is to check the integrity of the received message.
954+It supports various SHA3 hashing functions including SHA3 Extended Output Function (XOF) known as SHAKE functions.
955+The details of the operation are described in the [SHA3 specification, FIPS 202](https://csrc.nist.gov/publications/detail/fips/202/final) known as _sponge construction_.
956+It has been adapted from OpenTitan and you can find documentation describing the functionality of the KMAC block it was derived from [here](https://opentitan.org/book/hw/ip/kmac/index.html).
957+In the current use cases of the SHA3 HW IP, either (a) messages are not considered secret (External Mu), or (b) SCA hardening would not be meaningful (HPKE in OCP L.O.C.K.), hence there are no SCA requirements.
958+
959+### Features
960+- Support for SHA3-224, 256, 384, 512, SHAKE[128, 256] and cSHAKE[128, 256]
961+- Support byte-granularity on input message
962+- Support arbitrary output length for SHAKE, cSHAKE
963+- Support customization input string S, and function-name N up to 36 bytes total
964+- 64b x 10 depth Message FIFO
965+- Performance (at 100 MHz):
966+ - SHA3-224: 2.93 B/cycle, 2.34 Gbit/s - 1.19 B/cycle, 952 Mbit/s (DOM)
967+ - SHA3-512: 1.47 B/cycle, 1.18 Gbit/s - 0.59 B/cycle, 472 Mbit/s (DOM)
968+
969+### Design Details
970+
971+#### Keccak Round
972+
973+A Keccak round implements the Keccak_f function described in the SHA3 specification.
974+Keccak round logic in SHA3 HWIP not only supports 1600 bit internal states but also all possible values {25, 50, 100, 200, 400, 800, 1600} based on a parameter `Width`.
975+Keccak permutations in the specification allow arbitrary number of rounds.
976+This module, however, supports Keccak_f which always runs `12 + 2*L` rounds, where \\[ L = log_2 {( {Width \over 25} )} \\] .
977+For instance, 200 bits of internal state run 18 rounds.
978+SHA3 instantiates the Keccak round module with 1600 bit.
979+
980+![](../images/caliptra-rtl/docs/images/sha3-keccak-round.svg)
981+
982+Keccak round logic has two phases inside.
983+Theta, Rho, Pi functions are executed at the 1st phase.
984+Chi and Iota functions run at the 2nd phase.
985+The first phase and the second phase run in the same cycle.
986+
987+To save circuit area, the Chi function uses 800 instead 1600 DOM multipliers but the multipliers are fully pipelined.
988+The Chi and Iota functions are thus separately applied to the two halves of the state and the 2nd phase takes in total three clock cycles to complete.
989+In the first clock cycle of the 2nd phase, the first stage of Chi is computed for the first lane halves of the state.
990+In the second clock cycle, the new first lane halves are output and written to state register.
991+At the same time, the first stage of Chi is computed for the second lane halves.
992+In the third clock cycle, the new second lane halves are output and written to the state register.
993+
994+#### Padding for Keccak
995+
996+Padding logic supports SHA3/SHAKE/cSHAKE algorithms.
997+cSHAKE needs the extra inputs for the Function-name `N` and the Customization string `S`.
998+Other than that, SHA3, SHAKE, and cSHAKE share similar datapath inside the padding module except the last part added next to the end of the message.
999+SHA3 adds `2'b 10`, SHAKE adds `4'b 1111`, cSHAKE adds `2'b00` then `pad10*1()` follows.
1000+All are little-endian values.
1001+
1002+Interface between this padding logic and the MSG_FIFO follows the conventional FIFO interface.
1003+So `caliptra_prim_fifo_*` can talk to the padding logic directly.
1004+This module talks to Keccak round logic with a more memory-like interface.
1005+The interface has an additional address signal on top of the valid, ready, and data signals.
1006+
1007+![](../images/caliptra-rtl/docs/images/sha3-padding.svg)
1008+
1009+The hashing process begins when the software issues the start command to `CMD` .
1010+If cSHAKE is enabled, the padding logic expands the prefix value (`N || S` above) into a block size.
1011+The block size is determined by the `CFG_SHADOWED.kstrength`.
1012+If the value is 128, the block size will be 168 bytes.
1013+If it is 256, the block size will be 136 bytes.
1014+The expanded prefix value is transmitted to the Keccak round logic.
1015+After sending the block size, the padding logic triggers the Keccak round logic to run a full 24 rounds.
1016+
1017+If the mode is not cSHAKE, or cSHAKE mode and the prefix block has been processed, the padding logic accepts the incoming message bitstream and forward the data to the Keccak round logic in a block granularity.
1018+The padding logic controls the data flow and makes the Keccak logic to run after sending a block size.
1019+
1020+After the software writes the message bitstream, it should issue the Process command into `CMD` register.
1021+The padding logic, after receiving the Process command, appends proper ending bits with respect to the `CFG_SHADOWED.mode` value.
1022+The logic writes 0 up to the block size to the Keccak round logic then ends with 1 at the end of the block.
1023+
1024+![](../images/caliptra-rtl/docs/images/sha3-padding-fsm.svg)
1025+
1026+After the Keccak round completes the last block, the padding logic asserts an `absorbed` signal to notify the software.
1027+At this point, the software is able to read the digest in `STATE` memory region.
1028+If the output length is greater than the Keccak block rate in SHAKE and cSHAKE mode, the software may run the Keccak round manually by issuing Run command to `CMD` register.
1029+
1030+The software completes the operation by issuing Done command after reading the digest.
1031+The padding logic clears internal variables and goes back to Idle state.
1032+
1033+#### Message FIFO
1034+
1035+The SHA3 HWIP has a compile-time configurable depth message FIFO inside.
1036+The message FIFO receives incoming message bitstream regardless of its byte position in a word.
1037+Then it packs the partial message bytes into the internal 64 bit data width.
1038+After packing the data, the logic stores the data into the FIFO until the internal SHA3 engine consumes the data.
1039+
1040+##### FIFO Depth calculation
1041+
1042+The depth of the message FIFO is chosen to cover the throughput of the software or other producers such as DMA engine.
1043+The size of the message FIFO is enough to hold the incoming data while the SHA3 engine is processing the previous block.
1044+Default design parameters assume the system characteristics as below:
1045+
1046+- `kmac_pkg::RegLatency`: The register write takes 5 cycles.
1047+- `kmac_pkg::Sha3Latency`: Keccak round latency takes 24 cycles.
1048+
1049+##### Empty and Full status
1050+
1051+Under normal operating conditions, the SHA3 engine will process data a lot faster than software can push it to the Message FIFO.
1052+The Message FIFO depth observable from `STATUS.fifo_depth` will remain **0** while the `STATUS.fifo_empty` status bit is lowered for one clock cycle whenever software provides new data.
1053+
1054+After the SHA3 engine starts popping the data again, the Message FIFO will eventually run empty again and the `fifo_empty` status interrupt will fire.
1055+Note that the `fifo_empty` status interrupt will not fire if i) one of the hardware application interfaces is using the SHA3 block, ii) the SHA3 core is not in the `Absorb` state, or iii) after software has written the `Process` command.
1056+
1057+If software pushes data to the Message FIFO while it is full, the write operation is blocked until there is again space in the FIFO.
1058+This means the processor is effectively stalled.
1059+If the SHA3 engine is currently running and software fills up the Message FIFO, the resulting stall won't take more than 100 clock cycles.
1060+The stall mechanism prevents data loss and the upper bound on the wait time avoids software needing to poll the `STATUS.fifo_depth` field before writing data.
1061+
1062+### Programmer's guide
1063+
1064+The software can update the SHA3 configurations only when the IP is in the idle state.
1065+The software should check `STATUS.sha3_idle` before updating the configurations.
1066+The software must first program `CFG_SHADOWED.msg_endianness` and `CFG_SHADOWED.state_endianness` at the initialization stage.
1067+These determine the byte order of incoming messages (msg_endianness) and the Keccak state output (state_endianness).
1068+
1069+#### Software Initiated SHA3 process
1070+
1071+This section describes the expected software process to run the SHA3 HWIP.
1072+At first, the software configures `CFG_SHADOWED.kmac_en` for the operation.
1073+If SHA3 is enabled, the software should configure `CFG_SHADOWED.mode` to cSHAKE and `CFG_SHADOWED.kstrength` to 128 or 256 bit security strength.
1074+The software also updates `PREFIX` registers if cSHAKE mode is used.
1075+Current design does not convert cSHAKE mode to SHAKE even if `PREFIX` is empty string.
1076+It is the software's responsibility to change the `CFG_SHADOWED.mode` to SHAKE in case of empty `PREFIX`.
1077+The SHA3 HWIP uses `PREFIX` registers as it is.
1078+It means that the software should update `PREFIX` with encoded values.
1079+
1080+After configuring, the software notifies the SHA3 engine to accept incoming messages by issuing Start command into `CMD`.
1081+If Start command is not issued, the incoming message is discarded.
1082+
1083+After the software pushes all messages, it issues Process command to `CMD` for SHA3 engine to complete the sponge absorbing process.
1084+SHA3 hashing engine pads the incoming message as defined in the SHA3 specification.
1085+
1086+After the SHA3 engine completes the sponge absorbing step, it generates `kmac_done` interrupt.
1087+Or the software can poll the `STATUS.squeeze` bit until it becomes 1.
1088+In this stage, the software may run the Keccak round manually.
1089+
1090+If the desired digest length is greater than the Keccak rate, the software issues Run command for the Keccak round logic to run one full round after the software reads the current available Keccak state.
1091+At this stage, SHA3 does not raise an interrupt when the Keccak round completes the software initiated manual run.
1092+The software should check `STATUS.squeeze` register field for the readiness of `STATE` value.
1093+
1094+After the software reads all the digest values, it issues Done command to `CMD` register to clear the internal states.
1095+Done command clears the Keccak state, FSM in SHA3, and a few internal variables.
1096+
1097+#### Endianness
1098+
1099+This SHA3 HWIP operates in little-endian.
1100+Internal SHA3 hashing engine receives in 64-bit granularity.
1101+The data written to SHA3 is assumed to be little endian.
1102+
1103+The software may write/read the data in big-endian order if `CFG_SHADOWED.msg_endianness` or `CFG_SHADOWED.state_endianness` is set.
1104+If the endianness bit is 1, the data is assumed to be big-endian.
1105+So, the internal logic byte-swap the data.
1106+For example, when the software writes `0xDEADBEEF` with endianness as 1, the logic converts it to `0xEFBEADDE` then writes into MSG_FIFO.
1107+
1108+
8931109 ## HMAC512/HMAC384
8941110
8951111 Hash-based message authentication code (HMAC) is a cryptographic authentication technique that uses a hash function and a secret key. HMAC involves a cryptographic hash function and a secret cryptographic key. This implementation supports the HMAC512 variants HMAC-SHA-512-256 and HMAC-SHA-384-192 as specified in [NIST FIPS 198-1](https://nvlpubs.nist.gov/nistpubs/FIPS/NIST.FIPS.198-1.pdf) [5]. The implementation is compatible with the HMAC-SHA-512-256 and HMAC-SHA-384-192 authentication and integrity functions defined in [RFC 4868](https://tools.ietf.org/html/rfc4868) [6].
@@ -916,25 +1132,25 @@
9161132
9171133 The total size should be equal to 128 bits, short of a multiple of 1024 because the goal is to have the formatted message size as a multiple of 1024 bits (N x 1024).
9181134
919-*Figure 24: HMAC input formatting*
1135+*Figure: HMAC input formatting*
9201136
9211137 ![](../images/caliptra-rtl/docs/images/HMAC_input.png)
9221138
9231139 The following figures show examples of input formatting for different message lengths.
9241140
925-*Figure 25: Message length of 1023 bits*
1141+*Figure: Message length of 1023 bits*
9261142
9271143 ![](../images/caliptra-rtl/docs/images/msg_1023.png)
9281144
9291145 When the message is 1023 bits long, padding is given in the next block along with message size.
9301146
931-*Figure 26: 1 bit padding*
1147+*Figure: 1 bit padding*
9321148
9331149 ![](../images/caliptra-rtl/docs/images/1_bit.png)
9341150
9351151 When the message size is 895 bits, a padding of ‘1’ is also considered valid, followed by the message size.
9361152
937-*Figure 27: Multi block message*
1153+*Figure: Multi block message*
9381154
9391155 ![](../images/caliptra-rtl/docs/images/msg_multi_block.png)
9401156
@@ -945,13 +1161,13 @@
9451161
9461162 The HMAC512 core performs the sha2-512 function to process the hash value of the given message. The algorithm processes each block of the 1024 bits from the message, using the result from the previous block. This data flow is shown in the following figure.
9471163
948-*Figure 28: HMAC-SHA-512-256 data flow*
1164+*Figure: HMAC-SHA-512-256 data flow*
9491165
9501166 ![](../images/caliptra-rtl/docs/images/HMAC_SHA_512_256.png)
9511167
9521168 The HMAC384 core performs the sha2-384 function to process the hash value of the given message. The algorithm processes each block of the 1024 bits from the message, using the result from the previous block. This data flow is shown in the following figure.
9531169
954-*Figure 29: HMAC-SHA-384-192 data flow*
1170+*Figure: HMAC-SHA-384-192 data flow*
9551171
9561172 ![](../images/caliptra-rtl/docs/images/HMAC_SHA_384_192.png)
9571173
@@ -959,7 +1175,7 @@
9591175
9601176 The HMAC architecture has the finite-state machine as shown in the following figure.
9611177
962-*Figure 30: HMAC FSM*
1178+*Figure: HMAC FSM*
9631179
9641180 ![](../images/caliptra-rtl/docs/images/HMAC_FSM.png)
9651181
@@ -997,7 +1213,7 @@
9971213
9981214 The following pseudocode demonstrates how the HMAC interface can be implemented.
9991215
1000-*Figure 31: HMAC pseudocode*
1216+*Figure: HMAC pseudocode*
10011217
10021218 ![](../images/caliptra-rtl/docs/images/HMAC_pseudo.png)
10031219
@@ -1122,7 +1338,7 @@
11221338
11231339 Secp384r1 parameters are shown in the following figure.
11241340
1125-*Figure 32: Secp384r1 parameters*
1341+*Figure: Secp384r1 parameters*
11261342
11271343 ![](../images/caliptra-rtl/docs/images/secp384r1_params.png)
11281344
@@ -1130,7 +1346,7 @@
11301346
11311347 The ECDSA consists of three operations, shown in the following figure.
11321348
1133-*Figure 33: ECDSA operations*
1349+*Figure: ECDSA operations*
11341350
11351351 ![](../images/caliptra-rtl/docs/images/ECDSA_ops.png)
11361352
@@ -1175,7 +1391,7 @@
11751391
11761392 The ECC top-level architecture is shown in the following figure.
11771393
1178-*Figure 34: ECC architecture*
1394+*Figure: ECC architecture*
11791395
11801396 ![](../images/caliptra-rtl/docs/images/ECC_arch.png)
11811397
@@ -1215,25 +1431,25 @@
12151431
12161432 #### KeyGen
12171433
1218-*Figure 35: KeyGen pseudocode*
1434+*Figure: KeyGen pseudocode*
12191435
12201436 ![](../images/caliptra-rtl/docs/images/keygen_pseudo.png)
12211437
12221438 #### Signing
12231439
1224-*Figure 36: Signing pseudocode*
1440+*Figure: Signing pseudocode*
12251441
12261442 ![](../images/caliptra-rtl/docs/images/signing_pseudo.png)
12271443
12281444 #### Verifying
12291445
1230-*Figure 37: Verifying pseudocode*
1446+*Figure: Verifying pseudocode*
12311447
12321448 ![](../images/caliptra-rtl/docs/images/verify_pseudo.png)
12331449
12341450 #### ECDH sharedkey
12351451
1236-*Figure 38: ECDH sharedkey pseudocode*
1452+*Figure: ECDH sharedkey pseudocode*
12371453
12381454 ![](../images/caliptra-rtl/docs/images/sharedkey_pseudo.png)
12391455
@@ -1299,7 +1515,7 @@
12991515 2. KEYGEN PRIVKEY: Running HMAC\_DRBG with seed and nonce to generate the privkey in KEYGEN operation.
13001516 3. SIGNING NONCE: Running HMAC\_DRBG based on RFC6979 in SIGNING operation with privkey and hashed\_msg.
13011517
1302-*Figure 39: HMAC\_DRBG utilization*
1518+*Figure: HMAC\_DRBG utilization*
13031519
13041520 ![](../images/caliptra-rtl/docs/images/HMAC_DRBG_util.png)
13051521
@@ -1315,7 +1531,7 @@
13151531
13161532 The data flow of the HMAC\_DRBG operation in keygen operation mode is shown in the following figure.
13171533
1318-*Figure 40: HMAC\_DRBG data flow*
1534+*Figure: HMAC\_DRBG data flow*
13191535
13201536 ![](../images/caliptra-rtl/docs/images/HMAC_DRBG_data.png)
13211537
@@ -1325,7 +1541,7 @@
13251541
13261542 In practice, observing a t-value greater than a specific threshold (mainly 4.5) indicates the presence of leakage. However, in ECC, due to its latency, around 5 million samples are required to be captured. This latency leads to many false positives and the TVLA threshold can be considered a higher value than 4.5. Based on the following figure from “Side-Channel Analysis and Countermeasure Design for Implementation of Curve448 on Cortex-M4” by Bisheh-Niasar et. al., the threshold can be considered equal to 7 in our case.
13271543
1328-*Figure 41: TVLA threshold as a function of the number of samples per trace*
1544+*Figure: TVLA threshold as a function of the number of samples per trace*
13291545
13301546 ![](../images/caliptra-rtl/docs/images/TVLA_threshold.png)
13311547
@@ -1335,7 +1551,7 @@
13351551 The TVLA results for performing seed/nonce-dependent leakage detection using 200,000 traces is shown in the following figure. Based on this figure, there is no leakage in ECC keygen by changing the seed/nonce after 200,000 operations.
13361552
13371553
1338-*Figure 42: seed/nonce-dependent leakage detection using TVLA for ECC keygen after 200,000 traces*
1554+*Figure: seed/nonce-dependent leakage detection using TVLA for ECC keygen after 200,000 traces*
13391555
13401556 ![](../images/caliptra-rtl/docs/images/tvla_keygen.png)
13411557
@@ -1343,13 +1559,13 @@
13431559
13441560 The TVLA results for performing privkey-dependent leakage detection using 20,000 traces is shown in the following figure. Based on this figure, there is no leakage in ECC signing by changing the privkey after 20,000 operations.
13451561
1346-*Figure 43: privkey-dependent leakage detection using TVLA for ECC signing after 20,000 traces*
1562+*Figure: privkey-dependent leakage detection using TVLA for ECC signing after 20,000 traces*
13471563
13481564 ![](../images/caliptra-rtl/docs/images/TVLA_privekey.png)
13491565
13501566 The TVLA results for performing message-dependent leakage detection using 64,000 traces is shown in the following figure. Based on this figure, there is no leakage in ECC signing by changing the message after 64,000 operations.
13511567
1352-*Figure 44: Message-dependent leakage detection using TVLA for ECC signing after 64,000 traces*
1568+*Figure: Message-dependent leakage detection using TVLA for ECC signing after 64,000 traces*
13531569
13541570 ![](../images/caliptra-rtl/docs/images/TVLA_msg_dependent.png)
13551571
@@ -1388,15 +1604,15 @@
13881604
13891605 LMS cryptography is a type of hash-based digital signature scheme that was standardized by NIST in 2020. It is based on the Leighton-Micali Signature (LMS) system, which uses a Merkle tree structure to combine many one-time signature (OTS) keys into a single public key. LMS cryptography is resistant to quantum attacks and can achieve a high level of security without relying on large integer mathematics.
13901606
1391-Caliptra supports only LMS verification using a software/hardware co-design approach. Hence, the LMS accelerator reuses the SHA256 engine to speedup the Winternitz chain by removing software-hardware interface overhead. The LMS-OTS verification algorithm is shown in follwoing figure:
1392-
1393-*Figure 45: LMS-OTS Verification algorithm*
1607+Caliptra supports only LMS verification using a software/hardware co-design approach. Hence, the LMS accelerator reuses the SHA256 engine to speedup the Winternitz chain by removing software-hardware interface overhead. The LMS-OTS verification algorithm is shown in following figure:
1608+
1609+*Figure: LMS-OTS Verification algorithm*
13941610
13951611 ![](../images/caliptra-rtl/docs/images/LMS_verifying_alg.png)
13961612
13971613 The high-level architecture of LMS is shown in the following figure.
13981614
1399-*Figure 46: LMS high-level architecture*
1615+*Figure: LMS high-level architecture*
14001616
14011617 ![](../images/caliptra-rtl/docs/images/LMS_high_level.png)
14021618
@@ -1421,7 +1637,7 @@
14211637
14221638 The Winternitz hash chain can be accelerated in hardware to enhance the performance of the design. For that, a configurable architecture is proposed that can reuse SHA256 engine. The LMS accelerator architecture is shown in the following figure, while H is SHA256 engine.
14231639
1424-*Figure 47: Winternitz chain architecture*
1640+*Figure: Winternitz chain architecture*
14251641
14261642 ![](../images/caliptra-rtl/docs/images/LMS_wntz_arch.png)
14271643
@@ -1456,7 +1672,14 @@
14561672 Please refer to the [Adams-bridge specification](https://github.com/chipsalliance/adams-bridge/blob/main/docs/AdamsBridgeHardwareSpecification.md)
14571673
14581674 ### Address map
1459-Address map of ML-DSA accelerator is shown here: [ML-DSA\_reg — clp Reference (chipsalliance.github.io)](https://chipsalliance.github.io/caliptra-rtl/main/internal-regs/?p=clp.mldsa_reg)
1675+Address map of ML-DSA accelerator is shown here: [ML-DSA\_reg — clp Reference (chipsalliance.github.io)](https://chipsalliance.github.io/caliptra-rtl/main/internal-regs/?p=clp.abr_reg)
1676+
1677+## Adams Bridge Kyber ML-KEM
1678+
1679+Please refer to the [Adams-bridge specification](https://github.com/chipsalliance/adams-bridge/blob/main/docs/AdamsBridgeHardwareSpecification.md)
1680+
1681+### Address map
1682+Address map of ML-KEM accelerator is shown here: [ML-KEM\_reg — clp Reference (chipsalliance.github.io)](https://chipsalliance.github.io/caliptra-rtl/main/internal-regs/?p=clp.abr_reg)
14601683
14611684 ## AES
14621685
@@ -1469,6 +1692,12 @@
14691692 ### Operation
14701693
14711694 For more information, see the [AES Programmer's Guide](https://github.com/vogelpi/opentitan/blob/aes-gcm-review/hw/ip/aes/doc/programmers_guide.md).
1695+
1696+## AES Endian
1697+
1698+The AES Core uses little endian for the DATA_IN and DATA_OUT registers. Caliptra allows a user to stream the data into and out of AES in big endian when AES_CLP.CTRL0.ENDIAN_SWAP is set to 1. This is done by swizzling the write and read data when a write targets DATA_IN or a read targets DATA_OUT.
1699+
1700+By default little endian is selected.
14721701
14731702 ### Signal descriptions
14741703
@@ -1797,19 +2026,19 @@
17972026 To underpin the results of the formal verification flow, the hardening of the GHASH module has been analyzed on the ChipWhisperer [CW310](https://rtfm.newae.com/Targets/CW310%20Bergen%20Board/) FPGA board.
17982027 For this analysis, power traces with the ChipWhisperer [Husky](https://rtfm.newae.com/Capture/ChipWhisperer-Husky/) scope were captured during GCM operations.
17992028 Afterwards a Test Vector Leakage Assessment (TVLA) with the [ot-sca toolset](https://github.com/lowRISC/ot-sca) has been performed.
1800-The setup is illustrated in Figure 1.
2029+The setup is illustrated in the following Figure.
18012030
18022031 ![](../images/caliptra-rtl/docs/images/cw310_cwhusky.jpeg)
18032032 :--:
1804-**Figure 1**: Target CW310 FPGA board (left) and the CW Husky scope (right).
2033+**Figure**: Target CW310 FPGA board (left) and the CW Husky scope (right).
18052034
18062035 ##### Setup
18072036
18082037 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure2.png)
18092038 :--:
1810-**Figure 2**: Measurement setup. The main components are the target board, the scope, and the SCA framework.
1811-
1812-Figure 2 gives a detailed overview of the measurement setup that has been utilized to capture the power traces.
2039+**Figure**: Measurement setup. The main components are the target board, the scope, and the SCA framework.
2040+
2041+The prior Figure gives a detailed overview of the measurement setup that has been utilized to capture the power traces.
18132042 The SCA evaluation framework ot-sca is the central component of the measurement setup.
18142043 It is responsible for communicating with the penetration testing framework that runs on the target FPGA board and with the scope.
18152044 Initially, ot-sca configures the scope (sample rate, number of samples) and the pentest framework (which input, how many encryptions, where to trigger).
@@ -1821,9 +2050,9 @@
18212050
18222051 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure3.png)
18232052 :--:
1824-**Figure 3**: Power trace with AES encryption rounds visible (*left*). Aligned traces when zooming in (*right*).
1825-
1826-Figure 3 depicts power traces captured during AES-GCM encryptions with the setup above.
2053+**Figure**: Power trace with AES encryption rounds visible (*left*). Aligned traces when zooming in (*right*).
2054+
2055+The prior Figure depicts power traces captured during AES-GCM encryptions with the setup above.
18272056 As shown in the figure, the traces are nicely aligned, allowing to perform a sound evaluation.
18282057
18292058 ##### Methodology
@@ -1835,9 +2064,9 @@
18352064
18362065 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure4.png)
18372066 :--
1838-**Figure 4:** TVLA plot showing leakage at around sample 1000. When increasing the number of traces (from 1000 to 10000), the leakage becomes more present. Note that the traces shown in this plot are taken from an arbitrary cryptographic hardware block and not AES.
1839-
1840-Figure 4 shows a TVLA plot that will be used throughout this document. The red lines mark the ± *t*-test border.
2067+**Figure:** TVLA plot showing leakage at around sample 1000. When increasing the number of traces (from 1000 to 10000), the leakage becomes more present. Note that the traces shown in this plot are taken from an arbitrary cryptographic hardware block and not AES.
2068+
2069+The prior Figure shows a TVLA plot that will be used throughout this document. The red lines mark the ± *t*-test border.
18412070
18422071 ###### Dataset Generation for FvsR IV & Key
18432072
@@ -1878,26 +2107,26 @@
18782107
18792108 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure5.png)
18802109 :--:
1881-**Figure 5:** AES-GCM block diagram. Red lines mark the trigger windows for each analysis step.
1882-
1883-As shown in Figure 5, we focus on analyzing (*i*) the generation of the hash subkey H, (*ii*) the encryption of the initial counter block S, (*iii*) the processing of the AAD blocks, (*iv*) the plaintext blocks, and (*v*) the tag generation. Each measurement is conducted with (*a*) masks off and (*b*) masks on to analyze the effectiveness of the masking countermeasure.
2110+**Figure:** AES-GCM block diagram. Red lines mark the trigger windows for each analysis step.
2111+
2112+As shown in the prior Figure, we focus on analyzing (*i*) the generation of the hash subkey H, (*ii*) the encryption of the initial counter block S, (*iii*) the processing of the AAD blocks, (*iv*) the plaintext blocks, and (*v*) the tag generation. Each measurement is conducted with (*a*) masks off and (*b*) masks on to analyze the effectiveness of the masking countermeasure.
18842113
18852114 ###### i) SCA Evaluation of Generating the Hash Subkey H
18862115
18872116 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure6ab.png)
18882117 :--:
18892118
1890-| **Figure 6a:** Masking Off - 100k traces - **Figure 6b:** Masking On - 1M traces |
2119+| **Figure:** Masking Off - 100k traces - **Figure:** Masking On - 1M traces |
18912120
18922121
18932122 ###### Interpretation
18942123
1895-The AES encryption is clearly visible in the form of 12 distinct peaks in the power traces shown Figures 6a and 6b.
2124+The AES encryption is clearly visible in the form of 12 distinct peaks in the power traces shown in the prior set of Figures.
18962125 The 12 peaks correspond to first the loading of the key and the all-zero block into the AES cipher core, followed by the initial round and the 10 full AES rounds (AES-128).
18972126 They spread over approximately 470 samples which corresponds to the 56 target clock cycles a full AES-128 encryption takes.
18982127
1899-If the masking is turned off (Figure 6a), first and second-order leakage is clearly visible throughout the operation.
1900-If the masking is on (Figure 6b), there is first-order leakage 1) at the beginning as well as 2) at the end of the operation.
2128+If the masking is turned off (set of graphs), first and second-order leakage is clearly visible throughout the operation.
2129+If the masking is on (set of graphs), there is first-order leakage 1) at the beginning as well as 2) at the end of the operation.
19012130
19022131 1. The leakage at the beginning of the operation is due to incrementing the IV/CTR value (inc32 function in GCM spec) which spreads across the first two AES rounds.
19032132 This produces first-order leakage as the inc32 function implementation isn’t masked.
@@ -1907,26 +2136,26 @@
19072136 The leakage is most likely due to how the FPGA implementation tool maps the flip flops of the hash subkey register shares to the available FPGA logic slices: if flip flops of the different shares get mapped to the same logic slice, the carry-chain and other muxing logic present in the logic slice can combine the various inputs thereby causing SCA leakage despite these logic outputs not being used.
19082137 We’ve observed similar effects in the past and there is [research giving more insight into this and other FPGA-specific issues](https://ieeexplore.ieee.org/document/10545383).
19092138
1910-To summarize, the observed first-order leakage if masking is on (Figure 6b) is not of concern for ASIC implementations.
2139+To summarize, the observed first-order leakage if masking is on is not of concern for ASIC implementations.
19112140
19122141 ###### ii) SCA Evaluation of Encrypting the Initial Counter Block
19132142
19142143 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure7ab.png)
19152144 :--:
19162145
1917-| **Figure 7a:** Masking Off - 100k traces - **Figure 7b:** Masking On - 1M traces |
2146+| **Figure:** Masking Off - 100k traces - **Figure:** Masking On - 1M traces |
19182147
19192148
19202149 ###### Interpretation
19212150
1922-Again, the AES encryption is clearly visible in the form of 12 peaks in the power traces shown Figures 7a and 7b.
2151+Again, the AES encryption is clearly visible in the form of 12 peaks in the power traces shown in the prior set of Figures.
19232152 This AES encryption corresponds to the generation of the encrypted initial counter block S.
19242153 The AES encryption is followed by another operation visible in the power trace: the computation of repeatedly used correction terms using the Galois-field multipliers inside GHASH.
19252154 This operation takes 33 target clock cycles (approximately 275 samples).
19262155
1927-If the masking is turned off (Figure 7a), first and second-order leakage is clearly visible throughout both operations while being more pronounced during the GHASH operation.
2156+If the masking is turned off (set of graphs), first and second-order leakage is clearly visible throughout both operations while being more pronounced during the GHASH operation.
19282157 This is because the GHASH block is smaller and thus produces less noise.
1929-If the masking is on (Figure 7b), there is first-order leakage 1) at the beginning as well as 2) between the two operations.
2158+If the masking is on (set of graphs), there is first-order leakage 1) at the beginning as well as 2) between the two operations.
19302159
19312160 1. As before, the leakage at the beginning of the operation is due to incrementing the IV/CTR value (inc32 function in GCM spec) which spreads across the first two AES rounds.
19322161 This produces first-order leakage as the inc32 function implementation isn’t masked.
@@ -1936,7 +2165,7 @@
19362165 As before, the leakage is most likely due to how the FPGA implementation tool maps the multiplexers in front of the GHASH state registers to the available FPGA logic slices: Since the multiplexers for both shares use the same control signals, the multiplexing logic can be combined even into the same look-up tables (LUTs) thereby causing SCA leakage.
19372166 We’ve observed similar effects in the past and there is [research giving more insight into this and other FPGA-specific issues](https://ieeexplore.ieee.org/document/10545383).
19382167
1939-To summarize, the observed first-order leakage if masking is on (FIgure 7b) is not of concern for ASIC implementations.
2168+To summarize, the observed first-order leakage if masking is on is not of concern for ASIC implementations.
19402169
19412170 ###### iii) SCA Evaluation of Processing the AAD Blocks
19422171
@@ -1945,31 +2174,31 @@
19452174 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure8ab.png)
19462175 :--:
19472176
1948-| **Figure 8a:** Masking Off - 50k traces - **Figure 8b:** Masking On - 10M traces |
2177+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 10M traces |
19492178
19502179
19512180 ###### Interpretation
19522181
19532182 For AAD blocks, the AES cipher core is not involved.
19542183 However, during the computation of the first AAD block, the GHASH block needs to compute an additional correction term which is used for the very first block only.
1955-If the masking is turned off (Figure 8a), first- and second-order leakage is clearly visible but only for the first activity block.
2184+If the masking is turned off (first set of graphs), first- and second-order leakage is clearly visible but only for the first activity block.
19562185 The second activity block involves computing the additional correction terms which requires Share 1 of the encrypted initial counter block to be multiplied by Share 1 of the hash subkey.
19572186 But since the masking is off, both these values are zero for both the fixed and the random set and hence there is no SCA leakage.
1958-If the masking is turned on (Figure 8b), no SCA leakage is observable which is desirable.
2187+If the masking is turned on (second set of graphs), no SCA leakage is observable which is desirable.
19592188
19602189 ###### Processing AAD Block 1
19612190
19622191 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure9ab.png)
19632192 :--:
19642193
1965-| **Figure 9a:** Masking Off - 50k traces - **Figure 9b:** Masking On - 10M traces |
2194+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 10M traces |
19662195
19672196
19682197 ###### Interpretation
19692198
19702199 For the second AAD block (and any subsequent AAD blocks) there is only one activity block corresponding to the Galois-field multiplication.
1971-If masking is turned off (Figure 9a), there is both first- and second-order leakage observable.
1972-If the masking is turned on (Figure 9b), no SCA leakage is observable which is desirable.
2200+If masking is turned off (first set of graphs), there is both first- and second-order leakage observable.
2201+If the masking is turned on (second set of graphs), no SCA leakage is observable which is desirable.
19732202
19742203 ###### iv) SCA Evaluation of Processing the PTX Blocks
19752204
@@ -1978,12 +2207,12 @@
19782207 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure10ab.png)
19792208 :--:
19802209
1981-| **Figure 10a:** Masking Off - 50k traces - **Figure 10b:** Masking On - 1M traces |
2210+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 1M traces |
19822211
19832212
19842213 ###### Interpretation
19852214
1986-Like in [ii) SCA Evaluation of Encrypting the Initial Counter Block](#ii-sca-evaluation-of-encrypting-the-initial-counter-block) there is first-order leakage 1) at the beginning and 2) between the two operations if the masking is turned on (Figure 10b).
2215+Like in [ii) SCA Evaluation of Encrypting the Initial Counter Block](#ii-sca-evaluation-of-encrypting-the-initial-counter-block) there is first-order leakage 1) at the beginning and 2) between the two operations if the masking is turned on (first set of graphs).
19872216
19882217 1. As before, the leakage at the beginning of the operation is due to incrementing the IV/CTR value (inc32 function in GCM spec) which spreads across the first two AES rounds.
19892218 This produces first-order leakage as the inc32 function implementation isn’t masked.
@@ -1993,14 +2222,14 @@
19932222 But since the AAD and the plaintext have been chosen to be the same for all traces in the fixed and the random sets, the traces of the fixed set only produce all the same ciphertext and thus are expected to exhibit a static power signature for this step, whereas the ciphertext of the random set is randomized through the random key and IV.
19942223 However, since the ciphertext is not secret in the context of GCM, this leakage is of no concern.
19952224
1996-To summarize, the observed first-order leakage if masking is on (FIgure 10b) is not of concern.
2225+To summarize, the observed first-order leakage if masking is on (second set of graphs) is not of concern.
19972226
19982227 ###### Processing PTX Block 1
19992228
20002229 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure11ab.png)
20012230 :--:
20022231
2003-| **Figure 11a:** Masking Off - 50k traces - **Figure 11b:** Masking On - 1M traces |
2232+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 1M traces |
20042233
20052234
20062235 ###### Interpretation
@@ -2013,7 +2242,7 @@
20132242 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure12ab.png)
20142243 :--:
20152244
2016-| **Figure 12a:** Masking Off - 50k traces - **Figure 12b:** Masking On - 1M traces |
2245+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 1M traces |
20172246
20182247
20192248 ###### Interpretation
@@ -2023,12 +2252,12 @@
20232252 The GHASH state is unmasked (still masked with the encrypted initial counter block S) and Share 1 of S is added to write the final authentication tag to the data output registers readable by software.
20242253 2) In parallel to writing the final authentication tag to the data output registers, the internal state is all cleared to random values and an additional multiplication is triggered to clear the internal state of the Galois-field multipliers and the correction term registers.
20252254
2026-If masking is turned off (Figure 12a), there is both first- and second-order leakage observable during the first activity block (tag generation) but not during the clearing operation.
2027-If the masking is turned on (Figure 12b), some SCA leakage is observable between the two operations, i.e., when the final authentication tag is written to the output data registers.
2255+If masking is turned off (first set of graphs), there is both first- and second-order leakage observable during the first activity block (tag generation) but not during the clearing operation.
2256+If the masking is turned on (second set of graphs), some SCA leakage is observable between the two operations, i.e., when the final authentication tag is written to the output data registers.
20282257 This leakage is expected as both the fixed and the random data sets use a static AAD and plaintext.
20292258 This means, the tag for the fixed data set is fixed whereas the tags for the random set get randomized through the ciphertext (random due to the random key and IV).
20302259
2031-To summarize, the observed first-order leakage if masking is on (FIgure 12b) is not of concern.
2260+To summarize, the observed first-order leakage if masking is on (second set of graphs) is not of concern.
20322261
20332262 ##### Results – FvsR PTX & AAD
20342263
@@ -2040,16 +2269,16 @@
20402269 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure13ab.png)
20412270 :--:
20422271
2043-| **Figure 13a:** Masking Off - 50k traces - **Figure 13b:** Masking On - 1M traces |
2272+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 1M traces |
20442273
20452274
20462275 ###### Interpretation
20472276
2048-There is no SCA leakage visible in both cases without masking (Figure 13a) and with masking turned on (Figure 13b).
2277+There is no SCA leakage visible in both cases without masking (first set of graphs) and with masking turned on (second set of graphs).
20492278 This is expected as the hash subkey generation doesn’t involve the plaintext and the AAD but only the key and IV.
20502279 Both the fixed and random set use the same static key and IV.
20512280
2052-This experiment was specifically done to check whether the leakage identified in Figure 6b and attributed to how the FPGA implementation tool maps the flip flops of the hash subkey register shares to the available FPGA logic slices.
2281+This experiment was specifically done to check whether the leakage identified in [i) SCA Evaluation of Generating the Hash Subkey H](#i-SCA-Evaluation-of-Generating-the-Hash-Subkey-H) and attributed to how the FPGA implementation tool maps the flip flops of the hash subkey register shares to the available FPGA logic slices.
20532282 As expected, the leakage peak is now gone.
20542283
20552284 ###### ii) SCA Evaluation of Encrypting the Initial Counter Block
@@ -2057,16 +2286,16 @@
20572286 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure14ab.png)
20582287 :--:
20592288
2060-| **Figure 14a:** Masking Off - 50k traces - **Figure 14b:** Masking On - 1M traces |
2289+| **Figure:** Masking Off - 50k traces - **Figure:** Masking On - 1M traces |
20612290
20622291
20632292 ###### Interpretation
20642293
2065-There is no SCA leakage visible in both cases without masking (Figure 14a) and with masking turned on (Figure 14b).
2294+There is no SCA leakage visible in both cases without masking (first set of graphs) and with masking turned on (second set of graphs).
20662295 This is expected as the encryption of the initial counter block and the subsequent computation of repeatedly used correction terms doesn’t involve the plaintext and the AAD but only the key and IV.
20672296 Both the fixed and random set use the same static key and IV.
20682297
2069-This experiment was specifically done to check whether the leakage identified in Figure 7b and attributed to how the FPGA implementation tool maps the multiplexers in front of the GHASH state registers to the available FPGA logic slices.
2298+This experiment was specifically done to check whether the leakage identified in [ii) SCA Evaluation of Encrypting the Initial Counter Block](#ii-SCA-Evaluation-of-Encrypting-the-Initial-Counter-Block) and attributed to how the FPGA implementation tool maps the multiplexers in front of the GHASH state registers to the available FPGA logic slices.
20702299 As expected, the leakage peak is now gone.
20712300
20722301 ###### iv) SCA Evaluation of Processing the PTX Block 0
@@ -2074,12 +2303,12 @@
20742303 ![](../images/caliptra-rtl/docs/images/GHASH_TVLA_Figure15ab.png)
20752304 :--:
20762305
2077-| **Figure 15a:** Masking Off - 100k traces - **Figure 15b:** Masking On - 1M traces |
2306+| **Figure:** Masking Off - 100k traces - **Figure:** Masking On - 1M traces |
20782307
20792308
20802309 ###### Interpretation
20812310
2082-With the masking turned off (Figure 15a), there is first-order leakage 1) at the beginning of the operation and 2) throughout the entire GHASH operation.
2311+With the masking turned off (first set of graphs), there is first-order leakage 1) at the beginning of the operation and 2) throughout the entire GHASH operation.
20832312
20842313 1. The leakage at the beginning of the operation is due to the input data (the plaintext) being written to an internal buffer register.
20852314 The AES cipher is operated in counter mode, meaning it doesn’t encrypt the input data but the counter value (incremented IV).
@@ -2088,7 +2317,7 @@
20882317 2. The GHASH operation then processes this ciphertext.
20892318 The observed leakage when the masking is off is expected.
20902319
2091-With the masking turned on (Figure 15b), the first-order leakage at the beginning of the operation remains visible. The reason for this is that the internal register buffering the previous input data is not masked.
2320+With the masking turned on (second set of graphs), the first-order leakage at the beginning of the operation remains visible. The reason for this is that the internal register buffering the previous input data is not masked.
20922321 This is of no concern as the leakage is not related to key or IV.
20932322
20942323 Another first-order leakage peak is visible between the AES encryption and the GHASH operation.
@@ -2292,9 +2521,9 @@
22922521 | Lock wr\[0\] | core_only_rst_b | Setting the lock wr field prevents the entry from being written by the microcontroller. Keys are always locked. After a lock is set, it cannot be reset until cptra_rst_b is de-asserted. |
22932522 | Lock use\[1\] | core_only_rst_b | Setting the lock use field prevents the entry from being used in any cryptographic blocks. After the lock is set, it cannot be reset until cptra_rst_b is de-asserted. |
22942523 | Clear\[2\] | cptra_rst_b | If unlocked, setting the clear bit causes KV to clear the associated entry. The clear bit is reset after entry is cleared. |
2295-| Copy\[3\] | cptra_rst_b | ENHANCEMENT: Setting the copy bit causes KV to copy the key to the entry written to Copy Dest field. |
2296-| Copy Dest\[8:4\] | cptra_rst_b | ENHANCEMENT: Destination entry for the copy function. |
2297-| Dest_valid\[16:9\] | hard_reset_b | KV entry can be used with the associated cryptographic block if the appropriate index is set. <br>\[0\] - HMAC KEY <br>\[1\] - HMAC BLOCK <br>\[2\] - MLDSA SEED <br>\[3\] - ECC PRIVKEY <br>\[4\] - ECC SEED <br>\[5\] - AES KEY <br>\[7:6\] - RSVD |
2524+| rsvd0\[3\] |||
2525+| rsvd1\[8:4\] |||
2526+| Dest_valid\[16:9\] | hard_reset_b | KV entry can be used with the associated cryptographic block if the appropriate index is set. <br>\[0\] - HMAC KEY <br>\[1\] - HMAC BLOCK <br>\[2\] - MLDSA SEED <br>\[3\] - ECC PRIVKEY <br>\[4\] - ECC SEED <br>\[5\] - AES KEY <br>\[6\] - MLKEM SEED <br>\[7\] - MLKEM MSG <br>\[8\] - AXI DMA DATA |
22982527 | last_dword\[20:19\] | hard_reset_b | Store the offset of the last valid dword, used to indicate the last cycle for read operations. |
22992528
23002529
@@ -2334,7 +2563,10 @@
23342563 | ecc_pkey_dest_valid\[9\] | ECC PKEY is a valid destination. |
23352564 | ecc_seed_dest_valid\[10\] | ECC SEED is a valid destination. |
23362565 | aes_key_dest_valid\[11\] | AES KEY is a valid destination. |
2337-| rsvd\[31:12\] | Reserved field |
2566+| mlkem_seed_dest_valid\[12\] | MLKEM SEED is a valid destination. |
2567+| mlkem_msg_dest_valid\[13\] | MLKEM MSG is a valid destination. |
2568+| dma_data_dest_valid\[14\] | DMA DATA is a valid destination. |
2569+| rsvd\[31:15\] | Reserved field |
23382570
23392571
23402572 | KV Status Reg | Description |
@@ -2363,12 +2595,12 @@
23632595
23642596 ### Key vault de-obfuscation block operation
23652597
2366-A de-obfuscation engine (DOE) is used in conjunction with AES cryptography to de-obfuscate the UDS and field entropy.  
2367-
2368-1. The obfuscation key is driven to the AES key. The data to be decrypted (either obfuscated UDS or obfuscated field entropy) is fed into the AES data. 
2369-2. An FSM manually drives the AES engine and writes the decrypted data back to the key vault. 
2370-3. FW programs the DOE with the requested function (UDS or field entropy de-obfuscation), and the destination for the result. 
2371-4. After de-obfuscation is complete, FW can clear out the UDS and field entropy values from any flops until cptra\_pwrgood de-assertion.  
2598+A de-obfuscation engine (DOE) is used in conjunction with AES cryptography to de-obfuscate the UDS and field entropy and HEK seed.  
2599+
2600+1. The obfuscation key is wired to DOE engine. The data to be decrypted (either obfuscated UDS, obfuscated field entropy, or obfuscated HEK seed) is fed into the DOE data.
2601+2. An FSM manually drives the DOE engine and writes the decrypted data back to the key vault. 
2602+3. FW programs the DOE with the requested function (UDS, field entropy, or HEK seed de-obfuscation), and the destination for the result. 
2603+4. After de-obfuscation is complete, FW can clear out the UDS, field entropy, and HEK seed values from any flops until cptra\_pwrgood de-assertion.  
23722604
23732605 The following tables describe DOE register and control fields.
23742606
@@ -2381,13 +2613,14 @@
23812613
23822614 | DOE Ctrl Fields | Reset | Description |
23832615 | :--------------- | :----------- | :------------------------------------------------------------------------------------------------------------------------------------------- |
2384-| COMMAND\[1:0\] | Cptra_rst_b | 2’b00 Idle <br>2’b01 Run UDS flow <br>2’b10 Run FE flow <br>2’b11 Clear Obf Secrets |
2385-| DEST\[4:2\] | Cptra_rst_b | Destination register for the result of the de-obfuscation flow. Field entropy writes into DEST and DEST+1 <br>Key entry only, can’t go to PCR . |
2616+| CMD\[1:0\] | Cptra_rst_b | 2’b00 Idle <br>2’b01 Run UDS flow <br>2’b10 Run FE flow <br>2’b11 Clear Obf Secrets |
2617+| DEST\[6:2\] | Cptra_rst_b | Destination register for the result of the de-obfuscation flow. Field entropy writes into DEST and DEST+1 <br>Key entry only, can’t go to PCR . |
2618+| CMD_EXT\[8:7\] | Cptra_rst_b | 2’b00 Idle (or running a standard, non-extended command) <br>2’b01 Run OCP LOCK HEK seed flow <br>2’b10 RESERVED <br>2’b11 RESERVED |
23862619
23872620
23882621 ### Key vault de-obfuscation flow 
23892622
2390-1. ROM loads IV into DOE. ROM writes to the DOE control register the destination for the de-obfuscated result and sets the appropriate bit to run UDS and/or the field entropy flow. 
2623+1. ROM loads IV into DOE. ROM writes to the DOE control register the destination for the de-obfuscated result and sets the appropriate bit to run UDS, field entropy, and/or HEK seed flow. 
23912624 2. DOE state machine takes over and loads the Caliptra obfuscation key into the key register. 
23922625 3. Next, either the obfuscated UDS or field entropy are loaded into the block register 4 DWORDS at a time. 
23932626 4. Results are written to the KV entry specified in the DEST field of the DOE control register. 
@@ -2403,6 +2636,117 @@
24032636 * 4B scratchpad registers that are lockable but cleared on cold reset (8 registers)
24042637 * 4B scratchpad registers that are lockable but cleared on warm reset (10 registers)
24052638 * 4B scratchpad registers that are cleared on warm reset (8 registers)
2639+
2640+
2641+## OCP LOCK Hardware Architecture
2642+
2643+### Overview
2644+The following hardware and ROM/FW enhancements support the OCP L.O.C.K. (a.k.a. **OCP LOCK**) flows defined for SSD applications. The specification is available here:
2645+[OCP LOCK Spec](https://chipsalliance.github.io/Caliptra/ocp-lock/specification/HEAD)
2646+
2647+---
2648+
2649+### Additional Registers, Straps, and Macros for OCP LOCK
2650+
2651+- **`SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS`**
2652+ A status/control bit used to enforce the new KeyVault (KV) rules required by OCP LOCK. Write-1-to-set, meaning that, once-enabled, OCP LOCK functionality will persist until the register is cleared by a cold reset. See the dedicated section below for details on the behaviors this register enables.
2653+
2654+- **`ss_ocp_lock_en`** (constant-value input strap) with a corresponding bit in **`CPTRA_HW_CONFIG`** register named **`OCP_LOCK_MODE_en`**:
2655+ - Enables Caliptra ROM to perform OCP LOCK operations (e.g., using DOE for HEK seed de-obfuscation, Key Release via AXI DMA).
2656+ - Allows the ROM to set `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS`.
2657+ - `ss_ocp_lock_en` is a strap pin and **must be driven with a constant value by the integrator**.
2658+ - `CPTRA_HW_CONFIG` register samples this strap and store its value in `OCP_LOCK_MODE_en` bit
2659+ - This bit is only reflected in CPTRA_HW_CONFIG if CALIPTRA_MODE_SUBSYSTEM is defined
2660+
2661+- **HEK seed fuse register**
2662+ Holds the **obfuscated HEK seed**. ROM is responsible for performing the operation to de-obfuscate the HEK seed.
2663+
2664+- **Key release address and size straps**
2665+ Writable until `FUSE_WR_DONE`, then locked (same as fuses and other subsystem-mode straps).
2666+ - **Address strap** (`strap_ss_key_release_base_addr`): full destination address for key release; in OCP LOCK this is the destination for the MEK to be written. Firmware can derive the SFR base from this value as needed.
2667+ - **Size strap** (`strap_ss_key_release_key_size`): byte-count (dword-aligned count is required by HW) of the key to program to the destination address via the key release operation. Strap input values are forced to a dword value by hardware. If control firmware updates this value (prior to FUSE_WR_DONE being set), it must use a dword-aligned value.
2668+
2669+Refer to the [Caliptra Integration Spec](https://github.com/chipsalliance/caliptra-rtl/blob/main/docs/CaliptraIntegrationSpecification.md) for more details about macros and strap pins.
2670+
2671+---
2672+
2673+### `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS` Register Bit
2674+
2675+**When/How it is set**
2676+- Set by **Caliptra ROM** after performing OCP LOCK-related derivations (HEK, MDK, etc.).
2677+- Can be set **iff** (`ss_ocp_lock_en` is set to 1 **AND** `CALIPTRA_MODE_SUBSYSTEM` is defined).
2678+ - Once set, a value of 1 persists until the register is cleared by cold reset.
2679+
2680+**Enforcements/Effects**
2681+- Reserves **KeyVault slots 0–15** for *standard* use-cases.
2682+- Reserves **KeyVault slots 16–23** for *OCP LOCK* use-cases.
2683+ - Key Vault slot 16 (KV16) is reserved for holding the MDK
2684+ - Key Vault slot 23 (KV23) is reserved for holding the MEK
2685+- Blocks interactions between *standard* slots and *LOCK* slots. This means that any crypto operation that uses a Key Vault input value (e.g. for Key, Block, Seed inputs) may not write the output to a Key Vault from a different region. E.g., When `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS` is set, HMAC may not perform an operation that uses Key Vault slot 8 as BLOCK input and writes the output TAG to Key Vault slot 17.
2686+- Enables **Key Release via AXI DMA**.
2687+- Enables **AES engine to write output to Key Vault, which must use KV23**.
2688+
2689+> **Note:** If `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS` is `1`, it also implies `ss_ocp_lock_en` and `CALIPTRA_MODE_SUBSYSTEM` are also `1`.
2690+
2691+---
2692+
2693+### AES Write Path
2694+
2695+- **MEK** is the final OCP LOCK key. It is **decrypted and stored in KV23**. After decryption, MEK may be transferred to its destination (as specified by the input strap) **via AXI DMA**.
2696+- OCP LOCK requires both the **AES write path** and a **DMA path** to the MEK destination.
2697+- **Hardware enforcement:** MEK is written to **KV23**. Hardware recognizes the MEK generation request if there is an **AES-ECB decrypt** operation with **KV16 (MDK)** as the AES-ECB key and routes the result accordingly. In this case, output of the decrypted plaintext via the AES dataout register API is blocked. Any Key Vault write operation requested for the AES output that does not meet these requirements results in a Key Vault write failure status.
2698+
2699+---
2700+
2701+### KeyVault Access Rules & Filtering (when `LOCK_IN_PROGRESS` is set)
2702+
2703+- **KV23 (MEK destination)**: **write-restricted to AES only**.
2704+- **KV22 (HEK)**: **locked for writes until warm reset** (ROM requirement).
2705+- **KV16 (MDK)**: **locked for writes until warm reset** (ROM requirement).
2706+- If OCP LOCK mode is enabled:
2707+ - **KV23 must not be used** as input to other crypto operations—**only** as a **Key Release** source.
2708+ - **AES-ECB decrypt** with **key = KV16** **must** have **dest = KV23**; otherwise the destination is **FW**.
2709+ *Rationale:* Prevents malicious FW from writing known values into other KV slots via AES.
2710+- **Additional KV behaviors**
2711+ - On write, hardware validates that the **destination** is legal for the **source/read**. If not valid, the Key Vault write operation returns a failing status.
2712+ - **No parallel crypto operations** permitted for cryptographic blocks with access to Key Vault. KV does not track this; Caliptra enforces this rule by evaluating each block's busy status indicator and signaling violations through the [CPTRA_HW_ERROR_FATAL](https://chipsalliance.github.io/caliptra-rtl/main/internal-regs/?p=clp.soc_ifc_reg.CPTRA_HW_ERROR_FATAL) register and corresponding interrupt at Caliptra top level design.
2713+
2714+---
2715+
2716+### HEK Seed De‑obfuscation
2717+
2718+- Executed by **Caliptra ROM**. The DOE supports a HEK deobfuscation command that may be executed only once during a boot cycle. If Caliptra ROM does not run this flow to produce the HEK seed, it should run the flow with a dummy Key Vault slot to lock against future erroneous uses.
2719+- **Hardware-supported HEK seed Deobfuscation Path:** Ratchet Fuse Register (**obfuscated HEK seed**) → **DOE** (with `OBF_KEY`) → **KV slot 22** (de-obfuscated seed).
2720+- Caliptra ROM shall lock **KV22** for writes immediately it has derived the **HEK** into that slot.
2721+
2722+---
2723+
2724+### Key Release
2725+
2726+Caliptra's AXI DMA supports a hardware path to write **KV23 (MEK)** to the SoC via the AXI manager interface. The following rules constrain this operation:
2727+- Allowed **only** when `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS` (sticky **W1SET**) is set by Caliptra ROM.
2728+- Destination and size must match the values from the straps:
2729+ - `strap_ss_key_release_base_addr`
2730+ - `strap_ss_key_release_key_size`
2731+
2732+---
2733+
2734+### Additional Security Hardening Specific to OCP LOCK Enhancements
2735+
2736+**Scan/Debug Protections**
2737+- Flush **DMA FIFOs** to prevent leakage of secrets via scan chain.
2738+- Flush **AES ↔ KV** interface.
2739+
2740+**AES/KV/DMA Robustness**
2741+- **AES → KV write path:** The key can't be written to KeyVault unless key_size bytes are decrypted by AES.
2742+- Validate **DMA `key_size`**; **error** if `key_size > 512b`.
2743+- Avoid hangs when **`key_size` != KV read DWORD count**:
2744+ - On KV reads, if `key_size` is **smaller** than the KV entry, **drop extra data** (do not push to FIFO).
2745+- **DMA KV read error**: Raised on the **first transfer cycle** from KV to DMA; DMA transitions immediately to **`DMA_ERROR`** without issuing an AXI transfer.
2746+- **KV write enable sourced from AES** (during OCP LOCK) so it **cannot** be modified mid-transfer.
2747+- **Enable AES ↔ KV write path** only if `SS_OCP_LOCK_CTRL.LOCK_IN_PROGRESS` is set.
2748+
2749+
24062750
24072751 ## Cryptographic blocks fatal and non-fatal errors
24082752
@@ -2485,10 +2829,11 @@
24852829 9. Coron, J.-S.: Resistance against differential power analysis for elliptic curve cryptosystems. In: Ko¸c, C¸ .K., Paar, C. (eds.) CHES 1999. LNCS, vol. 1717, pp. 292–302.
24862830 10. Schindler, W., Wiemers, A.: Efficient side-channel attacks on scalar blinding on elliptic curves with special structure. In: NISTWorkshop on ECC Standards (2015).
24872831 11. National Institute of Standards and Technology, "Digital Signature Standard (DSS)", Federal Information Processing Standards Publication (FIPS PUB) 186-4, July 2013.
2488-12. NIST SP 800-90A, Rev 1: "Recommendation for Random Number Generation Using Deterministic Random Bit Generators", 2012. |
2832+12. NIST SP 800-90A, Rev 1: "Recommendation for Random Number Generation Using Deterministic Random Bit Generators", 2012.
24892833 13. CHIPS Alliance, “RISC-V VeeR EL2 Programmer’s Reference Manual” \[Online\] Available at https://github.com/chipsalliance/Cores-VeeR-EL2/blob/main/docs/RISC-V_VeeR_EL2_PRM.pdf.
24902834 14. “The RISC-V Instruction Set Manual, Volume I: User-Level ISA, Document Version 20191213”, Editors Andrew Waterman and Krste Asanovi ́c, RISC-V Foundation, December 2019. Available at https://riscv.org/technical/specifications/.
24912835 15. “The RISC-V Instruction Set Manual, Volume II: Privileged Architecture, Document Version 20211203”, Editors Andrew Waterman, Krste Asanovi ́c, and John Hauser, RISC-V International, December 2021. Available at https://riscv.org/technical/specifications/.
2492-16. NIST SP 800-56A, Rev 3: "Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography", 2018, |
2836+16. NIST SP 800-56A, Rev 3: "Recommendation for Pair-Wise Key-Establishment Schemes Using Discrete Logarithm Cryptography", 2018.
2837+17. NIST FIPS 202: "SHA-3 Standard: Permutation-Based Hash and Extendable-Output Functions", 2015. Available at: [https://csrc.nist.gov/pubs/fips/202/final](https://doi.org/10.6028/NIST.FIPS.202).
24932838
24942839 <sup>[1]</sup> _Caliptra.** **Spanish for “root cap” and describes the deepest part of the root_

Image Changes

v2.0: Caliptra_boot_fsm.png

Image not present in this version

v2.1: Caliptra_boot_fsm.png

New version

v2.0: Crypto-2p0.png

Old version

v2.1: Crypto-2p0.png

Image not present in this version

v2.0: HW_mbox_boot_fsm.png

Old version

v2.1: HW_mbox_boot_fsm.png

Image not present in this version

v2.0: mbox_boot_fsm_FW_update_reset.png

Old version

v2.1: mbox_boot_fsm_FW_update_reset.png

Image not present in this version