2. Core Overview¶
This chapter provides a high-level overview of the VeeR EL2 core and core complex. VeeR EL2 is a machinemode (M-mode) and usermode (U-mode), 32-bit CPU small core which supports RISC-V’s integer (I), compressed instruction (C), multiplication and division (M), and instruction-fetch fence, CSR, and subset of bit manipulation instructions (Z) extensions. The core contains a 4-stage, scalar, in-order pipeline.
2.1. Features¶
The VeeR EL2 core complex’s feature set includes:
RV32IMC-compliant RISC-V core with branch predictor
Optional instruction and data closely-coupled memories with ECC protection (load-to-use latency of 1 cycle for smaller and 2 cycles for larger memories)
Optional 2- or 4-way set-associative instruction cache with parity or ECC protection (32- or 64-byte line size)
Optional programmable interrupt controller supporting up to 255 external interrupts
Four system bus interfaces for instruction fetch, data accesses, debug accesses, and external DMA accesses to closely-coupled memories (configurable as 64-bit AXI4 or AHB-Lite)
Core debug unit compliant with the RISC-V Debug specification [3]
600MHz target frequency (for 16nm technology node)
2.2. Core Complex¶
Fig. 2.1 depicts the core complex and its functional blocks which are described further in section Functional Blocks.
2.3. Functional Blocks¶
The VeeR EL2 core complex’s functional blocks are described in the following sections in more detail.
2.3.1. Core¶
Fig. 2.2 depicts the scalar 4-stage core with one execution pipeline, one load/store pipeline, one multiplier pipeline, and one out-of-pipeline divider. There are two stall points in the pipeline: ‘Fetch’ and ‘Decode’. The diagram also shows how VeeR EH1’s logic stages have been shifted up and merged into 4 stages named Fetch (F), Decode (D), Execute/Memory (X/M), and Retire (R). Also shown is additional logic such as a new branch adder in the D stage. The branch mispredict penalty is either 1 or 2 cycles in VeeR EL2.
The merged F stage performs the program counter calculation and the I-cache/ICCM memory access in parallel. The load pipeline has been moved up so that the DC1 memory address generation (AGU) logic is now combined with align and decode logic to enable a DCCM memory access to start at the beginning of the M stage. The design supports a load-to-use of 1 cycle for smaller memories and a load-to-use of 2 cycles for larger memories. For 1-cycle load-to-use, the memory is accessed and the load data aligned and formatted for the register file and forwarding paths, all in the single-cycle M stage. For 2-cycle load-to-use, almost the entire M stage is allocated to the memory access, and the DC3/DC4 logic combined into the R stage is used to perform the load align and formatting for the register file and forwarding paths. EX3 and EX4/WB are combined into the R stage and primarily used for commit and writeback to update the architectural registers.
2.4. Standard Extensions¶
The VeeR EL2 core implements the following RISC-V standard extensions:
Extension |
Description |
References |
---|---|---|
M |
Integer multiplication and division |
Chapter 7 in [1] |
C |
Compressed instructions |
Chapter 16 in [1] |
Zicsr |
Control and status register (CSR) instructions |
Chapter 9 in [1] |
Zifencei |
Instruction-fetch fence |
Chapter 3 in [1] |
Zba [1] (address calculation) (frozen) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbb [2] (base) (frozen) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbc [3] (carry-less multiply) (frozen) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbs [4] (single-bit) (frozen) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbe [5] (bit compress/ decompress) (stable) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbf [6] (bit-field place) (stable) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbp [7] (bit permutation) (stable) |
Bit manipulation instructions |
Chapter 2 in [4] |
Zbr [8] (CRC) (stable) |
Bit manipulation instructions |
Chapter 2 in [4] |
frozen
specified means that the extensions are not expected to change.stable
mean that the marked extension may still change.