[dv] Fix how we catch iside_err in cosim #1909

ctopal · 2022-11-07T19:23:49Z

In the case of a load/store access fault trap, rvfi_order number stays same but pc_id changes. And if the older instruction is faulty we still end up pushing it to the queue. This commit fixes that by checking for a matching rvfi_order number first.

Signed-off-by: Canberk Topal ctopal@lowrisc.org

In the case of a load/store access fault trap, rvfi_order number stays same but pc_id changes. And if the older instruction is faulty we still end up pushing it to the queue. This commit fixes that by checking for a matching rvfi_order number first. Signed-off-by: Canberk Topal <ctopal@lowrisc.org>

andreaskurth

With this change, riscv_mem_error_test passes for 14/15 seeds (starting at seed 16776). The one failure is due to an illegal cross bin:

riscv_mem_error_test.16781
--------------------------
binary:          test.bin
rtl_log:         rtl_sim.log
rtl_trace:       trace_core_00000000.log
iss_cosim_trace: spike_cosim_trace_core_00000000.log

[FAILED]: error seen in 'rtl_sim.log'
---------------*LOG-EXTRACT*----------------
    137: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1247380: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    138: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1315760: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    139: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1404020: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    140: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1440280: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    141: xmsim: *SE,EILLCT: (File: ./fcov/core_ibex_fcov_if.sv, Line: 691):(Time: 1440940 NS + 4) Illegal cross tuple (<0, 1, InstrCategoryFetchError, IdStallTypeMem, 0, 0, 0>) occurred corresponding to illegal cross bin (mem_stall_illegal) of cross (core_ibex_tb_top.dut.u_ibex_top.u_ibex_core.u_fcov_bind.uarch_cg_inst@11436_628.exception_stall_instr_cross).
[E] 142: xmsim: *SE,EILLCT: (File: ./fcov/core_ibex_fcov_if.sv, Line: 691):(Time: 1440980 NS + 4) Illegal cross tuple (<0, 1, InstrCategoryFetchError, IdStallTypeMem, 0, 0, 0>) occurred corresponding to illegal cross bin (mem_stall_illegal) of cross (core_ibex_tb_top.dut.u_ibex_top.gen_lockstep.u_ibex_lockstep.u_shadow_core.u_fcov_bind.uarch_cg_inst@11451_1.exception_stall_instr_cross).
    143: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1480360: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    144: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1569100: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    145: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1639980: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
    146: UVM_INFO /home/dev/src/dv/uvm/core_ibex/tb/core_ibex_tb_top.sv(163) @ 1717520: reporter [core_ibex_tb_top.unmblk1] Enabling assertions: core_ibex_tb_top.NoAlertsTriggered
--------------------------------------------

Summary of the full regression:

Test	Passing	Failed	Timed-out	Total	Pass Rate
riscv_arithmetic_basic_test	10	0	0	10	100.00 %
riscv_assorted_traps_interrupts_debug_test	3	7	0	10	30.00 %
riscv_bitmanip_balanced_test	10	0	0	10	100.00 %
riscv_bitmanip_otearlgrey_test	10	0	0	10	100.00 %
riscv_csr_test	5	0	0	5	100.00 %
riscv_debug_basic_test	8	2	0	10	80.00 %
riscv_debug_branch_jump_test	9	0	1	10	90.00 %
riscv_debug_csr_entry_test	10	0	0	10	100.00 %
riscv_debug_ebreak_test	14	0	1	15	93.33 %
riscv_debug_ebreakmu_test	13	1	1	15	86.67 %
riscv_debug_in_irq_test	9	0	1	10	90.00 %
riscv_debug_instr_test	24	0	1	25	96.00 %
riscv_debug_single_step_test	14	1	0	15	93.33 %
riscv_debug_stress_test	15	0	0	15	100.00 %
riscv_debug_triggers_test	4	0	1	5	80.00 %
riscv_debug_wfi_test	8	1	1	10	80.00 %
riscv_dret_test	4	1	0	5	80.00 %
riscv_ebreak_test	9	1	0	10	90.00 %
riscv_epmp_mml_execute_only_test	20	0	0	20	100.00 %
riscv_epmp_mml_read_only_test	20	0	0	20	100.00 %
riscv_epmp_mml_test	0	0	20	20	0.00 %
riscv_epmp_mmwp_test	0	0	20	20	0.00 %
riscv_epmp_rlb_test	19	0	1	20	95.00 %
riscv_hint_instr_test	10	0	0	10	100.00 %
riscv_icache_intg_test	15	0	0	15	100.00 %
riscv_illegal_instr_test	14	1	0	15	93.33 %
riscv_interrupt_csr_test	10	0	0	10	100.00 %
riscv_interrupt_instr_test	25	0	0	25	100.00 %
riscv_interrupt_wfi_test	15	0	0	15	100.00 %
riscv_invalid_csr_test	10	0	0	10	100.00 %
riscv_irq_in_debug_mode_test	9	0	1	10	90.00 %
riscv_jump_stress_test	10	0	0	10	100.00 %
riscv_loop_test	10	0	0	10	100.00 %
riscv_machine_mode_rand_test	10	0	0	10	100.00 %
riscv_mem_error_test	14	1	0	15	93.33 %
riscv_mem_intg_error_test	13	2	0	15	86.67 %
riscv_mmu_stress_test	10	0	0	10	100.00 %
riscv_multiple_interrupt_test	9	0	1	10	90.00 %
riscv_nested_interrupt_test	10	0	0	10	100.00 %
riscv_pc_intg_test	15	0	0	15	100.00 %
riscv_pmp_basic_test	49	0	1	50	98.00 %
riscv_pmp_disable_all_regions_test	50	0	0	50	100.00 %
riscv_pmp_full_random_test	283	2	315	600	47.17 %
riscv_pmp_out_of_bounds_test	48	0	2	50	96.00 %
riscv_rand_instr_test	10	0	0	10	100.00 %
riscv_rand_jump_test	8	0	2	10	80.00 %
riscv_reset_test	15	0	0	15	100.00 %
riscv_rf_intg_test	15	0	0	15	100.00 %
riscv_single_interrupt_test	14	0	1	15	93.33 %
riscv_umode_tw_test	9	1	0	10	90.00 %
riscv_unaligned_load_store_test	5	0	0	5	100.00 %
riscv_user_mode_rand_test	10	0	0	10	100.00 %
TOTAL	963	21	371	1355	71.07 %

andreaskurth · 2022-11-08T08:00:43Z

This is a follow-up on #1895 to improve the pass rate of riscv_mem_error_test.

ctopal · 2022-11-08T12:32:00Z

dv/uvm/core_ibex/common/ibex_cosim_agent/ibex_cosim_scoreboard.sv

+      // Determine if we were here before. If so, that means this rvfi_order number was actually used
+      // twice (a load/store fault caused jumping to the handler). Delete the old iside_error_queue
+      // element.
+      if (latest_order == instr_vif.instr_cb.rvfi_order_id) begin
+        iside_error_queue.pop_back();
+      end


After a bit of thought, I am not sure if this would work considering we actually don't check for failed_iside_accesses.exists(aligned_addr) && !iside_pmp_failure.exists(aligned_addr) in here. Concern being, we might end up deleting a valid iside_error. I guess the solution would be to check for the condition again before calculating the new aligned_addr, WDYT @GregAC ? Regression numbers show no such problem but just wanted to be sure before merging this.

GregAC

Thanks for looking into this @ctopal it's a tricky scenario!

After some thought I think I prefer a different fix, your fix uses the latest_order matching the just seen instruction order to detect the scenario and whilst this currently seems effective (i.e. there aren't other places where this is the case) I worry it'll generated other bugs down the line.

My fix is here: #1922 it aims to detect the specific scenario and remove the flushed errors from the iside error queue.

andreaskurth · 2022-11-14T09:37:38Z

Should we close this in favor of #1922?

GregAC · 2022-11-14T10:12:23Z

Should we close this in favor of #1922?

That's the idea but I wanted to have #1922 reviewed first so we can agree (or not) that's the preferred solution.

ctopal · 2022-11-14T10:35:09Z

Yes I believe #1922 is preferred as it's more of a general solution.

ctopal added Type:Enhancement Feature requests, enhancements Component:DV Design verification (DV) or testing issue labels Nov 7, 2022

ctopal requested review from GregAC and andreaskurth November 7, 2022 19:23

andreaskurth approved these changes Nov 8, 2022

View reviewed changes

ctopal commented Nov 8, 2022

View reviewed changes

GregAC suggested changes Nov 11, 2022

View reviewed changes

ctopal closed this Nov 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[dv] Fix how we catch iside_err in cosim #1909

[dv] Fix how we catch iside_err in cosim #1909

Uh oh!

ctopal commented Nov 7, 2022

Uh oh!

andreaskurth left a comment

Uh oh!

andreaskurth commented Nov 8, 2022 •

edited

Loading

Uh oh!

ctopal Nov 8, 2022

Uh oh!

GregAC left a comment •

edited

Loading

Uh oh!

andreaskurth commented Nov 14, 2022

Uh oh!

GregAC commented Nov 14, 2022

Uh oh!

ctopal commented Nov 14, 2022

Uh oh!

Uh oh!

[dv] Fix how we catch iside_err in cosim #1909

[dv] Fix how we catch iside_err in cosim #1909

Uh oh!

Conversation

ctopal commented Nov 7, 2022

Uh oh!

andreaskurth left a comment

Choose a reason for hiding this comment

Uh oh!

andreaskurth commented Nov 8, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ctopal Nov 8, 2022

Choose a reason for hiding this comment

Uh oh!

GregAC left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

andreaskurth commented Nov 14, 2022

Uh oh!

GregAC commented Nov 14, 2022

Uh oh!

ctopal commented Nov 14, 2022

Uh oh!

Uh oh!

andreaskurth commented Nov 8, 2022 •

edited

Loading

GregAC left a comment •

edited

Loading