Skip to content

Clock unstable during boot, tnCart stuck into boot loop #14

@herraa1

Description

@herraa1

After reports of many users that latest tnCartWonder bitstreams were not booting into their MSXs when using WonderTANG! V2.0b boards (and strangely my Panasonic FS-A1WSX and Toshiba HX-10 were not affected, or at least rarely), I started troubleshooting the issue.

Users observed that their WonderTANG! V2.0b boards got stuck while booting, showing a continuous fast-blinking board led, and not progressing to the MSX boot screen (video was black).

I made a minimal tnCart configuration for my WonderTANG! V2.0b with the bare minimum to make it wondertang-compatible, i.e.:

  • use the current tnCart code, without tweaks (and without adding the i2s transmitter for the MAX98357A)
  • swap MSEL0 and MSEL1
  • add the required pin mappings for WonderTANG! V2.0b
  • leave the external and internal sound outputs unconnected

User pakoto from msx.org (thanks!) help me making tests remotely with his WonderTANG! V2.0b and his Panasonic FS-A1 (and a Sanyo MSX1) which he was reporting were failing 100% of the time.
Using that bare minimum setup we observed indeed the same boot loop with the fast-blinking board led.

I started suspecting that something was going on during the boot process, before freeing the MSX to boot.

I changed a bit the setup so that I could use the LED output for debugging some signals, connecting it to a logic analyzer.
I diagnosed the CLK_MEM_READY signal through the LED pin (IO 75) and saw that during all those failed boots, the CLK_MEM_READY was deasserted before the bootloader had time to complete, causing a RESET and provoking the boot loop.

I thought what change could have caused that problem, and remembered that in commit 62a343a "Fix #7 Synchronize the operating clock with the bus clock" you switched the clock for deriving the base clock for the fpga from the 27MHz clock of the Tang Nano 20k to the 3.58MHz of the cartridge clock.

To confirm if that was the issue, and without further analyzing the root cause of the clock not being stable for the whole bootloader phase, I did a quick test switching the base clock to the old 108MHz derived from the 27MHz Tang Nano 20k (just the CLK_MEM, without touching anything else for a quick test).

So basically commenting out this:

/*
    wire CLK_MEM_LOCK;
    assign CLK_MEM_READY = RESET_n && CLK_MEM_LOCK;
    rPLL u_pll_base (
        .CLKOUT(CLK_MEM),
        .LOCK(CLK_MEM_LOCK),
        .CLKOUTP(CLK_MEM_P),
        .CLKOUTD(),
        .CLKOUTD3(),
        .RESET(!RESET_n),
        .RESET_P(1'b0),
        .CLKIN(CLK_IN),
        .CLKFB(1'b0),
        .FBDSEL({1'b0,1'b0,1'b0,1'b0,1'b0,1'b0}),
        .IDSEL({1'b0,1'b0,1'b0,1'b0,1'b0,1'b0}),
        .ODSEL({1'b0,1'b0,1'b0,1'b0,1'b0,1'b0}),
        .PSDA({1'b0,1'b0,1'b0,1'b0}),
        .DUTYDA({1'b0,1'b0,1'b0,1'b0}),
        .FDLY({1'b1,1'b1,1'b1,1'b1})
    );

    defparam u_pll_base.FCLKIN = "3.58";
    defparam u_pll_base.DYN_IDIV_SEL = "false";
    defparam u_pll_base.IDIV_SEL = 0;
    defparam u_pll_base.DYN_FBDIV_SEL = "false";
    defparam u_pll_base.FBDIV_SEL = 29;
    defparam u_pll_base.DYN_ODIV_SEL = "false";
    defparam u_pll_base.ODIV_SEL = 8;
    defparam u_pll_base.PSDA_SEL = "1000";
    defparam u_pll_base.DYN_DA_EN = "false";
    defparam u_pll_base.DUTYDA_SEL = "1000";
    defparam u_pll_base.CLKOUT_FT_DIR = 1'b1;
    defparam u_pll_base.CLKOUTP_FT_DIR = 1'b1;
    defparam u_pll_base.CLKOUT_DLY_STEP = 0;
    defparam u_pll_base.CLKOUTP_DLY_STEP = 0;
    defparam u_pll_base.CLKFB_SEL = "internal";
    defparam u_pll_base.CLKOUT_BYPASS = "false";
    defparam u_pll_base.CLKOUTP_BYPASS = "false";
    defparam u_pll_base.CLKOUTD_BYPASS = "false";
    defparam u_pll_base.DYN_SDIV_SEL = 2;
    defparam u_pll_base.CLKOUTD_SRC = "CLKOUT";
    defparam u_pll_base.CLKOUTD3_SRC = "CLKOUT";
    defparam u_pll_base.DEVICE = "GW2AR-18C";
*/

and adding this:

    wire CLK_MEM_LOCK;
    assign CLK_MEM_READY = RESET_n && CLK_MEM_LOCK;
    localparam FREQ=108_000;
    rPLL u_pll_base (
        .CLKOUT(CLK_MEM),
        .LOCK(CLK_MEM_LOCK),
        .CLKOUTP(CLK_MEM_P),
        .CLKOUTD(),
        .CLKOUTD3(),
        .RESET(!RESET_n),
        .RESET_P(1'b0),
        .CLKIN(CLK_27M),
        .CLKFB(1'b0),
        .FBDSEL(6'b000000),
        .IDSEL(6'b000000),
        .ODSEL(6'b000000),
        .PSDA(4'b0000),
        .DUTYDA(4'b0000),
        .FDLY(4'b1111)
    );
    defparam u_pll_base.FCLKIN = "27";
    defparam u_pll_base.DYN_IDIV_SEL = "false";
    defparam u_pll_base.IDIV_SEL = 0;
    defparam u_pll_base.DYN_FBDIV_SEL = "false";
    defparam u_pll_base.FBDIV_SEL = 3;
    defparam u_pll_base.DYN_ODIV_SEL = "false";
    defparam u_pll_base.ODIV_SEL = 8;
    defparam u_pll_base.PSDA_SEL = "1000";
    defparam u_pll_base.DYN_DA_EN = "false";
    defparam u_pll_base.DUTYDA_SEL = "1000";
    defparam u_pll_base.CLKOUT_FT_DIR = 1'b1;
    defparam u_pll_base.CLKOUTP_FT_DIR = 1'b1;
    defparam u_pll_base.CLKOUT_DLY_STEP = 0;
    defparam u_pll_base.CLKOUTP_DLY_STEP = 0;
    defparam u_pll_base.CLKFB_SEL = "internal";
    defparam u_pll_base.CLKOUT_BYPASS = "false";
    defparam u_pll_base.CLKOUTP_BYPASS = "false";
    defparam u_pll_base.CLKOUTD_BYPASS = "false";
    defparam u_pll_base.DYN_SDIV_SEL = 2;
    defparam u_pll_base.CLKOUTD_SRC = "CLKOUT";
    defparam u_pll_base.CLKOUTD3_SRC = "CLKOUT";
    defparam u_pll_base.DEVICE = "GW2AR-18C";


After that change, the WonderTANG! V2.0 of user pakoto started to boot 100% reliably in his Panasonic FS-A1 and Sanyo MSX1 without getting into the boot loop anymore (something that never happened before).

So, there is clearly a problem when using the CART_CLK, as at least in some MSX, the clock is not stable (and becomes not ready) in the middle of the bootloader process.

As I said, I didn't further analyze the root cause of it, but maybe it could be related to the following: as the CART_CLK comes from the CPU clock and the CPU clock comes from the VDP clock, maybe the fact that SYSRESET is help during bootloader causes the VDP to stop providing a clock. On some MSX that happens before than others, letting some of them complete the bootloader (and get to boot the MSX system) and others to get stuck into the dreaded boot loop.
And that explains too why when using the 27MHz clock to derive the base clock, the bootlopp never happens.

Of course, changing the clock that way was just to confirm the issue, and it is NOT a proposed solution as that change has other implications, and you made that change to fix other issues.

So something needs to be thought to fix the problem the right way.

Looking forward for your comments.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions