Skip to content

Commit 7de2dcf

Browse files
committed
Add a mechanism for fully arbitrary ALUs.
If gowin_pack encounters an ALU with the RAW_ALU_LUT parameter set, the value of this parameter is used to set the ALU fuse without changes. This allows, for example, in nextpnr, to disable constant inputs from networks and modify the ALU LUT so that these inputs do not affect the result. Signed-off-by: YRabbit <rabbit@yrabbit.cyou>
1 parent c18d795 commit 7de2dcf

File tree

3 files changed

+24
-10
lines changed

3 files changed

+24
-10
lines changed

apycula/gowin_pack.py

Lines changed: 18 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2355,16 +2355,30 @@ def place_lut(db, tiledata, tile, parms, num, row, col, slice_attrvals):
23552355
def place_alu(db, tiledata, tile, parms, num, row, col, slice_attrvals):
23562356
lutmap = tiledata.bels[f'LUT{num}'].flags
23572357
alu_bel = tiledata.bels[f"ALU{num}"]
2358-
mode = str(parms['ALU_MODE'])
23592358
for r_c in lutmap.values():
23602359
for r, c in r_c:
23612360
tile[r][c] = 0
2362-
if mode in alu_bel.modes:
2363-
bits = alu_bel.modes[mode]
2361+
# ALU_RAW_LUT - bits for ALU LUT init value, which are formed in nextpnr as
2362+
# a result of optimization.
2363+
if 'RAW_ALU_LUT' in parms:
2364+
alu_init = parms['RAW_ALU_LUT']
2365+
if len(alu_init) > 16:
2366+
alu_init = alu_init[-16:]
2367+
else:
2368+
alu_init = alu_init*(16 // len(alu_init))
2369+
bits = set()
2370+
for bitnum, bit in enumerate(alu_init[::-1]):
2371+
if bit == '0':
2372+
bits.update(lutmap[bitnum])
23642373
else:
2365-
bits = alu_bel.modes[str(int(mode, 2))]
2374+
mode = str(parms['ALU_MODE'])
2375+
if mode in alu_bel.modes:
2376+
bits = alu_bel.modes[mode]
2377+
else:
2378+
bits = alu_bel.modes[str(int(mode, 2))]
23662379
for r, c in bits:
23672380
tile[r][c] = 1
2381+
#print(row, col, num, bits)
23682382

23692383
# enable ALU
23702384
alu_mode_attrs = slice_attrvals.setdefault((row, col, int(num) // 2), {})

doc/alu.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -18,8 +18,8 @@ The ALU hard logic takes the shape of a full adder, where the carry chain is ful
1818
On the synthesis side, the ALU primitive supports 9 modes, wich correspond to a bit pattern stored in the LUT, as well as which ports are used, and which are set to constant values.
1919

2020
```
21-
add(0) 0011000011001100 A:- B:I0 C:1 D:I1 CIN:0
22-
sub(1) 1010000001011010 A:I0 B:- C:1 D:I1 CIN:1
21+
add(0) 0110000001101010 A:I0 B:I1 C:1 D:- CIN:0
22+
sub(1) 1001000010011010 A:I0 B:I1 C:1 D:- CIN:1
2323
addsub(2) 0110000010011010 A:I0 B:I1 C:1 D:I3 CIN:??
2424
ne(3) 1001000010011111 A:I0 B:I1 C:1 D:- CIN:??
2525
ge(4) 1001000010011010 A:I0 B:I1 C:1 D:- CIN:??
@@ -30,12 +30,12 @@ cupcdn(8) 1010000001011010 A:I0 B:I1 C:1 D:I3 CIN:??
3030
mul(9) 0111100010001000 A:I0 B:I1 C:0 D:1 CIN:??
3131
```
3232

33-
These values should be understood as follows: The lowest 4 bits are shared between the `LUT4` in the carry "selector" `LUT2`, so in the case of `ADD` `1100`, selecting `B`. In almost all cases `C:1` which means the output of the `LUT4` is controlled by `AAAA0000AAAA0000` avoiding the lower bits and explainging the zeros in most modes. In the case of `ADD` the `LUT4` function is therefore `00111100`, which is `B XOR D`. In the case of `MUL` `C:0` and `D:1` so indeed only `0000AAAA00000000` is used for the `LUT4`, having the function of `AND`, like the lower `LUT2`. I have confirmed the funcionality is identical with the other clusters set to `0000`. The full list of implemented logic functions:
33+
These values should be understood as follows: The lowest 4 bits are shared between the `LUT4` in the carry "selector" `LUT2`, so in the case of `ADD` `1010`, selecting `A`. In almost all cases `C:1` which means the output of the `LUT4` is controlled by `AAAA0000AAAA0000` avoiding the lower bits and explainging the zeros in most modes. In the case of `ADD` the `LUT4` function is therefore `01100110`, which is `A XOR B`. In the case of `MUL` `C:0` and `D:1` so indeed only `0000AAAA00000000` is used for the `LUT4`, having the function of `AND`, like the lower `LUT2`. I have confirmed the funcionality is identical with the other clusters set to `0000`. The full list of implemented logic functions:
3434

3535
```
3636
FN LUT4 LUT2
37-
ADD(0) B XOR D B
38-
SUB(1) !A XOR D A
37+
ADD(0) A XOR B A
38+
SUB(1) !A XOR B A
3939
ADDSUB(2) A XOR B XOR !D A
4040
NE(3) A XOR B 1
4141
GE/LE(4-5) A XOR B A

examples/gw5a/alu-simple.v

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
`default_nettype none
22

33
module top(input wire [7:0]a, input wire [7:0]b, output wire [7:0]led);
4-
assign led[3:0] = a[3:0] + a[7:4];
4+
assign led[3:0] = a[3:0] + {a[7:5], 1'b1};
55
endmodule
66

0 commit comments

Comments
 (0)