Chapters are checked off as they are (a) implemented and passes the supplied tests.
âś… Chapter 1 âś… Chapter 2 âś… Chapter 3 âś… Chapter 4 âś… Chapter 5 âś… Chapter 6 âś… Chapter 7 âś… Chapter 8
We do not use the RAM modules built in the project, opting to use single port RAM built into the iCE40 FPGA. This is far more efficient use of resources.
Due to the characteristics of the CPU a value loaded using an A-instruction
doesn't appear in register A (aka addressM
) until the next clock edge. For
arithematics this is fine - the ALU will see the value previous loaded and
compute the right values. For memory access however the single port RAM IP core
being used will not update its on the same clock edge in which addressM
is
updated. It will only do so on the next edge. In other words:
- edge-1:
addressM
takes on new value, RAM output is undefined because on edge transition of the value at the address present immediately prior is clocked into the RAM. - edge-2: the RAM output is now valid and reflects the value at
addressM
As such every memory read needs to be preceeded by a nop (0). This is the pipeline equivalent of inserting a bubble into a pipelined CPU but we are doing it in software.
Writes have a similar issue. When outM
, addressM
and writeM
are
asserted on the same clock edge no write happens b/c their value immediately
prior to transition is what matters. So it takes another nop for the write to
commit.
I "optimised" memory access by shifting the RAM clock 1/2 period
(i.e. ram_clk = ~clk
). This allows addressM
to be present at the rise edge
of the RAM clock. This means a sequence like
@R0
D=M
No longer needs a nop. However there is no getting around the fact that each
write takes an additional cycle. Additionally when M is on the RHS and the LHS,
e.g. M=M+1
a nop before and after is still required.
Inserting these nops is time consuming and prone to error. tools/assembler.py
will, by default, insert nops for you so programs supplied by the course can be
used as-is without modification provided they are assembled using our assembler.
Note that I wrote the assembler before I realised it was project 6.
Project 05x extends the HACK platform to implement an additional W
register.
This extended platform is called HACKx and it extends the C-instruction from:
1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3
⬇️
1 w d4 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3
Where
/w a
forms a 2-bit vector that selects thex
input of the ALU fromA
,inM
orW
corresponding to00
,01
,10
.11
is undefined.d4
works liked1..d3
but inverted. When it is UNSET theW
register takes on the ALU output on the next clock.
The interpretation of w
and d4
has been chosen so that programs intended for
the HACK platform can run on the HACKx platform without modification. Yay
backward compatibility!
I named the register W as a homage to the W register found in PIC microcontrollers.
Macros are implemented in two places:
asm.py
used byvm2asm.py
assembler.py
These macros are intended to help with implementing the stack virtual machine.
$inc_sp
,$dec_sp
: increments and decrements stack pointer (SP)$load_sp
,$save_sp
: loads SP into A and saves A into SP$_
: replaced with a unique identifier, e.g.$_LABEL
can be used multiple times and each time will result in unique label, e.g.001_LABEL
,002_LABEL
.
These macros are intended to help with direct assembly programming
$const <name> <value>
: inserts the symbol<name>
into the symbole table with the specified integer<value>
$call <label>
: pushes the return address at the top of the stack then performs a jump to the specified label$return
: pop the return address at the top of the stack and jumps to it$this
: replaced with the most recentfunc_
label, e.g.func_FOO
$copy_mm <dst> <src>
: copies content of memory atsrc
intodest
$copy_mv <dst> <value>
: copiesvalue
into memory addressdest
$if_var_goto <var> <dest>
: jumps todest
if*var
contains non-zero value$if_M_goto <dest>
: jumps todest
if M contains non-zero value$if_A_goto <dest>
: jumps todest
if A contains non-zero value$if_D_goto <dest>
: jumps todest
if D contains non-zero value
The $call/$return
macros allows for nested function calls, e.g.
(SUB_INC_D)
D=D+1
$return
(SUB_INC_M_AND_D)
M=M+1
$call SUB_INC_D
(SUB_MAIN)
@SCREEN
$call SUB_INC_M_AND_D
The course calls for $ to be a valid character, however I accidentally used it for the macro system so it cannot be a valid identifier any more.
Our implementation of the assembler (tools/assembler.py
) accepts hex and binary constants in the
form of:
- 0xNNNN for hexadecimal constants
- 0bNNNN for binary constants
Take a page from verilog's book hexadecimal and binary constants can be spaced out using underscore(_) to improve readability.
The assembler fully supports the HACKx platform. To ensure HACK compatible
machine code is emitted specify -C
on the commandline. This will cause
the assembler to error when it encounters use of the W
register.
The assembler can optionally annotate the machine code output with
the corresponding source block and the PC
value if the -A
option is given. This is useful
during debugging.
Some of the R0-R15 registers serve dual purpose and only R13-15 is actually available for general use. To avoid having to remember which registers can be used freely R13-15 can also be addressed as T0-T3.
When -O<opt>
is specified the assembler will perform some simple optimisations.
<opt>
can be one of:
all
: perform all optimisationsloads
: remove redundant loads where two (or more) A-instructions loading the same value will be reduced to one iff the A register is not modified in betweenconsec_nops
: consecutive NOP (0) instructions will be collapsed into oneunneeded_nops
: unneeded NOP (0) instructions will be removed. A NOP is unneeded if the next instructions following memory write doesn't access memory.
Our implementation of the vm-to-asm translator (tools/vm2asm.py
) is capable of
generating assembly for the HACKx and HACK platform with the HACKx platform
being the default target. To run tests VM programs from the course specify -C
when invoking vm2asm.py
. It is not necessary to also specify -C
to the
assembler b/c vm2asm.py
will not use the W
register.
Like the assembler vm2asm.py
will produce annotated assembly if -A
is given.
Following the stack model religiously means incrementing a value looks like this:
push local 0
push constant 1
add
pop local 0
This sequence ultimately results in no change in the stack pointer value (2 push
and 2 pops) and since incrementing by 1 doesn't require another operand we could
have manipulated the value using M=M+1
. This is true for other 1-operand
operations our ALU is capable of, e.g. !
, M-1
etc.
The VM translator implemented here supports the following direct segment manipulation commands:
s_inc <segment> <index>
: increments the segment value directlys_dec <segment> <index>
: decrements the segment value directlys_neg <segment> <index>
: negates (-x) the segment value directlys_not <segment> <index>
: binary not's (~x) the segment value directlys_set <segment> <index>
: sets all bits of the segment value directlys_clear <segment> <inex>
: clears all bits of the segment value directly
When targetting the HACKx platform the VM translator will use the W register as a dedicated stack pointer register. This significantly reduces the number of memory access commands.
Chapters 1-5 are implemented in their respective projects/
folder. Chapters 6
onwards live in tools/
b/c that makes the mose sense to me.
I have issue using the supplied software as-is, often getting opaque errors about "Expression
expected on line 0". So I would often I edit the .tst
files to remove the load commands and only
preserve the RAM setup commands. I would then mostly manually verify the content of RAM is as
expected against the .cmp
files.
This project uses apio: https://github.com/FPGAwars/apio . I have a custom fork which enables nicer testbenches: https://github.com/freespace/apio
apio clean
: removes build artefactsapio sim -t Top_tb.v
: run simulation usingTop_tb.v
apio build
: builds and generates the necessary binary files for uploading
Generally I also use apio clean
with sim
or build
, e.g.
apio clean; apio sim -t Top_tb.v
This ensures a clean build which is helpful in avoiding staleness or strange behaviours when switching between git branches etc.
To generate firmware.hack
for use with Top_tb.v
from one of the test .asm
programs run
something like the following in the same directory as Top_tb.v
:
python ../../../tools/assembler.py -A -O all < memory_access_test.asm > firmware.hack
- Install using brew cask install gtkwave
- Install Switch perl module into system dir:
sudo cpan install Switch
- If required fix up permissions in
/Library/Perl
, e.g.
sudo find /Library/Perl -type d -exec chmod a+rx {} \;
sudo find /Library/Perl -type f -exec chmod a+r {} \;
- Replace
/usr/local/bin/gtkwave
with symlink to/Applications/gtkwave.app/Contents/Resources/bin/gtkwave
When you type gtkwave
in the commandline it should launch the app.
- Define the DUMPSTR macro:
`define DUMPSTR(x) `"x.vcd`"
- Define the simulation output files:
$dumpfile(`DUMPSTR(`VCD_OUTPUT));
$dumpvars(0, <testbench_name>);
Where <testbench_name>
is something like Nand_tb
.
apio sim
will run the first test bench it finds (alphabetic sort, ends in
_tb.v).
In my branch of apio (https://github.com/freespace/apio) you can use
apio sim -t <testbench.v>
which will run the specified testbench file. This allows us to have more than 1 testbench per module per project.
-
When using if-statements with wires a delay is required for the wire value to update otherwise nothing seems to happen
a = 1; b = 1; #10; if (y != 1) begin $display("FAILED for input 11"); end
-
Assigning constants to a wire will mean it can only take on the assigned value or x, e.g.
wire a = 0; a = 1; // value of a is now 'x' a = 0; // value of a is now 0
On *nix use the following script to pull in all verilog files created in previous projects
find .. -type f -name '*.v' ! -name '*_tb.v' -maxdepth 2 -exec ln -s {} . \;
This confuses git b/c git doesn't know about symlinks and thinks there are new files for it to track. Fix it with:
find . -type l | sed -e s'/^\.\///g' >> .gitignore
The iCE40 UltraPlus comes primitives, like the block RAM, which can be instantiated directly. They are documented in:
Title of the document is "LATTICE ICE Technology Library" should the URL become invalid in the future.
Yosys implements some of the primitives, e.g. SB_RAM40_4K
, which are defined
in
https://github.com/YosysHQ/yosys/blob/master/techlibs/ice40/cells_sim.v
Not all primitives are implemented b/c @cliffordwolf believes that it is better to let the tooling figure things out. (See YosysHQ/yosys#423).
N.B. I am using the word primitive in the same way lattice is using it.
@cliffordwolf would call them macros in so far as all RAM primitives other than
SB_RAM40_4K is constructed using SB_RAM40_4K
.
- Memory Usage Guide for iCE40 Devices (TN1250): https://www.latticesemi.com/-/media/LatticeSemi/Documents/ApplicationNotes/MO/MemoryUsageGuideforiCE40Devices.ashx?document_id=47775
- SPRAM Usage Guide (TN1314): https://www.latticesemi.com/-/media/LatticeSemi/Documents/ApplicationNotes/IK/iCE40-SPRAM-Usage-Guide.ashx?document_id=51966
- You have an unused module which defines input/outputs. Remove the module should remove the error
- The ROM size is too large. The icebreaker has 30 EBR units of 256x16 size.
This allows a maximum of 256*30 = 7680 instructions in the ROM. Use
apio build --verbose-pnr
to get usage statistics.