Teensy Z80 – Part 4 – VRAM explained, display modes, simple shell.

This is the fourth part of a series of posts detailing steps required to get a simple Z80 based computer running, facilitated by a Teensy microcontroller. It’s a bit of fun, fuzing old and new hobbyist technologies. See Part 1, Part 2 and Part 3, if you’ve missed them.

setupI mentioned ‘VRAM’ in the last post, which really was just an area of ram which I specified to the teensy through a port. I’ve now got something a bit more serious set up, which is completely separate from main RAM. It’s accessed via the I/O ports, after a flag has been set.

At the moment, I have all but one of the address bus pins connected on the z80. This means I can address 32KB of ram. The screen which is connected to the Teensy via SPI is a 320×240, 16-bit colour unit. Sadly, this means a full size framebuffer for this screen would be an eye-watering 150KB! Even half this size at 160×120 full colour is 37KB. I cannot add the additional address bus pin for a 16-bit address space due to running out of I/Os on the Teensy. I have a single one left, and it’s needed for something I hope to explain in the next few posts. I can use a 256 colour palette, which brings the memory requirements for 160×120 down to 18KB, but it’s still a large chunk of memory which can no longer be used for programs.

VRAM as a second address space

So I decided to use 16-bit (15 in my case)  i/o addressing to enable a secondary 32KB address space – to use for VRAM. The Z80 in/out instruction in which the port is the C register actually places register B onto the top half of the address bus, allowing access to the full address space. We have a specific entry in the standard 256 port I/O space which is used to set a flag which the Teensy interprets as an instruction to treat all further I/O requests as writes into a special VRAM memory. I then have the highest port possible (0x7FFF) as the disable VRAM port. Reading from this port resets the Teensy and I/O operations return to their standard state. This allows a completely separate memory space for VRAM, which allows for all of main RAM to remain for programs and data.

timing_ioThere is a downside to this – I/O writes have an additional wait cycle automatically inserted, so they are slower than normal RAM writes. Additionally, things such as loading images from SD cards into VRAM would need to go via RAM, unless additional flags are inserted into the file system requests to specify what memory spaces the buffers refer to. However, I do think that those downsides will be insignificant when I try to make the Z80 clock asynchronously with the Teensy operations, as there are likely to be many wait states for RAM as well as I/O operations forced by the use of the WAIT input to the Z80.

On the Teensy, the code for this is very simple. We have a second global array to use as the VRAM storage, and then have a ioVramBankSet flag which we check on i/o operations.

#define PORT_VRAM_BANK_SET       0xC8
#define PORT_VRAM_BANK_RESET     0x7FFF 

byte Z80_VRAM[Z80_VRAM_LENGTH] = {0};

void loop() {
...
  unsigned short portAddress = addressBus & 0x00FF;
  if (RD_val) {
      if (ioVramBankSet && ( addressBus == PORT_VRAM_BANK_RESET))
      {
        // PORT_VRAM_BANK_RESET is a special case 16-bit port
        ioVramBankSet = 0;
      }
...
  } else if (WR_val) {
      readDataBus();

      if (ioVramBankSet)
      {
        Z80_VRAM[addressBus] = dataBus;
      }
      else if (portAddress == PORT_VRAM_BANK_SET)
      {
        ioVramBankSet = 1;
      }
...

The above code is really all we need for this. The upside of using this I/O style system instead of say, RAM banking, is that the instruction stream and source data can remain in standard RAM and we do not need to do any mapping of the address space which would restrict us significantly with only 15 address bits. Now we need to make use of the data which is stored in that space for graphics!

Display Modes

I mentioned earlier the amount of memory needed for various resolutions and colour depths. The simple fact is that the Teensy 3.1 microcontroller I’m using only has 64KB ram. Within that, we need the Z80 RAM, VRAM, and then working memory for the teensy itself – for driving the display, and working with the SD card and handling the FAT filesystem. This pretty much means 160×120 8bpp is really the maximum we can achieve. When combined with a 256 entry palette, we can get a very generous range of colours, and come in at less than 20KB. So we’ll have the VRAM set to 20KB.

The first and most generous mode is as above, 160×120, with a 256-entry 16-bit colour palette. This is laid out in vram with the first 512bytes as the palette, and after that the pixel data. This remains true for all display modes to simplify implementation. There are modes additionally for whether the display is stretched or not. If it is not stretched, the offset in the TFT will be configurable so you can move it around the screen and combine it with console text. As I write this, the following modes are supported:

  • 40×30, 16bpp
  • 48×48, 16bpp
  • 48×48, 8bpp, 16-bit palette
  • 80×60, 8bpp, 16-bit palette
  • 160×120, 8bpp, 16-bit palette

The mode is set by an index value which is written to an IO port. A draw port exists, and a write to it initiates a full screen redraw. The data bus value is ignored. I may implement a sub-screen redraw which acts on a set area of the screen later as an optimization.

An additional feature is that the palette has an offset associated with it, which wraps the 256 bytes. So, to implement the palette-shifting effects of plasma, etc, it’s an incredibly easy hack. It also means that when I implement modes with smaller bits per pixel indices, there can be multiple palettes stored that can be switched with a single i/o write.

That is the trick used in the plasma example shown in the middle of this video. The Z80 is running slow at around 2KHz (remember, everything is still synchronous).

The Z80 code

The code is very simple to load pixel data into the VRAM space from RAM, and to do plasma palette cycling.

  ld a, 6
  out (PORT_VRAM_SETMODE), a    ; 'vram' displaymode 6: 80x60

  ; we can put out BC now to write VRAM
  ld bc, 0200h                  ; pixel mem offset, after palette
  ld hl, 012c0h                 ; size  of pixel data (80x60)
  ld de, image_pixels_80x60     ; pixel data in binary section
  call ram_2_vram

  ld bc, 0000h                  ; pixel mem offset, palette
  ld hl, 0200h                  ; size  of palette data (256 2-byte entries)
  ld de, palette_defn           ; pixel data in binary section
  call ram_2_vram

  ld hl, 0
cycle_palette_idx:
  inc hl
  ld a, l
  out (PORT_VRAM_PALETTE_IDX), a     ; inc palette idx
  ld a, 0
  out (PORT_VRAM_DRAW), a            ; draw vram
  jr cycle_palette_idx

This will load the pixel and palette data which are stored in the binary already, into VRAM. The PORT_VRAM_PALETTE_IDX I/O port sets the ‘palette offset’ so it can be rotated incredibly easily, and PORT_VRAM_DRAW draws the contents of VRAM to the display given the current display mode set via the PORT_VRAM_SETMODE port.

  ; de = src in ram, bc = vram offset, hl = size
ram_2_vram:
  di
  push de
  push bc
  push hl
  push af
  ld a, 1
  out (PORT_VRAM_BANK_SET), a

ram_2_vram_loop:
  ld a, (de)
  out (c), a
  dec hl
  inc de
  inc bc

  ld a, h
  or l
  jr nz,ram_2_vram_loop  

  ; return to non-vram
  ld bc, PORT_VRAM_BANK_RESET
  in a, (c)
  pop af
  pop hl
  pop bc
  pop de
  ei
  ret

The ram_2_vram function shows how the VRAM memory space is enabled with the PORT_VRAM_BANK_SET port, and disabled with a PORT_VRAM_BANK_RESET read. Also note how I disable interrupts within this function – as interrupts may use i/o ports themselves, for instance the serial receive ‘data available’ interrupt, it’s important to disable interrupts whenever the VRAM is enabled for writing. Other than that, it works a treat!

These video modes also allow for some fun error screens. For example, if the SD card is not mounted correctly (and, on boot the teensy tries to locate a kernel.bin on the SD, so it needs to be there) we get a nice error graphic. This one uses the 40x30x16bpp mode, but I’ll make it a lot smaller soon with a 16-colour palette mode.

sd_errorA simple shell

You can also see from the video above that I have a very basic shell working, it simply takes input characters from serial and runs programs matching those names from SD. It does no argument passing, all it checks is if a file exists with that name, and if it does, loads it into offset 0x1000 of memory, before calling into it. The ls, cls and plasma binaries are all simply made in Z80 assembly and have no dependencies on any features that a kernel may need to provide.

At the moment, I do have the sdcc C compiler running with a compiled ‘kernel’ which allows for some real operating system services and true program loading with arguments to main, system calls, etc. Watch this space! I’ll talk more about that soon :) Code as always is on my github. A full schematic is incoming, but it’s not difficult to decipher if you want to make your own.

I hope you’ve been enjoying this Teensy Z80 project. If you have, let me know on twitter @domipheus!

 

Teensy Z80 – Part 3 – File System, SD Card, VRAM?

This is the third part of a series of posts detailing steps required to get a simple Z80 based computer running, facilitated by a Teensy microcontroller. It’s a bit of fun, fuzing old and new hobbyist technologies. See Part 1 and Part 2, if you’ve missed them.

Now we have the base Z80 working, interrupts and a display connected which can be manipulated in a console/terminal fashion using the Z80 I/O ports. The next step? File storage!

The obvious choice for file storage here is an SD card. It uses ~3v3 logic, which is what we are running everything with, and also uses the SPI bus, which we already have set up for our LCD screen. We’d need another pin on the Teensy for the SD chip select, and also another for the MISO line reading data from the SD – the LCD only ever used MOSI for input.

schematic_tft_sdA peek of what is covered in this post:

Interfacing for SD cards

Now, as I keep stressing, this is just a little fun exercise. So to make things (a lot) easier, the Teensy will actually handle all of the FAT file system behind the scenes work. I’m sure I could get it all ported, or find a Z80 FAT16 implementation already, but I like the pace this project is moving at – so we will cheat more!

I exposed 3 I/O ports to the Z80. I could probably combine them, but for now, I will stick with three:

  1. Opening/closing files
  2. Read/writing files
  3. Performing ‘nextfile’ operations on directories

I won’t go into too much detail on the Teensy side of things. The code is all available on my github project page if you want to look. I Teensy code uses SdFat, which means I need to license the Teensy Z80 code as GPL (for those who don’t know, my stance on GPL is “ugh“, but will abide by its demands).

I will, however, detail the I/O ports, commands and the data structures used to communicate the operations. The major file system functions – open, read/write, next – are implemented in such a way that you place required information in a section of memory, give that memory address to a port, and then tell it to execute the operation. So, for the ‘Open’ command, we set aside an area of memory with the following structure:

  openfile_cmd_data {
 0:    uint8_t error;    // operation writes
 1:    uint32_t size;    // operation writes
 5:    uint8_t type;     // operation writes
 6:    uint8_t flags;    // operation reads
 7:    char    name[13]; // operation reads, 8.3, null-terminated
  }

For open, we provide the name and flags in the structure before initiating the open command. Flags are whether we are opening for reading, writing, appending, etc. The open command itself will write the error, size and type fields. Error and size are self explanatory, the type field is for extra information, such as if this file is actually a directory.

Performing this in the Z80 assembly looks like the following:

      ; definitions of ports, commands, flags required
PORT_FILESYS_OPEN_CLOSE equ 11
FILESYS_OPEN_OPENFILE   equ 5
FILESYS_OPEN_SETMEMPTR  equ 6
OPEN_READ               equ 0

...

      ; area of memory for openfile_cmd_data
filesys_readme_open_read:
  defb 0ffh, 0,0,0,0,  0,  OPEN_READ,  'README.TXT',0,0,0

...
loadreadme:
  ld a, FILESYS_OPEN_SETMEMPTR       ; the 'Set Memory Pointer' command
  out (PORT_FILESYS_OPEN_CLOSE), a   ; tell the port we are giving it memory
  ld de, filesys_readme_open_read
  ld a, e
  out (PORT_FILESYS_OPEN_CLOSE), a   ; give the port 8 bits of address
  ld a, d
  out (PORT_FILESYS_OPEN_CLOSE), a   ; give the port the other 8 bits
  ld a, FILESYS_OPEN_OPENFILE
  out (PORT_FILESYS_OPEN_CLOSE), a   ; initiate the OPENFILE command
                                     ; - operation is immediate.
  ld a, (de)                         ; load the error byte
  or a
  jr nz, Lfile_fail                  ; if non-zero jump to fail handler

...

I’ve left out the previous step to this, which is actually opening (and closing) the root directory. It’s implemented as a special case of open where “/” is the filename. But you can see the code is very straightforward.

We can now use the information written at filesys_readme_open_read to help read the file contents. The file size is written to the 4 bytes after the error byte, so we cam read that in as to see how much memory we need to read the whole file into memory.

...
  ; assume < 256b file for now
  ld de, filesys_readme_open_read + 1
  ld a, (de)
...

It we read the first byte, as the system is little-endian we will read the file size if the file is less than 256 bytes. We’ll assume README.TXT is (well, we know it is) and so for simplicity will just read this once.

With this information, we can set some memory aside to store the file contents. We could use the stack by removing the bytes from there, or in my case, just have a fixed address in memory where program RAM can begin. We now need to fill out a memory read/write request structure:

   read_write_command {
 0:    uint8_t error_code;      // writes
 1:    uint8_t op_type;         // reads, CMD_READ(0) or CMD_WRITE(1)
 2:    uint16_t file_offset;    // reads, ignored if OPEN_APPEND
 4:    uint16_t block_size;     // reads,
 6:    uint16_t mem_buffer_ptr; // reads,
   }

The file_offset is where you’d typically seek() to before a read. The block size is the size of the read/write request, and the mem_buffer_ptr should point to a further area of memory that is at least of size block_size for the file system to write into (or read from given a write instruction).

filesys_read_request:
  ; error_code, op_type, file_offset_lo,file_offset_hi,block_size_lo,block_size_hi
  defb 0ffh, 0,  0,0,  0,0
  defw scratch_mem

scratch_mem:
  dc 128,0 ; 128 bytes of zero for scratch memory

After opening the file, we can insert the size of the file into the block_size_lo part of the data, to read the whole file starting from offset 0.

  ; assume < 256b file for now
  ld de, filesys_readme_open_read + 1
  ld a, (de)

  ld hl, filesys_read_request+4
  ld (hl), a

Now we use the same style of port i/o to provide the read/write port with the location of this structure in memory, and fire off the EXEC command to perform the operation.

  ld a, FILESYS_RW_SETCMDMEMPTR
  out (PORT_FILESYS_READ_WRITE), a   ; give the i/o port the command ptr
  ld de, filesys_read_request
  ld a, e
  out (PORT_FILESYS_READ_WRITE), a   ; give the i/o port the command ptr
  ld a, d
  out (PORT_FILESYS_READ_WRITE), a   ; give the i/o port the command ptr
  ld a, FILESYS_RW_EXEC
  out (PORT_FILESYS_READ_WRITE), a   ; execute the command
                                     ; - operation is instant
  ld a, (de)                         ; load the error byte
  or a
  jr nz, Lfile_fail                  ; if non-zero jump to fail 

        ; scratch_mem now contains file content, ascii text
  ld de, scratch_mem
  call print_string                  ; print that content to the screen
  call newline

  ld a, FILESYS_CLOSE_FILE
  out (PORT_FILESYS_OPEN_CLOSE), a   ; close README.txt

This setup allows for quite a decent amount of functionality. I can read files much larger than the available RAM (16KB as I write this) due to providing the seek location in the file as a file_offset in the read command. Writing acts exactly the same, except that scratch_mem would contain what I wanted to write to the file, and the file would be opened for writing, and the command op_type set CMD_WRITE.

setupDirectory Traversal

I skipped the fact that before opening README.TXT I had to open the root directory. In this system, directories are simply files. You need to be in the correct working directory to open a file, the directory does not form part of the open request. In this way, internally a directory tree can be kept on the Teensy. I’ve not implemented this fully yet, as for now, a flat filesystem really is enough for me.

The NEXT command allows the discovery of files in a directory, like most other file systems. Unlike the other file operations, this operation has no arguments, and simply operates on the current open file – if it’s a directory!

  getnext_output {
 0:    uint8_t  error;
 1:    uint32_t filesize;
 5:    uint8_t  flags;
 6:    char     name[13]; // null-terminated
  }

For this request, since this operation only writes memory, we can use the scratch_mem location from earlier, and initiate the GETNEXT command. We initiate GETNEXT operations until the error byte becomes non-zero.

  ld a, FILESYS_NEXT_SETMEMPTR
  out (PORT_FILESYS_NEXT), a    ; set the memory area to scratch_mem
  ld de, scratch_mem
  ld a, e
  out (PORT_FILESYS_NEXT), a
  ld a, d
  out (PORT_FILESYS_NEXT), a
  ld hl, scratch_mem
  ld de, scratch_mem + 6        ; scratch_mem+6 is char name[13]
getnextfile:
  call newline                  ; new line on the console
  ld a, FILESYS_NEXT_GETNEXT
  out (PORT_FILESYS_NEXT), a    ; initiate the GETNET command
  ld a, (hl)
  or a
  jr nz, nomorefiles            ; non-zero error? no more files.
  call print_string             ; prints string at de (scratch_mem+6)
  jr getnextfile
nomorefiles:
  ld a, FILESYS_CLOSE_FILE      ; close the directory file
  out (PORT_FILESYS_OPEN_CLOSE), a

dir_listing

That really is all there is to it! At present, only one file can be open at any one time. This is quite limiting, but for now, it will do. I can always add some sort of file descriptor system later.

Lets display an image!

The filesystem test I wrote lists the root directory, reads README.TXT, writes ‘0123456’ to TEST.TXT, and then reads a file speccy.565. Speccy.565 is an image file!

This was done fairly quickly, for some visual eye candy. speccy.565 is a tiny 48, 16-bit colour image of a ZX Spectrum keyboard. It’s in the 565 format, in that there are 5 bits for red, 6 bits for green, and 5 bits for blue. The file has no header, and the teensy is currently hard coded to have a 48x48x16bit ‘vram’. All I have working at the moment is a port for setting the start of this vram in memory, aligned to 256 bytes so we only need to set the top 8 bits of the address. I also have a port which ignores what is on the dataBus, it just initiates a draw of the vram to the LCD screen. It’s pretty primitive just now, but I hope to add more display and colour modes – as at the moment this 64×64 image in vram takes up a whopping 25% of the total RAM available to me!

The code looks like this:

  ld a, VRAM_HIGH_BEGIN
  out (PORT_VRAM_BUFFER_LOC), a 

  ... read speccy.565 into ram at VRAM_BEGIN ...

  ld a, 0
  out (PORT_VRAM_DRAW), a           ; Draw VRAM to screen

On the teensy, all we need is:

if (portAddress == PORT_VRAM_BUFFER_LOC)
{
  ioVramBuffer = ((unsigned short)dataBus) << 8U;
}
else if (portAddress == PORT_VRAM_DRAW)
{
  uint16_t* vram = (uint16_t*)&Z80_RAM[ioVramBuffer];

  // 48 X 48 test
  for (int y = 0; y < 48; y++)
  {
    for (int x = 0; x < 48; x++)
    {
      tft.drawPixel(VRAM_START_X+x, VRAM_START_Y+y, vram[y*48+x]);
    }
  }

}

The result:

read_image_test
This video is showing the real time execution of this test program on the Z80. It’s running probably around the 50KHz mark – I’m now beginning to think about an asynchronous clock – but that’s for yet another post :)

Wrapping up

That’s it for this post. We have a decent amount of filesystem functionality available, and I’ll be using that significantly in the future posts! I hope you’ve been enjoying this Teensy Z80 project. If you have, let me know on twitter @domipheus!

Teensy Z80 – Part 2 – Mode 2 Interrupts, Timer

Interrupts. Lovely interrupts. The Z80 has a maskable interrupt, and a non maskable interrupt. The maskable ones having the feature that they can be disabled and enabled from within code. For me, I wanted to implement maskable Mode 2 Interrupts.

Mode 2 interrupts are very powerful. They allow an external device to make the Z80 jump to one of 128 possible locations, by putting the lower half of a 16-bit address on the data bus. this is combined with the contents of the I register to form a location in memory, which should contain the location of the exception handler routine.

The steps are as follows:

  1. The Z80 is put into interrupt mode 2.
  2. Set the I register to the hi-byte start of the Interrupt Vector Table
  3. Interrupts are enabled
  4. The z80 runs program code
  5. A device puts the INT pin low (it’s active low, so this is the active state)
  6. The Z80 will acknowledge the interrupt request at a time in the future, making pins M1 and IOREQ active.
  7. The device should place it’s interrupt vector on the data bus.
  8. The CPU will take the data bus and combine it with the I register to form an address interrupt_handler_start = ((I<<8U)|DATA_BUS)
  9. The current PC is placed on the stack and execution resumes at interrupt_handler_start
  10. The handler should disable interrupts, perform an action, enable interrupts and perform a reti instruction to return to user code. Registers should be manually preserved in the interrupt handler.

I want TeensyZ80 to have some timer interrupts so in the first instance I have an interrupt which fires every second.

timing_intLooking at the timing diagram for an interrupt, we see INT goes active and is only sampled at specific times. It is safe to then serve the interrupt vector when IOREQ is active in an M1 cycle. We then place the interrupt vector on the data bus, and continue. It’s important to note that we must add a check to the existing I/O servicing code on the Teensy to ensure we do not enter that if M1 is active.

In the Teensy sketch, we have some new variables.

elapsedMillis timer_seconds;
byte timer_seconds_intvector = 2;
byte currentInterruptVector = 0; // The vector that is to be signalled
int currentInterrupt = 0; // We are in an interrupt state

In loop() we have more code. Before we set the clock high, signal the interrupt if we need to.

if ((timer_seconds > 1000) && !currentInterrupt)
{
  timer_seconds = 0;
  digitalWrite(INT, LOW);
  currentInterruptVector = timer_seconds_intvector;
  currentInterrupt = 1;
}

After the clock goes high, we should check for the Interrupt Acknowledge. you can see here that I keep INT active untile I see an ACK, which doesn’t seem to cause issues at present.

if (currentInterrupt>0) {
  if (IOREQ_val && M1_val) // interrupt ack
  {
    digitalWrite(INT, HIGH);
    currentInterrupt = 0;
    dataBus = currentInterruptVector;
    writeDataBus();
  }
}

...snip... 

// the IO code shouldn't be executed in an M1 state
if (IOREQ_val && (ioDebounce == 0) && !M1_val)
{
...snip...

And then on the Z80, we need to write several bits – the init for mode 2 interrupts and the I register, the interrupt handler, and the pointer to that handler in a specified table location, aligned to 256 bytes.

The Z80 boot code

  .org 0000h
start:
  di                        ;disable interrupts
  ld sp, 3fffh
  im 2                      ;enable mode 2 interrupts
  ld a, 01h
  ld I, a                   ;set the high byte of the interrupt table
  ld de, str_console_prompt
  call print_string
  ei                        ;enable interrupts
iloop:
  halt
  jr iloop

We then have the interrupt handler:

ihdlr_second_timer:
  di                ; disable interrupts
  push af           ; preserve registers
  ld a, 02eh        ; print period to console
  out ($03), a
  pop af
  ei                ; enable interrupts
  reti              ; return from interrupt

Port 3 is our console putChar port. So this simply prints a ‘.’ to the console (ASCII 0x2E)

The interrupt vector table:

; interrupt vector table
  .ORG 0100h
int_vector_table:
  dw ihdlr_unknown       ; vector 0 - will print 'unknown vector' to console if fired
  dw ihdlr_second_timer  ; vector 2 - fired every second
  dw ihdlr_unknown
  dw ihdlr_unknown
  dw ihdlr_unknown
...snip...

And this works well. Here is the Z80 running the test.

In addition to this, I also ran the TeensyZ80 in a debug mode with a slow clock so you could see the process of running the z80 code. This prints a log of each cycle, with a descriptive name and where appropriate some further information. A log line of ? prints the signal lines within square brackets with M1_val = F, RFSH_val = H, MEMREQ_val=M, and IOREQ_val=I. These cycles are for when the CPU is executing long instructions, etc.

This post comes to quite an abrupt end, but really once those fundamental bits of functionality are on the Teensy sketch the z80 code just falls into place. I did have some issues when the interrupt vector table was higher up in memory, but I’ll have to look at that issue again sometime. This timer works well, and Ive got another interrupt which fires when serial data is available for consumption by the z80. Setting that up is fairly simple, it’s a serial command where I tell the emulated serial device what interrupt vector to use.

INTVECTOR_SERIAL_DEVICE equ 8

.. skip forward to the serial device init ...
ld a, SERIAL_CMD_SET_INTVECTOR
out ($01), a                      ; put the serial device in SET_INTVECTOR mode
ld a, INTVECTOR_SERIAL_DEVICE
out ($01), a                      ; set the interrupt vector for the serial device to use

When the Teensy sees there is data available through serial, this interrupt will be fired, which is useful!

Again, I hope this was interesting! Let me know your thoughts via twitter @domipheus.

The next part in this series is available here.

Teensy Z80 – Part 1 – Intro, Memory, Serial I/O and Display

My Teensy Z80 Homebrew Computer

A few months ago, I bid on several ‘box of surplus electronic components’ listings on ebay. My lab needed some more components and I saw some of the things I needed in the listing pictures, so thought I’d go for it. I won all of them, at pretty much my lowest bid price, and when I got the boxes was really happy (I paid ~£20 for >£200 of components, most sealed new). At the bottom of one box was a Zilog Z80 CPU, in 40-pin DIP. It’s a Z84C0008PEC, designed to run at 8MHz. It looked pristine, but was not sealed, and it sat in my junk box for quite some time.

z80Last month I won a contest at Pimoroni (check them out, they are awesome), where I received £100 in gift vouchers. In the box of delight which followed, were two Teensy 3.1 boards. I didn’t know what I wanted to do with them, I just knew they packed some punch and had a plugin for the Arduino IDE. Soon after unpacking them, and seeing just how many I/O pins it had, I wondered if it was enough to connect a Z80 for a ‘working’ computer.

It was!

So my project over the holiday season was decided: TeensyZ80. I wanted to have a usable Z80 running its own code, with the teensy supporting it providing the RAM, I/O peripherals, and clock.

Now, some homebrew/single board computer enthusiasts may be groaning at this point “another Z80 board, and it’s not even using real ICs” but I was more interested in the timing, what speed could you actually achieve, and the Z80 in general. My first computer was a ZX Spectrum, and I’m feeling quite sentimental. I never programmed in Z80 assembly and thought that should change.

It seemed after reading the datasheets that it would be quite simple to achieve, but the first thing I needed to do was check if the Z80 actually worked!

A Z80 test circuit

The Z80 pinout shows just how simple this is to hook up and at least test PC increment on NOP instructions. Hooking up some LEDS (with resistors!) to some low address bus lines, powering it with USB 5v and also fixing some of the other status inputs to 5v – they are active low inputs – means we should see the address bus change as the CPU requests from memory. z80pinoutThe Z80 NOP instruction is represented by the 1-byte sequence 0x00, so if we pull all the data lines low to ground we should see the address bus increment as it executes NOPS at every data location.

test_circuitThe Z80 uses static registers, so this means there is no minimum clock frequency to keep the CPU running. Because of this, technically you can run this with a clock sourced from a push switch and some other circuitry to clean up the signal, but I just used an Arduino and got it to output a slow square wave. With this all put together on a breadboard, the Z80 booted, but was quite erratic – I was failing to reset the chip. To reset, you pull the reset line low, for a minimum of 4 complete clock cycles. Once I did this, the address bus started at 0 and worked up, counting. One thing to note is that it takes several cycles to fetch and execute instructions on the z80, so the bus does not increment every cycle, but usually every 4 cycles. More information about Z80 test circuits can be found here.

A bonus video below, showing the need for clean clocks. this is what happens when I just touch the clock pin with my fingers!

On to the Teensy

At this point I was confident the z80 worked, and so then started soldering headers to the teensy I got from Pimoroni. I wanted this for breadboard use, so put the headers on the opposite side from what most would expect, and then soldered 90 degree female headers to the underside pad I/O pins. This meant I had all the Teensy 3.1 pins available to me, on the breadboard. One issue with this is that you will need to use the extended ‘stackable’ headers in addition to the ones soldered to raise the teensy high enough that you can press the program button on the unit to flash it. teensyThe first thing to do was attempt to run the test circuit and the z80 at 3.3 volts. The Teensy is 5v tolerant, but some of the analog pins are 3v3 input only, so I wanted to make sure. Once this was confirmed working, I got on to making the test circuit do additional things. I added an SPI 2.2″ TFT to easily display debugging information and started connecting the data bus to the Teensy.

The first job was to create a clock signal for the Z80 to work with. To make things easier, this was going to be designed in such a way that the clock is completely synchronous with all other events, in that the loop() function of the Teensy simply sets the clock high, samples inputs, provides outputs, sets the clock low, return. Doing this allows for every operation to stall the CPU by way of delaying it’s lower clock edge, meaning we do not need to use the WAIT line to stall the Z80. It makes implementation easier for this project, but isn’t how things are done normally – although clock manipulation is a useful tool. One thing to note is that if you stall the clock like this, it should be done while it is high. Extended periods of the clock being low may lead to unexpected behavior.

Memory Requests

The next thing to implement is memory reading and writing. To do that, you need to know how all of the signals from the z80 cooperate to form a memory request. The Z80 manual has timing diagrams and you can see the one regarding memory requests below. timing_memory

As I mentioned before, many cycles are required before a single operation is completed. They are grouped into ‘T-states’. We are interested first with reading from RAM, so we look at the MREQ signal and the RD signal, when both active means a memory read is requested. From the diagram above we can see that simply checking for those two signals, then sampling the address bus, and then writing the data bus, should be enough. So the Teensy sketch looks now like a very large amount of pin definitions, Address lines input, and data lines output:

void loop() {
    digitalWrite(Z_CLK, HIGH);
    delay(300);
    updateZ80Control();
    if (MEMREQ && RD) {
        readAddressBus();
        if (addressBus < ROM_LENGTH) {
            dataBus = TEST_ROM[addressBus];
        } else {
            dataBus = 0x0;
        }
        writeDataBus();
    }
    digitalWrite(Z_CLK, LOW);
}

This should be enough to satisfy reads from the Teensy. The readAddressBus() and writeDataBus() functions simply build a short value from i/o pins, or write a byte value to i/o from a byte value.

void readAddressBus() {
    addressBus = 0;
    addressBus |= ((digitalRead(AD0)==HIGH)?1:0)<<0;
    addressBus |= ((digitalRead(AD1)==HIGH)?1:0)<<1;
    addressBus |= ((digitalRead(AD2)==HIGH)?1:0)<<2;
    addressBus |= ((digitalRead(AD3)==HIGH)?1:0)<<3;
snip

… and so on

Now the Z80 should be able to read from memory, but I had to now write some Z80 code to test that the CPU correctly interpreted this data. So I downloaded ZMAC.

Despite the ZX Spectrum being my first computer, I never did any coding on it. Nothing. So this is my first look at Z80 assembly. Really, though, it’s pretty basic stuff as far as an ISA goes. I coded a small example which is assembled at location 0 – all it does is execute several nops, before entering an infinite loop. I assembled with zmac, and then used bin2h on the outpu .cim file so I could simply paste the instruction stream into my Teensy Sketch source as RAM. I did this, built it all, and….

The address bus counted up as expected, and then become incredibly erratic.

;  To assemble: zmac asm.z
;  To get data for the sketch source:
;     bin2h zout/asm.cim > asm_binary.h

    .org 0000h
start:
    nop
    nop
    nop
infloop:
    jr infloop

I couldn’t figure out what was wrong. I wired up an LED to the HALT output of the CPU and assembled a program that should just immediately halt the CPU, thus make the LED go out, but it didn’t.

    .org 0000h
start:
    halt

After a length of time I’m far too embarrassed to state, I realised the data bus was wired to the Teensy in the wrong order. The address bus pins are all in order from A0 to A15, but the data bus is not. Oops. This meant the data was being fed to the Z80 from the teensy wrongly, which is why the address bus was erratic: it was executing completely different opcodes to what was intended. Here is the pinout of the Z80, again. z80pinoutAfter fixing that little mishap, we had a Z80 which counted up to the address where the infinite loop was, and it stayed there. Note, you may still see the address bus count for some cycles. That’s the refresh cycles at work – which are used to refresh dynamic ram. We can just ignore these cycles. If you connect an LED to the RFSH line you can tell the cycles to ignore. I attached this line to the Teensy and if it’s active on a clock I simply skip everything. I also connected an LED to the HALT line – the led is on when the z80 is active, and off when in a halt state.

Next, implement writing to RAM, of course! (Yes, I know the RAM is in a variable called TEST_ROM, stupidly :) ).

if (MEMREQ_val) {
    if (RD_val) {
        if (addressBus < ROM_LENGTH) {
            dataBus = TEST_ROM[addressBus];
        } else {
            dataBus = 0x0;
        }
        writeDataBus();
    } else if (WR_val) {
        readDataBus();
        if (addressBus < ROM_LENGTH) {
            TEST_ROM[addressBus] = dataBus;
        }
    }
}

The implementations of readDataBus() and writeDataBus() must remember to set the input/output mode of the pin. This can be done at the start of the function. For example:

void writeDataBus() {
    pinMode(D0, OUTPUT);
    pinMode(D1, OUTPUT);
    pinMode(D2, OUTPUT);
    pinMode(D3, OUTPUT);
    pinMode(D4, OUTPUT);
    pinMode(D5, OUTPUT);
    pinMode(D6, OUTPUT);
    pinMode(D7, OUTPUT);

    digitalWrite(D0, (dataBus&(1<<0))?HIGH:LOW);
    digitalWrite(D1, (dataBus&(1<<1))?HIGH:LOW);
    digitalWrite(D2, (dataBus&(1<<2))?HIGH:LOW);
    digitalWrite(D3, (dataBus&(1<<3))?HIGH:LOW);
    digitalWrite(D4, (dataBus&(1<<4))?HIGH:LOW);
    digitalWrite(D5, (dataBus&(1<<5))?HIGH:LOW);
    digitalWrite(D6, (dataBus&(1<<6))?HIGH:LOW);
    digitalWrite(D7, (dataBus&(1<<7))?HIGH:LOW);
}

That’s it. This is actually good enough, now, to get the z80 doing some real computation. I picked a section of RAM as a ‘frame buffer’ and just wrote ASCII characters to there from another area of Z80 memory, and got the Teensy every loop() to draw the contents of that frame buffer to the SPI TFT screen. It all worked. Slowly, mind – redrawing the TFT is very slow given how quickly we want the clock to tick.

At the moment, you cannot get data into the z80 that isn’t already flashed to the Teensy. We can solve this by using the Serial feature of the Teensy, and expose this functionality to the z80 via it’s I/O ports. It’s what I did next.

I/O Requests

The Z80 has another status pin, IOREQ, which if active signifies the low half of the Address Bus holds a port number, and the a read or Write should be applied to it. I would implement a serial status/command port, and a data port as follows:

  • A read from the Status Port will return with the number of bytes available in the buffer for reading.
  • A write to the Status port will be interpreted as a command to be carried out by the ‘serial device’
  • A read from the Data port will pop a byte from the buffer and put it on the data bus.
  • A write to the Data port will write the byte on the data bus to the serial device.

With this functionality, I could use my computer as input keyboard or output terminal. I would allow the Z80 to configure the serial rate and other options, and then initialize the connection. I could then implement getchar and putchar with stdin/out defaulting to the serial connection.

The serial data port will be very easy to implement with the teensy Serial object. If a write is made to the port, we do Serial.write(dataBus), and if a read is made we do dataBus = Serial.read().

We want the z80 to configure the connection via a command port. We may as well have the Z80 do something, given we’ve really cheated here by making the Teensy do all heavy lifting. The first thing we want is the serial rate, which will be a 16 bit value. For this, we want an 8-bit ‘command’ followed by the 2 halfs of the 16-bit rate. We’ll call this a command packet. As this uses multiple i/o writes to achieve a full packet, we need to store some state information on the Teensy over multiple cycles. It’s done very simply, the serial device has a current command state, and we work out from that what any I/O status write should do. So for setting the rate, we have the following:

if (portAddress == PORT_SERIAL_CMD)
{
    if (ioSerialCurrentMode == SERIAL_CMD_READY)
    {
        //ready for commands, so put us in the right mode
        ioSerialCurrentMode = dataBus;
    }
    else
    {
        //take data for the given mode
        if (ioSerialCurrentMode == SERIAL_CMD_SET_RATE)
        {
            ioSerialCurrentMode = SERIAL_CMD_SET_RATE_2;
            ioSerialRate = dataBus;
        }
        else if (ioSerialCurrentMode == SERIAL_CMD_SET_RATE_2)
        {
            ioSerialCurrentMode = SERIAL_CMD_READY;
            ioSerialRate |= (((unsigned short)dataBus)<<8U);
        }
    }
}

When we run this with the following Z80 ASM:

    ld a, SERIAL_CMD_SET_RATE
    out ($01), a       ; set the serial rate (9600)
    ld a, 80h
    out ($01), a
    ld a, 25h
    out ($01), a

we discover things don’t really work. This is due to the I/O operation taking many cycles, and our code assumes one cycle, or execution of loop() between runs. So we need to debounce the I/O. Looking at the timing diagram from the Z80 guide we see that we must wait 4 cycles.

timing_ioOnce this is changed so we always wait 4 cycles, we can implement the INIT process. I’ve got it set up to wait for a connection from the teensy before returning from the INIT command, but this could easily be put into the status ‘bytes available’ read port.

if (dataBus == SERIAL_CMD_INIT)
{
    Serial.begin(ioSerialRate);
    while (!Serial); // wait for a connection
    ioSerialInitialized = 1;
}

So, we can read and write serial data, cool. But I really want the z80 to have its ‘own’ screen. Not the hack we did before. We can do it two ways – set up an area of memory as some video ram and populate it, getting the teensy every frame to draw what is in that memory, be it characters or pixels, or we can create a set of ports to manipulate a virtual console. I’ve done both, but I’ll only look into the virtual console using ports here.

We need several functions to get a console:

  • put character
  • get/set column
  • get/set row
  • optionally, set colour.

The console will be 32 columns by 24 rows. To get things up and running quickly, I made the decision to simply have the teensy deal with the set column/row edge cases, and have put character increment along the console each time it’s used. For set colour, I used simple state to allow 16-but 5:6:5 colour input via two port writes. The code looks as follows:

else if (portAddress == PORT_DISP_SETCOLOUR)
{
    if ((console_current_color_state&0x1)==0x1)
    {
        console_current_color |= dataBus<<8U;
    }
    else
    {
        console_current_color = dataBus;
    }
    console_current_color_state++;
}

The putchar port code is very simple again:

else if (portAddress == PORT_DISP_PUTCHAR)
{
    char c = dataBus;
    tft.setTextColor( console_current_color, ILI9341_BLACK);
    tft.setCursor (CONSOLE_START_X + (console_current_col*CONSOLE_FONTX), CONSOLE_START_Y + (console_current_row*CONSOLE_FONTY));
    tft.print(fmtstring("%c", c));

    console_current_col++;
    if (console_current_col >= CONSOLE_COLUMNS) {
        console_current_col = 0;
        console_current_row++;
    }
    if (console_current_row >= CONSOLE_ROWS) {
        console_current_row = 0;
    }
}

An optimization to the above is to only set the text colour on a second textcolour I/O write, but at the time of writing I had some debug draw stuff going on in the sketch, so wanted to ensure the console always used the correct colours. Hence the setTextColor call each putchar.

Wrapping up

serialWith this, we have a display, serial in/out, and can now try writing some more z80 ASM! But for now I think this is enough for this part. I’ve already got Mode 2 interrupts working, and I’m interfacing an SD card interface. I’ll be cheating heavily with that, getting the Teensy to do all of the FAT heavy lifting. But it’s a fun exercise.

setupAll of the code for this is up my github https://github.com/Domipheus. Note it may not line up exactly with this post, as it’s being edited fairly often.

If you enjoyed this, please let me know via twitter @domipheus.

The next part in this series of posts is available here.