This blog has relocated to https://coolbutuseless.github.ioand associated packages are now hosted at https://github.com/coolbutuseless.

29 April 2018

mikefc

r64 - a c64/6502 assembler in R

Rationale

  • Who doesn’t want a compiler for a 1 MHz 8-bit computer with 16 colours and a max resolution of 320x200?
  • A 6502 assembler in R will allow for preparing/calculating data in R and then incorporating directly into the assembly code. e.g.
    • creating character sets
    • computing animation paths

Features

General features

  • Basic syntax only. Similar to TASS64 syntax.
  • Settable program counter
    • e.g. * = $0801
  • Defined variables
    • e.g. border = $d020
  • Low/High byte extraction from symbol (similar to TASS)
    • e.g. lda #<routine will store the low byte of the address of symbol routine in the A register

For integration with R

  • .rtext directive to include an R string as text data
  • .rbyte directive to include an R integer vector as bytes
  • {...} to delimit code to be evaluated at run time to manipulate labels and variables e.g. lda {border + 1}

Installation

r64 depends on a lot of tidyverse packages as well as the minilexer package (for splitting the c64 asm code into tokens).

devtools::install_github('coolbutuseless/minilexer') # for lexing the 6502 assembly into tokens
devtools::install_github('coolbutuseless/r64')

A simple 6502 program

The following c64/6502 ASM code will clear the screen and then write Hello #rstats! at the top

asm <- '
*=$0801
  .byte $0c, $08, $0a, $00, $9e, $20  ; 10 SYS 2080
  .byte $32, $30, $38, $30, $00, $00
  .byte $00

*=$0820
      lda #$93        ; Clear the screen
      jsr $ffd2

      ldx #$00        ; initialise the offset pointer into our message
loop  lda message,x   ; load a character and write it to screen 
      and #$3f        ; Manually place chars on screen
      sta $0400,x
      inx
      cpx #$0e
      bne loop

      rts

message
    .text "Hello #rstats!"
'

Compile the program and run it using VICE

prg_df       <- r64::compile(asm)
prg_filename <- tempfile()
r64::save_prg(prg_df, prg_filename)
# system(paste("/usr/local/opt/vice/x64.app/Contents/MacOS/x64", prg_filename), wait=FALSE)
helloworld output

helloworld output

Output from animation using a custom character set

custom charset

custom charset

Breakdown of assembly process

The compiler makes a few passes through the data to resolve symbol values.

The r64::compile() function is just a wrapper which calls the following 4 functions

  1. line_tokens <- r64::create_line_tokens(asm)
    • For each line in the input break it into tokens.
    • Filter any rows that contain no instructions
  2. prg_df <- r64::create_prg_df(line_tokens)
    • Create a data.frame from line_tokens
    • This is the key data structure for the compilation process
    • The compilation process is just a matter of manipulating this data.frame and merging with information about the instructions
  3. prg_df <- r64::process_symbols(prg_df)
    • Resolve labels to their actual addresses
    • Replace any defined variables with their values
  4. prg_df <- r64::process_zero_padding(prg_df)
    • If there are gaps between blocks of code, insert zero bytes

An example of the final form of the prg_df data.frame is show below. The actual contents of the c64 prg file is just the sequence of values in the hexbytes column.

prg_df %>%
  select(addr, label, line, opcommand, op, ophex, nbytes, hexbytes) %>%
  knitr::kable()
addr label line opcommand op ophex nbytes hexbytes
2049 NA * = $0801 NA NA NA 0
2049 NA .byte $0c $08 $0a $00 $9e $20 NA NA NA 6 0c, 08, 0a, 00, 9e, 20
2055 NA .byte $32 $30 $38 $30 $00 $00 NA NA NA 6 32, 30, 38, 30, 00, 00
2061 NA .byte $00 NA NA NA 1 0
2062 NA (zero padding) NA NA NA 18
2080 NA * = $0820 NA NA NA 0
2080 NA lda #$93 lda #$93 lda a9 2 a9, 93
2082 NA jsr $ffd2 jsr $ffd2 jsr 20 3 20, d2, ff
2085 NA ldx #$00 ldx #$00 ldx a2 2 a2, 00
2087 loop loop lda message x lda message x lda bd 3 bd, 35, 08
2090 NA and #$3f and #$3f and 29 2 29, 3f
2092 NA sta $0400 x sta $0400 x sta 9d 3 9d, 00, 04
2095 NA inx inx inx e8 1 e8
2096 NA cpx #$0e cpx #$0e cpx e0 2 e0, 0e
2098 NA bne loop bne loop bne d0 2 d0, f3
2100 NA rts rts rts 60 1 60
2101 message message NA NA NA 0
2101 NA .text “Hello #rstats!” NA NA NA 14 c8, 45, 4c, 4c, 4f, 20, 23, 52, 53, 54, 41, 54, 53, 21