Saturday, 26 December 2020

2nd SID chip auto detect

I've started working on something that can use a 2nd SID chip if it exists.  While I was going to put a manual setup in I thought I'd have a go at detecting it automatically as well.   I haven't looked at other people's implementations as I thought it'd be better to try and understand the problem from scratch.
 
So far this has only been tested on an emulator, but I think the premise at least is sound if the memory mapping holds up.

SID chips in memory:

SID chips, at least as far as I'm aware, can only be in the $d000-$dfff block of memory, and within that can only occupy two areas:

  • $d400-$d7ff
  • $de00-$dfff
This is because the other VIC-II registers, timers and colour ram also occupy that same 4kb of memory in various places.

The first SID is always at $d400, with the second SID able to occupy any 32-byte block in those two memory spaces. 

Unoccupied blocks behave differently depending on which of the two areas you are trying to detect.   In the first area ($d400-$d7ff) empty blocks are mirrors of the first chip.  So, for example, writing to $d500 acts the same as writing to $d400.

In the second area they don't appear to be mirrored at all, though empty blocks seem to get some crosstalk. (at least relying on the emulator output to be accurate)  So, for example, writing to $de00 won't affect the first chip at $d400.

This means we need two similar but seperate routines to look for a second SID across the available areas.

Detection:

To detect the chip I used the 'random number generator' read register on the SID.  ($1b)  This gives you the output value of the 3rd channel's oscillator.  The oscillator can be read just by setting it's pitch and waveform registers, it doesn't need any ADSR setup or output volume so can work silently.   By setting oscillator 3 to the noise waveform and putting the frequency at max, we have a constantly changing value we can compare against to figure out where the other SID could be.

As mentioned because of the different memory banking we need two detection routines:

1) $d400-$d7ff area:

  1. Set the first SID's pitch and waveform register to $00.
  2. Read the first SID's RNG (random number generator) and store that as the 'last played' value.
  3. Set the second SID's pitch to max ($ff) and waveform to noise ($81)
  4. Read the first SID's RNG value and compare to last played value.
  5. If the RNG value doesn't match the last played value we haven't found the 2nd SID.  This is because $d41b is being overwritten by an unmapped area.
  6. If the RNG value is the same as the last played value it means the memory isn't being mirrored and there is probably another SID there.

So that's the first memory block taken care of.  Of course there is a possibilty the RNG value will match the last played value but the chances are fairly low.

Now, if the second block doesn't have mirroring, surely we can just check the RNG register on the second chip and see if there's any activity?   Well, yes and no.    This is what we're going to use, but because of the apparant crosstalk we'll need to sample a group of values and try and detect it from an average instead.

The second detection routine does exactly the same as the first, but unmapped areas don't give a constant 0 output when reading their RNG register.   If there is a SID there you'll get the stream of random values, but otherwise you'll get zeroes interspersed with $ff and sometimes other values.   The one consistent thing is that there are many more zeroes on unmapped areas than the other values.   I decided to read 16 values from the RNG (one per frame) then use that this sample to decide if a SID exists.   If there are more than an arbitary amount of zeroes in the sample then most likely there isn't a SID there.

2) $de00-$dfff area:

  1. Set the first SID's waveform and pitch to $00.  We probably don't need to use this at all but might as well.
  2. Set the second SID's pitch to max ($ff) and waveform to noise ($81)
  3. Read 16 values from the second SID's RNG register, one per frame.
  4. Scan through the sampled values for any instances of zero, and add them to a tally.
  5. If the tally is under a certain threshold (I used 3 or less) we probably have a SID there.
  6. If the tally is over that threshold there's a lot more zeros in the crosstalk and we don't have a SID there.  

I'm sure there are plenty of flaws in the above, but it'll be interesting to see what does and doesn't work on real machines when I have time.   This was tested with VICE 3.1

Source:

setup_siddetect

    ; We're using indirect addressing to read the second SID.
    ; Set the read address to the first possible one. ($d420)
    
    lda #$20
    sta $02
    lda #$d4
    sta $03
    
    ; Detection loop for $d400-$d7ff area.
    
setup_siddetectloop
    ; Wait for a new frame before looking for the SID just for safety.
    lda $d012
    cmp #$fe
    bne setup_siddetectloop

    ; These subroutines clear the first and second SID registers.
    jsr sid_sidchip_clear   
    jsr setup_sid2ndclear

    ; Set first SID high pitch/waveform on channel 3 to zero.
    lda #$00
    sta $d401+$0e
    lda #$00
    sta $d404+$0e

    ; Read first SID's RNG register and store it.
    lda $d41b
    sta setup_prevvalue

    ; Set second SID high pitch/waveform on channel 3 to a noise waveform at max pitch.
    ldy #$0f
    lda #$ff
    sta ($02),y
    ldy #$12
    lda #$81
    sta ($02),y
    
    ; Compare the first SID's RNG register against the previously recorded value.
    ; If it's the same the second SID must be mapped in this area, otherwise we'd be
    ; getting the second SID's random output mirrored in.
    
    lda $d41b
    cmp setup_prevvalue
    bne setup_siddetect_notfoundyet
    jmp setup_siddetect_found

    ; Move the second SID address up to the next 32-byte block.
    ; If we've reached $d800 (where the colour RAM is) we need to skip to the second
    ; detection routine instead.
    
setup_siddetect_notfoundyet   
    clc
    lda $02
    adc #$20
    sta $02
    cmp #$00
    bne setup_siddetectloop
    inc $03
    lda $03
    cmp #$d8
    beq setup_siddetectskip
    jmp setup_siddetectloop
    
setup_siddetectskip

    ; Start of second detection routine, set it to start reading from $de00.
    lda #$de
    sta $03
    lda #$00
    sta $02
    
setup_sidlastloop
    ; This is pretty much the same as the first detection routine.....
    lda $d012
    cmp #$fe
    bne setup_sidlastloop
    
    jsr sid_sidchip_clear
    jsr setup_sid2ndclear

    lda #$00
    sta $d401+$0e
    lda #$00
    sta $d404+$0e
    ldy #$0f
    lda #$ff
    sta ($02),y
    ldy #$12
    lda #$81
    sta ($02),y

    ; ...until here.  This time we take a sample of 16-bytes from the 2nd SID's
    ; RNG register.  We take one per frame to try and avoid duplicates.
    
    ldy #$1b
    ldx #$0f
setup_siddetect_sampleinput
    lda $d012
    cmp #$fe
    bne setup_siddetect_sampleinput
    lda ($02),y
    sta setup_samplecache,x
    dex
    bpl setup_siddetect_sampleinput

    ; Now we scan through the sample looking for zeros, if we find any they're
    ; added to a tally.
    
    ldx #$00
    stx setup_prevvalue
setup_siddetect_sampleanalyze
    lda setup_samplecache,x
    cmp #$00
    bne setup_siddetect_samplenozero
    inc setup_prevvalue
setup_siddetect_samplenozero   
    inx
    cpx #$10
    bne setup_siddetect_sampleanalyze

    ; If the tally is 3 or less we've probably found a SID chip.  With the RNG
    ; register going it's unlikely it'd find any plus it's default is 0 anyway.
    
    lda setup_prevvalue
    cmp #$03
    bcc setup_siddetect_found

    ; If we haven't found a SID continue moving through the memory blocks.
    ; If we reach $e000 we're at the end of the available memory
    ; space, so set the address to $ffff which we can use as a check for
    ; no second SID existing.
    
    clc
    lda $02
    adc #$20
    sta $02
    cmp #$00
    bne setup_sidlastloop
    inc $03
    lda $03
    cmp #$e0
    bne setup_sidlastloop
    
setup_siddetect_notfound
    lda #$ff
    sta $02
    sta $03

    ; When the SID address is found write it on screen as PETSCII values.
    
setup_siddetect_found

    lda $02
    sta $0400
    lda $03
    sta $0401
    rts

    ; Clear first SID's registers.

sid_sidchip_clear
    ldx #$18
    lda #$00
sid_sidchip_clearloop
    sta $d400,x
    dex
    bpl sid_sidchip_clearloop
    rts

    ; Clear potential second SID's registers.
    
setup_sid2ndclear

    ldy #$00
    tya
setup_sid2ndclearloop   
    sta ($02),y
    iny
    cpy #$1d
    bne setup_sid2ndclearloop
    rts

    ; Variables:
    ; setup_prevvalue is used to store $d41b value in first detect and as the tally in second detect.
    
setup_prevvalue

    .byte $00
    
    ; Array for second detect sample.
    
setup_samplecache
    .byte $00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00,$00
   



Saturday, 7 March 2020

c64 303

A few days back I uploaded a Twitter video showing a little 303-ish sound simulator on the Commodore 64.

This came about because someone mentioned 3rd of March as 303 day, and that got me remembering I had a simple player from a few years back that never had quite enough 'bite' to the sound.  So I dug that out, played around with the source a bit and added some quick visuals, making sure it was at least 303 bytes or less as a tagline. (it's 301 btw)

I've put the source to the version from the video at the end of this post.  As you can guess this is code I wrote without doing much optimization as I only imagined it would be used in that video. :)  But anyway, enjoy.  Maybe down the line I'll do a much leaner version when I need to use it for something, or post the older version which at least autoruns.

For anyone interested the video was recorded using the VICE emulator, with the 8580 SID and the reSID filter bias set to -500.


Background on the SID chip:

So how do we get a 303-like sound on the Commodore 64?  Well, we have an advantage with this machine because of the unique SID chip, certainly very different to the AY, SN76489 and beepers of other hardware of the time.

It has 3 versatile oscillators (channels) that can produce triangle, saw, square and noise waveforms, plus some combinations of both.  The oscillators can also have sync and ring modulation applied (the latter only on triangle waveform), with the previous channel in the set providing the pitch modulator input value. (this is in a round-robin fashion so Channel 1 gets affected by Channel 3, Channel 2 gets affected by Channel 1 and finally Channel 3 is affected by Channel 2)

Each oscillator has it's own ADSR controls for instrument dynamics, and there's a global filter with low, band and high-pass modes that can be applied to any of the oscillator channels.

If this sounds like quite a powerful feature set for a home computer, you might be interested to know the chip's designer was Bob Yannes who went on to co-found Ensoniq.

As well as these features there are a few more registers at the end of the list that can be used in your code. Firstly there are two Analog to Digital converter registers for the Paddle Controllers, and then there is a Random Number Generator.  This uses the oscillator output from Channel 3 to provide an 8-bit number. To get what we think of as random numbers you set the third oscillator to noise, and use the pitch of that channel to control the speed of changes to the random value.  You can also use this as a low-quality SID sampler, by playing an instrument on channel 3 and then recording the values that come in to memory.  With a bit of manipulation you can play them back through the 4-bit volume register.  I did this in the Type Mismatch demo.

How the player works:

To update the player independently of the main loop we point the hardware interrupt ($0314-$0315) at our music player start point.  This gives us a regular update every vblank (50fps for PAL, 60fps for NTSC) so we have a steady tempo.  As we're using the interrupt in freewheeling mode rather than hitting a specific raster line, it will be triggered wherever the raster beam is when the interrupt is called.  This usually isn't a good idea for things you want to happen at a specific time, however for us this is actually useful later in the player as we'll use the raster beam position to pick values.

The player doesn't have a traditional song/pattern setup, only a single pattern with some hard-coded changes to make it get tweaked a little bit over time.

Every pass a counter (located at $60) counts down until it rolls around to #$ff, when this happens it's time to move onto the next note in the pattern. If this hasn't happened the player jumps to the update portion, this handles things that need to happen continually such as the filter sweep being added or checking if the drums have reached their note-off time.

If the counter is #$ff it's reset to the song's tempo (#$06) and the next note is setup, as well as resetting the filter cut-off value and checking if a drum sound needs to play at this new position.

A pattern is 16 notes long, when all notes are played it checks if the amount of cycles for that pattern has happened.  If not the cycle count is decreased and we continue.  If we've reached the end of the cycle a few things are changed:

* The waveform used for the main melody.  (a selection between saw, pulse and triangle)
* The length of the next pattern cycle from a table.
* The melody and ring-mod source notes.
* The drum check value, to skip some types of drum to add some dynamics to the 'song'.

Some of these changes use 'naughty' self-modifying code to directly alter the code as it's running. (so it changes from what was compiled to something else)  For small-form stuff like this it can really save you some space, though obviously resetting to initial state needs additional code.

Melody

The music data comes from the Kernal and BASIC roms in the machine, this is essentially 'free' data as the Kernal, BASIC and character gfx ROMs are in memory when you switch the machine on.  As we leave the machine in this state we're free to use them.   This data can potentially be any value between 0-255, but as we're looking only for low bass notes we do some BIT math to get something useable.

The C64 oscillators have a full 16-bit range so can be very accurate indeed, however we are only using the high-bit register as the slightly dischordant sound seem to suit these sort of acid lines.

The main melody uses some of the Kernal, I picked $f704 as it had a good string of melodies. (the kernal is at $e000-$ffff by default in memory)

lda $f704,y ; Read byte from Kernal area.
and #$0f ; Remove the top 4 bits of the value so we get a note between 0 and 15.
eor #$03 ; Mess around with the bottom two bits, this is more personal taste for the data I'd chosen.
sta 54273 ; store in channel 1's high frequency value register.

And the ring/sync modulation channel reads from the BASIC kernal.  I picked $a340 as a starting point there. (BASIC is at $a000-$bfff)  This also removes BITs from the data but leaves a range of 0-63, which means you occasionally get harmonics above the bass note that add a bit of flavour to the sound.

At the end of a pattern cycle the upper byte of the songdata reads are increased by one, so $f704 becomes $f804 and $a340 becomes $a440 to give a new set of notes.   Because this was for a video I don't do any checking for wrapping round to the start of memory, so eventually you'll end up just with the same low note for a while as it cycles through (what is probably) empty bytes.

The earlier driver resets the machine after a few cycles so never reaches a blank space in memory.

You could also use the program itself as source data but as you continue coding the melodies will keep changing.  Best to leave that bit until the end.

Drums

The drum setup isn't particularly efficient and I suspect I was going to work on it some more in the old player at some point.  Anyway, it cycles through a loop of BIT values looking for a match in the drum pattern data.  These being:

* $40 - Bass Drum
* $08 - Snare
* $02 - Hi-hat
* $01 - Rest (or indeed anything else, again I think this was going to be extended originally)

If it finds a match relevant values in a few small tables are set for Waveform, Starting Pitch, Pitch Sweep (signed) and the amount of frames to wait before the note-off is played.  This doesn't use the modern way of producing drums by cycling through an instrument table to set the waveform and pitch values per frame.  That is the setup I used in In A Loop a few years back.


The pattern data is stored with the demo, it could use ROM data but finding a coherent beat is actually a lot more difficult than you'd imagine compared to looping melodies.  I could write a program to check for exact matches in the ROM I guess.  I've done something similar trying to cram all of State of the Art's visuals into a 4kb demo.

Changes between the old player and new one:

As I mentioned I didn't think the old player had enough 'bite' to the sound, there's a sort of aggressive overtone to a 303 that is difficult to do with just one channel of the SID.

Originally I was using the second channel as a fake delay for the melody.  This was done by reading 3 steps behind the main melody's pattern, with the second channel playing at a lower volume than the first.   I switched this over to a triangle waveform with sync and ring-tone enabled, which follows it's
own note sequence.  This gives some extra harmonics to the sound when above the melody pitch, but also affects the timbre of the main note when below it.

The other thing I did was change how the filter sweep works.  In the original player the sweep starts at one point and drops down the same amount for every note.   I made two changes, firstly the sweep speed has different values per note, by reading a value from the ROM decided by the position of the raster beam.  So it's quite varied and will actually be different everytime the driver is run.   Every cycle through the pattern the filter start speed is reset to a variable value between 0 and 127, again using a similar setup to the sweep.

For playback I went for the 8580 version of the SID above the older 6581.  The 6581 has a much 'dirtier' sound with filter enabled because the filter adds some distortion to the affected channels, but I found a nice range with the 8580 filter that I preferred this time.

Things that could be added:


Apart from a general refactor and optimization pass (starting with that init code) some possible ideas are:

* Adding Accent and Pitch slide functions to really give that 'acid line' sound.
* Change drums to use instrument tables for more punchy control.
* Autostart when loading.

The old player has a proper BASIC header as it was still way under 303 bytes.  But there's also a workaround that skips a BASIC header and uses only 4 byte, but means your code starts just before the tape buffer area.  The only problem with this one is it's right next to the default screen location, so you either have to move your screen or hide it.  I used this in mus1k by setting the screen colour to the same as the text. :)

On the subject, if you have really small code (in the 20 bytes range) you can set the start address to $7c and have it autostart for you with no header at all.   I used this in Glitchshifter

303 source for C64:

    ; 303 style by 4mat 2020
    ; Type sys 49152 to play.
   
    ; Written using Dasm assembler. Should be mostly compatible with other assemblers except:   
    ; + The processor line is probably Dasm only unless your assembler handles multiple processors.
    ; + org might be replaced by * or something else.
    ; + Some assemblers use !byte or something else instead of .byte, see your docs.
   
    ; Memory map at bottom of source.


    ; Set start address to $c000. (49152)   
    org $c000
    processor 6502
   
    ; Init Intro.
    ; Disable interrupts while we do our setup.

   
    sei

    ; Copy song data to zero page.
    ; This saves a bit of memory as some assembler commands will only use 2 bytes instead
    ; of 3 if they were in 'normal' memory.

   
    ldy #$00
    ldx #$30
argha
    tya
    sta $60,x
    lda datas-1,x
    sta $0f,x
    dex
    bne argha

    ; Setup soundchip. 
    ; By default I've set all channels to use triangle and sync/ring-mod for the first cycle.


setsound
    lda #$17
    sta 54276,y
    lda #$08
    sta 54275,y
    lda #$00
    sta 54277,y
    lda $1d,x
    sta 54278,y
    lda $10,x
    sta $6b,x
    lda $12,x
    sta 54295,x
    inx
    clc
    tya
    adc #$07
    tay
    cpy #$15
    bne setsound

    ; Point hardware interrupt at the music player section.
   
    lda #<musicloop
    sta $0314
    lda #>musicloop
    sta $0315

    ; Enable interrupts so we're good to start.
    cli

    ; Main loop. 
    ; Only the visuals are updated here. This part runs as fast as the spare CPU
    ; time will allow.

   
loop
    ; Take the current filter cut-off value and add 65 to it (so it's in the Petscii area
    ; of the character set)

    lda $62
    adc #$41
    ; Use the random number generator to read the oscillator value from channel 3 and use
    ; this as the X offset position.  Because channel 3 is where the drums are, for snares
    ; and hihats this will be using the noise waveform.

    ldx $d41b
    ; Store the petscii char into the screen area.  This is repeated 4 times to fill the 4
    ; pages of the screen.  I used a slight offset from the usual $0400 start position to
    ; move the visuals around a bit.

    sta $03e8,x
    sta $04e8,x
    sta $05e8,x
    sta $06e8,x
    jmp loop

    ; Music player.
musicloop
    ; Decrease tempo tick, if it goes minus (#$ff) we need a new note, otherwise we can jump
    ; to the instrument update part.

    dec $60
    bmi musicloop2
    bpl updatedrums
    ; Get new note in the pattern.
musicloop2
    ; Reset tempo tick to full. (in our case #$06 which is about 125 bpm)
    lda #$06
    sta $60

    ; Make new value to add to filter cut-off for this note.
    ; Using the current raster beam value ($d012) as an offset, read a value from the
    ; Kernal ROM and use only the lower 4-bit value.  Then add the current number of
    ; loop cycles ($6b) to make each value slightly different.

    ldy $d012
    lda $e144,y
    and #$0f
    adc $6b
    sta filtsweep+$01
   
    ; Get new note values, the current pattern position stored in $61.   
    ldy $61
   
    ; First the main melody, using the lower 4-bits from some of the Kernal ROM to only
    ; use bass values between 0 (silence) and 15.  The extra 'eor #$03' is more for
    ; personal taste with the different melodies.

chan1
    lda $f704,y
    and #$0f
    eor #$03
    sta 54273

    ; Now the sync/ring-mod channel, using data from the BASIC ROM, but only taking
    ; the low 6-bits.

chan2
    lda $a340,y
    and #$3f   
    sta 54273+7

    ; Set new Drum
    ; This checks against 4 possible BIT values to see if a drum needs to be played.
    ; ($40 = Bass Drum, $08 = Snare , $02 = Hihat , $01 = Rest)  
 
noresetsq
    ldx #$00
drumcheck
    ; Get next position value from drum rhythm table.
    lda $30,y
    ; Check if value matches the current BIT value indexed.
    and $20,x
    beq nobit
    ; If it does this means we have a new drum to play.
    ; Firstly set the waveform from that table.  Also store it at $66 so we can apply
    ; note off later.

    lda $24,x
    sta 54276+14
    sta $66
    ; Set the drum pitch, this is placed directly in a variable as we do work on this
    ; value when adding the pitch sweep.

    lda $27,x
    sta $67
    ; Set the drum pitch sweep.
    lda $2a,x
    sta $68
    ; Finally set the timer value before applying the note-off on the drum.
    lda $2d,x
    sta $69
nobit
    dex
    bpl drumcheck

    ; Increase the pattern position by 1 and AND the value by #$0f so it's always
    ; between 00-15.

    iny
    tya
    and #$0f
    sta $61
    ; Check if the value is 00, if not we don't need to decrease the amount of pattern
    ; cycles yet.

    bne noupdate

    ; Decrease the amount of pattern cycles and check if this is #$ff yet.  If not we
    ; don't need to create a new pattern yet.

    dec $6b
    bpl noupdate   

    ; When the pattern cycles are complete it's time to create a new pattern.
   
    ; Firstly we do some self-modifying code to the initial value of the drum check loop.
    ; This means we don't always get the same drum beat by dropping out checks for the
    ; snare and hi-hats.

    dec noresetsq+$01
    lda noresetsq+$01
    and #$03
    sta noresetsq+$01
    ; We also use this value to change the melody line's waveform, from the table at $15-$18.
    tax
    lda $15,x
    sta 54276
    ; We also change the amount of pattern cycles for the next pattern from the
    ; table at $19-$1c.

    lda $19,x
    sta $6b

    ; Now we do some more self-modifying code to change the memory position to read the
    ; note data from.  This increase the high memory value by one for the main melody and
    ; the ring-tone channel.  This does mean that eventually both values will reach past the
    ; end of memory and reset back to $0000.  As mentioned in the docs as this was for a video
    ; I didn't add any checking for this occurance, however the older version of the player resets
    ; the machine to avoid this happening.

    inc chan1+$02
    inc chan2+$02

    ; This resets the starting value of the filter cut-off value.  It works very similar to
    ; the filter sweep setup though we start with a 7-bit value, divide it by half and then add
    ; the current pattern position value to it.   

noupdate
    ldy $d012
    adc $e948,y
    and #$7f
    lsr
    adc $61
    sta $62   
   
    ; This is where the player falls through to on every frame.  This updates the filter and
    ; drums and then sends an acknowledgement to the timer system that this routine has finished.
       
    ; Check the tempo tick against the current drum's note-off value.  If it's the same switch off
    ; the waveform's gate (bit 1) so the release part of the ADSR gets activated
.
updatedrums
    lda $60
    cmp $69
    bne nodrumgate
    dec $66
    lda $66
    sta 54276+14
nodrumgate
    ; Add the drum's pitch sweep value to the current pitch and store it in channel 3's high pitch
    ; register.  Note that we don't do the math directly on the register because the SID is write
    ; only when enabled.

    clc
    lda $67
    adc $68
    sta $67
    sta 54273+14
   
    ; Set current filter cut-off value to the cut-off register.
    lda $62
    sta 54294
    ; Decrease the cut-off value by the current filter sweep value. (note this was set in
    ; self-modifying code earlier)  If the value is already below zero don't store it in the
    ; variable so it only stays at this value.

filtsweep
    sbc #$00
    bmi filtnot
    sta $62

    ; End of music driver, call to IO system that we've ended our routine.   
    ; As we have the full default system enabled we need to use $ea31 rather than the
    ; less cpu-heavy $ea81

filtnot
    jmp $ea31

datas
    .byte $05,$07                  
siddata   
    .byte $f3,$1f,$00            
basswaves
    .byte $11,$21,$41,$21      
length   
    .byte $01,$03,$03,$03      
vols
    .byte $a9,$3c,$79               
btt
    .byte $40,$02,$08,$00            
wav
    .byte $41,$81,$81                
not
    .byte $0a,$ff,$20                
plu
    .byte $ff,$fe,$fc                
del
    .byte $03,$02,$01                
beat
    .byte $40,$01,$02,$01,$08,$01,$02,$01,$40,$01,$02,$01,$08,$01,$02,$40

; Memory map:

; $10 = Initial pattern cycles value for first pattern. (datas)
; $12 = SID Filter values for filter type/volume and resonance/channel allocation. (siddata)
; $15 = Melody line waveform table. (basswaves)
; $19 = Melody loop cycle length table. (length)
; $1d = SID Channel sustain and release values. (Attack/Decay are always zero) (vols)
; $20 = Drum BIT value check table. (btt)
; $24 = Drum Waveform table. (silence isn't stored in drum values) (wav)
; $27 = Drum starting Pitch value table. (not)
; $2a = Drum Pitch addition signed value table. (plu)
; $2d = Drum ticks before note off value table. (del)
; $30 = Drum beat pattern table. (beat)

; $60 = Current note timing tick.
; $61 = Current note position in pattern.
; $62 = Filter cut-off value.
; $66 = Drum Waveform setting for use with note-off.
; $67 = Current drum pitch.
; $68 = Pitch value to add to drum pitch every frame. (signed value)
; $69 = Drum note-off timer.
; $6b = number of cycles to loop the current pattern

Monday, 16 December 2019

Microtan65 plasma fx


I found some old test fx I made for the Microtan 65 computer and posted a quick video on twitter here   Someone asked for the source so here it is.   This was compiled with a DOS assembler called AS6502 for some reason, so some of the formatting is odd.  (eg: dfb than the usual .byte)  I haven't looked at the Microtan machine since 2016 but I think I ported this from my older 'Plasma 190 bytes' version on the c64.  That would explain why I'm generating the sinetable when there's plenty of room.


Sunday, 6 May 2018

How to convert SID files to FM (with FMX)

A couple of people have asked me how to play existing SID files in FMX. (my FM cart music driver)  I've written a tutorial below to get basic conversions going, it looks like a lot of steps but once you've set it up once you hardly need to change it afterwards.  Maybe in the future I'll dig into the advanced features which can help with better conversions and making instruments, though in the meantime I suggest reading the docs that come with it.

While you need to download an assembler to use it, FMX is designed so you only need to change tables to get results out of it.  You don't need to do any coding!

Tools:

You will need:
  1. The FMX driver 
  2. A copy of the DASM assembler.  Any version should do, here's a link to a Windows version from Lasse's page.
  3. A Sid player so we can get some info.  Personally I use VSID that comes with the Vice emulator , but most of them have an info page in there somewhere.  Another example is Sidplay/w for Windows.
  4. Some way to hear the music (obviously), if you don't have a cart for your c64 the Vice emulator has it built-in.  (Settings / Cartridge/IO settings/SFX Sound Expander settings and tick the enable box)
Step 1 - Installation:

OK let's set up the basics:
  1. Unpack FMX (with it's sub-folders) into a folder on your computer. 
  2. Move into the src sub-folder and unpack DASM into here.  We'll be doing our main work in this src folder from now on.
  3. To test DASM is working, run fmx.bat to build the player.  A file called fmx.prg should appear which you can test on your machine, it will play a small test loop on the fm chip as in this screenshot:
Step 2 : Setting up FMX to play single SID songs:

Now we want to set up FMX so it's ready to play a single SID tune with an optimal setup.
  1. Move into the configs sub-folder and then into the one called config-1song_only_fm, select & copy all the files in here.
  2. Move back into the src folder and paste the files we just copied here.  It'll overwrite some existing files but this is normal.
  3. Now run fmx.bat again and the built fmx.prg file will play a different tune and the screen text should have changed:

Step 3 : Getting info on our chosen song

Now those first two steps are out of the way we don't need to revisit them again in future.  

Unless you have some sid tunes already the best place to find them is in the High Voltage SIDS collection.   If you're looking for particular tracks there's a search engine for it at the Exotica game music site.

So, we have some songs and now we need to load them into the Sidplayer to find some info on them.  What we're looking for is the player's info window.  I've put up screenshots below showing the info we need in Sidplay/w and VSID.   VSID has it on the main display whereas Sidplay/w shows it when you go to File/Properties.

 Sidplay/w's info window with the info we need in red
VSID's main window showing the info we need in red

  • The Player Type has to be a 'vblank' type to work with FMX.  This is listed variously as VICII, VBI or simply VBLANK. 
  • The Load Address is where in memory the music is stored.  By default FMX has the space between $1000-$6fff or $a100-$cfff free, so make sure it fits between one of those two areas.
  • The Init Address is the piece of code called when the song starts to play.   This usually sets up the music player and resets it to the beginning.
  • The Play Address is called every time the screen refreshes to keep the music playing.
Optionally we can also jot down which sub-song we want to play if there's more than one track in there.  As we're only playing single sid songs we want to avoid any with _2SID or _3SID on the end of the filename.  (that's for another article, or read the docs to see how it's done)

Step 4 : Playing our song (the last step!)

Yes, now it's time to try playing back the SID file on your FM chip.
  1. Copy your SID file into the src folder where we've been working, I'm using the track 'Modern Loves Classics Intro' from the screenshots in this example.
  2. Now we need to open up the file fmx-song.asm to put in the info we've collected.  Any text editor will do, even notepad.  This may look daunting if you don't code but don't worry, we're only changing some names and numbers around.
Firstly we'll change it to load our song instead of the default one.   Scroll down until you find the org $1000-2 line as in this screenshot:

Change the org address to the Load address we jotted down earlier, and put -$7e on the end to skip the SID file header.  Change the filename after the .incbin label to the name of your sid file.   In my case the loading address ($1000) was the same as the demo track but you may have something different.  (eg: if your load address is $4000 change it to org $4000-$7e)





Now find the player_patchsid label as in this screenshot, we want to change the first number in this table to the first two digits of the load address.  (so if the load address is $4000 change it to $40)
We're in the home stretch now, just two more things to change : the Init and Play addresses.  Find the player_inithi label as in this screenshot, the play label is just below it:
As you can see the Init and Play are split into two tables each, the first two digits of the address go in the top table and the last two go in the one below.  So, if your init address is $4000, you put $40 in the top one and $00 in the bottom one. (likewise with play, if it's $4003 you put $40 in the top one and $03 in the bottom)  We can ignore all the extra columns in these tables as they're for multi-sid songs.

Finally if you want to play a different sub-tune to the default look for the player_song label as in the screenshot below, and change the first value to your new song name.  FMX counts songs from zero, most Sidplayer displays start from one so you'll need to subtract one from your value. 





Now run fmx.bat and it will hopefully build a new fmx.prg with your song in it.  (I've already changed the text in this screenshot so ignore that)







Extra features, changing text etc.

I'll leave the more advanced options (like changing the default instruments) for another article, but one other thing you can do easily is have the song also play on the SID chip at the same time.  This can create some interesting effects.  To do this look for the player_output label in fmx-song.asm and change the hi value to point at $d4 and the low value to $00 as in this screenshot:









The final thing we can do now is tidy up the display. Firstly you can change the displayed text by finding the label songname and changing the text lines below. (the space between the quotes is how much text can be displayed, it'll automatically crop any remainder)



And we can switch from the default Debug mode to one where we can change the colours of the screen.   This involves loading up two other .asm files, firstly fmx-build.asm to change the line DEBUGMODE = 1 to DEBUGMODE = 0 , and then fmx-globals.asm to change the colours as in this screenshot:














So now if we run fmx.bat again the built executable should look a bit nicer:











Saturday, 10 March 2018

$11 times the charm




My entry for the competition. (short version, before I realised there was a 2 minute minimum limit)
I don't write a lot of SID music these days, the only times I do it is when I've got something new on the code side. (or at least new to me)  I can't really see myself writing SID music now just for the sake of writing it, I guess the motivation isn't there.  Weirdly my favourite release on the SID is Mini Melodies   This is the least tech-driven thing I've done but it reminds me of my favourite type of SID tracks: the super basic players, pure noise drums, usually by people who only write a couple of tracks and move onto something else.  I love this kind of stuff above all else on there.

Currently (as I type) there's a competition going on to write music using only the triangle wave and filter.  This is turning into quite a tech-heavy competition and you can hear the entries at this link.   (I've dotted some videos through the article)

LMan - "Mellowhouse"
So I thought it'd be nice to contribute something, but if I just load up GoatTracker and start making a track I know I won't come up with anything.  Plus as it's on CSDB you know the standard is going to be super high and my instrument design skills aren't on any kind of parity with those other guys. :)   So then, what to do.  I need some tech idea within the limits of the competition, but without having to suddenly learn a great deal of sid craft in a couple of weeks.

A 4th channel

The 4th channel idea actually came out of writing the previous article on here.   While writing up Patchwork's fx list I was thinking that the filter allocation ($d417) is an on/off state on each channel.  On/Off states are good in sound because they can be used as a basic oscillator for audio.  The question then becomes, does allocating a channel to the filter create enough of an audible change that it's useful?

As with playing samples, sound generation is a cpu-intensive process because you have to send rapid updates to a sound buffer.  To test out the extra channel I first made an infinite loop that would rapidly toggle the channel allocation between full and zero without waiting for a frame to pass. (so it's running as fast as the CPU can manage) This produced a high-pitched whirr, which showed there was maybe something useful.  By adding some delay into the loop the pitch lowered, as the toggle is being updated at a slower rate.   So it was kind of like generating a square wave sample but playing it through the filter.

My next task was to get the sound under control by setting up a timer system.  This is nearly the same as running it in the infinite loop but the update speed of the loop can be controlled from a timer, and hence it's pitch can be controlled.  You're also able to run the loop alongside a normal frame-based music engine so it can be controlled from a constant beat source, like any other music channel.



Progress of the player. The border colours show which bits in $d417 are being set.
So now we have the timer system in place, we need a friendly way to control it while writing our music.   I opted to use an existing music editor (because there's no point re-inventing the wheel on the SID channel side) and look for a way to put some control data in that I could read from my code.

Well most music editors only use 3 channels so that's not good, but what about the ones that can use multiple SIDs?  I opted to use GoatTracker Stereo as the editor, with the '4th' channel in there being used to control our generated sound.   I wrote some code to take the exported song and patch it at runtime, so all writes to the sidchips are sent to a different part of memory instead.  From there I can write back the first 3 channels to the SID and use the information being sent to the 4th channel to control our generated channel instead. Any filter writes in GoatTracker's driver are ignored too as we're controlling the filter directly from our code.  

The first three channels write the SID data directly, but the 4th one reads the currently played note number from inside Goat Tracker's driver.   I have a table of timer speeds matched to the frequencies that would be sent to the SID pitches and set our timer to the relevant one.  I could get 4 octaves of useful pitch directly, and then this could be doubled by stepping through the $d417 changes at double the speed.  (though in my song this is only used in the fast bass sections)  I also do a check to see if  the note is a 'rest' command, and disable the 4th channel in that case.

Working this way there are going to be some compromises on the composing side though.  Without rewriting a bunch of the editor we won't be able to hear our music in exactly the same way it sounds on the machine.  But we'll be able to write the music and use dummy instruments for the 4th channel to create the structure.   That way only the sound design of the 4th channel needs to be done in code, that seems a good enough trade-off.

Wiklund - "Club Eleven"

More advanced sounds

So now we have a 1-bit oscillator we can control.  But thinking about it the filter allocation isn't really a 1-bit value, it's a 3-bit value: a bit-wise toggle for each SID channel:
  • Channel 1 : $01 toggle
  • Channel 2 : $02 toggle
  • Channel 3 : $04 toggle
Do we get any differences by toggling with different values?  The answer is: yes!   Combined channels = larger volume.  This means we have some kind of stepped range for our generated sound.

The other thing with a generated sound is you can shape the timbre of it. If you've used the Gameboy you'll remember the 3rd pitch channel allowed you to make custom 4-bit waveforms with multiple steps.  The hardware then cycles through this waveform on a loop with the different permutations in the waveform creating an individual timbre to the instrument.   So we can do something similar here, I opted for an 8 step waveform which gives enough variety in the sounds.   By dropping 'on' toggles at fixed steps you can also add harmonics to the sound in a similar way to a drawbar organ.   Some examples:
  • 01,00,00,00,00,00,00,00 - Straight tone
  • 01,00,00,00,01,00,00,00 - Octaver effect
  • 01,00,00,00,01,00,01,00 - Double octaver effect
The symmetry between the on and off states will change the 'duty' of the sound, if you've used a NES you'll know this kind of setup:
  • 01,00,00,00,00,00,00,00 - 12.5%
  • 01,01,00,00,00,00,00,00 - 25%
  • 01,01,01,01,00,00,00,00 - 50% (full 'square wave' sound)

JT & LMan - "Skypeople"
So now we can shape our sounds enough that they'll sound different to each other.  But hang on, as we're using the filter to do this the 4th channel is also affected by the filter's settings.  Is there anything more we can add?  

Well for a start the $d417 register has two jobs: channel allocation and setting the resonance.  So we can have difference resonance values in each instrument, or change it per cycle.   We also have the cutoff frequency which we can apply a sweep value to, and finally the type of filter being used. (between low,band,high-pass or the combinations)   So yes,  there is more!

But there's also one more software thing we can do:  switch the waveform over time.   I added the option to cycle through a set of waveforms with a time delay while playing a note, kind of like using a wavetable in a traditional sid player.  This let me produce the echo effect and octaver bass sounds used in the song.

Generated sound and the sid channels

So we now have 4 channels playing, but if we're setting the sid channels to use the filter too what does it do to the SID channels?   Well it kind of gives the set channels a ring-modulation effect.  To quote the very knowledgeable lft on my original upload:


"The reason for the apparent ring modulation is that the filter is inverting. So when flipping the filter routing at some frequency, that's similar to applying ring modulation at that frequency to the waveform."
The timbre of the 4th channel is also affected by the SID waveforms used, I haven't dug into how useful this is but amusingly using waveforms other than $11 gives better resolution at least. :)

A funny thing is this modulation effect would have been super useful when making the ST musicdisk.  It's a nearly spot-on simultation of the Buzz bass sound and would have freed up a channel on the SID in the process.

Sunday, 25 February 2018

How Patchwork (patch)works



As you may know, the Commodore 64 has 64kb of RAM.   On boot-up, however, that 64kb isn't all accessible from the machine directly. Certain blocks of the RAM space are patched out with various ROMs and the I/O space for the custom chips. (The VIC-II graphics chip, timing/port control and our beloved SID chip)

Here's the RAM setup when your Commodore 64 boots up:

$0000-$9fff - Mostly free (some kernal things are patched into zeropage and there's also the stack)
$a000-$bfff - Basic ROM
$c000-$cfff - Free
$d000-$dfff - VIC-II,SID,I/O,Character Set address space
$e000-$ffff - Kernal OS ROM


This isn't a fixed setup, however.  By changing the bit values in the $01 register you can swap out the ROMs and the chip address space however you like, either on a permanent or temporary basis.   This mostly applies to assembly language programs, however some BASIC games would make copies of the Character Set into RAM using the above method.   (if you ever saw a "Please Wait" message
in a BASIC game and then a minute of nothing it was probably doing that)

LDA #$35 , STA $01

A large amount of standalone video games swap out the Basic and Kernal because they don't use them.   This is achieved by setting the register $01 to #$35 and gives you 60kb accessible at any time.   While the CPU can see all the RAM without a problem the VIC-II chip can only see
16kb ram blocks, so from a graphical standpoint you'll usually set one area of RAM as 'the graphics ram' and copy anything else there as needed.

Patchwork relies on the fact you can also swap out the SID chip address space into RAM.  This needs a bit of background first to explain.

How music players work:

A music player is usually split into two parts, the music driver and the music data.  The music driver will have a specific set of addresses the game/demo can call to initialize, play and stop the music.  For 8-bit machines these are usually 'per frame', meaning every time the raster beam passes over the screen the music driver is called to keep a constant tempo.  Your typical music driver when
called will do some housekeeping (updating where it is in the song, changing instruments etc.) and then write to the set of registers where the soundchip is.  In our case the standard SID chip sits at the $d400-$d41c address space.  For custom machines with extra SIDs their addresses can be at any place in the $d400-$dfff range but that first SID is always at $d400.

LDA #$34 , STA $01

So, with that knowledge if we call the music player normally it'll write to the SID registers and you'll hear the notes for that frame of music.  But what if we swap out the $d000-$dfff register space to RAM first, by setting $01 to value #$34.   Now what happens?

Well the music player runs fine, but the register data gets written into RAM instead of to the SID chip so we don't hear it.  If we copy that data somewhere and then re-enable the SID chip (with LDA #$35 , STA $01 again) we still don't hear the music, because as far as the SID chip is concerned nothing new has been written to it.  Switching out the SID chip also doesn't stop the SID chip playing whatever is already in the registers, it just stops it being able to receive new data until it's enabled again.

If we then write the data we copied back to the SID chip we'll hear the new frame of music.  With this simple setup we can play a SID tune, but we have the option to manipulate the SID data first before it goes out to the chip.  You may have seen some of my other videos where I get the SID chip to pretend to be other systems. (like the Spectrum or NES)  This is using the above method, by manipulating the data we get from the music driver and then writing that manipulated data back to the SID in realtime.







What Patchwork does with the data:

Patchwork is a bit different in that it doesn't use that data in real-time, instead it stores it into an incremental buffer.  Because it's mostly a proof-of-concept I decided to store the data uncompressed, meaning every frame 25 bytes are written to memory.  (the registers after the volume control aren't used in normal music playback)  This means Patchwork has a limit of 27 seconds of music, but if I come back to this project in the future there are plenty of opportunities for improving that.

So, now we've captured some sid data into a buffer we don't need the original song to play it back.  If we send that buffered data to the SID chip every frame it'll play that instead.  

This method does has some drawbacks however.  Because the SID chip has some timing inconsistencies with the way the ADSR envelopes and the waveform Gate works, sometimes the CPU cycle count in a music driver is the reason the music sounds how it does.  This is especially
true of older game music drivers, most modern drivers have some form of "Hard-Restart" enabled which is a way to get SID chip playback pretty much 100% consistent on every frame.

You can read more on the subject in this thread , by people far more knowledgeable than myself.  One day I'll write up how the AY music player works which had it's own unique challenges. :)

Anyway, so for some game music drivers the time between SID register writes is a factor in the reliability of their playback.  This isn't a complaint on my part.  What this means (to me) is the way the driver has been developed and the sound of it have a symbiotic relationship, and are so heavily fused together that changes to one directly affect the other.

On Patchwork's end it doesn't try to emulate the playback of different music drivers, it just wants to write the register data back to the SID chip with one method.   There are a couple of ways to try and fix the 'problem' though they don't cover all cases:

  1. It's more consistent to write SID registers backwards.  I don't know the exact reasons for this but I assume hitting the ADSR registers first before the Waveform has more of a success rate.
  2. Putting a cycle delay between register writes.  For the SID player I've put in a few NOPs between writes, in Patchwork's own driver there's enough CPU manipulation between each section that the register writes are spaced out a bit.
  3. Manipulating the data before it gets to the SID.  I've only used this on Patchwork's driver side.  Because instruments can be made at any point in a recording the SID registers can be in any state.  This means that while you may have lined up one instrument to play at it's beginning with the Gate enabled, other channels could be in a 'note-off' (gate off) state, or something else.   To get around this I artificially force the Gate back on for two frames at the start of each channel.   This helps with consistency when playing back because two frames is usually enough time for the ADSR to have reset to a constant state, even if it's not completed it's cycle before being switched off again.  This third option can change the sound of the instrument for those first two frames, but as we're using 'cut up' parts of an existing song it's going to sound a bit different anyway.
Manipulating the SID data:

So now we have the data we've recorded, why not change it around a bit?  The pattern editor has a few different pitch effects, and the filter and channel assignments can be changed around.

Loops and speed changes:

Because each frame is basically a contained moment in time (like a sample) we don't have to play them in order.  This means Patchwork can play an instrument in reverse and loop forwards or backwards if required.  We can also speed up or slow down the playback, by delaying the frame increment with a timer (slower) or skipping through the data frames at a faster rate. (faster)  I think slower has some possibilities but in hindsight faster isn't all that useful.

What is different to a sample, however, is that the SID chip is expecting a note to go through a particular cycle of events.  The creator of the SID chip ( Robert Yannes ) went on to co-found Ensoniq, and the chip follows a traditional synthesis setup. That being:
  1. At the start of a note the Gate (equivalant to a note-on in midi) is enabled.
  2. While the Gate is active the Attack, Decay and Sustain values of the ADSR are cycled through by the SID chip.
  3. When the Gate is disabled (a note-off) the Release cycle of the ADSR starts. If the first three stages haven't been completed in time it stops (afaik) wherever it is and goes straight to the Release timing.
Played notes on the SID don't exist in a bubble, wherever they have got to in that cycle can affect how the next note plays.   This is the important bit, the next note that plays may not play at the right volume, or at all if the ADSR isn't in a ready state to begin again.   As far as I know this is why techniques like 'Hard-Restart' (linked earlier on) were developed.  It's also why I manipulate the first couple of frames of an instrument directly.  Even if the song data is playing backwards it should at least START with a solid note setup. :)

(as a sidenote, thanks to the SID chip I got a hands-on education on how synths work in the '80s)

Pitchshift and Octaver:

Making an Octaver fx on the c64 is quite simple, sound pitch is stored as a frequency value which means an octave higher is achieved by dividing the note by 2, and dropping an octave by doubling it.  As it's an 8-bit machine you need to do some carry value checking as you move through the octaves but it's quite do-able and the loops are fast.

This is the first time I've tried making a Pitchshift on anything, and surprisingly it kinda works for the most part.  It's unfortunately the most CPU intensive effect in the driver though, right at the end of development I had to re-work the GUI code to get it working again on NTSC machines. 

One problem with making this effect is because the C64 pitch is stored as a frequency value, music drivers aren't locked into a particular tuning scale.  This means that, for example, middle C in one driver can have different values to another depending on what frequency table they use.  A pitchshifter fx has to go through two stages to get the effect we want:
  • Detect which note we are closest to already.
To do this the fastest method I found was to take the source pitch down to the lowest octave it can possibly reach.  At this point I can compare it against a baseline pitchtable (in this case the one at codebase64 made by mr.SID) to find the closest possible match.  During this process I store a few values:
  1. The pitch offset between the source note and known 'good' note so we can apply this back later to make it as close to the original frequency table as possible.
  2. How many octaves we had to move to get to the lowest octave so we can shift it back later.
  • Add/subtract the pitchshift value
Now we have an idea where the note is we can add/subtract the required amount of semi-tones to our frequency and take that value from the baseline pitchtable.  After which we can shift it back up to the octave it was in, or if we're subtracting drop it an octave lower.  We also re-apply the pitch offset we stored beforehand so it's shifted out a few cents to match the original note's pitch.

This probably isn't the most efficient way of doing it, and I thought there'd be more errors than I was getting but for the most part it seems to work ok.  Only wildly shifting notes/pitch tables seem to have a large problem.   The range is an octave above or below so maybe if I'd increased that the problem would be more apparent.

Both the octaver and pitchshift can be applied per channel so individual layers can be manipulated.  One thing I forgot to mention in the docs is that a noise waveform note isn't affected by any pitch changes, I decided to do this because it's mostly used for drums and you usually want those to be consistent in a song.

Sweeps and filters:

These effects are relatively simple, changes to the filter setup directly alter the values in $d417 and $d418, which are where the filter assign and type are stored.  I do a few basic checks to see if the filter is enabled at all and then clear the assigns so you don't hear silence.  (on the SID assigning a channel to the filter without having a filter enabled silences the channel)

The pitch sweeps take the first pitch value of the playing instrument as the base, and then apply the addition/subtraction value stored in the instrument each frame.  When the high value of the pitch register rolls over to zero I stop applying the pitch sweep and silence the channel.  

Likewise the filter sweep takes the first filter cut-off value and applies the value stored in the instrument, but at the start of each beat rather than per frame.  This allows you to get much more rapid filter sweep effects (which is mostly what this is used for) and also means that if you change the tempo the sweep time remains relatively consistent.

Future ideas:

As this was written mostly as a 'proof-of-concept' I stuck to a list of items I thought I could comfortably finish.   If I return to the project in future there are a few things I'd like to investigate:

Independent channel playback:

While there are pattern commands for leaving channels running, essentially Patchwork is playing one instrument at a time on all channels.  What would be a more flexible approach is having each channel able to run it's own instrument.  This would mean instruments would need to have their own channel assigns (saying if it's a 1-3 channel instrument) but it also means song parts can play on any channels rather than being fixed to their original positions.   The benefit of the latter is when using SID features like ring mod or sync, as they are controlled by their playing position in the channel order.

Recording compression:

I'll admit when using Patchwork I haven't really come up against the 27 second recording limit yet.  However it would be quite possible to pack the data down from it's 25 byte frame size.  One thing to take into consideration is that the sample stream isn't used in a linear fashion, instruments can play backwards and also when recording you can be in the middle of an existing stream.  Even though you've overwritten what's there any data left after you stop recording has to be maintained.

The easiest idea is to try a smaller frame format with a fixed size, while all registers can hold a full 8-bit value realistically not all of them do so.  If it's 10 bytes smaller that adds up and we don't have to do much data management.

Another idea is to split out each register as an RLE stream.  This may sound crazy with 25 registers but if you've done music data compression before you'll know that the individual elements of a stream don't rapidly change all that often.   There'll be some (like the waveform register) that won't work as well but notes, filter and ADSR should yield quite a saving.   The other benefit with this idea is that potentially the data management isn't as horrible as first imagined.  If you start recording in the middle of existing data you only need to change where the previous RLE streams have ended, and you can check forward where the next ones are and change their values as you record.

Midi sync:

The final thing, especially as I have liveplay options, is to put in some midi sync.   I had a very basic keyboard input working as a test but I'll need to implement a timed midi cache to allow it to work with sequencers.  They send a lot of data!


Thursday, 18 August 2016

Making "Ad Astra" (1/3)


This is a short behind the scenes about "Ad Astra" , a demo for the Pico-8 virtual console that I worked on recently with ilkke  Here's a video of it, of you can watch it online.



I got into Pico-8 coding after seeing some short-form gif tutorials that the author, Lexaloffe, uploaded to his Twitter account.  At the time I was on the tail end of a large commercial game project, so it was a nice distraction writing in this self-contained development environment after work.  Being an occasional demo coder the first things I worked on were replicating a few popular demoscene effects: the Plasma, Vertical Rasters and then Vector Bobs.

After doing a few of these independently of each other I thought about some way to release them, and that's when I decided to try making a little demo 'engine' to run these scenes in sequence, with maybe some sort of design to tie them together.

Demo Engines

In the modern demoscene a demo engine is a broad term, in it's basic form an engine runs behind the scenes, managing the sequence of demo parts, assets, handling their set up and the overall timing.  On the other end of the scale you have fully integrated tool suites like Werkkzeug, which was the production tool behind a lot of Farbrauch's content like the famous fr-08 from a few years back.

What I wanted to do was something that handled the back-end work and left me free to create scenes from the various effects I'd made easily.  This boiled down to three requirements:
  • Each part of the demo can be timed accurately.
  • Each part can be made from modular components so they can be re-used, and the 'engine' will handle initialisation and constructing the draw and update loops for it.
  • The content of scenes can be mostly constructed outside Pico-8 for easy iteration using a very basic script language, making it a "data-driven" demo rather than hard-coding each screen to fit.
So let's see how each point was tackled:

1) Timing

The first thing I needed to do was work out a way to do timing.  Not timing for effects, we have _draw() and _update() for that, but general timing of the demo flow.  Modern demos usually sync to the music as it's a constant and there is likely an audio system in place providing rock-solid timing for you.  In Pico-8 there are a couple of stat() variables you can use to find out which pattern a channel is playing (16-19), and what position it's got to in that pattern. (20-23)  There aren't any for song position so the way I did it was to check if we'd hit pattern position 0 and used a flag to check if we were still in the same pattern.  I decided that each scene in the demo could be X song patterns in length, so when the timing flag is set it decreases this counter by one and if it reaches zero we know it's time to move on to the next screen.

2) Modular Components

As with games, modern demos (as in PC) are usually coded out of modular components.  This is so scenes can be constructed from various re-usable parts to make a greater whole and saves on code duplication.

Let's take an example, here's a scroller screen I coded beforehand.  It has a starfield, vector bobs, the scroll and a mirror effect.  They're all hardcoded to that one screen.




Instead you split each element out into functions and give them as many input variables as you can, without impacting on performance too much.  This instantly makes each thing more flexible and gives the designer a lot more scope to construct scenes from these smaller parts to make a larger whole.  It also applies to effects you're only going to use once, like the landscape part in Ad Astra, because you can enhance these scenes with other modules you've already written.

So now each effect had three functions: Init, Draw and Update.  Actually some of them didn't have an update because I did that in the Display bit instead (naughty) but never mind.  That's another rule of making demos: if it seems to be faster do it that way instead.

When a new scene starts the engine checks which modules are used and feeds their Init function with the relevent data for that scene.   It then builds array lists for the _draw() and _update() functions.  These are basically just loops checking if a value is true in the array and calling the relevant effect function at that point.

The z-order of things in a scene is decided by their order in the _draw() array.  This was good for backgrounds and overlays where needed.

Effects aren't all truly modular as some of them only have one instance of their variables available (the Bobs & Vectors for a start) , however the important bits that get re-used often (like the Map module) are setup properly and can be re-used multiple times in a scene.

3) Building scenes outside Pico-8

This was mostly about making things more comfortable for the creative process.  The easier it is to make your demos the faster you can iterate and improve them, you're more likely to do that in a workflow that isn't awkward.

To that end I made a few tools to help out, these were all made in the old Blitz Basic which, for me at least, I still find the fastest way to make little graphics tools.  As only the two of us were ever going to use them they didn't need to look nice, they just had to work.

The main tool was both a script compiler and .p8 file builder.  You feed it a script in the right format which it turns into an array the demo engine can read.  It then splices this with a .p8 file of the demo engine and exports a new compiled version.  Finally it boots pico8 and runs the file automatically for convenience. 

Here's a quick example of the really simple script format.  First two lines are the p8 filename to splice into (useful for checking earlier versions if something starts breaking) and the timing values for each scene in array format.  The 255 at the end is a special command to tell it to loop on the last scene.

demoengine_v46.p8
{1,10,6,5,4,4,1,3,7,7,9,5,1,255}

Then each part follows, this is a list of the modules used in the Z order we want them to be displayed.  Each module starts with a # followed by the name and then the data for that part afterwards.  For example:

#map
80 ; layerxmap

4 ; layerymap
0 ; layerxpos
47 ; layerypos
16 ; layerxsize
12 ; layerysize
0 ; layerxposreset
0 ; layeryposreset
0 ; layerxspeed
7 ; layeryspeed
0 ; layerxposadd
-1 ; layeryposadd
0 ; layerxmapadd
0 ; layerymapadd
0 ; layerxposmax
0 ; layeryposmax
#vector
0 ; skip mesh reset?
50 ; start zoom
148 ; xpos offset
48 ; ypos offset
0.5 ; x rotate start
0.0 ; y rotate start
0.0 ; z rotate start
0.000 ; x rotate move
-0.001 ; y rotate move
0.0003 ; z rotate move
700 ; target zoom
4 ; zoom speed
-1.1 ; xpos add
0.15 ; ypos add
1 ; mesh object

#end

There are a few extra commands to handle adding data (such as the 3d meshes) and using Pico-8 commands within a scene, like the palette controls.   This might seem odd but because it's data-driven we want to avoid hard coding things per scene if possible.  So while changing the palette through an extra function rather than directly may lose a tiny amount of cpu time it means it's available in the script with the other parts of the scene. There were several times where I had to change masking colours for particular map layers and being able to do that in one continuous process made things much easier.

Each scene ends with an #end command so the engine knows we're done.

This may not look like a much faster way of working, but being able to run this in Notepad++ , tweak a setting here and there, cut/paste entire parts around to change the flow etc. certainly helped with the production workflow.

Another useful tool I made was a .PNG to _map & _gfx converter for the artwork.  This was like any old tile converter in that it optimized the tiles used and then reconstructed the .png in the new order as a _map & _gfx set that could copy/pasted into the .p8 file.  So ilkke could send me his latest version of the artwork as a .PNG file, made in whatever he wanted to use:


And I then had these in the engine ready to view straight way.

The final tool I did was a converter for .OBJ (Wavefront) files to turn them into an array for the vector routine.  The only easy way to store vertex colours was in materials, so we did that.  Here's a screenshot of ilkke's original Blender spaceship:


Hindsight

In hindsight my use of an array meant I wasted a LOT of tokens, and didn't throughly check that out until we started running out of memory.  Though luckily it was only in the last days of production.  Next time I'll use a pure bitstream approach and also standardize the input format for parts so we don't have any fixed variable names.

In the next two posts I'll go through each part in turn with a bit more detail.