Sunday, 6 May 2018

How to convert SID files to FM (with FMX)

A couple of people have asked me how to play existing SID files in FMX. (my FM cart music driver)  I've written a tutorial below to get basic conversions going, it looks like a lot of steps but once you've set it up once you hardly need to change it afterwards.  Maybe in the future I'll dig into the advanced features which can help with better conversions and making instruments, though in the meantime I suggest reading the docs that come with it.

While you need to download an assembler to use it, FMX is designed so you only need to change tables to get results out of it.  You don't need to do any coding!


You will need:
  1. The FMX driver 
  2. A copy of the DASM assembler.  Any version should do, here's a link to a Windows version from Lasse's page.
  3. A Sid player so we can get some info.  Personally I use VSID that comes with the Vice emulator , but most of them have an info page in there somewhere.  Another example is Sidplay/w for Windows.
  4. Some way to hear the music (obviously), if you don't have a cart for your c64 the Vice emulator has it built-in.  (Settings / Cartridge/IO settings/SFX Sound Expander settings and tick the enable box)
Step 1 - Installation:

OK let's set up the basics:
  1. Unpack FMX (with it's sub-folders) into a folder on your computer. 
  2. Move into the src sub-folder and unpack DASM into here.  We'll be doing our main work in this src folder from now on.
  3. To test DASM is working, run fmx.bat to build the player.  A file called fmx.prg should appear which you can test on your machine, it will play a small test loop on the fm chip as in this screenshot:
Step 2 : Setting up FMX to play single SID songs:

Now we want to set up FMX so it's ready to play a single SID tune with an optimal setup.
  1. Move into the configs sub-folder and then into the one called config-1song_only_fm, select & copy all the files in here.
  2. Move back into the src folder and paste the files we just copied here.  It'll overwrite some existing files but this is normal.
  3. Now run fmx.bat again and the built fmx.prg file will play a different tune and the screen text should have changed:

Step 3 : Getting info on our chosen song

Now those first two steps are out of the way we don't need to revisit them again in future.  

Unless you have some sid tunes already the best place to find them is in the High Voltage SIDS collection.   If you're looking for particular tracks there's a search engine for it at the Exotica game music site.

So, we have some songs and now we need to load them into the Sidplayer to find some info on them.  What we're looking for is the player's info window.  I've put up screenshots below showing the info we need in Sidplay/w and VSID.   VSID has it on the main display whereas Sidplay/w shows it when you go to File/Properties.

 Sidplay/w's info window with the info we need in red
VSID's main window showing the info we need in red

  • The Player Type has to be a 'vblank' type to work with FMX.  This is listed variously as VICII, VBI or simply VBLANK. 
  • The Load Address is where in memory the music is stored.  By default FMX has the space between $1000-$6fff or $a100-$cfff free, so make sure it fits between one of those two areas.
  • The Init Address is the piece of code called when the song starts to play.   This usually sets up the music player and resets it to the beginning.
  • The Play Address is called every time the screen refreshes to keep the music playing.
Optionally we can also jot down which sub-song we want to play if there's more than one track in there.  As we're only playing single sid songs we want to avoid any with _2SID or _3SID on the end of the filename.  (that's for another article, or read the docs to see how it's done)

Step 4 : Playing our song (the last step!)

Yes, now it's time to try playing back the SID file on your FM chip.
  1. Copy your SID file into the src folder where we've been working, I'm using the track 'Modern Loves Classics Intro' from the screenshots in this example.
  2. Now we need to open up the file fmx-song.asm to put in the info we've collected.  Any text editor will do, even notepad.  This may look daunting if you don't code but don't worry, we're only changing some names and numbers around.
Firstly we'll change it to load our song instead of the default one.   Scroll down until you find the org $1000-2 line as in this screenshot:

Change the org address to the Load address we jotted down earlier, and put -$7e on the end to skip the SID file header.  Change the filename after the .incbin label to the name of your sid file.   In my case the loading address ($1000) was the same as the demo track but you may have something different.  (eg: if your load address is $4000 change it to org $4000-$7e)

Now find the player_patchsid label as in this screenshot, we want to change the first number in this table to the first two digits of the load address.  (so if the load address is $4000 change it to $40)
We're in the home stretch now, just two more things to change : the Init and Play addresses.  Find the player_inithi label as in this screenshot, the play label is just below it:
As you can see the Init and Play are split into two tables each, the first two digits of the address go in the top table and the last two go in the one below.  So, if your init address is $4000, you put $40 in the top one and $00 in the bottom one. (likewise with play, if it's $4003 you put $40 in the top one and $03 in the bottom)  We can ignore all the extra columns in these tables as they're for multi-sid songs.

Finally if you want to play a different sub-tune to the default look for the player_song label as in the screenshot below, and change the first value to your new song name.  FMX counts songs from zero, most Sidplayer displays start from one so you'll need to subtract one from your value. 

Now run fmx.bat and it will hopefully build a new fmx.prg with your song in it.  (I've already changed the text in this screenshot so ignore that)

Extra features, changing text etc.

I'll leave the more advanced options (like changing the default instruments) for another article, but one other thing you can do easily is have the song also play on the SID chip at the same time.  This can create some interesting effects.  To do this look for the player_output label in fmx-song.asm and change the hi value to point at $d4 and the low value to $00 as in this screenshot:

The final thing we can do now is tidy up the display. Firstly you can change the displayed text by finding the label songname and changing the text lines below. (the space between the quotes is how much text can be displayed, it'll automatically crop any remainder)

And we can switch from the default Debug mode to one where we can change the colours of the screen.   This involves loading up two other .asm files, firstly fmx-build.asm to change the line DEBUGMODE = 1 to DEBUGMODE = 0 , and then fmx-globals.asm to change the colours as in this screenshot:

So now if we run fmx.bat again the built executable should look a bit nicer:

Saturday, 10 March 2018

$11 times the charm

My entry for the competition. (short version, before I realised there was a 2 minute minimum limit)
I don't write a lot of SID music these days, the only times I do it is when I've got something new on the code side. (or at least new to me)  I can't really see myself writing SID music now just for the sake of writing it, I guess the motivation isn't there.  Weirdly my favourite release on the SID is Mini Melodies   This is the least tech-driven thing I've done but it reminds me of my favourite type of SID tracks: the super basic players, pure noise drums, usually by people who only write a couple of tracks and move onto something else.  I love this kind of stuff above all else on there.

Currently (as I type) there's a competition going on to write music using only the triangle wave and filter.  This is turning into quite a tech-heavy competition and you can hear the entries at this link.   (I've dotted some videos through the article)

LMan - "Mellowhouse"
So I thought it'd be nice to contribute something, but if I just load up GoatTracker and start making a track I know I won't come up with anything.  Plus as it's on CSDB you know the standard is going to be super high and my instrument design skills aren't on any kind of parity with those other guys. :)   So then, what to do.  I need some tech idea within the limits of the competition, but without having to suddenly learn a great deal of sid craft in a couple of weeks.

A 4th channel

The 4th channel idea actually came out of writing the previous article on here.   While writing up Patchwork's fx list I was thinking that the filter allocation ($d417) is an on/off state on each channel.  On/Off states are good in sound because they can be used as a basic oscillator for audio.  The question then becomes, does allocating a channel to the filter create enough of an audible change that it's useful?

As with playing samples, sound generation is a cpu-intensive process because you have to send rapid updates to a sound buffer.  To test out the extra channel I first made an infinite loop that would rapidly toggle the channel allocation between full and zero without waiting for a frame to pass. (so it's running as fast as the CPU can manage) This produced a high-pitched whirr, which showed there was maybe something useful.  By adding some delay into the loop the pitch lowered, as the toggle is being updated at a slower rate.   So it was kind of like generating a square wave sample but playing it through the filter.

My next task was to get the sound under control by setting up a timer system.  This is nearly the same as running it in the infinite loop but the update speed of the loop can be controlled from a timer, and hence it's pitch can be controlled.  You're also able to run the loop alongside a normal frame-based music engine so it can be controlled from a constant beat source, like any other music channel.

Progress of the player. The border colours show which bits in $d417 are being set.
So now we have the timer system in place, we need a friendly way to control it while writing our music.   I opted to use an existing music editor (because there's no point re-inventing the wheel on the SID channel side) and look for a way to put some control data in that I could read from my code.

Well most music editors only use 3 channels so that's not good, but what about the ones that can use multiple SIDs?  I opted to use GoatTracker Stereo as the editor, with the '4th' channel in there being used to control our generated sound.   I wrote some code to take the exported song and patch it at runtime, so all writes to the sidchips are sent to a different part of memory instead.  From there I can write back the first 3 channels to the SID and use the information being sent to the 4th channel to control our generated channel instead. Any filter writes in GoatTracker's driver are ignored too as we're controlling the filter directly from our code.  

The first three channels write the SID data directly, but the 4th one reads the currently played note number from inside Goat Tracker's driver.   I have a table of timer speeds matched to the frequencies that would be sent to the SID pitches and set our timer to the relevant one.  I could get 4 octaves of useful pitch directly, and then this could be doubled by stepping through the $d417 changes at double the speed.  (though in my song this is only used in the fast bass sections)  I also do a check to see if  the note is a 'rest' command, and disable the 4th channel in that case.

Working this way there are going to be some compromises on the composing side though.  Without rewriting a bunch of the editor we won't be able to hear our music in exactly the same way it sounds on the machine.  But we'll be able to write the music and use dummy instruments for the 4th channel to create the structure.   That way only the sound design of the 4th channel needs to be done in code, that seems a good enough trade-off.

Wiklund - "Club Eleven"

More advanced sounds

So now we have a 1-bit oscillator we can control.  But thinking about it the filter allocation isn't really a 1-bit value, it's a 3-bit value: a bit-wise toggle for each SID channel:
  • Channel 1 : $01 toggle
  • Channel 2 : $02 toggle
  • Channel 3 : $04 toggle
Do we get any differences by toggling with different values?  The answer is: yes!   Combined channels = larger volume.  This means we have some kind of stepped range for our generated sound.

The other thing with a generated sound is you can shape the timbre of it. If you've used the Gameboy you'll remember the 3rd pitch channel allowed you to make custom 4-bit waveforms with multiple steps.  The hardware then cycles through this waveform on a loop with the different permutations in the waveform creating an individual timbre to the instrument.   So we can do something similar here, I opted for an 8 step waveform which gives enough variety in the sounds.   By dropping 'on' toggles at fixed steps you can also add harmonics to the sound in a similar way to a drawbar organ.   Some examples:
  • 01,00,00,00,00,00,00,00 - Straight tone
  • 01,00,00,00,01,00,00,00 - Octaver effect
  • 01,00,00,00,01,00,01,00 - Double octaver effect
The symmetry between the on and off states will change the 'duty' of the sound, if you've used a NES you'll know this kind of setup:
  • 01,00,00,00,00,00,00,00 - 12.5%
  • 01,01,00,00,00,00,00,00 - 25%
  • 01,01,01,01,00,00,00,00 - 50% (full 'square wave' sound)

JT & LMan - "Skypeople"
So now we can shape our sounds enough that they'll sound different to each other.  But hang on, as we're using the filter to do this the 4th channel is also affected by the filter's settings.  Is there anything more we can add?  

Well for a start the $d417 register has two jobs: channel allocation and setting the resonance.  So we can have difference resonance values in each instrument, or change it per cycle.   We also have the cutoff frequency which we can apply a sweep value to, and finally the type of filter being used. (between low,band,high-pass or the combinations)   So yes,  there is more!

But there's also one more software thing we can do:  switch the waveform over time.   I added the option to cycle through a set of waveforms with a time delay while playing a note, kind of like using a wavetable in a traditional sid player.  This let me produce the echo effect and octaver bass sounds used in the song.

Generated sound and the sid channels

So we now have 4 channels playing, but if we're setting the sid channels to use the filter too what does it do to the SID channels?   Well it kind of gives the set channels a ring-modulation effect.  To quote the very knowledgeable lft on my original upload:

"The reason for the apparent ring modulation is that the filter is inverting. So when flipping the filter routing at some frequency, that's similar to applying ring modulation at that frequency to the waveform."
The timbre of the 4th channel is also affected by the SID waveforms used, I haven't dug into how useful this is but amusingly using waveforms other than $11 gives better resolution at least. :)

A funny thing is this modulation effect would have been super useful when making the ST musicdisk.  It's a nearly spot-on simultation of the Buzz bass sound and would have freed up a channel on the SID in the process.

Sunday, 25 February 2018

How Patchwork (patch)works

As you may know, the Commodore 64 has 64kb of RAM.   On boot-up, however, that 64kb isn't all accessible from the machine directly. Certain blocks of the RAM space are patched out with various ROMs and the I/O space for the custom chips. (The VIC-II graphics chip, timing/port control and our beloved SID chip)

Here's the RAM setup when your Commodore 64 boots up:

$0000-$9fff - Mostly free (some kernal things are patched into zeropage and there's also the stack)
$a000-$bfff - Basic ROM
$c000-$cfff - Free
$d000-$dfff - VIC-II,SID,I/O,Character Set address space
$e000-$ffff - Kernal OS ROM

This isn't a fixed setup, however.  By changing the bit values in the $01 register you can swap out the ROMs and the chip address space however you like, either on a permanent or temporary basis.   This mostly applies to assembly language programs, however some BASIC games would make copies of the Character Set into RAM using the above method.   (if you ever saw a "Please Wait" message
in a BASIC game and then a minute of nothing it was probably doing that)

LDA #$35 , STA $01

A large amount of standalone video games swap out the Basic and Kernal because they don't use them.   This is achieved by setting the register $01 to #$35 and gives you 60kb accessible at any time.   While the CPU can see all the RAM without a problem the VIC-II chip can only see
16kb ram blocks, so from a graphical standpoint you'll usually set one area of RAM as 'the graphics ram' and copy anything else there as needed.

Patchwork relies on the fact you can also swap out the SID chip address space into RAM.  This needs a bit of background first to explain.

How music players work:

A music player is usually split into two parts, the music driver and the music data.  The music driver will have a specific set of addresses the game/demo can call to initialize, play and stop the music.  For 8-bit machines these are usually 'per frame', meaning every time the raster beam passes over the screen the music driver is called to keep a constant tempo.  Your typical music driver when
called will do some housekeeping (updating where it is in the song, changing instruments etc.) and then write to the set of registers where the soundchip is.  In our case the standard SID chip sits at the $d400-$d41c address space.  For custom machines with extra SIDs their addresses can be at any place in the $d400-$dfff range but that first SID is always at $d400.

LDA #$34 , STA $01

So, with that knowledge if we call the music player normally it'll write to the SID registers and you'll hear the notes for that frame of music.  But what if we swap out the $d000-$dfff register space to RAM first, by setting $01 to value #$34.   Now what happens?

Well the music player runs fine, but the register data gets written into RAM instead of to the SID chip so we don't hear it.  If we copy that data somewhere and then re-enable the SID chip (with LDA #$35 , STA $01 again) we still don't hear the music, because as far as the SID chip is concerned nothing new has been written to it.  Switching out the SID chip also doesn't stop the SID chip playing whatever is already in the registers, it just stops it being able to receive new data until it's enabled again.

If we then write the data we copied back to the SID chip we'll hear the new frame of music.  With this simple setup we can play a SID tune, but we have the option to manipulate the SID data first before it goes out to the chip.  You may have seen some of my other videos where I get the SID chip to pretend to be other systems. (like the Spectrum or NES)  This is using the above method, by manipulating the data we get from the music driver and then writing that manipulated data back to the SID in realtime.

What Patchwork does with the data:

Patchwork is a bit different in that it doesn't use that data in real-time, instead it stores it into an incremental buffer.  Because it's mostly a proof-of-concept I decided to store the data uncompressed, meaning every frame 25 bytes are written to memory.  (the registers after the volume control aren't used in normal music playback)  This means Patchwork has a limit of 27 seconds of music, but if I come back to this project in the future there are plenty of opportunities for improving that.

So, now we've captured some sid data into a buffer we don't need the original song to play it back.  If we send that buffered data to the SID chip every frame it'll play that instead.  

This method does has some drawbacks however.  Because the SID chip has some timing inconsistencies with the way the ADSR envelopes and the waveform Gate works, sometimes the CPU cycle count in a music driver is the reason the music sounds how it does.  This is especially
true of older game music drivers, most modern drivers have some form of "Hard-Restart" enabled which is a way to get SID chip playback pretty much 100% consistent on every frame.

You can read more on the subject in this thread , by people far more knowledgeable than myself.  One day I'll write up how the AY music player works which had it's own unique challenges. :)

Anyway, so for some game music drivers the time between SID register writes is a factor in the reliability of their playback.  This isn't a complaint on my part.  What this means (to me) is the way the driver has been developed and the sound of it have a symbiotic relationship, and are so heavily fused together that changes to one directly affect the other.

On Patchwork's end it doesn't try to emulate the playback of different music drivers, it just wants to write the register data back to the SID chip with one method.   There are a couple of ways to try and fix the 'problem' though they don't cover all cases:

  1. It's more consistent to write SID registers backwards.  I don't know the exact reasons for this but I assume hitting the ADSR registers first before the Waveform has more of a success rate.
  2. Putting a cycle delay between register writes.  For the SID player I've put in a few NOPs between writes, in Patchwork's own driver there's enough CPU manipulation between each section that the register writes are spaced out a bit.
  3. Manipulating the data before it gets to the SID.  I've only used this on Patchwork's driver side.  Because instruments can be made at any point in a recording the SID registers can be in any state.  This means that while you may have lined up one instrument to play at it's beginning with the Gate enabled, other channels could be in a 'note-off' (gate off) state, or something else.   To get around this I artificially force the Gate back on for two frames at the start of each channel.   This helps with consistency when playing back because two frames is usually enough time for the ADSR to have reset to a constant state, even if it's not completed it's cycle before being switched off again.  This third option can change the sound of the instrument for those first two frames, but as we're using 'cut up' parts of an existing song it's going to sound a bit different anyway.
Manipulating the SID data:

So now we have the data we've recorded, why not change it around a bit?  The pattern editor has a few different pitch effects, and the filter and channel assignments can be changed around.

Loops and speed changes:

Because each frame is basically a contained moment in time (like a sample) we don't have to play them in order.  This means Patchwork can play an instrument in reverse and loop forwards or backwards if required.  We can also speed up or slow down the playback, by delaying the frame increment with a timer (slower) or skipping through the data frames at a faster rate. (faster)  I think slower has some possibilities but in hindsight faster isn't all that useful.

What is different to a sample, however, is that the SID chip is expecting a note to go through a particular cycle of events.  The creator of the SID chip ( Robert Yannes ) went on to co-found Ensoniq, and the chip follows a traditional synthesis setup. That being:
  1. At the start of a note the Gate (equivalant to a note-on in midi) is enabled.
  2. While the Gate is active the Attack, Decay and Sustain values of the ADSR are cycled through by the SID chip.
  3. When the Gate is disabled (a note-off) the Release cycle of the ADSR starts. If the first three stages haven't been completed in time it stops (afaik) wherever it is and goes straight to the Release timing.
Played notes on the SID don't exist in a bubble, wherever they have got to in that cycle can affect how the next note plays.   This is the important bit, the next note that plays may not play at the right volume, or at all if the ADSR isn't in a ready state to begin again.   As far as I know this is why techniques like 'Hard-Restart' (linked earlier on) were developed.  It's also why I manipulate the first couple of frames of an instrument directly.  Even if the song data is playing backwards it should at least START with a solid note setup. :)

(as a sidenote, thanks to the SID chip I got a hands-on education on how synths work in the '80s)

Pitchshift and Octaver:

Making an Octaver fx on the c64 is quite simple, sound pitch is stored as a frequency value which means an octave higher is achieved by dividing the note by 2, and dropping an octave by doubling it.  As it's an 8-bit machine you need to do some carry value checking as you move through the octaves but it's quite do-able and the loops are fast.

This is the first time I've tried making a Pitchshift on anything, and surprisingly it kinda works for the most part.  It's unfortunately the most CPU intensive effect in the driver though, right at the end of development I had to re-work the GUI code to get it working again on NTSC machines. 

One problem with making this effect is because the C64 pitch is stored as a frequency value, music drivers aren't locked into a particular tuning scale.  This means that, for example, middle C in one driver can have different values to another depending on what frequency table they use.  A pitchshifter fx has to go through two stages to get the effect we want:
  • Detect which note we are closest to already.
To do this the fastest method I found was to take the source pitch down to the lowest octave it can possibly reach.  At this point I can compare it against a baseline pitchtable (in this case the one at codebase64 made by mr.SID) to find the closest possible match.  During this process I store a few values:
  1. The pitch offset between the source note and known 'good' note so we can apply this back later to make it as close to the original frequency table as possible.
  2. How many octaves we had to move to get to the lowest octave so we can shift it back later.
  • Add/subtract the pitchshift value
Now we have an idea where the note is we can add/subtract the required amount of semi-tones to our frequency and take that value from the baseline pitchtable.  After which we can shift it back up to the octave it was in, or if we're subtracting drop it an octave lower.  We also re-apply the pitch offset we stored beforehand so it's shifted out a few cents to match the original note's pitch.

This probably isn't the most efficient way of doing it, and I thought there'd be more errors than I was getting but for the most part it seems to work ok.  Only wildly shifting notes/pitch tables seem to have a large problem.   The range is an octave above or below so maybe if I'd increased that the problem would be more apparent.

Both the octaver and pitchshift can be applied per channel so individual layers can be manipulated.  One thing I forgot to mention in the docs is that a noise waveform note isn't affected by any pitch changes, I decided to do this because it's mostly used for drums and you usually want those to be consistent in a song.

Sweeps and filters:

These effects are relatively simple, changes to the filter setup directly alter the values in $d417 and $d418, which are where the filter assign and type are stored.  I do a few basic checks to see if the filter is enabled at all and then clear the assigns so you don't hear silence.  (on the SID assigning a channel to the filter without having a filter enabled silences the channel)

The pitch sweeps take the first pitch value of the playing instrument as the base, and then apply the addition/subtraction value stored in the instrument each frame.  When the high value of the pitch register rolls over to zero I stop applying the pitch sweep and silence the channel.  

Likewise the filter sweep takes the first filter cut-off value and applies the value stored in the instrument, but at the start of each beat rather than per frame.  This allows you to get much more rapid filter sweep effects (which is mostly what this is used for) and also means that if you change the tempo the sweep time remains relatively consistent.

Future ideas:

As this was written mostly as a 'proof-of-concept' I stuck to a list of items I thought I could comfortably finish.   If I return to the project in future there are a few things I'd like to investigate:

Independent channel playback:

While there are pattern commands for leaving channels running, essentially Patchwork is playing one instrument at a time on all channels.  What would be a more flexible approach is having each channel able to run it's own instrument.  This would mean instruments would need to have their own channel assigns (saying if it's a 1-3 channel instrument) but it also means song parts can play on any channels rather than being fixed to their original positions.   The benefit of the latter is when using SID features like ring mod or sync, as they are controlled by their playing position in the channel order.

Recording compression:

I'll admit when using Patchwork I haven't really come up against the 27 second recording limit yet.  However it would be quite possible to pack the data down from it's 25 byte frame size.  One thing to take into consideration is that the sample stream isn't used in a linear fashion, instruments can play backwards and also when recording you can be in the middle of an existing stream.  Even though you've overwritten what's there any data left after you stop recording has to be maintained.

The easiest idea is to try a smaller frame format with a fixed size, while all registers can hold a full 8-bit value realistically not all of them do so.  If it's 10 bytes smaller that adds up and we don't have to do much data management.

Another idea is to split out each register as an RLE stream.  This may sound crazy with 25 registers but if you've done music data compression before you'll know that the individual elements of a stream don't rapidly change all that often.   There'll be some (like the waveform register) that won't work as well but notes, filter and ADSR should yield quite a saving.   The other benefit with this idea is that potentially the data management isn't as horrible as first imagined.  If you start recording in the middle of existing data you only need to change where the previous RLE streams have ended, and you can check forward where the next ones are and change their values as you record.

Midi sync:

The final thing, especially as I have liveplay options, is to put in some midi sync.   I had a very basic keyboard input working as a test but I'll need to implement a timed midi cache to allow it to work with sequencers.  They send a lot of data!

Thursday, 18 August 2016

Making "Ad Astra" (1/3)

This is a short behind the scenes about "Ad Astra" , a demo for the Pico-8 virtual console that I worked on recently with ilkke  Here's a video of it, of you can watch it online.

I got into Pico-8 coding after seeing some short-form gif tutorials that the author, Lexaloffe, uploaded to his Twitter account.  At the time I was on the tail end of a large commercial game project, so it was a nice distraction writing in this self-contained development environment after work.  Being an occasional demo coder the first things I worked on were replicating a few popular demoscene effects: the Plasma, Vertical Rasters and then Vector Bobs.

After doing a few of these independently of each other I thought about some way to release them, and that's when I decided to try making a little demo 'engine' to run these scenes in sequence, with maybe some sort of design to tie them together.

Demo Engines

In the modern demoscene a demo engine is a broad term, in it's basic form an engine runs behind the scenes, managing the sequence of demo parts, assets, handling their set up and the overall timing.  On the other end of the scale you have fully integrated tool suites like Werkkzeug, which was the production tool behind a lot of Farbrauch's content like the famous fr-08 from a few years back.

What I wanted to do was something that handled the back-end work and left me free to create scenes from the various effects I'd made easily.  This boiled down to three requirements:
  • Each part of the demo can be timed accurately.
  • Each part can be made from modular components so they can be re-used, and the 'engine' will handle initialisation and constructing the draw and update loops for it.
  • The content of scenes can be mostly constructed outside Pico-8 for easy iteration using a very basic script language, making it a "data-driven" demo rather than hard-coding each screen to fit.
So let's see how each point was tackled:

1) Timing

The first thing I needed to do was work out a way to do timing.  Not timing for effects, we have _draw() and _update() for that, but general timing of the demo flow.  Modern demos usually sync to the music as it's a constant and there is likely an audio system in place providing rock-solid timing for you.  In Pico-8 there are a couple of stat() variables you can use to find out which pattern a channel is playing (16-19), and what position it's got to in that pattern. (20-23)  There aren't any for song position so the way I did it was to check if we'd hit pattern position 0 and used a flag to check if we were still in the same pattern.  I decided that each scene in the demo could be X song patterns in length, so when the timing flag is set it decreases this counter by one and if it reaches zero we know it's time to move on to the next screen.

2) Modular Components

As with games, modern demos (as in PC) are usually coded out of modular components.  This is so scenes can be constructed from various re-usable parts to make a greater whole and saves on code duplication.

Let's take an example, here's a scroller screen I coded beforehand.  It has a starfield, vector bobs, the scroll and a mirror effect.  They're all hardcoded to that one screen.

Instead you split each element out into functions and give them as many input variables as you can, without impacting on performance too much.  This instantly makes each thing more flexible and gives the designer a lot more scope to construct scenes from these smaller parts to make a larger whole.  It also applies to effects you're only going to use once, like the landscape part in Ad Astra, because you can enhance these scenes with other modules you've already written.

So now each effect had three functions: Init, Draw and Update.  Actually some of them didn't have an update because I did that in the Display bit instead (naughty) but never mind.  That's another rule of making demos: if it seems to be faster do it that way instead.

When a new scene starts the engine checks which modules are used and feeds their Init function with the relevent data for that scene.   It then builds array lists for the _draw() and _update() functions.  These are basically just loops checking if a value is true in the array and calling the relevant effect function at that point.

The z-order of things in a scene is decided by their order in the _draw() array.  This was good for backgrounds and overlays where needed.

Effects aren't all truly modular as some of them only have one instance of their variables available (the Bobs & Vectors for a start) , however the important bits that get re-used often (like the Map module) are setup properly and can be re-used multiple times in a scene.

3) Building scenes outside Pico-8

This was mostly about making things more comfortable for the creative process.  The easier it is to make your demos the faster you can iterate and improve them, you're more likely to do that in a workflow that isn't awkward.

To that end I made a few tools to help out, these were all made in the old Blitz Basic which, for me at least, I still find the fastest way to make little graphics tools.  As only the two of us were ever going to use them they didn't need to look nice, they just had to work.

The main tool was both a script compiler and .p8 file builder.  You feed it a script in the right format which it turns into an array the demo engine can read.  It then splices this with a .p8 file of the demo engine and exports a new compiled version.  Finally it boots pico8 and runs the file automatically for convenience. 

Here's a quick example of the really simple script format.  First two lines are the p8 filename to splice into (useful for checking earlier versions if something starts breaking) and the timing values for each scene in array format.  The 255 at the end is a special command to tell it to loop on the last scene.


Then each part follows, this is a list of the modules used in the Z order we want them to be displayed.  Each module starts with a # followed by the name and then the data for that part afterwards.  For example:

80 ; layerxmap

4 ; layerymap
0 ; layerxpos
47 ; layerypos
16 ; layerxsize
12 ; layerysize
0 ; layerxposreset
0 ; layeryposreset
0 ; layerxspeed
7 ; layeryspeed
0 ; layerxposadd
-1 ; layeryposadd
0 ; layerxmapadd
0 ; layerymapadd
0 ; layerxposmax
0 ; layeryposmax
0 ; skip mesh reset?
50 ; start zoom
148 ; xpos offset
48 ; ypos offset
0.5 ; x rotate start
0.0 ; y rotate start
0.0 ; z rotate start
0.000 ; x rotate move
-0.001 ; y rotate move
0.0003 ; z rotate move
700 ; target zoom
4 ; zoom speed
-1.1 ; xpos add
0.15 ; ypos add
1 ; mesh object


There are a few extra commands to handle adding data (such as the 3d meshes) and using Pico-8 commands within a scene, like the palette controls.   This might seem odd but because it's data-driven we want to avoid hard coding things per scene if possible.  So while changing the palette through an extra function rather than directly may lose a tiny amount of cpu time it means it's available in the script with the other parts of the scene. There were several times where I had to change masking colours for particular map layers and being able to do that in one continuous process made things much easier.

Each scene ends with an #end command so the engine knows we're done.

This may not look like a much faster way of working, but being able to run this in Notepad++ , tweak a setting here and there, cut/paste entire parts around to change the flow etc. certainly helped with the production workflow.

Another useful tool I made was a .PNG to _map & _gfx converter for the artwork.  This was like any old tile converter in that it optimized the tiles used and then reconstructed the .png in the new order as a _map & _gfx set that could copy/pasted into the .p8 file.  So ilkke could send me his latest version of the artwork as a .PNG file, made in whatever he wanted to use:

And I then had these in the engine ready to view straight way.

The final tool I did was a converter for .OBJ (Wavefront) files to turn them into an array for the vector routine.  The only easy way to store vertex colours was in materials, so we did that.  Here's a screenshot of ilkke's original Blender spaceship:


In hindsight my use of an array meant I wasted a LOT of tokens, and didn't throughly check that out until we started running out of memory.  Though luckily it was only in the last days of production.  Next time I'll use a pure bitstream approach and also standardize the input format for parts so we don't have any fixed variable names.

In the next two posts I'll go through each part in turn with a bit more detail.

Making "Ad Astra" (2/3)

Scene 1 : Flyby

The opening scene makes the most use of the Map module, this is a little wrapper around the Map function that adds variables for movement over time, and checking boundries for resetting the map's position.  By stacking these up I could have some parallax layers to set the scene.  The stars use the Vector Bob module on a mesh of random X,Y,Z points, with a very slight rotation on one axis so it's direction gradually moves from head-on to the side-view. 

ilkke supplied some great assets for this, a full-size space background across several maps, the detailed space freighter and finally our Ate Bit logo.  This is also the first time you see the triangular design style ilkke went with on the backgrounds to each scene.

If you're unfamiliar with the term Vector Bobs, it's basically a mesh of points sorted in Z order and rendered with sprites.  To add more depth you can use different sized sprites based on the Z ordering.  I think the first time a lot of us saw this effect was the Afterburner arcade game's title screen.

In the early days of the Amiga demoscene it was quite a popular effect, probably the most famous is in the Red Sector Megademo.

Scene 2: Space Flight

So now we use the Vector Bobs again with classic bubble sprites, and bring in a filled spaceship to fly over it while a planet hovers into view behind.

Amusingly this is the first software triangle routine I've ever done, so it was fun looking up old tutorials to find out how to best approach it.   Ones I used were this for the triangle splitting algorithm and this old one from Cubic & $eeN on backface culling.  I did end up with a couple of bugs in the draw which were temporarily fixed with Rectfill to cover them up.  I did mean to go back and fix it properly but I was running out of time so that stayed in, it didn't seem to impact the speed much.  Anyway, something to work on next time.

For sorting both bobs and tris I used a Comb Sort, I think I did a literal translation of the psuedo code from here in the end, short and fast enough for what I needed.  I did have a faster version in the demo which shared the Z position and index in one token, it had a hard-limit of 511 items (not really a problem for the scenes I wanted to do) but had some problems with really close Z positions that I didn't notice until late in the day.  As I didn't have time to fix that up and the scenes were still in a frame without it I swapped back to the older two token version instead.   The one in the demo puts out about 150 bobs in a frame, the faster one was hitting about 190.   Here's a gif of the fast one at full capacity, just under 3 frames sorting 511 bobs:

Scene 3 : Scroller

This keeps the previous elements and adds a sine scrolling logo over the top.

The logo is one long map, when displayed the visible area is split into vertical strips of 1x16 in size, and each one of these has an independent Y sine position which is stored in an array.  Strips are pixel scrolled left, and once 8 pixels have been covered the pixel position is reset and the X position of the map draw increases by one so that the map scrolls across the screen.  

To do the sine effect the last element in the Y array has a new sine position calculated, then the whole array is shifted to the left every frame so it's constantly Y disting across the whole visible map.

Scene 4 : Planetfall

This uses the map and vector modules again (actually all screens use the Map module in some way), the only new thing here is the Plasma module which is used for the cloud effect.

The first effect I coded on Pico-8 was a standard Plasma, to make things a little smoother I did it as a 4x4 pixel effect in sprites, plotting the sprites for the map every 4 pixels in X + Y direction:

The Plasma module has a few extra inputs for max sprites to use, area and also what resolution to use.  Amusingly while it can still do the 4x4 mode I only use it in normal 8x8 sprite resolution in the demo.

Scene 5 : Landscape

The landscape is in the style of classic 'voxel map' effects like the famous Mars demo from 1993.  Obviously it's a lot less capable than any of those but it does use a height-map (displayed in sprites so I could have a bit of a gradient effect with the lighting) but can only move in map shifts, no rotation or camera movement is possible.   The Z distance calculation is far more like one you'd do for a classic racing game effect than proper 3d too.

The lighting is calculated based on the sun sprite's X + Y position relative to the displayed sprite position. But it's one of those 'demoscene calculations' where you tweak it to look alright rather than basing it on any real-world formulas.  As we're not shifting the map position left or right it's actually mirrored in the middle to save a bit of cpu time, but with slightly different lighting calculations for each half so it's not immediately obvious.

To finish the scene off I asked ilkke to do a cockpit overlay (which looks great), this is shifted up and down on the same sine calculation as the sun sprite.  In the background of the scene a couple of Rectfills draw the sky and horizon, which also moves on the sun's sine wave.

This is the only effect that goes over a 'frame', or at least goes over the stat(1) output whch is what I was using to test each screen during development.

I put an earlier version of this scene up on my twitter (below), but I think the new version looks a bit with the gradient sprites.  Still lots of room for improvement though.

Making "Ad Astra" (3/3)

Scene 6 : Ship flyover

This re-uses most of what we've already seen, with the only new effect being the Water module which is only 5 lines of code.  This uses memcpy to copy part of the screen line by line to another part of the screen but in reverse order.  To get the ripple effect I run a sine calculation in the copy loop and add the result to the line to copy, the start positions of the sine calculation are stored outside the loop and updated on each new call. This way you get the distortion of lines being copied out of order, but because the sine is starting from a moving constant you get a gradual movement of the distortion vertically up the area, making it seem much more like waves.   After this point I use it in pretty much all the scenes because it was fast and was a nice way to fill the spare space with action.

The Plasma module is used again here to add some subtle ripples on the water surface. I'm quite happy with how that turned out because with the small window size it creates some quite diverse shapes in the water.  With ilkke's great map layer work behind and in front of the water I think this was quite effective in the end.

Scene 7 : Shaded vectorbobs

Here we use the Vector Bobs again on a 5x5 cube.  The lighting is again calculated by the bob's X+Y position against the Sun sprite's X+Y position.  It averages this out so that the X + Y results are always a positive.  If both X+Y results are under a certain threshold (a simple way of saying the bob is close enough to the sun to be lit) then we do some division on both values to end up with a final result that is somewhere in the spectrum of bob sprites in the bank.  I think we had 8 lighting values in the end and as long as they're in the right sprite order bobs closest to the sun will get the brightest colours.

To give the calculation a bit more of a 3d appearance the X + Y are divided by different amounts (half for Y) and I use the bob's Z position as a divider to give it more of a spread across the cube surface.   These are, of course, 'demoscene calculations' in that you keep tweaking until you get what you want for the effect.

Anyway I was quite happy with this one, that lighting spread was my main aim and it gives the right appearance even if the maths are very far from a proper lighting model.

Scene 8 : Vertical Rasters

Another oldskool effect, I was quite happy to see that this can be done in nearly the same way I've done it on 8-bit machines before.  Basically you have a 1-pixel high horizontal strip which is plotted onto each vertical line of the screen.  For each vertical line you plot into it your vertical bar graphic at a sine calculation.  You then update the sine calculation, wait for the next vertical line and plot your bar graphic into it again.  You continue doing this for the whole screen and because you're plotting into the same pixel line of graphics without clearing it you end up with the 3d snakey effect above.  

In the pico-8 version I have a one-pixel high strip of sprites which is plotted into as above and then I use the map command to redraw it one pixel down each time.  Simpler than doing it on a Vic-20 obviously. :) 

To enhance it a bit I use the Map X position to give me some wobbly sine offsets, and I do the loop in reverse order so it comes out upside down.   That's very easy on here where you're not chasing the raster beam down the screen, but it's also a homage to the one in LDA STAin by Clarence/Chorus which was a complete headscratcher for me when it came out.   He then went one better in Rocketry by drawing them in reverse Z order, amazing stuff.

Finally as I'm plotting into a map I can make use of transparency and have some bubble sprites behind the rasters to add depth, the horizon also bobs up and down using Rectfill and the raster sine position calculation.  Everything else in this scene is handled by the Map module and Water so I can hide the logo behind it when I need to.

Scene 9 : Blenk vectors

Classic blenk vectors, I think the first of these I saw was in Enigma by Phenomena though those were centrally lit.  This uses no new modules but adds a moving sun sprite.  The lighting calculation is much simpler for this and only uses the sun's X position relative to each triangle's X position.  There is no threshold check, it just divides down to a nicely useable number for an array of the lighting colours: darkest to brightest. 

So that's all the vector routines in the demo.  One I didn't get to use was an extra mode which combined wireframe with filled to have a psuedo-drawn effect.  The wireframe has a random offset from the origin each frame to make it look more sketched.  As it didn't fit in to the design I didn't remove the triangle joins which would make this look neater.

Scene 10 : Mirror bobs

So here is the final screen, this almost didn't get in because we'd started running out of memory in the blenk vector part.  To claw it back I turned all the init functions into one big unwieldy if..else list, generalised some variables and finally hard-coded some movement routines only to the parts they're used in. (like the shade bobs and blenk vector)

In this part the vector bobs are plotted to the same origin position on an empty screen, which is then copied to a 4x4 block of sprites in one of the banks.  The screen is then cleared and 4 map strips are drawn onto the screen next to each other, these strips contain the same sprites we just plotted into, so we get a mirror copy of the vector bobs all over the screen.  To add variety the map strips are offset by 1 y map position each and are longer than the display so they can be shifted vertically and then reset every 4 map blocks for a seamless moving background.  I put an earlier version of this on my twitter which got quite a few likes, so thanks for that:

As there was some cpu time left I put the logo scroller over it for the finale.  I think this part is still running in a frame, stat(1) says yes but it does look like it staggers occasionally, maybe that's just the draw update.


In most scenes there's a transition at the start and end, this just uses Rectfill and each of the X+Y Start and End points can have a number added or subtracted to give different fills and reveals. (with different colours)  I hard-coded the timing for when these start and end (start is literally at the start of a new screen, end is a few beats into the last pattern of the part) which worked ok here but isn't that flexible obviously.


One rule of thumb for doing music for demos (and trailers too) is to have the music first, and time your scenes to it.  But personally if I'm coding the demo as well the last thing I'm thinking about is writing the music.  One obvious flaw with writing it last is you are stuck to the timing of the demo which usually won't be as bar friendly as you'd like.   You'll notice the music in the demo doesn't really follow much of a song structure because the scenes aren't very even in length.

One thing I will say though is make use of soundfx in your music.  Having the ship engine whooshes and odd portamento effects really helped with dynamics. (One thing I should have put in was a 'Shake' module for shifting the camera around, but I got around some of that using sound fx)  This is still something that doesn't happen very often even in modern PC demos and I don't really understand why.


Thanks for reading, hope you enjoyed watching the demo.  Massive thanks to ilkke for all his support working on this.