Figuring out sound - the nuts and bolts

zahlman · November 18, 2012

Okay, so approximately forever ago there was this big buzz on SoS about some leaked dump of FE sound dev stuff. It was all totally disorganized and mostly didn't say anything new, but one thing I did get out of it was an object file for the FE7 sound code (IIRC it's really the same in 6 and 8, but at a different offset in the ROM) - no actually useful source, but standard tools were able to extract some labels... which meant (a) knowing where in the ROM to look for the code that handles sound; (b) having some names for functions and stuff.

So recently I set to disassembling that chunk, figuring out which pieces are ARM and such. It's a big mess, but here are my findings so far...

- The code is full of references to a magic value 0x68736D53. This seems to be just a random value that's chosen to be unique and to indicate the start of blocks of IRAM reserved for sound processing. But there might be some subtle reason for the exact value that I haven't figured out yet.

- There's one main routine "SoundMainRAM" which, like the name suggests, gets copied to IRAM at startup. It is a badass mofo; the startup code block-copies an entire kilobyte (just to be safe, I guess), and the routine is actually almost that long (932 bytes - so the copy runs up until the middle of some other function, lol). Complicating matters further, it switches between ARM and Thumb code a couple of times. The routine is copied from the ROM starting at 0x0BD8E4, and goes to offset 0x2450 in IRAM. (There are a couple of functions before this in the object dump: a short utility that switches to ARM to use its 32-bit long multiplication functionality; and something called "SoundMain" that does some basic setup work, checks sync, calls a couple other functions I haven't really looked at yet, and then launches into the IRAM copy of SoundMainRAM.

- There's another huge chunk of IRAM that's dedicated to sound stuff, starting at 0x4AE0 in IRAM. This block starts with the magic value, but it will periodically switch to be 1 higher and then back; I think this has something to do with sync. This region looks like

Header: the cookie, 24 bytes of... something, then 11 pointers to various places: asm routines in the ROM, and more sound-related stuff in the IRAM.

16 bytes padding (?).

12? * what I am pretty sure are structures representing "sound channels": each seems to be 64 bytes, the last 16 of which are usually zero. There's room for 12, anyway, but I don't think more than 8 ever get used.

This bears an eerie resemblance to the SoundArea structure described at http://nocash.emubase.de/gbatek.htm#biossoundfunctions

except that certain things just don't seem to quite line up. (In fact it seems like a LOT of this code is doing stuff that should be done, or at least do-able, by built-in BIOS routines that don't actually get called.)

I suspect that within each structure, the blue box is the actual "WaveData*", i.e. a pointer to a ROM location which is the start of the sample data (header for the sample data, really), and the black box is a pointer to the current position in that sample.

Anyway, right after the space taken up by the supposed 12th sound channel, starting at 0x4E30 in the IRAM, is the "PCM_DMA_BUF": 2 regions of 0x630 bytes each, the last 0x10 of each of which seem to be always zero. It seems that sound is mixed and copied into these buffers, and then hardware transfers are used to dump it from there directly into the GBA's custom sound hardware. I have no idea why there seems to be that 0x10 bytes of padding, nor why the size is what it is; but there's clearly a constant built into the code for it.

There's a bunch more sound stuff after that, too, and then a bunch of empty space, and then something that I think is mostly to do with graphics, and then right at the end of IRAM, at 0x7FF0, is... a pointer to the SoundArea. :/

ANYWAY. Basically I personally probably don't care much, if at all about the stuff just after the PCM_DMA_BUF (I'm guessing it has to do with the Sappy tracking stuff). What I'd really like to figure out is where the code is manipulating that black box, and moving data from the sound sample in ROM into the PCM_DMA_BUF (or perhaps into some other temporary work area?). What I'm hoping is that I can hook into this process to implement some kind of compression. This would have to be extremely simple (even if you didn't have to do anything other than sound, you'd only get like 500 CPU cycles per sound cycle), but I do have something in mind. If I'm right, we can free up a few megs of space (the samples in the FE7 ROM take up just over 4MB and I'm optimistic I can get close to 4:1 based on some earlier experiments).

So. Cam? SoC? PM me or something? Attached: my disassembly.

dump.txt

Ziose0 · November 18, 2012

wrong section dude.

eCut · November 18, 2012

wrong section dude.

If you think there is a topic in the wrong section, you can report the post so that we see it, just for future reference! It's like a double post; no one gets hurt, it's just brought to staff attention so it can be fixed.:P

And with that, moved to hacking questions.

Arch · November 18, 2012

It's more of a resource/question combination, there's some good documentation in here.

Wrong section, ma'am.

CT075 · November 18, 2012

oh dear that is a nasty block of code

judging from the specs in the link, it actually matches up to SoundArea pretty perfectly (although you already knew that) except that IS is dumb and recoded a lot of stuff that should be handled by BIOS

i'll look at it more in-depth later

EDIT

yeah i'd move it into resources too

Edited November 18, 2012 by CT075

eCut · November 18, 2012

Okie! I'll trust you guys and move it to resources then!

Thanks. :)

zahlman · November 18, 2012

Thanks, I couldn't decide really where it should go because I'm not really asking anything clear yet and I also don't have anything close to usable by the community either. Anyway :) Yeah it seems to be reinventing a lot of stuff but I'm not sure it can be blamed on IS, I suspect that a lot of this comes right from Nintendo. (At any rate I'm pretty sure that something that switches between ARM and Thumb in the middle of a function would have to have been written originally in ASM, while most of the code looks, honestly, like a debug C build.) And certainly the Sappy (as we're calling it) standard appears in a LOT of other games (FFS, the tool Sappy comes from the Pokemon hacking community). Maybe Nintendo was thinking ahead, or rather, the GBA BIOS sound routines were a test that didn't really get off the ground? I mean if I'm reading the docs right, they won't actually work right on a DS. No idea why they would change the SWI numbers around though o_O

zahlman · November 25, 2012

It's my thread and it's been a whole week and there's something important to say so imma go ahead and bump this.

Bad news.

I tried implementing the compression I had in mind and... it didn't work... like the output was correct and everything, but it wasn't actually shorter. So I looked at the data and ran a bunch of tests and statistics and discovered that

- I'm going to need a slightly more sophisticated approach to get any compression at all

- It's still going to be really bad compression. That 4:1 I was hoping for is looking more like 4:3 at the moment. :(

Still, that would be over a megabyte saved, so I'm not giving up on this yet. Also it's interesting and cool and what-not. Although I won't be surprised if it can't be done in real-time on the GBA anyway. At least if this doesn't work, it will totally wtfpwnlogic anyone who still imagines that something like MP3 is possible.

Edited November 25, 2012 by zahlman

Agro · November 26, 2012

can you figure out how to get expression controls working in MIDIs first

oh well :(

zahlman · November 26, 2012

can you figure out how to get expression controls working in MIDIs first

First I'd have to understand what they do. Feel free to link me some docs.

The problem is, MIDI is really a terrible format... fundamentally it's designed for logging, not for tracking

oh well :(

Well I coded up the compressor anyway and made a compressed binary dump... the savings are about 22%, which is still almost a megabyte for FE7. Probably I could do significantly better with a lossy scheme but making one that doesn't completely ruin everything (I mean they're already 8-bit samples) would require even more genius than I can summon at the moment. And yeah no guarantees there's enough CPU time for it but it's still worth trying, if only to prove understanding of the code.

Actually, if you're interested I could show you how to dump the samples as AIFF, and you could take a shot at cleaning them up (less static = better for compression probably, although you really can't do anything about buzzy percussion) and/or shortening the ones that loop (I don't think the loop-point information gets reflected in formats like AIFF though).

BTW I'm planning a ZSE re-release soon... to migrate to Python 3, and to take advantage of some stuff I developed while working on NM3 for making the command-line interface.

Edited November 26, 2012 by zahlman

Agro · November 26, 2012

Alright, well I'm not sure how well-versed in MIDIs you are so I'll assume you know next to nothing and explain the three main volume controllers for MIDI.

[spoiler=MIDI shit]You've got volume, velocity (note on), velocity (note off), and expression. Volume is just the overall volume for the whole track (until there's a change, at least) and maxes out at 128. Simple enough. Velocity (note off) is almost always 0 unless you're really weird and there's something wrong with you, so we'll ignore it.

Velocity is the relative level of volume at which the start of the note plays. It allows for variation between each notes AND in Fire Emblem songs (I haven't seen this used anywhere else) they are used to create a sort of "reverb". For instance, a staccato, 1/16th note with native instruments sounds crappy. By placing another note directly afterwards of a short length BUT of approximately half the velocity, you can make things sound like they have reverb and thus take away the harshness of the note. Examples:

With velocity all the same: https://dl.dropbox.com/u/51331815/samplevelocity2.mp3

With varied velocity between notes: https://dl.dropbox.com/u/51331815/samplevelocity1.mp3

(continued in pt 2)

Edited November 26, 2012 by Agro

Agro · November 26, 2012

pt 2 since I can't post any more than a certain amount of media files, apparently:

Normal drums: https://dl.dropbox.com/u/51331815/samplevelocity4.mp3

Drums with the "reverb" velocity trick: https://dl.dropbox.com/u/51331815/samplevelocity3.mp3

Finally, there's expression. Whereas volume controls the volume of the entire track, and velocity controls the volume of the start of a note, expression controls what happens during a note. Essentially, it allows for crescendos and decrescendos, as well as fadeouts, etc. etc. The lack of it is most notable during songs that are supposed to have notes that fade out. Example:

Flute without expression controls: https://dl.dropbox.com/u/51331815/without%20expression.mid

Flute with expression controls: https://dl.dropbox.com/u/51331815/with%20expression.mid

The whole point of the two additional controls, velocity and expression, is to allow for easier volume adjustment later down the track. All of the above are possible with volume, but if you suddenly decided that the track was too loud, you'd have a hell of a time trying to readjust every single note, and so the other two options exist for that purpose (since you can leave expression and velocity intact while only change 1-2 values for volume)

The current version of Zahlman's Song Editor recognises velocity (thank god) but doesn't recognise "expression". Those are the 1456 "unimportant control change(s)" that it ignores (though I'm sure you know this seeing as you made it). It wouldn't be a problem for me because I can do fadeouts and such with the volume controls, EXCEPT that in ZSE if you have a note that changes volume halfway through, instead of the note changing volume when it's inserted, it will be split into two notes. This is why reinserting MIDIs from FE7 with ZSE doesn't work so well.

Was that what you meant when you said you didn't know what expression controls did?

The AIFF shit is kind of out of my league, though. Everything I've just explained to you I learned through dissecting heaps of DS and GBA MIDIs and doing a lot of reading and trial and error etc.

(pitch wheel control would also be nice if possible)

(oh and panning)

Edited November 26, 2012 by Agro

shadowofchaos · November 26, 2012

SoC?

Cam has LONG since surpassed me in this field. The thing that takes up most of my time is Japanese translations.

Agro · December 28, 2013

I realise this is a really bad necropost, but I just wanted to provide a little bit of closure to this thread since I don't think we reached an awful lot of conclusions and I needed to make a correction anyway. The past year I've learned a lot more about music hacking so I guess I can come back to this a little more learned.

Actually, if you're interested I could show you how to dump the samples as AIFF, and you could take a shot at cleaning them up (less static = better for compression probably, although you really can't do anything about buzzy percussion) and/or shortening the ones that loop (I don't think the loop-point information gets reflected in formats like AIFF though).

I had a crack at this and to be honest, it's a pointless exercise. Samples are designed to loop at a specific point and change the looping point makes the samples sound really weird and not very good overall. The "static" actually has little to do with the samples--it's actually due to the GBA playing them back at a very low sampling rate which causes noise. VBA actually has an option to reduce this noise (the "Low pass filter") but you could also make the game play them at higher sample rates by changing a byte at around $BE578, which is where the Sound Driver Operation Mode is. However, this reduces game performance.

I seem to recall a discussion elsewhere about trimming them so that there's less noise at the end of notes, but this is actually caused by the envelopes of the samples. We tend to hear noise more when there's less music playing, and most of the samples have a release value of around 165 so you're bound to hear noise at the end of any given note.

Speaking of envelopes, the reason FE7 has separate voicegroups for each of its songs is because Yuka Tsujiyoko decided she wanted to have different envelopes for the same samples across different songs. Thus, a separate table is needed for each song.

Edited December 28, 2013 by Agro

Crimson Red · December 28, 2013

Interesting stuff, thanks for sharing

zahlman · December 31, 2013

I... guess I shouldn't be surprised by any of that, really. :/ There are still optimizations that are possible on the instrument tables, I'm sure, but whatever. It wouldn't add up to much.

Edited December 31, 2013 by zahlman

Figuring out sound - the nuts and bolts

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Join the conversation

Recently Browsing 0 members