Hacking Homebrew game Extracting sprites - file system scrambled?

Horizon_

New Member
OP
Newbie
Joined
Sep 27, 2022
Messages
4
Trophies
0
Age
31
Location
London
XP
49
Country
United Kingdom
Hey all,

I posted this in the noob thread to no response so just hoping it can get some more visibility here.

Has anyone seen this situation before, where the file system appears to be scrambled?

Here's how Age of Empires: Age of Kings DS looks in Tinke:
Capture.PNG


Extracting sounds in bulk with ndssndext was easy so it's clearly not an unsolvable problem. I'm hoping there's some step I'm missing or tool I have to use to get the sprites out of the above mess. Doing this the hard way with screenshots would obviously not be ideal! 😅

Thanks in advance for any advice you can give.
 

FAST6191

Techromancer
Editorial Team
Joined
Nov 21, 2005
Messages
36,798
Trophies
3
XP
28,348
Country
United Kingdom
Do we have a noob thread for ROM hacking?

Anyway file names are just a courtesy on the DS (internally they can happily call numbers). Occasionally you do get a ROM where you are not given such luxuries by one means or another.
I can't rule out some internal scrambling/obfuscation but it would be a first outside of a ROM hacker many years ago trying the protectionist mindset.
I don't recall it being in Age of Empires but that might have been the other game I looked at, though do check other regions as well in case it was a one off.

ndssndext (by the way vgmtrans and vgmtoolbox are both far more suggested programs these days -- ndssndext's midi conversion is better than the even older sseq2mid but not by a lot compared to the actually accurate approaches) would likely have done some kind of header/magic stamp inference which is also what I would suggest next.

In most DS formats (not all by any means but most) you will tend to have the file start with a given value https://www.romhacking.net/documents/[469]nds_formats.htm or have one a few bytes in. Knowing this you can infer a lot of things where extensions might fail you. Major exception for compressed things, though that is also useful as compression means usually means something interesting and most times means it starts with 10h and the file size, 11h and the file size (both of these being flavours of LZ/BIOS compression), 40h for some later games (Golden Sun being the first it was seen in), there are some others for huffman but eh. https://www.romhacking.net/utilities/826/ and dsdecmp being what I would point at for compression tools.
Indeed this first bytes (after everything decompressed of course)/magic stamp will often be what I use as an extension if I am pulling apart such a game (or more likely archive file within a more conventional game) in the absence of something better.

You can try on with brute force (open in tile editor, press down a lot. Alternatively swap files around, corrupt them, tweak data in them and see what changes in the game), relative search ( https://www.romhacking.net/utilities/513/ ), maths based search (all sorts of mathematical features of language mean finding text is easier than many other things), pointer inference (harder on the DS with individual files but still able to yield useful info), back searching (find data in RAM when loaded by game*, search ROM for it)...

*do remember you have cheats, saves you download off the internet and turbo to speed that process up, and again when we get to the method below.

After this we get to tracing. More annoying on the DS than some earlier systems but still within reason.
On earlier systems the cartridge was at some level visible in memory, on anything that is not a cartridge (floppy disc, CD, DVD...) and DS or newer then chances are it has a means by which it speaks to the data storage device and gets things from it.
The DS is within that paradigm. http://problemkaputt.de/gbatek.htm#dscartridgeprotocol
-- Main Data Load --
B7aaaaaaaa000000h Encrypted Data Read
The parameter digits contained in above commands are:
aaaaaaaa 32bit ROM address (command B7 can access only 8000h and up)

This leads to a method called tracing.
Find a point in the game where the data you want is loaded (again you have cheats and all manner of other things to speed this up, along with the ability to move a bit laterally -- don't need to get that end game sword via 50 hour sidequest if the graphics are a few hundred bytes on from the starter dagger), note where it is and set a break on write to that area, restore savestate from earlier/exit level/exit room/whatever is needed to clear it from memory and go back in such that it loads again, it will hopefully ping up and you then work back to the B7 command and thus know what file it corresponds to.
https://www.romhacking.net/documents/361/ is for the GBA but the principles apply in general for most things.
 
  • Like
Reactions: YuseiFD

Horizon_

New Member
OP
Newbie
Joined
Sep 27, 2022
Messages
4
Trophies
0
Age
31
Location
London
XP
49
Country
United Kingdom
Thank you, FAST6961!

Your posts here show up a lot when Googling these issues... you are definitely seen and appreciated!

I'll dig into this info and come back with anything further.
 

Horizon_

New Member
OP
Newbie
Joined
Sep 27, 2022
Messages
4
Trophies
0
Age
31
Location
London
XP
49
Country
United Kingdom
Thanks again for the detailed reply. I've finally found some time to dig into this further.

I have the full file system extracted in both
compressed (Tinke: select root folder -> extract)​
and uncompressed (in Tinke: select root folder -> unpack -> extract)​
formats. I've tried to read the first few bytes of the files (of both sets) in various encodings but coming up with nothing useful - no compression format indicators and no Magic IDs, except for the one SDAT file. I have no idea where to go next.

Here's what I ended up doing, trying to find something useful in the first few bytes (or after the first few bytes, as you said).
Python:
import os

for (dirpath, dirnames, filenames) in os.walk(mypath):
    for filename in filenames:
        with open(dirpath + '/' + filename, 'rb') as f:
            my_bytes = f.read(12)
            print(str(my_bytes, 'u8', 'ignore'))
Again, I tested several encodings (u8, u16, u32, ascii) but never got anything useful.

Any tips for next steps? Thanks in advance!
 

Site & Scene News

Popular threads in this forum

General chit-chat
Help Users
  • No one is chatting at the moment.
    NinStar @ NinStar: CRAZY HAMBURGER