Homebrew TWLbf - a tool to brute force DSi Console ID or EMMC CID

enderghast13 · Aug 24, 2017

JimmyZ said:
Thanks to all of you for sharing!

Thank you for the detailed report and our first EMMC CID report! I assume they're all US region based on your location?

I don't know much about that, aren't 3DS like totally hacked already?
And this tool actually doesn't support 3DS TWL FIRM, they're encrypted differently according to GBATEK and TWLTool.
I could add support to this if such needs arise though.

Yes, these are all US region, sorry for forgetting to specify.
About my 3DS question, I was just wondering if they would be useful for archival or research purposes as there isn't much documentation on 3DS Console IDs, though I do understand that bruteforcing 3DS Console IDs is beyond the scope of this tool and practically useless to do anyways.

wicksand420 · Aug 24, 2017

Is there a way to dump the nand without hardmod yet?

JimmyZ · Aug 24, 2017

nocash123 said:
I've had a look at polarssl and openssl some years ago when trying to "understand how AES works"... openssl looked very confusing, and polarssl looked a bit straighter (but still very confusing and overcomplicated)... anyways, as far as I remember polarssl did support AES hardware acceleration, too. So both might be same as long as you have a PC with AES-NI support (which seems to have been invented in 2010).

Yes PolarSSL supported AES-NI, but the version included in TWLTool doesn't, otherwise TWLbf with mbed TLS(PolarSSL's new name) won't be 3 times faster than previous TWLTool brute force mod.

nocash123 said:
For multi-core CPUs, I wonder if each core is having its own AES hardware? If not, then multi-threading won't actually speedup the calculations.

AES-NI is a new set of instructions, I believe each core has it's own AES-NI like MMX/SSE/AVX, it's not like a separate AES engine device on DSi.

nocash123 said:
Is that using an "optimized" SHA1 function? One older optimization mentioned here https://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1 uses gerneral-purpose SSSE3 instructions (this should be also implemented in openssl).

I really don't know, I just call SHA1 from OpenSSL and/or mbed TLS, I suppose I couldn't do much better than them, at least in a short amount of time.

OpenSSL forces us to use a high level interface called EVP to utilize hardware specific optimizations like AES-NI, the result is, on large blocks(like 8K bytes) EVP AES can be 5 times faster with AES-NI, and 1.8 times faster than mbed TLS with AES-NI(supposedly for lacking pipeline optimize), but, if on 16 bytes blocks, mbed TLS is 2.5 time faster than OpenSSL EVP. I talked about this in the github readme.

And, for SHA1, EVP interface and low level interface is about the same speed(you can test this with "openssl speed -evp sha1" and "openssl speed sha1"), the later is a tiny bit faster than EVP on 16 bytes blocks(which is our use case). so it's either not hardware specific optimized or on the same level of optimization, ~~I suspect the 1st guess, I suppose in crypto libs targeting TLS, SHA1 performance is not that important.~~ apparently SHA1 in OpenSSL uses lots of exotic SIMD optimizations including SSE/AVX(2)/SHAEXT, but when working on 16 bytes blocks, it's only about 1?% faster than mbed TLS's C code.

nocash123 said:
And newer intel processors should have extra opcodes SHA1RNDS4, SHA1NEXTE, SHA1MSG1/2 (not sure if/when/where that's supported, intel announced that stuff in 2013, but some other webpage mentioned it not being implemented until 2016, or so).

My ignorance! I've never heard of those SHA1 instructions, a quick search show they're used in OpenSSL but not mbed TLS
https://github.com/openssl/openssl/search?utf8=✓&q=sha1rnds4&type=
https://github.com/ARMmbed/mbedtls/search?utf8=✓&q=sha1rnds&type=

nocash123 said:
And, another (small) optimization would be appending the sha1-end-byte and sha1-padding-bytes to the CID, and then passing that directly to the 64-byte-sha1-core function (ie. avoiding the same padding to be repeated on each calculation).

Yeah I thought about that too, but my assumption is it will not impact the performance too much?
It might benefit more from preparing a series of CIDs in 512 bit blocks and feed it to a pipeline optimized sha1-core function, but this will only benefit the EMMC CID brute which by my opinion is less useful, and requires me to write/modify crypto lib code, as of now I just call them.

nocash123 said:
I thought the MBR and DSi partitions are using the same encryption on 3DS? That should be somewhat required to be so for DSi backwards compatibility. The MBR may contain different/extra data on 3DS (so brute forcing may fail when searching for certain "fixed" values in the MBR).
For the ConsoleID, I think the 3DS does have it's own "3DS ID" (for whatever 3DS things), and separate/crippled "DSi ID" (for DSi-style eMMC encryption). The latter one being reported to be 6B27D20002000000h on one n3DS console.

I'd admit I don't know most of this, but TWLTool had a 3DS TWL FIRM brute routine, which IIRC doesn't seem to be the same, but on the other hand it's not documented anywhere so it might be testing code that doesn't work yet.

BTW, I saw TWLTool's 3DS brute expecting 16 bytes at offset 0x1e0 to be all zero, but I found out that's not true on DSi for having a 3rd partition, so I moved to verify offset 0x1f0 instead, which by coincidence is similar to your efforts as documented in that NGemu thread. Does 3DS TWL FIRM actually only had two partitions?

enderghast13 said:
About my 3DS question, I was just wondering if they would be useful for archival or research purposes as there isn't much documentation on 3DS Console IDs, though I do understand that bruteforcing 3DS Console IDs is beyond the scope of this tool and practically useless to do anyways.

I'm really not the right person to answer this, but I've never heard anybody in the 3DS hacking community talking about this, it might be a sign?

wicksand420 said:
Is there a way to dump the nand without hardmod yet?

I believe that requires you to have a working dsiwarehax to run fwtool.

JimmyZ · Aug 24, 2017

nocash123 said:
For multi-core CPUs, I wonder if each core is having its own AES hardware? If not, then multi-threading won't actually speedup the calculations.

I tried to run 4 processes on a Windows 10 i5-3450(4C4T 3.1GHz) like this:

Code:

@echo off
start /b /belownormal /affinity 1 twlbf_mbedtls console_id_bcd 0820100000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa
start /b /belownormal /affinity 2 twlbf_mbedtls console_id_bcd 0820200000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa
start /b /belownormal /affinity 4 twlbf_mbedtls console_id_bcd 0820300000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa
start /b /belownormal /affinity 8 twlbf_mbedtls console_id_bcd 0820400000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa
@echo on

they finished in 775, 781, 797, 808 seconds respectively. as opposed to running a single process it cost 764 seconds, it scales quite well as expected.

I'll do a hyper-threaded test later.

FR0ZN · Aug 24, 2017

damn so you can bruteforce all you need in 15 minutes? f*** the software exploits man, this is awesome !

JimmyZ · Aug 24, 2017

Hyper Threading test on Linux, Xeon E3-1230v2 (4C8T 3.3GHz):
1 process costed 680 seconds.
4 processes with -c 1, 3, 5, 7 costed 917, 917, 918, 918 seconds respectively. hmm, interesting.
8 processes with -c 0, 1, 2, 3, 4, 5, 6, 7 costed 968, 968, 968, 968, 968, 968, 968, 968 seconds respectively. what?

Code:

cat /proc/cpuinfo |egrep "processor|core id"
processor       : 0
core id         : 0
processor       : 1
core id         : 1
processor       : 2
core id         : 2
processor       : 3
core id         : 3
processor       : 4
core id         : 0
processor       : 5
core id         : 1
processor       : 6
core id         : 2
processor       : 7
core id         : 3

silly me, should have used -c 4, 5, 6, 7 instead, also, this means hyper threading rocks for TWLbf.
re-run 4 processes with -c 4, 5, 6, 7 costed 721, 721, 721, 721 seconds respectively, that looks about right.

BTW Linux test commands was like this:

Code:

taskset -c 4 ./twlbf_mbedtls console_id_bcd 0820100000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa &
taskset -c 5 ./twlbf_mbedtls console_id_bcd 0820200000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa &
taskset -c 6 ./twlbf_mbedtls console_id_bcd 0820300000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa &
taskset -c 7 ./twlbf_mbedtls console_id_bcd 0820400000000000 ab6778e02d034d303046504100001500 001f 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa &

--------------------- MERGED ---------------------------

iCEQB said:
damn so you can bruteforce all you need in 15 minutes? f*** the software exploits man, this is awesome !

If the first 5 digits collections and BCD assumptions are right and if you're testing a XL, yeah, like that

FR0ZN · Aug 24, 2017

Let's say the 14th digit is indeed always 1, would your application also accept this?:

Code:

@echo off
start /b /belownormal /affinity 1 twlbf_mbedtls console_id_bcd 0820100000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 2 twlbf_mbedtls console_id_bcd 0820200000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 4 twlbf_mbedtls console_id_bcd 0820300000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 8 twlbf_mbedtls console_id_bcd 0820400000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
@echo on

Because that would narrow down the search area even further to 10^10 possibilities (maximum of 10000000000 tries per core / thread instead of 25937424601) if I'm not mistaken.

JimmyZ · Aug 24, 2017

iCEQB said:

Let's say the 14th digit is indeed always 1, would your application also accept this?:

Code:

@echo off
start /b /belownormal /affinity 1 twlbf_mbedtls console_id_bcd 0820100000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 2 twlbf_mbedtls console_id_bcd 0820200000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 4 twlbf_mbedtls console_id_bcd 0820300000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
start /b /belownormal /affinity 8 twlbf_mbedtls console_id_bcd 0820400000000100 ab6778e02d034d303046504100001500 1ced45c75e810bb6b51a5318e0fc5eee 000000000000000000000000000055aa 001f
@echo on

Because that would narrow down the search area even further to 10^10 possibilities (maximum of 10000000000 tries per core / thread instead of 25937424601) if I'm not mistaken.

No need, it's already been taken care of.

FR0ZN · Aug 24, 2017

Sweet

wsquan171 · Aug 25, 2017

I'm lucky enough to have both Biggest Loser and Field Runners on my DSi so I'm able to get both Console ID and eMMC CID without doing a hardmod. To help out as much as I can, and considering DSi family life cycle has terminated, I'll share all the info here.

The console is a regular sized USA region Cyan DSi

Code:

Serial No. TW408006543
eMMC CID: 2CDF570409034D303046504100001500
Console ID: 7da02160-08a2144507087128

Yes I know first 4 bytes of Console ID is not needed but I'd like to post it here anyways. Since I didn't open up my console can't provide eMMC label here.

Good work OP. Hope this will help a little bit. Good luck!

JimmyZ · Aug 25, 2017

nocash123 said:
Is that using an "optimized" SHA1 function? One older optimization mentioned here https://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1 uses gerneral-purpose SSSE3 instructions (this should be also implemented in openssl). And newer intel processors should have extra opcodes SHA1RNDS4, SHA1NEXTE, SHA1MSG1/2 (not sure if/when/where that's supported, intel announced that stuff in 2013, but some other webpage mentioned it not being implemented until 2016, or so).

I looked up a bit, it's really an interesting subject, the only Intel architecture supporting SHAEXT is Goldmont(Atom) release April 2016, later Kaby Lake(Oct 2016) and just released Coffee Lake still doesn't support this, for Intel desktop, we'd wait until Cannonlake expected H1 2018. On the other hand AMD supported SHAEXT in Ryzen released Feb 2017.

nocash123 said:
And, another (small) optimization would be appending the sha1-end-byte and sha1-padding-bytes to the CID, and then passing that directly to the 64-byte-sha1-core function (ie. avoiding the same padding to be repeated on each calculation).

I was considering an OpenCL port so I dug out the SHA1 code from mbed TLS, thought might as well give it a go as you suggested, with the same brute range it went from 122 seconds to 105 seconds, not that impressive but a substantial improvement.

wsquan171 said:
I'm lucky enough to have both Biggest Loser and Field Runners on my DSi so I'm able to get both Console ID and eMMC CID without doing a hardmod. To help out as much as I can, and considering DSi family life cycle has terminated, I'll share all the info here.

The console is a regular sized USA region Cyan DSi

Code:

Serial No. TW408006543 eMMC CID: 2CDF570409034D303046504100001500 Console ID: 7da02160-08a2144507087128

Yes I know first 4 bytes of Console ID is not needed but I'd like to post it here anyways. Since I didn't open up my console can't provide eMMC label here.

Good work OP. Hope this will help a little bit. Good luck!

Thank you! 08A21, that's new!

Billy Acuña · Aug 25, 2017

Is posible to create a good nand dump from cid, cosoleid and a decrypted nand?

Lord_Friky · Aug 25, 2017

Billy Acuña said:
Is posible to create a good nand dump from cid, cosoleid and a decrypted nand?

Well... I was able to run someone else dump on my DSi XL copying some stuff from my nand to the other one, but I don't know if it's possible to use only the donor dump, I will try it

JimmyZ · Aug 26, 2017

Billy Acuña said:
Is posible to create a good nand dump from cid, cosoleid and a decrypted nand?

I suppose the TWLTool thread is more suitable for this question?

Friendsxix · Aug 26, 2017

I've got another two submissions to make:

White DSi (Non XL, USA): 0820116402XXXXXX
Burgundy DSi (XL, USA): 0820187622XXXXXX

On both console IDs, the third character from the right is a 1. All digits fall within the range of 0 ~ 9.

JimmyZ · Aug 27, 2017

Played a little with OpenCL, ported sha1 over, test show about 100x faster on a AMD HD7970 than (single threaded)Xeon E3-1230v2, so we can probably brute 3~4 bits more in the same time frame, not very impressive but not bad for a fun project

code is here:
https://github.com/Jimmy-Z/bfCL

update: about 200x faster on friend's Fury X, and 70x faster on another friend's GTX980

FR0ZN · Aug 27, 2017

JimmyZ said:
Played a little with OpenCL, ported sha1 over, test show about 100x faster on a AMD HD7970 than (single threaded)Xeon E3-1230v2, so we can probably brute 3~4 bits more in the same time frame, not very impressive but not bad for a fun project

code is here:
https://github.com/Jimmy-Z/bfCL

update: about 200x faster on friend's Fury X, and 70x faster on another friend's GTX980

So with that we should be able to walk the entire 08***********1** range in the same time?
(13^10 = 137858491849 - a lot of possibilities)

JimmyZ · Aug 27, 2017

iCEQB said:
So with that we should be able to walk the entire 08***********1** range in the same time?
(13^10 = 137858491849 - a lot of possibilities)

Easy man, binary bits, not decimal digits. BTW that's 10^13 not 13^10.

FR0ZN · Aug 27, 2017

JimmyZ said:
Easy man, binary bits, not decimal digits. BTW that's 10^13 not 13^10.

True, got confused here again lol

Oleboy555 · Aug 27, 2017

guys how do I get my ConsoleID? I have a metallic blue dsi 1.4.5E with flipnote and some other useless dsiware

Homebrew TWLbf - a tool to brute force DSi Console ID or EMMC CID

Member

Well-Known Member

Sarcastic Troll

Sarcastic Troll

Well-Known Member

Sarcastic Troll

Well-Known Member

Sarcastic Troll

Well-Known Member

Well-Known Member

Sarcastic Troll

Well-Known Member

Active Member

Sarcastic Troll

Introspective Potato

Sarcastic Troll

Well-Known Member

Sarcastic Troll

Well-Known Member

Well-Known Member

Similar threads

Popular threads in this forum