I've had a look at polarssl and openssl some years ago when trying to "understand how AES works"... openssl looked very confusing, and polarssl looked a bit straighter (but still very confusing and overcomplicated)... anyways, as far as I remember polarssl did support AES hardware acceleration, too. So both might be same as long as you have a PC with AES-NI support (which seems to have been invented in 2010).
Yes PolarSSL supported AES-NI, but the version included in TWLTool doesn't, otherwise TWLbf with mbed TLS(PolarSSL's new name) won't be 3 times faster than previous TWLTool brute force mod.
For multi-core CPUs, I wonder if each core is having its own AES hardware? If not, then multi-threading won't actually speedup the calculations.
AES-NI is a new set of instructions, I believe each core has it's own AES-NI like MMX/SSE/AVX, it's not like a separate AES engine device on DSi.
Is that using an "optimized" SHA1 function? One older optimization mentioned here
https://software.intel.com/en-us/articles/improving-the-performance-of-the-secure-hash-algorithm-1 uses gerneral-purpose SSSE3 instructions (this should be also implemented in openssl).
I really don't know, I just call SHA1 from OpenSSL and/or mbed TLS, I suppose I couldn't do much better than them, at least in a short amount of time.
OpenSSL forces us to use a high level interface called EVP to utilize hardware specific optimizations like AES-NI, the result is, on large blocks(like 8K bytes) EVP AES can be 5 times faster with AES-NI, and 1.8 times faster than mbed TLS with AES-NI(supposedly for
lacking pipeline optimize), but, if on 16 bytes blocks, mbed TLS is 2.5 time faster than OpenSSL EVP. I talked about this in the github readme.
And, for SHA1, EVP interface and low level interface is about the same speed(you can test this with "openssl speed -evp sha1" and "openssl speed sha1"), the later is a tiny bit faster than EVP on 16 bytes blocks(which is our use case). so it's either not hardware specific optimized or on the same level of optimization,
I suspect the 1st guess, I suppose in crypto libs targeting TLS, SHA1 performance is not that important. apparently SHA1 in OpenSSL uses lots of exotic SIMD optimizations including SSE/AVX(2)/SHAEXT, but when working on 16 bytes blocks, it's only about 1?% faster than mbed TLS's C code.
And newer intel processors should have extra opcodes SHA1RNDS4, SHA1NEXTE, SHA1MSG1/2 (not sure if/when/where that's supported, intel announced that stuff in 2013, but some other webpage mentioned it not being implemented until 2016, or so).
My ignorance! I've never heard of those SHA1 instructions, a quick search show they're used in OpenSSL but not mbed TLS
https://github.com/openssl/openssl/search?utf8=✓&q=sha1rnds4&type=
https://github.com/ARMmbed/mbedtls/search?utf8=✓&q=sha1rnds&type=
And, another (small) optimization would be appending the sha1-end-byte and sha1-padding-bytes to the CID, and then passing that directly to the 64-byte-sha1-core function (ie. avoiding the same padding to be repeated on each calculation).
Yeah I thought about that too, but my assumption is it will not impact the performance too much?
It might benefit more from preparing a series of CIDs in 512 bit blocks and feed it to a pipeline optimized sha1-core function, but this will only benefit the EMMC CID brute which by my opinion is less useful, and requires me to write/modify crypto lib code, as of now I just call them.
I thought the MBR and DSi partitions are using the same encryption on 3DS? That should be somewhat required to be so for DSi backwards compatibility. The MBR may contain different/extra data on 3DS (so brute forcing may fail when searching for certain "fixed" values in the MBR).
For the ConsoleID, I think the 3DS does have it's own "3DS ID" (for whatever 3DS things), and separate/crippled "DSi ID" (for DSi-style eMMC encryption). The latter one being reported to be 6B27D20002000000h on one n3DS console.
I'd admit I don't know most of this, but TWLTool had a 3DS TWL FIRM brute routine, which IIRC doesn't seem to be the same, but on the other hand it's not documented anywhere so it might be testing code that doesn't work yet.
BTW, I saw TWLTool's 3DS brute expecting 16 bytes at offset 0x1e0 to be all zero, but I found out that's not true on DSi for having a 3rd partition, so I moved to verify offset 0x1f0 instead, which by coincidence is similar to your efforts as documented in that NGemu thread. Does 3DS TWL FIRM actually only had two partitions?
About my 3DS question, I was just wondering if they would be useful for archival or research purposes as there isn't much documentation on 3DS Console IDs, though I do understand that bruteforcing 3DS Console IDs is beyond the scope of this tool and practically useless to do anyways.
I'm really not the right person to answer this, but I've never heard anybody in the 3DS hacking community talking about this, it might be a sign?
Is there a way to dump the nand without hardmod yet?
I believe that requires you to have a working dsiwarehax to run fwtool.