WARNING! DO NOT POWER UP THE MODIFIED LAMP DIRECTLY FROM THE MAINS! YOU NEED AN ISOLATED DC VOLTAGE, BETWEEN 5 AND 30V!
INTRODUCTION – INSPIRATION FROM A DISTORTED REALITY
Last year, I was impressed by the news about someone being able to run Doom on a pregnancy test.
Sadly, the story was a little bit exaggerated.
In fact, all the news websites failed to understand (or to report correctly) what Foone (the author) actually did, despite clearly documenting all the steps, which finally lead to the viral video.
In fact, Foone was playing Doom on a PC, and a custom code on the PC-side was resizing and dithering the DOOM window, streaming it to a small OLED display through the USB port of the Teensy 3.2 Board. A small Bluetooth keyboard was connected to the PC to allow Foone to actually play Doom. The OLED display was fitted inside the plastic case of an electronic pregnancy tester the author analyzed earlier to see how does it work.
Despite all of this was very far from porting Doom to a microcontroller, it was very inspiring, and prompted me to find something I could port Doom to.
STATE OF THE ART AND RAM REQUIREMENTS
Original Doom’s memory requirement was 4MB of RAM, i.e. much more than what you might typically find in a microcontroller of a cheap consumer device. However, the relatively large RAM requirement followed the need of copying all the data (in particular graphics) from the hard disk to RAM, to access it quickly. Instead, in an embedded system or in a videogame console, constant data can stay in flash/ROM (provided you can access it with a good speed), saving a lot of RAM.
There are already ports on low-end devices, (SNES Doom, Vicdoom - Doom on a Vic 20 - or Doom on the Ti83 calculator), but these were stripped down a lot to fit memory and computing power limitations. Instead, I wanted to keep the engine close to its original.
How far can you optimize Doom, without sacrificing its main features, while being able to play at least some maps of the shareware version? (spoiler alert: all shareware maps work with this port!)
Or put in the opposite way, what it the minimum RAM requirement to play at least the shareware Doom? (given, of course, you have a reasonable computing power)
So I began researching what was the current state of the art, in terms of RAM usage, and I found this forum thread, which pointed me to the excellent doomhack’s Doom port to the GameBoy Advance (GBA), based on PrBoom. Doomhack also documents the RAM usage, which can go as high as about 197000 bytes (192 kB) for level E1M6 in the shareware version.

Later, I found that such spreadsheet, however, does not take into account:
- the usage of the GBA’s IWRAM (which is a very fast on-chip 32kB RAM on the GBA main CPU).
- the screen frame buffer (38400 bytes for a 240x160 pixel screen @8bpp. The GBA has a separate 96kB video memory)
- (unsure, but most likely true, by looking at the memory allocation code) the memory overhead required to store the information of each memory block (20+8 bytes per block. Hundreds of blocks are allocated during game play. On levels as complex as E1M6, this overhead might well exceed 15kB. More information about this here).
Therefore, if we include also the frame buffer and the other memory region used in the GBA port, the total RAM required to play all the shareware levels could be somewhat larger than 256kB.
In the meantime, for other reasons, I stumbled across the IKEA TRÅDFRI lamp series. These use a Silicon Lab RF microcontroller, which is already powerful enough for Doom (a 40 MHz Cortex M4), however it features only 32 kB of RAM: too few, to run a vanilla Doom, unless stripping it of too many features. So, I had to drop the idea of porting there Doom.
Later, I saw on a github repository that a newer IKEA TRÅDFRI lamp (the RGB GU10 version) mounts the MGM210L module from Silicon Labs (you can find the same module on several distributors like Mouser, Farnell, Digikey, Newark, RS-Components, etc.), which is based on a more powerful (and memory-rich) microcontroller: the Silicon Labs EFR32MG21AxxxF1024, featuring more RAM and a very powerful 80MHz Cortex M33.

In particular, the microcontroller of the MGM210L has 96+12 kB of RAM and 1MB of internal flash. By comparison, the GBA has 384kB of RAM and direct memory mapped access to up to 32 MB ROM on the cartridge.
Although these specifications suggested that there was not enough memory to play all the shareware levels, there was still hope at least to create a proof of concept project. In fact, by looking at the memory requirements shown above, I noticed that E1M1 could still be played, as it required just 90kB of RAM (this, of course before I realized that the spreadsheet did not count all the memory contribution mentioned above…). That was enough to convince me to start porting Doom to that IKEA TRÅDFRI lamp.
The goal was set to play at least E1M1, with as much as features of the original shareware Doom engine, without any requirements about audio.
THE PORT TO THE MGM210L
My port is based on doomhack's GBA Doom port, as it already features many memory optimizations. No need to reinvent the wheel.
No doubt, the MGM210L packs enough computing power to run Doom at very high speed. The issue is the small RAM and Flash amounts available to run DOOM. Indeed, the real challenge of this project is memory optimization, both in terms of amount and access speed.
Since the shareware Doom WAD is around 4MB, I needed to add an external memory.
MMC-SD cards are ruled out here: Doom is quite memory-access intensive, and high memory bandwidth and low latency are required. The filesystem overhead and high latency of SDs make them unsuitable for this job. The ideal solution would be a parallel flash memory, but 1) the MG21 does not support it 2) even if it did, the MGM210L module only has 10 I/O pins. After all, the MG21 is designed as RF MCU for IoT application, not to run Doom.
Therefore, the only acceptable solution was an external SPI memory. The size used on the prototype is 8MB and was determined by two factors:
- WAD files need to be converted to a format which is more convenient to handle in this project. This will slightly increase the WAD size. In particular the shareware DOOM1.WAD will increase to 4.2 MB.
- Due to current IC stock crisis, the largest size I was able to find was 8MB. This means that I cannot test full commercial WADs.
Despite the MG21 microcontroller series does not support dual or quad SPI memory, using a hardware trick (and software optimization, see here for details) I was able to use a QSPI flash and read it in dual-SPI mode, doubling the bandwidth, with respect to a single-line SPI flash. Furthermore, I overclocked the CPU peripheral bus frequency to 80MHz. According to the datasheet, the maximum bus speed is 50MHz. However, I found that such figure is very conservative, and the system run flawlessly at 80MHz too, at least considering the peripherals used for this project, and at room temperature. This enables to get a 40MHz SPI, which, thanks to the dual-SPI trick, allows for a peak bandwidth of 10MB, i.e. 80Mbit/s.
This is just one part of the job. GBA Doom code expects data in a memory mapped region, whereas in my case most of the data is stored in the external flash. Therefore, all the data accesses had to be wrapped, to fetch data directly from the SPI.
Floor and ceiling texture data are not accessed sequentially, so leaving them in the external flash would lead to a painfully slow process (probably in the single digit FPS). Therefore, these textures are copied in the internal Flash. In the shareware version, these textures, together with the code, amount to about 550kB, so there are still 450kB of internal flash, which can be used to store as many wall textures as possible. This improves the frame rate a lot, because the data can be fetched at very high speed (peaking at 320 MB/s).
Now that I have discussed how I solved the issue of reading data at a decent speed, a major issue remains: RAM. This is where the hardest part was.
Cutting down as many bytes as possible has been quite an effort. The list of optimization is rather long and can be found here. Almost every single data structure has been modified to save up to the last drop.
By far, the major improvement was to use 16-bit pointers, instead of 32 bit. In fact, in the GBA Doom port, the majority of the pointers point to structures, which are stored in RAM, and are 4-byte aligned. Since there are less than 128kB RAM, this limits the useful address width to 17 bits. However, since the structures are 4-byte aligned, the 2 LSBs are not needed, because they are always zero, therefore a 16 bit integer can contain all the information to find something stored in RAM.
This trick, on one hand saves a lot of RAM, on the other adds some overhead to convert a long pointer to short, and vice versa. This is not a big issue: we have an 80MHz Cortex M33.
After a lot of memory optimizations, I was able to run the full shareware episode, including E1M6! This only using less than 108kB of RAM!
Furthermore, using only 3kB of flash, and few bytes of RAM, I was able to bring back the Z-Depth lighting effect!
LET’S ADD AUDIO
After almost everything was ported and fixed (menu, intermission screens, savegame, etc), something was still missing… Audio!
Luckily the MG21 features DMA, and timers that can work in PWM mode. Doom audio samples are 8-bit, 11025 Hz. Since I expect a frame rate well above 10Hz, I used a 1024-sample buffer, which is filled/refreshed each frame. This means that the minimum frame rate we need to have is 10.8 fps. Going below that threshold will trigger audio glitches. Here I explain why interrupts cannot be used for this purpose.
Since there are 8 channels at 8 bits, each sample of the audio buffer (which is the mix of all the 8 channels) is larger than one byte, therefore the audio buffer is 2048 bytes. To recover these 2 kB lost, further RAM optimizations had to be done.
Noticeably only sound FX are implemented. Music is not.
THE HARDWARE

It is extremely simple: the Ikea module is soldered to a DC-DC converter, which already generates 3.3V and 5V. All what you need is just the display itself (I used a quite standard 160x128 TFT pixel), the 8 Mbyte SPI flash, a widespread shift register, 8 pushbuttons and few passives. That’s it!

The module accepts input from 5 to 30V, however you need to remove one resistor if you are planning to use it at low voltages (e.g. 12V). That resistor prevents the DC-DC converter IC from turning on, in case the input voltage is too low (see picture below).

The 74HC165 parallel input serial output shift register is used because the MGM210L module has very few I/O pins. This cheap IC allows to connect 8 pushbuttons. Using two ICs would allow for a bigger keyboard, but for sake of simplicity and size I stick to a 8-key layout.

Noticeably, the external Flash is clocked at 40MHz, the display at 30MHz (note: overclock! According to its datasheet its maximum clock frequency would be 16MHz), and the shift register only al 5MHz. This is not a problem, the MG21 allows to switch clock (and peripheral I/O) very easily.
Finally, audio is connected using just a capacitor. Due to lack of space, I did not even mount a low pass filter, as the PWM frequency is very high, and I rely on the amplified speaker internal filter. Still, it is recommended, as it improves the quality. When I will make the PCB, there will be for sure enough space for a simple RC low pass filter.
BUILDING THE PROTOTYPE
You need to get the MGM210L from the IKEA lamp. The module is not write locked so it can be easily programmed using any SWD programmer. You might also find the MGM210L from conventional distributors. In the IKEA lamp it is soldered on a small board, which, as said, provides a 30V to 5V (and 3.3V) converter. Since the voltage conversion circuit is very useful, and separating the two boards is quite difficult, I suggest to keep them joined and use the DC-DC converter.
To do this, you need to do some wiring work, and also you need to remove the resistor marked R25, as seen in the picture, shown before.


I used prototyping boards, but as you can see, you’ll end with a mess of wires. In the future I plan to release a KiCAD PCB layout in the github repository, so the design will be much more compact.


The RF module board goes on the top of the previous one. The connector on the left is the debug header, which is also used to upload the WAD file.

WARNING: the whole system (except the display) will fit the original Ikea Lamp, but you need to power it at low voltage! Do not power it from mains, if you don’t want to start a fire.
PROGRAMMING AND FINALIZING THE LAMP
The device can programmed using any SWD programmer attached to the debug header. The program can be compiled using Silicon Labs’s Simplicity Studio V5. NOTE: at the end of programming, you might get an error. Ignore it, it will work.
Then you need a WAD file converted to a particular format, compatible with this port. The already converted shareware WAD file is in the repository. If you want to try other WADs (note I have not tested any other WAD, it might not work!) you need to use the mg21wadutil (derived from doomhack's GBA wad util), present in the same repository.
The converted WAD must be sent to the internal flash via YMODEM protocol (XMODEM supported too). For this operation you need an USB to TTL UART converter, see debug header signals as shown above (note: TX and RX are MGM210L relative. You need to connect TX with RX and vice versa!).
To upload the WAD, power up the device, and keep pressed the buttons “use”, “change weapon” and “alt”, to initiate YMODEM reception. Use Teraterm and send the file via YMODEM protocol.
Be aware that the upload process is very slow, and it will take more than 10 minutes for the shareware WAD. In the meantime go and have some coffee or tea…
After the download has been complete reset the device, and you should see DOOM running!

Once you verified that it works, you might want to put everything inside the lamp...

PERFORMANCE
From a Cortex M33 you might expect to run Doom even with a 320x200 resolution at 35 fps. After all, Doom required a 486 DX 33MHz to run at a good frame rate.
You are right, but the issue is the data access speed. Flats (ceilings and floors), as well as some cached textures are accessed at high speed, but sprites and uncached data are accessed at most at 10MB/s. And this value is the peak sequential read speed. Actual figures are much, much lower.
I verified, using a STM32 microcontroller featuring memory mapped QSPI support, that the QSPI speed has dramatic effect on the speed. When the QSPI speed is 40 MHz or above (160 Mbit/s) the frame rate rarely drops below 30fps, and frequently caps at 35 fps. For smaller bandwidths, the frame rate drops, especially on those levels featuring many different uncached textures on the same scene.
That said, we got a framerate which still sometime caps at 35 fps, but on complex scenes with many sprites (in terms of data to be fetched externally, not in terms of textures) it can drop down to 16 fps: still playable.
COMPARISON WITH GBA DOOM PORT
The GBA Doom port feature also stereo audio, while our prototype only has a mono channel (8 channels are software mixed to a single one). Furthermore, music is not implemented in our port.
As in the GBA port, demo support is broken, so I have disabled it. This is also because every time you change level, you perform a write operation in the FLASH. This means that after you have changed level 10000 times, your flash will be out of specification (this does not mean it won’t work anymore. It means that the flash will not probably keep the stored data for 10 years. However, this affects only the upper region of the flash – code part is never overwritten – so it will not be a huge issue).
However, unlike the GBA Port, my version features Z-Depth lighting effect. On GBA Port, this was removed to save some 11kB of RAM. I managed to bring back this features using less than 3kB of Flash an 8 byte (yes, bytes!) or additional RAM.
Furthermore, doomhack implemented a mip-mapping like algorithm on composite textures. This was probably to increase software cache hit/miss ratio, at the expense of lower details on some textures. We cannot afford software cache (doomhack used 16kB for it), so there is no point of implementing it: the speed is not affected. Therefore, I have restored full detail rendering on composite textures. Note that I have restored high detail rendering of composite textures AFTER taking the pictures and shooting the video, where you can clearly see the mip-map effect on some walls.
In the IKEA lamp port, Doom runs faster than in the GBA, but this expected: an 80 MHz Cortex M33 is about 8 times faster than a 16.7 MHz ARM7TDMI.
In terms of memory usage, the port on the Ikea lamp requires much less RAM. For instance, level E1M1 requires less than 60kB (including framebuffer and stack) vs about 87.4kB (excluding 32kB of IWRAM, stack, memory overhead and 38400 bytes of frame buffer, i.e. a total of about 157kB). Level E1M6 occupies less than 108kB on the Ikea Tradfri port, vs the already mentioned amount exceeding 256kB.
"LAMP" OR "LIGHT BULB"?
The correct technical word for "light bulb" is "lamp"! IKEA itself refers to this particular device as "self ballasted LED lamp".
SOME MORE SCREENSHOTS



VIDEO
Here is a short clip, showing the performance on level E1M6. Note in this video mip-mapping on composite textures was not disabled yet. Actual graphics is much better. Also the camera does not get the exact colors, which are instead much better..
I WANT MORE DETAILS!
I have written a more detailed article, especially about optimization, here.
CONCLUSIONS
I made it!
Several features are still missing:
- Support for different WAD files, other than the shareware one. I have not tested yet them, as I have only mounted a 8 MB chip. I need at least a 16MB chip to test.
- Memory optimization (yes, some more kB can be still saved!)
- Performance optimization.
- Music (but I don't think I'm implementing it, at least in the short period)
- Aspect ratio corrected face in the status bar, and in general better status bar (small text is not readable)
- General bug and rendering fixes.
- Using a circular display and put everything inside the actual IKEA TRÅDFRI lamp case. But that’s another story.
Finally, two last remarks.
- GBA Doom port to the MG21 device was a two-step port. I initially ported the code to an STM32 device, featuring memory mapped QSPI flash support, and 1MB of RAM. After I cut down RAM usage to 108kB, I ported from STM32 to the MG21. This allowed me to initially focus on RAM optimization and later focus on the dual-SPI trick and implementing the flash caching (emulated in RAM on the STM32 device).
- This project can be used also as a starting point to port DOOM to almost any microcontroller featuring enough Flash and at least 108 kB of RAM, and of course, decent processing power.
NOTES
- Do not power up the lamp directly from the mains. Maximum voltage is 30V (if you use the DC-DC converter board).
- In all the pictures, the AMMO count shows the current framerate multiplied by 10. E.g. 300 = 30 fps. In the current github repository, the frame rate counter is disabled!
- Yes, you might see that the lamp case is broken. Unluckily, the lamp fell on the ground and broke in 3 pieces (it is made of glass). A bit of superglue helped :)