Introducing Lightrec, a MIPS-to-everything dynarec

Emulation is what got me into computer science to begin with, as I always thought that emulators are impressive pieces of software. The fact that we simulate a real-world electronic device is just amazing. What astounds me even more is that some emulators break the boundaries of what we thought was possible. Who remembers UltraHLE? Who ever tried Bleem!cast? As my knowledge of computer science increased for the last 12 years learning C, working on Linux and doing low-level programming on embedded systems, emulators slowly ceased to be a mystery to me; but that made me even more respectful now that I can grasp the genius that went into these craft pieces.

The biggest praise I have for emulator creators is that they don't follow the common premise that the solution is always better hardware. In a world where consumption is the key to our doom, I like to believe that we can always do more with less. Under constraints, people get creative. Writing software on infinitely powerful machines would be boring.

Introducing Lightrec

Since 2014, I've been working on-off on a project called Lightrec. Started as an experiment, to test my skills and improve my knowledge, it later became a fully working dynamic recompiler (aka. dynarec) for the PCSX Playstation emulator targetting a wide panel of host CPUs, thanks to the use of GNU Lightning as the code emitter.

Succeeding where others failed

The big disavantage of traditional dynamic recompilers is that they only target one architecture. PCSX has one dynamic recompiler for x86 PCs, another one for ARM-based smartphones, and yet another one for MIPS. Each new dynarec means a different code base, a different performance, a different compatibility.

Ever since projects like LLVM or libjit came out, several unrelated attempts have been made by different people to create a dynamic compiler that would use these technologies to support a lot of different CPUs. Unfortunately, they all failed, as they soon discovered that these technologies were really not well-suited to dynamic recompilers. The reason is that while they can generate well-optimized code at runtime, they were not designed to do so in a tight schedule. A game's frame time is generally of about 16ms, and the recompiler sometimes needs to execute thousands of pieces of code in that time frame, something that LLVM or libjit just cannot do.

GNU Lightning is different than the two aforementioned projects as it has a different scope. LLVM and libjit were designed for creating programming language compilers or fast interpreters, and as such have the concept of variables, which is a construct that all programming languages share, but not something that machine code has. Machine code manipulates registers.

GNU Lightning is better described as a code emitter. It offers you a finite number of virtual registers (the actual number depends on the architecture), and a programming API that closely ressembles the instruction set of MIPS processors. All it does, is translate each virtual instruction and virtual registers to the corresponding CPU instruction (or instructions) with the corresponding hardware registers. It doesn't perform any optimization (except very obvious and easy ones), and does not provide register allocation facilities either. Thanks to being that simple, it is extremely fast at generating code, and is well suited for a portable dynamic recompiler project, as it supports almost every CPU on which you'd ever want to run a Playstation emulator.

Implementation details

As you may have guessed by now, the Lightrec name is a fusion of GNU Lightning and recompiler, as it's what it really is. It could also be read as Light Recompiler and that wouldn't be wrong either.

From a compatibility standpoint, Lightrec is very compatible with only a handful of games showing glitches or bugs. Regarding performance, it was truely abyssal a couple of years ago, being slower than PCSX's interpreter. It is now a few times faster, thanks to a few tricks:

  • High-level optimizations.
    The MIPS code is first pre-compiled into a form of Intermediate Representation (IR). Basically, just a single-linked list of structures representing the instructions. On that list, several optimization steps are performed: instructions are modified, reordered, tagged; new meta-instructions can be added, for instance to tell the code generator that a certain register won't be used anymore.
  • Run-time profiling with a built-in interpreter.
    The first time the MIPS code will jump to a new address, Lightrec will emulate it with its built-in interpreter. The interpreter will then gather run-time information. For instance, whether a load/store will hit the BIOS area, the RAM, or a hardware register. The code generator will then use this information to generate direct read/writes to the emulated memories, instead of jumping to C for every call.

  • Lazy compilation.
    If the interpreter detects a block of code that would be very hard to compile properly (e.g. a branch with a branch in its delay slot), the block is marked as not compilable, and will always be emulated with the interpreter. This allows to keep the code emitter simple and easy to understand.

  • Threaded compilation.
    The code generator can optionally run in a different thread of execution. Instead of compiling a block of code right when we jump to it, Lightrec can add it to the working queue of the threaded compiler, and emulate the block of code using the interpreter in the meantime. This greatly reduces stutter in the games when a lot of code is being recompiled, as the main execution thread doesn't wait anymore for the compilation process to finish.

  • Fast code LUT.
    Coming from psx4all's mipsrec dynarec, the function block Look-Up Table (LUT) is now a huge array of the size of the Playstation's RAM, 2 MiB. It makes it extremely fast to obtain a pointer to generated code from its MIPS address, and extremely easy to mark a block of code as outdated - the generated code just writes NULL to the corresponding offset.

Big-Ass Debugger

The tool I developped that helped build this dynarec from the ground up is called the Big-Ass Debugger. The name comes from the fact that it doesn't try to do anything smart: it runs the interpreter and the dynarec in parallel, and every time a block of code is executed, it will calculate a hash of all the registers and the whole RAM, thousands of times per frame, in the two instances of the emulator, and compare the results. It is a slow process, but if a difference is found, emulation stops and the debugger reports what exactly has gone wrong, and where it went wrong. This tool is what allowed me, from a state where the code emitted for all MIPS instructions were calls to PCSX's interpreter, to write the dynarec progressively, instruction after instruction, while still making sure that my code was fully working and compliant with the expected behaviour shown by the interpreter. To this day, I still use it to verify each optimization and improvement made to the dynarec.

Projects using Lightrec

So far Lightrec has been plugged into a few different emulators:

  • PCSX-ReArmed, which is the emulator I've been using for developing Lightrec. Not the fastest, since the dynarec exits after each piece of recompiled code; but it supports the Big-Ass Debugger.
  • pcsx4all, which is the fastest for various reasons: the dynarec doesn't return as often to the main loop, and the BIOS/scratchpad/RAM and RAM mirror memories are memory-mapped to locations that are a much better fit for the generated code.
  • Beetle, which is a libretro core based on Mednafen. The Lightrec integration is much more recent and still incomplete, but it already is a strong contender to replace the slow interpreter that Beetle has been using since the beginning.

Future

As it is now, the dynarec is already working really well and ready for prime time. Of course, it still has ways to go; I already have ideas about advanced optimizations (or should I say optimizations senquack suggested) but all the "easy" optimizations have already been done, and the benefit-over-work-needed ratio is getting smaller and smaller. Also, the fact that it's been plugged into Beetle means that we may start seeing it running on all libretro-supported platforms, which is something I definitely look forward to.

Overall, it's been a challenging project and I'm glad that I could take it to a state where it is usable.

Till next time!

OpenDingux release 2019.06.01

Another month, another update.

Changelog

  • Added USB mass storage mode (MTP). Finally, you can transfer your apps other files without any specific software! Use the 'USB Mode' app in the settings tab to revert to the Ethernet-over-USB mode that was the default in the previous versions of the firmware.
  • Added 20 MiB of in-RAM compressed swap (zram). This will permits some RAM-hungry apps to start, although with a performance hit vs. those who don't require swap.
  • Switched from mdev to udev, which fixes some issues, like the automounting of SD cards.
  • The brightness setting is now preserved across reboots.
  • And most importantly, the cow is back. Those who used to develop for OpenDingux on other devices will understand.

Download links

The update OPK can be downloaded here: OpenDingux update OPK.
Be careful that you must have at least 25 MiB of internal storage before running the update.

Enjoy!

OpenDingux release 2019.05.17

Just a small update to tell you I made a small update to the OpenDingux firmware for the RetroMini.

Changelog

  • V3.0 boards should be supported now. Run flash_v30.bat on Windows or flash_v30.sh on Linux to flash the device.
  • It is now possible to change the brightness level from within the settings panel.
  • The DMA is used for SD card transfers now, so these should be a bit faster.
  • The battery level should be a bit more accurate now.

Download links

The update OPK can be downloaded here: OpenDingux update OPK

For those who did not flash already, an updated flasher can be downloaded here: Flasher tool download

Enjoy!

OpenDingux release 2019.04.30

Hi folks,

Here is the first release of the official OpenDingux firmware for the RS-90 (RetroMini). The root FS is based on Buildroot 2019.02.1, the Linux kernel is based on 5.1-rc5.

Download links

The flasher can be downloaded here: Flasher tool download

The toolchain (for developers) can be downloaded here: Toolchain download

Additionally, for Windows users the drivers required for the flasher tool can be obtained here: Windows drivers download

All the sources can be found on the Github page of the OpenDingux project.

DISCLAIMER

Flashing OpenDingux to your RetroMini will permanently remove the native OS, all the games it contains, as well as your savegames. I cannot be held responsible for any damage that this software causes to your RetroMini device or the software it contains. Use it as your own risk.

How to flash

Attention: This flasher has been mostly tested under Linux, and it seems to be a hit-or-miss on Windows. If you cannot flash from Windows, try it from a Linux live-CD.
  1. With the device powered OFF, connect it to your PC through USB.
  2. Power it ON while pressing A. If done properly, the device will now be in USB Boot mode. On Windows, to continue to the following step you will need to have the driver properly installed.
  3. Extract the ZIP of the flasher tool to a directory. On Linux, run flash.sh from the terminal. On Windows, double-click on flash.bat.
  4. The RetroMini should power up, and show the welcome notice of the flasher tool. To continue, just follow the instructions. The flash operation should take a few minutes.
  5. After the installation completed and the device rebooted, OpenDingux has been successfully installed.

Usage

Those who already used a handheld running the official OpenDingux firmware (Dingoo A320 or GCW Zero) will feel at home. The firmware makes uses of the GMenu2X user interface, and OPKs as the format for the applications.

  • The firmware is standalone, and does not come with games nor emulators. These will need to be installed separately.
  • For file transfers, you can either use a FTP software and connect to 10.1.0.2 (anonymous login). Alternatively, you can use a SSH client and connect as the od user. The default password is odrocks.
  • The games or ROMs should be transferred somewhere inside the home directory or a subfolder. The OPK files must be placed inside the data/apps subfolder in order to appear in the menu.
  • The firmware supports overclocking the CPU. A list of frequencies are provided, from 216 MHz to 456 MHz, but I cannot guarantee that your RetroMini will be able to overclock that much. Mine does, though. To choose the frequency at which an application should run, select it in the menu, press SELECT then Edit; the frequency can be selected through the "Clock frequency" setting. Of course, overclocking will reduce the battery life. On the other hand, underclocking will increase the battery life. The default frequency is 360 MHz.

Final words

This project represents more than four hundred hours of work just on my side, all for free. If you value my work, please consider sending me a tip via PayPal using the "Donate" button on the top-right coner of my blog (or if you're on mobile, at the bottom of the page).