Legacy AMD APU Llano Laptop for Emulation tests - Part 3

A6-3420m CPU Emulation performance


I wanted to test my laptop’s APU for performance test on emulation. To recap, it is AMD’s first gen APU that the CPU is based on Phenom K10 CPUs, except having boost. It is unlocked so you can overclock it with a software. By default, the A6-3420m is a quad core 1.5Ghz cpu with boost to 2.4Ghz on one core. Boost was new so it helps a little. Overclocking brings some programs significant jump. From being a weak CPU to a decent one for emulation is an interesting story. The first gen Llano APUs are all unlocked, and are the few exceptions to overclock your laptop without being an actual risk. It came out in 2011, and seeing the first gen APUs in action should be surprising. They’re weak from the start, but offers decent GPU performance, and I’m offering both stock and overclocked benchmarks here for each emulator.

Benchmarks:         A6-3420m 1.5Ghz-2.4Ghz and OC A6-3420m 2.3Ghz-2.8Ghz
All tests are using the lowest non stutter FPS on the exact scene for a while to see how it performs and to see how to avoid sound stuttering to have smooth experience. Retroarch is on some of the benchmark and is using DX11 as main on windows, and OpenGL for Linux and for hardware rendering. Standalone hardware rendering based emulator is preferred (ex. Standalone Flycast vs Libretro’s Flycast, Standalone Mupen64Plus vs Libretro Mupen64Plus). Testing a 3D emulator is best with DirectX on Windows most of the time, and OpenGL for the rest or on Linux. Mesa drivers are the fastest and offers better compatibility. GPU bottleneck is not an issue by using native resolution without any shaders or anti-aliasing applied. The lowest FPS of a heavy game is a way to see which Emulator you could generally use. Note, if a specific system hardware or emulator to emulate one most demanding game doesn’t go fullspeed, doesn’t mean you can’t use the emulator for general good performance. BSNES’s demanding games for the CPU are three rare ST018 games. You may not play one game that is only demanding, but to see how many other popular titles perform. Some emulators may not play a demanding game due to compatibility or development issue. It’s a good way to see how good of a performance you would get to use it generally. Having over 60fps is a great way to have smooth experience and to throw any or most games without any problem.

NES:
Mesen-Stock:       Megaman 2 Intro
                       82.0 (100.0)
Mesen-Very-High-OC:Megaman 2 Intro
                       48.0 (59.5)
Nestopia UE works very well and very light. Mesen by default performs fine at stock. For virtual overclocking, only the CPU overclock can barely perform. However, it’s best to use Nestopia UE for those features, as well as using Runahead feature for lower input latency.

SMS/GG/Genesis/CD/32x:
Genesis-GX-Nuked:  Virtua Racing Demos (MAME OPN2 / Nuked OPN2)
                       118.0 (154.0) / 75.0 (93.0)
The Genesis GX Plus core is too efficient to find any issue, and it is the most accurate currently and it was made for GC and Wii. Virtua Racing is the only demanding title since it uses SVP chip for 3D rendering. While it performs good, the Nuked OPN2 audio was added for more accurate sound. It seems to perform great, I suggest using MAME for fast forwarding, especially Runahead feature. 32x Virtua Racing runs around four times the fullspeed on Picodrive. I haven’t tested it on Fusion yet, but assuming it will run at fullspeed.

SNES:
SNES9x:            Super Mario RPG
                       116.0 (163.0)
Bsnes-v110 fast:   Super Mario RPG
                       50.0 (63.0)
                  ST018 Game
                       36.0 (47.0)
Higan:             ST018 Game
                       21.0 (29.0)
Bsnes-HD-Mode7:    Super Mario Kart
1x (2x)
Testing the new Bsnes or BSNES-HD core performs really fine. Non-chip games works fullspeed out of the box. Games with Super FX2 chip or SA-1 chip are a bit demanding, and they are below fullspeed with CPU in stock. With overclocking, they are barely above 60fps. Super Mario RPG uses SA-1 chip. It would stay smooth and may not encounter small slowdowns. The most demanding games are the ones that uses ST010 DSP4 chip. Only three Japanese games use it, so they aren’t common. However, they won’t play at fullspeed, regardless.
Higan an be used on Bsnes Standalone if you turn off all special fast features. Generally, it’s best to use Bsnes since Higan’s performance isn’t there at all for the CPU. I also suggest the newest Bsnes standalone or HD core over any Bsnes forks you find from Retroarch.
I haven’t tested the Super FX overclocking feature.
I recommend the main SNES9X if you want to fast forward and use Retroarch’s Runahead for less latency, especially paired with overclocking for SA1 games.
HD side on Bsnes is also tested. Using Super Mario Kart and playing the demos, and the game has DSP1 chip. On any game with Mode7, it is not fullspeed at 2x at stock CPU. For overclocking, it generally performs smooth on most Mode7 games. With Super Mario Kart, since it has an external chip, it is slightly demanding, that it goes down to almost below fullspeed. For a long test, I do get 59fps at the lowest I got, but it generally plays at fullspeed. 2x with overclocked APU should be good, as long as you don’t use 2x on other games that has more demanding chip games than any DSP games.

Virtual Boy: Simple, perfect performance, regardless of hard sync.

Sega Saturn:
Yaba Sanshiro is the best emulator you can use on the APU. You can enable frameskip to get the best performance as much as possible. Some parts of any games may go a bit below fullspeed, but the audio is async, so it may not be as noticable, as long as the CPU is overclocked.

PlayStation:
Beetle-PSX Core:   Crash Team Racing (Interpreter / Max Perfprmance 1024 DMA)
                       36.0 (45.0) / 47.0 (54.0)
Mednafen:          Crash Team Racing
                       41.0 (57.0)
PCSX-Rearmed:      Crash Team Racing (Interpeter / Dyanmic)
57.0 (71.0) / 61.0 (81.0)
PCSX-R PGXP:       Crash Team Racing (Vanilla / PGXP MEMORY + CPU 1.5x)
~85.0 (~115.0) / ~60.0 (~85.0)
These are four emulators tested for the laptop and each has its own story.
Beetle PSX Core from Retroarch is based on Mednafen. I am testing with the new dynamic recompiler on performance mode and most games should work with it. While the performance is noticably faster than standard interpreter, it is only more playable with overclocked CPU to barely have any lag, at least in software. Hardware rendering is quite slower on this laptop. I don’t know exactly why it’s slower than software, even using Linux with Mesa Drivers, but it still hits really similar speed when comparing interpreter and dynamic. If you want to do hardware with higher resolution and PGXP, use PCSX-R fork. With Crash Team Racing intro and test the ice bear scene, that’s the part where I found the slowest point. Even with that, dynamic at max performance with software and host CPU overclock gives best results. Although, the interpreter on beetle is kinda slower than Mednafen and beetle is a fork of it.
Mednafen is a multicore emulator, and I used its PSX emulator that is the most accurate. Without frameskip for full mesaurements, Mednafen is faster than Beetle core. Somehow, overclocking your CPU brings the performance up dramatically. It is pretty close to 60fps on few spots on CTR demos, but fullspeed on a lot of areas. It’s unbelievable for standalone Mednafen to be faster then Retroarch core that you may use this for faithful emulation. Although, you can turn on frameskip for full emulation performance, I recommend not having frameskip for good response. Somehow, Mednafen doesn’t use CPU boost clock for me, but still shows it’s faster than Beetle core.
Another Retroarch core is PCSX-Rearmed. In the last few years, we do have it for x86 and x64 PCs. It uses less accurate interpreter and Pete’s Software for performance. On stock CPU, the performance reaches fullsleed most of the time, but you can encounter minor slowdown, but it’s not that below. With Overclock, it reaches fullspeed on all areas of testing. Like Mednafen, it renders at 1x. Recently, we got dynamic recompiler for x86, x64, and Arm64. It made PCSX-Rearmed run at fullspeed without overclocking the CPU. For a 1x resolution, this emulator is preferred over the other two for performance.
PCSX-R PGXP is a really good emulator and performs excellent. You can use Pete’s OpenGL for Linux and OpenGL2 2.9 Tweak version for Windows. Pete’s OpenGL 1.78 on Linux is more reliable than Windows version and just as fast as OpenGL2 2.9 tweak when using full framebuffer settings. Only difference are that OpenGL 2.9 allows shaders and xBR upscaling on textures. Both Pete’s OpenGL 1.78 and OpenGL2 2.9 Tweak offers PGXP capabilities, so you should see very great polygon rendering. Only PGXP Memory for the CPU are usable with fullspeed. Combining PGXP memory and internal CPU overclock at 1.5x gets you slightly above fullspeed. Overclocking your CPU should bring more relief for fullspeed on any games. The Linux drivers, despite performing better than official drivers from AMD for OpenGL, it performs the same. Only one downside with r600g drivers at the moment on any video plugin is the lighting on Spyro on some areas, but they are minor, not severe. Regardless, you should have great experience on PCSX-R PGXP. Although, neither of the builds use .CHD iso files. I did test Windows PCSX-R PGXP on Wine, and while I was able to use OGL2 Tweak and get the same performance as Windows, I do have problems with the audio plugins and Xaudio2 driver. I do recommend finding PGXP Linux Build for easier setup. It’s available as a PPA and AUR build.

N64:
Angrylion Plus with Project64 using internal LLE mode plays at half the speed or lower mostly.
This is gonna be a long explanation about this laptop hardware and drivers. In short, you can play many N64 games with pretty great accuracy without the use of Angrylion. However, it is a mess on Windows side. I’ve tested many video plugins. Windows 10 updates seems to make things a bit slower. Rice plugins are all over the place, and many of them have problems. GLN64 is not as good. Jabo’s D3D8 1.6.1 is the fastest you would get. Glide64 and GlideN64 are bottlenecked by AMD OpenGL drivers, meaning that it’s slower. Glide64’s performance is mediocre. I tried using nGlide, and it helps a bit, it’s still doesn’t solve the lag on some games, mainly Quake 2 demos that’s used as a test to see if the lag is present. Jabo’s is the fastest, and only has minor lag because of Windows 10 updates. GlideN64 is really slow, even turning off framebuffer at 240p. It’s a driver issue, and overclocking the CPU didn’t help much. Quake 2 demo lag was few frames per second. I would’ve test Windows 7 since the laptop was made for it, but I no longer have it since 2016. Mupen64plus is slightly slower, since all plugins use OpenGL.
Let’s jump into Linux. This is unbelievable! I use Mesa Drivers and downloaded Mupen64Plus and got GlideN64 4.0. I tested Quake 2 demos, and by default, it’s much faster than almost every plugin I tried on Windows. I overclocked the CPU, and turn off Depth Buffer to RDRAM with non-noticable regression, and it goes from minor lag to none! I bumped up the resolution and no lag is present at all. I do however set Framebuffer mode to VI origin to use less GPU usage on high resolutions. GlideN64 is really fast on Linux on this laptop. Even 3-point filtering finally works on my laptop. I recommend using standalone Mupen64Plus for Linux since it’s faster. On Retroarch on Mupen64Plus-Next, I still have minor lag with the same settings. To get the easiest way to have mupen64plus with GlideN64 bundled, search M64p.

Dreamcast: Redream is the fastest emulator you can use for the CPU. It works fine at CPU stock. Reicast’s fork, Flycast, is more compatible with games, but is more demanding. Even with CPU overclocking and turn off few accurate settings, it is a bit below fullspeed. On my drivers, I do have sprite glitch on Marvel Vs Capcom on Redream. It was tested on Linux, but on Windows, the performance may worsen due to dated drivers and poor OpenGL drivers.

GBA:
mGBA:              Mermaid Melody PPPP Menu
141.0
VBA-Next:          Mermaid Melody PPPP Menu
126.0
VBA-M:             Mermaid Melody PPPP Menu
127.0
Plays very fine. mGBA is newer and more accurate than VBA emulators. VBA-M is the slowest generally. VBA-Next is sometimes close to mGBA’s speed and sometimes by VBA-M’s speed. Even when using bios and disable remove idle as shown, mGBA offers better performance.

NDS:
Desmume 0.9.11+:   Pokemon Black2/White2 Title Screen (No Frameskip / Frameskip 9)
                       33.0 (40.0) / 60.0+ (80.0+)
MelonDS 0.83:      Pokemon Black2/White2 Title Screen (OpenGL 1x / Jit Recompiler)
                       20.0 (29.0) / 00.0 (42.0)
I’m testing two emulators for measurements. I’m using a jit command on Desmume Linux build for full performance. On Windows, it has OpenGL renderer, but Software is the fastest, so that’s why I’m using software rendering on Desmume. I’m testing a demanding area on Pokemon B2W2. Without frameskip, you would get almost down to half fullspeed. With overclocking, you would get a bit more performance. With frameskip at max, I get fullspeed. Although, I suggest using lower frameskip, like one or two. On a lot of games, it may not need that much frameskip, generally. It performs fine on other games that have less demanding scenes. It’s probably better for overclocked CPU since you can lower frameskip by one.
On MelonDS, since it has an OpenGL renderer, I decide to test it myself. As a result, I get below half the speed at stock clocks. On Overclock, I get about half the speed, so it’s the interpreter CPU that is the bottleneck. With beta ready Jit recompiler with default settings for pre-0.9 release, I do see some increase. It slightly passes Desmume without frameskip. However, some games will run near fullspeed and others at fullspeed. Not much has been tested for high internal resolution or other games.
Your last choice to get better performance to games that are in the first 2/3 of the DS life cycle, No$GBA is your choice. It is fast and you can use Nocash or OpenGL renderer. Although, Wine has problems with OpenGL that it crashes wine. The nocash is faster and No$GBA is the fastest option while being really least accurate, like you can hear the audio have noisy sounds on couple of games, and it has problems playing Pokemon Gen 5 games.
(Note!) I heard Drasic DS is gonna go Open Source after it has AARCH64 ARMv8 dynamic recompiler implemented. It is faster than Desmume that you can run it on an android emulator at fullspeed. It may not be that easy to set up since it’s payware and using an emulator, but it does perform well. Although it does have a slight input lag, it still considerable for emulated Drastic DS. I haven’t test it yet. Dev is working on x86-64 and x86 builds and will be out once the emulator goes free.

GameCube/Wii:
Dolphin x64:       Soul Caliber 2
                       36.0 (45.0)
Soul Caliber 2 runs fine. At some parts, you can encounter a little slowdown. With big effects that happens during fighting, I see ¾ of performance with overclock. Some games may play fine though, at least with overclocking. Make sure you run at 1x with async shaders, not using ubershaders. You won’t play any heavier titles though. You can play with the virtual overclock options and you may set it to half the speed or quarter for some games.

PS2: While it runs at least, even with overclocking, a lot of games runs slower that it’s not a recommended system to use PS2 emulators. At best, you stick with DX11 on Windows or OpenGL on Linux for PCSX2. Pushing speed to very aggressive may be appropriate for certain games that can run decently or almost fullspeed, but those are lighter titles.

PSP:
PPSSPP:            God of War
37.0 (48.6)
It runs games completely fine. Only demanding game is God of War. You can encounter slowdown on certain parts of the game. You can solve it by only setting the CPU clock to 222mhz on the option specifically for GOW. The game isn’t constantly slow or majority of the time, it’s just it has slowdowns sometimes, and goes fullspeed on other times. If God of War only has slowdowns on many enemies with the performance given above, you won’t at least encounter slowdowns on the rest of PSP titles.

3DS: On overclocking too, I couldn’t generally get Pokemon games to play at fullspeed on needed amount of times. It goes lower than fullspeed on battles, somewhat lower on overworld, and a bit lower than half the speed on double battle or battle royale. A lot of 3DS games runs generally slow. They barely reach fullspeed, even overclocking the CPU. Citra won’t run fast enough for this system.

Dosbox: From any Dosbox builds I use as explained from previous page, it runs the dynamic recompiler fine. It reaches commonly around above near late 486 performance, around 24000. With overclock, it goes up around 36000, equivalent to 486DX4-100Mhz. Although, some 486-pentium era games are able to use more cycles without slowing down the emulator. On Interpreter, it runs around 12000, equivalent to 486DX-33Mhz. With overclock, you go to around 18000, equivalent to 486DX2-50Mhz. I do recommend Dosbox ECE, or finding Dosbox builds that has patches, and is 32bit build since 32bit dynamic recompiler is robust.

PCEM: It can run any 386 processors. 486, it can run on any SX ones pretty fine. However for DX, let’s get into it. 486DX-25mhz can run fine at stock as an interpreter. Interpreter seems more constant on speed than dynamic recompiler. With Overclock, it can use DX-33mhz pretty good as an interpreter. Dynamic Recompiler is a way to get good performance for emulated CPUs and go higher, but on places like Windows 95, sometimes windows being on idle or loading things on Windows can bring the performance down a bit than expected. It can go above the targeted interpreters, but dynamic is better used on DOS mode on this laptop. On stock, it can go up to 486DX-40, and with overclock, it can go up to 486DX2-50. I use DBOPL on sound blaster setting to get a little more performance for the CPUs. The laptop can’t go any higher to use Pentium CPUs, and using 3DFX Voodoo hasn’t been tested, but I recommend using threads of 2 since the host CPU has four cores.

Recommended Emulators:
NES: Nestopia UE
SMS/GG/Genesis: Genesis GX Plus
SNES: Snes9x
PSX: PCSX-R PGXP, PCSX-Rearmed
N64: Mupen64Plus (Gliden64, Linux), Project64 (Jabo’s, Windows)
Saturn: Yaba Sanshiro
Dreamcast: ReDream
GC/Wii: Dolphin
GB/GBC: Sameboy
GBA: mGBA
NDS: Desmume 0.9.11+
PSP: PPSSPP
PCEM: 486DX 25Mhz/40Mhz
DOS: Dosbox ECE

Recommended emulators are listed as usable. If a system or emulator is not listed, it either that it won’t be playable due to speed, not past playable yet, or too fast enough to play (Stella, Atari 2600). The emulators on the list are recommended for general use. This is using stock settings on most emulators listed. Also, lighter games will perform faster, and you can toggle more settings for those games, like Runahead.

If any of you know what are the most demanding games for GBA, Saturn, Dreamcast, or DOS, let me know and comment.

Using AMD cards on OpenGL Emulators:
On Windows, you can only use official AMD drivers. It runs pretty fine for DirectX stuff, but for OpenGL, a lot of OpenGL programs runs slower and sometimes broken. OpenGL drivers are not really optimized, and since Terascale GPUs aren’t supported for at least four years as of this writing, you may not get to use newer OpenGL emulators or updates, even though you feel it should be more capable than how it performs. Even worse, first few generations of AMD APUs have short lifespan for graphic drivers from AMD, and Windows 10 can make things a bit slower than using the first version or using Windows 7. Again, Terascale GPUs will not have Vulkan support on any drivers.

Using Linux with Mesa Drivers, r600g:
I tested OpenGL emulators on few distros with Mesa drivers. It does perform almost as good as Nvidia’s OpenGL drivers.
On GlideN64, all the slowdowns on Quake 2 are gone. I don’t have that problem on Linux. The Mesa drivers are much more reliable, even if there very few errors I explained above, it’s still very much stable and efficient. Trust me, it’s far better than Windows.

Since we covered the CPU performance for emulators, we’ll test out GPU performance of Radeon HD 6520g on the next page.

Next Page on GPU emulation performance.

Previous Page on software and emulators use.