Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 8
Yuki talks to Ron after school about not picking on Minoru and doubts
him that he just changed. She also offered him a competition tomorrow.
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 7
Minoru’s former teacher decides to talk to Yuki when her class begins.
She warns her about fighting with Ron while she tells her that she wants
to protect Minoru.
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 6
Yuki and Ron gets caught fighting in the hallway. Yuki decides to take Minoru to check on his knee that got hurt.
[MMD Tutorial] SGSSAA with Nvidia Inspector
Edit 3/12: Added LOD quality and examples.
This is my second post about SGSSAA on MMD.
I
know it’s a bit overdue that I posted this than I stated. The fact that
I’ve been using this method for best Anti-Aliasing, SGSSAA is the best
method. To use SGSSAA on Nvidia cards on MMD, use Nvidia Inspector and
make a profile for MMD and load application for both MikuMikuDance.exe,
PMXEditor.exe (32 or 64 bit), and VMDViewer.exe. Then do the exact same
settings from this photo and pay attention to the black texts instead of
grey. Also, you don’t need AA compatibility bits to use SGSSAA for MMD.
AA Fix must be turned on to avoid artifacts when using customed AA.
Vertical
Sync must be turned off so that you can have very fast rendering times
when rendering to AVI. Tearing doesn’t appear since they are all shared
with desktop’s V-Sync.
AA-Mode can either use Enhanced or Override to
work, but I prefer Enhanced since I can toggle Antialiasing in MMD’s
option to turn it off if I need to. Override will always force it.
Enhanced is helpful if your GPU is very low and you want to turn it off
during editing.
AA-Setting must be 8xQ and AA-Transparency Settings
must be 8x Sparse Grid Supersampling to use SGSSAA, and 8x is the best
quality to offer. You will want that maximum quality when rendering to
AVI.
Anistropic Filtering must be at 16x to have better mipmapping details.
Texture Filtering Quality must be put at High Quality to have better mipmapping along with SGSSAA.
Driver Controlled LOD should be off and LOD bias (DX) should be at max -3.0. More about LOD below.
Hit
apply and start using MMD and you will notice that it has a very high
quality. On the photo at the upper side, check the yellow box to see the
default MMD Anti-Aliasing and you will notice that it doesn’t cover
everything. Shader aliasing, temporal aliasing, and transparency
aliasing is present. SGSSAA corrects this and does full supersampling
with Sparse Grid method that removes temporal aliasing greatly, and even
better than standard supersampling. Temporal Aliasing is better seen in
motion as temporal means several frames. Standard SSAA doesn’t help
much on Temporal, and only removes it by a little. MSAA doesn’t cover
shader aliasing and texture aliasing. Alpha aliasing is removed in MSAA
if the settings has an option for alpha coverage. SGSSAA also helps on
polygons, and even at 8x, it looks like a lot of sampling is used to
smooth out any aliasing on the image. 4x barely shows temporal aliasing
when I used it on some games, but 8x is a more professional option and
gets rid of more small aliasing. It is compatible with any post
processing effects.
About LOD, it is a subtle difference, and you
can notice it if using a lower resolution, even using 16x Anistropic
Filtering. It’s not as blurry as not using Anistropic Filtering.
However, if you are rendering pictures or video at SD resolution, you
may notice a bit blur on textures. I’ve used -3.0 bias to match how it
is shown on AMD’s 8x RGSSAA method or downsampling the image. I would go
by an integer negative number down on each quality, like -1.0 for 2x,
-2.0 for 4x, and -3.0 for 8x. I mean the LOD blur isn’t really
noiticable on higher resolutions, and trying to compare images on high
resolution seems to be a little difference, so I made this a little more
optional, but I prefer doing this to match fairly with AMD’s method and
how it looks when downsampling.
NOTE: Since MMD-Ray shader only
has its own framebuffer on display, any anti-aliasing has no effect at
all, not even SGSSAA. FXAA and SMAA is our only option, but still
doesn’t help with temporal or spectular lighting aliasing. If you want
to get the best AA method with MMD-Ray, follow the Downsampling post.
Yuki Koriyama in Swimsuit
Yuki decides to show off her swimsuit in the beach. She also wants you
to play games with her like volleyball because she’s really good at it.
Her bikini was done with some bases and models, and poses and beach are from Deviantart
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 5
Yuki is very aware that Ron wants to get Minoru, but ends up getting caught by her.
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 4
Yuki, Minoru, Rin, and Retasu are having lunch together while watching over Minoru.
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 3
Yuki tries to plan out how to protect Minoru when she’s away.
Yuki Story MMD Chapter 9 - Yuki’s Defense - Page 1
Ron, Aki, and Ren decides to speak to Minoru to tell him off for Yuki winning the race.
As of December 24, 2019:
I wanted to test out some performance of the FX 8350 processor since I am concerned about the single thread performance that most emulators relies, and several Intel architecture have better single thread performance than AMD fx processors while going lower clock speed. That also goes back to Sandy Bridge as having a better IPC that high end emulators is capable of, so I wanted to mention this so that if any emulator works full speed with certain settings on my desktop. I will also post my laptop benchmark after the mentioned emulators on the FX 8350, and also running at 4.0 ghz without turbo, as it has very little boost for those programs. Pretty much, any recent i7 processor in the last few years at least performs nearly double the performance as this CPU.
All Retroarch test will use audio sync off and speed limit off to get the highest fps, and have vsync off, hard sync off, runahead off, and frame delay off.
Benchmarks: FX-8350-4ghz
All tests are using the lowest FPS on the exact scene for a while to see how it performs and to see how to avoid sound suttering to have smooth experience. Retroarch is using DX11 as main, Vulkan is used second for hardware rendering, and OpenGL last for the cores hardware rendering. Standalones on some emulators may be used for reliability or doesn’t exist as a libretro core. Testing a 3D emulator that only has OpenGL rendering, Nvidia cards are the best to see the fastest performance for OpenGL Rendering. Using DX11 or Vulkan wouldn’t matter. GPU bottleneck is not an issue by using native resolution without any shaders or anti-aliasing applied. The lowest FPS of a heavy game is a way to see which Emulator you could generally use. Note, if a specific system hardware to emulate one most demanding game doesn’t go fullspeed, doesn’t mean you can’t use the emulator for general good performance. Dolphin’s demanding game is Rogue Squadron 3: Rebel Strike, and it’s not playable. You may not play one game that is only demanding, but to see how popular titles perform. Some emulators may not play a demanding game due to emulator compatibility. It’s a good way to see how good of a performance you would get to use it generally. Having over 60fps is a great way to have smooth experience and to throw any or most games without any problem. While I’m putting the lowest FPS that you can encounter (not counting stutter, but constant), the average FPS is a few more fps for somewhat or really demanding emulator, or several more on less demanding emulators.
Overclocking:
FX-8350 is overclockable. Only a good quality motherboard, especially the 990FX boards, can overclock the CPU fairly high. Many can reach great overclocking to 4.8Ghz, and certain number of users can reach it to 5Ghz. Not every users can reach that far either because of the silicon quality of their 8350, or in my situation, have a lower quality motherboard. I am using Asrock 970 Extreme4, and I couldn’t overclock my CPU without it throttling. It means doing stress tests on all cores, it will throttle down to lowest clock of the CPU, 1400Mhz, for a while. It’s due to the VRM design and motherboard not having 8+2 power input for overclocking an eight core CPU.
I did a little workaround. I set my CPU to disable core parking, or use one core per chip. I get four cores. FX-8350 has four modules. It has two integer cores, and one floating point needs to be shared. Disabling core parking would have each module only use one integer and one floating point, and you do gain emulator speed of up to 5%. Since I am only using four cores on this situation, I can apply decent overclocking. I can go up to 4.4Ghz V1.3 from 4.0 V1.275. It receives mild throttling when only doing stress tests like Intel Burn Test or Prime95. Keep in mind, for emulators like RPCS3, Xenia, or Angrylion-Plus plugin, it does need more cores. Cemu is still fine for four core CPU. Switching from 2 recompilers to 3 has small improvement.
What about the default boost? The boost goes to 4.2Ghz on one core, but since my motherboard is not really strong, the boost doesn’t stay consistent and barely helps the performance. I only see an increase of 1FPS.
NES:
Mesen-Stock: Megaman 2 Intro
300.0+
Mesen-Very-High-OC:Megaman 2 Intro
137.0 (151.0)
Nestopia UE works very well and very light. Mesen performs great with the overclocking setting.
SMS/GG/Genesis/CD/32x:
Genesis-GX-Nuked: Virtua Racing Demos
150.0
The Genesis GX Plus core is too efficient to find any issue, and it is the most accurate currently and it was made for GC and Wii. Recently, a new OPN2 audio is added to it, and nuked FM sound doesn’t reduce speed as much on this processor. The test above only played Virtua Racing Genesis with Nuked YM2612 and low pass. Virtua Racing is the test for performance since it uses SVP chip.
SNES:
Bsnes-v110: Yoshi’s Island Title Screen
122.0 (137.0)
ST018 ARM Game
84.0 (93.0)
Bsnes-v110(Higan): Yoshi’s Island Title Screen
46.0 (50.0)
ST018 ARM Game
38.0 (41.0)
New Bsnes from Byuu seems pretty useful for more accuracy and is based on Higan, the most accurate emulator. It has fast options turned on by default, and has game checks to disable some fast option for accuracy fixes, and can do high resolutions on Mode 7 games.
Performance on the Fx-8350 is pretty solid. The most demanding games are the ones that uses ST018 chip and only three japanese games has it, but one is used for testing. It should perform pretty well right out of the box. For HD Mode 7, I can go up to 1440p. However it goes near down to 60fps, and 1200p is probably the best option to have smooth performance while playing games in high resolution.
You can turn Bsnes into Higan level accuracy by uncheck all enhancements options.
I suggest using current Bsnes core or new standalone, since it offers many things better, and don’t pick other bsnes versions on retroarch.
I haven’t tested the Super FX overclocking feature.
I recommend the main SNES9X if you want to fast forward and use Retroarch’s Runahead for less latency.
Virtual Boy: Simple, perfect performance, regardless of hard sync.
Sega Saturn:
Mednafen Saturn: Daytona USA CCE
44.0 (51.0)
Yaba Sanshiro: Daytona USA CCE
60.0+
Kronos 1.4.5: Daytona USA CCE
65.0 (72.0)
Haven’t test that much games as I didn’t look around Saturn games that much, but I found a few and had tested before. In Daytona, the title screen after fading from the demo seems to be the test I could find that performs lower in Mednafen.
SSF, the only best Saturn emulator exclusive for Windows, seems to have identical speed with Mednafen, only slightly faster, but I haven’t use it that much for a long time.
Yaba Sanshiro is tested and I was getting full speed on any game I used. The performance and OpenGL rendering is pretty great. I haven’t force vsync off yet, but it should run much faster. Well, certain graphics can have less compatibility and not all games work on its dynamic recompiler, but it performs really good.
Kronos, as of 1.7.0, I can run the cached interpreter pretty well. I get full speed on pretty much any game, and since it used the cached interpreter, it does have more compatibility with more games than Yaba Sanshiro. Performance is plenty for the FX cpu. Remember AMD drivers would make Yaba and Kronos perform lower since it uses OpenGL, but Linux will run it in full potential. I used standalone build since they perform the best and got more updates.
Mednafen-Saturn can drop to ¾ of fullspeed sometimes, so I would just use the other two emulators.
PlayStation:
Mednafen-PSX SW: Crash Team Racing
71.0 (75.0)
I’ve tested this game during the entire intro from starting the race to Crash waking up from the grass. I’ve only test this in software.
Mednafen-PSX works perfectly in speed on any game I used with default settings. I can play any game with PGXP at any resolution. Vulkan plays the fastest. PGXP + CPU plays game at near fullspeed so you can notice the suttter on the sound.
On PCSX-R PGXP, it should run any game in full speed with PGXP Memory Used the fastest and compatible Pete’s OpenGL2 2.9 Tweak. I can overclock the emulator’s CPU to 2x-4x without speed penalty. Since the recommend plugins have audio do async, some games can get away with near fullspeed when using PGXP + CPU like Spyro. GPUblade runs decently, but use Pete’s Software or better, Mednafen PSX for better performance on Software Rendering.
N64:
ParaLLEl-Plus: Pokemon Snap Intro
17.0
Project64-Plus: Pokemon Snap Intro
23.5
All of them are tested with Angrylion-Plus with available RSP to LLE.
GlideN64 works very well on my system on any emulator. It performs great and had no performance penality on any demanding games on full framebuffer settings. The new plugin, Angrylion-Plus, is good for FX-8350 since it has 8 cores and you are able to use it. On newest Project64, you get better performance if you use the default RSP in LLE mode than Hatcat or CXD4 RSP in LLE. Many games runs well with Angrylion-Plus, but not every game are at fullspeed, like the test in one scene in Pokemon Snap where 25fps is there. That scene is the most stress testing I could find for N64 game when using LLE as well as being close to an explosion at Goldeneye 007. The performance is only bottlenecked by the RSP LLE core. It does render more frames on those scenes than the console would, because GPU bottleneck isn’t emulated. You do get better performance than Angrylion without multi-threading, but LLE is not as fast yet. M64P does have on par or slightly better compatibility than Project64.
Dreamcast: Forget NullDC and Demul, because Reicast’s fork called, Flycast, is a much better emulator, generally speaking. Flycast runs pretty well on Libretro, but more stable as a standalone app. Games runs pretty well, including WinCE games. Vulkan is available so you can use it with AMD GPUs on Windows with no performance penalty.
https://flyinghead.github.io/flycast-builds/
GBA: Plays very fine. Generally very high FPS on any with fast forwarding.
NDS:
Desmume 0.9.11+: Pokemon Black2/White2 Title Screen
55.0 (61.0) / 77.0 (83.0) Interpreter/100 JIT
MelonDS 0.9.0: Pokemon Black2/White2 Title Screen
32.0 (38.0) / 57.0 / 81.0 Software/OpenGL/OpenGL(Dynamic-8)
I’m testing two emulators and testing Pokemon B2W2 title screen. Let’s go to MelonDS first. Since 0.8, it has OpenGL rendering and can use higher resolution than 1x. It is faster and a bit faster than threaded software. Even higher resolution is faster than Desmume. For my FX CPU, it is almost reaching fullspeed with OpenGL on a known demanding scene. Generally, it plays at fullspeed for many games, but only the Interpreter is the bottleneck. MelonDS 0.9 will have dynamic recompiler, and it performs really well. On the same scene, it is 81FPS so it is really smooth for the FX CPU. Threaded Rendering can reach almost OpenGL performance. However, it can break certain compatibility slightly more than OpenGL.
Desmume, well I can use dynamic recompiler and OpenGL, but I tested the Interpreter with any accuracy options ticked and leave it at 1x resolution for a fair comparison. Video rendering is not the bottleneck for the Interpreter. Both Software and OpenGL perform really fast, especially Software. OpenGL seems demanding on higher resolutions, more than MelonDS. Recent builds seems to match old x432 build or a bit faster, but still reliable.
Generally, go with MelonDS for the CPU.
GameCube/Wii:
Dolphin x64 2019-11-28: Super Mario Galaxy
65.0 (72.0)
Rogue Squadron 2
31.2 ()
Performs very well on many games, but the heavy games like Rogue Leader or Wind Waker won’t get stable 60fps as far as I know, regardless of graphic resolutions and api. Most demanding game that is compatible is Rogue Squadron 2. Games like Rogue Squadron 2 can be slow in some scenes, especially on one scene on Hoth with many entities onscreen from the startup cutscene. Generally, games like those won’t be on fullspeed a couple of times, and that includes games like Twilight Princess or Metroid Prime 3.
PS2:
Shadow of the Colossus - PCSX2 1.5.0-dev
Safest: 24.3 (26.8)
Safe: 25.7 (28.7)
Balanced: 28.3 (30.8)
Balanced SW: 14.7
Aggressive: 49.8 (56.7)
Very Aggressive: 52.3 (58.8)
I barely have PS2 games, but I found at least certain ones. With FX-8350, it helps more on Software Rendering on less CPU demanding games. It may not perform fullspeed with GPU demanding games if running software, but for hardware, those games should play fine, if they are not demanding on the CPU. Two cores are used by default, with the second relating to the graphics. MTVU threading does help with some games that uses them, and is in high compability, but can hang or slow some games. Graphic plugins are using SSE4 ones. With Shadow of the Colossus, it is commonly known to be CPU demanding. The FX-8350 won’t be playing at fullspeed, even on Hardware mode with default settings. I tested out the speed hack presets above. I suggest either safe or balanced, but balanced only enables MTVU threading, which can improve performance on some games.
PSP: I haven’t had any performance issue on cpu side of PPSSPP.
Wii U: On Cemu, I can play Mario Kart 8 very well. It’s best to set the cache buffer to low as going high doesn’t show any benefits. As of 1.14.0, I can play with shared shader cache to avoid shader compilation stutter. In game, it is usually borderline fullspeed when using triple core recompiler. Usually hits down to 56fps at the lowest, but not really noticable due to having async audio. That’s the only game I tested and I will stick around any latest cemu version in the future that doesn’t remove async audio or have an impact on performance so it could run decently on my hardware or lower ones without hearing sound stutter. If it’s about running Breath of the Wild, I’ve seen it run around 30fps, so using FPS++ Dynamic or Static 30fps is what gives you the best gameplay. As of 1.16.0, it has Vulkan Renderer, so it runs really well on AMD GPUs.
3DS: As of DrWhojan’s or Canary builds, they are best to use with games that are playable with it, like Pokemon Sun and Moon, ORAS, or Metroid with speed improvements. More games are playable since the GPU shaders and ignore implemtation. For my processor with this build, Pokemon Sun does reach fullspeed aside the shader compilation, so it does stutter when new objects are loaded. Also, there is no shader cache yet. Mario Kart 7 does play fullspeed most of the time. Looking at the lens flare directly, it does go almost fullspeed at native resolution. Higher resolutions does perform lower, but it does relate to CPU usage. I would benchmark Monster Hunter games since they are really heavy, but they are not playable, and not even fast for high end Intel CPUs too. The sound stretching is checked off for mine so that the audio stutter won’t last too long or act weird when you encounter a stutter.
Dosbox: I used to use Dosbox Daum Build and Dosbox-X and they’re buggy, especially the first one. Use Dosbox ECE instead since it has many features that the main build doesn’t have and it aims for accuracy and performance. In normal mode instead of dynamic, I can have cycles up to 35000 in intro of Jazz Jackrabbit CD as a test since I only found that scene to be most stress testing, but I don’t know what is the heaviest DOS game that uses more of the host’s CPU when aiming for the highest cycles in normal mode. Used Nuked OPL3 and Gravis Ultrasound. Using the max settings is recommended as it will give you smooth experience without sound crackling, even in dynamic mode.
PCEM, I can only use the Pentium 75 or Cyrix PR90. These processor are the only one I can use without any slowdowns at best on Socket 7 emulated PC. Any Voodoo cards runs well, at least for 480p resolutions. On 486 platform, I can use AMD 5x86/P75 at max with dynamic recompiler. In Interpreter mode, I can just use 486-DX2-66mhz maximum without encountering any stutter.
PS3: I only used Kingdom Hearts on RPCS3, and it runs pretty good. I know a lot of popular titles are more demanding and it won’t be playable on an FX CPU. RPCS3 is one of those emulators where you do need all threads available.
Recommended Emulators:
NES: Mesen
SMS/GG/Genesis: Genesis GX Plus
SNES: BSNES v111, Snes9x
PSX: Mednafen PSX HW
N64: Mupen64Plus (M64P Gliden64)
Saturn: Kronos
Dreamcast: Flycast
PS2: PCSX2
GC/Wii: Dolphin
Xbox: CXBX-Reloaded
Wii U: Cemu
PS3: RCPS3
X360: XeinaSwitch: Yuzu
GB/GBC: Sameboy
GBA: mGBA
NDS: Desmume Dev
3DS: Citra
PSP: PPSSPP
PCEM: Pentium 75
DOS: Dosbox ECE
Recommended emulators are listed as usable. If a system or emulator is not listed, it either that it won’t be playable due to speed, not past playable yet, or too fast enough to play (Stella, Atari 2600). The emulators on the list are recommended for general use. This is using stock settings on most emulators listed.
If any of you know what are the most demanding games for GBA, Saturn, Dreamcast, or DOS, let me know and comment.
Using AMD cards on OpenGL Emulators:
Recently, I switched the GTX 950 to RX 570 8GB for more affordable huge VRAM to produce videos at 4K. I test some emulators that only uses OpenGL. Some were slower than expected, at least on Windows.
Citra seems to not reach full potencial of the hardware shaders. Still faster than without it, but on Battle Royal in Pokemon Sun and Moon, the least fps I got without counting shader stuttering is 30fps, which is the normal speed, even unchecking limited speed. It goes up to 50fps.
On GlideN64, Quake 2 does have the stutter on the demo on few areas. It hasn’t happened before. Using 2D rendering options makes the entire thing slower. It does support Depth Compare fine, but 2D rendering is the problem as well as the additional minor slowdown, regardless of resolution. Glide64 does have stuttering too, but GlideN64 is better used.
Reicast and Redream, plays fine as far as I tested.
Cemu, as noted, it will perform slower. On Mario Kart 8, the lowest is 32fps, and average is usually 50fps. It rarely goes fullspeed.
Using Linux with Mesa Drivers, RadeonSI and RADV:
I tested OpenGL emulators on Ubuntu with Mesa 18.2.2. It does perform better or as good as Nvidia’s OpenGL drivers.
On GlideN64, all the slowdowns on Quake 2 are gone. I don’t have that problem. 2D rendering hasn’t been tested, but should perform fine. Depth Compare isn’t fully supported since few extensions hasn’t been implented yet to the drivers. Depth Compare is a new feature added to GlideN64. Glide64 does have minor stutters, just like how it happened before. GlideN64 is overall faster.
On Citra, it does perform closer to Nvidia. Hardware Shaders is used, and it does perform a bit faster than official drivers.
That’s all I’ve tested.
The end:
I plan to upgrade my FX-8350 to Ryzen 3rd Gen CPU, and I would see great benefits of many of those emulators. It would be the last time I would be testing it. I put all the benchmarks here to show the power of the CPU that can run some demanding emulators. I had this post for two years, and I do plan to upload the Ryzen CPU benchmarks for emulators. Merry Christmas.