Hacker Newsnew | past | comments | ask | show | jobs | submit | b0b_d0e's commentslogin

South of Atlanta we had a tornado warning around 5am. Several downed trees and power outages but I haven't heard of any actual tornadoes that touched down in my area yet.


It seems unlikely that it'll ever get fully upstreamed, but according to a post on the discord that I'm pasting in full below, there certainly can be parts that may get upstreamed. I do a lot of NES related development in my free time, and llvm-mos has been awesome for rapid development. I'd love to see of this get upstreamed in the hopes that it could reduce the maintenance burden on the small team, but I'm not trying to speak for them or anything haha.

> So, I wanted to do a little blurb on the topic of upstreaming LLVM. My previous answer to this questions was "yeah, we'd like to, but we have more work to do." This implied that we were working on it. More accurate to reality is that we were keeping it in the back of our heads and doing work to decrease the diff from upstream. The latter is also useful for making merges from upstream easier, and that's closer to the real reason I was doing it.

> Well, I've lost some rather high-profile fights upstream. In particular, upstream now strips out attribute((leaf)) from LTO builds, which is the whole thing that makes our static stack solution work. I personally think this decision was totally bogus, and wasn't alone in this, but the conservative voice won out. My experiences with the LLVM community so far has been one of deep conservativism; the stranger you are, the more you need to justify your strangeness with real-world impact. We're a very strange hobby project, which just doesn't bode well. We could make our backend a lot less strange by making it a lot less good, but then it becomes impossible to compete with incumbents like KickC and cc65.

> Accordingly, I'm not keeping the goal of maintaining llvm-mos entirely upstream in the back of my head anymore. I don't oppose work along those lines, unless it interferes with making llvm-mos the best 6502 compiler we can.

> That being said, LLVM may independently change to be more amenable to us, so this may become easier in the future. This has already happened prior to us, with GlobalISel and AVR making development of the backend far simpler. If that happens, I'll definitely reexamine my opinion on this.

> Alternatively, I'd definitely be open to upstreaming the unobjectionable parts of llvm-mos backend; we could then maintain the actual distribution as an increasingly thin fork from upstream. In fact, we could probably get started on that project today; I haven't yet spent much time considering the idea, but I'm starting to like it more and more, since it gives increased visibility, easier merges, and an excellent reference backend for upstream documentation. (We're really nice once you strip away all the crazy!)


It might be more accurate to say that OP "lost" that fight because the existing semantics of that LLVM attribute were half-baked, and would have resulted in wrong-compile bugs if preserved in LTO. That's something that can be fixed at least in principle, but it requires adding some other extension to LLVM IR, that's closer to what OP is looking for.


The story is a little more complex than that; the semantics were internally consistent, but foot-gun-ey, and they could technically be taken to match GCC's documentation of how they behaved, depending on an unfortunately ambiguous phrase in their docs.

The old semantics also matched GCC's actual behavior; when we brought this up in a GCC issue, those present decided that GCC's behavior was wrong, but the appropriate maintainer couldn't be reached for a final say. The issue is still hung like that.

There were also a few other folks trying to do the same kinds of whole-program call graph analysis this enables, IIRC for GPU purposes. So, there was a lot of conflicting opinions about how this should work, a lot of uncertainty, all the recipe for a big long endless thread.

EDIT: This is of course my extremely biased take on the proceedings. This was also the first and only "open source kerfluffle" I've so far been direct party to; I've seen these come and go on mailing lists before, but I was surprised how different it felt to actually be inside one.


Top of homepage.....

Platform for teams of OnlyFan girls to safely create in-person performances.


The fact that its performant is tangential to the fact that its running on ARM android. IIRC Drastic still does a traditional JIT to recompile the guest ARM code into ARMv7a for phones (I might be misremembering, but I recall they had a whole lotta headache trying to update to support 64 bit ARM in time for the Android deadline for that. Also I remember them talking about tons of headaches dealing with the new storage APIs for Android)

Its really fast because of the tons of optimization that went into the application. It does many cool things to cut out overhead on the CPU side, but also it does a lot on the graphics emulation side, including hand rolled ARM NEON code for SIMD processing polygons. I'm not very familiar with DS emulation, but one of the devs for Drastic contributed to another project I worked on, so I had some chats here and there about these sort of things.


> Emulation involves a lot of large byte arrays and static structs that your code updates a lot to represent the hardware

Things work a bit differently for "modern" emulators, where the emulators recreate the kernel/OS at a high level. In these emulators, the games will call into the system, and the kernel will be expected to do all thats necessary for the call. In the high level approach, this means that if a call allocates, so does the emulator (edit: note that this is a simplified view, as both emulators map a 4GB page that they work in for the guest system memory, but theres still a ton of side allocations that happen "outside" of the guest kernel). There is a lot of work that goes on in this layer of emulation, and theres going to be objects that the emulator allocates and later destroys. Process tables, thread lists, scheduler information, timing events, kernel synchronization primitives like mutexs, and so on to name some. I'm not intimately familiar with Ryujinx to make any statements about how they handle GC of course, but its something that they'll need to take into consideration. That said, there's plenty of other things like JIT compilation, shader compilation, caches filling up, and on and on that all also cause micro stuttering, so its not uncommon for even C or C++ emulators to have annoying pauses too.


fail0verflow did this exact thing when they first got linux working on the switch actually! https://twitter.com/fail0verflow/status/988543541403160576 It'd be a pain and a half to get a "proper" fast port with CPU emulation (either some lightweight JITing or natively running the code with hooks on SVC and such) and a GPU backend for the switch graphics. Not nearly simple enough to attract someone to do it on a whim with no other reason that to say it happened haha


The other sibling comments were talking about general C++ vs C# performance, but I wanna get into some emulation specific context here that's missing. I was one of the contributors to the yuzu project since it was first started, and so I'm fairly familiar with the challenges of switch emulation, albeit I'm not going to pretend like I was one of the all-star devs responsible for making the magic happen. (I've recently stopped contributing to the project in order to pursue other interests, so i'm going to keep the information here general and not specific in case somethings changed in the last month.)

Let's start off by breaking the core performance portions of emulation into a few broad categories. There's CPU emulation, for running the actual guest exe, Kernel and OS emulation for handling the system calls that games make, and GPU emulation for translating the guest's GPU work into modern graphics API that your PC can use. Now let's compare how language overhead will affect each of these main scopes.

CPU Emulation - Both yuzu and Ryujinx use JIT compilation to recompile the guest ARMv8 instructions into x64 at runtime. The specifics of the two emulators JITs are pretty different, and it'd be cool to go into more details, but the mile high view is a comparison of C# vs C++ isn't going to have much of an effect on the runtime difference. At least not near as much performance gap between techniques and optimization levels that the JIT is capable of. The goal of JIT compilation for CPU code is to remove as much interpreter overhead as possible, so if your choice of programming language is slowing down the JIT, that suggests that you have somewhere else to improve in the JIT ;)

Kernel/OS - This is the area that will have the largest performance difference between implementation languages, but with the major caveat that Kernel and OS emulation requires the least amount of processing power out of the 3 categories mentioned here. The Kernel and OS are responsible for managing and scheduling threads, handling network connections, and etc, but really most of these things have fairly low impact on final application performance in comparison to CPU and GPU emulation. As a side note, emulators aren't the only groups interested in switch OS/Kernel work. The open source Atmosphere custom firmware for the switch is working through recreating an open source kernel/os for the switch, and the two emulators benefit from their work too. (See the licensing exemptions here https://github.com/Atmosphere-NX/Atmosphere#licensing)

GPU Emulation - This is probably the trickiest part of Switch emulation, and once again, it comes down to how you emulate it, and not the language you use to write the emulator. The biggest performance differences between the GPU emulation of the two emulators will boil down to technique differences, and not the programming language. GPU emulation performance can be roughly broken into two parts, the "actual" gpu running time, and the state management/conversion. There's only so much an emulator can do about the actual GPU running time since at some level, you are going to need to run the game's GPU code, but the other half is a whole lotta code to avoid doing more work, and much of the GPU performance optimizations goes here. Things like managing the game's GPU memory, avoiding changing or querying GPU state unless necessary, converting nvidia shader ASM into SPIR-V or GLSL, and so on, are not generally going to be bottlenecked on the emulator's language of choice, but on the algorithms and designs that you use. Also a side note, the average comment about how "easy" switch emulation is because of "off the shelf nvidia parts" really misunderstands just how much work goes into this part. Switch emulation benefits greatly from the open source nvidia reverse engineering efforts from teams like nouveau, and others working on open source GPU acceleration on the switch like https://github.com/devkitPro/deko3d but also a great deal of effort from the switch emu devs themselves, writing tests cases to run on the switch to find edge cases and document behavior. It definitely is not easy work.

At the end of the day, every drop of performance counts, but some drops are much much larger than others. As such, the advantages of any language's performance characteristics will be heavily offset by the design choices the emulator uses. The creator of ryujinx is very comfortable with C#, and with good development practices, there's no intrinsic reason that one cannot achieve good performance in a C# emulator. And if one decides that it's worth the tradeoff to do some extra work for performance in exchange for a more comfortable development environment, then I say let them do what they want.

Shoutouts to both the yuzu and Ryujinx teams for all their hard work. I loved working on emulators a lot, and highly recommend anyone who's interested in contributing to give it a shot, its a really challenging and rewarding kind of project where there's always something new to learn about in a broad array of subjects.


redream is a Dreamcast emulator written in C only. libretro (the core API behind Retroarch) and Retroarch itself are primarily C. mupen64plus is in C, although the common 3rd party plugins and frontends are not always in C.


From what I can tell, Leadwerks has a "Game Launcher" application on Steam, and you can upload your game as a workshop mod for the Game Launcher App.

http://www.leadwerks.com/werkspace/page/tutorials/_/publishi...

So, its not like its actually a title on steam, but its available to be played through steam if that makes sense.


I don't think that was the case for me at least. I felt like (using your own analogy) the more I studied Rust, the better I learned how to use every other hammer. Learning a new programming language helped me learn c++ better since I was constantly on the lookout for potential memory leaks and other common pitfalls that Rust prevents. Now when I code in c++, I always try to write the code with safety in mind.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: