Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Sure, we could use Burst to speed up some strategic parts, but that would not help with the core of the game.

To give some context, things are very complex in our game, we have fully dynamic terrain with terrain physics (land-slides), advanced path-finding of hundreds of vehicles (each entity has its own width and height clearance), trains, conveyors and pipes carrying tens or even hundreds of thousands of individual products, machines, rockets, ships, automated logistics, etc. There is no one thing that could be bursted to get 3x gain. At this point, we'd have to rewrite the entire game in C++.

So what's the reason we use C#? Productivity, ease of debugging and testing, and resilience to bugs (e.g. null dereference won't kill the program). Messing with C++ or even burst would cost us more time and to be honest, the game would possibly not even exist at that point.

Could you share some details about your custom thread pool that got 3x speedup? What was the speedup from? It is highly unlikely that a custom thread pool would have any significant impact on the benchmark in our case. As you can see from Figure 3, threaded tasks run for about 25% of the total time and even with Mono, all tasks are reasonably well balanced between threads. Threads utilization is surely over 90% (there is always slight inefficiency towards the end as threads are finishing up, but that's 100's of ms). An "oracle" thread pool could speed tings up by 10% of 25%, so that is not it.

Vectorization could help too but majority of the code is not easily vectorizable. It's all kinds of workloads, loading data, deserialization, initialization of entities, map generation, precomputation of various things. I highly doubt that automatic vectorization from code generated by IL2CPP would bring more than 20% speedup here. The speedup from burst would mostly come from elimination of inefficient code generated by Mono's JIT, not from vectorization.

For now, we are accepting the Mono tax to be more productive. But I am hoping that Unity will deliver on the CoreCLR dream. In the meantime, my post was meant raise awareness and stir up some discussion, like this one, which is great. I've read lots of interesting thoughts in this comments section.



>Sure, we could use Burst to speed up some strategic parts... the game would possibly not even exist at that point.

Yeah, the thing with Burst is that its a lot easier to work with if you start with it than having to replace/upgrade code later, especially if you're not familiar with it. A big issue is usually that you create structs with data and they're referencing other structs etc., all those need to be untangled to really make use of Burst. I myself am also a big C# fan, it is a lot easier than using C. Unity has a lot of issues but there's a reason its so widely adopted and used. (I myself am currently working on a Unity C# tool that I believe will speed up code development significantly).

Your game does sound as if its a VERY ripe target for Burst usage based on the elements that you describe, but the real question should be if you need it. For example if you're already running at 60 fps on whatever your mid target hardware is at whatever max + N% load/size for a game instance then you don't need it. But if you're only hitting 40fps and design-wise want to increase e.g. your map size by 2x then it might be something to look into. Also if you look at e.g. Factorio, they spend a LOT of time optimizing systems, but of course you first need to launch the game (which is and should be the priority).

If you have for example 25 systems (e.g. pathfinding, trains, pipes, etc.) and they're evenly balanced then as you say then you won't increase your game speed by 2x by just converting one of those. BUT if for example your pipes are being processed in 4ms per frame, so you instead adopt other strategies like only processing them every Nth frame or doing M pipes per frame; at that point using Burst to just get that 4ms down to 0.5ms might be a really worthwhile target to make your game play better. The same goes for all your systems where the upgrade will have a cumulative effect.

I highly suggest learning just the basics of Burst in your spare time and trying it out on something basic to get the feel of it. As with all code/libraries it'll unfortunately take some time to figure out how to effectively use it. Roughly speaking: - You don't have to have SOA data, but it helps. At the start just convert methods over 1 to 1. - You have to convert most C# container types to Burst ones, for example in struct Vehicle { Wheel[] wheels } you need to change Wheel[] over to NativeArray<Wheel>, and the Wheel struct itself also need to not use complex types etc. Other types such as NativeSpan are also very useful, instead of storing the wheels just use a ref Span to them instead. - After you have basics going you can try out SOA along with more math/less logic so that the code can be vectorized, once you see that big speedup for certain types of code it's hard to go back.

>Could you share some details about your custom thread pool that got 3x speedup? What was the speedup from? It is highly unlikely that a custom thread pool would have any significant impact on the benchmark in our case. As you can see from Figure 3, threaded tasks run for about 25% of the total time and even with Mono, all tasks are reasonably well balanced between threads. Threads utilization is surely over 90% (there is always slight inefficiency towards the end as threads are finishing up, but that's 100's of ms). An "oracle" thread pool could speed tings up by 10% of 25%, so that is not it.

My thread pool itself is pretty standard, it spins up some heavy threads and uses ManualResetEvent to trigger them. Its advantage lies in pre-registering simple Action (with/without parameters) calls to set methods that'll be called when the thread runs; and with more gaming related options for whether we're waiting on thread completion, interleaving them with other threads etc. A big plus is that it has a self-optimization function, so it'll self-adjust the thread count vs the total time runs take, the total # of amounts of items being processed for the given workload etc. so as to automatically find very good sizes for all those elements to use for the target computer, vs just assuming e.g. 32, 64 or 128 inner elements and launching the max available threads on the PC (as thread pools usually do).

>Vectorization could help too but majority of the code is not easily vectorizable. It's all kinds of workloads, loading data, deserialization, initialization of entities, map generation, precomputation of various things. I highly doubt that automatic vectorization from code generated by IL2CPP would bring more than 20% speedup here. The speedup from burst would mostly come from elimination of inefficient code generated by Mono's JIT, not from vectorization.

Yeah, if its startup/generating code that's mostly bypassed by loading a game then its not worth switching over. Do note that code compiled by Burst will in general be more optimized than Mono just due to better tooling, but in general its not worth moving over just for that due to the amount of work you need to do so. The real wins come in if some generating element that's done often is taking too long, or during gameplay where you can replace elements in the game that take e.g. N milliseconds to calculate every frame and drop those down to 1/10th - 1/100th of the time it used to take.

Good luck!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: