The tedious task on (why not) 100% CPU usage

As one of the many programmers tasked in the endless rush of better performances, you were looking for a page that would teach you how to achieve the legendary 100% cpu usage… I cheated you.

Instead, I will point why you could be wrong in reaching full CPU consumption…

Not only the CPU is in the course: GPU, memory bandwidth, I/O…

People tends to blame the CPU horse power for the lack of performances. The same ppl would say: "If we need more performance, then just put in more CPU". It does not go like that.

A very good example is cache miss, your game could be idle because it does a lot of cache miss that a) trash the CPU cache b) make lot of memory access and c) in the worst case triggers disk swap.
Doing a lot of disk access is not good, disk access are typically blocking function. Disk reads are usually fast because, as it has a big latency, there's many caches in between. Still, it has a worst case and while waiting the DVD to stream the file, your application is not using 100% CPU…

Thus, before starting "optimizing for more CPU", make sure that other components are not "blocking" the CPU. Minimizing the amount of I/O is a good thing, in the same order of magnitude, minimizing memory access is a good thing BUT I recall: make sure this is the blocking components, else you will fall in the perilous den of premature optimization

More performances: Better algorithm ?

Thus, think it this way: it is better to have one thread running very fine than two threads running quite badly.

Before parallelizing your application, make sure that in its current form, it is already working efficiently. It makes no sense to parallelize O( n³ ) algorithm while you could use one in O( n log(n) ).

Although, it makes sense to consider the parallelized algorithm: IF you are using massively parallelized hardware, which mean more than 4 cores, GPUs and so on…

Okay, lets assume your application runs at 100% CPU, yeepee?

A second point is that the OS may "work better" when the system is not totally overloaded. If your game runs at 100%, the system starts stealing time to your cpu-greed application. It is right to say that OSes are done to deal with such situation. On the other hand; they are guaranteed to be stable but they don't guarantee to be very responsive in such situation.

You can see it this way: if you leave a rendering task running in background and browse; you will feel your system is under heavy load: it does not feel great. And it is bit the same for your applications: if you have 3 task taking 100% of resources, the fourth one (lets say input/UI) might lack of resources to be as "responsive" as you would like. This is not only true for the CPU resources; but also with the bandwidth used in your bridges: if they are saturated; your system may lack of responsiveness.

The right response to this problem would be: yes right, we need correctly defined scheduling so the Input and UI receive top priority and we can afford to delay AI task a little bit. The problem is that not all scheduler works well for the type of application you are targeting.

Unless otherwise stated, the content of this page is licensed under Creative Commons Attribution-ShareAlike 3.0 License