The majority of a program's runtime is usually spent in only a tiny section of its code. That is where optimization benefits are. If it helps to separate out that code and compile it with different compiler switches, the additional maintenance burden for the program structure and build system might be acceptable.
Go look at profiles for programs which have been written with performance in mind. Operating systems, databases, game engines, web servers, some compilers, video/audio/3d editing packages come to mind. I 100% guarantee these programs do not spend the majority of their runtimes in a tiny section of code. What you said is nearly-unilaterally untrue, at least for programs that care about real performance.
That's not a useful description of desktop "creative" software. Even though it might be true for audio that in many cases, the majority of the run time is spent handling the "process callback" from the audio subsystem, once the user starts actually working on things, the slow parts of the code (and the ones that impede the user or degrade their experience) are far removed from that core. This is a little less true of visual applications (video, drawing, image editing etc.) but I would imagine that similar considerations apply there too.