Cache problem disappears then reappears
After sitting on it for a week, the cache problem "disappeared".
Starting operateAnnhilateNaive... Stopped operateAnnhilateNaive 5289ms elapsed Starting operateAnnhilate... Stopped operateAnnhilate 3719ms elapsed Starting operateAnnihilateFast... Stopped operateAnnihilateFast 3494ms elapsed Starting operateAnnihilateFastest... Stopped operateAnnihilateFastest 2766ms elapsed
Groan. Talk about determinism and digital computers... The problem vanished in the exact moment some kind souls from the CHUD team clued me in on how to measure instruction cache performance with Shark on the 970.
Oh oh wait, now the problem is back. Hurra.
Starting operateAnnhilateNaive... Stopped operateAnnhilateNaive 5189ms elapsed Starting operateAnnhilate... Stopped operateAnnhilate 3641ms elapsed Starting operateAnnihilateFast... Stopped operateAnnihilateFast 4198ms elapsed Starting operateAnnihilateFastest... Stopped operateAnnihilateFastest 2617ms elapsed
The result of sharking is that operateAnnihilateFast runs slower, because the data cache is not as effective in this case. With the exact same data and the exact same sequence of access no less. The instruction cache is out of the equation, first because Shark says so and second because reordering of the methods didn't make any difference. May current favorite culprit is the context switch.