Hacker News new | ask | show | jobs
by swatcoder 983 days ago
OP — If this comp_nearest is still a hot path for you or if you want to generate more articles, consider testing:

1. using `restrict` to tell that compiler that src and dest are sure not to overlap

2. Converting your two increment and test blocks to add+mod to allow for uninterrupted pipelining

Neither might make a difference, but either could.

1 comments

Addressing the aliasing concern would be the easiest improvement. I observed in the assembly that the source pixel is being re-read all four times it is used, which could be fixed.

Writing an optimal composite function is of course not really the goal, nor of much educational/entertainment value. For any additional speed I already have a function which slices up compositing tasks into chunks and puts them on a thread pool.