|
|
|
|
|
by DigitalTurk
4759 days ago
|
|
Very nice work! It seems like this is only fast on large files, though, because the text needs to be copied from the main RAM memory to the GPU, which introduces latency. I wonder what latency would be like if this algorithm was instead run on the kind of unified memory architecture that you see in e.g. the PS2 and XBox One. Also, I don't quite follow why they're compiling the finite automata on the GPU. To me their explanation that they didn't want to copy the automaton node per node sort of sounds like there's a lot of room for optimization here. E.g. maybe the regular expressions could be compiled to OpenCL code. Then again, they did also find that pattern matching is a memory bound problem so maybe emitting native code is pointless. Anyone know if there are regular expression engines that compile emit native x86 code? |
|