Hacker News new | ask | show | jobs
by lostmsu 590 days ago
What's the perf like?
1 comments

Sorry, I have not benchmarked against cuBLAS or Eigen or similar, I did that thing for ML inference.

I have implemented a profiler on top of D3D11_QUERY_TIMESTAMP and D3D11_QUERY_TIMESTAMP_DISJOINT queries, and tweaked the compute shader to minimize the time reported by these queries for my specific use case.