| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by gpderetta 80 days ago

Partial specialization specifically. Match some patterns and covert it to something else. For example:

  struct F { double x; };
  enum Op { Add, Mul };
  auto eval(F x) { return x.x; }
  template<class L, class R, Op op> struct Expr;
  template<class L, class R> struct Expr<L,R,Add>{  L l; R r; 
    friend auto eval(Expr self) { return eval(self.l) + eval(self.r); } };
  template<class L, class R> struct Expr<L,R,Mul>{  L l; R r; 
    friend auto eval(Expr self) { return eval(self.l) * eval(self.r); } };
  template<class L, class R, class R2> struct Expr<Expr<L, R, Mul>, R2, Add>{   Expr<L,R, Mul> l; R2 r; 
    friend auto eval(Expr self) { return fma(eval(self.l.l), eval(self.l.r), eval(self.r));}};
  template<class L, class R>
  auto operator +(L l, R r) { return Expr<L, R, Add>{l, r}; } 
  template<class L, class R>
  auto operator *(L l, R r) { return Expr<L, R, Mul>{l, r}; } 

  double optimized(F x, F y, F z) { return eval(x * y + z); }
  double non_optimized(F x, F y, F z) { return eval(x + y * z); }

Optimized always generates a call to fma, non-optimized does not. Use -O1 to see the difference (will inline trivial functions, but will not do other optimizations). -O0 also generates the fma, but it is lost in the noise.

The magic happens by specifically matching the pattern Expr<Expr<L, R, Mul>, R2, Add>; try to add a rule to optimize x+y*z as well.

1 comments

aw1621107 79 days ago

Hrm, OK, that makes sense. Thanks for taking the time to explain! Guessing optimizing x+y*z would entail something similar to the third eval() definition but with Expr<L, Expr<L2, R2, Mul>, Add> instead.

I think at this point I can see how my initial assertion was wrong - specialization isn't fully orthogonal to expression templates, as the former is needed for some of the latter's use cases.

Does make me wonder how far one could get with rustc's internal specialization attributes...

link