|
|
|
|
|
by janwas
676 days ago
|
|
Was thinking of a shorter avl producing partial results merged into another reg.
Something like a += b; a[0] += c[0]. Without avl we'd just have a write-after-write, but with it, we now have an additional input, and whether this happens depends on global state (VL). Espasa discusses this around 6:45 of https://www.youtube.com/watch?v=WzID6kk8RNs. Agree agnostic would help, but the machine also has to handle SW asking for mask/tail unchanged, right? |
|
Yes, but it should rarely do so.
The problem is that because of the vl=0 case you always have a dependency on avl. I think the motivavtion for the vl=0 case was that any serious ooo implementation will need to predict vl/vtype anyways, so there might as well be this nice to have feature.
IMO they should've only supported ta,mu. I think the only usecase for ma, is when you need to avoid exceptions. And while tu is usefull, e.g. summing am array, it could be handled differently. E.g. once vl<vlmax you write the summ to a difgerent vector and do two reductions (or rather two diffetent vectors given the avl to vl rules).