(The inference costs are cheaper for them now as context grows because of the Sparse attention mechanism)