Simon, sorry I didn't get around to answering your question on post-t5 encoder-decoders from the Markdown Lethal Trifecta prompt injection post. (https://news.ycombinator.com/item?id=45724941)
Since the plain decoder models stole the show, Google DeepMind demonstrated a way to adapt LLMs,adding a T5 encoder to an existing normal Gemma model to get the benefits of the more grounded text-to-text tasks WITHOUT instruction tuning (and the increased risk of prompt injection).
They also have a few different kinds they shared on HuggingFace. I didn't get around to fine-tuning the weights of one for summarisation yet but it could well be a good way for more reliable summarisation.
I did try out some models for inference though and made a Gist here, which is useful since I found the HF default code example a bit broken: