You mix the dialog louder for the entire piece, and every other sound in the middle. No extreme highs, no extreme lows. General compression with dialog forward choices.
I don't know why compression isn't built in to consumer media devices, it's so often called for (and closely followed by volume normalisation ... but I guess the advertisers veto that).
> I don't know why compression isn't built in to consumer media devices
As always, it depends on the device. Dynamic range compression seems to be a relatively common feature, usually as an option described (inaccurately) with something like "Reduce Loud Sounds" like it is on the Apple TV.
Because other times when it's not dialogue (or even when it is--the busy/crowded street effect) you may not want the center channel gain to be raised during mixdown.
Consumer hardware can only guess. A sound engineer can know.