If you have the capability to measure state, and you have control over audio / visual input, then you necessarily have the ability, albeit via a probably alarmingly small number of experiments, to induce (some value of) state.
Lacking an understanding of the mechanism, I'm not sure I understand how that necessarily follows regardless of the mechanism. Completely pulling something out of my ass, imagine that they determine inserting occassional white frames results in more significant pupil dilation increases when someone is in a heightened emotional state. Sounds doable and vaguely plausible? It would allow them to infer that heightened emotional state from a quickly flashed visual in a way most users likely wouldn't perceive from a measured reaction to it. But I'm missing how it would follow that you could use the same to cause that emotional state.