Since you mention VAEs: you might be interested in Active Inference, which is built on minimizing variational free energy. Same E−TS form as the thermodynamic kind (Helmholtz free energy), so it does fit your thermo/info/NN question. And the negative free energy is equivalent to the evidence lower bound used in generative models like VAEs. Lots of really fascinating connections between all of these topics :)
Thank you! Parr, Pezzulo & Friston looks like the kind of book I'm after - love cross-disciplinary works. After all, knowledge and nature are continuous. It's humans who like to chop them up into subfields :)