Hacker News new | ask | show | jobs
by polygamous_bat 849 days ago
What is the video resolution, 64x64? And even then it becomes blurry. Seems like another Google flag-plant-y paper filled with hot air that we will never see the source code or model for because it will expose how poor its capabilities are relative to competitors.

The internal politics at these places must be exhausting. Industry research was supposed to be free from the publish or perish mindset, but it seems like it just got replaced by a different kind of need for posturing.

1 comments

Hey author here :) First, tough crowd, love it, always great to get feedback because we are actively working on improving the model. We are very happy to admit it is not perfect, but given not many people thought this was possible a year ago, I am quite excited to see the next step of improvement. This is like the GPT1 of foundation world models, and we have a fair few ideas in the works to speed up progress.

The resolution is 90p but we use an upsampler to make it 360p for examples on the website.

How can I get started with this kind of research? Is it even possible without a PhD? Thanks.
If we did a good job then the paper should be written in a way that is digestible. When you don't understand things, follow the references to learn more (and there's probably videos covering most of the components we use).

In the Appendix we have a case study that should be possible to re-implement and run with a single GPU/TPU. We are hoping the community can build from that and innovate. If you take these steps and get stuck, feel free to get in touch!