| Talked with a founder building something in this space that was inspired by Matt's post in October. lynq.ai (no affiliation, I just know Paul). I think when you look at this problem originally, you say - that's a bad idea. Why would you take structured data, output it to non-structured format and then have ML parse that. Lots of wasted CPU cycles all around. However, when you think about the complex dynamics of standards around documents, tens of years of digital formats, hundreds of standards, lack of adherence to those standards, proprietary formats, hundreds of years of print and legal documents, the argument is akin to self-driving cars. The state we have today around data & dashboards is a hugely emergent & dynamic system, just like our road transport infrastructure. We are closer to a machine being able to navigate the same way a human can, than we are to one simple (or one set of) standard that work more the way a machine would want to consume. Screenshots as a universal API simply meets the world where it is vs assuming the world is going to change towards something simpler and more elegant. I think part of the problem with how this comes across at first glance is how it's framed. "Screenshots" as an API evokes some dirty feelings for most of us in tech because the format is so unstructured. I think if you think about the idea of building something once that both a human and a machine consume from the same target (the UI), this makes a lot more sense in many ways even if it feels like there's an expensive level of indirection in there. Excited to see what'll happen here over time. |