|
|
|
|
|
by JSavageOne
948 days ago
|
|
> "One way we explored approaching this was using puppeteer to automate opening websites in a web browser, taking a screenshot of the site, and traversing the HTML to find the img tags. > We then used the location of the images as the output data and the screenshot of the webpage as the input data. And now we have exactly what we need — a source image and coordinates of where all the sub-images are to train this AI model." I don't quite understand this part. How does this lead to a model that can generate code from a UI? |
|
In this case, if you look two images up you will see e-commerce image with many images composted into one image/layer. How will their system automatically decide whether all those should be separate images/layers or one composted image? To do so they trained a model that examines web pages and <img> tags and see's their location. Basically, they are under the assumption that their data has good decisions and you can learn in which cases people use multiple vs one image.
I could be misunderstanding :)