Hacker News new | ask | show | jobs
by weego 2818 days ago
This looks all wrong. Page scraping is not accessing a data source in a way that means a query language makes any sense. The moment you need to interact with the page and admit there's a dom under there it breaks the idiom.

And why it's remaking variable declaration I don't know, and why is the for loop so verbose? If you insist on a query language go the whole way and remove repetition and syntax complexity because that's the only thing that could actually add value.

1 comments

DOM is a representation of some data. Which means, you can extrapolate the data and then manipulate it. The language itself has nothing related to the DOM. All DOM operations are implemented via functions from standard library.

"Good artist copy, great artist steal" I'm trying to be a good artist trying to not invent a new brand language (I'm not that smart), so I just picked up (copied) an existing one that fits better for dealing with complex structures like trees. So it is AQL - ArangoDB Query Language. https://docs.arangodb.com/3.3/Manual/

If you have any suggestions how to improve the language - you are very welcome.

How about using an existing language, like Python? You can make a really great DSL using Python, and then people have access to all the other Python language features that they already know, and the stdlib that they already know, and 3rd-party modules they already know..
I could, if I knew Python pretty well :) But I've done it in the way I needed it to be done. I wanted to have an isolated and safe environment that would allow me to easily scrape the web without dealing with infrastructural code.
Yup, I get it, I want that too, but I don't want to learn another language just to do that :/