|
|
|
|
|
by raw_anon_1111
63 days ago
|
|
Can’t speak for search engines specifically. But I recently had to do a project which required me to crawl the customer’s large site and index it into a vector search for RAG for a call center. My first attempt was to use crawl it just by doing GET requests (ie same thing as using curl). That got me nowhere. I had to use headless Chrome and Playwright. Do any modern websites work with just curl even if they don’t block it - ie without being able to run JS? |
|