Is there a simple way to severly impede webscraping and LLM data collection of my website?

@Maroon · edit-2 6 months ago

Is there a simple way to severly impede webscraping and LLM data collection of my website?

@corroded · 7 months ago

I probably should have specified I’m using libcurl, but I did try the equivalent of what you suggested. I even tried setting a list of user agents and having it cycle through. None of them work. A lot of anti-scraping methods use much more complex schemes than just validating the user agent. In some cases, even a headless browser will be blocked.