Lemmy newb here, not sure if this is right for this /c.

An article I found from someone who hosts their own website and micro-social network, and their experience with web-scraping robots who refuse to respect robots.txt, and how they deal with them.

  • Stubb@lemmy.sdf.org
    link
    fedilink
    English
    arrow-up
    9
    ·
    2 days ago

    I’ve found that many of these solutions/hacks block legitimate users that are using the tor browser and Internet Archive scrapers, which may be a dealbreaker for some but maybe acceptable for most users and website owners.