Edit:
So it looks like people have taken this post to be me calling to harass the dev, which was completely and utterly uncalled for.
Don't harass people.
Looking further into the project and what it was doing and how it showed up in my system, I'd rephrase my opinion here as "problematic Fedi client" rather than scraper. So I'd like to apologise for that.
Original message:
Looking like we have a new Fedi scraper at contentnation.net, if you're interested in not being a part of that
@aurynn Admin/Owner of ContentNation.net here. It's not a scraper, just an feature incomplete ActivityPub service. It only fetches data if someone uses the web viewer with a known profile name or AP id. If it can fetch "private" data, something is wrong on the server to provide it. It does not do HTML scraping. So just what ActivityPub was designed for. Robots are excluded by robots.txt and nofollow, but some just don't care. So that might be a reason why it seems scraped.
@sash it’s fetching unlisted and does not respect post delete. It’s a scraper.
@aurynn How can it fetch unlisted data if the server does not list it? If it does list it, it's a server bug or it is listed. And it should honor post deletes, if not then it's a bug.
@sash okay so your scraper takes anything that’s public and shoves it up on your domain to talk about monetisation, and you get defensive about admins like me correctly calling it a scraper and telling each other about it
Instead of going “hmm maybe I should talk to admins before I launch a scraper”
@aurynn it does federation, the intend of the fediverse and by means of ActivityPub. And no, there is no monetization on the site. Btw. every mastodon instance does the same. Pick any one and you can see all public post of any local and remote user, exactly like my site. So mastodon is evil, too?
@sash my dude, it’s a bit late for this. Getting defensive about your scraper in my mentions isn’t going to have the result you want.
@aurynn Sorry, I don't get your point. It's not a scraper, just an ActivityPub client like Mastodon, Firefish, any many others. If there is a bug that delete requests are not honored (which I can't reproduce, works with my local mastodon test instance), how about support instead of false accusations? You are the aggressive one here. So please be helpful and open minded.
@sash I am being helpful - I’m informing the community of admins I’m in that there’s a scraper that needs to be blocked at the IP and user-agent level because it doesn’t respect federation boundaries.
I’m just not being helpful to you, nor will I be.
@sash nah I’m good. Information is disseminating. Try talking to people next time before launching a scraper.
@aurynn please stop distributing false information, just because you think so. Aren't you the one that breaks the rules? To quote your own server rules, paragraph 3: Misinformation and content harmful to the public good (for example, “5G causes coronavirus” or “vaccines cause autism”) is not allowed and will be removed, and may result in an account suspension." False claiming of scrapers is O.K. if done by you?
@sash look, mate, being aggro and super defensive in my mentions like this isn’t going to get you anywhere with me. I’m not interested in having this conversation with you.
@aurynn I'm just trying to stop you from spreading false information and harming others that use my service.
@sash that you don’t like how other people view your project does not make it false information.
@aurynn @lutoma there is a WeDistribute article that should answer your question:
https://wedistribute.org/2024/03/contentnation-mastodons-toxicity/
It also mentions this thread as an archetype of Mastodon's toxicity.
@ErikUden @lutoma Not really. It, and reading through some of the docs on the content nation site again linked from it, it seems like it was intended to be a fediverse client? though a very poorly behaved one, if it was directly hitting API endpoints on servers and not implementing a peer server node.