noc.social is part of the decentralized social network powered by Mastodon.
This instance is focused on technology, networking, linux, privacy, security, infosec, engineering, but open to anyone. Civil discourse, polite and open. Managed by the noc.org / trunc.org team.

Administered by:

Server stats:

787
active users

Learn more

Edit:

So it looks like people have taken this post to be me calling to harass the dev, which was completely and utterly uncalled for.

Don't harass people.

Looking further into the project and what it was doing and how it showed up in my system, I'd rephrase my opinion here as "problematic Fedi client" rather than scraper. So I'd like to apologise for that.

Original message:

Looking like we have a new Fedi scraper at contentnation.net, if you're interested in not being a part of that

sash

@aurynn Admin/Owner of ContentNation.net here. It's not a scraper, just an feature incomplete ActivityPub service. It only fetches data if someone uses the web viewer with a known profile name or AP id. If it can fetch "private" data, something is wrong on the server to provide it. It does not do HTML scraping. So just what ActivityPub was designed for. Robots are excluded by robots.txt and nofollow, but some just don't care. So that might be a reason why it seems scraped.

@sash it’s fetching unlisted and does not respect post delete. It’s a scraper.

@aurynn How can it fetch unlisted data if the server does not list it? If it does list it, it's a server bug or it is listed. And it should honor post deletes, if not then it's a bug.

@sash okay so your scraper takes anything that’s public and shoves it up on your domain to talk about monetisation, and you get defensive about admins like me correctly calling it a scraper and telling each other about it

Instead of going “hmm maybe I should talk to admins before I launch a scraper”

@aurynn it does federation, the intend of the fediverse and by means of ActivityPub. And no, there is no monetization on the site. Btw. every mastodon instance does the same. Pick any one and you can see all public post of any local and remote user, exactly like my site. So mastodon is evil, too?

@sash my dude, it’s a bit late for this. Getting defensive about your scraper in my mentions isn’t going to have the result you want.

@aurynn Sorry, I don't get your point. It's not a scraper, just an ActivityPub client like Mastodon, Firefish, any many others. If there is a bug that delete requests are not honored (which I can't reproduce, works with my local mastodon test instance), how about support instead of false accusations? You are the aggressive one here. So please be helpful and open minded.

@sash I am being helpful - I’m informing the community of admins I’m in that there’s a scraper that needs to be blocked at the IP and user-agent level because it doesn’t respect federation boundaries.

I’m just not being helpful to you, nor will I be.

@aurynn So please tell me which boundaries I overstepped in you opinion. Also please explain, also to the other admins, why you think it is a scraper, if even one of the creators of ActivityPub (@evan) replied, that is not a scraper and just using the official API calls.

@sash nah I’m good. Information is disseminating. Try talking to people next time before launching a scraper.

@aurynn please stop distributing false information, just because you think so. Aren't you the one that breaks the rules? To quote your own server rules, paragraph 3: Misinformation and content harmful to the public good (for example, “5G causes coronavirus” or “vaccines cause autism”) is not allowed and will be removed, and may result in an account suspension." False claiming of scrapers is O.K. if done by you?

@sash look, mate, being aggro and super defensive in my mentions like this isn’t going to get you anywhere with me. I’m not interested in having this conversation with you.

@aurynn I'm just trying to stop you from spreading false information and harming others that use my service.

@sash that you don’t like how other people view your project does not make it false information.

@aurynn @sash It absolutely is false information though. All you did during the entire discussion is reiterate your (incorrect) initial assumption.

@lutoma @sash *sigh* okay please explain how this works and how it is intended to participate in the fediverse.

@aurynn @lutoma there is a WeDistribute article that should answer your question:

wedistribute.org/2024/03/conte

It also mentions this thread as an archetype of Mastodon's toxicity.

Content Nation Backlash Highlights Mastodon's Toxicity - We DistributeWe Distribute

@ErikUden @lutoma Not really. It, and reading through some of the docs on the content nation site again linked from it, it seems like it was intended to be a fediverse client? though a very poorly behaved one, if it was directly hitting API endpoints on servers and not implementing a peer server node.

Man, you'd be better off just blocking cloudisland. Anyone she talks to, trust me you won't care for their content or users anyway.

Hell I even block mastodon.art on principle due to them weaponising their userbase to be the fedi police. It's all very childish and ridiculous.

They don't seem to understand everything they post is public, and that you aren't scraping, you are federating.

I used to share a similar opinion to Aurynn... years ago when this place was smaller - but I live in the real world and realise anything we post publicly is out there, probably forever.

As a photographer I struggle with this, but it is what it is. Anyone making money from my content I will go after, however.

You aren't lying about what it is and how it works so... good on you. ❤️

@aurynn @sash this performative bullying is trite.