Can ai generated content go around existing filters that prevent spam and illegal content?

submitted by

Pretty much the question. Is it different enough from regular crap to be missed or is the prevention working fine?

A thread about grok allowing nonconsensual deep fakes for paying customers had me wondering whether we’ll be flooded once more in the near future by waves of spam bots meant to disgust and create trouble. And I thought to ask whether we can prevent that kind of thing now or it’s something to be dealt with as it goes.
And no doubt that it will go.

4
10

Log in to comment

4 Comments

There are no existing filters for spam or illegal content. That’s all done manually when someone reports it.

Instances that use Cloudflare should opt in to their service which detects CSAM. It’s free and easy. I’ve done that for piefed.social.

I’ve recently added some LLM-generated text detection to PieFed but it’s not scalable as I don’t have the hardware to enable it’s widespread use. So I just use it in a very limited and targeted way at the moment.

I see. It was wishful thinking on my part I suppose. Means someone will still have to suffer through the process.

Can’t ask for more than what’s possible, thanks for evern doing that much.

Just learned, Grok now requires a paid subscription to generate nudes. That won’t stop the average slop, but maybe some weird stuff. Because these people need to put in some payment details. And they’d probably need to do some extra work not to get their personal credit card connected to illegal activities.

I think the only way for us to do something, is to complain to our legislators and make them pass laws to mandate watermarking for all AI services. With proper watermarks, we could do AI filters.

And automatically detecting if an AI image is an unconsentual deep-fake… I don’t think that’s possible with technology.

The future of communication is through whitelisting, sadly.

Insert image