Right now, robots.txt on lemmy.ca is configured this way

User-Agent: *
  Disallow: /login
  Disallow: /login_reset
  Disallow: /settings
  Disallow: /create_community
  Disallow: /create_post
  Disallow: /create_private_message
  Disallow: /inbox
  Disallow: /setup
  Disallow: /admin
  Disallow: /password_change
  Disallow: /search/
  Disallow: /modlog

Would it be a good idea privacy-wise to deny GPTBot from scrapping content from the server?

User-agent: GPTBot
Disallow: /

Thanks!

-2 points
*

No, definitely not. Our work posted in the open is done so because we want it to be open!

It is understandable that not all work wants to be open, but access would already be appropriately locked down for all robots (and humans!) who are not a member of the secret club in those cases. There is no need for special treatment here.

permalink
report
reply
8 points

Is this even possible without all federated instances also prohibiting them?

permalink
report
reply
14 points

You take action where you can ;)

permalink
report
parent
reply
18 points

Yes. Ban them.

if ($http_user_agent = "GPTBot") {
  return 403;
}
permalink
report
reply
4 points

Thanks for empowering my lazyness =)

permalink
report
parent
reply
6 points

Probably want == instead else we will all be forbidden

permalink
report
parent
reply
3 points
*

I would have thought so too, but == failed the syntax check

2023/08/07 15:36:59 [emerg] 2315181#2315181: unexpected "==" in condition in /etc/nginx/sites-enabled/lemmy.ca.conf:50

You actually want ~ though because GPTBot is just in the user agent, it’s not the full string.

permalink
report
parent
reply
2 points

Strangely, = works the same as == with nginx. It’s a very strange config format…

https://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if

permalink
report
parent
reply
1 point

Look at me! I’m the GPTBot now!

permalink
report
parent
reply
11 points

1000% yes. Please block them.

permalink
report
reply
5 points

Are they even respecting those files?

But yeah, sure, it’s worth trying!

permalink
report
reply
1 point

Worth trying for what reason?

permalink
report
parent
reply
3 points
*

It’s from the official documentation.

permalink
report
parent
reply

Lemmy.ca Support / Questions

!lemmy_ca_support@lemmy.ca

Create post

Support / Questions specific to lemmy.ca.

For support / questions related to the lemmy software itself, go to !lemmy_support@lemmy.ml

Community stats

  • 26

    Monthly active users

  • 91

    Posts

  • 318

    Comments