Right now, robots.txt on lemmy.ca is configured this way
User-Agent: *
Disallow: /login
Disallow: /login_reset
Disallow: /settings
Disallow: /create_community
Disallow: /create_post
Disallow: /create_private_message
Disallow: /inbox
Disallow: /setup
Disallow: /admin
Disallow: /password_change
Disallow: /search/
Disallow: /modlog
Would it be a good idea privacy-wise to deny GPTBot from scrapping content from the server?
User-agent: GPTBot
Disallow: /
Thanks!
6 points
3 points
*
I would have thought so too, but == failed the syntax check
2023/08/07 15:36:59 [emerg] 2315181#2315181: unexpected "==" in condition in /etc/nginx/sites-enabled/lemmy.ca.conf:50
You actually want ~ though because GPTBot is just in the user agent, it’s not the full string.
2 points
Strangely, =
works the same as ==
with nginx. It’s a very strange config format…
https://nginx.org/en/docs/http/ngx_http_rewrite_module.html#if
1 point