-- Anzeige/Ad --
You might have guessed it already: We are struggling with excessive crawling today. We have - again - blocked several large IP ranges, but were not yet able to identify the new actor.
We are working on restoring service availability and fine-tuning our rate-limiting.
If someone is interested in implementing an improved native rate-limiting in #Forgejo that also protects other instances from abusive crawlers, please reach out 😉
Rev. Roger BW 😷
Als Antwort auf Codeberg.org • • •Felipe M.
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf Felipe M. • • •@fmartingr We're using haproxy and have a custom blacklist loaded here: codeberg.org/Codeberg-Infrastr…
It's not public (yet), but we should probably consider opening it. Would need a check there are only publicly known IP addresses on there, though. I'm not fully up to date with how law considers publishing IP ranges of bad actors. ~f
scripted-configuration/hosts/kampenwand/etc/haproxy/haproxy.cfg at bef038ca91cb928e0b865ada4bc6d579b2bc857e
Codeberg.orgFelipe M.
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf Codeberg.org • • •IAG
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf IAG • • •@iagondiscord The problem is that it's not targeting Codeberg. It's the #AIgoldrush. The web was completely crawled, just not by everyone yet. So startups start their #crawlers, carelessly and explicitly ignoring robots.txt to get the #biggestdata.
It does not matter if the web can no longer serve humanity due to this. Training the #AI is the only thing that matters.
Maybe a bit like a sacrifice for faith.
~f #goldrush
Codeberg.org
Als Antwort auf Codeberg.org • • •jon ⚝
Als Antwort auf Codeberg.org • • •@iagondiscord
Daniel Böhmer
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf Daniel Böhmer • • •@dboehmer One of the primary constraints of the current rate-limiting is that there is only a global counter that increases for each request.
So a user watching Forgejo Actions logs scroll through will fire a lot of small requests. And a botnet that is distributed over many many IP addresses do not trigger the rate-limiting at all, because each server only fires a few requests.
Harald
Als Antwort auf Codeberg.org • • •Marcus Rohrmoser 🌻
Als Antwort auf Codeberg.org • • •Simon
Als Antwort auf Codeberg.org • • •I know what you feel... Same on gitnet.fr.
if ($http_user_agent ~* "facebookexternalhit|bytespider|Amazonbot|ClaudeBot|AhrefsBot") { return 429; }
Rachel Rawlings
Als Antwort auf Codeberg.org • • •the esoteric programmer
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf the esoteric programmer • • •Torsten Grote
Als Antwort auf Codeberg.org • • •Codeberg.org
Als Antwort auf Torsten Grote • • •@grote
This is interesting feedback. There have been no changes to the rate-limiting, and the last two changes over the past three months were always increases.
We have blocked several offending IP ranges. Is there information about which hosting providers Fdroid uses?
@fdroidorg
Hans-Christoph Steiner
Als Antwort auf Codeberg.org • • •