Defending Savannah from DDoS attacks
Savannah is under heavy attack, likely from one or more organizations using a massive botnet to build a dataset for training large language models (LLMs). Since January 2025, a distributed denial-of-service (DDoS) attack has been underway. With metrics for our IP blocklist reaching five million in February 2025. In this article, we will introduce Savannah and some tools and techniques that the Savannah hackers and FSF system administrators use to mitigate DDoS attacks against GNU resources and the FSF network. This series of attacks is not limited to Savannah: staff and volunteers have read about similar attacks against other software forges including Sourceware, Pagure, GitLab instances, SourceHut, and Codeberg, as well as Gitea and Forgejo instances. We hope this article can help others fight these attacks as well.
GNU Savannah is the software development forge operated by the GNU Project and hosted by the FSF. GNU Savannah was initially a fork of SourceForge installed by Loïc Dachary, distinguished by an express commitment to only host free software. While savannah.gnu.org is reserved for official GNU packages, savannah.nongnu.org hosts free software packages that are not officially GNU packages. Savannah is hosted by the FSF with a core infrastructure in Massachusetts, maintained and operated by the Savannah hackers team with the help of the FSF system administrators. Savannah continuously works to maintain an A-grade from the GNU Ethical Repository Criteria Evaluations.
Savannah's hosting is split between many different virtual machines which isolate different functionality, such as front-end web user interface (UI), internal databases, and our supported version control systems (VCS): bzr, cvs, hg, git, and svn (view the design of Savannah's infrastructure. The hosts that serve source code for human reading over HTTP and HTTPS currently receive the majority of abuse. These hosts generate web pages with syntax-highlighted source code pertaining to a specific commit to a GNU package in a git or other VCS repository.
Defending systems like Savannah from DDoS begins with analysis. Teams must differentiate problematic requests to the system from acceptable ones. For Savannah, analyzing log files revealed a correlation between many of the IPs hitting our servers: it is not one user agent, but many different user agents overlapping simultaneously. This information helped, but it did not solve our problems and introduced a new one! The list of IPs was too large for many of the traditional firewalls. Enter: ipset.
Ipset is a newer tool for Savannah hackers to help manage large collections of IP addresses. Jing, a Savannah hacker and GNU webmaster based in the Asia-Pacific region, had been experimenting with it on his own infrastructure successfully. Frustrated with the limits of iptables (a firewall management instrument) and excited by Jing's research, longtime Savannah hacker Bob Proulx immediately put ipset to work. It worked fantastically, handling the initial list of two million IP addresses without meaningful degradation of host performance, soon scaling to over five million unique IPv4 addresses.
![]() Ipset is a powerful tool for mitigating DDoS attacks.
|
Ipset mitigated the attack of the moment but, once again, introduced new problems. Many of the addresses were with Internet service providers (ISPs) using Carrier-Grade NAT (CG-NAT). CG-NAT enables individuals to share IP addresses and is used by many ISPs due to IPv4 exhaustion, commonly including people in China, Brazil, Peru, and users of mobile carrier networks. Bob added corresponding allowlists, tracking confirmed "real user" behaviors and exempting them from future bans. This isn't a perfect solution, but it is amazingly effective.
Unfortunately, all of this processing adds up. Hosting software and documentation is a vital part of our work promoting software freedom. This months-long abuse and our continuous work defending against it and adapting to the ever-changing situation presents an enormous drain on resources. Nevertheless, protecting our servers against degradation of service remains one of our highest priority tasks.
To all of the companies crawling the Internet: there is a better way! Do not scan code repositories over the web: clone them using version control tools such as git, cvs, svn, Mercurial, or bzr. Follow the rules set forth in the robot.txt files. Identify yourself with a user agent that includes a link describing your activity and a contact address. If your bot was blocked, do not attempt to circumvent the ban. If your program is unblocked after a ban, add more rate-limiting to it. Please contact us with questions by emailing sysadmin@fsf.org or visit us on IRC on libera.chat in the #savannah or #fsfsys channels.
We will fight these attacks for as long as they continue.
"Hosts blocklisted ipset" © 2025 by Corwin Brust. This image is licensed under CC BY-SA 4.0.