Disk full? Package Materials downloads filling up the local disks

dcaltier · July 29, 2025, 3:48pm

I’m referencing these here as their a similar set of issues to describe the behavior No disk space left after big zip file into /opt/indico/tmp.

And midway down into Export large number of posters from indico - #10 where they get into similar problems.

Currently running 3.3.6

Basically, twice in the last two weeks we’ve filled up the available disk space.

The first time was two weeks ago and we assumed that it was a big conference that let out and hundreds of users rushing download the materials which included video. We cleared out our cache and tmp folders and added about 200gb of disk for what we thought was more than enough headroom. However…

This morning starting at ~4:00am indico got crushed by external traffic that triggered the “download material” function in many different events simultaneously, rapidly filling the storage on the server. We thought at first it might have been the same thing.

But my senior engineer pulled the web access logs, between 04:00 and 04:59 today indico was hit 26213 times, but only from 3 IP addresses. Looks like some kind of scanner or possibly a botnet/crawler hit a bunch of download material buttons. Not too certain about the first event anymore.

Seemed to be mostly concentrated in about a dozen different events. With only 4 events with huge numbers of *.zips from the download materials in the attachment-packages folder. We’re asking our cyber folks now. But whatever it is, it basically filled both the archives and the /opt (we don’t use disk quotas for indico) overnight.

We’re looking for ways to try to throttle (something like the LATEX_RATE_LIMIT config but for package materials?) or if we can’t run the cleanup tool process more proactively than every 24 hours. Right now, we’re going to throw a lot more head room at the disk. But, last time we had less than a hundred gigs of headroom left on archive before it filled it up. This time we had more than 230gb. We’re going to probably add a TB, but that still doesn’t quite solve the problem if it fills up aggressively like this in a 24 hour period.

Any ideas or features we missed looking into this problem? Any specific logs you’d want to look at for reference?

ThiefMaster · July 29, 2025, 4:38pm

Add CAPTCHA + rate limit to material package form by ThiefMaster · Pull Request #6996 · indico/indico · GitHub (the main functionality is already there, so if you want you can take the CI-built wheel from that PR and install it)

Guess why I opened this PR today…
The same idiot - running a poorly configured security scanner I think - also hit one of our instances and created around 380 GB of garbage:

$ grep 91.92.46.110 /var/log/haproxy/frontend.log | grep -P 'POST /event/\d+/attachments/package' | wc -l
35821

I then blocked it, deleted the existing garbage and ensured the tasks still queued did not build anything.

PS: Would you mind sharing the IPs that did this on your instance with me (privately if you prefer)? Curious if mine is one of them, and I might just block the others since they’re clearly no good netizens.