A Puppet server i'm managing was running out of disk space and the culprit turned out to be Puppet's rather verbose report files. I had a whole bunch of reports which simply informed that the following umpteen files were not changed at all. This is both useless and wasteful, at 38 megs a report, per server, twice an hour. Even though the environment is small, i ended up with 22 gigs of reports...
After much googling and stackoverflowing, i came up with the following script:
#!/bin/bash
grep -Pzl "status: unchanged(\n)metrics" /opt/puppetlabs/server/data/puppetserver/reports/*/*.yaml > $(dirname $0)/unchanged-reports # this is one long line, not four
while read p; do
sed '/metrics:/,$d' $p > ${p}.0
rm $p
mv ${p}.0 $p
done < $(dirname 0)/unchanged-reports
Run as root. Comment out the rm and mv bits if you're nervous or you just want to experiment.
The command line switches for grep (only work on GNU Grep, ie on Linux):
- P turns on experimental Perl regexp mode and can potentially break things
- z will effectively allow for multiline regexp patterns
- l will return the file name where the pattern was found rather than the pattern itself
And then you can automate this, say, with cron.
In addition to this script, i use logrotate to compress and eventually remove old report files.