Monthly Archives: January 2013

Streaming rm, a faster alternative to find … exec rm

Anyone who has done Linux or Unix administration will be very familiar with the “find … -exec rm …” used for cleaning up temporary files. For example if some application is creating temporary files, we will remove its temporary files which are more than seven days old with a command like this:

find /var/spool/thisapp -name thisapp\*.tmp -mtime +7 -exec rm "{}" ";"

Some years ago I came across a situation where an application was creating large numbers of temporary files in a huge directory structure and someone was using a find command like the one above to clear them down. It occurred to me that it would be easy to write a command which took the names of files on the standard input and removed them. I wrote a crude version and ran the command like this (using the above example):

find /var/spool/thisapp -name thisapp\*.tmp -mtime +7 | strrm

It was ten times faster than the original form!  It took thirty seconds versus five minutes for find … exec.  Admittedly there were 46,000 files to remove in that example.

A more comprehensive strrm

Recently I decided it was worth re-writing my streaming rm and you can download the source code.  These are its options:

  • -n|–dryrun Don’t do the remove, but echo the file(s) to the standard output.
  • -v|–verbose Normally strrm will run silently, using this option will echo each file name as it’s removed.

Removing files with awkward names

Apart from the aforementioned example strrm also has another advantage, it doesn’t care about weird file names.  So consider this rather artificial example which has three backspaces in it:

vger:~/tmp(99)+>- ls Annoy*
Annoy???ingFileName
vger:~/tmp(100)+>- ls Annoy* | strrm --dryrun
Annoy\010\010\010ingFileName
vger:~/tmp(101)+>- ls Annoy* | strrm --verbose
Annoy\010\010\010ingFileName
vger:~/tmp(102)+>- ls Annoy*
ls: No match.
vger:~/tmp(103)+>-

Removing empty directory paths

For various reasons it is possible to end up with empty directory paths, i.e. paths which contain only directories and no files.  This will delete all the empty directories working back from the longest path:

find testdir -type d | awk '{printf("%04d %s\n",length($0),$0)}' | sort -rn | sed 's/[0-9][0-9][0-9][0-9] //' | strrm --dryrun

Even though it cannot delete empty directories, I’ve show it with “–dryrun” just in case it doesn’t do exactly what you expect.  Once you’re happy you can remove the “–dryrun”.

 

How Red Hat made Linux palatable for business

A recent blog on TechRepublic mentioned the importance of Red Hat for Linux however I don’t think the blogger, Jack Wallen, quite hit the nail on the head.

The first worthwhile Linux distribution was Debian and it continues to be the most important base for user-accessible Linux, most notably Ubuntu. Ubuntu is a superb desktop, but it is exasperating as a server OS. Debian is based on the principle of constant updates which just doesn’t work in a business environment, where configuration management is critical. Constant updating is particularly perilous with Open Source; I have seen point updates break things. Further, in a professional environment you want to stage updates. So for instance you might introduce Apache 2.2 in Test, but leave it on 2.0 for Staging and Production. Later you can introduce it in Staging and Production. This is just too difficult on Debian to bother even trying.

Enter Red Hat. RH understood the needs of business, and created a much more controllable Linux. It also introduced a new, extremely valuable, facility: the ability to stay on old versions of software without the associated risks. For instance, Apache 2.0 has a few known vulnerabilities and the only remedy from the Apache Software Foundation was to upgrade to the latest version of Apache. This left businesseses in a quandary: Upgrade and risk almost certainly breaking the corporate web site, or just hope no one notices it’s running a vulnerable web server. Red Hat had the solution: it back-ported the security fixes into Apache 2.0 and all was well. This is a service it provides for all its packages.

RHEL isn’t a flawless business OS—for instance, patch auditing is unsatisfactory—but it’s what made Linux acceptable to the business community.

LinkedIn streamlining its apps into oblivion

Just before Christmas LinkedIn announced changes to its profile pages adding that it would also be “streamlining” its app offerings, resulting in its link to WordPress® being streamlined into oblivion.  This was bad news for me because it is important to me that my blog is associated with my LinkedIn identity.  Anyone who was using LinkedIn’s apps were told in the same email that they will have new ways to “showcase rich content” on their profile; there was no follow-up email to suggest how or when this might work.  An Internet search for the term “linkedin showcase rich content on your profile” turned up a lot of leads one of which was hosted on wordpress,com, describing in more detail how the rich content feature would work.

That blog described the nature of LinkedIn’s new offerings—which is more than LinkedIn itself did—and I confirmed that none of those new features was any use to me. However a subsequent link in the search results revealed something that really should have been on LinkedIn’s notification: it’s possible to get your WordPress blog to write to LinkedIn.