Simple Bash Scripts for Lazy People | Part 4: Cadaver Dissection

This is part 4 of a five-part series:

  • Part One has examples for common daily tasks in Git.
  • Part Two, similar examples for Rails.
  • Part Three, miscellaneous cases.
  • Part Four dissects an example of a failed attempt at a useful script.
  • Part Five concludes with a brief discussion of when to use Bash as opposed to some other scripting language.

TL;DR for Part 4:

  • Use comments (and proofread them!)
  • Be mindful of potential future growth of files or directories you might be searching or greping
  • Get a substring: substr="${string:int_start_pos:int_length}" e.g. ${string:0:1}
  • awk‘s regex matching is super useful and can make one-liners simpler

Cadaver Dissection

The Anatomy Lesson of Dr. Nicolaes Tulp, Rembrandt, 1632

log-cleanup

This script is useless for a number of reasons:

  • it takes too long to run
  • it doesn’t show the correct output
  • it’s all or nothing so you’d almost never say “yes” when prompted to complete the task

However, it does show some ideas about how to incorporate functions and other nifty tools into your bash scripts, as well as some good examples of mistakes and bad ideas.

The “default no nothing" comment on line 8 makes no sense. It’s probably supposed to be “default do nothing” but that barely makes sense. If the string argument’s first character is not 'y' or 'Y', the function returns 1 (not a boolean true, but an exit status 1 meaning an error condition or, in this case, a negative response; where exit status 0 is no errors or, in this case, an affirmative response). The comment should probably be something like “exit status zero for affirmative response”.

Because yes_no exits with a standard 0 status code for “all good” and a non-zero status for the opposite, one can do if yes_no or if ! yes_no, which is nice.

The interesting part here is ${string:0:1} to get a substring (the 0 is start position and the 1 is substring length) — that’s handy to know about.

list_logs is a disaster – finding anything that looks like a log in my dev directory takes ages. This should probably be used on a per-project basis rather than everything.

The modification time on this suggests I wrote it very shortly after joining Beezwax a couple years ago, and at the time I was on very few projects and looking in all of ~/dev was no big deal. Obviously terrible future-proofing.

Do you see the bug here?

Line 18 calls list_logs (which wraps that crazy find of ~/dev) and reads its output in a while loop that does a du (disk usage; -c to include a grand total (not very useful when it’s getting things on a per-file basis), -h for human readable, and -s which is only useful when there are multiple paths as arguments).

The output from the loop gets piped to tail where -1 gives us only the last line, and the awk prints the first column, which is just the size. It looks like there’s a (wrong) assumption that du would magically know about its use in a loop and magically get a total of everything that happens inside the loop. Too bad there’s no such thing as magic.

Instead I should have size="$(list_logs | xargs du -ch | tail -1 | awk '{print $1}')"; which, in addition to being correct, is simpler and easier to read.

The line can get even simpler by using awk‘s regex matching: size="$(list_logs) | xargs -du | awk '/total/{print $1}'"; matching the line I care about from du (“82M total“) instead of using tail to extract it.

As above, line 42 could be list_logs | xargs rm.

On to Part 5: When to Chose Bash

Leave a Reply