Estimate Time for Job Completion (With Progress Updates) When Tar'ing Huge Directories

For the sake of future me, I am recording this here, the coolest shell trick I’ve learned this year: (Linux): tar cf - /folder-with-big-files -P | pv -s $(du -sb /folder-with-big-files | awk '') | gzip > big-files.tar.gz (OSX): tar cf - /folder-with-big-files -P | pv -s $(($(du -sk /folder-with-big-files | awk '') * 1024)) | gzip > big-files.tar.gz with output looking like: 4.69GB 0:04:50 [16.3MB/s] [==========================> ] 78% ETA 0:01:21 Requires ‘pv’: https://github.

Read more

Setting up the Text Editor in My Computing Ecosystem

Image from WikiMedia Commons Basic Setup of Shell to Support My Text Editor Preferences By “text editor”, I mean Vim, of course. There are pseudo-operating systems that include rudimentary text-editing capabilities (e.g. Emacs), and integrated development environments that allow for editing of text, but there really is only one text editor that deserves the title of “text editor“: Vim, that magical mind-reading mustang that carries out textual mogrifications with surgical precision and zen-like elegance.

Read more

'xargs' - Handling Filenames With Spaces or Other Special Characters

xargs is a great little utility to perform batch operations on a large set of files. Typically, the results of a find operation are piped to the xargs command: find . -iname "*.pdf" | xargs -I mv ~/collections/pdf/ The -I tells xargs to substitute “ in the statement to be executed with the entries being piped through. If these entries have spaces or other special characters, though, things will go awry.

Read more

Useful diff Aliases

Add the following aliases to your ‘~/.bashrc’ for some diff goodness: alias diff-side-by-side='diff --side-by-side -W"`tput cols`"' alias diff-side-by-side-changes='diff --side-by-side --suppress-common-lines -W"`tput cols`"' You can, of course, use shorter alias names in good old UNIX tradition, e.g. ‘ssdiff’ and ‘sscdiff’. You might be wondering why (a) I did not do so, and (b) what is the point, conversely, of having aliases that are almost as long as the commands that they are aliasing.

Read more

Supplementary Command-History Logging in Bash: Tracking Working Directory, Dates, Times, etc.

Introduction Here is a way to create a secondary shell history log (i.e., one that supplements the primary “~/.bash_history”) that tracks a range of other information, such as the working directory, hostname, time and date etc. Using the “HISTTIMEFORMAT” variable, it is in fact possible to store the time and date with the primary history, but the storing of the other information is not as readibly do-able. Here, I present an approach based on this excellent post on StackOverflow.

Read more

Stripping Paths from Files in TAR Archives

There is no way to get tar to ignore directory paths of files that it is archiving. So, for example, if you have a large number of files scattered about in subdirectories, there is no way to tell tar to archive all the files while ignoring their subdirectories, such that when unpacking the archive you extract all the files to the same location. You can, however, tell tar to strip a fixed number of elements from the full (relative) path to the file when extracting using the “--strip-components” option.

Read more

Piping Output Over a Secure Shell (SSH) Connection

We all know about using scp to transfer files over a secure shell connection. It works fine, but there are many cases where alternate modalities of usage are required, for example, when dealing when you want to transfer the output of one program directly to be stored on a remote machine. Here are some ways of going about doing this. Let “$PROG” be a program that writes data to the standard output stream.

Read more

Locally Mounting a Remote Directory Through a Firewall Gateway on OS X

Download and install MacFUSE. Download the sshfs binary, renaming/moving to, for example, “/usr/local/bin/sshfs”. Create a wrapper tunneling script and save it to somewhere on your system path (e.g., “/usr/local/bin/ssh-tunnel-gateway.sh”), making sure to set the executable bit (”chmod a+x”): #! /bin/bash ssh -t GATEWAY.HOST.IP.ADDRESS ssh $@ Create the following script, and save it to somewhere on your system path (e.g., “/usr/local/bin/mount-remote.sh”), making sure to set the executable bit (”chmod a+x”):

Read more

`gcd` - A Git-aware `cd` Relative to the Repository Root with Auto-Completion

The following will enable you to have a Git-aware “cd” command with directory path expansion/auto-completion relative to the repository root. You will have to source it into your “~/.bashrc” file, after which invoking “gcd” from the shell will allow you specify directory paths relative to the root of your Git repository no matter where you are within the working tree. gcd() _gcd() " prev="$$2" dirnames=$(cd $TARGET; compgen -o dirnames $2) opts=$(for i in $dirnames; do if [[ $i !

Read more