Estimate Time for Job Completion (With Progress Updates) When Tar’ing Huge Directories

For the sake of future me, I am recording this here, the coolest shell trick I’ve learned this year:


tar cf - /folder-with-big-files -P | pv -s $(du -sb /folder-with-big-files | awk '{print $1}') | gzip > big-files.tar.gz


tar cf - /folder-with-big-files -P | pv -s $(($(du -sk /folder-with-big-files | awk '{print $1}') * 1024)) | gzip > big-files.tar.gz

with output looking like:

4.69GB 0:04:50 [16.3MB/s] [==========================>        ] 78% ETA 0:01:21

Requires ‘pv’:

Reproduced from this Superuser answer here:

The Traveler’s Restaurant Process — A Better Description of the Dirichlet Process for Partitioning Sets


I. “Have Any of These People Ever Been to a Chinese Restaurant?”

The Dirichlet process is a stochastic process that can be used to partition a set of elements into a set of subsets. In biological modeling, it is commonly used to assign elements into groups, such as molecular sequence sites into distinct rate categories. Very often, an intuitive explanation as to how it works invokes the “Chinese Restaurant Process” analogy. I have always found this analogy very jarring and confusing, as, while it is a good description of how the Dirichlet process works, it is a terrible description of how Chinese restaurants, or, indeed, any other type of restaurant run by (and patronized by) humans[1], works. As Dr. Emily Jane McTavish says, “Have any of these people ever been to a Chinese restaurant?” Indeed. In fact, one may wonder if any of them have been to any restaurant outside of a Kafkaesque performance-art themed one.

II. The “Traveler’s Restaurant Process” Analogy

I believe a far more intuitive analogy might be given by the “Traveler’s Restaurant [Selection] Process”. A common dictum in traveler lore, guides, advice, and general wisdom is that when picking a restaurant for a meal to prefer establishments that appear to be more popular or frequented with locals. And so, we can imagine instead, describing how a group of N travellers distribute themselves among the various restaurants at a food court. As with the original analogy, we have the travelers entering the establishment one-by-one, with the food court initially empty. (Yes, this is still a weakness in the analogy that lends a surreal contrivance to the whole tale, but maybe this can be alleviated somewhat by imagining that it is (a) 1 am in the morning; (b) the travelers have just arrived and are suffering from jet lag; and (c) wander down to the food court after checking in their respective hotels/motels/pensions, and variances in processes times lead them to come in straggling individually. As for the food court open at 1 am — in a number of parts of the world, this is pretty commonplace, I assure you!). The first traveler finds an empty food court, and picks a restaurant at random. The next traveler enters the food court, and looking around, sees just one restaurant occupied. It is possible that she makes a beeline for that restaurant, but maybe she is feeling cranky or does not particular like the other traveler (or other travelers in general), and heads off to a different restaurant. And so on with the next traveler, and the next, and the next, until the last one, each of whom makes an individual decision whether to try a new, empty restaurant (if they are feeling particularly adventurous or misanthrophic or introverted or have squabbled with or are otherwise disdainful of everyone else) or else, at the other extreme, head for the most crowded restaurant (if they are sticking to the true and tried traveler’s restaurant selection dictum or are feeling particular sociable), or everything in between. At the end of it all we take the clusters of travelers as they have distributed themselves into individual restaurants as the partition of the original the set of travelers.

Makes (better) sense?

Maybe not!

III. Working Through the Process

Emily also says that many times just walking through the model expressions and formulas often yields much better intuition than contrived and twisted analogies and metaphors. And maybe she is right!

So, let us take a set of $N$ elements: $t_1, t_2, …, t_N$. We can imagine each element to be a traveler from the “Traveler’s Restaurant Process” analogy, or vice versa, or anything else you prefer. Each element is assigned to a group in turn, creating new groups as needed. With no existing groups, the first element is automatically assigned to a new group. Well, technically, the first element is assigned to new group with probability given by:

\frac{\alpha}{\alpha + n – 1},

where $\alpha$ is known as the concentration parameter, and $n$ is is the 1-based index of the element (i.e., the first element has $n=1$, the second has $n=2$, the third has $n=3$, and so on). The first element has $n=1$, and the above expression evaluates to 1.0, so is always assigned a new group. The more general case, applicable to all subsequent elements up to and including the last one, is that the $n^{\text{th}}$ element is:

  • Either assigned a new group, $g_*$ with probability according to expression (1) above,
  • Or assigned to one of the existing groups, $g_i$, with probability:
    \frac{|g_i|}{\alpha + n – 1},

where $\alpha$ is the concentration parameter and $n$ is is the 1-based index of the element, as previously described, and $|g_i|$ is the number of elements in group $g_i$.

IV. How/Why Does it Work?

How do we know that each element will definitely be assigned to a group? We can check that the probabilities of either being assigned to a new group or existing group sum to one to reassure ourselves of this. Let us say that we are dealing with $n^{\text{th}}$ element out of $N$ elements, where $1 \leq n \leq N$. As described above, the probability that this element will be assigned to a new group is given by expression (1). On the other hand, the probability of being assigned to any one of the existing groups is given by:

\sum_{i=1}^{M}\frac{|g_i|}{\alpha + n – 1}

where $M$ is the number of existing groups, $g_1, g_2, …, g_M$. Now, while we do not know in this general case exactly how the previous $n-1$ elements are distributed amongst the various $M$ existing groups, we do know that, regardless, there must be a total of $n-1$ elements across all groups. So, then, the above expression reduces to:

\frac{\sum_{i=1}^{M} |g_i|}{\alpha + n – 1} = \frac{n-1}{\alpha + n – 1}.

So the probability of either being assigned to a new group or being assigned to an existing group is given by the sum of Expression (1) and (4), which is:

\frac{\alpha}{\alpha + n – 1} + \frac{n-1}{\alpha + n – 1} &= \frac{\alpha + n – 1}{\alpha + n – 1} \\
&= 1.

And thus, we are assured of all elements being assigned to a group, whether a new one or an existing one, as well as being able to sleep at night knowing that the distribution of partitions under the Dirichlet process is a proper probability as it sums to 1.0.

V. The (Anti-?)Concentration Parameter

We have mentioned and used the concentration parameter, $\alpha$, above, without actually explaining it. Basically, the concentration parameter, as its name implies, determines how “clumpy” the process is. Unfortunately, contrary to what its name implies, at low values the process is more “clumpy” — i.e., yielding partitions where elements tend to be grouped together more frequently, resulting in larger and correspondingly fewer subsets — while at high values the process is less “clumpy” — yielding partitions where elements tend to be more dispersed, resulting in smaller and correspondingly more subsets. Yeah, it really should have been called the “anti-concentration” or “dispersion” parameter. Or perhaps folks should just stick to using the more neutral and less evocative, but non-misleading, standard term: the “scaling” parameter.

(See updates here and here that clear up this issue! Basically, the “concentration” parameter gets its name not due to way it concentrates elements, as naive and incorrect intuition led me to think, but rather due to how increasing it leads to the distribution of values across the subsets converging on the base distribution! What values? What base distribution? EXACTLY! Those concepts do not really enter into any of the analogies we have discussed so far, or, indeed, even in the way we have explained the process with reference to the equations and model. Only when considering a version of the DP process where our elements are not just anonymous exchangeable atoms, but value-bearing elements that we want to cluster based on values, does the base distribution enter the picture, and only then does the “concentration” parameter do its concentrating the higher it gets!)

VI. In Action

This Gist wraps up all this logic in a script that you can play with to get a feel for how different concentration parameters work.

With low values of the concentration parameter, we pretty much get all the elements packed into a single set:

# python -n100 -v0 -a 0.01
Mean number of subsets per partition: 1.04
  Mean number of elements per subset: 9.8

On the other hand, with high values of the scaling parameter, we the trend is toward each element being in its own set, with nearly as many subsets in the partition as there are elements in the full set:

# python -n100 -v0 -a 100
Mean number of subsets per partition: 9.65
  Mean number of elements per subset: 1.03944444444

And, of course, moderate values result in something in between:

# python -n100 -v0 -a 1
Mean number of subsets per partition: 3.03
  Mean number of elements per subset: 3.92166666667

# python -n100 -v0 -a 5
Mean number of subsets per partition: 5.76
  Mean number of elements per subset: 1.88055555556

# python -n100 -v0 -a 10
Mean number of subsets per partition: 7.01
  Mean number of elements per subset: 1.49297619048

VII. The Script

The script to run this is shown below. If you are interested in downloading and using it, please visit:


  1. Interestingly enough, I can imagine a restaurant that was patronized by cats working pretty much exactly like the traditional analogy, given the way some cats tend to be clumpers and others loners. So, if you don’t like the “Traveler’s Restaurant Process” analogy, please feel free to use the “Cat Restaurant Process” analogy, or, better yet, the “Cat Room/Furniture Occupation Process” analogy.
  2. Update 2017-07-25:
    So it seems that the concentration parameter gets its name not from the fact that it (inversely) controls the concentration or clustering or elements, but because of its relationship to the base distribution of the Dirichlet process. The base distribution is something that we have not talked about, and I will cover it in a future post once I understand it well enough to talk about it in a useful and interesting way! But for the time being, it is sufficient to say that it is the distribution over the elements themselves before they are assigned into groups. The concentration parameter is so called because, as it increases in value, it increasingly “concentrates” the values of elements on the base distribution (while at the same time increasingly disperses the elements to distinct groups). So the reason for the counter-intuitive name for the parameter is wrong intuition — it is not referring to how “concentrated” the elements are in terms of the subsets of partition, but how closely the distribution of elements resemble the base distribution. More details on this can be found here, where the concentration parameter is described as a prior belief in the base distribution.
  3. Update 2017-07-26:
    So, here is a GREAT explanation of what is going on: Summarizing: basically, if we understand the Dirichlet process using the Chinese Restaurant Process or the Traveler’s Restaurant Process or the Hair-Clog Process etc. etc., we are ignoring the base distribution as we do not care about the values of the elements that are clustered, or the distribution of those values within each subset. Only with reference to, e.g. the Polya Urn Model, where each element has associated with it a value which, and is clustered to a group based on this values to the distribution of values associated with each group, while parameters of the distribution of each “urn” are drawn from a base distribution, do things make sense. As the concentration parameter increases, the elements spread out across more and more subsets/urns/tables, and as the parameters of the distribution of each subset/urn/table’s “value” are sampled from the base distribution, in effect, the base distribution itself gets more and more (and better and better) sampled: as each subset/urn/table represents an independent sampling of the base distribution. Thus the distribution across all subset/urns/tables converges on the base distribution. Conversely, as the concentration parameter decreases, the elements cluster into fewer groups, and, with fewer groups we get more limited sampling of the the base distribution.

“Joy Plots” — Great Plot Style for Visualizing Distributions on Discrete/Categorical or Multiple Continuous Variables

R doing what R does really, really, really, really, really, really, *R*eally well: visualization. Folks, this might be THE plot to use to visualize distributions of discrete/categorical variables or simultaneous distributions of multiple continuous variables, replacing or at least taking up a seat alongside the violin plots as the current best approach IMHO.

(EDIT: This plot style is named after the “Joy Division”, due to a similar graphic on one of their album covers. Not being at all familiar with Joy Division or their work, the name that comes to mind when I see this plot is “Misty Mountain Plot”, after the maps in “The Hobbit”)

“Pre-Columbian Mycobacterial Genomes Reveal Seals As A Source Of New World Human Tuberculosis”

When, in 1994, definitive evidence of tuberculosis in humans was reported from pre-Columbian America, it was a startling. Conventional understanding had pegged tuberculosis as part of the new, exotic, and (to immunologically-naive populaces) deadly menagerie of pathogens brought by Europeans over to the Americas. While there were suggestions of pre-Columbian tuberculosis in the Americans, these were based on lesions on bones, which were ambiguous. Unlike previous cases, however, the Chiribaya mummy from 1000-1300 CE in Peru was shown beyond doubt to have been exposed to tuberculosis:

In the mummy’s right lung and a lymph node, the scientists found scars of disease. These were small, calcified lesions typical of tuberculosis. Extracting fragments from the tissue, molecular biologists isolated genetic material betraying the presence of Mycobacterium tuberculosis.

To find evidence of tuberculosis here some half a millennia before contact seems to suggest that tuberculosis had always been present in the Americas, and the increased incidences (to put it mildly) in post-contact native American populations were the result of contextual changes (such as high population densities, changes in diet or lifestyle, physiologies or immune systems compromised or changed by other diseases or factors, etc.) rather than the “virgin soil” phenomenon. It also led to other questions: given that tuberculosis was thought to have originated some 8000 years ago in the Middle East, as cowpox jumped to humans with the domestication of cattle, how did it get to the Americas before not just Old World people, but cattle got there?

A decade later, the mystery was solved. Leveraging advances not just in molecular sequencing technologies but also theoretical and computational analytic methods, it was found that the strain of tuberculosis found in pre-Columbian humans in the Americas was an entirely different one from the one that wreaked appalling havoc after the arrival of the Europeans. As can clearly be seen from the phylogenetic trees, the pre-Columbian tuberculosis probably was completely independent zoonosis event, jumping to humans from seals. The strain of tuberculosis that was endemic to the Americas did not confer any immunity to the new strain brought over by the Europeans, and hence the “virgin soil” pandemic syndrome that proved so devastating. The seals themselves were thought to have picked it up from hosts in Africa, and brought it over to the Americas through oceanic migration/dispersal.

While seals may seem a strange source given the agricultural practices with which we are currently familiar, resulting in a lot of skepticism in comments in the popular press, many pre-Columbian populations on coastal sites had marine-based resource exploitation economies that included strong interaction with a range of marine animals including seals. In fact, tantalizing clues to the marine origin of this (that only can be seen as clues with hindsight) were proximity of many of the early evidence of the pre-Columbian tuberculosis cases came from sites that were close to the coast or sea, as with the Chiribaya mummy.

All in all, a fascinating historical detective story, with many twists and turns, with the conclusion at the end a great example of the power of modern molecular phylogenetics to peer into the past at the micro- as well as the macro scales.

    Bos, K. I. et al. (2014). Pre-Columbian mycobacterial genomes reveal seals as a source of New World human tuberculosis. Nature, 514(7523), 494-497. doi:10.1038/nature13591.

Multispecies Coalescent Species Delimitation: Conflating Populations with Species in the Grey Zone (Evolution 2017 Talk)


The always fantastic Evolution meetings were a blast. So many great talks, and, perhaps more importantly, great catching up with so many friends, collaborators, and colleagues!

I presented a talk on our PNAS paper showing how the Multispecies Coalescent model, when used for “species” delimitation, actually delimits Wright-Fisher populations.

Titled “Multispecies Coalescent Species Delimitation: Conflating Populations with Species in the Grey Zone“, the entire talk can be viewed here:

The slides are available here:

“Phylogenomics reveals rapid, simultaneous diversification of three major clades of Gondwanan frogs at the Cretaceous–Paleogene boundary”

Some nice work that ties the timing of the radiation of three independent lineages of frogs, constituting the majority of modern living frogs, to about the time the major groups of dinosaurs took a hit (literally and figuratively!). Compelling and interesting story, with lots of intriguing follow-up questions. A more general article covering the findings is available here.

Yan-Jie Feng, David C. Blackburn, Dan Liang, David M. Hillis, David B. Wake, David C. Cannatella, and Peng Zhang. 2017. Phylogenomics reveals rapid, simultaneous diversification of three major clades of Gondwanan frogs at the Cretaceous–Paleogene boundary. PNAS 2017 ; published ahead of print July 3, 2017, doi:10.1073/pnas.1704632114

Solving the “Could not find all biber source files” Error

Biblatex is a fantastic bibliography/citation manager for LaTeX. It trumps the older bibtex for its much easier customizability and configuration. It does however, have one bug that can be very perplexing to figure out due to the misleading error message that results: “Could not find all biber source files“. At first glance this message seemed straightforward enough to send me poking about the project file structure and build system, checking paths and names. When all that seemed intact, I started trying building the document from different locations. Then I checked out older version-controlled revisions of the project that I was sure I had built successfully, and when these, too, failed, I started to look at my TeX installation. And so on and so on, and before I knew it … poof! there went most of my morning.

This was all a wild goose chase, though, and luckily I came across this discussion before I got too far. (Well, at further too far, at any rate.) Turns out that this is a known bug with the Biblatex engine, “biber”. The fix is to clear the “biber” cache. You can locate the “biber” cache by running:

$ biber --cache

and then you can “rm -rf” it with extreme glee or just do it all in one step with:

$ rm -rf $(biber --cache)

Invisible Photographer Meets Vanishing Fox


My absolute favorite wildlife encounter in the Pantanal was with a central South American endemic Crab-Eating Fox (Cerdocyon thous). I had left the group behind in the safari vehicle on the road to try and close in on some jabiru in the marsh on foot. I was creeping about, camera + lens in hand, when suddenly, I saw her staring at me through the marsh vegetation. I froze. After a few moments, I realized that it seemed that while she sensed something, she could not actually see me: she kept sniffing the air in my direction, ears pricked and alert, and starting intently (right at me!), but seemed like she was still assessing the situation. She seemed to relaxed a bit, and so I took a series of photos and then started slowly walking forward until she tensed up again. Then the cycle repeated: she stared at me intently, trying to smell or hear something, but apparently not being able to visually see me, and when she relaxed I took some photos and started creeping forward.

On the third cycle, however, I screwed up: I failed to note a branch or some other dead wood on the ground and stepped on it. SNAP!!!!!! And she was gone. Literally and immediately, she disappeared. I did not see her turn, move, or even shift posture: one second she was in front of me, and then the crack of the wood snapping, and then there was nothing but marsh. She vanished instantaneous and utterly, and absolutely no trace of her presence remained, just the quietness and the stillness of the marsh.

I felt like, through a glitch in the universe, I was briefly allowed a glimpse into an alternate world. It was one of those magical “contact” encounters that we are sometimes privileged to experience and that we cherish till we die (I am lucky enough to have had three of these so far).

A little while later, we came across a cub running frantically across the road into the same marsh. There was no doubt about our presence this time, as by this time we were in the vehicle with engine on and everything. The cub seemed less concerned with us though than getting to the other side, presumably to join his mother. After crossing the road and reaching the start of the marsh, he did grace us with a long look before disappearing through the vegetation, and that’s when I took this second shot.


Is that an archosaur in your Pantanal, or are you just happy to see me?

Finally getting around to finishing processing some of my Pantanal wetland photographs, and realized that despite being famous for its endemic birds, I was drawn more to the other clade of archosaurs in the region, the Crocodylia, represented by the Black Caiman, Caiman yacare. The Pantanal population of caimans is the largest single crocodiilan population on the planet.








More Sandhill Cranes!

Love these birds!

More Sandhill Cranes from the kayak. This time I deliberately avoiding going too close to where I last saw the nesting pair, and instead hit a different area of the marsh. Here I stumbled upon a colony of 8-10 individuals.

(The one on the right seems to be complaining about its day to the others, who are all dutifully listening, The middle one has completely zoned out, though, and is daydreaming about cornfields while waiting for the rant to end.)

Sandhill Cranes Parents and Chick

AMAZING experience! So, I was using my kayak to stalk a Great Blue Heron in the marshes at the other end of the lake behind our house and the guy kept drifting deeper and deeper into the marsh, till I could not follow any more (less than a 10th of an inch water, and chock full of vegetation; next time, I am getting a push pole!). So I turn around to head back, when I see these guys almost right next to me! A pair of Sandhill Cranes. AWESOME!!! I spent a while photographing them, watching them alternative preen and forage. But folks, this is just part of it. What happened next is EVEN more remarkable …

After a while, I realized that there was more to the picture, so to speak. I realized that there was a little fuzzy yellow blob between them …. it was A CHICK!! AMAZING, AMAZING, AMAZING!!!! The parents were moving around, foraging in the water and feeding the little guy. Can I say again: AMAZING, AMAZING, AMAZING!!! Absolutely fantastic experience.

(Note: I have, perhaps obviously, not done any noise reduction/removal on any of these. I am thinking I should? Masking out the foreground etc. is going to be some work. But I will say it is testimony to the 6D’s high-ISO performance that these were taken at ISO 600 and then have had +0.8 to +1.2 EV added in post)

Evolution of Bioluminescence in Millipedes

Walk deep into a rainforest at night. Switch off your headlamps. And wait with open eyes. At first, it is so pitch black that you cannot see your own hand if you wave it in front of nose (as Bilbo might have said). As your eyes get accustomed to the darkness, you will realize one thing. Everything glows. Everything. There is fine fuzzy layer of bioluminescent fungus covering dead leaves and the bark of trees, so you can almost make out the forest like some one has traced it out in ghostly yellow-green outline. Little ghostly yellow-green mushroom cap clusters are found here and there. And in certain places, much brighter spots are moving very slowly: millipedes crawling on vegetation. For years I’ve wondered about the functional utility of these different forms of bioluminescence. The mushrooms caps? Surely it cannot be to attract insects? Or warn off predators (after all, why visually call attention to yourself in the first place when nobody can see you)? My (admittedly, in those days, naive) literature searches yielded nothing. I speculated that the only explanation that made sense was that the bioluminescence was a side-effect of some other metabolic process, and in a place where visual channels were not economical to exploit by predators, there was no cost to it. It seems that I was totally wrong! But that is the fantastically cool thing about science. It is precisely when you are wrong that you learn the most!

Vim: Insert Mode is Like the Passing Lane

Insert mode is not the mode for editing text.

It is a mode for editing text, because both normal and insert modes are modes for editing text.

Insert mode, however, is the mode for inserting new/raw text directly from the keyboard (as opposed to, e.g., from a register or a file).

Thus, you will only be in insert mode when you are actually typing in inserting (raw) text directly. For almost every other editing operation, normal mode is where you will be. Once you grok this you will realize that, the bulk of most editing sessions is not insert mode, and you actually spend most of your time in normal mode, just dipping into insert mode to add text and then dipping out again.

Insert mode is thus like the passing lane on the highway. Just like you are should only be in the passing lane when you are actually passing other vehicles, you should only be in insert mode when you are inserting text.

Some snapshots from my own learning experiences here and here.

From Acolyte to Adept: The Next Step After NOP-ing Arrow Keys in Vim

René Descartes's illustration of dualism. Inputs are passed on by the sensory organs to the epiphysis in the brain and from there to the immaterial spirit. Public domain image. Sourced from: Wikimedia Commons

We all know about no-op’ing arrow keys in Vim to get us to break the habit of relying on them for inefficient movement.
But, as this post points out, it is not the location of the arrow keys that makes them inefficient, but the modality of the movement: single steps in insert mode is a horrible way to move around when normal mode provides so much better functionality.

But here is the thing: while normal mode provides for much better and more efficient ways to move around than insert mode, it also provides for ways to move that are just as inefficient as arrow keys. In fact, there is nothing that makes, e.g. “j” significantly better than “<DOWN>“, and so if we replace <DOWN><DOWN><DOWN><DOWN> with jjjj, we are just slapping on some fresh paint on a rusty bike and calling it “faster”. We have not even replaced one bad habit with another, we are indulging in the same bad habit, albeit with a different “it”. A bad habit that is not only inefficient, but, perhaps a much worse sin in the Vim-world, inelegant.

This customization will help break you of that habit by forcing you to enter a count for each of the basic moves (“h“, “j“, “k“, “l“, “gj“, “gk“).
This itself will make you more efficient for any move of three repeats or more: “3j” is more efficient than the uncounted “jjj” equivalent.
But, in addition, it will also have the side-effect of making you come up with more efficient moves yourself: as your eyes focus on the point where you want to go, instead of counting off the lines (or reading off the line count if you have “:set relativenumber“), you might find it more natural to, e.g. “/<pattern>” .

In fact, you might find that in many cases, you do not even need to actually move as such. For example, instead of moving to a line 8 lines down and deleting it, “8jdd“, you could just “:+8d“. Or instead of moving to a line four lines up, yanking it, moving back to where you were and pasting it, “4kyy4jp“, you can just “:-4y” and “p“. Once you get good enough at it, it will seem like magic the way you can manipulate lines far from your current position without moving! And what you will find is that, beyond the increased efficiency in number of keystrokes, there is an increase in mental efficiency: the microseconds of visually re-orienting yourself after each move is no longer a cost that you have to pay over and over and over again.

Naturally, you are going to find things less efficient and less elegant at first.
But that is just the pain that of stressing out mental muscles that have not been exercised enough, like that first leg day after the holidays (or maybe even the first leg day ever after signing up at the gym 4 years ago).

Eventually, your efficiency will increase.

But more than efficiency, the elegance of your moves will eventually increase as well. Dramatically. As far as editing text goes, at any rate.

So, stick the following into your “~/.vimrc“, and be prepared for some pain and frustration and swearing and clumsiness as you retrain your muscle memory and your mind, before gaining a new level of enlightened “acting-without-doing”.

NOTE: One of the greatest impediments to me naturally working with counted-movements was the fact that counting the number of lines to go in each direction is disruptive: it completely breaks my “flow”, jarringly derailing my train of thought. See below for the solution to this, the implementation of which I consider a mandatory pre-requisite to working this way.


Displaying Relative Numbers vs. Absolute Numbers

I find the need to count line offsets before every move or operation as conducive to my “flow” as having an air horn stuffed down my throat while frozen mayonnaise is blasted into my ears. This was why I resisted count-based ergonomics in Vim for so long.
Vim has a feature, “`:set relativenumber`” that shows relative numbers, and this makes things tremendously better, in that you can simply read of the line offset to your navigation target …. except that you must choose to show either relative numbers or absolute numbers. The fact is, the only time relative numbers are useful is for motions/operations in the current window or split, but when you have other splits open, relative numbers are worse than useless, as you need to have absolute numbers to make sense of what part of the buffer you are seeing in the non-focal splits. Showing both absolute and relative numbers at the same time would be ideal, but Vim does not support that natively (there is a plugin to help with that, but it exploits the “sign” feature, which can be a problem if you use signs to display something else, like marks, as I do). So the dilemma is that in most cases you want absolute numbers, but count-based motions/operations in the current window are annoying and mentally-disruptive if you do not have relative numbers showing to avoid you breaking your train of thought to count the lines to the target.

Luckily, a Vim plugin provides the answer: vim-numbers. This plugin automatically sets relative numbers on for the split/window in focus, and restores the previous numbering (absolute in my case) when focus moves to another split or window. It was this that made my move to strict count-required based motion possible.

EDIT: It was pointed out by /u/VanLaser that the following in your “~/.vimrc” is sufficient to achieve the relative-numbering-in-focal-window-and-absolute-everywhere-else dynamics without the need for a plugin:

set number
if has('autocmd')
augroup vimrc_linenumbering
    autocmd WinLeave *
                \ if &number |
                \   set norelativenumber |
                \ endif
    autocmd BufWinEnter *
                \ if &number |
                \   set relativenumber |
                \ endif
    autocmd VimEnter *
                \ if &number |
                \   set relativenumber |
                \ endif
augroup END

Laika: A Sad, Unnecessary Death


I think our space programme is one of our species’ greatest achievements. It does have a sordid past, though, with roots in war, conflict, aggression, violence, paranoia, and narrow-minded parochial/tribal brutality. Which makes it all the more remarkable that I think it unites us as a species now, when it was born of such acrimonious and savage division. But probably one of the greatest crimes of the early days our species’ space programmes is the death of Laika.

A congenial, friendly, and very patient stray picked up from the streets of Moscow who cooperated enthusiastically with all the scientists and engineers and programme managers until she was shot out into space to die a horrible death by being roasted alive as the climate controls failed. Not that it was ever planned for her to make it back alive: it was a one-way trip from the get-go. And for no real gain in scientific or engineering knowledge: it was just a PR stunt in the game of one-upsmanship that characterized to early days of the space programmes of both countries.

Oleg Gazenko, one of the scientists responsible for sending Laika into space: “Work with animals is a source of suffering to all of us. We treat them like babies who cannot speak. The more time passes, the more I’m sorry about it. We shouldn’t have done it… We did not learn enough from this mission to justify the death of the dog.

Comet Landing Pride: Rosetta Mission to Churyumov-Gerasimenko

I am always resistant to imparting any exceptionalism to our species, considering it some sort of narrow-minded parochialism based on the accident of the restricted perspective we have when we try to place ourselves in context with the rest of nature. But every time I look at at the achievements of our space programmes, I find it hard not take pride in our species’ achievements in this domain (whatever else the horrible things we do): this is truly where our reach exceeds our grasp, but again and again and again we step up to the challenge and make our grasp meet our reach. On Wednesday, once again, we, as a species, will be reaching out from our cradle to grasp a speck of dust in the darkness of infinity, and in doing so will grow more as a species in a moment than we have in the past decade.