The following example shows how easy it can be to use the three interoperability modules provided by the DendroPy Phylogenetic Computing Library to download nucleotide sequences from GenBank, align them using MUSCLE, and estimate a maximum-likelihood tree using RAxML. The automatic label composition option of the DendroPy genbank module creates practical taxon labels out the original data. We also pass in additional arguments to RAxML to request that the tree search be carried out 250 times (['-N', Read more [...]
Vim's regular expression dialect is distinct from many of the other more popular ones out there today (and actually predates them).
One of the dialect differences that always leaves me fumbling has to do with which special characters need to be escaped.
Vim does have a special "very magic" mode (that is activated by "\v" in the regular expression) that makes thing very clean and simple in this regard: only letters, numbers and underscores are treated as literals without escaping.
But I have never Read more [...]
Merge conflicts suck. It is not uncommon, however, that you often just know that you really just want to accept all the changes from the branch that you are merging in. Which makes things a lot simpler conceptually. The Git documentation suggests that this can also be procedurally simple as well, as it mentions the "-s theirs" merge strategy which does just that, i.e., unconditionally accept everything from the branch that you are merging in:
$ git merge -s theirs
Unfortunately, however, running Read more [...]
Vim's text objects are not only a powerful, flexible and precise way to specify a region of text, but also intuitive and efficient.
They can be used with any command that can be combined with a motion (e.g., "d", "y", "v", "r"), but in this post I will be using the "c" command ("change") to illustrate them.
Imagine you were on a line looked like this, with the cursor on the letter "r" of the word "dry":
print "Enter run mode ('test', 'dry', or 'full')"
Then, after typing "c" to start Read more [...]
Here is a way to create a secondary shell history log (i.e., one that supplements the primary "~/.bash_history") that tracks a range of other information, such as the working directory, hostname, time and date etc. Using the "HISTTIMEFORMAT" variable, it is in fact possible to store the time and date with the primary history, but the storing of the other information is not as readibly do-able. Here, I present an approach based on this excellent post on StackOverflow.
The main differences Read more [...]
There is no way to get tar to ignore directory paths of files that it is archiving. So, for example, if you have a large number of files scattered about in subdirectories, there is no way to tell tar to archive all the files while ignoring their subdirectories, such that when unpacking the archive you extract all the files to the same location. You can, however, tell tar to strip a fixed number of elements from the full (relative) path to the file when extracting using the "--strip-components" option. Read more [...]
While Python comes with many "batteries included", many others are not. Luckily, thanks to generosity and hard work of various members of the Python community, there are a number of third-party implementations to fill in this gap. For example, Fisher's exact test is not part of the standard library.
While Python comes with many "batteries included", many others are not. Luckily, thanks to generosity and hard work of various members of the Python community, there are a number of third-party implementations Read more [...]
We all know about using scp to transfer files over a secure shell connection.
It works fine, but there are many cases where alternate modalities of usage are required, for example, when dealing when you want to transfer the output of one program directly to be stored on a remote machine.
Here are some ways of going about doing this.
Let "$PROG" be a program that writes data to the standard output stream.
Transfering without compression:
$PROG | ssh destination.ip.address Read more [...]
The pyPDF package provides really nice facilities for PDF document manipulation. Here is a simple application script to extract a specified subset of pages from a PDF file.
UPDATE Nov 7, 2009: Better parsing of traceback.
UPDATE Nov 4, 2009: Now passing a "-b" flag to the script opens the parsed stack frame references in a BBEdit results browser, inspired by an AppleScript script by Marc Liyanage.
When things go wrong in a Python script, the interpreter dumps a stack trace, which looks something like this:
$ python y.py
Calling f1 ...
Traceback (most recent call last):
File "y.py", line 6, in
File "/Users/jeet/Scratch/snippets/x.py", line Read more [...]
This is pretty slick: enter “
fc” in the shell and your last command opens up for editing in your default editor (as given by “
$EDITOR“). Works perfectly with vi. The”
$EDITOR” variable approach does not seem to work with BBEdit though, and you have to:
$ fc -e '/usr/bin/bbedit --wait'
With vi, “
:cq” aborts execution of the command.
Given a list of strings, how would you iterpolate a multi-character string in front of each element?
For example, given:
>>> k = ['the quick', 'brown fox', 'jumps over', 'the lazy', 'dog']
The objective is to get:
['-c', 'the quick', '-c', 'brown fox', '-c', 'jumps over', '-c', 'the lazy', '-c', 'dog']
Of course, the naive solution would be to compose a new list by iterate over the original list:
>>> result = 
>>> for i in k:
Read more [...]
The DendroPy Phylogenetic Computing Library includes native infrastructure for phylogenetic sequence simulation on DendroPy trees under the HKY model. Being pure-Python, however, it is a little slow. If Seq-Gen is installed on your system, though, you can take advantage of a lightweight Seq-Gen wrapper added to the latest revision under the interop subpackage: dendropy.interop.seqgen. Documentation is lagging, but the following examples should be enough to get started, and the class is simple and Read more [...]
If you have opened a file, and see a bunch "^M" or "^J" characters in it, chances are that for some reason Vim is confused as to the line-ending type.
You can force it to interpret the file with a specific line-ending by using the "++ff" argument and asking Vim to re-read the file using the ":e" command:
This will not actually change any characters in the file, just the way the file is interpreted.
If you want to resave the file with the new line-ending Read more [...]
For a week now, opening a new tab or window in OS X's Terminal application has been major palaver, sometimes taking up to a minute. CPU usage would shoot up (mostly/usually by WindowServer, but sometimes by kernel_task). It was driving me nuts. I practically live in the Terminal (or the be more accurate, Terminal + Vim), and usually spawn a new Terminal window several times in an hour for everything from using R as a calculator to opening files for viewing to actual development work. With this slow Read more [...]
Download and install MacFUSE.
Download the sshfs binary, renaming/moving to, for example, "/usr/local/bin/sshfs".
Create a wrapper tunneling script and save it to somewhere on your system path (e.g., "/usr/local/bin/ssh-tunnel-gateway.sh"), making sure to set the executable bit ("chmod a+x"):
ssh -t GATEWAY.HOST.IP.ADDRESS ssh $@
Create the following script, and save it to somewhere on your system path (e.g., "/usr/local/bin/mount-remote.sh"), making sure Read more [...]
The following is an example of how to use the "pkg_resources" module (provided by the setuptools project) to compose a list of all available modules in a Python package.
#! /usr/bin/env python
sys.stderr.write("'pkg_resources' could not be imported: setuptools installation required\n")
Returns list of module names for package `package_name`.
Read more [...]
When you pull and update your local, it would be nice to easily see all the commits that you have applied in the pull. Sure you can figure it by scanning through the git log carefully, but adding the following to your '~/.gitconfig' gives you an easy way to see it in a glance:
whatsnewlog = !"sh -c \"git log --graph --pretty=format:'%Creset%C(red bold)[%ad] %C(blue bold)%h%C(magenta bold)%d %Creset%s %C(green bold)(%an)%Creset' --abbrev-commit --date=short $(git symbolic-ref HEAD 2> /dev/null Read more [...]
Python descriptors allow for rather powerful and flexible attribute management with new-style classes. Combined with decorators, they make for some elegant programming.
One useful application of these mechanisms are lazy-loading properties, i.e., properties with values that are computed only when first called, returning cached values on subsequent calls.
An implementation of this concept (based on this post) is:
Lazy-loading read-only property descriptor.
Read more [...]
To search content of all tracked files in the current working tree for a pattern:
To search content of all commit messages for a pattern ('-E' for extended grep):
git log [-E] --grep
To search content of all commit diffs for lines that add or remove a pattern ('-w' for pattern only at word boundary):
git [-w] log -G
To search content of entire working trees of previous revisions for a pattern:
git grep $(git rev-list --all)
Note that Git supports POSIX Basic Regular Expression. Read more [...]