Using DendroPy Interoperability Modules to Download, Align, and Estimate a Tree from GenBank Sequences

The following example shows how easy it can be to use the three interoperability modules provided by the DendroPy Phylogenetic Computing Library to download nucleotide sequences from GenBank, align them using MUSCLE, and estimate a maximum-likelihood tree using RAxML. The automatic label composition option of the DendroPy genbank module creates practical taxon labels out the original data. We also pass in additional arguments to RAxML to request that the tree search be carried out 250 times (['-N', Read more [...]

Pure-Python Implementation of Fisher’s Exact Test for a 2×2 Contingency Table

While Python comes with many "batteries included", many others are not. Luckily, thanks to generosity and hard work of various members of the Python community, there are a number of third-party implementations to fill in this gap. For example, Fisher's exact test is not part of the standard library. While Python comes with many "batteries included", many others are not. Luckily, thanks to generosity and hard work of various members of the Python community, there are a number of third-party implementations Read more [...]

Parse Python Stack Trace and Open Selected Source References for Editing in OS X

UPDATE Nov 7, 2009: Better parsing of traceback. UPDATE Nov 4, 2009: Now passing a "-b" flag to the script opens the parsed stack frame references in a BBEdit results browser, inspired by an AppleScript script by Marc Liyanage. When things go wrong in a Python script, the interpreter dumps a stack trace, which looks something like this: $ python Calling f1 ... Traceback (most recent call last): File "", line 6, in x.f3() File "/Users/jeet/Scratch/snippets/", line Read more [...]

Most Pythonique, Efficient, Compact, and Elegant Way to Do This

Given a list of strings, how would you iterpolate a multi-character string in front of each element? For example, given: >>> k = ['the quick', 'brown fox', 'jumps over', 'the lazy', 'dog'] The objective is to get: ['-c', 'the quick', '-c', 'brown fox', '-c', 'jumps over', '-c', 'the lazy', '-c', 'dog'] Of course, the naive solution would be to compose a new list by iterate over the original list: >>> result = [] >>> for i in k: ... result.append('-c') Read more [...]

Molecular Sequence Generation with DendroPy

The DendroPy Phylogenetic Computing Library includes native infrastructure for phylogenetic sequence simulation on DendroPy trees under the HKY model. Being pure-Python, however, it is a little slow. If Seq-Gen is installed on your system, though, you can take advantage of a lightweight Seq-Gen wrapper added to the latest revision under the interop subpackage: dendropy.interop.seqgen. Documentation is lagging, but the following examples should be enough to get started, and the class is simple and Read more [...]

List All Modules Provided By A Python Package

The following is an example of how to use the "pkg_resources" module (provided by the setuptools project) to compose a list of all available modules in a Python package. #! /usr/bin/env python import sys try: import pkg_resources except ImportError: sys.stderr.write("'pkg_resources' could not be imported: setuptools installation required\n") sys.exit(1) def list_package_modules(package_name): """ Returns list of module names for package `package_name`. """ Read more [...]

Lazy-Loading Cached Properties Using Descriptors and Decorators

Python descriptors allow for rather powerful and flexible attribute management with new-style classes. Combined with decorators, they make for some elegant programming. One useful application of these mechanisms are lazy-loading properties, i.e., properties with values that are computed only when first called, returning cached values on subsequent calls. An implementation of this concept (based on this post) is: class lazy_property(object): """ Lazy-loading read-only property descriptor. Read more [...]

Execute Selected Lines of (Optionally) Marked-Up Python Code in a Vim Buffer

There are a number of solutions for executing Python code in your active buffer in Vim. All of these expect the buffer lines to be well-formatted Python code, with correct indentation. Many times, however, I am working on program or other documentation (in, for example reStructuredTex or Markdown format), and the code fragments that I want to execute have extra indentation or line leaders. For example, a reStructuredText buffer might look like: How to Wuzzle the Wookie ------------------------- Read more [...]

Download Sequences from GenBank, Keeping Only Codons

The following script takes a space separated list of GenBank numbers as input, and then uses BioPython to download the corresponding sequences from GenBank, strips off all non-coding nucleotides, gives the sequences sensible names, and assembles them into a FASTA file. It is pretty basic, and does not do a lot of fancy error checking, and is probably a little too specific to be useful for most people. I can imagine extending it in a number of ways that would make it much more useful in a number Read more [...]

An Idiosyncratic Analogical Overview of Some Programming Languages from an Evolutionary Biologist’s Perspective

R R is like a microwave oven. It is capable of handling a wide range of pre-packaged tasks, but can be frustrating or inappropriate when trying to do even simple things that are outside of its (admittedly vast) library of functions. Ever tried to make toast in a microwave? There has been a push to start using R for simulations and phylogenetic analysis, and I am actually rather ambiguous about how I feel about this. On the one hand, I would much rather an open source platform R be used than some Read more [...]

All About Your Python(s)

Here I present a script that provides diagnostics about the current Python execution context, or the Python environment of the interpreter passed as an argument. As a Python developer, I have multiple Python versions side-by-side for testing purposes, using scripts that munge my $PATH variable to "import" and "unimport" different versions of Python as I need them. While "which python" is always available, many times I want to know things like, "what is the version of the current default Python?" Read more [...]

Adding Test Code Coverage Analysis to a Python Project’s Setup Command

I recently integrated unit test code coverage analysis (using coverage) as a setuptools command extension into the DendroPy phylogenetic computing library, and thought that I would share how this was done. Providing the Command Extension < p> The first step is to provide the command functionality in a class that derives from "setuptools.Command&quot in a separate module of the package, which, in my case, was called "", located in the "test/support" Read more [...]