Sei sulla pagina 1di 6

Phylomatic & Phylocom overview

Phylomatic and Phylocom are commonly used programs in community phylogenetic analysis.

• Phylomatic takes a supertree and boils it down to a phylogeny containing only the species in
your sample.

• Phylocom performs the actual analyses (for instance, calculating phylogenetic diversity), using
the tree you made in Phylomatic.

These programs are easy to use once your data is formatted correctly, but formatting takes some effort.
The necessary steps are outlined here.

1. Cleaning your data

The first step is to make sure that the names in your data set are consistent with the current angiosperm
phylogeny (or with whichever supertree you will be using; see Supertrees). If you have a genus listed in
the wrong family, or a name spelled incorrectly, the program will not work. It is okay to have
morphospecies (e.g. Quercus_species1).

The Phylomatic website provides a tool that checks your families and genera against the default supertree
(www.phylodiversity.net/phylomatic/taxacheck.html). If you aren’t using the default supertree from
Phylomatic, you’re on your own to do this check. This is a tedious but crucial step.

2. Creating input files

Once your data is clean, you need to format your data files for use in Phylomatic and Phylocom. For any
analysis you need to create two basic text documents:

A. The taxa file: Contains a list of all species found in any of your plots. You can think of it as the
regional species pool. This is a slash-delimited text file in the following format:
family/genus/genus_specificepithet

For example, you might have:

sapindaceae/acer/acer_platanoides

Each species must be on its own line and should appear in the taxa file only once. In the example
above, even if four of your plots contained Acer platanoides, the line for that species would
appear only once in the taxa file.

Also note that the names are in lower case. This isn’t strictly necessary, but for reasons
mysterious to me the program is less likely to crash when all of your input files are in lower case.

B. The sample file: Contains species names and abundances for each of your plots. Unlike the taxa
file, a species found in multiple plots will be listed multiple times. This is a tab-delimited text
file in the following format:

plot_name number_of_individuals genus_specific epithet

For example, let’s say you have two plots called “longisland1” and “longisland2.” In
“longisland1” there are 5 individuals of Acer platanoides, and in “longisland2” there are 3 individuals. In
your file, you would have two lines that look like this:

longisland1 5 acer_platanoides

longisland2 3 acer_platanoides

3. Supertrees
You will derive the phylogeny of the species in your plots from a supertree. The Phylomatic website lists
available supertrees at svn.phylodiversity.net/tot/megatrees/ and describes those trees at
www.phylodiversity.net/phylomatic/phylomatic_old.html.

Most of these supertrees are based on Peter Stevens’ tree at APweb (which is itself based on the APG;
www.mobot.org/mobot/research/apweb/). The folks behind Phylomatic then add family-level
phylogenies to the APweb tree as they are published. These are informal supertrees, meaning they are
assembled by hand. The APweb tree is resolved largely to the family level; family-level phylogenies are
basically glued onto the tips of the APweb tree. A formal, but less current, supertree from Davies et al.
2004 is also available. (See Sanderson et al. 1998 for the nuances of informal and formal supertrees; in
short, the difference is something akin to the difference between a traditional review and a meta-analysis).
I am using “R20100701,” an informal supertree which is the most current available.

Although Phylomatic provides trees, you are also free to use your own supertree if you have one. For
instance, if you are only interested in phylogenetic community patterns for a particular tribe or genus, you
could use a resolved phylogeny for that tribe or genus rather than using the entire angiosperm tree.

Whatever supertree you use, it should be in Newick format and saved as a text file. The supertrees from
Phylomatic are already formatted for you.

4. Set up the programs

Phylomatic is available in two forms – an online version and a desktop version. The online version is
easier to use, but the desktop version gives you a lot more flexibility. I’m only going to talk about the
desktop version here. You can download the program here (http://www.phylodiversity.net/phylocom/).
The package comes with both Phylomatic and Phylocom. These are programs that you will run in the
command line on your computer – as such, you don’t actually install them. Just place the two .exe files
(phylomatic.exe and phylocom.exe) into the same folder as the data you will be working with.

5. Make your community phylogeny

(This section contains the text commands for Phylomatic and Phylocom. I will put the commands
themselves in red, and the names of files and folders in blue. This should help keep straight things that
you need to type as-is and things that depend on what you have named your files and folders).
At this point, you should have a supertree, a taxa file and a sample file. You can name these as anything
you want, but for now let’s call them “masterphylo.txt,” “taxa.txt,” and “sample.txt” respectively. These
should all be in the same folder. You also need phylocom.exe and phylomatic.exe in this folder.

Open the command line on your computer (Command Prompt on Windows; Terminal on a Mac). Change
the directory to the folder where your files are located. For instance, I would type:

cd C:\Users\Emily\Documents\“Phylocom Demo”

(Note that any folder names with spaces in them need to be in quotes, or it won’t work.)

Type this command to make your community phylogeny:

phylomatic -l -f masterphylo.txt -t taxa.txt > myphylo.txt

What does this mean? First you are typing the name of the program you want to run (phylomatic). The
“-l” command tells it that everything is in lower case. (Again, theoretically you could leave this bit out
and not bother making your input files lower case, but for mysterious reasons things work better this
way). “-f” designates the phylogeny file you want it to use; after “-f” you type the name of your
supertree. Similarly, “-t” designates the taxa file you want to use, so after “-t” you type the name of your
taxa file. “> myphylo.txt” tells it to save the results to a file of that name.

Now you have a community phylogeny!

6. Add branch lengths to your tree

You probably want your tree to have branch lengths. The easiest way to do this is to use the ages from
Wikstrom et al. 2001 provided on the phylocom website on the same pages as the megatrees
svn.phylodiversity.net/tot/megatrees/). The file is called “ages.” (NOTE: This file doesn’t work properly
for me if I save it as ages.txt or ages.ages; saving it without any file extension works fine. Another quirk
of the program.)

In command line, type the following:

phylocom bladj –t taxa.txt –f myphylo.txt > myphylo_um.txt

Here I’m telling it to use the bladj command in the program Phylocom, with my taxa file and the
phylogeny I just made. The bladj command uses the ages file to put branch lengths on your phylogeny;
here, I am saving it to another file called myphylo_um.txt.

7. Calculate phylogenetic diversity

To calculate the phylogenetic diversity (Faith’s index, PD) of each of your plots, use the following
command:

phylocom pd –s sample.txt –f myphylo_um.txt > pd.txt

Here I’m telling it do use the pd command in Phylocom. “-s” designates my sample file, and the other
pieces of the command work as before.

8. Calculate other metrics

The Phylocom manual lists all of the metrics that can be calculated with the program. This can be found
on their website (http://www.phylodiversity.net/phylocom/). I’ll demonstrate just one more here. The
“comstruct” command calculates MPD (mean phylogenetic distance) and MNTD (mean nearest taxon
distance). Like PD, these are metrics of phylogenetic diversity. This takes a little more thought than
calculating PD; it uses a null model, and you need to specify which null model you wish to use and how
many runs should be used. An example of the command is as follows:
phylocom comstruct -a -m 1 -r 999 -s sample.txt -f myphylo_um.txt > comstruct.txt

“-a” weights the metrics by the abundance of each taxon. You can leave this part out if you don’t want
the metrics to be abundance-weighted. “-m” specifies your null model. Here I am using null model #1
(the null models available to use are described in detail in the Phylocom manual). “-r” designates the
number of runs you will use for your null model; I am using 999 runs in this example. As before, “-s”
refers to your sample file and “-f” refers to your phylogeny.

Citations

Davies T, Barraclough T, Chase M, Soltis P, Soltis D, Savolainen V (2004) Darwin's abominable


mystery: Insights from a supertree of the angiosperms. Proceedings of the National Academy of Sciences
of the United States of America 101:1904-1909.

Sanderson M, Purvis A, Henze C (1998) Phylogenetic supertrees: assembling the tree of life. Science
13:8-12.

Wikström N, Savolainen V, Chase M (2001) Evolution of the angiosperms: calibrating the family tree.
Proceedings of the Royal Society of London B: Biological Sciences 268:2211-20.

Potrebbero piacerti anche