In this tutorial we’re going to download OrthoFinder and check we can run it on the Example Dataset. After doing that we’ll be ready for the next tutorial where we run it on a more interesting set of species. All these steps will be done on the command line so that you can just copy and paste the commands yourself. If you are not familiar with the command line there are many online tutorials and reference pages, here is a nice short one that covers the basics: https://www.techspot.com/guides/835-linux-command-line-basics/.
There are a number of ways of obtaining OrthoFinder.
For Linux: follow the instructions below.
For Macs: it’s easiest to use Bioconda to install and then follow the instructions below.
For Windows it is best to install the Windows Subsystem for Linux and then continue as for Linux below.
For Mac and Windows using Bioconda/Windows Subsystem for Linux, follow the instructions here: Alternative ways of getting OrthoFinder and then return to Step 1 of this tutorial.
mkdir ~/orthofinder_tutorial
cd ~/orthofinder_tutorial
wget https://github.com/davidemms/OrthoFinder/releases/latest/download/OrthoFinder.tar.gz
If you don’t have wget installed, you can try curl:
curl -L -O https://github.com/davidemms/OrthoFinder/releases/latest/download/OrthoFinder.tar.gz
Or go to the GitHub releases page and download OrthoFinder: https://github.com/davidemms/OrthoFinder/releases
tar xzvf OrthoFinder.tar.gz
cd OrthoFinder/
Request OrthoFinder to print its help file.
On Linux:
./orthofinder -h
Or, if you’ve installed OrthoFinder using Bioconda run the version it installed in the system path rather than the local copy in this directory:
orthofinder -h
This will print all the OrthoFinder command line options.
Run OrthoFinder on the Example Dataset (this is a very small dataset so should run in a few of minutes, normal datasets will take longer)
Linux:
./orthofinder -f ExampleData/
Or, using Bioconda:
orthofinder -f ExampleData/
When you run OrthoFinder you should get something like this:
~/orthofinder_tutorial$ ./orthofinder -f ExampleData/
OrthoFinder version 2.3.7 Copyright (C) 2014 David Emms
2019-10-23 11:12:56 : Starting OrthoFinder
48 thread(s) for highly parallel tasks (BLAST searches etc.)
1 thread(s) for OrthoFinder algorithm
Checking required programs are installed
----------------------------------------
Test can run "mcl -h" - ok
Test can run "fastme -i /home/emms/orthofinder_tutorial/ExampleDataset/OrthoFinder/Results_Oct23/WorkingDirectory/SimpleTest.phy -o /home/emms/orthofinder_tutorial/ExampleDataset/OrthoFinder/Results_Oct23/WorkingDirectory/SimpleTest.tre" - ok
Dividing up work for BLAST for parallel processing
--------------------------------------------------
2019-10-23 11:12:56 : Creating diamond database 1 of 4
2019-10-23 11:12:56 : Creating diamond database 2 of 4
2019-10-23 11:12:56 : Creating diamond database 3 of 4
2019-10-23 11:12:56 : Creating diamond database 4 of 4
Running diamond all-versus-all
------------------------------
Using 48 thread(s)
2019-10-23 11:12:56 : This may take some time....
2019-10-23 11:13:05 : Done all-versus-all sequence search
Running OrthoFinder algorithm
-----------------------------
2019-10-23 11:13:05 : Initial processing of each species
2019-10-23 11:13:05 : Initial processing of species 0 complete
2019-10-23 11:13:05 : Initial processing of species 1 complete
2019-10-23 11:13:06 : Initial processing of species 2 complete
2019-10-23 11:13:06 : Initial processing of species 3 complete
2019-10-23 11:13:08 : Connected putative homologues
2019-10-23 11:13:08 : Written final scores for species 0 to graph file
2019-10-23 11:13:08 : Written final scores for species 1 to graph file
2019-10-23 11:13:08 : Written final scores for species 2 to graph file
2019-10-23 11:13:09 : Written final scores for species 3 to graph file
2019-10-23 11:13:09 : Ran MCL
Writing orthogroups to file
---------------------------
OrthoFinder assigned 2202 genes (80.6% of total) to 604 orthogroups. Fifty percent of all genes were in orthogroups with 4 or more genes (G50 was 4) and were contained in the largest 281 orthogroups (O50 was 281). There were 269 orthogroups with all species present and 246 of these consisted entirely of single-copy genes.
2019-10-23 11:13:15 : Done orthogroups
Analysing Orthogroups
=====================
Calculating gene distances
--------------------------
2019-10-23 11:13:19 : Done
Inferring gene and species trees
--------------------------------
2019-10-23 11:13:19 : Done 0 of 325
2019-10-23 11:13:19 : Done 100 of 325
2019-10-23 11:13:19 : Done 200 of 325
269 trees had all species present and will be used by STAG to infer the species tree
Best outgroup(s) for species tree
---------------------------------
2019-10-23 11:13:27 : Starting STRIDE
2019-10-23 11:13:28 : Done STRIDE
Observed 2 well-supported, non-terminal duplications. 2 support the best roots and 0 contradict them.
Best outgroups for species tree:
Mycoplasma_hyopneumoniae
Mycoplasma_agalactiae, Mycoplasma_hyopneumoniae
Mycoplasma_agalactiae
WARNING: Multiple potential species tree roots were identified, only one will be analyed.
Reconciling gene trees and species tree
---------------------------------------
Outgroup: Mycoplasma_hyopneumoniae
2019-10-23 11:13:28 : Starting Recon and orthologues
2019-10-23 11:13:28 : Starting OF Orthologues
2019-10-23 11:13:28 : Done 0 of 325
2019-10-23 11:13:29 : Done 100 of 325
2019-10-23 11:13:30 : Done 200 of 325
2019-10-23 11:13:32 : Done 300 of 325
2019-10-23 11:13:32 : Done OF Orthologues
2019-10-23 11:13:32 : Done Recon
Writing results files
=====================
2019-10-23 11:13:32 : Done orthologues
Results:
/home/emms/orthofinder_tutorial/ExampleDataset/OrthoFinder/Results_Oct23/
CITATION:
When publishing work that uses OrthoFinder please cite:
Emms D.M. & Kelly S. (2015), Genome Biology 16:157
If you use the species tree in your work then please also cite:
Emms D.M. & Kelly S. (2017), MBE 34(12): 3267-3278
Emms D.M. & Kelly S. (2018), bioRxiv https://doi.org/10.1101/267914
OrthoFinder creates a directory within the directory with your input files and puts all the results there, e.g.: ExampleData/OrthoFinder/Results_Oct11
. This is what the results directory looks like:
If everything worked then you should have got a similar looking results directory. That’s it, we’re done!
In the next tutorial (Running an example OrthoFinder analysis) we will look at how to prepare and run our own analysis and after that there’s a tutorial showing how to explore all the results.
Written on September 18th, 2019 by David Emms