Contents
- Tool sheds supported by the core Galaxy development team
- Hosting your own local tool shed
- Migrating the database schema of your local tool shed
- History of the tool shed
- Introduction
- Basic repository features: create repository, upload, browse and delete files
- Viewing or managing a repository
- Contacting the owner of a repository
- Uploading additional files
- The mercurial repository change log
- Repository revisions: uploading a new version of an existing tool
- Repository revisions: valid tool versions
- Repository rating and reviews
- Search repositories for valid tools by any combination of id, name or version
- Including proprietary data types that subclass from Galaxy data types in the distribution
- Including proprietary data types that use class modules contained in your repository
- Including datatype converters and display applications
- Automatic installation of Galaxy tool shed repository tools into a local Galaxy instance
- Automatic installation of Galaxy tool shed repository data types into a local Galaxy instance
- Getting updates for tool shed repositories installed in a local Galaxy instance
- Enabling workflow sharing: importing a workflow via a URL
- Enabling workflow sharing: finding workflows in tool shed repositories
- Enabling workflow sharing: importing a workflow from an installed tool shed repository
- Migrating tools from the Galaxy distribution to the Galaxy main tool shed
- Configuring your Galaxy server to automatically install tools eliminated from the Galaxy distribution
- Use case: automatically install the Emboss tools and datatypes into a local Galaxy instance
- Deactivating and uninstalling tool shed repositories installed into a local Galaxy instance
Tool sheds supported by the core Galaxy development team
The Galaxy Test Tool Shed is available as a sandbox environment allowing you to familiarize yourself with tool shed features. Feel free to mess around here as much as you want.
The Galaxy Main Tool Shed hosts production-ready Galaxy tools and tool components, and should not be used for testing or becoming familiar with tool shed features.
Hosting your own local tool shed
If you decide to host your own local Galaxy tool shed, your shed will not initially contain anything until you add your own locally developed tools to it. Starting up a local tool shed does not result in the mercurial repositories currently available in the main galaxy tool shed being automatically made available in your local tool shed.
All of the code for the tool shed is included in the Galaxy distribution - it is just a different web application from Galaxy itself. It uses a different database from Galaxy (this is CRITICAL), which you configure in the file community_wsgi.ini (the equivalent of universe_wsgi.ini for Galaxy). After you have the configuration settings as you want them, start up your local tool shed by typing the following on the command line within the Galaxy installation directory.
%sh run_community.sh
If you use an apache proxy to your tool shed, you can use the same approach detailed in our Apache proxy to Galaxy wiki. For example, the following rules can be used to enable your apache server to serve static content (located in the directory /home/galaxy/static in this example) for your tool shed running on port 9009:
1 RewriteEngine on
2 RewriteRule ^/toolshed$ /toolshed/ [R]
3 RewriteRule ^/toolshed/static/style/(.*) /home/galaxy/static/june_2007_style/blue/$1 [L]
4 RewriteRule ^/toolshed/static/(.*) /home/galaxy/static/$1 [L]
5 RewriteRule ^/toolshed/images/(.*) /home/galaxy/static/images/$1 [L]
6 RewriteRule ^/toolshed/favicon.ico /home/galaxy/static/favicon.ico [L]
7 RewriteRule ^/toolshed/robots.txt /home/galaxy/static/robots.txt [L]
8 RewriteRule ^/toolshed(.*) http://localhost:9009$1 [P]
Of course, your tool shed must be aware that it is running with a prefix (for generating URLs in dynamic pages). This is accomplished by configuring a Paste proxy-prefix filter in the [app:main] section of community_wsgi.ini.
Note that cookie_path should be set to prevent Galaxy's session cookies from clobbering each other only if running more than one instance of the tool shed in different subdirectories on the same hostname.
Migrating the database schema of your local tool shed
When the schema for the Galaxy tool shed database changes, you'll see a message similar to the following familiar message (You may have seen it when updating your Galaxy instance) when you attempt to start your tool shed application after updating the code base.
Exception: Your database has version 'X' but this code expects version 'Y'. Please backup your database and then migrate the schema by running 'sh manage_db.sh upgrade'.
Migrating your tool shed database schema requires an additional command line parameter that is not included in the above message. Here is the command for updating the database schema.
%sh manage_db.sh upgrade community
Similarly, to downgrade your tool shed database schema to a previous version, the command requires the same additional parameter. For example suppose you want to downgrade your tool shed database schema to version 9. The command to do so would be the following.
sh manage_db.sh downgrade 9 community
History of the tool shed
The Galaxy tool shed was originally introduced at the Galaxy Community Conference in 2010. At that time, the tool shed allowed uploading and downloading of tool archives (tar files) consisting of single tools or a suite of several tools. Uploading an archive required the contents of the archive to follow strict rules.
The tool shed was completely re-engineered for it's new introduction at the 2011 Galaxy Developer Conference, where it was referred to as the "next-gen tool shed", and the foundation of many of its features became mercurial (hg). Rather than simply storing tool archives as files on disk, hg repositories became the container, allowing for automatic version control among other useful features. In addition to uploading and downloading using a browser, hg pulls and pushes via http became possible.
Introduction
The Galaxy Tool Shed enables sharing of Galaxy tools across the Galaxy community. The intent of the main Galaxy tool shed is to enable sharing of tools between the many local Galaxy instances around the world. The mercurial repositories that are available in the main Galaxy tool shed can be "hg cloned" individually or "automatically installed" individually as a means of making their contents (Galaxy tools, workflows, data, etc) available to your local Galaxy instance. This provides flexibility to those hosting their own local Galaxy instances in that they can install only those tools in which they have interest, and are not forced to get all of them in order to get any one of them.
If you are used to Galaxy, the tool shed will have a flavor of familiarity. Along with the top menu bar, the blue menu panel on the left provides access to the tool shed's features. Your initial visit to the tool shed will display the list of names by which tool shed repositories are categorized. You can browse the repositories in each category by clicking on the name link. You can browse all repositories by clicking the appropriate link in the left menu panel.
Although mercurial now provides the framework for the tool shed, a user can take advantage of virtually all of the tool shed's features without knowing anything about mercurial. This document provides information about using the various features of the tool shed. We'll start by creating a new mercurial repository to house a tool we've written.
The intent of the tool shed (whether it is a local, proprietary tool shed or the public Galaxy tool shed hosted by Penn State University) is to provide a vehicle for sharing tools (and tool-related objects like workflows, data types, etc) that are determined to be functional within a Galaxy instance. The tools themselves should be implemented within a development environment that includes a Galaxy instance and when a tool is deemed functional, it can then be uploaded to the tool shed for sharing.
You can, however, tweak the primary intent of a tool shed to meet your needs. For example, you may be interested in using a local tool shed instance to share tools between developers during the development process. If this is the case, then your approach can be one where a developer creates a repository on the local tool shed so that multiple developers can clone it. The mercurial command line process for committing, pushing, pulling and updating the tool shed repository can be used to share updates to the tool code by multiple developers throughout the process. If a single developer is implementing the tool however, it may make more sense to not use the tool shed as part of the development process - just upload the tool to the tool shed repository when it is functional.
Basic repository features: create repository, upload, browse and delete files
You have to login to the tool shed in order to create a repository. The process for creating a tool shed account is the same as that in Galaxy, although the tool shed is a separate application from the main public Galaxy instance hosted at Penn State, so user accounts are not used between the two. Selecting the Register option from the User menu on the top menu bar will display the following page allowing you to create a new account on the tool shed.
After registering (or logging in if you already have an account), the Login to create a repository option in the left menu panel will become Create new repository, and selecting that option will display the following page. We've filled in the form fields to create a new repository named filter.
After saving the repository information, the following page is displayed presenting the information you've entered, and enabling you to upload files to your repository.
Clicking the Upload files to repository button displays the following page. You can upload individual files or tar archives of files, and gzip and bzip2 compression is supported. Any change to the contents of the repository will result in the creation of a new mercurial changeset within the repository (e.g., uploading new files or editing or deleting existing files).
Repository files are not restricted to only the basic Galaxy tool wrapper combination (Galaxy tool config and executable), but can be anything useful to the intent of the repository. For example, if your tool config includes functional tests, your repository should include the input and output datasets used by the tests. If your tool refers to an index location file (a xxx.loc file usually stored in the ~/tool-data directory), your repository should include a xxx.loc.sample file so those that download the tool will have an example of the required .loc file for their local Galaxy instance. You may also decide to include one or more exported Galaxy workflows in your repository that provide examples of how your tool may be used.
After uploading files, the following page is displayed. Notice that you can now clone the repository (creating a new repository within your local environment) that enables you to maintain your own version of the repository files. You can also browse the files in the repository or select one or more files to delete from the repository from this page, and the Repository Actions pop-up menu in the upper right corner provides the list of other features that are available. Notice that, among other things, this menu enables you to download an archive of the repository files in either gzip, bzip2 or zip format. This feature differs from cloning the repository using mercurial because downloading in this way does not create a local mercurial repository.
Viewing or managing a repository
If you are the owner of a repository, the Repository Actions menu will include a Manage repository option, while for those repositories that you don't own, this menu will include a View repository option. Both of these options provide similar features, but the former provides additional features that are not available with the latter.
Selecting the Manage repository option from the pop-up menu shown in the page above will display a page that includes several sections, each of which is a form that enables you to manage something about the repository. The first section consists of the following form which contains the basic information about the repository. You can change the Name, Synopsis and Detailed description field contents on this form. The Revision field refers to the current mercurial revision of the repository (the repository tip), and clicking the revision link displays a page that contains the entire mercurial repository change log.
The Preview tools and inspect metadata by tool version section enables you to preview all tools in the repository (there may be more than one) by version, and inspect the metadata about each of them. You will be able to preview a version of the tool (and metadata for it will be generated) only if it successfully loads in a Galaxy instance. There are many reasons that a tool will not load properly, but most often the cause is a missing file upon which the tool depends for proper operation. If the tool does not load, clicking the Reset metadata button displays a message providing information about the cause of the problem.
Clicking the tool name link in the Preview tools and inspect metadata by tool version form will display the selected version of the tool as it will look when loaded into a Galaxy instance, providing you a preview of the tool and its inputs, syntax, etc.
Selecting the View tool metadata pop-up menu item for the tool name in the Preview tools and inspect metadata by tool version form displays metadata about the tool including information about functional tests, if they exist. A notable item on the following page is the Guid, which for our filter tool is "gvk.bx.psu.edu:9009/repos/greg/filter/Filter1/1.0.1". This is a unique identifier for the specific version of this tool. No other tool/version combination within this Galaxy tool shed or any other Galaxy tool shed will have this same identifier. This identifier plays an important role in enabling the tool to be automatically installed from the tool shed into a local Galaxy instance. The details for doing this are presented in a following section of this document.
The Manage categories section enables you to associate tools with 1 or more categories, or disassociate from them.
The Notification on update section enables you to receive email notification whenever a change is made to the repository. The notification includes the message provided by the user when they made the changes, making it easy to know if a new version of the tool is available for download, or if less critical changes were made.
The Grant authority to make changes section enables you, as the owner of the repository, to grant permission to other tool shed users to make changes to your repository files.
Contacting the owner of a repository
You must be logged in to the tool shed in order to contact the owner of a repository. When logged in, the Repository Actions menu for a repository that you don't own will include a Contact repository owner option.
Selecting the Contact repository owner option from the Repository Actions menu will display a page like the following, enabling you to send an email message to the owner of the repository.
Uploading additional files
We previously discussed uploading files to a newly created, empty repository. Uploading files to a non-empty repository introduces additional features on the upload page, so we'll take another look now. When you select the Upload files to repository item from the upper right Repository Actions menu, the upload page will look similar to the one below. There are 2 important things to take note of here.
* The current repository file hierarchy, labeled Contents:, is included in the upload form near the bottom. This feature enables you to browse your repository, and upload a file at a specified point in the hierarchy. This specified point is called the upload point, and is selected by clicking the check box to the left of the item in the hierarchy (if you select a file, the upload point will be the folder in which the file is contained). You can upload a single file, or a tarball. If you do not select an upload point, the file will be uploaded to the root of the repository file hierarchy.
* If uploading a tarball, you have the option to automatically remove all files that exist in the repository (relative to the upload point), but do not exist in the uploaded archive. This feature is labeled Remove files in the repository (relative to the root or selected upload point) that are not in the uploaded archive?. It streamlines uploading a tarball to the repository root or a selected upload point, completely replacing the existing file hierarchy. Of course, this can have unwanted consequences, so be careful!
The mercurial repository change log
Selecting the View change log option from the Repository Actions pop-up menu will display a page similar to the following. Each section in the list of Changesets includes the user-entered change set message labeled Description, the mercurial change set revision and parent, the user that made the changes, and how long ago the changes were made.
Clicking on the first change set link in our list displays the last mercurial change set created when we uploaded the file named ANOTHER_README.
Repository revisions: uploading a new version of an existing tool
The version of a Galaxy tool is defined in the <tool> tag of the tool config (if the version attribute is missing from the tag, Galaxy assigns the default version string "1.0.0" to the tool). For example, the tool tag of the filter tool that we previously uploaded looks like this:
<tool id="Filter1" name="Filter" version="1.0.1">
The Galaxy standard is to change the value of the version string for a tool if the tool has been modified such that inputs, parameters or behavior produce different output. Let's assume that we've changed the behavior of our filter tool in such a way that it requires a new version value, say 1.0.2. Uploading the new version of the tool to our filter repository will produce some interesting behavior.
When we select either View repository or Manage repository after uploading the new version of the filter tool, a Repository revision select list is displayed at the top of the page, assuming, in the case of our filter tool, the following criteria are all met:
The value of the id attribute in the <tool> tag (i.e., Filter1) does not change.
- Version 1.0.1 of the tool successfully loads in a Galaxy instance.
The value of the version attribute in the <tool> tag changes (e.g., 1.0.1 -> 1.0.2)
- Version 1.0.2 of the tool successfully loads in a Galaxy instance.
When the above list of criteria is met, a new set of metadata about the repository contents (in this case, the Filter tool) is created for version 1.0.2 of the filter tool. This "tool metadata" was discussed in the previous section. The new metadata record for our repository is the second record associated with the repository. The first metadata record was created when we uploaded version 1.0.1 of the tool. The Repository revision select list will include an option for each metadata record associated with the repository. In repositories with more than one tool, the criteria is slightly different. Upon initial upload, a new repository metadata record is created only if the repository includes at least 1 tool, and all tools in the repository properly load into Galaxy. A second repository metadata record will be created if any of the following criteria is true:
- At least 1 new tool is uploaded to the repository that was not previously in the repository.
- At least 1 tool is removed from the repository.
- The relative file path to at least 1 tool is changed, even if the tool version is not changed (usually occurs when an uploaded tarball contains a different directory structure than a previously uploaded tarball). Doing this is not ideal since a new metadata record will be unintentionally created - you should always keep directory structures the same if the tool versions do not change.
- The version string of at least 1 tool in the repository is different in the upload.
If changes are made to the tool (or tools) that do not require a version change (e.g., the tool help is updated), the existing metadata for that version of the tool is overwritten rather than creating a new record. This is why some repositories have multiple change sets in the change log, but only a single metadata record (or none at all if the repository does not include a tool that loads properly into a Galaxy instance).
The selected option of the Repository revision select list will default to the repository tip. Selecting an option enables you to download the versions of the tools within that change set revision. You can also preview the tool versions (assuming they properly load into Galaxy) and you can inspect the metadata for each tool version available in the selected repository revision. For example, selecting revision 1:... (the repository tip) of our Filter tool displays version 1.0.2 of the tool when you click on the name of the tool in the Preview tools and inspect metadata by tool version form.
Selecting revision 0:... (the previous revision that properly loads into Galaxy) of our Filter tool displays version 1.0.1 of the tool when you click on the name of the tool in the Preview tools and inspect metadata by tool version form.
Repository revisions: valid tool versions
As discussed previously, a new repository revision is created every time a change is made to the repository by committing a new repository change set. This occurs whenever existing repository files are modified or deleted or new files are uploaded. The changes made to the repository for each change set are contained in the repository change log, which can be inspected by selecting the View change log option from the repository's Repository Actions menu. The latest repository revision is also called the repository tip.
While browsing repositories, you may notice that the value of some Revision columns is a select list, while others are simply text. Columns with a text value are displaying the repository tip, and the repository usually consists of a single change set (although not always) which resulted from the initial upload of the tool files. The barcode_splitter and clustalw repositories in the following page have textual Revision column values, while the clustalomega repository's value is a select list.
Notice that the Revision values are a number followed by a : and an alpha-numeric string (e.g., 6:98d05121d41e). The number automatically increments, and refers to the zero-based number of change sets committed to the repository (i.e, the first committed change set's number is zero). The alpha-numeric string is a unique identifier for each change set within the repository's change log, and the : is simply a separator.
Repositories with textual Revision column values always refer to the repository tip, and will include at most one change set in which all contained tools properly load within a Galaxy instance. In this case, you'll know that a tool will properly load if it is included in the Preview tools and inspect metadata by tool version section of the View repository or Manage repository pages described previously in this document. Tools that will not successfully load can still be downloaded (although they cannot be automatically installed into a Galaxy instance as discussed in a following section of this document), but will require fixes in order for them to properly function within your Galaxy instance. You can contact the repository owner, asking them to commit fixes to their repository if you discover tools with problems.
The criteria by which new repository revisions are generated is described in the previous section of this document. To help clarify this process, let's take a closer look at the Revision column select list for the clustalomega tool. There are 2 options in the select list as of the writing of this document: 0:ff1768533a07 and 2:bb1847435ec1 (the repository tip). The first option, 0:ff1768533a07, refers to the repository change set that includes clustalomega tool version 0.2, which successfully loads within a Galaxy instance.
You can download version 0.2 of the clustalomega tool by visiting the above page and selecting the appropriate options from the Repository Actions menu, and you can preview version 0.2 of the tool and inspect the metadata for that version using the pop-up menu in the Preview tools and inspect metadata by tool version section.
Similarly, you can inspect metadata or download version 1.0.2 of the tool by selecting option 2:bb1847435ec1 in the Revision select list.
Repository rating and reviews
if you browse to a repository that you do not own, you can rate it using a 5 star rating, and add review comments.
Selecting the Rate repository option from the Repository Actions pop-up menu will display a page like the one below. You can select a 1 to 5 rating and add your review. Tool shed users can browse thess reviews.
Search repositories for valid tools by any combination of id, name or version
The Search section of the left tool shed menu panel includes an option labeled Search for valid tools.
Clicking on the Search for valid tools menu item will display the following page. You can search repositories for valid tools by entering any combination of tool id, name and version. You can also specify if you want to restrict searches to exact matches. Here we are searching for all valid tools whose id contains the letter b and whose version contains the string 1.0.0 (we are not matching on the exact strings).
Clicking the Search repositories button will display a list of all tool shed repositories that matched our criteria (i.e., the repository contains at least 1 tool whose id contains the letter b and whose version contains the string 1.0.0). Clicking on the repository name link will display the information for that repository and associated change set revision.
Including proprietary data types that subclass from Galaxy data types in the distribution
If your repository includes tools that require data types that are not defined in the Galaxy distribution, you can include the required data types in the repository along with your tools, or you can create a separate repository to contain them. The repository must include a file named datatypes_conf.xml, which is modeled after the file named datatypes_conf.xml.sample in the Galaxy distribution. This section describes support for including data types that subclass from data types in the Galaxy distribution. Refer to the next section for details about data types that use your own proprietary class modules included in your repository.
An example of this is the datatypes_conf.xml file in the emboss_datatypes repository in the main Galaxy tool shed, shown below.
Tool shed repositories that include valid datatypes_conf.xml files will display the data types in the Preview tools and inspect metadata by tool version section of the view or manage repository page.
Including proprietary data types that use class modules contained in your repository
Including proprietary data types that use class modules included in your repository is a bit tricky. As part of your development process for tools that use data types that fall into this category, it is highly recommended that you host a local Galaxy tool shed. When your newly developed tools have proven to be functionally correct within your local Galaxy instance, you should upload them, along with all associated proprietary data types files and modules to your local tool shed to ensure that everything is handled properly within the tool shed. When your local tool shed repository is functionally correct, install your repository from your local tool shed to a local Galaxy instance to ensure that your tools and data types properly load both at the time of installation and when you stop and restart your Galaxy server. You should not upload your tools to the main Galaxy tool shed until you have confirmed that everything works by following these steps.
To illustrate how this works, we'll use the gmap repository in the main Galaxy tool shed as an example. The datatypes_conf.xml file included in this repository looks something like the following. You'll probably notice that this file is modeled after the datatypes_conf.xml.sample file in the Galaxy distribution, but with some slight differences.
Notice the <datatypes_files> tag set. This tag set contains <datatype_file> tags, each of which refers to the name of a class module file name within your repository (in this example, there is only one file named gmap.py), which contains the proprietary data type classes you've defined for your tools.
In addition, notice the value of each "type" attribute in the <datatype> tags. The : separates the class module included in the repository (in this example, the class module is "gmap") from the class name ("GmapDB", "IntervalAnnotation", etc). It is critical that you make sure your datatype tag definitions match the classes you've defined in your class modules or the data type will not properly load into a Galaxy instance when your repository is installed.
<?xml version="1.0"?> <datatypes> <datatype_files> <datatype_file name="gmap.py"/> </datatype_files> <registration> <datatype extension="gmapdb" type="galaxy.datatypes.gmap:GmapDB" display_in_upload="False"/> <datatype extension="gmapsnpindex" type="galaxy.datatypes.gmap:GmapSnpIndex" display_in_upload="False"/> <datatype extension="iit" type="galaxy.datatypes.gmap:IntervalIndexTree" display_in_upload="True"/> <datatype extension="splicesites.iit" type="galaxy.datatypes.gmap:SpliceSitesIntervalIndexTree" display_in_upload="True"/> <datatype extension="introns.iit" type="galaxy.datatypes.gmap:IntronsIntervalIndexTree" display_in_upload="True"/> <datatype extension="snps.iit" type="galaxy.datatypes.gmap:SNPsIntervalIndexTree" display_in_upload="True"/> <datatype extension="gmap_annotation" type="galaxy.datatypes.gmap:IntervalAnnotation" display_in_upload="False"/> <datatype extension="gmap_splicesites" type="galaxy.datatypes.gmap:SpliceSiteAnnotation" display_in_upload="True"/> <datatype extension="gmap_introns" type="galaxy.datatypes.gmap:IntronAnnotation" display_in_upload="True"/> <datatype extension="gmap_snps" type="galaxy.datatypes.gmap:SNPAnnotation" display_in_upload="True"/> </registration> <sniffers> <sniffer type="galaxy.datatypes.gmap:IntervalAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SpliceSiteAnnotation"/> <sniffer type="galaxy.datatypes.gmap:IntronAnnotation"/> <sniffer type="galaxy.datatypes.gmap:SNPAnnotation"/> </sniffers> </datatypes>
Modules that include proprietary datatype class definitions cannot use relative import references for imported modules. To function correctly when your repository is installed in a local Galaxy instance, your class module imports must be defined as absolute from the galaxy subdirectory inside the Galaxy root's lib subdirectory. For example, assume the following import statements are included in our example gmap.py file. They certainly work within the Galaxy development environment when the gmap tools were being developed.
import data from data import Text from metadata import MetadataElement
However, the above relative imports will not work when the gmap.py class module is installed from the tool shed into a local Galaxy instance because the modules will not be found due to the use of the relative imports. The developer must use the following approach instead. Notice that the imports are written such that they are absolute relative to the ~/lib/galaxy subdirectory.
import galaxy.datatypes.data from galaxy.datatypes.data import Text from galaxy.datatypes.metadata import MetadataElement
The use of <converter> tags contained within <datatype> tags is supported in the same way they are supported within the datatypes_conf.xml.sample file in the Galaxy distribution.
<datatype extension="ref.taxonomy" type="galaxy.datatypes.metagenomics:RefTaxonomy" display_in_upload="true" <converter file="ref_to_seq_taxonomy_converter.xml" target_datatype="seq.taxonomy"/> </datatype>
Including datatype converters and display applications
To include your proprietary datatype converters or display applications, add the appropriate tag set to your repository's datatypes_conf.xml file in the same way that they are defined in the datatypes_conf.xml.sample file in the Galaxy distribution.
If you include datatype converter files in your repository, all files (the disk file referred to by the value of the "file" attribute) must be located in the same directory in your repository hierarchy. Similarly, your datatype display application files must all be in the same directory in your repository hierarchy (although the directory can be a different directory from the one containing your converter files). This is critical because the Galaxy components that load these proprietary items assume each of them are located in the same directory.
Automatic installation of Galaxy tool shed repository tools into a local Galaxy instance
Automatic installation of Galaxy tool shed repositories to local Galaxy instances provides several benefits:
- Virtual one-click tool installation for tools with no dependencies (or whose dependencies are already available in the Galaxy environment's path).
- New installed tools are loaded into a specified Galaxy tool panel section and can be used immediately without restarting the Galaxy server.
- Different versions of the same tool can be simultaneously used in the same Galaxy instance, streamlining the enablement of reproducible analyses over time.
With regard to tool dependencies, Galaxy tools fall into three categories; tools with no dependencies, tools that include dynamically generated selected lists whose options depend upon entries in the tool_data_table_conf.xml file along with references to index files (i.e., tool-data/xxx.loc files), and tools that include 3rd party dependencies. At the current time, tools that fall into the first category can be automatically installed with no manual user intervention. However, index files must still be made available to tools the require them, and 3rd party tool dependencies must still be installed manually in such a way that Galaxy can find them on its environment path. In the future, index files and 3rd party dependencies will be automatically installed if functional Python fabric scripts are included in the the tool shed repository along with the tools.
In providing this feature, multiple Galaxy tool panel configuration files are now supported rather than just one. In the past, the following 2 settings in the Galaxy config (universe_wsgi.ini) allowed for a single file (tool_conf.xml) to render the Galaxy tool panel and a single directory location (tools) to contain the tool files.
# Tool config file, defines what tools are available in Galaxy. tool_config_file = tool_conf.xml # Path to the directory containing the tools defined in the config. tool_path = tools
The "tool_config_file" setting has now been enhanced to allow for any number of files by using a comma-separated list of file names. For backward compatibility the "tool_path" setting remains the same and still points to the tools whose links will be rendered in the Galaxy tool panel by parsing the tool_conf.xml file.
# Locally installed tools and tools installed from tool sheds tool_config_file = tool_conf.xml,shed_tool_conf.xml tool_path = tools
Any "tool panel config" files in addition to the original "tool_conf.xml" should only be used to contain information about tools automatically installed from a Galaxy tool shed. Here is the "shed_tool_conf.xml.sample" file included in the Galaxy distribution. Notice that it includes a "tool_path" attribute in the <toolbox> tag. This attribute is similar to the "tool_path" setting in the Galaxy config described above, but its value should be a location different from your default Galaxy tool directory.
<?xml version="1.0"?> <toolbox tool_path="../shed_tools"> </toolbox>
The directory to which the "tool_path" attribute in the <toolbox> tag above points must be outside of the main Galaxy installation root directory, or it must be in a subdirectory protected by a properly configured .hgignore file if the directory is within the Galaxy installation directory hierarchy. This is because tool shed repositories that are automatically installed will be placed within this directory using mercurial's repository clone feature which creates .hg directories and associated mercurial repository files. Not having .hgignore properly configured could result in undesired behavior when modifying or updating your local Galaxy instance or the tool shed repositories if they are in directories that pose conflicts. See mercurial's .hgignore documentation for details.
Another important point to convey before we proceed is that these new "shed_tool_conf.xml" files are modified in real time when you automatically install tools from a Galaxy tool shed. You can manually edit the files if you want, but doing so is not necessary, and may result in undesired behavior if incorrectly altered.
Tool shed repositories that contain tools that include dynamically generated select list parameters that refer to an entry in the tool_data_table_conf.xml file must contain a tool_data_table_conf.xml.sample file that contains the required entry for each dynamic parameter. Similarly, any index files (i.e., ~/tool-data/xxx.loc files) to which the tool_data_table_conf.xml file entries refer must be defined in xxx.loc.sample files included in the tool shed repository along with the tools. If any of these tool_data_table_conf.xml entries or any of the required xxx.loc.sample files are missing from the tool shed repository, the tools will not properly load and metadata will not be generated for the repository. This means that the tools cannot be automatically installed into a Galaxy instance.
For those tools that include dynamically generated select list parameters that require a missing entry in the tool_data_table_conf.xml file, this file will be modified in real time by adding the entry from a tool_data_table_conf.xml.sample file contained in the tool shed repository.
Let's assume we're logged in as an "admin" user to a Galaxy instance whose tool panel looks like the panel in the following screen shot. Notice that the Filter and Sort tool panel section includes 2 tools, "Sort data in ascending or descending order" and "Select lines that match an expression". We want to install the filter tool that we uploaded to the Galaxy tool shed in previous sections of this document.
Clicking the Admin link in the top Galaxy tool panel displays the Galaxy Administration interface. Notice the sections in the blue left menu panel. We'll be taking a look at the Tool sheds section where we have the option to Search and browse tool sheds.
Clicking the Search and browse tool sheds link displays the Accessible Galaxy tool sheds page. These links to the various tool sheds are defined in the tool_sheds_conf.xml file in the Galaxy installation directory.
The file that produces the links to the 3 tool sheds shown in the page above looks like this:
<?xml version="1.0"?> <tool_sheds> <tool_shed name="Bx tool shed" url="http://someserver.bx.psu.edu:9009/"/> <tool_shed name="Galaxy main tool shed" url="http://toolshed.g2.bx.psu.edu/"/> <tool_shed name="Galaxy test tool shed" url="http://testtoolshed.g2.bx.psu.edu/"/> </tool_sheds>
Each tool shed link includes a pop-up menu that allows you to browse valid repositories or search for tools or workflows. See the Search repositories for valid tools by any combination of id, name or version and the Search repositories for workflows by name topic sections above for details about each of these respective features.
Clicking the Galaxy main tool shed link (the same feature as the Browse valid repositories option in the tool shed pop-up menu) displays tool repositories from the main production tool shed hosted at Penn State University. The list of repositories is filtered so that only repositories considered "valid" are displayed. A repository is valid if it has at least 1 set of repository metadata. See the Repository revisions: uploading a new version of an existing tool topic section above for details about repository metadata.
In previous sections of this document we uploaded our filter tool to our Bx tool shed, so we'll click the link to that tool shed in the Accessible Galaxy tool sheds page to browse for the tool. Locating our filter tool is easy since our Bx tool shed currently contains only the single repository.
Clicking the pop-up menu next to the repository name enables us to preview and install our filter tool.
Clicking the Preview and install option from the pop-up menu in the screen shot above displays the following page. Notice that this page looks similar to the same-titled section of the View repository and Manage repository pages described in previous sections of this document. Here you can preview the tool and inspect its metadata in the same way that you can on those pages. The Revision field on this page is also similar in that it becomes a select field when more than 1 repository revision is associated with a repository metadata record.
Clicking the Install to local Galaxy button in the upper right corner of the above page displays the following page. Note the warnings on this page, they're both very important! This page allows us to select the section of our Galaxy tool panel where we want the installed filter tool to be located. We'll select the Filter and Sort tool panel section and click the Install button.
After clicking the Install button and waiting for the tool installation to finish, we are presented with the following message.
Clicking the Analyze Data option in the top Galaxy too menu and then checking the Filter and Sort tool panel section shows us that our tool is loaded and ready to use.
Now that the tool is installed, lets take a look at the shed_tool_conf.xml file. It now looks something like the following. Notice that the tool files were installed in the relative directory "../shed_tools/gvk.bx.psu.edu/repos/greg/filter/897bb218d0cf/filter" and that the tool's guid attribute is "gvk.bx.psu.edu:9009/repos/greg/filter/Filter1/1.0.1".
All tools that are installed from a Galaxy tool shed in the way just presented in this section will have a guid attribute and value like this. When this tool is executed in the Galaxy instance in which it was installed, the value of this guid becomes the tool id. This is how tools with the same tool id (e.g., "Filter1" in this case since the installed filtering.xml file's tool tag is <tool id="Filter1" name="Filter" version="1.0.1">) from different repositories or with different versions can be simultaneously loaded and executed in the same Galaxy instance.
<?xml version="1.0"?> <toolbox tool_path="../shed_tools"> <section name="Filter and Sort" id="filter"> <tool file="localhost/repos/greg/filter/10456b4659aa/filter/filtering.xml" guid="localhost:9009/repos/greg/filter/Filter1/1.0.1"> <tool_shed>localhost:9009</tool_shed> <repository_name>filter</repository_name> <repository_owner>greg</repository_owner> <changeset_revision>10456b4659aa</changeset_revision> <id>Filter1</id> <version>1.0.1</version> </tool> </section> </toolbox>
Automatic installation of Galaxy tool shed repository data types into a local Galaxy instance
To demonstrate how data types included in installed tool shed repositories are handled by Galaxy, let's assume we're the administrator for a Galaxy instance where all of the Emboss datatypes have been removed from the datatypes_conf.xml file.
Let's take a look at what happens when we install the emboss_datatypes repository from the main Galaxy tool shed into our local Galaxy instance.
Since the emboss_datatypes repository does not include any tools, selecting a tool panel section is not necessary.
Inspecting the paster log during the installation confirms that after the repository was installed, all of the included Emboss data types were loaded into our local Galaxy instance's datatypes registry.
galaxy.util.shed_util DEBUG 2012-01-03 13:52:53,665 Installing repository 'emboss_datatypes' ... galaxy.datatypes.registry DEBUG 2012-01-03 13:52:54,624 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml
The paster log above shows that the data types installed with the emboss_datatypes repository are available in the Galaxy server session in which they were installed. But what happens when we stop and restart the sever? Let's inspect some snippets of the paster log from our local Galaxy server when we stop and restart it.
galaxy.jobs INFO 2012-01-03 14:23:33,558 job stopper stopped ... $ sh run.sh ... galaxy.datatypes.registry DEBUG 2012-01-03 14:23:40,228 Loading datatypes from datatypes_conf.xml ... galaxy.tools INFO 2012-01-03 14:23:40,798 parsing the tool configuration ./shed_tool_conf.xml galaxy.datatypes.registry DEBUG 2012-01-03 14:23:41,106 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml
You can see from the above paster log snippets that the data types installed with the emboss_datatypes repository are loaded whenever the Galaxy server is started. Galaxy data types correctly included in any installed repository will be similarly loaded, both when the repository is installed from the tool shed and when the Galaxy server is stopped and restarted at any point after the repository was installed.
What about conflicts? What happens if we have a data type defined in our local Galaxy instance's datatypes_conf.xml file, and we install a tool shed repository that includes the very same data type (but perhaps with a different class definition)? Precedence is given to the data type that was loaded first. The order in which data types are loaded is:
Data types defined in the local datatypes_conf.xml file which is parsed and loaded top to bottom by the Galaxy server.
- Data types defined in each installed tool shed repository where precedence is given to installation times oldest to newest.
To demonstrate this, let's add the acedb Emboss datatype back into our local Galaxy instance's datatypes_conf.xml file.
<datatype extension="acedb" type="galaxy.datatypes.data:Text" subclass="True"/>
Stopping and restarting our Galaxy server provides the following information in the paster log. You can see that the datatype contained within the installed emboss_datatypes tool shed repository with the extension acedb was ignored because it had previously been loaded due to it's definition in the datatypes_conf.xml file.
galaxy.datatypes.registry DEBUG 2012-01-03 14:40:14,403 Loading datatypes from datatypes_conf.xml ... galaxy.datatypes.registry DEBUG 2012-01-03 14:40:15,563 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml galaxy.datatypes.registry DEBUG 2012-01-03 14:40:15,564 Ignoring datatype with extension 'acedb' from '/Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml' because the registry already includes a datatype with that extension.
Galaxy administrators should pay close attention to the potential conflicts that will arise when tool shed repositories that include data types are installed into local Galaxy instances. If conflicts result, the data types to which the local instance has access may not be what is expected by the users.
Getting updates for tool shed repositories installed in a local Galaxy instance
Galaxy can be configured to automatically poll appropriate Galaxy tool sheds to find updates that are available for any of your installed tool shed repositories. To enable this feature set the value of the following config settings in universe_wsgi.ini. Tools sheds will be polled when your Galaxy server is started or when the configured number of hours have passed since your Galaxy server was started or since the last poll occurred if your Galaxy server has been running for some time since it was restarted.
# Enable automatic polling of relative tool sheds to see if any updates # are available for installed repositories. Ideally only one Galaxy # server process should be able to check for repository updates. The # setting for hours_between_check should be an integer between 1 and 24. enable_tool_shed_check = True hours_between_check = 12
If you have one or more Galaxy tool shed repositories installed into your local Galaxy instance, you'll see a new item in the Server section of your Administration menu named Manage installed tool shed repositories.
Selecting the menu item will display a list of all of the repositories that have been installed from any Galaxy tool shed into your local Galaxy instance. The page below shows that we have installed three repositories; the bam_to_bigwig repository from the Galaxy test tool shed, the blast2go repository from the Galaxy main too shed and our filter tool that we installed from our Bx tool shed in the previous section of this document.
Let's assume that after we've installed the filter repository from the Bx tool shed, changes were made to the repository in the tool shed. If the time defined by the hours_between_check config setting in our universe_wsgi.ini file have passed, our Galaxy server will poll the Bx tool shed and discover that the filter repository has been updated. Now when we click on the Manage installed tool shed repositories menu item, we see our installed filter repository highlighted in red.
If you don't want to configure your Galaxy instance to automatically poll tool sheds, your repositories that have available updates will not be highlighted in red. However, you can still manually get updates for each of your repositories. Clicking the repository name link will display a page like the following where you can view information about the installed repository and change the description.
The Repository Actions pop-up menu provides a way to get any new updates that are available from the relevant Galaxy tool shed.
A very important point to convey here is that updates retrieved from the relevant Galaxy tool shed will be restricted to the latest change set that includes those versions of tools that are currently in your installed tool shed repository. Remember that the tool shed repository revision values are a number followed by a : and an alpha-numeric string (e.g., 6:98d05121d41e). Let's assume that at some point you installed revision 0:sdj45ger5fr4 of a tool shed repository into your local Galaxy instance. Then at some later point the related repository in the Galaxy tool shed was updated with revision 1:si88rhjk8hfh. Then even later the same repository in the Galaxy tool shed was updated to a new revision number, say 2:srjls89ojf8e. Let's assume that this latest version resulted in a Revision select list for the repository in the Galaxy tool shed because the version of one or more tools within the repository changed. If you updated your locally installed tool shed repository after these changes to the repository within the Galaxy tool shed were made, your local repository would be updated to revision 1:si88rhjk8hfh, but would not be updated to include the change in revision 2:srjls89ojf8e. Since revision 2:srjls89ojf8e of the repository within the Galaxy tool shed includes tools that have different versions, you have to install that revision into your local Galaxy instance as a separate tool shed repository installation if you want to use the new versions of the tools.
Selecting the Get updates option will check for updates in the Galaxy tool shed repository and pull them to your locally installed repository if any are available. If no updates are available, a message will be displayed letting you know that your installed repository is up-to-date. One approach for keeping track of when you should update your installed repositories is to check the Receive email alerts checkbox in the relevant Galaxy tool shed for each of your installed repositories so that you'll get an email message letting you know there may be updates you want to apply.
Enabling workflow sharing: importing a workflow via a URL
Galaxy tool sheds play a beneficial role in enabling sharing of exported Galaxy workflows between different Galaxy instances. In the past, importing a workflow was functional only if all of the tools required by the workflow were available in the Galaxy instance into which it was being imported. Now workflows can be imported into Galaxy instances that are missing required tools. Let's take a look at what happens if we try importing one of the workflows available in the published paper titled Dynamics of mitochondrial heteroplasmy in three families investigated via a repeatable re-sequencing study, which is available here. Clicking the Workflow link in the top Galaxy menu bar in our local Galaxy instance displays the following page which includes a feature labeled Upload or import workflow.
Selecting the Upload or import workflow option in the page above displays the Import an exported Galaxy workflow page where we can enter the URL where the workflow is located and import it.
Clicking the Import button in the page above imports the workflow into our local Galaxy instance. In doing this, the workflow import utility determines that our local Galaxy instance does not have all of the tools required by the workflow, and so the following message is displayed. Here we are presented with the list of tool sheds that are accessible from our local Galaxy instance.
When the link to any of the accessible tool sheds is clicked, a search of that tool shed is performed in an attempt to locate the list of tools required by the workflow, but not available in our local Galaxy instance. In our scenario, let's assume that we have repositories in our Bx tool shed that include all of the tools required by the workflow, but are not available in our local Galaxy instance. If this is the case, clicking the link to our Bx tool shed displays the following list of repositories that include at least one of the tools for which we are searching.
Selecting all of the repositories and clicking Install displays the following page where we can select a section in the tool panel in which to install the tools.
Selecting a tool panel sectio and clicking Install installs all of the repositories onto the server which hosts our local Galaxy instance, and displays the following message. Since the repositories have all been installed and all of the tools that they contain have been loaded into our Galaxy instance, the workflow we imported can now be executed with input data that we choose.
Enabling workflow sharing: finding workflows in tool shed repositories
The Search section of the left tool shed menu panel includes an option labeled Search for workflows. Clicking on the Search for workflows menu item will display the following page. You can search repositories for workflows by entering a workflow name. Here we are searching for all workflows whose name contains the string genome (we are not matching on the exact strings).
Clicking the Search repositories button will display a list of all tool shed repositories that matched our criteria (i.e., the repository contains at least 1 exported workflow whose name contains the string genome).
Clicking on the repository name link in the above page will display the information for that repository and associated change set revision, including a section labeled Preview tools and inspect metadata by tool version. Since the repository matching our search contains an exported Galaxy workflow, this section includes not only the list of tools included in the repository, but also the workflow. Information about the workflow, including the number of steps, the format-version and annotation is displayed.
Clicking the workflow's name link displays an SVG graphic of the workflow (your browser must support svg graphic display for the image to be rendered). Boxes in the graphic that represent tools required by the workflow which are available in the repository are displayed with a brown background while boxes representing tools required by the workflow but that are not available in the repository have a red background.
The Repository Actions popup menu includes options for importing just the workflow into your local Galaxy instance or installing the complete repository.
Selecting the Import workflow to local Galaxy option in the Repository Actions pop-up menu in the page above will make the workflow available in the Workflow interface. If your local Galaxy instance is missing any of the tools that the workflow requires, a message is displayed that includes links to all accessible tool sheds enabling you to search the tool sheds for the missing tools. This behavior is the same as that described in the previous topic section.
Selecting the Install repository to local Galaxy option in the Repository Options pop-up menu in the page above will install the repository to your local Galaxy server using the process described in the previous topic section titled Automatic installation of Galaxy tool shed repository tools into a local Galaxy instance.
Enabling workflow sharing: importing a workflow from an installed tool shed repository
Let's assume we installed the tool shed repository named nextgen_variant_identification that we found in the previous topic section when we searched the main Galaxy tool shed for workflows whose name included the string "genome". When we select the Manage installed tool shed repositories from the Administration menu, this repository will be included in the list of installed tool shed repositories.
Clicking on the name link of the installed nextgen_variant_identification repository will display the following page where we can view information about the repository and make desired changed to the description.
The Repository Actions pop-up menu in the page above includes options to browse the repository and get updates for the repository from the tool shed from which it was installed. Selecting the Browse repository option from this menu displays the following page.
The pop-up menu associated with the workflow name in the page above provides the ability to import the workflow into your local Galaxy instance. If your Galaxy instance is missing any of the tools that the workflow requires, a message is displayed that includes links to all accessible tool sheds enabling you to search the tool sheds for the missing tools. This behavior is the same as that described in the previous topic section titled Enabling workflow sharing: importing a workflow via a URL.
Migrating tools from the Galaxy distribution to the Galaxy main tool shed
In 2012, the Galaxy development team will begin the process of migrating many of the tools that are currently available in the Galaxy distribution to the Galaxy Main Tool Shed. This will enable those that host local Galaxy instances much more flexibility in choosing to provide only those specific tools in which their users are interested.
A certain base set of "default tools" will continue to be included with the Galaxy distribution, but it will be much smaller than it is currently. The Emboss version 5.0.0 tools is the first set of tools that will be eliminated from the Galaxy distribution and hosted in the main Galaxy tool shed.
Any of the tools that were in the Galaxy distribution, but are now in the main Galaxy tool shed can very easily be installed to your local Galaxy instance from the tool shed. The following sections describe this process using the Emboss 5.0.0 tools as an example. As of the writing of this document, the Emboss 5.0.0 tools have not yet been removed from the Galaxy distribution. However, the tools are currently contained in the repository named emboss_5 and the Emboss data types are contained in the repository named emboss_datatypes in the main Galaxy tool shed. Both of these repositories will be referenced in the following discussion.
Configuring your Galaxy server to automatically install tools eliminated from the Galaxy distribution
The following new configuration settings were recently introduced within the universe_wsgi.ini.sample file.
enable_tool_shed_install - if set to True, enable automatic installation of tools that used to be in the Galaxy distribution but are now in the main Galaxy tool shed.
tool_shed_install_config_file - any tools that will be installed are configured in the config file named by the value of this setting (the default file name is tool_shed_install.xml). More information about this file is provided below.
install_tool_config_file - the contents of this file will be automatically appended for each tool that is installed into the local Galaxy instance (the default file name is shed_tool_conf.xml). More information about this file is provided below.
enable_tool_shed_check - if set to True, enable automatic polling of relative tool sheds to see if any updates are available for installed repositories. Ideally only one Galaxy server process should be able to check for repository updates.
hours_between_check - the number of hours to wait between each poll of the appropriate tool sheds to inquire about updates that may be available for installed repositories (the default setting is 12 hours). The setting for hours_between_check must be an integer between 1 and 24.
A file named tool_shed_install.xml.sample is included in the Galaxy distribution within the Galaxy install directory. The contents of this sample file will evolve over time as more tools are eliminated from the Galaxy distribution and moved to the tool shed. Those Galaxy administrators that wish to enable automatic installation of tools that moved from the Galaxy distribution to the tool shed should copy the sample file to a file named tool_shed_install.xml (or whatever value you've assigned to the install_tool_config_file config setting in universe_wsgi.ini). You may choose to edit the contents of the sample file to restrict automatic installation of only those tools in which your users are interested. However, be very cautious in making changes to this file, as incorrect changes may result in problems with installed tools. Details about each of the tags in the sample file are described below to provide all of the information you'll need to manage your version of the tool_shed_install.xml file.
Here is a snippet of the contents of the tool_shed_install.xml.sample file that provides information about the Emboss 5.0.0 tools.
<?xml version="1.0"?> <toolshed name="toolshed.g2.bx.psu.edu"> <!-- The following repository includes no tools, so nothing will be loaded into the tool panel. --> <repository name="emboss_datatypes" description="Datatypes for Emboss tools" changeset_revision="a89163f31369" /> <section name="EMBOSS" id="EMBOSSLite"> <repository name="emboss_5" description="Galaxy wrappers for EMBOSS version 5.0.0 tools" changeset_revision="b94ca591877b"> <tool id="EMBOSS: antigenic1" version="5.0.0" /> <tool id="EMBOSS: backtranseq2" version="5.0.0" /> <tool id="EMBOSS: banana3" version="5.0.0" /> <tool id="EMBOSS: biosed4" version="5.0.0" /> ... </repository> </section> </toolshed>
The primary container tag set is the "toolshed" tag set. The value of the name attribute of the toolshed tag is the domain name of the tool shed hosting the tools that have been eliminated from the distribution. This value will very likely always be "toolshed.g2.bx.psu.edu" for those tools that were at one time included in the Galaxy distribution, but have been moved to the main Galaxy tool shed by the core Galaxy development team. The toolshed tag set contains "section" tag sets and "repository" tag sets.
Both "name" and "id" attributes are required for "section" tags. This tag is used to define the section within the Galaxy tool panel in which to load all of the tools that are included in each of the repositories that will be installed. In the example snippet above, the tools contained in the "emboss_5" repository will be loaded into the Galaxy tool panel section labeled "EMBOSS" since that is the value for the name attribute of the section tag that contains the "emboss_5" repository tag set.
All "repository" tag sets have "name", "description" and "changeset_revision" attributes. Tool shed repositories installed into a local Galaxy instance can be managed by a Galaxy administrator, and the name and description of each repository enable this ability. Maintenance of installed tool shed repositories will be described in later sections of this document. The value of the "changeset_revision" tag defines the mercurial change set revision of the associated tool shed repository that will be installed. The value of this attribute is very important because it determines the tools (and tools versions), and possibly data types and workflows that will be made available to the local Galaxy instance when the repository is installed.
Repository tag sets may optionally be contained within "section" tag sets. If a repository does not include any Galaxy tools, it will make no difference if it is included inside or outside a section tag set since no tools will be loaded into the Galaxy instance. However, if the repository does include tools and it's tag set is not contained within a section tag set, the tools will be loaded into the top level of the Galaxy tool panel (outside of any sections).
If tools are included in the repository, the "repository" tag set must contain a "tool" tag for each tool contained in the defined mercurial change set revision of the repository. Each tool tag must include an "id" and a "version" attribute. This information is critical, and is stored in the Galaxy database to enable backward compatibility for workflows. More information about this is included in the next section.
Use case: automatically install the Emboss tools and datatypes into a local Galaxy instance
Let's assume we're the administrator of the local Galaxy instance where the Emboss data types are not included in the datatypes_conf.xml file, and the available tools include only the upload tool.
We'll further assume that the tool_shed_install.xml.sample in the Galaxy distribution looks like that displayed in the previous section of this document (where it includes only the emboss_5 and emboss_datatypes repositories). We've copied this sample file to our local version of the file named tool_shed_install.xml.
We'll make the following configuration settings in our universe_wsgi.ini file.
# Enable automatic installation of tools that used to be in the Galaxy
# distribution but are now in the main Galaxy tool shed. The tools
# that will be installed are configured in the config file named
# tool_shed_install.xml, which is located in the Galaxy install directory.
# Tools already installed will not be re-installed even if they are
# referenced in the tool_shed_install.xml file.
enable_tool_shed_install = True
tool_shed_install_config_file = tool_shed_install.xml
# CRITICAL NOTE: the location in which the tools will be installed is the
# location pointed to by the "tool_path" attribute in the following file.
# The default location setting in shed_tool_conf.xml ("../shed_tools") may
# be problematic for some cluster environments, so make sure to change it
# if appropriate or use a different file name for the setting.
install_tool_config_file = shed_tool_conf.xmlInspecting the paster log after stopping and restarting our Galaxy server provides the following information. We see that Galaxy's tool shed repository install manager parses our tool_shed_install.xml file, and automatically installs each defined tool shed repository up to the defined mercurial change set revision.
galaxy.jobs INFO 2012-01-03 15:24:41,149 job stopper stopped... $ sh run.sh ... galaxy.tool_shed.install_manager DEBUG 2012-01-03 15:25:05,493 Parsing tool shed install configuration ./tool_shed_install.xml
The following snippet from our paster log provides the information about the 2 tool shed repositories (emboss_datatypes and emboss_5) that are being automatically installed because they are defined in our tool_shed_install.xml file.
galaxy.tool_shed.install_manager DEBUG 2012-01-03 15:25:05,497 Repositories will be installed from tool shed 'toolshed.g2.bx.psu.edu' into configured tool_path location '../shed_tools' galaxy.util.shed_util DEBUG 2012-01-03 15:25:05,497 Installing repository 'emboss_datatypes' galaxy.util.shed_util DEBUG 2012-01-03 15:25:05,497 Cloning http://toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes destination directory: emboss_datatypes requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files updating to branch default 1 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-01-03 15:25:06,181 Updating cloned repository to revision "a89163f31369" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.datatypes.registry DEBUG 2012-01-03 15:25:06,359 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml galaxy.util.shed_util DEBUG 2012-01-03 15:25:06,401 Adding new row (or updating an existing row) for repository 'emboss_datatypes' in the tool_shed_repository table. galaxy.tool_shed.install_manager DEBUG 2012-01-03 15:25:06,970 Loading new tool panel section: EMBOSS galaxy.util.shed_util DEBUG 2012-01-03 15:25:06,971 Installing repository 'emboss_5' galaxy.util.shed_util DEBUG 2012-01-03 15:25:06,971 Cloning http://toolshed.g2.bx.psu.edu/repos/devteam/emboss_5 destination directory: emboss_5 requesting all changes adding changesets adding manifests adding file changes added 1 changesets with 113 changes to 113 files updating to branch default 113 files updated, 0 files merged, 0 files removed, 0 files unresolved galaxy.util.shed_util DEBUG 2012-01-03 15:25:10,134 Updating cloned repository to revision "b94ca591877b" 0 files updated, 0 files merged, 0 files removed, 0 files unresolved
The following snippet later in our paster log tells us that all of the tools contained in the installed emboss_5 tool shed repository will be loaded into our Galaxy instance's tool panel in the section labeled EMBOSS.
galaxy.tools DEBUG 2012-01-03 15:25:15,826 Appending to section: EMBOSS
The following snippet from our paster log shows us that each of the tools contained in the installed emboss_5 tool shed repository is being loaded.
galaxy.tools DEBUG 2012-01-03 15:25:15,839 Loaded tool: toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: antigenic1/5.0.0 5.0.0 galaxy.tools DEBUG 2012-01-03 15:25:15,862 Loaded tool: toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: backtranseq2/5.0.0 5.0.0 galaxy.tools DEBUG 2012-01-03 15:25:15,874 Loaded tool: toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: banana3/5.0.0 5.0.0 ...
The following snippet from our paster log (after the emboss tools are loaded) tells us that we have a new row added to our Galaxy database's tool_shed_repository table for our installed emboss_5 repository.
galaxy.util.shed_util DEBUG 2012-01-03 15:25:17,964 Adding new row (or updating an existing row) for repository 'emboss_5' in the tool_shed_repository table.
When included in the Galaxy distribution, tools are defined by id and version, among other attributes. For example, the Emboss antigenic1 tool has id="EMBOSS: antigenic1" and version="5.0.0". When installed from a tool shed, the tool's id becomes it's guid attribute from the tool shed. The Emboss antigenic1 tool's guid is toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: antigenic1/5.0.0. To provide backward compatibility for Galaxy workflows, a mapping between the tool's old id and version and it's new id (guid) must be provided. Galaxy does this by adding a new row to the tool_id_guid_map table in the database for all tools installed using the process described in this section. For the Emboss antigenic1 tool this mapping looks like the following.
Tool id Version Guid EMBOSS: antigenic1 5.0.0 toolshed.g2.bx.psu.edu/repos/devteam/emboss_5/EMBOSS: antigenic1/5.0.0
If a Galaxy workflow was built using a tool from the Galaxy distribution, the workflow defines the tool by its id attribute (in the future the tool version may also be used by the workflow to further define the tool). If the Galaxy development team removed the tool from the Galaxy distribution after the workflow was developed and the tool is installed from the main Galaxy tool shed, the workflow will locate the correct tool using this mapping process.
The following snippet from our paster log tells us that our Galaxy instance has properly mapped the ids to the guids for all of the tools included in the installed emboss_5 tool shed repository.
galaxy.tool_shed.install_manager DEBUG 2012-01-03 15:25:18,743 Mapped tool ids to guids for 107 tools included in repository 'emboss_5'.
The following snippet displayed next in our paster log tells us that Galaxy is loading all of the data types that are contained in the installed emboss_datatypes tool shed repository.
galaxy.datatypes.registry DEBUG 2012-01-03 15:25:18,753 Loading datatypes from /Users/gvk/workspaces_2008/shed_tools/toolshed.g2.bx.psu.edu/repos/devteam/emboss_datatypes/a89163f31369/emboss_datatypes/datatypes_conf.xml
Refreshing our Galaxy tool panel shows us that all of the Emboss 5.0.0 tool have been rpoperly installed and are ready for us.
When we select the Manage installed tool shed repositories option from the left menu panel in the Admin interface we see our 2 installed repositories.
Clicking the View tool id guid map button in the upper right corner displays our list of mappings for each installed tool.
Deactivating and uninstalling tool shed repositories installed into a local Galaxy instance
Those hosting their own Galaxy instances may find it useful to use tools contained in an installed tool shed repository for a period of time, and then remove them from the tool panel either temporarily or permanently. To enable this, Galaxy provides the ability to deactivate or uninstall an installed tool shed repository.
Deactivating an installed tool shed repository results in the following.
- The repository and all of it's contents will remain on disk.
- Any contained tools will not be loaded into the Galaxy tool panel.
- Any contained proprietary datatypes, datatype converters and display applications will be eliminated from the datatypes registry.
- The repository record's deleted column in the tool_shed_repository database table will be set to True.
Uninstalling this repository will result in the following.
- The repository and all of it's contents will be removed from disk.
- If the repository contains tools, their tag sets will be removed from the tool config file in which they are defined.
- Any contained proprietary datatypes, datatype converters and display applications will be eliminated from the datatypes registry.
- The repository record's deleted column in the tool_shed_repository database table will be set to True.
- The repository record's uninstalled column in the tool_shed_repository database table will be set to True.
- If the repository was installed via the Galaxy install manager (this occurs when the repository contains tools that used to be available in the Galaxy distribution, but are now contained in a tool shed repository hosted on the main Galaxy tool shed), all records associated with the repository will be eliminated from the tool_id_guid_map database table.
Let's take a look at how this works. Assume you are the administrator of a local Galaxy instance that includes the following tools and tool sections in the tool panel. The Get Data section includes the upload tool included in the Galaxy distribution. You manually installed the Add column tool from a local Galaxy tool shed, placing it outside of any sections in the tool panel. The Mothur tool section contains the Mothur tool suite and the two repos tool section contains tools from two repositories that you manually installed from a local Galaxy tool shed (the Grinder and Blast2GO tools). The EMBOSS tool section contains the Emboss version 5.0.0 tools that used to be included in the Galaxy distribution, but are now hosted on the main Galaxy tool shed. You configured your Galaxy instance to automatically install the repository containing these tools and the repository containing the Emboss datatypes using Galaxy's new install manager.
Since your Galaxy instance includes several installed tool shed repositories, the Administration menu wil include a link labeled Manage installed tool shed repositories in the menu's Server section. Clicking on that link will display a page like the following.
Each of the installed tool shed links includes a pop-up menu just right of the repository name (the downward pointing triangle). The pop-up menu includes an option labeled Deactivate or uninstall.
Clicking the Deactivate or uninstall option for the installed repository named add_value will display the following page. Notice the check box allowing you to deactivate the repository if its left blank or uninstall the repository if its checked.
Since the add_value repository includes only tools and no proprietary datatypes, deactivating the repository will set the repository record's deleted column in the tool_shed_repository database table to True, and keep the tool from being displayed in the tool panel. Clicking the Deactivate or uninstall button will display the following page. Notice that the add_value repository is no longer displayed in the list of installed repositories.
And the Add column tool is no longer displayed in the Galaxy tool panel.
Let's try uninstalling the blast2go repository - notice we've checked the check box here.
After uninstalling the repository, the following page is displayed. The repository files have been removed from disk, and the XML tag set for the Blast2GO tool has been removed from the tool config file.
Inspecting the tool panel shows us that the Blast2GO tool is no longer included in the two repos tool section.
How can we reinstall a repository that we've uninstalled? From the Administration Manage installed tool shed repositories page, click the deleted option within the Advanced search feature...
...and the list of repositories that you have deactivated or uninstalled will be displayed.
The pop-up menus on this page allow you to activate or reinstall the repositories.
Reinstalling uninstalled repositories that were originally installed by the Galaxy install manager is handled in a special way. To show how this works, assume you've uninstalled the emboss_5 repository so that it is included in the list on the page displayed when you click the deleted option within the Advanced search feature. Attempting to reinstall the repository using the Activate or reinstall option on the repository's opop-up menu will display the following message. To reinstall this repository, you'll need to make sure the required settings in your universe_wsgi.ini file are correct, and simply restart your Galaxy server. Review the three previous sections in this document for an explanation of how this works.

