Galaxy Tool XML File
The XML File for a Galaxy tool, generally referred to as the "tool config file", serves a number of purposes. First, it lays out the user interface for the tool ( e.g., form fields, text, help, etc. ). Second, it provides the glue that links your tool to Galaxy by telling Galaxy how to invoke it, what options to pass, and what files it will produce as output. It would be best to take some time to browse through the various tool configs ( files with a .xml extension ) in the /tools subdirectories of your local Galaxy instance as you read this document.
Pay attention to the following when creating a new tool:
- Make sure your XML is valid - Improper XML will most likely cause Galaxy to not load your tool. The easiest way to validate your XML is just to open the XML file itself in Firefox, which will either parse the file and display it, or show the error and its location in large letters.
- Don't forget to restart Galaxy - Galaxy loads and parses XML at run-time, which means you'll have to restart it after updating any XML files. The same does not apply if you only update an executable.
- Use the -file_strandCol options - Using interval files is more of a pain than using BED files, because the column locations are variable. But by including command line options in your executable and passing them the "automagic" column variables, you can easily handle interval formats. Plus, since BED formats are treated internally as intervals, you don't have to worry about figuring out which one your program is being passed. Everything will be provided as an interval file. Read more about this in the "command line" tag section below.
- Make sure your parameter names match your command-line variables - Galaxy will populate your command line options from the parameters selected by the user when the tool executes. Form field values are mapped to parameters in the command line via such that "form field name <-> $parameter name in the command line.
- Provide tool tips and other help - useful for those that will use your tool. The help section provides information on how to use the tool.
A Galaxy tool's config file consists of a subset of the following XML tag sets - each of these is described in detail in the following sections.
Contents
- Galaxy Tool XML File
-
Details of XML tag sets
- <tool> tag set
- <description> tag set
- <version_command> tag set
- <command> tag set
- <inputs> tag set
- <repeat> tag set
- <conditional> tag set
- <when> tag set
- <param> tag set
- <validator> tag set
- <option> tag set
- <options> tag set
- <column> tag set
- <filter> tag set
- <request_param_translation> tag set
- <request_param> tag set
- <append_param> tag set
- <value> tag set
- <value_translation> tag set
- <value> tag set
- <sanitizer> tag set
- <valid> tag set
- <add> and <remove> tag set
- <mapping> tag set
- <add> and <remove> tag set
- <configfiles> tag set
- <configfile> tag set
- <outputs> tag set
- <data> tag set
- <change_format> tag set
- <when> tag set ( change_format )
- <actions> tag set
- <tests> tag set
- <test> tag set
- <param> tag set (functional tests)
- <output> tag set (functional tests)
- <assert_contents> tag set (functional tests)
- <page> tag set
- <code> tag set
- <requirements> tag set
- <requirement> tag set
- <help> tag set
Details of XML tag sets
<tool> tag set
The outer-most tag set
| attribute | values | details | required | example |
|---|---|---|---|---|
| id | a string * | Must be unique across all tools; should be lowercase and contain only letters, numbers, and underscores. It allows for tool versioning and metrics of the number of times a tool is used, among other things. | yes | id="sort1" |
| name | a string | This string is what is displayed as a hyperlink in the tool menu | yes | name="Sort" |
| version | a string | This string defaults to "1.0.0' if it is not included in the tag. It allows for tool versioning and should be changed with each new version of the tool. | no | version="1.0.1" |
| hidden | true, false | Allows for tools to be loaded upon server startup, but not displayed in the tool menu | no | hidden="true" |
| tool_type | data_source | Allows for certain framework functionality to be performed on certain types of tools. This is currently only used in "data_source" tools, but will undoubtedly be used with other tools in the future. | no | tool_type="data_source" |
| URL_method | get, post | Only if "tool_type" attribute value is "data_source" - defines the HTTP request method to use when communicating with an external data source application ( the default is "get" ). | no | URL_method="post" |
Example
The following is an example that contains all of the attributes described above.
1 <tool id="ucsc_table_direct1" name="UCSC Main" version="1.0.0" hidden="false" tool_type="data_source" URL_method="post">
<description> tag set
The attribute value is displayed in the tool menu immediately following the hyperlink for the tool ( based on the "name" attribute of the <tool> tag set described above ).
Example
1 <description>table browser</description>
<version_command> tag set
Specifies the command to be run in order to get the tool's version string. The resulting value will be found in the "Info" field of the history dataset. For example:
1 <version_string>tophat -version</version_string>
<command> tag set
This tag specifies how Galaxy should invoke your tool's executable, passing it's required input parameter values ( the command line specification links the parameters supplied in the form with the actual tool executable ). Anything inside it preceded by a dollar sign ($) will be treated as a variable whose values can be acquired from one of three sources: parameters, metadata, or output files.
| attribute | values | details | required | example |
|---|---|---|---|---|
| interpreter | python, perl, bash, etc | This attribute defines the programming language in which the tool executable file is written. Any language can be used ( tools can be written in Python, C, Perl, Java, etc. ) This attribute can be eliminated for compiled (binary) executables. | no ( unless executable is interpreted ) | interpreter="python" |
Example
The following uses a compiled executable ( see the various tool config files in ~/tools/emboss_5 tools for examples of using compiled executables ).
1 <command>backtranseq -sequence $input1 -outfile $out_file1 -cfile $cfile -osformat2 $out_format1 -auto</command>
Example
The following uses an interpreted executable. The values of the $<variables> ( e.g., $input ) are acquired from form field parameters in the tool form ( see ~/tools/filters/sorter.xml for an example of using an interpreted executable ).
1 <command interpreter="python">sorter.py -i $input -o $out_file1 -cols $column -order $order -style $style</command>
Example
The values of the ${<variables>}( e.g., ${input.metadata.chromCol}) are acquired from the Metadata associated with the objects selected as the values of each of the relative form field parameters in the tool form. Accessing this information is generally enabled using the following feature components:
- A set of "metadata information" is defined for each supported data type ( see the _MetadataElement_ objects in the various data types classes in ~/lib/galaxy/datatypes ).
- The _DatasetFilenameWrapper_ class in the ~/lib/galaxy/tools/__init__.py code file wraps a Metadata Collection to return Metadata parameters wrapped according to the Metadata spec.
There are a few reserved variables which Galaxy will automatically fill in
Also note the use of the reserved parameter name GALAXY_DATA_INDEX_DIR - it points to the ~/tool-data directory.
Reserved Variables
Galaxy provides a few pre-defined variables which can be used in your command line, even though they don't appear in your tool's parameters.
| name | description |
|---|---|
| $__new_file_path__ | universe_wsgi.ini new_file_path value |
| $__tool_data_path__ | universe_wsgi.ini tool_data_path value |
| $__root_dir__ | Top-level Galaxy source directory made absolute via os.path.abspath() |
| $__datatypes_config__ | universe_wsgi.ini datatypes_config value |
| $__user_id__ | Email's numeric ID (id column of galaxy_user table in the database) |
| $__user_email__ | User's email address |
| $__app__ | The galaxy.app.UniverseApplication instance, gives access to all other configuration file variables (e.g. $__app__.config.output_size_limit). Should be used as a last resort, may go away in future releases. |
<inputs> tag set
Consists of all tag sets that define the tool's input parameters. Each <param> tag within the <inputs> tag set maps to a command line parameter within the <command> tag set described above.
<repeat> tag set
See ~/tools/plotting/xy_plot.xml for an example of how to use this tag set. This is a container for any tag sets that can be contained within the <inputs> tag set. When this is used, the tool will allow the user to add any number of additional sets of the contained parameters ( an "Add new <title>" button will be displayed on the tool form ). An example of the use of this tag set is in the ~/tools/plotting/xy_plot.xml tool config.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string | The name of the repeat section | yes | name="series" |
| title | a string | The title of the repeat section, which will be displayed on the tool form | yes | title="Series" |
| min | an integer | The minimum number of repeat units | no | min="1" |
| max | a number | The maximum number of repeat units | no | max="5" |
Example
<conditional> tag set
See ~/tools/maf/interval2maf.xml for an example of how to use this tag set. This is a container for conditional parameters in the tool ( must contain <when> tag sets ) - the command line is wrapped in an if-else statement.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | any string | The name of the conditional parameter | yes | name="maf_source_type" |
Example
Select the alignment target database ( a Galaxy cached genome build or a dataset in the history ). Note the different input variables in the command lines.
1 <command interpreter="python">
2 #if $source.source_select=="database" #blat_wrapper.py 0 $source.dbkey $input_query $output1 $iden $tile_size $one_off
3 #else #blat_wrapper.py 1 $source.input_target $input_query $output1 $iden $tile_size $one_off
4 #end if
5 </command>
6
7 <conditional name="source">
8 <param name="source_select" type="select" label="Target source">
9 <option value="database">Genome Build</option>
10 <option value="input_ref">Your Upload File</option>
11 </param>
12 <when value="database">
13 <param name="dbkey" type="genomebuild" label="Genome" />
14 </when>
15 <when value="input_ref">
16 <param name="input_target" type="data" format="fasta" label="Reference sequence" />
17 </when>
18 </conditional>
<when> tag set
Contained within the <conditional> tag set - each <when> tag set contains a set of input parameters, and the conditional variables are usually defined within <option> tag sets.
| attribute | values | details | required | example |
|---|---|---|---|---|
| value | a possible conditional value | This tag set will be used when the value of the containing conditional parameter equals this attribute value | yes | value="user" |
Example
This example provides details for how to choose the MAF source file, either locally cached data or an history item ( there are two options: "cached" or "user" ). If a user selects "Alignments in Your History", a variable of type "data" will be generated. If the user selects "Locally Cached Alignments", a drop-down selection menu will be generated according to entries contained in the file "maf_index.loc", which is stored in the ~/tool-data directory.
1 <command>
2 #if $maf_source_type.maf_source == "user": #your_program $maf_source_type.maf_file
3 #else: #your_program $maf_source_type.maf_identifier
4 #end
5 </command>
6
7 <inputs>
8 <conditional name="maf_source_type">
9 <param name="maf_source" type="select" label="MAF Source">
10 <option value="cached" selected="true">Locally Cached Alignments</option>
11 <option value="user">Alignments in Your History</option>
12 </param>
13 <when value="user">
14 <param name="maf_file" type="data" format="maf" label="MAF File" />
15 </when>
16 <when value="cached">
17 <param name="maf_identifier" type="select" label="MAF Type" >
18 <options from_file="maf_index.loc">
19 <column name="name" index="0"/>
20 <column name="value" index="1"/>
21 </options>
22 </param>
23 </when>
24 </conditional>
25 </inputs>
<param> tag set
Contained within the <inputs> tag set - each of these specifies a field that will be displayed on the tool form. Ultimately, the values of these form fields will be passed as the command line parameters to the tool's executable.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string * | * Attribute values must map to each command line parameter name. "Reserved" names are: REDIRECT_URL, DATA_URL, GALAXY_URL. | yes | name="input" |
| type | text, integer, float, boolean, genomebuild, select, data_column, hidden, baseurl, file, data, drill_down | The list of supported parameter types is in the parameter_types dictionary in ~/lib/galaxy/tools/parameters/basic.py. To create a multi-line text box add an 'area=True' attribute to the param tag. For example: <param name="foo" type="text" area="True" size="5x25" /> | yes | type="data" |
| label | a string | The attribute value will be displayed on the tool page as the label of the form field | no | label="Sort Query" |
| help | a string | Rendered on the tool form just below the associated field to provide information about the field | no | help="No data? See tip below" |
| value | a string | Default value for the form field | no | value="0" |
| optional | true, false | If false, parameter must have a value | no | optional="false" |
| min | a number | minimum parameter value; only valid when type is "integer" or "float" | no | min="0" |
| max | a number | maximum parameter value; only valid when type is "integer" or "float" | no | max="600000" |
| format | a string * | * Only if value of "type" attribute value is "data" - the list of supported data formats is contained in the ~/datatypes_conf.xml.sample file | no * | format="tabular" |
| data_ref | attribute value of the input dataset * | * Only if "type" attribute value is "select" - used with select lists whose options are dynamically generated based on certain metadata attributes of the dataset upon which this parameter depends ( usually but not always the tool's input dataset ) | no | data_ref="input" |
| force_select | true, false | Only if "type" attribute value is "select" - force user to select an option in the list | no | force_select="true" |
| display | checkboxes, radio | Only if "type" attribute value is "select" - render a select list as a set of check boxes or radio buttons. Defaults to a drop-down menu select list. | no | display="checkboxes" |
| multiple | true, false | Only if "type" attribute value is "select" - render a multi-select list | no | multiple="true" |
| numerical | true, false | Only if "type" attribute value is "data_column" - builds a dynamically generated select list of numerical or string data | no | numerical="true" |
| hierarchy | exact, recurse | Only if "type" attribute value is "drill_down" | no | hierarchy="recurse" |
| checked | yes, true, on | Only if "type" attribute value is "boolean" | no | checked="true" |
| truevalue | a string | Only if "type" attribute value is "boolean" | no | truevalue="-p" |
| falsevalue | a string | Only if "type" attribute value is "boolean" | no | falsevalue="-q" |
| size | an appropriate number | Only if "type" attribute value is "text" | no | size="4" |
Example
The following will find all "coordinate interval files" contained within the current history and dynamically populate a select list with them. If they are selected, their destination and internal file name will be passed to the appropriate command line variable. Presto! Automatic temporary file management.
<param name="interval_file" type="data" format="interval">
<label>near intervals in</label>
</param>
Example
The following will create a select list containing the options "Downstream" and "Upstream". Depending on the selection, a "d" or "u" value will be passed to the $upstream_or_down variable on the command line.
Example
Sometimes you need labels for data or graph axes, chart titles, etc. This can be done using a text field. The following will create a text box 30 characters wide with the default value of "V1".
1 <param name="xlab" size="30" type="text" value="V1" label="Label for x axis"/>
Example
The following will create a text box 4 characters wide with the default value of 1 and will restrict values entered to integers:
1 <param name="region_size" size="4" type="integer" value="1"><label>flanking regions of size</label></param>
<validator> tag set
See ~/tools/annotation_profiler/annotation_profiler.xml for an example of how to use this tag set. This tag set is contained within the <param> tag set - it applies a validator to the containing parameter.
| attribute | values | details | required | example |
|---|---|---|---|---|
| type | expression, regex, in_range, length, metadata, unspecified_build, no_options, empty_field, dataset_metadata_in_file, dataset_ok_validator | The list of supported validators is in the validator_types dictionary in /lib/galaxy/tools/parameters/validation.py | yes | type="dataset_metadata_in_file" |
| message | a string | The message displayed on the tool form if validation fails | no | message="Sequences are not currently available for the specified build" |
| filename | the name of a file stored locally | The file contains values for validation | no | filename="alignseq.loc" |
| metadata_name | a valid metadata attribute name * | * The metadata attribute name | no | metadata_name="dbkey" |
| metadata_column | a number * | * The column index in the file containing the values for validation | no | metadata_column="0" |
| message | any string * | * An error message when input data has values that do not pass validation | no | message="Error on dbkey" |
| line_startswith | a string * | * Lines in the file being used for validation start with a this attribute value | no | line_startswith="seq" |
| min | a number * | * Only when the "type" attribute value is "in_range" - the manimum number allowed | no | min="0" |
| max | a number * | * Only when the "type" attribute value is "in_range" - the maximum number allowed | no | max="50" |
Example
Example
The genome build of the dataset must be stored in Galaxy clusters and the name of the genome ("dbkey") must be one of the values in the first column of file alignseq.loc.
1 <validator type="dataset_metadata_in_file" filename="alignseq.loc" metadata_name="dbkey" metadata_column="1" message="Sequences are not currently available for the specified build." split=" " line_startswith="seq" />
<option> tag set
See ~/tools/filters/sorter.xml for an example of how to use this tag set. This tag set is optionally contained within the <param> tag when the "type" attribute value is "select" ( used for statically generated select lists ).
| attribute | values | details | required | example |
|---|---|---|---|---|
| value | a string | The value to be passed in the command line | yes | value="0" |
| selected | true | The option selected as the default when the form is initially refreshed | no | selected="true" |
Example
1 <param name="col" type="select" label="From">
2 <option value="0" selected="true">Column 1 / Sequence name</option>
3 <option value="1">Column 2 / Source</option>
4 <option value="2">Column 3 / Feature</option>
5 <option value="6">Column 7 / Strand</option>
6 <option value="7">Column 8 / Frame</option>
7 </param>
<options> tag set
See ~/tools/extract/liftOver_wrapper.xml for an example of how to use this tag set. This tag set is optionally contained within the <param> tag when the "type" attribute value is "select" ( used for dynamically generated select lists ). This tag set dynamically creates a list of options whose values can be obtained from a predefined file stored locally or a dataset selected from the current history.
| attribute | values | details | required | example |
|---|---|---|---|---|
| from_dataset | the attribute name of the input dataset in the tool config | The options for the select list are dynamically obtained from input dataset selected for the tool from the current history | no | from_dataset="input1" |
| from_file | the name of a file contain in the /tool-data directory | The options for the select list are dynamically obtained from a file | no | from_file="alignseq.loc" |
| from_data_table | the name of a table named in tool_data_table_conf.xml | The options for the select list are dynamically obtained from a file specified in tool_data_table_conf.xml | no | from_data_table="bowtie_indexes" |
| from_parameter | a valid parameter name | The options for the select list are dynamically obtained from a parameter | no | from_parameter="tool.app.datatypes_registry.upload_file_formats" |
Example
Select a database that is pre-formatted and cached in Galaxy clusters. When a new dataset is available, it will be added to the local file named "blastdb.loc" and included in the options of the select list. For a local instance, the file ("blastdb.loc" or "alignseq.loc") must be stored in the ~/tool_data directory. In this example, the option names and values are taken from column 0 of the file.
Example
Show all of the species that are available in the dataset selected for the parameter named "input1".
Example
Select datasets that are available in both the dataset selected for the parameter named "input1" and the binned_scores.loc file locally stored in the ~/tool-data directory.
1 <param name="datasets" type="select" label="Available datasets" display="radio">
2 <options from_file="binned_scores.loc">
3 <column name="name" index="1"/>
4 <column name="value" index="2"/>
5 <column name="dbkey" index="0"/>
6 <filter type="data_meta" ref="input1" key="dbkey" column="0" />
7 </options>
8 </param>
from_data_table
Basically, there are 3 steps to using from_data_table:
1. Modify tool_data_table_conf.xml to specify:
- The bowtie data table
- The column types in the loc file
- It should look something like this:When defining column names in data_tables, it is suggested that the use of special characters (e.g. hyphens) is avoided. This will allow simplified access to additional fields for a parameter value when e.g. building the command-line; for example if a 'path' value needs to be accessed, it could be done with a syntax of ${param.fields.path}.
2. Create/modify the loc file to correspond with the column types specified in tool_data_table_conf.xml (in this example, the loc file doesn't have to be changed), though we are going to be changing the specification to <columns>value, dbkey, name, path</columns>
3. Modify the Bowtie wrapper to use the data table instead of the loc file directly, replacing this:
with this:
1 <options from_data_table="bowtie_indexes"/>
Example
Select a reference genome that is indexed for bowtie. (see ~/tools/sr_mapping/bowtie_wrapper.xml)
<column> tag set
Optionally contained within an <options> tag set - displays columns of values from a file stored locally or a dataset in the current history.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string * | The valid name of the desired column | yes | name="value" |
| index | a number * | The index of the column in the referenced file or history item | yes | index="0" |
Example
Show options from the dataset in the current history that has been selected as the value of the parameter named "input1".
<filter> tag set
Optionally contained within an <options> tag set - filter out values obtained from a locally stored file or a dataset in the current history.
| attribute | values | details | required | example |
|---|---|---|---|---|
| type | data_meta, param_value, static_value, unique_value, multiple_splitter, add_value, sort_by | The list of valid filter types is contained in the "filter_types" dictionary in the ~/lib/galaxy/tools/parameters/dynamic_options.py file | yes | type="data_meta" |
| name | a string | The name of the filter | yes | name="dbkey" |
| column | a number | The column index within the file that contains the values to be filtered | ys | column="1" |
| ref | a string * | * the attribute name of the reference file or input dataset | no | ref="input1" |
| key | a string * | * | no | key="species" |
| separator | a string * | * The column separator of the reference file or dataset | no | separator=";" |
Example
Filter values in dataset "input".
Example
Show all options contained in the file "encode_datasets.loc", which is locally stored in the ~/tool-data directory.
1 <options from_file="encode_datasets.loc">
2 <column name="name" index="2"/>
3 <column name="value" index="3"/>
4 <column name="dbkey" index="1"/>
5 <column name="encode_group" index="0"/>
6 <column name="uid" index="3"/>
7 <filter type="static_value" name="encode_group" value="ALD" column="0"/>
8 <filter type="static_value" name="dbkey" value="hg17" column="1"/>
9 </options>
<request_param_translation> tag set
See ~/tools/data_source/ucsc_tablebrowser.xml for an example of how to use this tag set. This tag set is used only in "data_source" tools ( the "tool_type" attribute value is "data_source" ). This tag set is contained within the <param> tag set - it contains a set of <request_param> tags.
<request_param> tag set
Contained within the <request_param_translation> tag set ( used only in "data_source" tools ) - the external data source application may send back parameter names like "GENOME" which must be translated to "dbkey" in Galaxy.
| attribute | values | details | required | example |
|---|---|---|---|---|
| galaxy_name | URL, dbkey, organism, table, description, name, info, data_type | Each of these maps directly to a remote_name value | yes | galaxy_name="URL" |
| remote_name | a string * | * The string representing the name of the parameter in the remote data source | yes | remote_name="URL" |
| missing | a string | The default value to use for galaxy_name if the remote_name parameter is not included in the request | yes | missing="" |
Example
<append_param> tag set
Optionally contained within the <request_param> tag set if galaxy_name="URL" - some remote data sources ( e.g., Gbrowse, Biomart ) send parameters back to Galaxy in the initial response that must be added to the value of "URL" prior to Galaxy sending the secondary request to the remote data source via URL.
| attribute | values | details | required | example |
|---|---|---|---|---|
| separator | a string * | * The text to use to join the requested parameters together | yes | separator="&" |
| first_separator | a string * | * The text to use to join the request_param parameters to the first requested parameter | no | first_separator="?" |
| join | a string * | * The text to use to join the param name to its value | yes | join="=" |
<value> tag set
Contained within the <append_param> tag set - allows for appending a param name / value pair to the value of URL.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string * | * Any valid HTTP request parameter name. The name / value pair must be received from the remote data source and will be appended to the value of URL as something like "&_export=1" | yes | name="_export" |
| missing | a string * | * Must be a valid HTTP request parameter value | yes | missing="1" |
Example
<value_translation> tag set
Optionally contained within the <request_param> tag set the parameter value received from a remote data source may be named differently in Galaxy, and this tag set allows for the value to be appropriately translated.
<value> tag set
Contained within the <value_translation> tag set - allows for changing the data type value to something supported by Galaxy.
| attribute | values | details | required | example |
|---|---|---|---|---|
| galaxy_value | a supported data type * | * The target value. e.g. For setting data format: the list of supported data formats is contained in the ~/datatypes_conf.xml.sample file | yes | galaxy_value="tabular" |
| remote_value | a string * | * The value supplied by the remote data source application | yes | remote_value="selectedFields" |
Example
<sanitizer> tag set
See ~/tools/filters/grep.xml for an example of how to use this tag set. This tag set is used to replace the basic parameter sanitization with custom directives. This tag set is contained within the <param> tag set - it contains a set of <valid> and <mapping> tags.
| property | values | details | required | example |
|---|---|---|---|---|
| sanitize | True or False | is this parameter sanitized | no, default is True | <sanitizer sanitize="True"/> |
| invalid_char | string | character to replace invalid characters with | no, default is X | <sanitizer invalid_char="~"/> |
<valid> tag set
Contained within the <sanitizer> tag set. Used to specify a list of allowed characters. Contains <add> and <remove> tags.
| property | values | details | required | example |
|---|---|---|---|---|
| initial | string | initial characters to allow | no, default is string.letters + string.digits +" -=_.()/+*^,:?!" | <valid initial="none"> |
<add> and <remove> tag set
Contained within the <valid> tag set. Used to add or remove individual characters or preset lists of characters. Character must not be allowed as a valid input for the mapping to occur. Preset lists include default and none as well as those available from string.* (e.g. string.printable).
| attribute | values | details | required | example |
|---|---|---|---|---|
| preset | none, default or available from string | Add or Remove these characters from the list of valid characters | no | <add preset="string.printable"/> or <remove preset="string.printable"/> |
| value | a character | A character to add or remove from the list of valid characters | no | <remove value="""/> or <add value="""/> |
<mapping> tag set
Contained within the <sanitizer> tag set. Used to specify a mapping of disallowed character to replacement string. Contains <add> and <remove> tags.
| property | values | details | required | example |
|---|---|---|---|---|
| initial | string | initial character mapping | no, default is galaxy.util.mapped_chars | <valid initial="none"> |
<add> and <remove> tag set
Contained within the <valid> tag set. Used to add or remove individual characters or preset lists of characters. Character must not be allowed as a valid input for the mapping to occur. Preset lists include default and none as well as those available from string.* (e.g. string.printable).
| attribute | values | details | required | example |
|---|---|---|---|---|
| source | a character | Replace all occurrences of this character with the string of target | no | <add source=""" target="\""/> or <remove source=""" |
| target | a string | Replace all occurrences of source with this string | no | <add source=""" target="\""/> |
Example
<configfiles> tag set
See ~/tools/maf/maf_filter.xml for an example of how this tag set is used in a tool. This tag set is a container for <configfile> tag sets - it defines an additional configuration section.
<configfile> tag set
This tag set is contained within the <configfiles> tag set. It allows for the creation of a temporary file for file-based parameter transfer.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string * | * This value is the parameter name of the configuration file | yes | name="maf_filter_file" |
Example
The following is taken from the ~/tools/plotting/xy_plot.xml tool config.
1 <configfiles>
2 <configfile name="script_file">
3 ## Setup R error handling to go to stderr
4 options( show.error.messages=F, error = function () { cat( geterrmessage(), file=stderr() ); q( "no", 1, F ) } )
5 ## Determine range of all series in the plot
6 xrange = c( NULL, NULL )
7 yrange = c( NULL, NULL )
8 #for $i, $s in enumerate( $series )
9 s${i} = read.table( "${s.input.file_name}" )
10 x${i} = s${i}[,${s.xcol}]
11 y${i} = s${i}[,${s.ycol}]
12 xrange = range( x${i}, xrange )
13 yrange = range( y${i}, yrange )
14 #end for
15 ## Open output PDF file
16 pdf( "${out_file1}" )
17 ## Dummy plot for axis / labels
18 plot( NULL, type="n", xlim=xrange, ylim=yrange, main="${main}", xlab="${xlab}", ylab="${ylab}" )
19 ## Plot each series
20 #for $i, $s in enumerate( $series )
21 #if $s.series_type['type'] == "line"
22 lines( x${i}, y${i}, lty=${s.series_type.lty}, lwd=${s.series_type.lwd}, col=${s.series_type.col} )
23 #elif $s.series_type.type == "points"
24 points( x${i}, y${i}, pch=${s.series_type.pch}, cex=${s.series_type.cex}, col=${s.series_type.col} )
25 #end if
26 #end for
27 ## Close the PDF file
28 devname = dev.off()
29 </configfile>
30 </configfiles>
<outputs> tag set
Container tag set for the <data> tag set. The files created by tools as a result of their execution are named by Galaxy. You specify the number and type of your output files using the contained <data> tags. You must pass them to your tool executable through using line variables just like the parameters described in the previous sections.
<data> tag set
This tag set is contained within the <outputs> tag set, and it defines the output data description for the files resulting from the tool's execution. The value of the attribute "label" can be acquired from input parameters or metadata in the same way that the command line parameters are ( discussed in the <command> tag set section above ).
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | a string * | * This attribute name must match the attribute name of the command line parameter defined for the output | yes | name="output1" |
| format | a supported data type | This is the data type of the output file. It can be one of the supported data types ( e.g., "tabular" ) or the format of the tool's input dataset ( e.g., format="input" ) | yes | format="fasta" |
| format_source | name of an input data parameter | This sets the data type of the output file to be the same format as that of a tool input dataset. Useful when there are multiple inputs to match. | no | format_source="input2" |
| metadata_source | value of the input data parameter | This copies the metadata information from the tool's input dataset. This is particularly useful for interval data types where the order of the columns is not set. | no | metadata_source="input" |
| label | a string | This will be the label of the history item for the output data set. The string can include structure like ${<some param name>.<some attribute>}, as discussed for command line parameters in the <command> tag set section above. | no | label="Blat on ${database.value_label}" |
| from_work_dir | a string | Relative path to a file produced by the tool in its working directory. Output's contents are set to this file's contents. | no | from_work_dir="tool_output_file.txt" |
| hidden | True, true, False, false | Whether to hide dataset in the history view. | no | hidden="True" |
Example
The following will create a dataset in the history panel whose data type is the same as that of the input dataset selected for the tool.
Example
The following will create datasets in the history panel, setting the output data type to be the same as that of an input dataset named by the "format_source" attribute. Note that a conditional name is not included, so 2 separate conditional blocks should not contain parameters with the same name.
1 <inputs>
2 <!-- fasta may be an aligned fasta that subclasses Fasta -->
3 <param name="fasta" type="data" format="fasta" label="fasta - Sequences"/>
4 <conditional name="qual">
5 <param name="add" type="select" label="Trim based on a quality file?" help="">
6 <option value="no">no</option>
7 <option value="yes">yes</option>
8 </param>
9 <when value="no"/>
10 <when value="yes">
11 <!-- qual454, qualsolid, qualillumina -->
12 <param name="qfile" type="data" format="qual" label="qfile - a quality file"/>
13 </when>
14 </conditional>
15 </inputs>
16 <outputs>
17 <data format_source="fasta" name="trim_fasta" label="${tool.name} on ${on_string}: trim.fasta"/>
18 <data format_source="qfile" name="trim_qual" label="${tool.name} on ${on_string}: trim.qual">
19 <filter>(qual['add'] == 'yes')</filter>
20 </data>
21 </outputs>
Example
The following will create a variable called $out_file1 with data type "pdf".
Example
Assume that the tool includes an input parameter named "database" which is a select list ( e.g., assume the following inputs ):
Assume that the user selects the first option in the $database select list. Then the following will ensure that the tool produces a tabular data set whose associated history item has the label "Blat on Human (hg18)".
<change_format> tag set
See ~/tools/extract/extract_genomic_dna.xml for an example of how this tag set is used in a tool. This tag set is optionally contained within the <data> tag set and is the container tag set for the following <when> tag set.
<when> tag set ( change_format )
If the data type of the output dataset is the specified type, the data type is changed to the desired type.
| attribute | values | details | required | example |
|---|---|---|---|---|
| input | a string * | * This value must be the attribute name of the desired input parameter | yes | input="out_format" |
| value | a string * | * This value must also be an attribute name of an input parameter | yes | value="interval" |
| format | a string * | * This value must be a supported data type | yes | format="interval" |
Example
Assume that your tool config includes the following select list parameter structure:
Then whenever the user selects the "interval"" option from the select list, the following structure in your tool config will override the format="fasta" setting in the <data> tag set with format="interval".
<actions> tag set
The <actions> in the Bowtie wrapper is used in lieu of the deprecated <code> tag to set the dbkey of the output dataset. In bowtie_wrapper.xml (see below), according to the first action block, if the refGenomeSource.genomeSource is "indexed" (not "history"), then it will assign the dbkey of the output file to be the same as that of the reference file. It does this by looking at through the loc file and finding the line that has the value that's been selected in the index dropdown box as column 1 of the loc file entry and using the dbkey, in column 0 (ignoring comment lines (starting with #) along the way).
If refGenomeSource.genomeSource is "history", it resorts to default behavior for Galaxy, which is that the output is assigned the same value as the first input that has a dbkey specified.
The second block would not be needed for most cases--it is required here to handle the specific case of a small reference file we use for functional testing. It says that if the dbkey has been set to "equCab2chrM" (that's what the <filter type="metadata_value"... column="1" /> tag) does then it should be changed to "equCab2" (the <option type="from_param" ... column="0" ...> tag does).
Example
1 <actions>
2 <conditional name="refGenomeSource.genomeSource">
3 <when value="indexed">
4 <action type="metadata" name="dbkey">
5 <option type="from_file" name="bowtie_indices.loc" column="0" offset="0">
6 <filter type="param_value" column="0" value="#" compare="startswith" keep="False"/>
7 <filter type="param_value" ref="refGenomeSource.index" column="1"/>
8 </option>
9 </action>
10 </when>
11 </conditional>
12 <!-- Special casing equCab2chrM to equCab2 -->
13 <action type="metadata" name="dbkey">
14 <option type="from_param" name="refGenomeSource.genomeSource" column="0" offset="0">
15 <filter type="insert_column" column="0" value="equCab2chrM"/>
16 <filter type="insert_column" column="0" value="equCab2"/>
17 <filter type="metadata_value" ref="output" name="dbkey" column="1" />
18 </option>
19 </action>
20 </actions>
<tests> tag set
Container tag set for the <test> tag sets. Functional tests are executed via the ~/run_functional_tests.sh shell script. Any number of tests can be included, and each test is wrapped within separate <test> tag sets.
<test> tag set
This tag set contains the necessary parameter values for executing the tool via the functional test framework.
Example
The following two tests will tool execute the ~/tools/filters/sorter.xml tool. Notice the way that the tool's inputs and outputs are defined.
1 <tests>
2 <test>
3 <param name="input" value="1.bed"/>
4 <param name="column" value="1"/>
5 <param name="order" value="ASC"/>
6 <param name="style" value="num"/>
7 <output name="out_file1" file="sort1_num.bed"/>
8 </test>
9 <test>
10 <param name="input" value="7.bed"/>
11 <param name="column" value="1"/>
12 <param name="order" value="ASC"/>
13 <param name="style" value="alpha"/>
14 <output name="out_file1" file="sort1_alpha.bed"/>
15 </test>
16 </tests>
Example
Test the execution of the MAF-to-FASTA converter ( ~/tools/maf/maf_to_fasta.xml ).
Example
This test demonstrates verifying specific properties about a test output instead of directly comparing it to another file. Here the file attribute is not specified and instead a series of assertions is made about the output.
1 <test>
2 <param name="input" value="maf_stats_interval_in.dat" />
3 <param name="lineNum" value="99999"/>
4 <output name="out_file1">
5 <assert_contents>
6 <has_text text="chr7" />
7 <not_has_text text="chr8" />
8 <has_text_matching expression="1274\d+53" />
9 <has_line_matching expression=".*\s+127489808\s+127494553" />
10 <!-- 	 is XML escape code for tab -->
11 <has_line line="chr7	127471195	127489808" />
12 <has_n_columns n="3" />
13 </assert_contents>
14 </output>
15 </test>
<param> tag set (functional tests)
This tag set defines the tool's input parameters for executing the tool via the functional test framework.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | name of an input parameter | This value must match the name of the associated input parameter. | yes | name="input1" |
| value | a legal value of an input parameter | This value must be one of the legal values that can be assigned to an input parameter | yes | value="3.maf" |
| ftype | data type of the input file * | * This attribute name should be included only with the parameter that defines the input dataset for the tool. If this attribute name is not included, the functional test framework will attempt to determine the data type for the input dataset using the data type sniffers. | no | ftype="maf" |
Example
The following defines the four input values that are passed to the ~/tools/filters/sorter.xml tool via functional test framework.
<output> tag set (functional tests)
This tag set defines the variable that names the output dataset for the functional test framework. The functional test framework will execute the tool using the parameters defined in the <param> tag sets and generate a temporary file, which will either be compared with the file named in the "file" attribute value or checked against assertions made by a child assert_contents tag to verify that the tool is functionally correct.
| attribute | values | details | required | example |
|---|---|---|---|---|
| name | parameter name of the output file | This value is the same as the value of the "name" attribute of the <data> tag set contained within the tool's <outputs> tag set. | yes | name="outfile_1" |
| file | file name | This value is the name of the output file stored in the /test-data directory which will be used to compare the results of executing the tool via the functional test framework | yes | file="cf_maf2fasta_concat.dat" |
<assert_contents> tag set (functional tests)
This tag set defines a sequence of checks or assertions to run against an output dataset for the functional test framework. This tag requires no attributes, but child tags should be used to define the assertions to make about the output dataset. The functional test framework makes it easy to extend Galaxy with such tags, the following table summarizes many of the default assertion tags that come with Galaxy and examples of each can be found below.
| tag | description | example |
|---|---|---|
| has_text | Asserts the specified text appears in the output. | <has_text text="chr7"> |
| not_has_text | Asserts the specified text does not appear in the output. | <not_has_text text="chr8" /> |
| has_text_matching | Asserts text matching the specified regular expression (expression) appears in the output. | <has_text_matching expression="1274\d+53" /> |
| has_line_matching | Asserts a line matching the specified regular expression (expression) appears in the output. | <has_line_matching expression=".*\s+127489808\s+127494553" /> |
| has_n_columns | Asserts tabular output contains the specified number (n) of columns. | <has_n_columns n="3" /> |
| is_valid_xml | Asserts the output is a valid XML file. | <is_valid_xml /> |
| has_element_with_path | Asserts the XML output contains at least one element (or tag) with the specified XPath-like path. | <has_element_with_path path="BlastOutput_param/Parameters/Parameters_matrix" /> |
| has_n_elements_with_path | Asserts the XML output contains the specified number (n) of elements (or tags) with the specified XPath-like path | <has_n_elements_with_path n="9" path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_num" /> |
| element_text_is | Asserts the text of the XML element with the specified XPath-like path is the specified text. | <element_text_is path="BlastOutput_program" text="blastp" /> |
| element_text_matches | Asserts the text of the XML element with the specified XPath-like path matches the regular expression defined by expression. | <element_text_matches path="BlastOutput_version" expression="BLASTP\s+2\.2.*" /> |
| attribute_is | Asserts the XML attribute for the element (or tag) with the specified XPath-like path is the specified text. | <attribute_is path="outerElement/innerElement1" attribute="foo" text="bar" /> |
| attribute_matches | Asserts the XML attribute for the element (or tag) with the specified XPath-like path matches the regular expression specified by expression | <attribute_matches path="outerElement/innerElement2" attribute="foo2" expression="bar\d+" /> |
| element_text | This tag allows the developer to recurisively specify additional assertions as child elements about just the text contained in the element specified by the XPath-like path. | <element_text path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_def"><not_has_text text="EDK72998.1" /></element_text> |
Example
1 <output name="out_file1">
2 <assert_contents>
3 <has_text text="chr7" />
4 <not_has_text text="chr8" />
5 <has_text_matching expression="1274\d+53" />
6 <has_line_matching expression=".*\s+127489808\s+127494553" />
7 <!-- 	 is XML escape code for tab -->
8 <has_line line="chr7	127471195	127489808" />
9 <has_n_columns n="3" />
10 </assert_contents>
11 </output>
Example
1 <output name="out_file1">
2 <assert_contents>
3 <is_valid_xml />
4 <has_element_with_path path="BlastOutput_param/Parameters/Parameters_matrix" />
5 <has_n_elements_with_path n="9" path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_num" />
6 <element_text_matches path="BlastOutput_version" expression="BLASTP\s+2\.2.*" />
7 <element_text_is path="BlastOutput_program" text="blastp" />
8 <element_text path="BlastOutput_iterations/Iteration/Iteration_hits/Hit/Hit_def">
9 <not_has_text text="EDK72998.1" />
10 <has_text_matching expression="ABK[\d\.]+" />
11 </element_text>
12 </assert_contents>
13 </output>
Example
<page> tag set
This is deprecated since the introduction of the "refresh_on_change" features, so try to stay away from using it for new tools. In older tools, if you needed to split your interface over multiple pages, you could do so by wrapping each page with a <page></page> tag and putting them in order in the XML file.
To create two-page interface:
<code> tag set
Deprecated do not use this unless absolutely necessary. This tag set provides detailed control of the way the tool is executed. This (optional) code can be deployed in a separate file in the same directory as the tool's config file. These hooks are being replaced by new tool config features and methods in the ~/lib/galaxy/tools/__init__.py code file.
| attribute | values | details | required | example |
|---|---|---|---|---|
| file | a string * | This value is the name of the executable code file, and is called in the exec_before_process(), exec_before_job(), exec_after_process() and exec_after_job()( methods. | yes | file="extract_genomic_dna_code.py" |
Example
The following is taken from the ~/tools/new_operations/coverage.xml tool config.
1 <code file="operation_filter.py"/>
<requirements> tag set
See ~/tools/extract/phastOdds/phastOdds_tool.xml for an example of how this tag set is used in a tool. This is a container tag set for the <requirement> tag set described below.
<requirement> tag set
This tag set is contained within within the <requirements> tag set. Third party programs or modules that the tool depends upon ( and which not distributed with Galaxy ) are included in this tag set. Before running the tool, Galaxy will check whether the required programs or modules are available on the local machine where Galaxy is set to run. If they are not available, this tool will not be loaded when the Galaxy server is started.
| attribute | values | details | required | example |
|---|---|---|---|---|
| type | python-module, binary | This value defines the which type of the 3rd party module required by this tool | yes | type="python-module" |
Example
This example shows a tool that requires a python-module named rpy which is not distributed with Galaxy.
Example
This example shows a tool that requires a python module named numpy, and a binary named taxBuilder, neither of which is distributed with Galaxy.
<help> tag set
This tag set includes all of the necessary details of how to use the tool. This tag set should be included as the last tag set in the tool config. Tool help is written in reStructuredText. Included here is only an overview of a subset of features. For more information see http://docutils.sourceforge.net/docs/ref/rst/restructuredtext.html.
| tag | details |
|---|---|
| .. class:: warningmark | a yellow warning symbol |
| .. class:: infomark | a blue information symbol |
| .. image:: path-of-the-file.png :height: 500 :width: 600 | insert a png file of height 500 and width 600 at this position |
| bold | bold |
| *italic* | italic |
| * | list |
| - | list |
| :: | paragraph |
| ----- | a horizontal line |
Example
Show a warning sign to remind users that this tool accept fasta format files only, followed by an example of the query sequence and a figure.
Miscellaneous tips and tricks
If you need to label a dataset with its real name (as displayed in the history)
<outputs>
<data name="output" format="tabular" label="SEQLEN of ${input1.name}" />
</outputs>
