Locked History Actions

Admin/Config/Apache Proxy

Apache proxy to Galaxy

For various reasons (performance, authentication, etc.) in a production environment, it's recommended to run Galaxy behind a web server proxy. Although any proxy could work, Apache is the most common. Alternatively, we use nginx for our public sites, and details are available for it, too.

Currently the only recommended way to run Galaxy with Apache is using mod_rewrite and mod_proxy. fastcgi, AJP or similar connectors may be supported in the future.

To support proxying, the mod_proxy, mod_http_proxy and mod_rewrite modules must be enabled in the Apache config. The main proxy directives, ProxyRequests and ProxyVia do not need to be enabled.

Please note that Galaxy should never be located on disk inside Apache's DocumentRoot. By default, this would expose all of Galaxy (including datasets) to anyone on the web.

Basic configuration

Serving Galaxy at the web server root (/)

For a default Galaxy configuration running on http://localhost:8080/, the following lines in the Apache configuration will proxy requests to the Galaxy application:

   1 RewriteEngine on
   2 RewriteRule ^(.*) http://localhost:8080$1 [P]

Thus, all requests on your server (for example, http://www.example.org/) are now redirected to Galaxy. Because this example uses the "root" of your web server, you may want to use a VirtualHost to be able to run other sites from this same server.

Since Apache is more efficient at serving static content, it is best to serve it directly, reducing the load on the Galaxy process and allowing for more effective compression (if enabled), caching, and pipelining. To do so, your configuration will now become:

   1 RewriteEngine on
   2 RewriteRule ^/static/style/(.*) /home/nate/galaxy-dist/static/june_2007_style/blue/$1 [L]
   3 RewriteRule ^/static/scripts/(.*) /home/nate/galaxy-dist/static/scripts/packed/$1 [L]
   4 RewriteRule ^/static/(.*) /home/nate/galaxy-dist/static/$1 [L]
   5 RewriteRule ^/favicon.ico /home/nate/galaxy-dist/static/favicon.ico [L]
   6 RewriteRule ^/robots.txt /home/nate/galaxy-dist/static/robots.txt [L]
   7 RewriteRule ^(.*) http://localhost:8080$1 [P]

You'll need to ensure that filesystem permissions are set such that the user running your Apache server has access to the Galaxy static/ directory.

Serving Galaxy at a sub directory (such as /galaxy)

It may be necessary to house Galaxy at an address other than the web server root (http://www.example.org/galaxy, instead of http://www.example.org). Two changes are necessary:

   1 RewriteEngine on
   2 RewriteRule ^/galaxy$ /galaxy/ [R]
   3 RewriteRule ^/galaxy/static/style/(.*) /home/nate/galaxy-dist/static/june_2007_style/blue/$1 [L]
   4 RewriteRule ^/galaxy/static/scripts/(.*) /home/nate/galaxy-dist/static/scripts/packed/$1 [L]
   5 RewriteRule ^/galaxy/static/(.*) /home/nate/galaxy-dist/static/$1 [L]
   6 RewriteRule ^/galaxy/favicon.ico /home/nate/galaxy-dist/static/favicon.ico [L]
   7 RewriteRule ^/galaxy/robots.txt /home/nate/galaxy-dist/static/robots.txt [L]
   8 RewriteRule ^/galaxy(.*) http://localhost:8080$1 [P]

Note the first rewrite rule deals with the missing trailing slash problem. If left out, http://www.example.org/galaxy will result in a 404 error.

Additionally, the Galaxy application needs to be aware that it is running with a prefix (for generating URLs in dynamic pages). This is accomplished by configuring a Paste proxy-prefix filter in the [app:main] section of universe_wsgi.ini and restarting Galaxy:

   1 [filter:proxy-prefix]
   2 use = egg:PasteDeploy#prefix
   3 prefix = /galaxy
   4 
   5 [app:main]
   6  
   7 filter-with = proxy-prefix
   8 cookie_path = /galaxy

cookie_prefix should be set to prevent Galaxy's session cookies from clobbering each other if running more than one instance of Galaxy in different subdirectories on the same hostname.

External user authentication

By default, Galaxy manages its own users. However, it may be more useful at your site to tie into a local authentication system. Galaxy does not do this itself - it delegates this responsibility to the upstream proxy server (in this case, Apache). The authentication module (basic authentication, mod_auth_foo, Cosign, etc.) is responsible for providing a username, which we will pass through the proxy to Galaxy as $REMOTE_USER.

In addition to the modules above, mod_headers must be enabled in the Apache config.

Basic authentication is configured as it is for any other protected portion of your site (other authentication modules are configured differently):

   1 # Define the authentication method
   2 AuthType Basic
   3 AuthName Galaxy
   4 AuthUserFile /home/nate/htpasswd
   5 Require valid-user

The following options are used to take the $REMOTE_USER variable (set by basic authentication) and set it as a header in the proxied environment:

   1 # Define Galaxy as a valid Proxy
   2 <Proxy http://localhost:8080>
   3     Order deny,allow
   4     Allow from all
   5 </Proxy>
   6 # Take the $REMOTE_USER environment variable and set it as a header in the proxy request.
   7 RewriteEngine on
   8 RewriteCond %{IS_SUBREQ} ^false$
   9 RewriteCond %{LA-U:REMOTE_USER} (.+)
  10 RewriteRule . - [E=RU:%1]
  11 RequestHeader set REMOTE_USER %{RU}e

These new directives should be placed in a <Location> block, depending on the directory from which you are serving Galaxy. Your entire configuration will now look like something like this:

   1 # Define Galaxy as a valid Proxy
   2 <Proxy http://localhost:8080>
   3     Order deny,allow
   4     Allow from all
   5 </Proxy>
   6 RewriteEngine on
   7 <Location "/">
   8     # Define the authentication method
   9     AuthType Basic
  10     AuthName Galaxy
  11     AuthUserFile /home/nate/htpasswd
  12     Require valid-user
  13     # Take the $REMOTE_USER environment variable and set it as a header in the proxy request.
  14     RewriteCond %{IS_SUBREQ} ^false$
  15     RewriteCond %{LA-U:REMOTE_USER} (.+)
  16     RewriteRule . - [E=RU:%1]
  17     RequestHeader set REMOTE_USER %{RU}e
  18 </Location>
  19 RewriteRule ^/static/style/(.*) /home/nate/galaxy-dist/static/june_2007_style/blue/$1 [L]
  20 RewriteRule ^/static/(.*) /home/nate/galaxy-dist/static/$1 [L]
  21 RewriteRule ^/favicon.ico /home/nate/galaxy-dist/static/favicon.ico [L]
  22 RewriteRule ^/robots.txt /home/nate/galaxy-dist/static/robots.txt [L]
  23 RewriteRule ^(.*) http://localhost:8080$1 [P]
  24 </Location>

On the Galaxy side, set use_remote_user = True in universe_wsgi.ini. If your auth method doesn't provide a full email address in $(REMOTE_USER, you'll also need to set remote_user_maildomain:

use_remote_user = True
remote_user_maildomain = example.org

For example, when using basic authentication, only bare usernames (e.g. "nate") will be passed to Galaxy. Since Galaxy usernames are full email addresses, remote_user_maildomain needs to be set (e.g. to "example.org"). On the other hand, auth methods such as mod_auth_kerb set the full nate@example.org address, so remote_user_maildomain should not be set. If you're not sure, Galaxy will tell you via an error message if remote_user_maildomain needs to be set.

Users are automatically created in the Galaxy database if the external auth method allows them through. Users created in this manner may not log in if use_remote_user is later disabled, since Galaxy does not have a password stored for the user (since the password is managed by Apache).

mod_authnz_ldap

The Apache mod_authnz_ldap module does not set $REMOTE_USER like other auth modules. The following alternate configuration should allow you to use any LDAP attribute as the username to set in $REMOTE_USER:

   1 # Define Galaxy as a valid Proxy
   2 <Proxy http://localhost:8080>
   3     Order deny,allow
   4     Allow from all
   5 </Proxy>
   6 #!highlight apache
   7 <Location "/">
   8     AuthType Basic
   9     AuthBasicProvider ldap
  10     AuthLDAPURL "ldap://server:389/ou=People,dc=example,dc=org?uid?sub?(objectClass=person)"
  11     AuthzLDAPAuthoritative off
  12     Require valid-user
  13     # Set the REMOTE_USER header to the contents of the LDAP query response's "uid" attribute
  14     RequestHeader set REMOTE_USER %{AUTHENTICATE_uid}e
  15 </Location>

The AuthLDAPURL and variable in which the username is set will vary and is dependent entirely upon the schema/design of your LDAP database.  If your LDAP server is Windows (Active Directory), you may need to use the %{AUTHENTICATE_sAMAccountName} variable.

Display Sites

Display sites such as UCSC work not by sending data directly from Galaxy to UCSC via the client's browser, but by sending UCSC a URL to the data in Galaxy that the UCSC server will retrieve data from.  Since enabling authentication will place all of Galaxy behind authentication, such display sites will no longer be able to access data via that URL. If display_servers is set to a non-empty value in galaxy-dist/universe_wsgi.ini, this tells Galaxy it should allow the named servers access to data in Galaxy. However, you still need to configure Apache to allow access to the datasets. An example config is provided here that allows the UCSC Main/Test backends:

   1 <Location "/root/display_as">
   2     Satisfy Any
   3     Order deny,allow
   4     Deny from all
   5     Allow from hgw1.cse.ucsc.edu
   6     Allow from hgw2.cse.ucsc.edu
   7     Allow from hgw3.cse.ucsc.edu
   8     Allow from hgw4.cse.ucsc.edu
   9     Allow from hgw5.cse.ucsc.edu
  10     Allow from hgw6.cse.ucsc.edu
  11     Allow from hgw7.cse.ucsc.edu
  12     Allow from hgw8.cse.ucsc.edu
  13 </Location>

PLEASE NOTE that this introduces a security hole, the impact of which depends on whether you have restricted access to the dataset via Galaxy's internal dataset permissions.

  • By default, data in Galaxy is public.  Normally with a Galaxy server behind authentication in a proxy server this is of little concern since only clients who've authenticated can access Galaxy.  However, if display site exceptions are made as shown above, anyone could use those public sites to bypass authentication and view any public dataset on your Galaxy server.  If you have not changed from the default and most of your datasets are public, you should consider running your own display sites that are also behind authentication rather than using the public ones.
  • For datasets for which access has been restricted to one or more roles (i.e. it is no longer "public"), access for reading via external browsers is only allowed for a brief period, when someone with access permission clicks the "display at..." link.  During this period, anyone who has the dataset ID would then be able to use the browser to view this dataset.  Although such a scenario is unlikely, it is technically possible.

SSL

If you place Galaxy behind a proxy address that uses SSL (e.g. https:// URLs), set the following in your Apache config:

   1 <Location "/">
   2     RequestHeader set X-URL-SCHEME https
   3 </Location>

Setting X-URL-SCHEME makes Galaxy aware of what type of URL it should generate for external sites like Biomart. This should be added to the existing <Location "/"> block if you already have one, and adjusted accordingly if you're serving Galaxy from a subdirectory.

Compression and caching

All of Galaxy's static content can be cached on the client side, and everything (including dynamic content) can be compressed on the fly. This will decrease download and page load times for your clients, as well as decrease server load and bandwidth usage. To enable, you'll need to load mod_deflate and mod_expires in your Apache configuration, and then set:

   1 <Location "/">
   2     # Compress all uncompressed content.
   3     SetOutputFilter DEFLATE
   4     SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip dont-vary
   5     SetEnvIfNoCase Request_URI \.(?:t?gz|zip|bz2)$ no-gzip dont-vary
   6 </Location>
   7 <Location "/static">
   8     # Allow browsers to cache everything from /static for 6 hours
   9     ExpiresActive On
  10     ExpiresDefault "access plus 6 hours"
  11 </Location>

The contents of <Location "/"> should be added to the existing <Location "/"> block if you already have one, and adjusted accordingly if you're serving Galaxy from a subdirectory.

Sending files using Apache

Galaxy sends files (e.g. dataset downloads) by opening the file and streaming it in chunks through the proxy server. However, this ties up the Galaxy process, which can impact the performance of other operations (see Admin/Config/Performance/Production Server for a more in-depth explanation). Apache can assume this task instead and as an added benefit, speed up downloads. This is accomplished through the use of mod_xsendfile, a 3rd-party Apache module. Dataset security is maintained in this configuration because Apache will still check with Galaxy to ensure that the requesting user has permission to access the dataset before sending it.

To enable it, you must first download, compile and install mod_xsendfile. Once done, add the appropriate LoadModule directive to your Apache configuration to load the xsendfile module and the XSendFile directives to your proxy configuration:

   1 <Location "/">
   2      XSendFile on
   3      XSendFilePath /
   4 </Location>

This should be added to the existing <Location "/"> block if you already have one, and adjusted accordingly if you're serving Galaxy from a subdirectory.

Note: If you use a version of mod_xsendfile older than 0.10, use "XSendFileAllowAbove on" instead of "XSendFilePath /"

Finally, set apache_xsendfile = True in the [app:main] section of universe_wsgi.ini and restart Galaxy.