Andy Jenkinson, EMBL-EBI, 7th April 2010
This tutorial is intended to help you understand how ProServer, a DAS server for Perl, is structured. It will also guide you in setting up ProServer on your machine, and exploring the examples that come with the distribution.
The tutorial assumes you are familiar with Perl and are operating on a Linux platform.
ProServer is a standalone server, meaning you do not need to run a separate web server such as Apache. It handles all of the communications, query parsing and XML output functions in a set of "core" modules, and uses plugins to run actual data sources. Plugins are responsible for adapting actual data to the DAS protocol. The server and its plugins are configured using an INI file.
Each data source is represented in ProServer by an instance of a plugin module, called a SourceAdaptor. Simple data sources, especially those based on files, can sometimes be set up without requiring any code at all by using a pre-existing SourceAdaptor. More often, running a custom data source will require you to write your own. This will be covered in Part 2.
The lifecycle of a typical DAS features request is as follows:
The best way to get ProServer is via the Subversion repository. The
trunk
typically contains the latest stable version, so includes
the latest bugfixes. To download it to your home directory, open a terminal
and type the following:
cd ~ svn checkout http://proserver.svn.sf.net/svnroot/proserver/trunk Bio-Das-ProServer
When the download is complete, enter the Bio-Das-ProServer
directory that was created:
cd Bio-Das-ProServer
Take a moment to read the installation section of the README file. Proceed to build ProServer as per the instructions:
perl Build.PL ./Build
The distribution contains a Perl script called proserver
that you
should use to run proserver. It is in the eg
directory. During development,
you should run this script with the -x
option. This prevents the process
from forking and directs log output to your terminal rather than to file. Try
running the script in your terminal:
eg/proserver -x
If all is well, the server will start and output some information about its (default) configuration. If not, you should be able to diagnose the problem. Commonly errors arise from:
proserver
script from a location where it can find the modules. It is looking in
./blib/lib
(where the modules reside when ProServer is built),
so make sure you run the script from the root Bio-Das-Proserver directory.
With ProServer still running, point your browser to the URL and port where it is listening:
You should see the ProServer homepage. From this page you can click the "SOURCES" link. This will execute the DAS sources command, via the URL:
You should see a table listing the DAS sources being hosted from your server. There are two examples configured by default: "mysimple" and "mygff". The sources command provides some metadata about each source that is useful for client software. In particular, the "coordinates" and "capabilities" properties help a client to know whether a source contains data that is relevant to it.
The sources command, like all DAS commands, has an XML output. The
browser converts this XML to the coloured human readable output you see via
an XSL stylesheet. To see the DAS XML output, use the "view source"
option of your browser. (NOTE: If you see no output at all at this point, make
sure you are running the server from the Bio-Das-ProServer
directory so that ProServer can find its XSL stylesheets.)
Take some time to explore the capabilities of the example DAS sources:
ProServer uses an INI file to configure itself, which you can specify using the
'-c' command-line option. This INI file defines lots of things such as the
port number the server should listen on, the root directory to look for static
content, and details of the DAS sources it is serving. In Part 2
you will write your own
INI file, but for now take a quick look at the default one. It is located at
eg/proserver.ini
.
Each section of the INI file is denoted by square brackets. Server options such as port number are in the [general] section. All other sections are treated as DAS sources that the server hosts, each representing an individual source of data.
Find the [mygff] section in the INI file to see how the "mygff" DAS source is configured.
[mygff] adaptor = file state = on description = An example source using a GFF file doc_href = http://another.homepage.com ; Properties for the 'file' SourceAdaptor to allow it to read GFF2 files filename = eg/data/example.gff cols = segment,method,type,start,end,score,ori,phase,note,note feature_query = field0 lceq %segment AND field4 >= %start AND field3 <= %end comment = ^# separator = \t|\s*;\s* ; Coordinate system and test range: coordinates = NCBIM_37,Chromosome,Mus musculus -> Y:1,100
Do not worry about the specifics of each property, though hopefully you will
have a vague idea what they do. This exercise is merely to familiarise you
with where some of the metadata comes from. From here you can see that the
actual data is coming from a file, eg/data/example.gff
- so have
a look inside using a text editor.
Now download another GFF file and save it:
curl 'ftp://ftp.sanger.ac.uk/pub/wormbase/live_release/genomes/c_elegans/genome_feature_tables/GFF2/CHROMOSOME_MtDNA.gff' > CHROMOSOME_MtDNA.gff
[Alternative download: CHROMOSOME_MtDNA.gff]
Edit the mygff source section in proserver.ini
to point
to this new file and update the coordinates accordingly:
filename = CHROMOSOME_MtDNA.gff coordinates = WS_200,Chromosome,Caenorhabditis elegans -> CHROMOSOME_MtDNA:1,100
Stop and re-start your server (Ctrl-C to stop your interactive session on the terminal) and take another look at your modified DAS source. If it doesn't work, keep an eye out for errors in the terminal output.