University Home
Manchester Centre for Integrative Systems Biology

Workflows


The analyses of the omics data generated by the study described in (Castrillo et al., 2007) below have been implemented as workflows. These workflows may be viewed and run by using the Taverna workflow software


boxplotWithstats_pdf1.xml | New 15/03/07

The workflow below uses R to plot a box plot and summarise select microarray data from the big experiment retrieved from Maxd. These results are combined into a PDF document which is created by a beanshell script using the iText free java PDF library. N.B. For this workflow to run successfully, the iText jar has to be downloaded and placed in the /lib folder in your local .taverna directory. This is usually $HOME/.taverna/lib (Linux) or c:/Documents and Settings/MyUser/Application Data/Taverna/lib in Windows. In addition, the workflow will only run with the latest update of Taverna, 1.5.1.6.

The results of this workflow is a PDF document which can be displayed using the PDF renderer plugin. The plugin can be installed as follows:

  • Click on Tools > Plugin Manager.
  • Click on Find New Plugins and add a New Plugin Site using "MCISB" as site name and "http://dbk-ed.mib.man.ac.uk/taverna/1.5/plugins/" as the site URL
  • Install the PDF renderer plugin.

Once the plugin has been installed the workflow can be enacted which will hopefully produce results similar to that shown below. The box plots show the spread of gene expression values from chip measurements. The text below the box plots show some miscellaneous stats about the data.

The PDF document can be viewed using the PDF renderer. It may also be saved onto your file system if required.

Addendum: Running this workflow requires a username and password for running the R processor and accessing the data in the Maxd database.


calcGeneExpFreq.xml | New 15/03/07

R can be used to analyse data from Taverna workflows using a Rserve server which has been wrapped for use as a Taverna processor by Ingo Wassink. The workflow below shows how data retrieved from the MaxD database can be analysed in a simple R script which calculates the frequencies of gene expression values from a given experiment.

The results of this workflow is shown below in the form of a histogram of the gene expression value frequencies calculated by the R script.

Whilst this is a simple analysis, it shows what can be achieved once distributed microarray data is in a form which is accessible by Taverna with R.

Addendum: Running this workflow requires a username and password for running the R processor and accessing the data in the Maxd database.

queryMaxD.xml | New 26/02/07

There are a number of projects developing distributed systems for analysing microarray data, e.g. GEMEPS and the Extensible MicroArray Analysis System. Such systems require a repository for storing the data and metadata generated from microarray experiments. Here in Manchester, Andy Brass's group has developed the MaxD database, a MIAME compliant database for storing gene expression data generated from microarray experiments.

Work done by Giles in the DBK group has provided MaxD with a web services interface into the maxdBrowse client for MaxD which is accessible by Taverna. Its still easier to browse the data and metadata for an experiment in MaxD from a browser but once you know the experiment and associated measurements you are interested in then you can retrieve the gene expression values from taverna for further analysis.

The above workflow retrieves the gene expresion values for a given measurement identifier from an experiment stored in MaxD and transforms the data into a comma separated value String.

Its often useful to transform the data into a CSV String as it can then be further analysed using applications which can consume data for mathematical analysis such as R and Matlab.

Addendum: Running this workflow requires a username and password for accessing the data in the Maxd database.