TestingBioMoby – Hackathon

Context Navigation

by Kazuharu Arakawa and Hiroyuki Nakamura (G-language Project, Institute for Advanced Biosciences, Keio University)

MOBY Client

Current BioMoby release for Perl ( http://biomoby.org/PerlReleases/) distributes Client and Server codes together, with several required external CPAN/non-CPAN modules.

Majority of the users would just want the client code, so we have put together a Client-only package, using Module::Install so that it automatically installs required CPAN modules upon make.

Edward has already uploaded the module to CPAN ( http://search.cpan.org/~ekawas/MOBY-Client-1.0/lib/MOBY/Client/Central.pm) so users can easily install MOBY-Client by the following command:

sudo cpan MOBY::Client::Central

Testing MOBY services

Out of 39 services that can be used with NCBI gi:

* 5 returned 404 NOT FOUND
* 10 returned mobyException (mostly java.lang.exception, or Perl errors)
* only 16 returned some sort of response.

MOBY objectTypes

* 584 in total
* 123 has services (rest are output data types)

MOBY Client in G-language GAE

MOBY Client implemented in G-language GAE, developed during this hackathon, is available in release 1.8.2. Get the latest software package here.

Tutorial is available here.

Philosophy behind the G-language GAE MOBY client

We prefer to stick with the KISS principle (wikipedia) (Keep It Sweet & Simple), and we believe it is also preferable for web services.

Sophisticated semantics and ontology will probably work if ALL web services are annotated with that semantics, but we already know that it is not the case. Molecular Biology is obviously a rapidly advancing discipline of science, and even with the tremendous amount of efforts of the Bio* projects to make bioinformatics seamless throughout the numerous tools and databases, we still need many "glue codes" to connect the pipelines into usable workflows.

If there is even a minimal amount of documentation and if that documentation is accessible, it is relatively easy for a user to figure out if one output of certain web service can be used as the input of another. Likewise, scientists can easily decipher wether given string is an amino acid sequence, nucleotide sequence, or NCBI gi (If he/she is not certain, he/she should not be using the service anyway). If the user can access the output of certain web service in a scripting language, it should also be simple for him/her to modify the variable to "glue" it to different service that take it as input in a little different format.

In fact, majority of the BioMOBY ontology/namespace are defined for singletons, that are also mostly output formats. Input formats are rather limited, and therefore are simple.

In light of the above facts, we have made the services easily discoverable, with the "help" command that is used to search G-language/BioPerl documentations. We have simply added a "-w" option which allows searching of web services, and when a specific service name is given, it shows the service description, with input/output types. All services can then be run with ws() function. When all procedures are done, users can simply use the "makelog" command of G-language Shell to output the session as a Perl script, which can be used as a "workflow". See here for an example.

SOAP was originally designed for RPC, and to transfer objects in an interoperable way. However, when connecting numerous web services in the form of workflows, data objects are usually too complicated to understand and to handle, so simpler data structures such as simple strings, arrays, or dictionaries (hashes) should be preferable. If a user wants to use one service just because there is no alternative, REST services that can be accessed with "curl" or "wget" command line utilities and with very simple (even raw and not marked up) output are most preferable.

If a user wants to make use of MANY web services to create a workflow of services, WSDL file does not include service descriptions by nature, so it is very difficult for these services to be discoverable without some standards, such as BioMOBY. BioMOBY does a very good job for service discovery, so we hope that BioMOBY keeps its necessary description simple and short.

Therefore, we believe SOAP is not simple enough, and it is neither sufficient in terms of service discovery. To make "gluing" easier, we believe that web services should be extremely simple to use (perhaps with REST?), and the clients should take care of the "gluing" with user interface and ease of scripting.

Download in other formats:

Plain Text