5 | | Databases in lifescience area have enormous amount of data, still |
6 | | growing and becoming more multi-faceted. To provide integrated services |
7 | | , the way just centralising these databases, like by mirroring each |
8 | | databases, is insufficient but also the way using directly over |
9 | | distributed environments is strongly needed to be constructed at really |
10 | | serviceable state. |
| 5 | As the databases in life science is keep growing and daily updated, integration of these databases by mirroring in one organization is not feasible. |
| 6 | Instead, it is required to develop the way to utilize variety of databases directly over the distributed environments. |
12 | | This situation brought on SOAP/WSDL web services to be launched at |
13 | | institutes around bioinformatics like NCBI and EBI abroad, and DDBJ XML, |
14 | | KEGG API, PDBj, CBRI in Japan, providing respective services handling |
15 | | many tools and databases. |
| 8 | This situation brought on SOAP/WSDL web services to be launched at the major database centers around the world including NCBI and EBI. |
| 9 | In the same way, DDBJ, KEGG, PDBj, and CBRC has been released relevant web services in Japan. |
17 | | Also, by myGrid project and BioMOBY project, efforts has already been |
18 | | made to provide foundation to use those web services in an unified way, |
19 | | which brings situation very practical to implement integration by web services |
20 | | technology around this field. |
| 11 | Several projects like BioMOBY and myGrid/Taverna projects have been started to utilize these services in an unified way, |
| 12 | and it is strongly believed that this kind of integration should be accomplished by the SOAP/WSDL based web service technology in this field. |
22 | | Main problems are, the variety of each specification and |
23 | | naming convention, and unstandardised data structure to be passed. |
24 | | Additionally, handling for cases like temporal service down or whatever error |
25 | | occured while executing a job just rely on each provided servers which |
26 | | tends to leave insufficient specification documentation. |
| 14 | However, inconsistent specifications of the data structures and the naming conventions among these services prevent the interoperability. |
| 15 | Besides, the insufficient documentation of each services is one of the bottle necks to popularize these services. |
| 16 | Additionally, the web service is originally designed to be utilized by every programming languages which have support for SOAP/WSDL, |
| 17 | but there are several services which can't be used with some languages. |
| 18 | It is also hard to handle the cases like temporal service down or whatever error occurred while executing a job for end users. |
28 | | Existence of different usages among each services, or situation compelling |
29 | | user to exchange data types and to handle exceptions for their own |
30 | | is really inefficient. Furthermore, the number of serviced tools and |
31 | | databases providing web services are still few, which makes it quite |
32 | | severe to achieve constructions of workflows through these services for |
33 | | the moment. |
| 20 | To improve this circumstance, it is a key to standardize the data structures to be passed among existing web services and |
| 21 | to increase the number of tools and databases accessible through the web service for creating practical bioinformatics workflows. |
37 | | Thus we at DBCLS have planned to search usage of existing web services and |
38 | | of those data structures to construct proxy-like server, which aim to provide |
39 | | unifed and consistent naming conventions and usages. |
40 | | To make this, we would first consider; |
| 25 | Thus, we at DBCLS are planning to construct a proxy-like server, which aim to provide unified and consistent usages of the existing services. |
| 26 | For this purpose, we would firstly consider: |
42 | | * Services be well documented. |
43 | | * Owe error handling at server side as much as possible. |
44 | | * Servers to be accessible by languages widely used like Perl, Ruby, Python, |
45 | | Java and more as many as possible. |
46 | | * Owe data exchange by server side to make communicate between many |
47 | | servers servicing web services. |
48 | | * Pipelines over several steps be done on server side to be effective enough. |
| 28 | * provides sufficient documentation for the usage of each services |
| 29 | * ensures every operations to be accessible by the widely used languages like Perl, Ruby, Python and Java. |
| 30 | * translates data structure exchanged among servers to create seamless workflows |
| 31 | * concatenates pipelines over several steps to be done on the server side in typical cases for the effecitivity |
| 32 | * handles various errors returned by external services at the server side as much as possible |
50 | | By conducting our proposal, not just providing environments exeeded in |
51 | | usability to many researchers, it can act as an infrastructure for |
52 | | constructing workflows which accordingly provide each centers a dramatical |
53 | | increase of internet access counts. |
54 | | |
55 | | At the same time, consideration is needed over what categories of |
56 | | services to provide, those quantity and quality. Therefore it is |
57 | | desirable to parallelize working on providing service along with |
58 | | developments of integrated databases and tools at DBCLS. |
| 34 | By conducting our proposal, not just providing user friendly environments to many researchers, |
| 35 | the server can act as an infrastructure for constructing practical workflows effectively utilizing existing services. |
| 36 | At the same time, we will also provide new web services for tools and databases to be developed at DBCLS. |
70 | | Therefore, we are going to held developer's meeting in January or |
71 | | February of 2008. There, core developers and key members at home and |
72 | | abroad related to each web service providers, including BioMOBY, |
73 | | Open Bio*s (ie. like BioPerl), would be offered to attend |
74 | | this meeting, staying for about a week in Japan. |
| 48 | Therefore, we are going to held developer's meeting in February of 2008. |
| 49 | There, key members related to each web service providers around the world |
| 50 | and core developers from BioMOBY and Open Bio* (BioPerl, BioRuby, BioPython, and BioJava), |
| 51 | would be offered to attend this meeting, staying for about a week in Japan. |
76 | | So far, data type (class design) for bioinformatics has been defined for |
77 | | each project of Open-bio (like BioPerl, BioPython, BioJava, BioRuby) |
78 | | respectively. |
79 | | Though to benefit interoperability, it seems to be nice to define |
80 | | standard specification of objects based on web services' class, to |
81 | | comply with it. |
| 53 | Historically, the class design, which represents bioinformatics data types, |
| 54 | has been defined by each Open Bio* project, independently. |
| 55 | In this opportunity, to define the standard specification for the biological objects |
| 56 | through the web service will also benefit interoperability among these libraries. |
| 57 | Seamless integration of the remote (web services) and local (installed tools) environment is |
| 58 | an another challenge for those projects. |