Version 5 (modified by hlapp, 10 years ago)


At present there is no standard web-service API for phylogenetic data that would allow integration of phylogenetic data and service providers into the programmable web. Hence, current approaches to integrate data and services into workflows are highly specific to the integration platform (CIPRES, Bioperl, Bio::Phylo, Kepler), and nearly unusable in other environments.

Here are several ideas for tasks we can work on at the hackathon:

  • Defining scope
    • Issue of identifiers and OTUs
  • Accumulating use-cases
    • OTU-oriented queries:
      • Need ability to obtain sequence(s) and taxon (taxa) for a leaf (or generally, a node)
      • Finding trees by sequence (rather than the OTU name)
  • Formulating a task-oriented API requirements description
  • Proposing a concrete REST or SOAP-based API
  • Propose input/output formats
  • Start a reference implementation, for example based on data in BioSQL

Gathering of use-cases and task-oriented requirements has started at

The Open Space discussion centered on the following issues:

  • The OTU (Operational Taxonomic Unit) perspective is an important use-case.
    • Gene tree analysis: similar to the Zmasek et al (2007) paper, one may want to build alignments and phylogenetic trees for all ortholog families of a gene family, or a pathway. After loading the trees into a database, one could then query the database for those gene trees that support a certain species phylogeny, for example the Ecdysizoan hypothesis.
      • Problem: the query topology will be given with either gene name labels, or species name labels, but the labels of the trees will be OTUs.
      • Hence, each OTU needs to be linked to the gene name(s) and taxon names, and it needs to be possible to specify that matching tree nodes use the linked taxon or gene names.