Skip to content

Workflow Indexing API

Soumya Brahma edited this page Feb 6, 2016 · 3 revisions

Table of Contents

API function overview

The main goal of the overall workflow indexng work is to provide a way to scientist for searching workflows by its funcionality, properties, or other conceptualization allowing their easy accessibility.

In order to make the workflows more available to the users we have created an api which includes set of services that provide indexing, searching and recommending workflow processes.

The API will be described with more detail in the follwing sections.

API usage

The usage of this API is similar to most of the APIs that provide information for the users in the Wf4ever project (Stability services, Checklist, etc.).

Let's suppose that we have a process name (or list of processes). This will be the only input needed for running the API correctly.

  • Process/Processes: Name of the process or list of processes names that we want to use for the search or recommendation services.
  1. The workflow abstraction services would then be invoked in a sequence of two HTTP operations:

Case 1: Search

C: GET /stability/rest/search HTTP/1.1
C: Host: service.example.org
C: Accept: application/rdf+xml
S: HTTP/1.1 200 OK
S: Content-Type: application/rdf+xml
S:
S:<?xml version="1.0" encoding="UTF-8"?>
S:<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction">
S:<rdf:Description rdf:about="">
S: <roe:wfabstraction>/rest/search{?process[]}</roe:wfabstraction>
S:</rdf:Description>
S:</rdf:RDF>
C: GET /stability/rest/search?process=name_of_process HTTP/1.1
C: Host: service.example.org
C: Accept: application/json
S: HTTP/1.1 200 OK
S: Content-Type: application/json
S:
S: (result of the search in json)
S:

Case 2: Recommend

C: GET /stability/rest/recommend HTTP/1.1C: Host: service.example.org
C: Host: service.example.orgC: Accept: application/rdf+xml
S: HTTP/1.1 200 OK
S: Content-Type: application/rdf+xml
S:
S:<?xml version="1.0" encoding="UTF-8"?>S:<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction">
S:<rdf:Description rdf:about="">
S: <roe:wfabstraction>/rest/recommend{?process[]}</roe:wfabstraction>
S:</rdf:Description>
S:</rdf:RDF>
C: GET /stability/rest/recommend?process=name_of_process HTTP/1.1
C: Host: service.example.org
C: Accept: application/json
S: HTTP/1.1 200 OK
S: Content-Type: application/json
S:
S: (result of the recommendation in json)
S:

Link relations

Links where the services are currently available:

General information about the workflow abstraction in wf4ever: For other information see also:

HTTP methods

The only HTTP method available for this API is the GET method. Depending on the content negotiation (which is explained in the next point) we will get a)description of the service or b)results obtained by the invocation of the stability evaluation.

Service description



<?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction">
   <rdf:Description rdf:about="">
      <roe:wfabstraction>/rest/search{?process[]}</roe:wfabstraction>
   </rdf:Description>
</rdf:RDF>
OR
 <?xml version="1.0" encoding="UTF-8"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction">
   <rdf:Description rdf:about="">
      <roe:wfabstraction>/rest/recommend{?process[]}</roe:wfabstraction>
   </rdf:Description>
</rdf:RDF>

Searching inside workflows

The created trie structured provided for indexing purposes has been encapsulated in order to provided the next two services :

  • Search: it receives a sequence of processes and searches for workflows that contain that sequence.
  • Service_call: /wfabstraction/rest/search?process=Processor regex_value
  • Inputs: sequence of processes names (?process=text1&process=text2)
  • Output: xml or json structure providing the following info ( process_id, freq, URIs).
    • Process_id: the name of the process used in the query
    • freq: How many times this process appears in other workflows.
    • URIs: Uris of the workflows where it the proecces appears.
  • An example of output is:

XML

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<search>
    <process id="Processor regex_value" freq="26.0">
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/</uri>
        <uri>http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/</uri>
</process>
 </search>

JSON

 {
    "process": {
        "@freq": "30.0",
        "@id": "Processor regex_value",
        "uri": [
            "http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/",
            "http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/",
            "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/",
            "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/"
        ]
    }
}

Recommendation service for processes inside workflows

  • Recommend: it returns the most frequent next process given a sequence of previous ones.
  • Service_call: /wfabstraction/rest/recommend?process=Processor regex_value
  • Input: sequence of processes names (?process=text1&process=text2)
  • Output: xml or json structure providing the following info ( id, prob, freq).
    • id: id of the recommended process
    • prob: The probability given for the recommendation to be correct.
    • freq: number of times it appears.
  • An example of output is:

XML

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<recommendation>
    <process prob="0.26923078298568726" id="Processor Split_string_into_string_list_by_regular_expression" freq="7.0"/>
</recommendation>

JSON

{
    "process": {
        "@freq": "7.0",
        "@id": "Processor Split_string_into_string_list_by_regular_expression",
        "@prob": "0.23333333432674408"
    }
}

Security considerations

Workflow abstraction is formed by a read-only functions, so there shouldn't be any security risks of adding or deleting content. The main problem is that this service provides information about names of processes and maybe not everybody wants to make the names of tis processes available.

Cache considerations

No cache considerations have been taken as this service provides small amounts of information and it is no need to cache.