-
Notifications
You must be signed in to change notification settings - Fork 2
Workflow Indexing API
The main goal of the overall workflow indexng work is to provide a way to scientist for searching workflows by its funcionality, properties, or other conceptualization allowing their easy accessibility.
In order to make the workflows more available to the users we have created an api which includes set of services that provide indexing, searching and recommending workflow processes.
The API will be described with more detail in the follwing sections.
The usage of this API is similar to most of the APIs that provide information for the users in the Wf4ever project (Stability services, Checklist, etc.).
Let's suppose that we have a process name (or list of processes). This will be the only input needed for running the API correctly.
- Process/Processes: Name of the process or list of processes names that we want to use for the search or recommendation services.
- The workflow abstraction services would then be invoked in a sequence of two HTTP operations:
C: GET /stability/rest/search HTTP/1.1 C: Host: service.example.org C: Accept: application/rdf+xml S: HTTP/1.1 200 OK S: Content-Type: application/rdf+xml S: S:<?xml version="1.0" encoding="UTF-8"?> S:<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction"> S:<rdf:Description rdf:about=""> S: <roe:wfabstraction>/rest/search{?process[]}</roe:wfabstraction> S:</rdf:Description> S:</rdf:RDF> C: GET /stability/rest/search?process=name_of_process HTTP/1.1 C: Host: service.example.org C: Accept: application/json S: HTTP/1.1 200 OK S: Content-Type: application/json S: S: (result of the search in json) S:
C: GET /stability/rest/recommend HTTP/1.1C: Host: service.example.org C: Host: service.example.orgC: Accept: application/rdf+xml S: HTTP/1.1 200 OK S: Content-Type: application/rdf+xml S: S:<?xml version="1.0" encoding="UTF-8"?>S:<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction"> S:<rdf:Description rdf:about=""> S: <roe:wfabstraction>/rest/recommend{?process[]}</roe:wfabstraction> S:</rdf:Description> S:</rdf:RDF>
C: GET /stability/rest/recommend?process=name_of_process HTTP/1.1 C: Host: service.example.org C: Accept: application/json S: HTTP/1.1 200 OK S: Content-Type: application/json S: S: (result of the recommendation in json) S:
Links where the services are currently available:
- http://sandbox.wf4ever-project.org/wfabstraction/rest/search
- http://sandbox.wf4ever-project.org/wfabstraction/rest/recommend
http://tools.ietf.org/html/rfc6570 http://tools.ietf.org/html/rfc3986 HTTP methods The only HTTP method available for this API is the GET method. Depending on the content negotiation (which is explained in the next point) we will get a)description of the service or b)results obtained by the invocation of the stability evaluation.
Service description
<?xml version="1.0" encoding="UTF-8"?>
<rdf:rdf xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction"> <rdf:description rdf:about=""> &lt;roe:wfabstraction&gt;/rest/search&#123;?process&#91;&#93;&#125;&lt;/roe:wfabstraction&gt; </rdf:description> </rdf:rdf> OR
<?xml version="1.0" encoding="UTF-8"?>
<rdf:rdf xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:roe="http://sandbox.wf4ever-project.org/wfabstraction"> <rdf:description rdf:about=""> &lt;roe:wfabstraction&gt;/rest/recommend&#123;?process&#91;&#93;&#125;&lt;/roe:wfabstraction&gt; </rdf:description> </rdf:rdf> Searching inside workflows The created trie structured provided for indexing purposes has been encapsulated in order to provided the next two services :
Search: it receives a sequence of processes and searches for workflows that contain that sequence. Service_call: /wfabstraction/rest/search?process=Processor regex_value Inputs: sequence of processes names (?process=text1&process=text2) Output: xml or json structure providing the following info ( process_id, freq, URIs). Process_id: the name of the process used in the query freq: How many times this process appears in other workflows. URIs: Uris of the workflows where it the proecces appears. An example of output is: XML <search> <process id="Processor regex_value" freq="26.0"> &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3&#45;0ab9&#45;42d4&#45;b5c3&#45;e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3&#45;0ab9&#45;42d4&#45;b5c3&#45;e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/b62064b9&#45;4d4c&#45;459b&#45;bf16&#45;dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/b62064b9&#45;4d4c&#45;459b&#45;bf16&#45;dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/da8428f6&#45;e0ee&#45;4b6c&#45;90fd&#45;1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/da8428f6&#45;e0ee&#45;4b6c&#45;90fd&#45;1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3&#45;e864&#45;439c&#45;8706&#45;f8c74097cab6/workflow/Workflow4/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3&#45;e864&#45;439c&#45;8706&#45;f8c74097cab6/workflow/Workflow4/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/fed272a3&#45;24d2&#45;4926&#45;b8ce&#45;b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/fed272a3&#45;24d2&#45;4926&#45;b8ce&#45;b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3&#45;0ab9&#45;42d4&#45;b5c3&#45;e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3&#45;0ab9&#45;42d4&#45;b5c3&#45;e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/&lt;/uri&gt; &lt;uri&gt;http://ns.taverna.org.uk/2010/workflowBundle/5cc50154&#45;e29b&#45;4059&#45;b08b&#45;9c35cb69c46b/workflow/Workflow32/&lt;/uri&gt; </process> </search> JSON
{ "process": { "@freq": "30.0", "@id": "Processor regex_value", "uri": [ "http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/", "http://ns.taverna.org.uk/2010/workflowBundle/74b80ee3-e864-439c-8706-f8c74097cab6/workflow/Workflow4/", "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/", "http://ns.taverna.org.uk/2010/workflowBundle/fed272a3-24d2-4926-b8ce-b39e9fb83309/workflow/AMIGA_ConeSearch_from_a_file_of_targets_positions/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/1cd4c8a3-0ab9-42d4-b5c3-e3c91231b9a9/workflow/Find_orthologs_using_a_list_of_uniprot_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/da8428f6-e0ee-4b6c-90fd-1c1d9c2b994f/workflow/EBI_PICR__find_cross_references_for_protein_accessions/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/b62064b9-4d4c-459b-bf16-dc62f06035a3/workflow/Get_homologous_from_NCBI_homoloGene/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/", "http://ns.taverna.org.uk/2010/workflowBundle/5cc50154-e29b-4059-b08b-9c35cb69c46b/workflow/Workflow32/" ] }
} Recommendation service for processes inside workflows Recommend: it returns the most frequent next process given a sequence of previous ones. Service_call: /wfabstraction/rest/recommend?process=Processor regex_value Input: sequence of processes names (?process=text1&process=text2) Output: xml or json structure providing the following info ( id, prob, freq). id: id of the recommended process prob: The probability given for the recommendation to be correct. freq: number of times it appears. An example of output is: XML <recommendation> <process prob="0.26923078298568726" id="Processor Split_string_into_string_list_by_regular_expression" freq="7.0"></process> </recommendation> JSON {
"process": { "@freq": "7.0", "@id": "Processor Split_string_into_string_list_by_regular_expression", "@prob": "0.23333333432674408" }
} Security considerations Workflow abstraction is formed by a read-only functions, so there shouldn't be any security risks of adding or deleting content. The main problem is that this service provides information about names of processes and maybe not everybody wants to make the names of tis processes available.
Cache considerations No cache considerations have been taken as this service provides small amounts of information and it is no need to cache.