diff --git a/integrations/extensions/starter-kits/language-model-watsonx/README.md b/integrations/extensions/starter-kits/language-model-watsonx/README.md index 31a2db05..6edd0abf 100644 --- a/integrations/extensions/starter-kits/language-model-watsonx/README.md +++ b/integrations/extensions/starter-kits/language-model-watsonx/README.md @@ -13,13 +13,15 @@ The watsonx specification in the starter kit describes one endpoint and a few of | Endpoint | Description | | ---------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Generation | Used with watsonx text completion models such as `google/flan-ul2` and `google/flan-t5-xxl`. You provide text as a prompt, and it returns the text that follows that prompt. | +| Generation from a deployed prompt | Used with a deployed watsonx prompt. You provide optional the prompt deployment ID, and the query text, and optional passages for RAG. ## Prerequisites -### Create an API key and a project ID +### Create an API key, project ID, and prompt deployment 1. Log in to [watsonx](https://dataplatform.cloud.ibm.com/wx/home?context=wx&apps=cos&nocache=true&onboarding=true&quick_start_target=watsonx) and [generate an API key](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/ml-authentication.html?context=cpdaas). Save this API key somewhere safe and accessible. You need this API key to set up the watsonx custom extension later. 1. To find your watsonx project id, go to [watsonx.ai](https://dataplatform.test.cloud.ibm.com/wx) and find Projects/ (this could be your `sandbox`, which is created for you by default). Click on the link, then follow the Project's Manage tab (Project -> Manage -> General -> Details) to find the project id. +1. [Create your prompt](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-prompt-lab.html?context=wx&audience=wdp#creating-and-running-a-prompt) in Prompt Lab. Ensure you add the `query_text` and `passages` [variables](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-prompt-variables.html?context=wx&audience=wdp#creating-prompt-variables). Once your prompt is saved, [deploy your prompt](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/prompt-template-deploy.html?context=wx) and note your deployment ID. ### Create an assistant @@ -52,6 +54,7 @@ If you need capabilities that are not in the watsonx specification provided, fee Use **Actions Global Settings** to upload the [`watsonx-actions.json`](./watsonx-actions.json) file in this kit to your assistant. For more information, see [Uploading](https://cloud.ibm.com/docs/watson-assistant?topic=watson-assistant-admin-backup-restore#backup-restore-import). You may also need to refresh the action Preview chat, after uploading, to get all the session variables initialized before these actions will work correctly. 1. Under Variables/Created by you within the Actions page, set the `watsonx_project_id` session variable using [a project ID value from watsonx](https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/manage-projects.html?context=wx&audience=wdp). See [this section](#create-an-api-key-and-a-project-id) for additional details on how to find your project ID. +1. Set the `deployment_id` variable to your deployment space ID. **NOTE**: If you import the actions _before_ configuring the extensions, you will see errors on the actions because it could not find the extensions. Configure the extensions (as described [above](#prerequisites)), and re-import the action JSON file. @@ -59,8 +62,10 @@ The starter kit includes [a JSON file with sample actions](./watsonx-actions.jso | Action | Description | | ----------------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Invoke watsonx Generation API | Connects to watson with the selected model and the model input | +| Invoke watsonx Generation API | Connects to watson with the selected model and the model input. | | Test model | Simple test action that asks what model, length, temperature and prompt you want and then calls "Invoke watsonx Generation API" so the model can generate a response to the specified prompt. | +| Invoke watsonx deployed prompt API | Connects to the deployed prompt using the specified deployment ID and input. | +| Test deployment | Simple test action that calls "Invoke watsonx deployed prompt API" so the model can generate a response to the specified query using a saved prompt template. | Note that the "Test model" action includes a step that invokes an extension and includes a parameter named `model_id`. You can set the `model_id` session variable to control which model is used by `Test model`. You can see which models work with the Generate API by viewing the supported foundation models in [the watsonx Prompt Lab](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-prompt-lab.html?context=wx). @@ -84,11 +89,20 @@ These are the session variables used in this example. - `model_response`: The text generated by the model in response to the user input. - `watsonx_api_version` - watsonx api date version. It currently defaults to `2023-05-29`. - `watsonx_project_id`: You **MUST** set this value to be [a project ID value from watsonx](https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/manage-projects.html). By default, this is a [sandbox project id](https://dataplatform.cloud.ibm.com/docs/content/wsj/manage-data/sandbox.html) that is automatically created for you when you sign up for watsonx.ai. +- `deployment_id`: The ID of the deployment space where your prompt is promoted to. +- `parameters.prompt_variables.query_text`: The input for the deployed prompt. +- `parameters.prompt_variables.passages`: Optional variable to include passages for RAG-related queries. + +Note: `passages` and `query_text` must be [added as parameters](https://dataplatform.cloud.ibm.com/docs/content/wsj/analyze-data/fm-prompt-variables.html?context=wx&audience=wdp#creating-prompt-variables) in Prompt Lab. Here is an example of how to use the `Test model` action: +Here is an example of how to use the `Test deployment` action: + + + ### Limitations Adding large values for max_new_tokens & min_new_tokens can result in timeouts, if you see any timeouts please adjust the values of those tokens. diff --git a/integrations/extensions/starter-kits/language-model-watsonx/assets/sample2.png b/integrations/extensions/starter-kits/language-model-watsonx/assets/sample2.png new file mode 100644 index 00000000..f23bc4fc Binary files /dev/null and b/integrations/extensions/starter-kits/language-model-watsonx/assets/sample2.png differ diff --git a/integrations/extensions/starter-kits/language-model-watsonx/watsonx-actions.json b/integrations/extensions/starter-kits/language-model-watsonx/watsonx-actions.json index 0ddfe3d5..ed1f822a 100644 --- a/integrations/extensions/starter-kits/language-model-watsonx/watsonx-actions.json +++ b/integrations/extensions/starter-kits/language-model-watsonx/watsonx-actions.json @@ -3,13 +3,14 @@ "type": "action", "valid": true, "status": "Available", - "created": "2023-07-26T00:29:57.573Z", - "updated": "2023-07-26T00:34:33.767Z", + "created": "2024-05-08T16:31:44.640Z", + "updated": "2024-05-08T16:35:10.141Z", "language": "en", - "skill_id": "d7b6684e-45f0-4fed-967e-c1091494f945", + "skill_id": "a901b7b6-c8aa-4bf1-a61e-e885ecffe9ea", "workspace": { "actions": [ { + "type": "standard", "steps": [ { "step": "step_285", @@ -612,6 +613,7 @@ "disambiguation_opt_out": false }, { + "type": "standard", "steps": [ { "step": "step_817", @@ -629,8 +631,8 @@ "type": "integration_interaction", "method": "POST", "internal": { - "spec_hash_id": "540081b16fe217626f2ff197e583c2f1c7ef19f0752c8b875ec454036cfdda7f", - "catalog_item_id": "4f170e0d-95b5-4f59-9d1d-e52f67ee9352" + "spec_hash_id": "16f2e2f8ca5a1910b7a1d34a1a499202bd2d5c85515447b9ac020c0ae516e299", + "catalog_item_id": "5d2e03cf-aff3-4618-b0bc-c9444b255363" }, "request_mapping": { "body": [ @@ -651,36 +653,156 @@ "skill_variable": "watsonx_project_id" }, "parameter": "project_id" - }, - { - "value": { - "skill_variable": "model_parameters_temperature" - }, - "parameter": "parameters.temperature" - }, + } + ], + "query": [ { "value": { - "skill_variable": "model_parameters_max_new_tokens" + "skill_variable": "watsonx_api_version" }, - "parameter": "parameters.max_new_tokens" - }, + "parameter": "version" + } + ] + }, + "result_variable": "step_817_result_2" + } + }, + "variable": "step_817", + "next_step": "step_606" + }, + { + "step": "step_606", + "output": { + "generic": [] + }, + "context": { + "variables": [ + { + "value": { + "expression": "${step_817_result_2.body.results}[0].generated_text" + }, + "skill_variable": "model_response" + } + ] + }, + "handlers": [], + "resolver": { + "type": "end_action" + }, + "variable": "step_606", + "condition": { + "and": [ + { + "eq": [ { - "value": { - "skill_variable": "model_parameters_min_new_tokens" - }, - "parameter": "parameters.min_new_tokens" + "variable": "step_817_result_2", + "variable_path": "success" }, { - "value": { - "skill_variable": "model_parameters_stop_sequences" - }, - "parameter": "parameters.stop_sequences" - }, + "scalar": true + } + ] + }, + { + "expression": "${step_817_result_2.body.results}.size() > 0" + } + ] + }, + "next_step": "step_212" + }, + { + "step": "step_212", + "output": { + "generic": [] + }, + "context": { + "variables": [ + { + "value": { + "expression": "null" + }, + "skill_variable": "model_response" + } + ] + }, + "handlers": [], + "resolver": { + "type": "end_action" + }, + "variable": "step_212" + } + ], + "title": "Invoke watsonx Generation API", + "action": "action_3200-2", + "boosts": [], + "handlers": [], + "condition": { + "intent": "action_3200_intent_45093-2" + }, + "variables": [ + { + "title": "", + "variable": "step_212", + "data_type": "any" + }, + { + "title": "", + "variable": "step_606", + "data_type": "any" + }, + { + "title": "No response", + "privacy": { + "enabled": false + }, + "variable": "step_817", + "data_type": "any" + }, + { + "privacy": { + "enabled": false + }, + "variable": "step_817_result_2", + "data_type": "any" + } + ], + "next_action": "action_3200-3", + "topic_switch": { + "allowed_from": true, + "allowed_into": true, + "never_return": false + }, + "disambiguation_opt_out": false + }, + { + "type": "standard", + "steps": [ + { + "step": "step_817", + "output": { + "generic": [] + }, + "context": { + "variables": [] + }, + "handlers": [], + "resolver": { + "type": "callout", + "callout": { + "path": "/ml/v1/deployments/{deployment-id}/generation/text", + "type": "integration_interaction", + "method": "POST", + "internal": { + "spec_hash_id": "16f2e2f8ca5a1910b7a1d34a1a499202bd2d5c85515447b9ac020c0ae516e299", + "catalog_item_id": "5d2e03cf-aff3-4618-b0bc-c9444b255363" + }, + "request_mapping": { + "path": [ { "value": { - "skill_variable": "model_parameters_repetition_penalty" + "skill_variable": "deployment_id" }, - "parameter": "parameters.repetition_penalty" + "parameter": "deployment-id" } ], "query": [ @@ -760,12 +882,12 @@ "variable": "step_212" } ], - "title": "Invoke watsonx Generation API", - "action": "action_3200-2", + "title": "Invoke watsonx deployed prompt API", + "action": "action_3200-3", "boosts": [], "handlers": [], "condition": { - "intent": "action_3200_intent_45093-2" + "intent": "action_3200_intent_45093-3" }, "variables": [ { @@ -780,14 +902,165 @@ }, { "title": "No response", + "privacy": { + "enabled": false + }, "variable": "step_817", "data_type": "any" }, { + "privacy": { + "enabled": false + }, "variable": "step_817_result_2", "data_type": "any" } ], + "next_action": "action_3200-4", + "topic_switch": { + "allowed_from": true, + "allowed_into": true, + "never_return": false + }, + "disambiguation_opt_out": false + }, + { + "type": "standard", + "steps": [ + { + "step": "step_129", + "output": { + "generic": [ + { + "values": [ + { + "text_expression": { + "concat": [ + { + "scalar": "What would you like to ask?" + } + ] + } + } + ], + "response_type": "text", + "selection_policy": "sequential" + } + ] + }, + "context": { + "variables": [] + }, + "handlers": [], + "question": { + "free_text": true, + "response_collection_behavior": "always_ask" + }, + "resolver": { + "type": "continue" + }, + "variable": "step_129", + "next_step": "step_817" + }, + { + "step": "step_817", + "output": { + "generic": [] + }, + "context": { + "variables": [ + { + "value": { + "expression": "${step_129}" + }, + "skill_variable": "model_input" + } + ] + }, + "handlers": [], + "resolver": { + "type": "invoke_another_action", + "invoke_action": { + "action": "action_3200-3", + "policy": "default", + "parameters": null, + "result_variable": "step_817_result_1" + } + }, + "variable": "step_817", + "next_step": "step_606" + }, + { + "step": "step_606", + "output": { + "generic": [ + { + "values": [ + { + "text_expression": { + "concat": [ + { + "scalar": "Response is: " + }, + { + "skill_variable": "model_response" + } + ] + } + } + ], + "response_type": "text", + "selection_policy": "sequential" + } + ] + }, + "context": { + "variables": [] + }, + "handlers": [], + "resolver": { + "type": "end_action" + }, + "variable": "step_606" + } + ], + "title": "Test deployment", + "action": "action_3200-4", + "boosts": [], + "handlers": [], + "condition": { + "intent": "action_3200_intent_45093-4" + }, + "variables": [ + { + "title": "What would you like to ask?", + "privacy": { + "enabled": false + }, + "variable": "step_129", + "data_type": "any" + }, + { + "title": "Response is: {variable}", + "variable": "step_606", + "data_type": "any" + }, + { + "title": "No response", + "privacy": { + "enabled": false + }, + "variable": "step_817", + "data_type": "any" + }, + { + "privacy": { + "enabled": false + }, + "variable": "step_817_result_1", + "data_type": "any" + } + ], "next_action": "fallback", "topic_switch": { "allowed_from": true, @@ -797,6 +1070,7 @@ "disambiguation_opt_out": false }, { + "type": "standard", "steps": [ { "step": "step_001", @@ -893,6 +1167,7 @@ "disambiguation_opt_out": true }, { + "type": "standard", "steps": [ { "step": "digression_failure", @@ -1201,6 +1476,7 @@ "disambiguation_opt_out": true }, { + "type": "standard", "steps": [ { "step": "danger_word_detected", @@ -1275,6 +1551,7 @@ "next_action": "anything_else" }, { + "type": "standard", "steps": [ { "step": "step_001", @@ -1328,6 +1605,18 @@ "intent": "action_3200_intent_45093-2", "examples": [] }, + { + "intent": "action_3200_intent_45093-3", + "examples": [] + }, + { + "intent": "action_3200_intent_45093-4", + "examples": [ + { + "text": "Test deployment" + } + ] + }, { "intent": "fallback_connect_to_agent", "examples": [ @@ -1362,12 +1651,19 @@ { "type": "synonyms", "value": "google/flan-t5-xxl", - "synonyms": ["flan-t5-xxl", "t5", "xxl"] + "synonyms": [ + "flan-t5-xxl", + "t5", + "xxl" + ] }, { "type": "synonyms", "value": "google/flan-ul2", - "synonyms": ["flan-ul2", "ul2"] + "synonyms": [ + "flan-ul2", + "ul2" + ] } ], "fuzzy_match": true @@ -1389,10 +1685,19 @@ "metadata": { "api_version": { "major_version": "v2", - "minor_version": "2021-11-27" + "minor_version": "2018-11-08" } }, "variables": [ + { + "title": "deployment_id", + "privacy": { + "enabled": false + }, + "variable": "deployment_id", + "data_type": "any", + "description": "" + }, { "title": "model_id", "variable": "model_id", @@ -1580,8 +1885,9 @@ "learning_opt_out": true }, "description": "created for assistant 05c10d7d-336f-4d33-8cb3-5c53520d61ce", - "assistant_id": "67614f79-ba73-4c10-a90f-9b647f797521", - "workspace_id": "d7b6684e-45f0-4fed-967e-c1091494f945", + "assistant_id": "bd63fead-e90b-43b4-9e57-e0fb90553861", + "workspace_id": "a901b7b6-c8aa-4bf1-a61e-e885ecffe9ea", "dialog_settings": {}, - "next_snapshot_version": "1" -} + "next_snapshot_version": "1", + "environment_id": "77419769-c935-4105-8da5-87a20d970abb" +} \ No newline at end of file diff --git a/integrations/extensions/starter-kits/language-model-watsonx/watsonx-openapi.json b/integrations/extensions/starter-kits/language-model-watsonx/watsonx-openapi.json index 2a468202..8780454c 100644 --- a/integrations/extensions/starter-kits/language-model-watsonx/watsonx-openapi.json +++ b/integrations/extensions/starter-kits/language-model-watsonx/watsonx-openapi.json @@ -187,6 +187,112 @@ } } } + }, + "/ml/v1/deployments/{deployment-id}/generation/text": { + "post": { + "description": "Generation from a deployed prompt", + "parameters": [ + { + "name": "version", + "in": "query", + "description": "Release date of the version of the API you want to use. Specify dates in YYYY-MM-DD format.", + "required": true, + "schema": { + "type": "string" + } + }, + { + "name": "deployment-id", + "in": "path", + "description": "Deployment ID of the prompt deployment you want to use.", + "required": true, + "schema": { + "type": "string" + } + } + ], + "requestBody": { + "content": { + "application/json": { + "schema": { + "type": "object", + "properties": { + "parameters": { + "type": "object", + "properties": { + "prompt_variables": { + "type": "object", + "properties": { + "query_text": { + "type": "string", + "description": "The original query text from the user", + "example": "Who is the president of the United States?" + }, + "passages": { + "type": "string", + "description": "Optional specific passages to pass to the deployed prompt", + "example": "Austin is the state capital of Texas, an inland city bordering the Hill Country region. Home to the University of Texas flagship campus, Austin is known for its eclectic live-music scene centered around country, blues and rock. Its many parks and lakes are popular for hiking, biking, swimming and boating. South of the city, Formula One's Circuit of the Americas raceway has hosted the United States Grand Prix." + } + } + } + } + } + } + } + } + } + }, + "responses": { + "200": { + "description": "Default Response", + "content": { + "application/json": { + "schema": { + "type": "object", + "properties": { + "model_id": { + "description": "The ID of the model to be used for this request", + "type": "string" + }, + "created_at": { + "description": "The date and time of the response", + "type": "string" + }, + "results": { + "type": "array", + "items": { + "type": "object", + "properties": { + "generated_text": { + "description": "The generated text", + "type": "string" + }, + "generated_token_count": { + "description": "The number of tokens in the output", + "type": "integer" + }, + "input_token_count": { + "description": "The number of tokens in the input", + "type": "integer" + }, + "stop_reason": { + "description": "The reason for stopping the generation. Can be NOT_FINISHED - Possibly more tokens to be streamed, MAX_TOKENS - Maximum requested tokens reached, EOS_TOKEN - End of sequence token encountered, CANCELLED - Request canceled by the client, TIME_LIMIT - Time limit reached, STOP_SEQUENCE - Stop sequence encountered, TOKEN_LIMIT - Token limit reached, ERROR - Error encountered", + "type": "string" + } + } + }, + "description": "Outputs of the generation" + } + } + } + } + } + }, + "default": { + "description": "Unexpected error" + } + } + } } } }