diff --git a/.github/CODEOWNERS b/.github/CODEOWNERS
index 64ab662d7ea..bd9639fb814 100644
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -44,6 +44,7 @@
/generative-ai/gemini/use-cases/intro_multimodal_use_cases.ipynb @saeedaghabozorgi @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/sample-apps/genwealth/ @paulramsey @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/use-cases/applying-llms-to-data/analyze-poster-images-in-bigquery/poster_image_analysis.ipynb @aliciawilliams @GoogleCloudPlatform/generative-ai-devrel
+/generative-ai/gemini/use-cases/retail/product_attributes_extraction.ipynb @tianli @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/use-cases/retrieval-augmented-generation/RAG_Based_on_Sensitive_Data_Protection_using_Faker.ipynb @ainaomotayo @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/use-cases/retrieval-augmented-generation/rag_qna_langchain_bigquery_vector_search.ipynb @ashleyxuu @GoogleCloudPlatform/generative-ai-devrel
/generative-ai/gemini/use-cases/retrieval-augmented-generation/retail_warranty_claim_chatbot.ipynb @zthor5 @charleselliott @GoogleCloudPlatform/generative-ai-devrel
diff --git a/.github/actions/spelling/allow.txt b/.github/actions/spelling/allow.txt
index 6d5293844d9..b9946a29fde 100644
--- a/.github/actions/spelling/allow.txt
+++ b/.github/actions/spelling/allow.txt
@@ -3,8 +3,8 @@ Adidas
agentic
AGG
AGs
-aip
ainvoke
+aip
Aktu
alloydb
Aniston
@@ -13,6 +13,7 @@ Arborio
Arepa
Arsan
Arxiv
+Ashish
astype
Autorater
autosxs
@@ -50,14 +51,15 @@ colorway
colwidth
csa
CZE
+D'orsay
Dataform
dataframe
datname
dbadmin
dbln
+ddl
deepeval
DeepEval
-ddl
dente
Depatmint
descgen
@@ -84,6 +86,7 @@ fillmode
Firestore
Fishburne
flac
+Flatform
Flipkart
forno
FPDF
@@ -157,6 +160,7 @@ levelname
lexer
linalg
linecolor
+Llion
llm
LLMs
llms
@@ -186,6 +190,7 @@ owlbot
paleo
pancetta
parcoords
+Parmar
payslip
paystub
pdfminer
@@ -233,6 +238,7 @@ Selam
selectbox
sentenc
SEO
+Shazeer
showlakes
showland
showor
@@ -249,12 +255,14 @@ ssh
ssn
SSRF
STIX
+Strappy
streamlit
sytem
tagline
tfhub
tgz
thelook
+Tianli
tiktoken
timechart
toself
@@ -270,6 +278,9 @@ urandom
Urs
username
usernames
+Uszkoreit
+Vaswani
+vectordb
vertexai
VMs
websites
@@ -287,10 +298,3 @@ yticks
zaxis
Zscaler
Zuercher
-Ashish
-Llion
-Parmar
-Shazeer
-Uszkoreit
-Vaswani
-vectordb
diff --git a/.github/actions/spelling/excludes.txt b/.github/actions/spelling/excludes.txt
index 1c98a8cd05a..a89efc0b488 100644
--- a/.github/actions/spelling/excludes.txt
+++ b/.github/actions/spelling/excludes.txt
@@ -92,6 +92,7 @@
^\Qgemini/use-cases/healthcare/react_gemini_healthcare_api.ipynb\E$
^\Qgemini/use-cases/intro_multimodal_use_cases.ipynb\E$
^\Qgemini/use-cases/retrieval-augmented-generation/intro_multimodal_rag.ipynb\E$
+^\Qgemini/use-cases/retail/product_attributes_extraction.ipynb\E$
^\Qlanguage/use-cases/document-qa/utils/__init__.py\E$
^\Qlanguage/use-cases/marketing-image-overlay/marketing_image_overlay.ipynb\E$
^\Qsearch/bulk-question-answering/bulk_question_answering_output.tsv\E$
diff --git a/gemini/context-caching/intro_context_caching.ipynb b/gemini/context-caching/intro_context_caching.ipynb
new file mode 100644
index 00000000000..cd51a919c68
--- /dev/null
+++ b/gemini/context-caching/intro_context_caching.ipynb
@@ -0,0 +1,544 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# Intro to Context Caching with the Gemini API\n",
+ "\n",
+ "
\n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\") Open in Colab\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\") Open in Colab Enterprise\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Vertex](\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\") Open in Workbench\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"GitHub](\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\") View on GitHub\n",
+ " \n",
+ " | \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Eric Dong](https://github.com/gericdong)|"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "### Gemini\n",
+ "\n",
+ "Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases.\n",
+ "\n",
+ "### Context Caching\n",
+ "\n",
+ "The Gemini API provides the context caching feature for developers to store frequently used input tokens in a dedicated cache and reference them for subsequent requests, eliminating the need to repeatedly pass the same set of tokens to a model. This feature can help reduce the number of tokens sent to the model, thereby lowering the cost of requests that contain repeat content with high input token counts.\n",
+ "\n",
+ "### Objectives\n",
+ "\n",
+ "In this tutorial, you learn how to use the Gemini API context caching feature in Vertex AI.\n",
+ "\n",
+ "You will complete the following tasks:\n",
+ "- Create a context cache\n",
+ "- Retrieve and use a context cache\n",
+ "- Use context caching in Chat\n",
+ "- Update the expire time of a context cache\n",
+ "- Delete a context cache\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## Code Examples"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "vHwJCyNF6u0O"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "mginH0QC6u0O"
+ },
+ "outputs": [],
+ "source": [
+ "import datetime\n",
+ "\n",
+ "import vertexai\n",
+ "from vertexai.generative_models import Part\n",
+ "from vertexai.preview import caching\n",
+ "from vertexai.preview.generative_models import GenerativeModel"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "PAGPrpYagu2z"
+ },
+ "source": [
+ "### Create a context cache\n",
+ "\n",
+ "**Note**: Context caching is only available for stable models with fixed versions (for example, `gemini-1.5-pro-001`). You must include the version postfix (for example, the `-001` in `gemini-1.5-pro-001`).\n",
+ "\n",
+ "For more information, see [Available Gemini stable model versions](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/model-versioning#stable-versions-available).\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XdpfXqpJ-NNj"
+ },
+ "outputs": [],
+ "source": [
+ "MODEL_ID = \"gemini-1.5-pro-001\" # @param {type:\"string\"}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "appJ7LCc_YCW"
+ },
+ "source": [
+ "Context caching is particularly well suited to scenarios where a substantial initial context is referenced repeatedly by shorter requests.\n",
+ "\n",
+ "- Cached content can be any of the MIME types supported by Gemini multimodal models. For example, you can cache a large amount of text, audio, or video. **Note**: The minimum size of a context cache is 32,769 tokens.\n",
+ "- The default expiration time of a context cache is 60 minutes. You can specify a different expiration time using the `ttl` (time to live) or the `expire_time` property.\n",
+ "\n",
+ "This example shows how to create a context cache using two large research papers stored in a Cloud Storage bucket, and set the `ttl` to 60 minutes.\n",
+ "\n",
+ "- Paper 1: [Gemini: A Family of Highly Capable Multimodal Models](https://arxiv.org/abs/2312.11805)\n",
+ "- Paper 2: [Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context](https://arxiv.org/abs/2403.05530)\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UmJA6AvVujyZ"
+ },
+ "outputs": [],
+ "source": [
+ "system_instruction = \"\"\"\n",
+ "You are an expert researcher who has years of experience in conducting systematic literature surveys and meta-analyses of different topics.\n",
+ "You pride yourself on incredible accuracy and attention to detail. You always stick to the facts in the sources provided, and never make up new facts.\n",
+ "Now look at the research paper below, and answer the following questions in 1-2 sentences.\n",
+ "\"\"\"\n",
+ "\n",
+ "contents = [\n",
+ " Part.from_uri(\n",
+ " \"gs://cloud-samples-data/generative-ai/pdf/2312.11805v3.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://cloud-samples-data/generative-ai/pdf/2403.05530.pdf\",\n",
+ " mime_type=\"application/pdf\",\n",
+ " ),\n",
+ "]\n",
+ "\n",
+ "cached_content = caching.CachedContent.create(\n",
+ " model_name=MODEL_ID,\n",
+ " system_instruction=system_instruction,\n",
+ " contents=contents,\n",
+ " ttl=datetime.timedelta(minutes=60),\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "7e1dKGSLDg2q"
+ },
+ "source": [
+ "You can access the properties of the cached content as example below. You can use its `name` or `resource_name` to reference the contents of the context cache.\n",
+ "\n",
+ "**Note**: The `name` of the context cache is also referred to as cache ID."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "uRJPRtkKDk2b"
+ },
+ "outputs": [],
+ "source": [
+ "print(cached_content.name)\n",
+ "print(cached_content.resource_name)\n",
+ "print(cached_content.model_name)\n",
+ "print(cached_content.create_time)\n",
+ "print(cached_content.expire_time)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "d-f5gTEaCPkN"
+ },
+ "source": [
+ "### Retrieve and use a context cache\n",
+ "\n",
+ "You can use the property `name` or `resource_name` to reference the contents of the context cache. For example:\n",
+ "```\n",
+ "new_cached_content = caching.CachedContent(cached_content_name=cached_content.name)\n",
+ "```"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "RQ1zMmFQ1BNj"
+ },
+ "source": [
+ "To use the context cache, you construct a `GenerativeModel` with the context cache."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "EPVyJIW1BaVj"
+ },
+ "outputs": [],
+ "source": [
+ "model = GenerativeModel.from_cached_content(cached_content=cached_content)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "1kgfyCoGH_w0"
+ },
+ "source": [
+ "Then you can query the model with a prompt, and the cached content will be used as a prefix to the prompt."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "5dSDogewDAHB"
+ },
+ "outputs": [],
+ "source": [
+ "response = model.generate_content(\n",
+ " \"What is the research goal shared by these research papers?\"\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tX7vHiybEWeJ"
+ },
+ "source": [
+ "### Use context caching in Chat\n",
+ "\n",
+ "You can use the context cache in a multi-turn chat session.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "FYNZS5o0FoGR"
+ },
+ "outputs": [],
+ "source": [
+ "chat = model.start_chat()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "5U-6wGSFFx51"
+ },
+ "outputs": [],
+ "source": [
+ "prompt = \"\"\"\n",
+ "How do the approaches to responsible AI development and mitigation strategies in Gemini 1.5 evolve from those in Gemini 1.0?\n",
+ "\"\"\"\n",
+ "\n",
+ "response = chat.send_message(prompt)\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "FFO_JgKNeCpK"
+ },
+ "outputs": [],
+ "source": [
+ "prompt = \"\"\"\n",
+ "Given the advancements presented in Gemini 1.5, what are the key future research directions identified in both papers\n",
+ "for further improving multimodal AI models?\n",
+ "\"\"\"\n",
+ "\n",
+ "response = chat.send_message(prompt)\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "h2VhjUmojQjg"
+ },
+ "source": [
+ "You can use `print(chat.history)` to print out the chat session history."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ORsGDdXXLwHK"
+ },
+ "source": [
+ "### Update the expiration time of a context cache\n",
+ "\n",
+ "\n",
+ "The default expiration time of a context cache is 60 minutes. To update the expiration time, update one of the following properties:\n",
+ "\n",
+ "`ttl` - The number of seconds and nanoseconds that the cache lives after it's created or after the `ttl` is updated before it expires. When you set the `ttl`, the cache `expire_time` is updated.\n",
+ "\n",
+ "`expire_time` - A Timestamp that specifies the absolute date and time when the context cache expires."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "WyiZoZHKI2Jr"
+ },
+ "outputs": [],
+ "source": [
+ "cached_content.update(ttl=datetime.timedelta(hours=1))\n",
+ "\n",
+ "cached_content.refresh()\n",
+ "\n",
+ "print(cached_content.expire_time)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "chd6_8YRxdIu"
+ },
+ "source": [
+ "### Delete a context cache\n",
+ "\n",
+ "You can remove content from the cache using the delete operation."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XGzgTk6YzgSt"
+ },
+ "outputs": [],
+ "source": [
+ "cached_content.delete()"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "intro_context_caching.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
diff --git a/gemini/controlled-generation/intro_controlled_generation.ipynb b/gemini/controlled-generation/intro_controlled_generation.ipynb
index 4f522777ffc..90449f5334a 100644
--- a/gemini/controlled-generation/intro_controlled_generation.ipynb
+++ b/gemini/controlled-generation/intro_controlled_generation.ipynb
@@ -1,749 +1,797 @@
{
- "cells": [
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "ur8xi4C7S06n"
- },
- "outputs": [],
- "source": [
- "# Copyright 2024 Google LLC\n",
- "#\n",
- "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
- "# you may not use this file except in compliance with the License.\n",
- "# You may obtain a copy of the License at\n",
- "#\n",
- "# https://www.apache.org/licenses/LICENSE-2.0\n",
- "#\n",
- "# Unless required by applicable law or agreed to in writing, software\n",
- "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
- "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
- "# See the License for the specific language governing permissions and\n",
- "# limitations under the License."
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "JAPoU8Sm5E6e"
- },
- "source": [
- "# Intro to Controlled Generation with the Gemini API\n",
- "\n",
- "\n",
- " \n",
- " \n",
- " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\") Open in Colab\n",
- " \n",
- " | \n",
- " \n",
- " \n",
- " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\") Open in Colab Enterprise\n",
- " \n",
- " | \n",
- " \n",
- " \n",
- " ![\"Vertex](\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\") Open in Workbench\n",
- " \n",
- " | \n",
- " \n",
- " \n",
- " ![\"GitHub](\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\") View on GitHub\n",
- " \n",
- " | \n",
- "
"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "84f0f73a0f76"
- },
- "source": [
- "| | |\n",
- "|-|-|\n",
- "|Author(s) | [Eric Dong](https://github.com/gericdong)|"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "tvgnzT1CKxrO"
- },
- "source": [
- "## Overview\n",
- "\n",
- "### Gemini\n",
- "\n",
- "Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases.\n",
- "\n",
- "### Controlled Generation\n",
- "\n",
- "Depending on your application, you may want the model response to a prompt to be returned in a structured data format, particularly if you are using the responses for downstream processes, such as downstream modules that expect a specific format as input. The Gemini API provides the controlled generation capability to constraint the model output to a structured format.\n",
- "\n",
- "\n",
- "### Objectives\n",
- "\n",
- "In this tutorial, you learn how to use the controlled generation capability in the Vertex AI Gemini API to generate model responses in a JSON object with specific fields.\n",
- "\n",
- "You will complete the following tasks:\n",
- "\n",
- "- Using `response_mime_type` with the Gemini 1.5 Flash models\n",
- "- Using `response_mime_type` and `response_schema` with the Gemini 1.5 Pro models\n",
- "- Using controlled generation in use cases requiring output constraints\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "61RBz8LLbxCR"
- },
- "source": [
- "## Get started"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "No17Cw5hgx12"
- },
- "source": [
- "### Install Vertex AI SDK and other required packages\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "tFy3H3aPgx12"
- },
- "outputs": [],
- "source": [
- "%pip install --upgrade --user --quiet google-cloud-aiplatform"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "R5Xep4W9lq-Z"
- },
- "source": [
- "### Restart runtime\n",
- "\n",
- "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
- "\n",
- "The restart might take a minute or longer. After it's restarted, continue to the next step."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "XRvKdaPDTznN"
- },
- "outputs": [],
- "source": [
- "import IPython\n",
- "\n",
- "app = IPython.Application.instance()\n",
- "app.kernel.do_shutdown(True)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "SbmM4z7FOBpM"
- },
- "source": [
- "\n",
- "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
- "
\n"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "dmWOrTJ3gx13"
- },
- "source": [
- "### Authenticate your notebook environment (Colab only)\n",
- "\n",
- "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "NyKGtVQjgx13"
- },
- "outputs": [],
- "source": [
- "import sys\n",
- "\n",
- "if \"google.colab\" in sys.modules:\n",
- " from google.colab import auth\n",
- "\n",
- " auth.authenticate_user()"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "DF4l8DTdWgPY"
- },
- "source": [
- "### Set Google Cloud project information and initialize Vertex AI SDK\n",
- "\n",
- "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
- "\n",
- "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "Nqwi-5ufWp_B"
- },
- "outputs": [],
- "source": [
- "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
- "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
- "\n",
- "\n",
- "import vertexai\n",
- "\n",
- "vertexai.init(project=PROJECT_ID, location=LOCATION)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "EdvJRUWRNGHE"
- },
- "source": [
- "## Code Examples"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "09720c707f1c"
- },
- "source": [
- "### Import libraries"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "e45ea9a28734"
- },
- "outputs": [],
- "source": [
- "import json\n",
- "\n",
- "from vertexai import generative_models\n",
- "from vertexai.generative_models import GenerationConfig, GenerativeModel"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "74badac24b3e"
- },
- "source": [
- "### Using `response_mime_type` with the Gemini 1.5 Flash models\n",
- "\n",
- "You can have the model output in certain format by setting the `response_mime_type` configuration option in `generation_config`, and in the prompt, describe the format you want in response."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "4a9c4ebc507b"
- },
- "outputs": [],
- "source": [
- "model = GenerativeModel(\n",
- " model_name=\"gemini-1.5-flash\",\n",
- " generation_config={\"response_mime_type\": \"application/json\"},\n",
- ")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "a63b746a44cf"
- },
- "source": [
- "In the prompt, describe the format you want in response."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "37292b0e4ef6"
- },
- "outputs": [],
- "source": [
- "prompt = \"\"\"\n",
- " List a few popular cookie recipes using this JSON schema:\n",
- " Recipe = {\"recipe_name\": str}\n",
- " Return: list[Recipe]\n",
- "\"\"\""
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "09e3f92c710c"
- },
- "source": [
- "Generate the content and parse the response string to JSON."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 6,
- "metadata": {
- "id": "fee244ad523e"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[{'recipe_name': 'Chocolate Chip Cookies'}, {'recipe_name': 'Oatmeal Raisin Cookies'}, {'recipe_name': 'Snickerdoodles'}, {'recipe_name': 'Sugar Cookies'}, {'recipe_name': 'Peanut Butter Cookies'}]\n"
- ]
- }
- ],
- "source": [
- "response = model.generate_content(prompt)\n",
- "\n",
- "json_response = json.loads(response.text)\n",
- "print(json_response)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "52aeea15a479"
- },
- "source": [
- "### Using `response_mime_type` and `response_schema` with the Gemini 1.5 Pro models\n",
- "\n",
- "While Gemini 1.5 Flash models only accept a text description of the schema you want returned, the Gemini 1.5 Pro models let you pass a data structure in the `response_schema` parameter in `generation_config`, and the model output will strictly follow that schema.\n",
- "\n",
- "Note that when `response_schema` is specified, the `response_mime_type` has to be set to `application/json`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "81cbb6bd51d8"
- },
- "outputs": [],
- "source": [
- "model = GenerativeModel(\"gemini-1.5-pro\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "766346c046f9"
- },
- "source": [
- "Following the previous example, define the data structure for the model output. Note that all of the fields in the JSON are optional by default unless specified in the `required` field."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 8,
- "metadata": {
- "id": "af3fa1fbff4f"
- },
- "outputs": [],
- "source": [
- "response_schema = {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"OBJECT\",\n",
- " \"properties\": {\n",
- " \"recipe_name\": {\n",
- " \"type\": \"STRING\",\n",
- " },\n",
- " },\n",
- " \"required\": [\"recipe_name\"],\n",
- " },\n",
- "}"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "82033e70bf6e"
- },
- "source": [
- "When prompting the model to generate the content, pass the schema to the `response_schema` field of the `generation_config`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 9,
- "metadata": {
- "id": "5db8b91d5be0"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[{\"recipe_name\": \"Classic Chocolate Chip Cookies\"}, {\"recipe_name\": \"Peanut Butter Cookies\"}, {\"recipe_name\": \"Snickerdoodles\"}, {\"recipe_name\": \"Oatmeal Raisin Cookies\"}, {\"recipe_name\": \"Shortbread Cookies\"}] \n"
- ]
- }
- ],
- "source": [
- "response = model.generate_content(\n",
- " \"List a few popular cookie recipes\",\n",
- " generation_config=GenerationConfig(\n",
- " response_mime_type=\"application/json\", response_schema=response_schema\n",
- " ),\n",
- ")\n",
- "\n",
- "print(response.text)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "ca9af4346be7"
- },
- "source": [
- "You can parse the response string to JSON."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 10,
- "metadata": {
- "id": "76b5284016c0"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[{'recipe_name': 'Classic Chocolate Chip Cookies'}, {'recipe_name': 'Peanut Butter Cookies'}, {'recipe_name': 'Snickerdoodles'}, {'recipe_name': 'Oatmeal Raisin Cookies'}, {'recipe_name': 'Shortbread Cookies'}]\n"
- ]
- }
- ],
- "source": [
- "json_response = json.loads(response.text)\n",
- "print(json_response)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "69450c61bc07"
- },
- "source": [
- "### Using controlled generation in use cases requiring output constraints\n",
- "\n",
- "Controlled generation can be used to ensure that model outputs adhere to a specific structure (e.g., JSON), instruct the model to perform pure multiple choices (e.g., sentiment classification), or follow certain style or guidelines.\n",
- "\n",
- "Let's use controlled generation with the Gemini 1.5 Pro models in the following use cases that require output constraints."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": null,
- "metadata": {
- "id": "64f95263b81d"
- },
- "outputs": [],
- "source": [
- "model = GenerativeModel(\"gemini-1.5-pro\")"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "eba9ef4d4b50"
- },
- "source": [
- "#### **Example**: Generate game character profile\n",
- "\n",
- "In this example, you instruct the model to create a game character profile with some specific requirements, and constraint the model output to a structured format. This example also demonstrates how to configure the `response_schema` and `response_mime_type` fields in `generative_config` in conjunction with `safety_settings`."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 12,
- "metadata": {
- "id": "1411f729f2f7"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- " [{\n",
- " \"age\": 42,\n",
- " \"children\": [\n",
- " {\n",
- " \"age\": 21,\n",
- " \"name\": \"Merida\"\n",
- " },\n",
- " {\n",
- " \"age\": 18,\n",
- " \"name\": \"Fergus\"\n",
- " },\n",
- " {\n",
- " \"age\": 18,\n",
- " \"name\": \"Harris\"\n",
- " }\n",
- " ],\n",
- " \"name\": \"Eleanor\",\n",
- " \"occupation\": \"Queen\",\n",
- " \"background\": \"Eleanor, the beloved ruler of a prosperous kingdom, is known for her wisdom, grace, and unwavering strength. After the untimely death of her husband, she has successfully navigated countless challenges, earning her the admiration of both her people and neighboring rulers. However, a new threat emerges, one that will test Eleanor's mettle and force her to confront her past\",\n",
- " \"playable\": false\n",
- " },\n",
- " {\n",
- " \"age\": 25,\n",
- " \"children\": [],\n",
- " \"name\": \"Kaelen\",\n",
- " \"occupation\": \"Hunter\",\n",
- " \"background\": \"Kaelen is a skilled hunter and tracker who lives off the land, relying on his instincts and knowledge of the wilderness. He is fiercely independent and wary of outsiders, but his loyalty to those he trusts is unwavering. Haunted by a tragic event from his past, Kaelen struggles to balance his desire for revenge with his inherent sense of justice\",\n",
- " \"playable\": true\n",
- " }\n",
- "] \n"
- ]
- }
- ],
- "source": [
- "response_schema = {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"OBJECT\",\n",
- " \"properties\": {\n",
- " \"name\": {\"type\": \"STRING\"},\n",
- " \"age\": {\"type\": \"INTEGER\"},\n",
- " \"occupation\": {\"type\": \"STRING\"},\n",
- " \"background\": {\"type\": \"STRING\"},\n",
- " \"playable\": {\"type\": \"BOOLEAN\"},\n",
- " \"children\": {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"OBJECT\",\n",
- " \"properties\": {\n",
- " \"name\": {\"type\": \"STRING\"},\n",
- " \"age\": {\"type\": \"INTEGER\"},\n",
- " },\n",
- " \"required\": [\"name\", \"age\"],\n",
- " },\n",
- " },\n",
- " },\n",
- " \"required\": [\"name\", \"age\", \"occupation\", \"children\"],\n",
- " },\n",
- "}\n",
- "\n",
- "prompt = \"\"\"\n",
- " Generate a character profile for a video game, including the character's name, age, occupation, background, names of their\n",
- " three children, and whether they can be controlled by the player.\n",
- "\"\"\"\n",
- "\n",
- "response = model.generate_content(\n",
- " prompt,\n",
- " generation_config=GenerationConfig(\n",
- " response_mime_type=\"application/json\", response_schema=response_schema\n",
- " ),\n",
- " safety_settings={\n",
- " generative_models.HarmCategory.HARM_CATEGORY_HATE_SPEECH: generative_models.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,\n",
- " generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: generative_models.HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
- " generative_models.HarmCategory.HARM_CATEGORY_HARASSMENT: generative_models.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n",
- " generative_models.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: generative_models.HarmBlockThreshold.BLOCK_NONE,\n",
- " },\n",
- ")\n",
- "\n",
- "print(response.text)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "e02769d61054"
- },
- "source": [
- "#### **Example**: Extract errors from log data\n",
- "\n",
- "In this example, you use the model to pull out specific error messages from unstructured log data, extract key information, and constraint the model output to a structured format.\n"
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 13,
- "metadata": {
- "id": "007c0394cadc"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[{\"error_code\": 308, \"error_message\": \"Could not process image upload: Unsupported file format.\" , \"timestamp\": \"15:43:28\"}, {\"error_code\": 5522, \"error_message\": \"Service dependency unavailable (payment gateway). Retrying...\" , \"timestamp\": \"15:45:02\"}, {\"error_code\": 9001, \"error_message\": \"Application crashed due to out-of-memory exception.\" , \"timestamp\": \"15:45:33\"}] \n"
- ]
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# Intro to Controlled Generation with the Gemini API\n",
+ "\n",
+ "\n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\") Open in Colab\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\") Open in Colab Enterprise\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Vertex](\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\") Open in Workbench\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"GitHub](\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\") View on GitHub\n",
+ " \n",
+ " | \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Eric Dong](https://github.com/gericdong)|"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "### Gemini\n",
+ "\n",
+ "Gemini is a family of generative AI models developed by Google DeepMind that is designed for multimodal use cases.\n",
+ "\n",
+ "### Controlled Generation\n",
+ "\n",
+ "Depending on your application, you may want the model response to a prompt to be returned in a structured data format, particularly if you are using the responses for downstream processes, such as downstream modules that expect a specific format as input. The Gemini API provides the controlled generation capability to constraint the model output to a structured format.\n",
+ "\n",
+ "\n",
+ "### Objectives\n",
+ "\n",
+ "In this tutorial, you learn how to use the controlled generation capability in the Vertex AI Gemini API to generate model responses in a JSON object with specific fields.\n",
+ "\n",
+ "You will complete the following tasks:\n",
+ "\n",
+ "- Using `response_mime_type` with the Gemini 1.5 Flash models\n",
+ "- Using `response_mime_type` and `response_schema` with the Gemini 1.5 Pro models\n",
+ "- Using controlled generation in use cases requiring output constraints\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "XRvKdaPDTznN"
+ },
+ "outputs": [],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"[your-project-id]\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## Code Examples"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "09720c707f1c"
+ },
+ "source": [
+ "### Import libraries"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "e45ea9a28734"
+ },
+ "outputs": [],
+ "source": [
+ "import json\n",
+ "\n",
+ "from vertexai import generative_models\n",
+ "from vertexai.generative_models import GenerationConfig, GenerativeModel, Part"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "74badac24b3e"
+ },
+ "source": [
+ "### Using `response_mime_type` with the Gemini 1.5 Flash models\n",
+ "\n",
+ "You can have the model output in certain format by setting the `response_mime_type` configuration option in `generation_config`, and in the prompt, describe the format you want in response."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "4a9c4ebc507b"
+ },
+ "outputs": [],
+ "source": [
+ "model = GenerativeModel(\n",
+ " model_name=\"gemini-1.5-flash\",\n",
+ " generation_config={\"response_mime_type\": \"application/json\"},\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a63b746a44cf"
+ },
+ "source": [
+ "In the prompt, describe the format you want in response."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "37292b0e4ef6"
+ },
+ "outputs": [],
+ "source": [
+ "prompt = \"\"\"\n",
+ " List a few popular cookie recipes using this JSON schema:\n",
+ " Recipe = {\"recipe_name\": str}\n",
+ " Return: list[Recipe]\n",
+ "\"\"\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "09e3f92c710c"
+ },
+ "source": [
+ "Generate the content and parse the response string to JSON."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "id": "fee244ad523e"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[{'recipe_name': 'Chocolate Chip Cookies'}, {'recipe_name': 'Oatmeal Raisin Cookies'}, {'recipe_name': 'Snickerdoodles'}, {'recipe_name': 'Sugar Cookies'}, {'recipe_name': 'Peanut Butter Cookies'}]\n"
+ ]
+ }
+ ],
+ "source": [
+ "response = model.generate_content(prompt)\n",
+ "\n",
+ "json_response = json.loads(response.text)\n",
+ "print(json_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "52aeea15a479"
+ },
+ "source": [
+ "### Using `response_mime_type` and `response_schema` with the Gemini 1.5 Pro models\n",
+ "\n",
+ "While Gemini 1.5 Flash models only accept a text description of the schema you want returned, the Gemini 1.5 Pro models let you pass a data structure in the `response_schema` parameter in `generation_config`, and the model output will strictly follow that schema.\n",
+ "\n",
+ "Note that when `response_schema` is specified, the `response_mime_type` has to be set to `application/json`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "81cbb6bd51d8"
+ },
+ "outputs": [],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "766346c046f9"
+ },
+ "source": [
+ "Following the previous example, define the data structure for the model output. Note that all of the fields in the JSON are optional by default unless specified in the `required` field."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 8,
+ "metadata": {
+ "id": "af3fa1fbff4f"
+ },
+ "outputs": [],
+ "source": [
+ "response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"recipe_name\": {\n",
+ " \"type\": \"STRING\",\n",
+ " },\n",
+ " },\n",
+ " \"required\": [\"recipe_name\"],\n",
+ " },\n",
+ "}"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "82033e70bf6e"
+ },
+ "source": [
+ "When prompting the model to generate the content, pass the schema to the `response_schema` field of the `generation_config`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 9,
+ "metadata": {
+ "id": "5db8b91d5be0"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[{\"recipe_name\": \"Classic Chocolate Chip Cookies\"}, {\"recipe_name\": \"Peanut Butter Cookies\"}, {\"recipe_name\": \"Snickerdoodles\"}, {\"recipe_name\": \"Oatmeal Raisin Cookies\"}, {\"recipe_name\": \"Shortbread Cookies\"}] \n"
+ ]
+ }
+ ],
+ "source": [
+ "response = model.generate_content(\n",
+ " \"List a few popular cookie recipes\",\n",
+ " generation_config=GenerationConfig(\n",
+ " response_mime_type=\"application/json\", response_schema=response_schema\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "ca9af4346be7"
+ },
+ "source": [
+ "You can parse the response string to JSON."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 10,
+ "metadata": {
+ "id": "76b5284016c0"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[{'recipe_name': 'Classic Chocolate Chip Cookies'}, {'recipe_name': 'Peanut Butter Cookies'}, {'recipe_name': 'Snickerdoodles'}, {'recipe_name': 'Oatmeal Raisin Cookies'}, {'recipe_name': 'Shortbread Cookies'}]\n"
+ ]
+ }
+ ],
+ "source": [
+ "json_response = json.loads(response.text)\n",
+ "print(json_response)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "69450c61bc07"
+ },
+ "source": [
+ "### Using controlled generation in use cases requiring output constraints\n",
+ "\n",
+ "Controlled generation can be used to ensure that model outputs adhere to a specific structure (e.g., JSON), instruct the model to perform pure multiple choices (e.g., sentiment classification), or follow certain style or guidelines.\n",
+ "\n",
+ "Let's use controlled generation with the Gemini 1.5 Pro models in the following use cases that require output constraints."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "64f95263b81d"
+ },
+ "outputs": [],
+ "source": [
+ "model = GenerativeModel(\"gemini-1.5-pro\")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "eba9ef4d4b50"
+ },
+ "source": [
+ "#### **Example**: Generate game character profile\n",
+ "\n",
+ "In this example, you instruct the model to create a game character profile with some specific requirements, and constraint the model output to a structured format. This example also demonstrates how to configure the `response_schema` and `response_mime_type` fields in `generative_config` in conjunction with `safety_settings`."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 12,
+ "metadata": {
+ "id": "1411f729f2f7"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " [{\n",
+ " \"age\": 42,\n",
+ " \"children\": [\n",
+ " {\n",
+ " \"age\": 21,\n",
+ " \"name\": \"Merida\"\n",
+ " },\n",
+ " {\n",
+ " \"age\": 18,\n",
+ " \"name\": \"Fergus\"\n",
+ " },\n",
+ " {\n",
+ " \"age\": 18,\n",
+ " \"name\": \"Harris\"\n",
+ " }\n",
+ " ],\n",
+ " \"name\": \"Eleanor\",\n",
+ " \"occupation\": \"Queen\",\n",
+ " \"background\": \"Eleanor, the beloved ruler of a prosperous kingdom, is known for her wisdom, grace, and unwavering strength. After the untimely death of her husband, she has successfully navigated countless challenges, earning her the admiration of both her people and neighboring rulers. However, a new threat emerges, one that will test Eleanor's mettle and force her to confront her past\",\n",
+ " \"playable\": false\n",
+ " },\n",
+ " {\n",
+ " \"age\": 25,\n",
+ " \"children\": [],\n",
+ " \"name\": \"Kaelen\",\n",
+ " \"occupation\": \"Hunter\",\n",
+ " \"background\": \"Kaelen is a skilled hunter and tracker who lives off the land, relying on his instincts and knowledge of the wilderness. He is fiercely independent and wary of outsiders, but his loyalty to those he trusts is unwavering. Haunted by a tragic event from his past, Kaelen struggles to balance his desire for revenge with his inherent sense of justice\",\n",
+ " \"playable\": true\n",
+ " }\n",
+ "] \n"
+ ]
+ }
+ ],
+ "source": [
+ "response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"name\": {\"type\": \"STRING\"},\n",
+ " \"age\": {\"type\": \"INTEGER\"},\n",
+ " \"occupation\": {\"type\": \"STRING\"},\n",
+ " \"background\": {\"type\": \"STRING\"},\n",
+ " \"playable\": {\"type\": \"BOOLEAN\"},\n",
+ " \"children\": {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"name\": {\"type\": \"STRING\"},\n",
+ " \"age\": {\"type\": \"INTEGER\"},\n",
+ " },\n",
+ " \"required\": [\"name\", \"age\"],\n",
+ " },\n",
+ " },\n",
+ " },\n",
+ " \"required\": [\"name\", \"age\", \"occupation\", \"children\"],\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "prompt = \"\"\"\n",
+ " Generate a character profile for a video game, including the character's name, age, occupation, background, names of their\n",
+ " three children, and whether they can be controlled by the player.\n",
+ "\"\"\"\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " prompt,\n",
+ " generation_config=GenerationConfig(\n",
+ " response_mime_type=\"application/json\", response_schema=response_schema\n",
+ " ),\n",
+ " safety_settings={\n",
+ " generative_models.HarmCategory.HARM_CATEGORY_HATE_SPEECH: generative_models.HarmBlockThreshold.BLOCK_MEDIUM_AND_ABOVE,\n",
+ " generative_models.HarmCategory.HARM_CATEGORY_DANGEROUS_CONTENT: generative_models.HarmBlockThreshold.BLOCK_ONLY_HIGH,\n",
+ " generative_models.HarmCategory.HARM_CATEGORY_HARASSMENT: generative_models.HarmBlockThreshold.BLOCK_LOW_AND_ABOVE,\n",
+ " generative_models.HarmCategory.HARM_CATEGORY_SEXUALLY_EXPLICIT: generative_models.HarmBlockThreshold.BLOCK_NONE,\n",
+ " },\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "e02769d61054"
+ },
+ "source": [
+ "#### **Example**: Extract errors from log data\n",
+ "\n",
+ "In this example, you use the model to pull out specific error messages from unstructured log data, extract key information, and constraint the model output to a structured format.\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "id": "007c0394cadc"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[{\"error_code\": 308, \"error_message\": \"Could not process image upload: Unsupported file format.\" , \"timestamp\": \"15:43:28\"}, {\"error_code\": 5522, \"error_message\": \"Service dependency unavailable (payment gateway). Retrying...\" , \"timestamp\": \"15:45:02\"}, {\"error_code\": 9001, \"error_message\": \"Application crashed due to out-of-memory exception.\" , \"timestamp\": \"15:45:33\"}] \n"
+ ]
+ }
+ ],
+ "source": [
+ "response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"timestamp\": {\"type\": \"STRING\"},\n",
+ " \"error_code\": {\"type\": \"INTEGER\"},\n",
+ " \"error_message\": {\"type\": \"STRING\"},\n",
+ " },\n",
+ " \"required\": [\"timestamp\", \"error_message\", \"error_code\"],\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "prompt = \"\"\"\n",
+ "[15:43:28] ERROR: Could not process image upload: Unsupported file format. (Error Code: 308)\n",
+ "[15:44:10] INFO: Search index updated successfully.\n",
+ "[15:45:02] ERROR: Service dependency unavailable (payment gateway). Retrying... (Error Code: 5522)\n",
+ "[15:45:33] ERROR: Application crashed due to out-of-memory exception. (Error Code: 9001)\n",
+ "\"\"\"\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " prompt,\n",
+ " generation_config=GenerationConfig(\n",
+ " response_mime_type=\"application/json\", response_schema=response_schema\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "a74594893037"
+ },
+ "source": [
+ "#### **Example**: Analyze product review data\n",
+ "\n",
+ "In this example, you instruct the model to analyze product review data, extract key entities, perform sentiment classification (multiple choices), provide additional explanation, and output the results in JSON format."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 14,
+ "metadata": {
+ "id": "9a3b8b9800f9"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[\n",
+ " [\n",
+ " {\n",
+ " \"explanation\": \"Strong positive sentiment with superlative language (\\\"best ever\\\")\",\n",
+ " \"flavor\": \"Strawberry Cheesecake\",\n",
+ " \"rating\": 4,\n",
+ " \"sentiment\": \"POSITIVE\"\n",
+ " }\n",
+ " ],\n",
+ " [\n",
+ " {\n",
+ " \"explanation\": \"Mixed sentiment - acknowledges positive aspects (\\\"quite good\\\") but expresses a negative preference (\\\"too sweet\\\")\",\n",
+ " \"flavor\": \"Mango Tango\",\n",
+ " \"rating\": 1,\n",
+ " \"sentiment\": \"NEGATIVE\"\n",
+ " }\n",
+ " ]\n",
+ "] \n"
+ ]
+ }
+ ],
+ "source": [
+ "response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"rating\": {\"type\": \"INTEGER\"},\n",
+ " \"flavor\": {\"type\": \"STRING\"},\n",
+ " \"sentiment\": {\n",
+ " \"type\": \"STRING\",\n",
+ " \"enum\": [\"POSITIVE\", \"NEGATIVE\", \"NEUTRAL\"],\n",
+ " },\n",
+ " \"explanation\": {\"type\": \"STRING\"},\n",
+ " },\n",
+ " \"required\": [\"rating\", \"flavor\", \"sentiment\", \"explanation\"],\n",
+ " },\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "prompt = \"\"\"\n",
+ " Analyze the following product reviews, output the sentiment classification and give an explanation.\n",
+ " \n",
+ " - \"Absolutely loved it! Best ice cream I've ever had.\" Rating: 4, Flavor: Strawberry Cheesecake\n",
+ " - \"Quite good, but a bit too sweet for my taste.\" Rating: 1, Flavor: Mango Tango\n",
+ "\"\"\"\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " prompt,\n",
+ " generation_config=GenerationConfig(\n",
+ " response_mime_type=\"application/json\", response_schema=response_schema\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "10971b23afcf"
+ },
+ "source": [
+ "#### Example: Detect objects in images\n",
+ "\n",
+ "You can also use controlled generation in multimodality use cases. In this example, you instruct the model to detect objects in the images and output the results in JSON format. These images are stored in a Google Storage bucket.\n",
+ "\n",
+ "- [office-desk.jpeg](https://storage.googleapis.com/cloud-samples-data/generative-ai/image/office-desk.jpeg)\n",
+ "- [gardening-tools.jpeg](https://storage.googleapis.com/cloud-samples-data/generative-ai/image/gardening-tools.jpeg)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 13,
+ "metadata": {
+ "id": "1f3e9935e2da"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[\n",
+ " [{\"object\": \"globe\"}, {\"object\": \"tablet\"}, {\"object\": \"shopping cart\"}, {\"object\": \"eiffel tower\"}, {\"object\": \"airplane\"}, {\"object\": \"passport\"}, {\"object\": \"keyboard\"}, {\"object\": \"computer mouse\"}, {\"object\": \"sunglasses\"}, {\"object\": \"money\"}, {\"object\": \"notebook\"}, {\"object\": \"pen\"}, {\"object\": \"coffee cup\"}],\n",
+ " [{\"object\": \"watering can\"}, {\"object\": \"plant\"}, {\"object\": \"flower pot\"}, {\"object\": \"flower pot\"}, {\"object\": \"garden gloves\"}, {\"object\": \"garden trowel\"}, {\"object\": \"garden hand tool\"}]\n",
+ "] \n"
+ ]
+ }
+ ],
+ "source": [
+ "response_schema = {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"ARRAY\",\n",
+ " \"items\": {\n",
+ " \"type\": \"OBJECT\",\n",
+ " \"properties\": {\n",
+ " \"object\": {\"type\": \"STRING\"},\n",
+ " },\n",
+ " },\n",
+ " },\n",
+ "}\n",
+ "\n",
+ "prompt = \"Generate a list of objects in the images.\"\n",
+ "\n",
+ "response = model.generate_content(\n",
+ " [\n",
+ " Part.from_uri(\n",
+ " \"gs://cloud-samples-data/generative-ai/image/office-desk.jpeg\",\n",
+ " \"image/jpeg\",\n",
+ " ),\n",
+ " Part.from_uri(\n",
+ " \"gs://cloud-samples-data/generative-ai/image/gardening-tools.jpeg\",\n",
+ " \"image/jpeg\",\n",
+ " ),\n",
+ " prompt,\n",
+ " ],\n",
+ " generation_config=GenerationConfig(\n",
+ " response_mime_type=\"application/json\", response_schema=response_schema\n",
+ " ),\n",
+ ")\n",
+ "\n",
+ "print(response.text)"
+ ]
}
- ],
- "source": [
- "response_schema = {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"OBJECT\",\n",
- " \"properties\": {\n",
- " \"timestamp\": {\"type\": \"STRING\"},\n",
- " \"error_code\": {\"type\": \"INTEGER\"},\n",
- " \"error_message\": {\"type\": \"STRING\"},\n",
- " },\n",
- " \"required\": [\"timestamp\", \"error_message\", \"error_code\"],\n",
- " },\n",
- "}\n",
- "\n",
- "prompt = \"\"\"\n",
- "[15:43:28] ERROR: Could not process image upload: Unsupported file format. (Error Code: 308)\n",
- "[15:44:10] INFO: Search index updated successfully.\n",
- "[15:45:02] ERROR: Service dependency unavailable (payment gateway). Retrying... (Error Code: 5522)\n",
- "[15:45:33] ERROR: Application crashed due to out-of-memory exception. (Error Code: 9001)\n",
- "\"\"\"\n",
- "\n",
- "response = model.generate_content(\n",
- " prompt,\n",
- " generation_config=GenerationConfig(\n",
- " response_mime_type=\"application/json\", response_schema=response_schema\n",
- " ),\n",
- ")\n",
- "\n",
- "print(response.text)"
- ]
- },
- {
- "cell_type": "markdown",
- "metadata": {
- "id": "a74594893037"
- },
- "source": [
- "#### **Example**: Analyze product review data\n",
- "\n",
- "In this example, you instruct the model to analyze product review data, extract key entities, perform sentiment classification (multiple choices), provide additional explanation, and output the results in JSON format."
- ]
- },
- {
- "cell_type": "code",
- "execution_count": 14,
- "metadata": {
- "id": "9a3b8b9800f9"
- },
- "outputs": [
- {
- "name": "stdout",
- "output_type": "stream",
- "text": [
- "[\n",
- " [\n",
- " {\n",
- " \"explanation\": \"Strong positive sentiment with superlative language (\\\"best ever\\\")\",\n",
- " \"flavor\": \"Strawberry Cheesecake\",\n",
- " \"rating\": 4,\n",
- " \"sentiment\": \"POSITIVE\"\n",
- " }\n",
- " ],\n",
- " [\n",
- " {\n",
- " \"explanation\": \"Mixed sentiment - acknowledges positive aspects (\\\"quite good\\\") but expresses a negative preference (\\\"too sweet\\\")\",\n",
- " \"flavor\": \"Mango Tango\",\n",
- " \"rating\": 1,\n",
- " \"sentiment\": \"NEGATIVE\"\n",
- " }\n",
- " ]\n",
- "] \n"
- ]
+ ],
+ "metadata": {
+ "colab": {
+ "name": "intro_controlled_generation.ipynb",
+ "toc_visible": true
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
}
- ],
- "source": [
- "response_schema = {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"ARRAY\",\n",
- " \"items\": {\n",
- " \"type\": \"OBJECT\",\n",
- " \"properties\": {\n",
- " \"rating\": {\"type\": \"INTEGER\"},\n",
- " \"flavor\": {\"type\": \"STRING\"},\n",
- " \"sentiment\": {\n",
- " \"type\": \"STRING\",\n",
- " \"enum\": [\"POSITIVE\", \"NEGATIVE\", \"NEUTRAL\"],\n",
- " },\n",
- " \"explanation\": {\"type\": \"STRING\"},\n",
- " },\n",
- " \"required\": [\"rating\", \"flavor\", \"sentiment\", \"explanation\"],\n",
- " },\n",
- " },\n",
- "}\n",
- "\n",
- "prompt = \"\"\"\n",
- " Analyze the following product reviews, output the sentiment classification and give an explanation.\n",
- " \n",
- " - \"Absolutely loved it! Best ice cream I've ever had.\" Rating: 4, Flavor: Strawberry Cheesecake\n",
- " - \"Quite good, but a bit too sweet for my taste.\" Rating: 1, Flavor: Mango Tango\n",
- "\"\"\"\n",
- "\n",
- "response = model.generate_content(\n",
- " prompt,\n",
- " generation_config=GenerationConfig(\n",
- " response_mime_type=\"application/json\", response_schema=response_schema\n",
- " ),\n",
- ")\n",
- "\n",
- "print(response.text)"
- ]
- }
- ],
- "metadata": {
- "colab": {
- "name": "intro_controlled_generation.ipynb",
- "toc_visible": true
- },
- "environment": {
- "kernel": "conda-root-py",
- "name": "workbench-notebooks.m113",
- "type": "gcloud",
- "uri": "gcr.io/deeplearning-platform-release/workbench-notebooks:m113"
- },
- "kernelspec": {
- "display_name": "Python 3 (ipykernel) (Local)",
- "language": "python",
- "name": "conda-root-py"
},
- "language_info": {
- "codemirror_mode": {
- "name": "ipython",
- "version": 3
- },
- "file_extension": ".py",
- "mimetype": "text/x-python",
- "name": "python",
- "nbconvert_exporter": "python",
- "pygments_lexer": "ipython3",
- "version": "3.10.13"
- }
- },
- "nbformat": 4,
- "nbformat_minor": 4
+ "nbformat": 4,
+ "nbformat_minor": 0
}
diff --git a/gemini/use-cases/retail/product_attributes_extraction.ipynb b/gemini/use-cases/retail/product_attributes_extraction.ipynb
new file mode 100644
index 00000000000..2352ff3d5ff
--- /dev/null
+++ b/gemini/use-cases/retail/product_attributes_extraction.ipynb
@@ -0,0 +1,588 @@
+{
+ "cells": [
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "ur8xi4C7S06n"
+ },
+ "outputs": [],
+ "source": [
+ "# Copyright 2024 Google LLC\n",
+ "#\n",
+ "# Licensed under the Apache License, Version 2.0 (the \"License\");\n",
+ "# you may not use this file except in compliance with the License.\n",
+ "# You may obtain a copy of the License at\n",
+ "#\n",
+ "# https://www.apache.org/licenses/LICENSE-2.0\n",
+ "#\n",
+ "# Unless required by applicable law or agreed to in writing, software\n",
+ "# distributed under the License is distributed on an \"AS IS\" BASIS,\n",
+ "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n",
+ "# See the License for the specific language governing permissions and\n",
+ "# limitations under the License."
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "JAPoU8Sm5E6e"
+ },
+ "source": [
+ "# Product attributes extraction and detailed descriptions from images using Gemini 1.5 Pro\n",
+ "\n",
+ "\n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-logo-32px.png\") Open in Colab\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Google](\"https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png\") Open in Colab Enterprise\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"Vertex](\"https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32\") Open in Workbench\n",
+ " \n",
+ " | \n",
+ " \n",
+ " \n",
+ " ![\"GitHub](\"https://cloud.google.com/ml-engine/images/github-logo-32px.png\") View on GitHub\n",
+ " \n",
+ " | \n",
+ "
"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "84f0f73a0f76"
+ },
+ "source": [
+ "| | |\n",
+ "|-|-|\n",
+ "|Author(s) | [Tianli Yu](https://github.com/tianli |"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "tvgnzT1CKxrO"
+ },
+ "source": [
+ "## Overview\n",
+ "\n",
+ "This colab teaches how to build a general agent (on top of Gemini) to extract different product attributes or detailed product descriptions from an image input. It also introduces a prompting technique called \"self-correcting prompt\" where you can ask the model to check and verify the result by itself (all in one single prompt). Self-correcting prompt proves to improve the overall quality of the agent's output.\n",
+ "\n",
+ "In the following sections we will:\n",
+ "\n",
+ "* Write the necessary image loading and parsing library.\n",
+ "* Create a product image agent.\n",
+ "* Run the agent on a set of examples for tasks like product image description and attribute extraction.\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "61RBz8LLbxCR"
+ },
+ "source": [
+ "## Get started"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "No17Cw5hgx12"
+ },
+ "source": [
+ "### Install Vertex AI SDK and other required packages\n"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "tFy3H3aPgx12"
+ },
+ "outputs": [],
+ "source": [
+ "%pip install --upgrade --user --quiet google-cloud-aiplatform"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "R5Xep4W9lq-Z"
+ },
+ "source": [
+ "### Restart runtime\n",
+ "\n",
+ "To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.\n",
+ "\n",
+ "The restart might take a minute or longer. After it's restarted, continue to the next step."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "XRvKdaPDTznN",
+ "outputId": "01b6924b-740d-43d8-8a07-be8ecdad403e",
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ }
+ },
+ "outputs": [
+ {
+ "output_type": "execute_result",
+ "data": {
+ "text/plain": [
+ "{'status': 'ok', 'restart': True}"
+ ]
+ },
+ "metadata": {},
+ "execution_count": 1
+ }
+ ],
+ "source": [
+ "import IPython\n",
+ "\n",
+ "app = IPython.Application.instance()\n",
+ "app.kernel.do_shutdown(True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "SbmM4z7FOBpM"
+ },
+ "source": [
+ "\n",
+ "⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️\n",
+ "
\n"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "dmWOrTJ3gx13"
+ },
+ "source": [
+ "### Authenticate your notebook environment (Colab only)\n",
+ "\n",
+ "If you're running this notebook on Google Colab, run the cell below to authenticate your environment."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 1,
+ "metadata": {
+ "id": "NyKGtVQjgx13"
+ },
+ "outputs": [],
+ "source": [
+ "import sys\n",
+ "\n",
+ "if \"google.colab\" in sys.modules:\n",
+ " from google.colab import auth\n",
+ "\n",
+ " auth.authenticate_user()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "DF4l8DTdWgPY"
+ },
+ "source": [
+ "### Set Google Cloud project information and initialize Vertex AI SDK\n",
+ "\n",
+ "To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).\n",
+ "\n",
+ "Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 2,
+ "metadata": {
+ "id": "Nqwi-5ufWp_B"
+ },
+ "outputs": [],
+ "source": [
+ "PROJECT_ID = \"cloud_llm_preview1\" # @param {type:\"string\"}\n",
+ "LOCATION = \"us-central1\" # @param {type:\"string\"}\n",
+ "\n",
+ "\n",
+ "import vertexai\n",
+ "\n",
+ "vertexai.init(project=PROJECT_ID, location=LOCATION)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {
+ "id": "EdvJRUWRNGHE"
+ },
+ "source": [
+ "## Build a Product Image Agent"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 3,
+ "metadata": {
+ "cellView": "form",
+ "id": "xJQczGZX3FfW"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Image loading and parsing library\n",
+ "import json\n",
+ "\n",
+ "import requests\n",
+ "\n",
+ "\n",
+ "def get_mime_from_uri(image_uri: str) -> str:\n",
+ " \"\"\"Get the mime type from the image uri.\"\"\"\n",
+ " if image_uri.endswith(\".png\"):\n",
+ " return \"image/png\"\n",
+ " elif image_uri.endswith(\".gif\"):\n",
+ " return \"image/gif\"\n",
+ " else:\n",
+ " # Assume JPEG as the default mime\n",
+ " return \"image/jpeg\"\n",
+ "\n",
+ "\n",
+ "def parse_json_from_markdown(answer: str) -> dict[str, list[str]]:\n",
+ " \"\"\"Parse the json from the markdown answer.\n",
+ "\n",
+ " Args:\n",
+ " answer (str): The markdown answer from the model.\n",
+ "\n",
+ " Returns:\n",
+ " A parsed json dictionary}\n",
+ " \"\"\"\n",
+ " lines = answer.split(\"```\")\n",
+ " try:\n",
+ " # Tries to parse the last json in the answer.\n",
+ " answer = lines[-2]\n",
+ " if answer.startswith(\"json\"):\n",
+ " answer = answer[4:]\n",
+ " result = json.loads(answer)\n",
+ " except json.JSONDecodeError:\n",
+ " # Falls back to the first json in the answer.\n",
+ " answer = lines[1]\n",
+ " if answer.startswith(\"json\"):\n",
+ " answer = answer[4:]\n",
+ " result = json.loads(answer)\n",
+ " return result"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 44,
+ "metadata": {
+ "id": "3nThlwYK_tN2"
+ },
+ "outputs": [],
+ "source": [
+ "# @title The ProductImageAgent Class.\n",
+ "import ipywidgets as widgets\n",
+ "from vertexai.preview.generative_models import GenerationConfig, GenerativeModel, Part\n",
+ "\n",
+ "\n",
+ "class ProductImageAgent:\n",
+ " \"\"\"An agent that wraps around Gemini 1.5 to extract product attributes from\n",
+ " images.\n",
+ "\n",
+ " Args:\n",
+ " gemini_model_version (str): The version string of the gemini 1.5 model.\n",
+ " gemini-1.5-pro or gemini-1.5-flash\n",
+ " temperature (float): The temperature of the model. Defaults to 1.0.\n",
+ " max_output_tokens (int): The maximum number of output tokens. Defaults to\n",
+ " 8192.\n",
+ " \"\"\"\n",
+ "\n",
+ " def __init__(\n",
+ " self,\n",
+ " gemini_model_version: str = \"gemini-1.5-pro\",\n",
+ " temperature: float = 0.0,\n",
+ " max_output_tokens: int = 8192,\n",
+ " ):\n",
+ " config = GenerationConfig(\n",
+ " temperature=temperature, max_output_tokens=max_output_tokens\n",
+ " )\n",
+ "\n",
+ " # System instructions, add any common instructions here.\n",
+ " sys_inst = \"\"\"\n",
+ " As an assistant for an online retailer, your task is to recognize\n",
+ " attributes from the provided product image.\n",
+ " If an attribute vocabulary is provided, please only select attribute values\n",
+ " in that vocabulary. You answer should be strictly consistent with what's in\n",
+ " the image. If any attributes do not exist in the image, please\n",
+ " return null for that attribute.\n",
+ " \"\"\"\n",
+ " self.gemini_model = GenerativeModel(\n",
+ " gemini_model_version, generation_config=config, system_instruction=sys_inst\n",
+ " )\n",
+ "\n",
+ " def get_detailed_description(self, image_uri: str, debug: bool = False) -> str:\n",
+ " \"\"\"Generates the detailed product description from an image.\n",
+ "\n",
+ " Args:\n",
+ " image_uri: The url to the image, can be a local file path, or a url\n",
+ " from the web or gcs.\n",
+ "\n",
+ " Returns:\n",
+ " The generated detailed description from the model response.\n",
+ " \"\"\"\n",
+ " image_part = Part.from_uri(image_uri, mime_type=get_mime_from_uri(image_uri))\n",
+ " prompt = \"\"\"\n",
+ " Please write a complete and detailed product description for the\n",
+ " above product image. The length of the description should be at least\n",
+ " 200 words.\n",
+ " \"\"\"\n",
+ " if debug:\n",
+ " print(\"====== Begin Debug Info ======\")\n",
+ " preview = widgets.HTML(value=f'
')\n",
+ " display(preview)\n",
+ " print(f\"Prompt:\\n{prompt}\")\n",
+ " print(\"====== End Debug Info ======\")\n",
+ "\n",
+ " model_response = self.gemini_model.generate_content([image_part, prompt])\n",
+ " return model_response.text\n",
+ "\n",
+ " def get_attributes(\n",
+ " self,\n",
+ " image_uri: str,\n",
+ " product_category: str = None,\n",
+ " vocabulary_json: str = None,\n",
+ " debug: bool = False,\n",
+ " ) -> str:\n",
+ " \"\"\"Generates the product attributes from an image.\n",
+ "\n",
+ " Args:\n",
+ " image_uri (str): The uri of the product image to generate attributes.\n",
+ " vocabulary_json (str): A json string list all the attribute names and\n",
+ " their possible vocabulary.\n",
+ "\n",
+ " Returns:\n",
+ " The product attribute json string from the model response.\n",
+ " \"\"\"\n",
+ " image_part = Part.from_uri(image_uri, mime_type=get_mime_from_uri(image_uri))\n",
+ " if product_category:\n",
+ " prompt = f\"\"\"\n",
+ " The above image is a product image from the {product_category}\n",
+ " category.\n",
+ " Please list all the relevant attributes in the {product_category}\n",
+ " category for the main product in the above image and return a list of\n",
+ " key-value pairs in json format.\n",
+ " \"\"\"\n",
+ " else:\n",
+ " prompt = \"\"\"\n",
+ " Please recognize the main product's all relevant attributes in the\n",
+ " above image and return a list of key-value pairs in json format.\n",
+ " \"\"\"\n",
+ " if vocabulary_json:\n",
+ " prompt += f\"\"\"\n",
+ " Please using only the vocabulary defined in the following json:\n",
+ " {vocabulary_json}\n",
+ " For each key, you should select the most appropriate attribute value\n",
+ " from its corresponding vocabulary list and returns one value\n",
+ " for each attribute key.\n",
+ " You can return null for that key if no attributes match.\n",
+ " \"\"\"\n",
+ " if debug:\n",
+ " print(\"====== Begin Debug Info ======\")\n",
+ " preview = widgets.HTML(value=f'
')\n",
+ " display(preview)\n",
+ " print(f\"Prompt:\\n{prompt}\")\n",
+ " print(\"====== End Debug Info ======\")\n",
+ "\n",
+ " model_response = self.gemini_model.generate_content([image_part, prompt])\n",
+ " return model_response.text\n",
+ "\n",
+ " def get_attributes_self_correcting_prompt(\n",
+ " self,\n",
+ " image_uri: str,\n",
+ " product_category: str = None,\n",
+ " vocabulary_json: str = None,\n",
+ " debug: bool = False,\n",
+ " ) -> str:\n",
+ " \"\"\"Generates the product attributes from an image using self-correcting prompt.\n",
+ "\n",
+ " Args:\n",
+ " image_uri (str): The uri of the product image to generate attributes.\n",
+ " vocabulary_json (str): A json string list all the attribute names and\n",
+ " their possible vocabulary.\n",
+ "\n",
+ " Returns:\n",
+ " The product attribute json string from the parsed model response.\n",
+ " \"\"\"\n",
+ " image_part = Part.from_uri(image_uri, mime_type=get_mime_from_uri(image_uri))\n",
+ " if product_category:\n",
+ " prompt = f\"\"\"\n",
+ " The above image is a product image from the {product_category}\n",
+ " category.\n",
+ " First please list all the relevant attributes in the\n",
+ " {product_category} category for the main product in the above\n",
+ " image and return a list of key-value pairs in json format.\n",
+ " \"\"\"\n",
+ " else:\n",
+ " prompt = \"\"\"\n",
+ " First Please recognize the main product's all relevant attributes in\n",
+ " the above image and return a list of key-value pairs in json\n",
+ " format.\n",
+ " \"\"\"\n",
+ "\n",
+ " if vocabulary_json:\n",
+ " prompt += f\"\"\"\n",
+ " Please using only the vocabulary defined in the following json:\n",
+ " {vocabulary_json}\n",
+ " For each key, you should select the most appropriate attribute value\n",
+ " from its corresponding vocabulary list and return the attribute key-\n",
+ " value pair. You can return null for that key if no attributes match.\n",
+ " \"\"\"\n",
+ "\n",
+ " # Adding the self-correction instructions.\n",
+ " prompt += \"\"\"\n",
+ " Next, treat the returned json as the result generated by a different\n",
+ " model, rate each key-value pair as \"correct\" or \"wrong\" based on the\n",
+ " same image. You can output in a format like \"key - value: correct (or\n",
+ " wrong)\".\n",
+ " Then, based on this evaluation, please update all the attributes that\n",
+ " are corrected in the final json output.\n",
+ " Please use markdown to annotate different json in your output.\n",
+ " \"\"\"\n",
+ " model_response = self.gemini_model.generate_content([image_part, prompt])\n",
+ "\n",
+ " if debug:\n",
+ " print(\"====== Begin Debug Info ======\")\n",
+ " preview = widgets.HTML(value=f'
')\n",
+ " display(preview)\n",
+ " print(f\"Prompt:\\n{prompt}\\n\")\n",
+ " print(f\"Response:\\n{model_response.candidates[0].content.parts[0].text}\\n\")\n",
+ " print(\"====== End Debug Info ======\")\n",
+ "\n",
+ " # Parse the model_response to get the final json.\n",
+ " return parse_json_from_markdown(\n",
+ " model_response.candidates[0].content.parts[0].text\n",
+ " )\n",
+ "\n",
+ "\n",
+ "# Creates the agent.\n",
+ "product_agent = ProductImageAgent(\n",
+ " gemini_model_version=\"gemini-1.5-pro-preview-0514\", temperature=0\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "OCF1YAGzASQw"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Example 1: Generate detailed product description from an image.\n",
+ "image_uri = \"https://plus.unsplash.com/premium_photo-1711051513016-72baa1035293?q=80&w=3687&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\" # @param {type:\"string\"}\n",
+ "\n",
+ "product_agent.get_detailed_description(image_uri, True)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "fLwaCYeHl4An"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Example 2: Get product attribute json from an image -- open vocabulary.\n",
+ "image_uri = \"https://plus.unsplash.com/premium_photo-1711051513016-72baa1035293?q=80&w=3687&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\" # @param {type:\"string\"}\n",
+ "\n",
+ "# Open vocabulary.\n",
+ "attribute_json = product_agent.get_attributes(image_uri, debug=True)\n",
+ "print(f\"Open vocabulary attributes:\\n{attribute_json}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "Q-afCK3lY0U3"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Example 3: Get product attribute json from an image -- closed vocabulary.\n",
+ "image_uri = \"https://plus.unsplash.com/premium_photo-1711051513016-72baa1035293?q=80&w=3687&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\" # @param {type:\"string\"}\n",
+ "\n",
+ "# Closed vocabulary.\n",
+ "vocabulary = \"\"\"\n",
+ "{\"Pattern\": [\"Animal\", \"Letter\", \"Plaid\", \"Plain\", \"Polka Dot\", \"Quilted\", \"Striped\", \"Tie Dye\", \"Tropical\", \"Zebra\", \"Block\", \"Rainbow\", \"Floral\"], \"Toe\": [\"Almond Toe\", \"Cap Toe\", \"Closed Toe\", \"Peep Toe\", \"Point Toe\", \"Pointed Toe\", \"Round Toe\", \"Square Toe\", \"Toe Post\", \"Open Toe\"], \"Style\": [\"Ballet\", \"Bandage\", \"Basics\", \"Casual\", \"Classic\", \"Cute\", \"Elegant\", \"Formal\", \"Modern\", \"Motorcycle\", \"Retro\", \"Sexy\", \"Boho\", \"Modest\", \"Comfort\", \"Minimalist\"], \"Strap Type\": [\"Adjustable\", \"Ankle cuff\", \"Ankle straps\", \"Chain\", \"Convertible\", \"Criss Cross\", \"D'orsay\", \"Double Handle\", \"Flowers\", \"Gladiator\", \"Lace Up\", \"Mary Jane\", \"Ring\", \"Slingbacks\", \"Strappy\", \"T strap\", \"Zipper\", \"Elastic\", \"Velcro\", \"Ankle Strap\"], \"Heels\": [\"Chunky\", \"Cork\", \"Espadrilles\", \"Flat\", \"Flatform\", \"Platform\", \"Stiletto\", \"Cone Heel\", \"Kitten Heels\", \"Hidden Wedge\", \"Wedges\", \"Pyramid\"], \"Closure Type\": [\"Back Zipper\", \"Buckle\", \"Zipper\", \"Magnet\", \"Slip on\", \"Hook Loop\", \"Lace-up\", \"Flap\"]}\n",
+ "\"\"\"\n",
+ "\n",
+ "attribute_json = product_agent.get_attributes(\n",
+ " image_uri, vocabulary_json=vocabulary, debug=True\n",
+ ")\n",
+ "print(f\"Closed vocabulary attributes\\n{attribute_json}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "aVCiddCOfYZ0"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Example 4: Get product attribute json from an image using self-correcting prompt -- open vocabulary\n",
+ "image_uri = \"https://plus.unsplash.com/premium_photo-1711051513016-72baa1035293?q=80&w=3687&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\" # @param {type:\"string\"}\n",
+ "\n",
+ "# Open vocabulary.\n",
+ "attribute_json = product_agent.get_attributes_self_correcting_prompt(\n",
+ " image_uri, product_category=\"Shoes\", debug=True\n",
+ ")\n",
+ "print(f\"Open vocabulary attributes:\\n{attribute_json}\")"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "rJk--yXekwcQ"
+ },
+ "outputs": [],
+ "source": [
+ "# @title Example 5: Get product attribute json from an image using self-correcting prompt -- closed vocabulary\n",
+ "image_uri = \"https://plus.unsplash.com/premium_photo-1711051513016-72baa1035293?q=80&w=3687&auto=format&fit=crop&ixlib=rb-4.0.3&ixid=M3wxMjA3fDB8MHxwaG90by1wYWdlfHx8fGVufDB8fHx8fA%3D%3D\" # @param {type:\"string\"}\n",
+ "\n",
+ "# Closed vocabulary.\n",
+ "vocabulary = \"\"\"\n",
+ "{\"Pattern\": [\"Animal\", \"Letter\", \"Plaid\", \"Plain\", \"Polka Dot\", \"Quilted\", \"Striped\", \"Tie Dye\", \"Tropical\", \"Zebra\", \"Block\", \"Rainbow\", \"Floral\"], \"Toe\": [\"Almond Toe\", \"Cap Toe\", \"Closed Toe\", \"Peep Toe\", \"Point Toe\", \"Pointed Toe\", \"Round Toe\", \"Square Toe\", \"Toe Post\", \"Open Toe\"], \"Style\": [\"Ballet\", \"Bandage\", \"Basics\", \"Casual\", \"Classic\", \"Cute\", \"Elegant\", \"Formal\", \"Modern\", \"Motorcycle\", \"Retro\", \"Sexy\", \"Boho\", \"Modest\", \"Comfort\", \"Minimalist\"], \"Strap Type\": [\"Adjustable\", \"Ankle cuff\", \"Ankle straps\", \"Chain\", \"Convertible\", \"Criss Cross\", \"D'orsay\", \"Double Handle\", \"Flowers\", \"Gladiator\", \"Lace Up\", \"Mary Jane\", \"Ring\", \"Slingbacks\", \"Strappy\", \"T strap\", \"Zipper\", \"Elastic\", \"Velcro\", \"Ankle Strap\"], \"Heels\": [\"Chunky\", \"Cork\", \"Espadrilles\", \"Flat\", \"Flatform\", \"Platform\", \"Stiletto\", \"Cone Heel\", \"Kitten Heels\", \"Hidden Wedge\", \"Wedges\", \"Pyramid\"], \"Closure Type\": [\"Back Zipper\", \"Buckle\", \"Zipper\", \"Magnet\", \"Slip on\", \"Hook Loop\", \"Lace-up\", \"Flap\"]}\n",
+ "\"\"\"\n",
+ "attribute_json = product_agent.get_attributes_self_correcting_prompt(\n",
+ " image_uri, vocabulary_json=vocabulary, debug=True\n",
+ ")\n",
+ "print(f\"Closed vocabulary attributes\\n{attribute_json}\")"
+ ]
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "name": "product_attributes_extraction.ipynb",
+ "toc_visible": true,
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3",
+ "name": "python3"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 0
+}
\ No newline at end of file