From 9edd66f8c0c6332cc808fc415790c371473c1f6c Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 4 Mar 2024 13:07:57 -0800 Subject: [PATCH 1/4] Add a structured data extraction tutorial --- .../structured_data_extraction.ipynb | 877 ++++++++++++++++++ 1 file changed, 877 insertions(+) create mode 100644 site/en/tutorials/structured_data_extraction.ipynb diff --git a/site/en/tutorials/structured_data_extraction.ipynb b/site/en/tutorials/structured_data_extraction.ipynb new file mode 100644 index 000000000..d88f4a01d --- /dev/null +++ b/site/en/tutorials/structured_data_extraction.ipynb @@ -0,0 +1,877 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "NtX45QCEdPaP" + }, + "source": [ + "# Structured data extraction using function calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2tO4fP7FFg2V" + }, + "source": [ + "\n", + " \n", + " \n", + " \n", + "
\n", + " View on Google AI\n", + " \n", + " Run in Google Colab\n", + " \n", + " View source on GitHub\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8Szkddw5NScW" + }, + "source": [ + "In this tutorial you'll work through a small structured data extraction" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bvrwRlNPdYDr" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "QyW6x11UQHnx" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/137.4 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m137.4/137.4 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", + "\u001b[?25h" + ] + } + ], + "source": [ + "!pip install -U -q google-generativeai" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "TS9l5igubpHO" + }, + "outputs": [], + "source": [ + "import pathlib\n", + "import textwrap\n", + "\n", + "import google.generativeai as genai\n", + "import google.ai.generativelanguage as glm\n", + "\n", + "\n", + "from IPython.display import display\n", + "from IPython.display import Markdown\n", + "\n", + "from google.api_core import retry\n", + "\n", + "def to_markdown(text):\n", + " text = text.replace('•', ' *')\n", + " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VmSlTHXxb5pV" + }, + "source": [ + "Once you have the API key, pass it to the SDK. You can do this in two ways:\n", + "\n", + "* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).\n", + "* Pass the key to `genai.configure(api_key=...)`\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ab9ASynfcIZn" + }, + "outputs": [], + "source": [ + "try:\n", + " # Used to securely store your API key\n", + " from google.colab import userdata\n", + "\n", + " # Or use `os.getenv('API_KEY')` to fetch an environment variable.\n", + " GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", + "except ImportError:\n", + " import os\n", + " GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']\n", + "\n", + "genai.configure(api_key=GOOGLE_API_KEY)\n", + "\n", + "genai.configure(\n", + " api_key=GOOGLE_API_KEY,\n", + " client_options={'api_endpoint':'autopush-generativelanguage.sandbox.googleapis.com'})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "K6SdtoJCL4pL" + }, + "source": [ + "## The example task\n", + "\n", + "For this tutorial you'll extract entities from natural language stories. As an\n", + " example, below is a story written by Gemini." + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "0THz95wOL4pL" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "new_story = False\n", + "\n", + "if new_story:\n", + " model = genai.GenerativeModel(model_name='gemini-1.0-pro')\n", + "\n", + " response = model.generate_content(\"\"\"\n", + " Write a long story about a girl with magic backpack, her family, and at\n", + " least one other charater. Make sure everyone has names. Don't forget to\n", + " describe the contents of the backpack, and where everyone and everything\n", + " starts and ends up.\"\"\", request_options={'retry': retry.Retry()})\n", + " story = response.text\n", + " print(response.candidates[0].citation_metadata)\n", + "else:\n", + " story = \"\"\"In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n", + "\n", + "Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n", + "\n", + "Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n", + "\n", + "With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n", + "\n", + "Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n", + "\n", + "Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n", + "\n", + "\"What's wrong?\" she asked.\n", + "\n", + "A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n", + "\n", + "Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n", + "\n", + "Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n", + "\n", + "With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n", + "\n", + "Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n", + "\n", + "Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n", + "\n", + "When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n", + "\n", + "As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "yMnxJqubg759" + }, + "outputs": [ + { + "data": { + "text/markdown": "> In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n> \n> Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n> \n> Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n> \n> With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n> \n> Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n> \n> Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n> \n> \"What's wrong?\" she asked.\n> \n> A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n> \n> Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n> \n> Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n> \n> With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n> \n> Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n> \n> Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n> \n> When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n> \n> As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.", + "text/plain": [ + "" + ] + }, + "execution_count": 10, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "to_markdown(story)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zldoIzn-MuLE" + }, + "source": [ + "## Using Natural language\n", + "\n", + "Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay. \n", + "\n", + "The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:\n", + "\n", + "- Sometimes parsing fails.\n", + "- The schema can't be strictly enforced.\n", + "\n", + "We'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "eStTMD6VL4pM" + }, + "outputs": [], + "source": [ + "model = model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro')\n", + "\n", + "response = model.generate_content(textwrap.dedent(\"\"\"\\\n", + " Please return JSON describing the the people, places, things and relationships from this story using the following schema:\n", + "\n", + " {\"people\": list[PERSON], \"places\":list[PLACE], \"things\":list[THING], \"relationships\": list[RELATIONSHIP]}\n", + "\n", + " PERSON = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " PLACE = {\"name\": str, \"description\": str}\n", + " THING = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " RELATIONSHIP = {\"person_1_name\": str, \"person_2_name\": str, \"relationship\": str}\n", + "\n", + " All fields are required.\n", + "\n", + " Important: Only return a single piece of valid JSON text.\n", + "\n", + " Here is the story:\n", + "\n", + " \"\"\") + story)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B0b5zHI3uEBm" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'{\"people\": [{\"name\": \"Anya\", \"description\": \"A young girl who possesses a magical backpack.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Elise\", \"description\": \"Anya\\'s kind-hearted mother.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Edward\", \"description\": \"Anya\\'s wise-bearded father.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Samuel\", \"description\": \"Anya\\'s curious and adventurous best friend.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}], \"places\": [{\"name\": \"Willow Creek\", \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"}, {\"name\": \"The forest\", \"description\": \"A shadowy place with whispering trees and unseen creatures.\"}], \"things\": [{\"name\": \"Magical backpack\", \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\", \"start_place_name\": \"Anya\\'s grandmother\\'s house\", \"end_place_name\": \"Willow Creek\"}, {\"name\": \"Shimmering sword\", \"description\": \"A sword that Anya uses to defeat the monster.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Book of ancient spells\", \"description\": \"A book that contains ancient spells.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Tiny compass\", \"description\": \"A compass that always points north.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Magical key\", \"description\": \"A key that can open any lock.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Monster\", \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\", \"start_place_name\": \"The forest\", \"end_place_name\": null}], \"relationships\": [{\"person_1_name\": \"Anya\", \"person_2_name\": \"Elise\", \"relationship\": \"Mother-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Edward\", \"relationship\": \"Father-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Samuel\", \"relationship\": \"Best friends\"}]}'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "response.text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ScEdqKq1lhmQ" + }, + "source": [ + "That returned a json string. Try parsing it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "xSdj50czL4pM" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"people\": [\n", + " {\n", + " \"name\": \"Anya\",\n", + " \"description\": \"A young girl who possesses a magical backpack.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Elise\",\n", + " \"description\": \"Anya's kind-hearted mother.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Edward\",\n", + " \"description\": \"Anya's wise-bearded father.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"description\": \"Anya's curious and adventurous best friend.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"\n", + " },\n", + " {\n", + " \"name\": \"The forest\",\n", + " \"description\": \"A shadowy place with whispering trees and unseen creatures.\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"name\": \"Magical backpack\",\n", + " \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\",\n", + " \"start_place_name\": \"Anya's grandmother's house\",\n", + " \"end_place_name\": \"Willow Creek\"\n", + " },\n", + " {\n", + " \"name\": \"Shimmering sword\",\n", + " \"description\": \"A sword that Anya uses to defeat the monster.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Book of ancient spells\",\n", + " \"description\": \"A book that contains ancient spells.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Tiny compass\",\n", + " \"description\": \"A compass that always points north.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Magical key\",\n", + " \"description\": \"A key that can open any lock.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Monster\",\n", + " \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\",\n", + " \"start_place_name\": \"The forest\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"relationships\": [\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"relationship\": \"Mother-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Edward\",\n", + " \"relationship\": \"Father-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\",\n", + " \"relationship\": \"Best friends\"\n", + " }\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "import json\n", + "\n", + "json_text = response.text.strip('`\\r\\n ').removeprefix('json')\n", + "print(json.dumps(json.loads(json_text), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TgC_wkHPmkHn" + }, + "source": [ + "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CxMC28LAOfUf" + }, + "source": [ + "## Use Function Calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "x-V6PJn83Kh9" + }, + "source": [ + "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", + "\n", + "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k83LZ5MCBfTJ" + }, + "source": [ + "### Define the schema\n", + "\n", + "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." + ] + }, + { + "cell_type": "code", + "execution_count": 189, + "metadata": { + "id": "p2efqZA7BAzp" + }, + "outputs": [], + "source": [ + "person = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " 'start_place_name': glm.Schema(type=glm.Type.STRING),\n", + " 'end_place_name': glm.Schema(type=glm.Type.STRING)\n", + " },\n", + " required=['name', 'description', 'start_place_name', 'end_place_name']\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HGV1wxx6BCJl" + }, + "source": [ + "Then define people as an `ARRAY` of `person` objects:" + ] + }, + { + "cell_type": "code", + "execution_count": 190, + "metadata": { + "id": "Ur7kzpiA_Dqw" + }, + "outputs": [], + "source": [ + "people = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=person\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "N6uD63sBBJ3i" + }, + "source": [ + "Then do the same for each of the entities you're trying to extract:" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "metadata": { + "id": "7wd3jTqj_bVi" + }, + "outputs": [], + "source": [ + "place = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "places = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=place\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 192, + "metadata": { + "id": "45cLwvCd_vg_" + }, + "outputs": [], + "source": [ + "thing = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "things = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=thing\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 193, + "metadata": { + "id": "8DdVSZJfADDY" + }, + "outputs": [], + "source": [ + "relationship = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'person_1_name': glm.Schema(type=glm.Type.STRING),\n", + " 'person_2_name': glm.Schema(type=glm.Type.STRING),\n", + " 'relationship': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "relationships = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=relationship\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mJwqEUqjBToJ" + }, + "source": [ + "Now build the `FunctionDeclaration`:" + ] + }, + { + "cell_type": "code", + "execution_count": 194, + "metadata": { + "id": "YQkiVCtsPbUy" + }, + "outputs": [], + "source": [ + "add_to_database = glm.FunctionDeclaration(\n", + " name=\"add_to_database\",\n", + " description=textwrap.dedent(\"\"\"\\\n", + " Adds entities to the database.\n", + " \"\"\"),\n", + " parameters=glm.Schema(\n", + " type=glm.Type.OBJECT,\n", + " properties = {\n", + " 'people': people,\n", + " 'places': places,\n", + " 'things': things,\n", + " 'relationships': relationships\n", + " }\n", + " )\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e1_QSwD9Bmy5" + }, + "source": [ + "### Call the API\n", + "\n", + "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" + ] + }, + { + "cell_type": "code", + "execution_count": 195, + "metadata": { + "id": "5PGAPRDJP4Qx" + }, + "outputs": [], + "source": [ + "model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro',\n", + " tools = [add_to_database])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1uTYW5cVCDST" + }, + "source": [ + "Each time you call the API, the SDK will send the tools along with your prompt, and the model should call that function we defined:" + ] + }, + { + "cell_type": "code", + "execution_count": 196, + "metadata": { + "id": "bAPA7fNtSUwN" + }, + "outputs": [], + "source": [ + "result = model.generate_content(f\"\"\"\n", + "Please add the people, places, things, and relationships from this story to the database:\n", + "\n", + "{story}\n", + "\"\"\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oSG7r6IBCL7S" + }, + "source": [ + "Now there is no text to parse. The result _is_ a datastructure." + ] + }, + { + "cell_type": "code", + "execution_count": 197, + "metadata": { + "id": "07n3wXzFOZ4x" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 197, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'text' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 198, + "metadata": { + "id": "i-8hm1HPI5Ce" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 198, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'function_call' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 199, + "metadata": { + "id": "n8BTs6ogDEkq" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "fc = result.candidates[0].content.parts[0].function_call\n", + "print(type(fc))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kILNHmG2IED3" + }, + "source": [ + "The `glm.FunctionCall` class is based on Google Protocol Buffers, convert it to a more familiar JSON compatible object:" + ] + }, + { + "cell_type": "code", + "execution_count": 200, + "metadata": { + "id": "5GKHtT4-F3qa" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"add_to_database\",\n", + " \"args\": {\n", + " \"relationships\": [\n", + " {\n", + " \"relationship\": \"mother-daughter\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"person_1_name\": \"Anya\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"relationship\": \"father-daughter\",\n", + " \"person_2_name\": \"Edward\"\n", + " },\n", + " {\n", + " \"relationship\": \"best friends\",\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\"\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"a quaint town nestled amidst rolling hills and whispering willows\"\n", + " },\n", + " {\n", + " \"name\": \"forest\",\n", + " \"description\": \"a shadowy place with rustling undergrowth\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"description\": \"a backpack with a shimmering emerald-green fabric and leather straps, containing a magical sword, a book of ancient spells, a tiny compass that always points north, and a magical key that could open any lock\",\n", + " \"name\": \"magical backpack\"\n", + " },\n", + " {\n", + " \"description\": \"a weapon that can defeat monsters\",\n", + " \"name\": \"shimmering sword\"\n", + " },\n", + " {\n", + " \"description\": \"a book containing magical spells\",\n", + " \"name\": \"book of ancient spells\"\n", + " },\n", + " {\n", + " \"name\": \"tiny compass\",\n", + " \"description\": \"a compass that always points north\"\n", + " },\n", + " {\n", + " \"name\": \"magical key\",\n", + " \"description\": \"a key that can open any lock\"\n", + " },\n", + " {\n", + " \"name\": \"monster\",\n", + " \"description\": \"a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease\"\n", + " }\n", + " ],\n", + " \"people\": [\n", + " {\n", + " \"description\": \"a young girl\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Anya\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's mother\",\n", + " \"name\": \"Elise\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Edward\",\n", + " \"end_place_name\": null,\n", + " \"description\": \"Anya's father\"\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"end_place_name\": null,\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's best friend\"\n", + " },\n", + " {\n", + " \"name\": \"tall, lanky boy\",\n", + " \"description\": \"a boy who warned Anya about the monster\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ]\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "print(json.dumps(type(fc).to_dict(fc), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4m8FakjCIKmI" + }, + "source": [ + "## Conclusion\n", + "\n", + "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." + ] + } + ], + "metadata": { + "colab": { + "name": "structured_data_extraction.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" + } + }, + "nbformat": 4, + "nbformat_minor": 0 +} From 22cda7bb3bb329c9e2a0c3db49708e8b827b8c9f Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 4 Mar 2024 18:17:45 -0800 Subject: [PATCH 2/4] lint --- .../structured_data_extraction.ipynb | 1773 +++++++++-------- 1 file changed, 903 insertions(+), 870 deletions(-) diff --git a/site/en/tutorials/structured_data_extraction.ipynb b/site/en/tutorials/structured_data_extraction.ipynb index d88f4a01d..ad101fed5 100644 --- a/site/en/tutorials/structured_data_extraction.ipynb +++ b/site/en/tutorials/structured_data_extraction.ipynb @@ -1,877 +1,910 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": { - "id": "NtX45QCEdPaP" - }, - "source": [ - "# Structured data extraction using function calling" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2tO4fP7FFg2V" - }, - "source": [ - "\n", - " \n", - " \n", - " \n", - "
\n", - " View on Google AI\n", - " \n", - " Run in Google Colab\n", - " \n", - " View source on GitHub\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8Szkddw5NScW" - }, - "source": [ - "In this tutorial you'll work through a small structured data extraction" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "bvrwRlNPdYDr" - }, - "source": [ - "## Setup" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "QyW6x11UQHnx" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[?25l \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m0.0/137.4 kB\u001b[0m \u001b[31m?\u001b[0m eta \u001b[36m-:--:--\u001b[0m\r\u001b[2K \u001b[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━\u001b[0m \u001b[32m137.4/137.4 kB\u001b[0m \u001b[31m4.2 MB/s\u001b[0m eta \u001b[36m0:00:00\u001b[0m\n", - "\u001b[?25h" - ] - } - ], - "source": [ - "!pip install -U -q google-generativeai" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "TS9l5igubpHO" - }, - "outputs": [], - "source": [ - "import pathlib\n", - "import textwrap\n", - "\n", - "import google.generativeai as genai\n", - "import google.ai.generativelanguage as glm\n", - "\n", - "\n", - "from IPython.display import display\n", - "from IPython.display import Markdown\n", - "\n", - "from google.api_core import retry\n", - "\n", - "def to_markdown(text):\n", - " text = text.replace('•', ' *')\n", - " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VmSlTHXxb5pV" - }, - "source": [ - "Once you have the API key, pass it to the SDK. You can do this in two ways:\n", - "\n", - "* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).\n", - "* Pass the key to `genai.configure(api_key=...)`\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ab9ASynfcIZn" - }, - "outputs": [], - "source": [ - "try:\n", - " # Used to securely store your API key\n", - " from google.colab import userdata\n", - "\n", - " # Or use `os.getenv('API_KEY')` to fetch an environment variable.\n", - " GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", - "except ImportError:\n", - " import os\n", - " GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']\n", - "\n", - "genai.configure(api_key=GOOGLE_API_KEY)\n", - "\n", - "genai.configure(\n", - " api_key=GOOGLE_API_KEY,\n", - " client_options={'api_endpoint':'autopush-generativelanguage.sandbox.googleapis.com'})" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "K6SdtoJCL4pL" - }, - "source": [ - "## The example task\n", - "\n", - "For this tutorial you'll extract entities from natural language stories. As an\n", - " example, below is a story written by Gemini." - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "0THz95wOL4pL" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - } - ], - "source": [ - "new_story = False\n", - "\n", - "if new_story:\n", - " model = genai.GenerativeModel(model_name='gemini-1.0-pro')\n", - "\n", - " response = model.generate_content(\"\"\"\n", - " Write a long story about a girl with magic backpack, her family, and at\n", - " least one other charater. Make sure everyone has names. Don't forget to\n", - " describe the contents of the backpack, and where everyone and everything\n", - " starts and ends up.\"\"\", request_options={'retry': retry.Retry()})\n", - " story = response.text\n", - " print(response.candidates[0].citation_metadata)\n", - "else:\n", - " story = \"\"\"In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n", - "\n", - "Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n", - "\n", - "Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n", - "\n", - "With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n", - "\n", - "Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n", - "\n", - "Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n", - "\n", - "\"What's wrong?\" she asked.\n", - "\n", - "A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n", - "\n", - "Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n", - "\n", - "Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n", - "\n", - "With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n", - "\n", - "Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n", - "\n", - "Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n", - "\n", - "When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n", - "\n", - "As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "yMnxJqubg759" - }, - "outputs": [ - { - "data": { - "text/markdown": "> In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n> \n> Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n> \n> Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n> \n> With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n> \n> Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n> \n> Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n> \n> \"What's wrong?\" she asked.\n> \n> A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n> \n> Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n> \n> Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n> \n> With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n> \n> Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n> \n> Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n> \n> When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n> \n> As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.", - "text/plain": [ - "" - ] - }, - "execution_count": 10, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "to_markdown(story)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "zldoIzn-MuLE" - }, - "source": [ - "## Using Natural language\n", - "\n", - "Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay. \n", - "\n", - "The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:\n", - "\n", - "- Sometimes parsing fails.\n", - "- The schema can't be strictly enforced.\n", - "\n", - "We'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "eStTMD6VL4pM" - }, - "outputs": [], - "source": [ - "model = model = model = genai.GenerativeModel(\n", - " model_name='gemini-1.0-pro')\n", - "\n", - "response = model.generate_content(textwrap.dedent(\"\"\"\\\n", - " Please return JSON describing the the people, places, things and relationships from this story using the following schema:\n", - "\n", - " {\"people\": list[PERSON], \"places\":list[PLACE], \"things\":list[THING], \"relationships\": list[RELATIONSHIP]}\n", - "\n", - " PERSON = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", - " PLACE = {\"name\": str, \"description\": str}\n", - " THING = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", - " RELATIONSHIP = {\"person_1_name\": str, \"person_2_name\": str, \"relationship\": str}\n", - "\n", - " All fields are required.\n", - "\n", - " Important: Only return a single piece of valid JSON text.\n", - "\n", - " Here is the story:\n", - "\n", - " \"\"\") + story)\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "B0b5zHI3uEBm" - }, - "outputs": [ - { - "data": { - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - }, - "text/plain": [ - "'{\"people\": [{\"name\": \"Anya\", \"description\": \"A young girl who possesses a magical backpack.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Elise\", \"description\": \"Anya\\'s kind-hearted mother.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Edward\", \"description\": \"Anya\\'s wise-bearded father.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Samuel\", \"description\": \"Anya\\'s curious and adventurous best friend.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}], \"places\": [{\"name\": \"Willow Creek\", \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"}, {\"name\": \"The forest\", \"description\": \"A shadowy place with whispering trees and unseen creatures.\"}], \"things\": [{\"name\": \"Magical backpack\", \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\", \"start_place_name\": \"Anya\\'s grandmother\\'s house\", \"end_place_name\": \"Willow Creek\"}, {\"name\": \"Shimmering sword\", \"description\": \"A sword that Anya uses to defeat the monster.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Book of ancient spells\", \"description\": \"A book that contains ancient spells.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Tiny compass\", \"description\": \"A compass that always points north.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Magical key\", \"description\": \"A key that can open any lock.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Monster\", \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\", \"start_place_name\": \"The forest\", \"end_place_name\": null}], \"relationships\": [{\"person_1_name\": \"Anya\", \"person_2_name\": \"Elise\", \"relationship\": \"Mother-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Edward\", \"relationship\": \"Father-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Samuel\", \"relationship\": \"Best friends\"}]}'" - ] - }, - "execution_count": 47, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "response.text" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ScEdqKq1lhmQ" - }, - "source": [ - "That returned a json string. Try parsing it:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "xSdj50czL4pM" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{\n", - " \"people\": [\n", - " {\n", - " \"name\": \"Anya\",\n", - " \"description\": \"A young girl who possesses a magical backpack.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Elise\",\n", - " \"description\": \"Anya's kind-hearted mother.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Edward\",\n", - " \"description\": \"Anya's wise-bearded father.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Samuel\",\n", - " \"description\": \"Anya's curious and adventurous best friend.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " }\n", - " ],\n", - " \"places\": [\n", - " {\n", - " \"name\": \"Willow Creek\",\n", - " \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"\n", - " },\n", - " {\n", - " \"name\": \"The forest\",\n", - " \"description\": \"A shadowy place with whispering trees and unseen creatures.\"\n", - " }\n", - " ],\n", - " \"things\": [\n", - " {\n", - " \"name\": \"Magical backpack\",\n", - " \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\",\n", - " \"start_place_name\": \"Anya's grandmother's house\",\n", - " \"end_place_name\": \"Willow Creek\"\n", - " },\n", - " {\n", - " \"name\": \"Shimmering sword\",\n", - " \"description\": \"A sword that Anya uses to defeat the monster.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Book of ancient spells\",\n", - " \"description\": \"A book that contains ancient spells.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Tiny compass\",\n", - " \"description\": \"A compass that always points north.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Magical key\",\n", - " \"description\": \"A key that can open any lock.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Monster\",\n", - " \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\",\n", - " \"start_place_name\": \"The forest\",\n", - " \"end_place_name\": null\n", - " }\n", - " ],\n", - " \"relationships\": [\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Elise\",\n", - " \"relationship\": \"Mother-daughter\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Edward\",\n", - " \"relationship\": \"Father-daughter\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Samuel\",\n", - " \"relationship\": \"Best friends\"\n", - " }\n", - " ]\n", - "}\n" - ] - } - ], - "source": [ - "import json\n", - "\n", - "json_text = response.text.strip('`\\r\\n ').removeprefix('json')\n", - "print(json.dumps(json.loads(json_text), indent=4))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TgC_wkHPmkHn" - }, - "source": [ - "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "CxMC28LAOfUf" - }, - "source": [ - "## Use Function Calling" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "x-V6PJn83Kh9" - }, - "source": [ - "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", - "\n", - "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "k83LZ5MCBfTJ" - }, - "source": [ - "### Define the schema\n", - "\n", - "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." - ] - }, - { - "cell_type": "code", - "execution_count": 189, - "metadata": { - "id": "p2efqZA7BAzp" - }, - "outputs": [], - "source": [ - "person = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " 'start_place_name': glm.Schema(type=glm.Type.STRING),\n", - " 'end_place_name': glm.Schema(type=glm.Type.STRING)\n", - " },\n", - " required=['name', 'description', 'start_place_name', 'end_place_name']\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HGV1wxx6BCJl" - }, - "source": [ - "Then define people as an `ARRAY` of `person` objects:" - ] - }, - { - "cell_type": "code", - "execution_count": 190, - "metadata": { - "id": "Ur7kzpiA_Dqw" - }, - "outputs": [], - "source": [ - "people = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=person\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "N6uD63sBBJ3i" - }, - "source": [ - "Then do the same for each of the entities you're trying to extract:" - ] - }, - { - "cell_type": "code", - "execution_count": 191, - "metadata": { - "id": "7wd3jTqj_bVi" - }, - "outputs": [], - "source": [ - "place = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "places = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=place\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 192, - "metadata": { - "id": "45cLwvCd_vg_" - }, - "outputs": [], - "source": [ - "thing = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "things = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=thing\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 193, - "metadata": { - "id": "8DdVSZJfADDY" - }, - "outputs": [], - "source": [ - "relationship = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'person_1_name': glm.Schema(type=glm.Type.STRING),\n", - " 'person_2_name': glm.Schema(type=glm.Type.STRING),\n", - " 'relationship': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "relationships = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=relationship\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mJwqEUqjBToJ" - }, - "source": [ - "Now build the `FunctionDeclaration`:" - ] - }, - { - "cell_type": "code", - "execution_count": 194, - "metadata": { - "id": "YQkiVCtsPbUy" - }, - "outputs": [], - "source": [ - "add_to_database = glm.FunctionDeclaration(\n", - " name=\"add_to_database\",\n", - " description=textwrap.dedent(\"\"\"\\\n", - " Adds entities to the database.\n", - " \"\"\"),\n", - " parameters=glm.Schema(\n", - " type=glm.Type.OBJECT,\n", - " properties = {\n", - " 'people': people,\n", - " 'places': places,\n", - " 'things': things,\n", - " 'relationships': relationships\n", - " }\n", - " )\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "e1_QSwD9Bmy5" - }, - "source": [ - "### Call the API\n", - "\n", - "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" - ] - }, - { - "cell_type": "code", - "execution_count": 195, - "metadata": { - "id": "5PGAPRDJP4Qx" - }, - "outputs": [], - "source": [ - "model = model = genai.GenerativeModel(\n", - " model_name='gemini-1.0-pro',\n", - " tools = [add_to_database])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "1uTYW5cVCDST" - }, - "source": [ - "Each time you call the API, the SDK will send the tools along with your prompt, and the model should call that function we defined:" - ] - }, - { - "cell_type": "code", - "execution_count": 196, - "metadata": { - "id": "bAPA7fNtSUwN" - }, - "outputs": [], - "source": [ - "result = model.generate_content(f\"\"\"\n", - "Please add the people, places, things, and relationships from this story to the database:\n", - "\n", - "{story}\n", - "\"\"\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "oSG7r6IBCL7S" - }, - "source": [ - "Now there is no text to parse. The result _is_ a datastructure." - ] - }, - { - "cell_type": "code", - "execution_count": 197, - "metadata": { - "id": "07n3wXzFOZ4x" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "False" - ] - }, - "execution_count": 197, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "'text' in result.candidates[0].content.parts[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 198, - "metadata": { - "id": "i-8hm1HPI5Ce" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 198, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "'function_call' in result.candidates[0].content.parts[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 199, - "metadata": { - "id": "n8BTs6ogDEkq" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] - } - ], - "source": [ - "fc = result.candidates[0].content.parts[0].function_call\n", - "print(type(fc))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "kILNHmG2IED3" - }, - "source": [ - "The `glm.FunctionCall` class is based on Google Protocol Buffers, convert it to a more familiar JSON compatible object:" - ] - }, - { - "cell_type": "code", - "execution_count": 200, - "metadata": { - "id": "5GKHtT4-F3qa" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{\n", - " \"name\": \"add_to_database\",\n", - " \"args\": {\n", - " \"relationships\": [\n", - " {\n", - " \"relationship\": \"mother-daughter\",\n", - " \"person_2_name\": \"Elise\",\n", - " \"person_1_name\": \"Anya\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"relationship\": \"father-daughter\",\n", - " \"person_2_name\": \"Edward\"\n", - " },\n", - " {\n", - " \"relationship\": \"best friends\",\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Samuel\"\n", - " }\n", - " ],\n", - " \"places\": [\n", - " {\n", - " \"name\": \"Willow Creek\",\n", - " \"description\": \"a quaint town nestled amidst rolling hills and whispering willows\"\n", - " },\n", - " {\n", - " \"name\": \"forest\",\n", - " \"description\": \"a shadowy place with rustling undergrowth\"\n", - " }\n", - " ],\n", - " \"things\": [\n", - " {\n", - " \"description\": \"a backpack with a shimmering emerald-green fabric and leather straps, containing a magical sword, a book of ancient spells, a tiny compass that always points north, and a magical key that could open any lock\",\n", - " \"name\": \"magical backpack\"\n", - " },\n", - " {\n", - " \"description\": \"a weapon that can defeat monsters\",\n", - " \"name\": \"shimmering sword\"\n", - " },\n", - " {\n", - " \"description\": \"a book containing magical spells\",\n", - " \"name\": \"book of ancient spells\"\n", - " },\n", - " {\n", - " \"name\": \"tiny compass\",\n", - " \"description\": \"a compass that always points north\"\n", - " },\n", - " {\n", - " \"name\": \"magical key\",\n", - " \"description\": \"a key that can open any lock\"\n", - " },\n", - " {\n", - " \"name\": \"monster\",\n", - " \"description\": \"a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease\"\n", - " }\n", - " ],\n", - " \"people\": [\n", - " {\n", - " \"description\": \"a young girl\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"name\": \"Anya\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"description\": \"Anya's mother\",\n", - " \"name\": \"Elise\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"name\": \"Edward\",\n", - " \"end_place_name\": null,\n", - " \"description\": \"Anya's father\"\n", - " },\n", - " {\n", - " \"name\": \"Samuel\",\n", - " \"end_place_name\": null,\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"description\": \"Anya's best friend\"\n", - " },\n", - " {\n", - " \"name\": \"tall, lanky boy\",\n", - " \"description\": \"a boy who warned Anya about the monster\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " }\n", - " ]\n", - " }\n", - "}\n" - ] - } + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "##### Copyright 2024 Google LLC." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": {}, + "outputs": [], + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "NtX45QCEdPaP" + }, + "source": [ + "# Structured data extraction using function calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2tO4fP7FFg2V" + }, + "source": [ + "\n", + " \n", + " \n", + " \n", + "
\n", + " View on Google AI\n", + " \n", + " Run in Google Colab\n", + " \n", + " View source on GitHub\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8Szkddw5NScW" + }, + "source": [ + "In this tutorial you'll work through a structured data extraction example, using the Gemini API to extract the lists of characters, relationships, things, and places from a story. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bvrwRlNPdYDr" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "QyW6x11UQHnx" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARNING: There was an error checking the latest version of pip.\u001b[0m\u001b[33m\n", + "\u001b[0m" + ] + } + ], + "source": [ + "!pip install -U -q google-generativeai" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "TS9l5igubpHO" + }, + "outputs": [], + "source": [ + "import pathlib\n", + "import textwrap\n", + "\n", + "import google.generativeai as genai\n", + "import google.ai.generativelanguage as glm\n", + "\n", + "\n", + "from IPython.display import display\n", + "from IPython.display import Markdown\n", + "\n", + "from google.api_core import retry\n", + "\n", + "def to_markdown(text):\n", + " text = text.replace('•', ' *')\n", + " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VmSlTHXxb5pV" + }, + "source": [ + "Once you have the API key, pass it to the SDK. You can do this in two ways:\n", + "\n", + "* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).\n", + "* Pass the key to `genai.configure(api_key=...)`\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ab9ASynfcIZn" + }, + "outputs": [], + "source": [ + "try:\n", + " # Used to securely store your API key\n", + " from google.colab import userdata\n", + "\n", + " # Or use `os.getenv('API_KEY')` to fetch an environment variable.\n", + " GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", + "except ImportError:\n", + " import os\n", + " GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']\n", + "\n", + "genai.configure(api_key=GOOGLE_API_KEY)\n", + "\n", + "genai.configure(\n", + " api_key=GOOGLE_API_KEY,\n", + " client_options={'api_endpoint':'autopush-generativelanguage.sandbox.googleapis.com'})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "K6SdtoJCL4pL" + }, + "source": [ + "## The example task\n", + "\n", + "For this tutorial you'll extract entities from natural language stories. As an\n", + " example, below is a story written by Gemini." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "0THz95wOL4pL" + }, + "outputs": [], + "source": [ + "new_story = False\n", + "\n", + "if new_story:\n", + " model = genai.GenerativeModel(model_name='gemini-1.0-pro')\n", + "\n", + " response = model.generate_content(\"\"\"\n", + " Write a long story about a girl with magic backpack, her family, and at\n", + " least one other charater. Make sure everyone has names. Don't forget to\n", + " describe the contents of the backpack, and where everyone and everything\n", + " starts and ends up.\"\"\", request_options={'retry': retry.Retry()})\n", + " story = response.text\n", + " print(response.candidates[0].citation_metadata)\n", + "else:\n", + " story = \"\"\"In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\\n\\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\\n\\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\\n\\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\\n\\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\\n\\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\\n\\n\"What's wrong?\" she asked.\\n\\nA tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\\n\\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\\n\\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\\n\\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\\n\\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\\n\\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\\n\\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\\n\\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "id": "yMnxJqubg759" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "> In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n", + "> \n", + "> Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n", + "> \n", + "> Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n", + "> \n", + "> With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n", + "> \n", + "> Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n", + "> \n", + "> Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n", + "> \n", + "> \"What's wrong?\" she asked.\n", + "> \n", + "> A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n", + "> \n", + "> Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n", + "> \n", + "> Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n", + "> \n", + "> With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n", + "> \n", + "> Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n", + "> \n", + "> Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n", + "> \n", + "> When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n", + "> \n", + "> As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time." ], - "source": [ - "print(json.dumps(type(fc).to_dict(fc), indent=4))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4m8FakjCIKmI" - }, - "source": [ - "## Conclusion\n", - "\n", - "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." + "text/plain": [ + "" ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "to_markdown(story)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zldoIzn-MuLE" + }, + "source": [ + "## Using Natural language\n", + "\n", + "Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay. \n", + "\n", + "The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:\n", + "\n", + "- Sometimes parsing fails.\n", + "- The schema can't be strictly enforced.\n", + "\n", + "You'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "eStTMD6VL4pM" + }, + "outputs": [], + "source": [ + "model = model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro')\n", + "\n", + "response = model.generate_content(textwrap.dedent(\"\"\"\\\n", + " Please return JSON describing the the people, places, things and relationships from this story using the following schema:\n", + "\n", + " {\"people\": list[PERSON], \"places\":list[PLACE], \"things\":list[THING], \"relationships\": list[RELATIONSHIP]}\n", + "\n", + " PERSON = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " PLACE = {\"name\": str, \"description\": str}\n", + " THING = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " RELATIONSHIP = {\"person_1_name\": str, \"person_2_name\": str, \"relationship\": str}\n", + "\n", + " All fields are required.\n", + "\n", + " Important: Only return a single piece of valid JSON text.\n", + "\n", + " Here is the story:\n", + "\n", + " \"\"\") + story)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B0b5zHI3uEBm" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'{\"people\": [{\"name\": \"Anya\", \"description\": \"A young girl who possesses a magical backpack.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Elise\", \"description\": \"Anya\\'s kind-hearted mother.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Edward\", \"description\": \"Anya\\'s wise-bearded father.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Samuel\", \"description\": \"Anya\\'s curious and adventurous best friend.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}], \"places\": [{\"name\": \"Willow Creek\", \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"}, {\"name\": \"The forest\", \"description\": \"A shadowy place with whispering trees and unseen creatures.\"}], \"things\": [{\"name\": \"Magical backpack\", \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\", \"start_place_name\": \"Anya\\'s grandmother\\'s house\", \"end_place_name\": \"Willow Creek\"}, {\"name\": \"Shimmering sword\", \"description\": \"A sword that Anya uses to defeat the monster.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Book of ancient spells\", \"description\": \"A book that contains ancient spells.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Tiny compass\", \"description\": \"A compass that always points north.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Magical key\", \"description\": \"A key that can open any lock.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Monster\", \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\", \"start_place_name\": \"The forest\", \"end_place_name\": null}], \"relationships\": [{\"person_1_name\": \"Anya\", \"person_2_name\": \"Elise\", \"relationship\": \"Mother-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Edward\", \"relationship\": \"Father-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Samuel\", \"relationship\": \"Best friends\"}]}'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" } - ], - "metadata": { - "colab": { - "name": "structured_data_extraction.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3", - "name": "python3" + ], + "source": [ + "response.text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ScEdqKq1lhmQ" + }, + "source": [ + "That returned a json string. Try parsing it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "xSdj50czL4pM" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"people\": [\n", + " {\n", + " \"name\": \"Anya\",\n", + " \"description\": \"A young girl who possesses a magical backpack.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Elise\",\n", + " \"description\": \"Anya's kind-hearted mother.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Edward\",\n", + " \"description\": \"Anya's wise-bearded father.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"description\": \"Anya's curious and adventurous best friend.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"\n", + " },\n", + " {\n", + " \"name\": \"The forest\",\n", + " \"description\": \"A shadowy place with whispering trees and unseen creatures.\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"name\": \"Magical backpack\",\n", + " \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\",\n", + " \"start_place_name\": \"Anya's grandmother's house\",\n", + " \"end_place_name\": \"Willow Creek\"\n", + " },\n", + " {\n", + " \"name\": \"Shimmering sword\",\n", + " \"description\": \"A sword that Anya uses to defeat the monster.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Book of ancient spells\",\n", + " \"description\": \"A book that contains ancient spells.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Tiny compass\",\n", + " \"description\": \"A compass that always points north.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Magical key\",\n", + " \"description\": \"A key that can open any lock.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Monster\",\n", + " \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\",\n", + " \"start_place_name\": \"The forest\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"relationships\": [\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"relationship\": \"Mother-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Edward\",\n", + " \"relationship\": \"Father-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\",\n", + " \"relationship\": \"Best friends\"\n", + " }\n", + " ]\n", + "}\n" + ] } + ], + "source": [ + "import json\n", + "\n", + "json_text = response.text.strip('`\\r\\n ').removeprefix('json')\n", + "print(json.dumps(json.loads(json_text), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TgC_wkHPmkHn" + }, + "source": [ + "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CxMC28LAOfUf" + }, + "source": [ + "## Use Function Calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "x-V6PJn83Kh9" + }, + "source": [ + "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", + "\n", + "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k83LZ5MCBfTJ" + }, + "source": [ + "### Define the schema\n", + "\n", + "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." + ] + }, + { + "cell_type": "code", + "execution_count": 189, + "metadata": { + "id": "p2efqZA7BAzp" + }, + "outputs": [], + "source": [ + "person = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " 'start_place_name': glm.Schema(type=glm.Type.STRING),\n", + " 'end_place_name': glm.Schema(type=glm.Type.STRING)\n", + " },\n", + " required=['name', 'description', 'start_place_name', 'end_place_name']\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HGV1wxx6BCJl" + }, + "source": [ + "Then define people as an `ARRAY` of `person` objects:" + ] + }, + { + "cell_type": "code", + "execution_count": 190, + "metadata": { + "id": "Ur7kzpiA_Dqw" + }, + "outputs": [], + "source": [ + "people = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=person\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "N6uD63sBBJ3i" + }, + "source": [ + "Then do the same for each of the entities you're trying to extract:" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "metadata": { + "id": "7wd3jTqj_bVi" + }, + "outputs": [], + "source": [ + "place = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "places = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=place\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 192, + "metadata": { + "id": "45cLwvCd_vg_" + }, + "outputs": [], + "source": [ + "thing = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "things = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=thing\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 193, + "metadata": { + "id": "8DdVSZJfADDY" + }, + "outputs": [], + "source": [ + "relationship = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'person_1_name': glm.Schema(type=glm.Type.STRING),\n", + " 'person_2_name': glm.Schema(type=glm.Type.STRING),\n", + " 'relationship': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "relationships = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=relationship\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mJwqEUqjBToJ" + }, + "source": [ + "Now build the `FunctionDeclaration`:" + ] + }, + { + "cell_type": "code", + "execution_count": 194, + "metadata": { + "id": "YQkiVCtsPbUy" + }, + "outputs": [], + "source": [ + "add_to_database = glm.FunctionDeclaration(\n", + " name=\"add_to_database\",\n", + " description=textwrap.dedent(\"\"\"\\\n", + " Adds entities to the database.\n", + " \"\"\"),\n", + " parameters=glm.Schema(\n", + " type=glm.Type.OBJECT,\n", + " properties = {\n", + " 'people': people,\n", + " 'places': places,\n", + " 'things': things,\n", + " 'relationships': relationships\n", + " }\n", + " )\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e1_QSwD9Bmy5" + }, + "source": [ + "### Call the API\n", + "\n", + "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" + ] + }, + { + "cell_type": "code", + "execution_count": 195, + "metadata": { + "id": "5PGAPRDJP4Qx" + }, + "outputs": [], + "source": [ + "model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro',\n", + " tools = [add_to_database])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1uTYW5cVCDST" + }, + "source": [ + "Each time you call the API the SDK will send the tools along with your prompt, and the model should call that function you defined:" + ] + }, + { + "cell_type": "code", + "execution_count": 196, + "metadata": { + "id": "bAPA7fNtSUwN" + }, + "outputs": [], + "source": [ + "result = model.generate_content(f\"\"\"\n", + "Please add the people, places, things, and relationships from this story to the database:\n", + "\n", + "{story}\n", + "\"\"\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oSG7r6IBCL7S" + }, + "source": [ + "Now there is no text to parse. The result _is_ a datastructure." + ] + }, + { + "cell_type": "code", + "execution_count": 197, + "metadata": { + "id": "07n3wXzFOZ4x" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 197, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'text' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 198, + "metadata": { + "id": "i-8hm1HPI5Ce" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 198, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'function_call' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 199, + "metadata": { + "id": "n8BTs6ogDEkq" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "fc = result.candidates[0].content.parts[0].function_call\n", + "print(type(fc))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kILNHmG2IED3" + }, + "source": [ + "The `glm.FunctionCall` class is based on Google Protocol Buffers, convert it to a more familiar JSON compatible object:" + ] + }, + { + "cell_type": "code", + "execution_count": 200, + "metadata": { + "id": "5GKHtT4-F3qa" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"add_to_database\",\n", + " \"args\": {\n", + " \"relationships\": [\n", + " {\n", + " \"relationship\": \"mother-daughter\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"person_1_name\": \"Anya\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"relationship\": \"father-daughter\",\n", + " \"person_2_name\": \"Edward\"\n", + " },\n", + " {\n", + " \"relationship\": \"best friends\",\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\"\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"a quaint town nestled amidst rolling hills and whispering willows\"\n", + " },\n", + " {\n", + " \"name\": \"forest\",\n", + " \"description\": \"a shadowy place with rustling undergrowth\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"description\": \"a backpack with a shimmering emerald-green fabric and leather straps, containing a magical sword, a book of ancient spells, a tiny compass that always points north, and a magical key that could open any lock\",\n", + " \"name\": \"magical backpack\"\n", + " },\n", + " {\n", + " \"description\": \"a weapon that can defeat monsters\",\n", + " \"name\": \"shimmering sword\"\n", + " },\n", + " {\n", + " \"description\": \"a book containing magical spells\",\n", + " \"name\": \"book of ancient spells\"\n", + " },\n", + " {\n", + " \"name\": \"tiny compass\",\n", + " \"description\": \"a compass that always points north\"\n", + " },\n", + " {\n", + " \"name\": \"magical key\",\n", + " \"description\": \"a key that can open any lock\"\n", + " },\n", + " {\n", + " \"name\": \"monster\",\n", + " \"description\": \"a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease\"\n", + " }\n", + " ],\n", + " \"people\": [\n", + " {\n", + " \"description\": \"a young girl\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Anya\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's mother\",\n", + " \"name\": \"Elise\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Edward\",\n", + " \"end_place_name\": null,\n", + " \"description\": \"Anya's father\"\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"end_place_name\": null,\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's best friend\"\n", + " },\n", + " {\n", + " \"name\": \"tall, lanky boy\",\n", + " \"description\": \"a boy who warned Anya about the monster\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ]\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "print(json.dumps(type(fc).to_dict(fc), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4m8FakjCIKmI" + }, + "source": [ + "## Conclusion\n", + "\n", + "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." + ] + } + ], + "metadata": { + "colab": { + "name": "structured_data_extraction.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3 (ipykernel)", + "language": "python", + "name": "python3" }, - "nbformat": 4, - "nbformat_minor": 0 + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.11.3" + } + }, + "nbformat": 4, + "nbformat_minor": 4 } From f55f376f2a08226ae413685fc7f6989be6c1c648 Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 4 Mar 2024 18:19:42 -0800 Subject: [PATCH 3/4] nbfmt --- .../structured_data_extraction.ipynb | 1798 ++++++++--------- 1 file changed, 895 insertions(+), 903 deletions(-) diff --git a/site/en/tutorials/structured_data_extraction.ipynb b/site/en/tutorials/structured_data_extraction.ipynb index ad101fed5..f0424e782 100644 --- a/site/en/tutorials/structured_data_extraction.ipynb +++ b/site/en/tutorials/structured_data_extraction.ipynb @@ -1,910 +1,902 @@ { - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "##### Copyright 2024 Google LLC." - ] - }, - { - "cell_type": "code", - "execution_count": 3, - "metadata": {}, - "outputs": [], - "source": [ - "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", - "# you may not use this file except in compliance with the License.\n", - "# You may obtain a copy of the License at\n", - "#\n", - "# https://www.apache.org/licenses/LICENSE-2.0\n", - "#\n", - "# Unless required by applicable law or agreed to in writing, software\n", - "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", - "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", - "# See the License for the specific language governing permissions and\n", - "# limitations under the License." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "NtX45QCEdPaP" - }, - "source": [ - "# Structured data extraction using function calling" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "2tO4fP7FFg2V" - }, - "source": [ - "\n", - " \n", - " \n", - " \n", - "
\n", - " View on Google AI\n", - " \n", - " Run in Google Colab\n", - " \n", - " View source on GitHub\n", - "
" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "8Szkddw5NScW" - }, - "source": [ - "In this tutorial you'll work through a structured data extraction example, using the Gemini API to extract the lists of characters, relationships, things, and places from a story. " - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "bvrwRlNPdYDr" - }, - "source": [ - "## Setup" - ] - }, - { - "cell_type": "code", - "execution_count": 10, - "metadata": { - "id": "QyW6x11UQHnx" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\u001b[33mWARNING: There was an error checking the latest version of pip.\u001b[0m\u001b[33m\n", - "\u001b[0m" - ] - } - ], - "source": [ - "!pip install -U -q google-generativeai" - ] - }, - { - "cell_type": "code", - "execution_count": 11, - "metadata": { - "id": "TS9l5igubpHO" - }, - "outputs": [], - "source": [ - "import pathlib\n", - "import textwrap\n", - "\n", - "import google.generativeai as genai\n", - "import google.ai.generativelanguage as glm\n", - "\n", - "\n", - "from IPython.display import display\n", - "from IPython.display import Markdown\n", - "\n", - "from google.api_core import retry\n", - "\n", - "def to_markdown(text):\n", - " text = text.replace('•', ' *')\n", - " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "VmSlTHXxb5pV" - }, - "source": [ - "Once you have the API key, pass it to the SDK. You can do this in two ways:\n", - "\n", - "* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).\n", - "* Pass the key to `genai.configure(api_key=...)`\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "ab9ASynfcIZn" - }, - "outputs": [], - "source": [ - "try:\n", - " # Used to securely store your API key\n", - " from google.colab import userdata\n", - "\n", - " # Or use `os.getenv('API_KEY')` to fetch an environment variable.\n", - " GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", - "except ImportError:\n", - " import os\n", - " GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']\n", - "\n", - "genai.configure(api_key=GOOGLE_API_KEY)\n", - "\n", - "genai.configure(\n", - " api_key=GOOGLE_API_KEY,\n", - " client_options={'api_endpoint':'autopush-generativelanguage.sandbox.googleapis.com'})" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "K6SdtoJCL4pL" - }, - "source": [ - "## The example task\n", - "\n", - "For this tutorial you'll extract entities from natural language stories. As an\n", - " example, below is a story written by Gemini." - ] - }, - { - "cell_type": "code", - "execution_count": 6, - "metadata": { - "id": "0THz95wOL4pL" - }, - "outputs": [], - "source": [ - "new_story = False\n", - "\n", - "if new_story:\n", - " model = genai.GenerativeModel(model_name='gemini-1.0-pro')\n", - "\n", - " response = model.generate_content(\"\"\"\n", - " Write a long story about a girl with magic backpack, her family, and at\n", - " least one other charater. Make sure everyone has names. Don't forget to\n", - " describe the contents of the backpack, and where everyone and everything\n", - " starts and ends up.\"\"\", request_options={'retry': retry.Retry()})\n", - " story = response.text\n", - " print(response.candidates[0].citation_metadata)\n", - "else:\n", - " story = \"\"\"In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\\n\\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\\n\\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\\n\\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\\n\\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\\n\\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\\n\\n\"What's wrong?\" she asked.\\n\\nA tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\\n\\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\\n\\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\\n\\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\\n\\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\\n\\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\\n\\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\\n\\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.\"\"\"" - ] - }, - { - "cell_type": "code", - "execution_count": 9, - "metadata": { - "id": "yMnxJqubg759" - }, - "outputs": [ - { - "data": { - "text/markdown": [ - "> In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n", - "> \n", - "> Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n", - "> \n", - "> Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n", - "> \n", - "> With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n", - "> \n", - "> Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n", - "> \n", - "> Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n", - "> \n", - "> \"What's wrong?\" she asked.\n", - "> \n", - "> A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n", - "> \n", - "> Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n", - "> \n", - "> Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n", - "> \n", - "> With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n", - "> \n", - "> Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n", - "> \n", - "> Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n", - "> \n", - "> When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n", - "> \n", - "> As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time." + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "8968a502d25e" + }, + "source": [ + "##### Copyright 2024 Google LLC." + ] + }, + { + "cell_type": "code", + "execution_count": 3, + "metadata": { + "cellView": "form", + "id": "906e07f6e562" + }, + "outputs": [], + "source": [ + "#@title Licensed under the Apache License, Version 2.0 (the \"License\");\n", + "# you may not use this file except in compliance with the License.\n", + "# You may obtain a copy of the License at\n", + "#\n", + "# https://www.apache.org/licenses/LICENSE-2.0\n", + "#\n", + "# Unless required by applicable law or agreed to in writing, software\n", + "# distributed under the License is distributed on an \"AS IS\" BASIS,\n", + "# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.\n", + "# See the License for the specific language governing permissions and\n", + "# limitations under the License." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "NtX45QCEdPaP" + }, + "source": [ + "# Structured data extraction using function calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2tO4fP7FFg2V" + }, + "source": [ + "\n", + " \n", + " \n", + " \n", + "
\n", + " View on Google AI\n", + " \n", + " Run in Google Colab\n", + " \n", + " View source on GitHub\n", + "
" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "8Szkddw5NScW" + }, + "source": [ + "In this tutorial you'll work through a structured data extraction example, using the Gemini API to extract the lists of characters, relationships, things, and places from a story. " + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "bvrwRlNPdYDr" + }, + "source": [ + "## Setup" + ] + }, + { + "cell_type": "code", + "execution_count": 10, + "metadata": { + "id": "QyW6x11UQHnx" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\u001b[33mWARNING: There was an error checking the latest version of pip.\u001b[0m\u001b[33m\n", + "\u001b[0m" + ] + } ], - "text/plain": [ - "" + "source": [ + "!pip install -U -q google-generativeai" + ] + }, + { + "cell_type": "code", + "execution_count": 11, + "metadata": { + "id": "TS9l5igubpHO" + }, + "outputs": [], + "source": [ + "import pathlib\n", + "import textwrap\n", + "\n", + "import google.generativeai as genai\n", + "import google.ai.generativelanguage as glm\n", + "\n", + "\n", + "from IPython.display import display\n", + "from IPython.display import Markdown\n", + "\n", + "from google.api_core import retry\n", + "\n", + "def to_markdown(text):\n", + " text = text.replace('•', ' *')\n", + " return Markdown(textwrap.indent(text, '> ', predicate=lambda _: True))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "VmSlTHXxb5pV" + }, + "source": [ + "Once you have the API key, pass it to the SDK. You can do this in two ways:\n", + "\n", + "* Put the key in the `GOOGLE_API_KEY` environment variable (the SDK will automatically pick it up from there).\n", + "* Pass the key to `genai.configure(api_key=...)`\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "ab9ASynfcIZn" + }, + "outputs": [], + "source": [ + "try:\n", + " # Used to securely store your API key\n", + " from google.colab import userdata\n", + "\n", + " # Or use `os.getenv('API_KEY')` to fetch an environment variable.\n", + " GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')\n", + "except ImportError:\n", + " import os\n", + " GOOGLE_API_KEY = os.environ['GOOGLE_API_KEY']\n", + "\n", + "genai.configure(api_key=GOOGLE_API_KEY)\n", + "\n", + "genai.configure(\n", + " api_key=GOOGLE_API_KEY,\n", + " client_options={'api_endpoint':'autopush-generativelanguage.sandbox.googleapis.com'})" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "K6SdtoJCL4pL" + }, + "source": [ + "## The example task\n", + "\n", + "For this tutorial you'll extract entities from natural language stories. As an\n", + " example, below is a story written by Gemini." + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "metadata": { + "id": "0THz95wOL4pL" + }, + "outputs": [], + "source": [ + "new_story = False\n", + "\n", + "if new_story:\n", + " model = genai.GenerativeModel(model_name='gemini-1.0-pro')\n", + "\n", + " response = model.generate_content(\"\"\"\n", + " Write a long story about a girl with magic backpack, her family, and at\n", + " least one other charater. Make sure everyone has names. Don't forget to\n", + " describe the contents of the backpack, and where everyone and everything\n", + " starts and ends up.\"\"\", request_options={'retry': retry.Retry()})\n", + " story = response.text\n", + " print(response.candidates[0].citation_metadata)\n", + "else:\n", + " story = \"\"\"In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\\n\\nHanded down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\\n\\nAnya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\\n\\nWith a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\\n\\nAnya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\\n\\nTogether, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\\n\\n\"What's wrong?\" she asked.\\n\\nA tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\\n\\nAnya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\\n\\nWithout a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\\n\\nWith Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\\n\\nSuddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\\n\\nFear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\\n\\nWhen the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\\n\\nAs she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time.\"\"\"" + ] + }, + { + "cell_type": "code", + "execution_count": 9, + "metadata": { + "id": "yMnxJqubg759" + }, + "outputs": [ + { + "data": { + "text/markdown": [ + "> In the quaint town of Willow Creek, nestled amidst rolling hills and whispering willows, resided a young girl named Anya. As she stepped out of the creaky wooden door of her modest cottage, her heart skipped a beat with excitement and anticipation. Today was her first day of school, and she couldn't wait to show off her prized possession - a magical backpack.\n", + "> \n", + "> Handed down to her from her grandmother, the backpack was no ordinary satchel. Its soft, emerald-green fabric shimmered with an ethereal glow, and its leather straps held secrets that only Anya knew. Within its capacious interior lay an enchanted world, filled with wonders that would ignite her imagination and change her life forever.\n", + "> \n", + "> Anya's parents, kind-hearted Elise and wise-bearded Edward, bid her farewell with warm embraces. \"Remember, my dear,\" whispered her mother, \"use your magic wisely and for good.\" Her father added, \"Always seek knowledge, and let the backpack be your trusted companion.\"\n", + "> \n", + "> With a skip in her step, Anya set off towards the town's only schoolhouse. On her way, she passed her best friend, Samuel, a curious and adventurous boy with a mischievous grin. \"Hey, Anya,\" he called out. \"Can I see your backpack?\"\n", + "> \n", + "> Anya hesitated for a moment before unzipping the flap and revealing its contents. Samuel's eyes widened in amazement as he peered inside. There, nestled amidst pencils and notebooks, were a shimmering sword, a book of ancient spells, a tiny compass that always pointed north, and a magical key that could open any lock.\n", + "> \n", + "> Together, they marveled at the backpack's wonders, promising to keep its secrets safe. As they approached the schoolhouse, Anya noticed a group of older children huddled together, their faces etched with fear. Curiosity getting the better of her, she cautiously approached.\n", + "> \n", + "> \"What's wrong?\" she asked.\n", + "> \n", + "> A tall, lanky boy stepped forward. \"There's a monster in the forest,\" he stammered. \"It's been terrorizing the town, attacking animals and even people.\"\n", + "> \n", + "> Anya's heart sank. The town of Willow Creek was small and peaceful, and the thought of a monster brought a shiver down her spine. She knew she had to do something to protect her family and friends.\n", + "> \n", + "> Without a moment's hesitation, Anya opened her backpack and retrieved the shimmering sword. With a determined gleam in her eye, she turned to her terrified peers. \"Don't worry,\" she said, her voice steady. \"I'll take care of it.\"\n", + "> \n", + "> With Samuel close behind her, Anya ventured into the shadowy depths of the forest. The trees seemed to whisper secrets as she passed, and the undergrowth rustled with unseen creatures. As they walked deeper into the forest, the air grew heavy and the ground beneath their feet trembled.\n", + "> \n", + "> Suddenly, they came to a clearing, and there before their eyes was the monster - a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease. The creature roared, a thunderous sound that shook the forest to its core.\n", + "> \n", + "> Fear surged through Anya, but she refused to let it consume her. She drew the sword from its sheath and charged towards the monster. The blade shimmered in the sunlight, and as it struck the beast's hide, a blinding light erupted, enveloping everything in its radiance.\n", + "> \n", + "> When the light faded, the monster was gone, and in its place was a pile of shattered crystals. Anya had defeated the creature with the magic of her backpack, proving that even the smallest of objects could hold the greatest of powers.\n", + "> \n", + "> As she and Samuel returned to the town, they were greeted as heroes. The people of Willow Creek rejoiced, and the legend of Anya, the girl with the magic backpack, was passed down through generations. And so, Anya continued her adventures, using the backpack's wonders to make the world a better place, one magical step at a time." + ], + "text/plain": [ + "" + ] + }, + "execution_count": 9, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "to_markdown(story)" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "zldoIzn-MuLE" + }, + "source": [ + "## Using Natural language\n", + "\n", + "Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay. \n", + "\n", + "The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:\n", + "\n", + "- Sometimes parsing fails.\n", + "- The schema can't be strictly enforced.\n", + "\n", + "You'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "eStTMD6VL4pM" + }, + "outputs": [], + "source": [ + "model = model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro')\n", + "\n", + "response = model.generate_content(textwrap.dedent(\"\"\"\\\n", + " Please return JSON describing the the people, places, things and relationships from this story using the following schema:\n", + "\n", + " {\"people\": list[PERSON], \"places\":list[PLACE], \"things\":list[THING], \"relationships\": list[RELATIONSHIP]}\n", + "\n", + " PERSON = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " PLACE = {\"name\": str, \"description\": str}\n", + " THING = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", + " RELATIONSHIP = {\"person_1_name\": str, \"person_2_name\": str, \"relationship\": str}\n", + "\n", + " All fields are required.\n", + "\n", + " Important: Only return a single piece of valid JSON text.\n", + "\n", + " Here is the story:\n", + "\n", + " \"\"\") + story)\n" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "B0b5zHI3uEBm" + }, + "outputs": [ + { + "data": { + "application/vnd.google.colaboratory.intrinsic+json": { + "type": "string" + }, + "text/plain": [ + "'{\"people\": [{\"name\": \"Anya\", \"description\": \"A young girl who possesses a magical backpack.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Elise\", \"description\": \"Anya\\'s kind-hearted mother.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Edward\", \"description\": \"Anya\\'s wise-bearded father.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Samuel\", \"description\": \"Anya\\'s curious and adventurous best friend.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}], \"places\": [{\"name\": \"Willow Creek\", \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"}, {\"name\": \"The forest\", \"description\": \"A shadowy place with whispering trees and unseen creatures.\"}], \"things\": [{\"name\": \"Magical backpack\", \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\", \"start_place_name\": \"Anya\\'s grandmother\\'s house\", \"end_place_name\": \"Willow Creek\"}, {\"name\": \"Shimmering sword\", \"description\": \"A sword that Anya uses to defeat the monster.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Book of ancient spells\", \"description\": \"A book that contains ancient spells.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Tiny compass\", \"description\": \"A compass that always points north.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Magical key\", \"description\": \"A key that can open any lock.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Monster\", \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\", \"start_place_name\": \"The forest\", \"end_place_name\": null}], \"relationships\": [{\"person_1_name\": \"Anya\", \"person_2_name\": \"Elise\", \"relationship\": \"Mother-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Edward\", \"relationship\": \"Father-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Samuel\", \"relationship\": \"Best friends\"}]}'" + ] + }, + "execution_count": 47, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "response.text" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "ScEdqKq1lhmQ" + }, + "source": [ + "That returned a json string. Try parsing it:" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "metadata": { + "id": "xSdj50czL4pM" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"people\": [\n", + " {\n", + " \"name\": \"Anya\",\n", + " \"description\": \"A young girl who possesses a magical backpack.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Elise\",\n", + " \"description\": \"Anya's kind-hearted mother.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Edward\",\n", + " \"description\": \"Anya's wise-bearded father.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"description\": \"Anya's curious and adventurous best friend.\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"\n", + " },\n", + " {\n", + " \"name\": \"The forest\",\n", + " \"description\": \"A shadowy place with whispering trees and unseen creatures.\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"name\": \"Magical backpack\",\n", + " \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\",\n", + " \"start_place_name\": \"Anya's grandmother's house\",\n", + " \"end_place_name\": \"Willow Creek\"\n", + " },\n", + " {\n", + " \"name\": \"Shimmering sword\",\n", + " \"description\": \"A sword that Anya uses to defeat the monster.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Book of ancient spells\",\n", + " \"description\": \"A book that contains ancient spells.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Tiny compass\",\n", + " \"description\": \"A compass that always points north.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Magical key\",\n", + " \"description\": \"A key that can open any lock.\",\n", + " \"start_place_name\": \"Anya's backpack\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"name\": \"Monster\",\n", + " \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\",\n", + " \"start_place_name\": \"The forest\",\n", + " \"end_place_name\": null\n", + " }\n", + " ],\n", + " \"relationships\": [\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"relationship\": \"Mother-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Edward\",\n", + " \"relationship\": \"Father-daughter\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\",\n", + " \"relationship\": \"Best friends\"\n", + " }\n", + " ]\n", + "}\n" + ] + } + ], + "source": [ + "import json\n", + "\n", + "json_text = response.text.strip('`\\r\\n ').removeprefix('json')\n", + "print(json.dumps(json.loads(json_text), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "TgC_wkHPmkHn" + }, + "source": [ + "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "CxMC28LAOfUf" + }, + "source": [ + "## Use Function Calling" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "x-V6PJn83Kh9" + }, + "source": [ + "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", + "\n", + "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "k83LZ5MCBfTJ" + }, + "source": [ + "### Define the schema\n", + "\n", + "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." + ] + }, + { + "cell_type": "code", + "execution_count": 189, + "metadata": { + "id": "p2efqZA7BAzp" + }, + "outputs": [], + "source": [ + "person = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " 'start_place_name': glm.Schema(type=glm.Type.STRING),\n", + " 'end_place_name': glm.Schema(type=glm.Type.STRING)\n", + " },\n", + " required=['name', 'description', 'start_place_name', 'end_place_name']\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "HGV1wxx6BCJl" + }, + "source": [ + "Then define people as an `ARRAY` of `person` objects:" + ] + }, + { + "cell_type": "code", + "execution_count": 190, + "metadata": { + "id": "Ur7kzpiA_Dqw" + }, + "outputs": [], + "source": [ + "people = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=person\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "N6uD63sBBJ3i" + }, + "source": [ + "Then do the same for each of the entities you're trying to extract:" + ] + }, + { + "cell_type": "code", + "execution_count": 191, + "metadata": { + "id": "7wd3jTqj_bVi" + }, + "outputs": [], + "source": [ + "place = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "places = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=place\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 192, + "metadata": { + "id": "45cLwvCd_vg_" + }, + "outputs": [], + "source": [ + "thing = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'name': glm.Schema(type=glm.Type.STRING),\n", + " 'description': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "things = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=thing\n", + ")" + ] + }, + { + "cell_type": "code", + "execution_count": 193, + "metadata": { + "id": "8DdVSZJfADDY" + }, + "outputs": [], + "source": [ + "relationship = glm.Schema(\n", + " type = glm.Type.OBJECT,\n", + " properties = {\n", + " 'person_1_name': glm.Schema(type=glm.Type.STRING),\n", + " 'person_2_name': glm.Schema(type=glm.Type.STRING),\n", + " 'relationship': glm.Schema(type=glm.Type.STRING),\n", + " }\n", + ")\n", + "\n", + "relationships = glm.Schema(\n", + " type=glm.Type.ARRAY,\n", + " items=relationship\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "mJwqEUqjBToJ" + }, + "source": [ + "Now build the `FunctionDeclaration`:" + ] + }, + { + "cell_type": "code", + "execution_count": 194, + "metadata": { + "id": "YQkiVCtsPbUy" + }, + "outputs": [], + "source": [ + "add_to_database = glm.FunctionDeclaration(\n", + " name=\"add_to_database\",\n", + " description=textwrap.dedent(\"\"\"\\\n", + " Adds entities to the database.\n", + " \"\"\"),\n", + " parameters=glm.Schema(\n", + " type=glm.Type.OBJECT,\n", + " properties = {\n", + " 'people': people,\n", + " 'places': places,\n", + " 'things': things,\n", + " 'relationships': relationships\n", + " }\n", + " )\n", + ")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "e1_QSwD9Bmy5" + }, + "source": [ + "### Call the API\n", + "\n", + "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" + ] + }, + { + "cell_type": "code", + "execution_count": 195, + "metadata": { + "id": "5PGAPRDJP4Qx" + }, + "outputs": [], + "source": [ + "model = model = genai.GenerativeModel(\n", + " model_name='gemini-1.0-pro',\n", + " tools = [add_to_database])" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "1uTYW5cVCDST" + }, + "source": [ + "Each time you call the API the SDK will send the tools along with your prompt, and the model should call that function you defined:" + ] + }, + { + "cell_type": "code", + "execution_count": 196, + "metadata": { + "id": "bAPA7fNtSUwN" + }, + "outputs": [], + "source": [ + "result = model.generate_content(f\"\"\"\n", + "Please add the people, places, things, and relationships from this story to the database:\n", + "\n", + "{story}\n", + "\"\"\")" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "oSG7r6IBCL7S" + }, + "source": [ + "Now there is no text to parse. The result _is_ a datastructure." + ] + }, + { + "cell_type": "code", + "execution_count": 197, + "metadata": { + "id": "07n3wXzFOZ4x" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "False" + ] + }, + "execution_count": 197, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'text' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 198, + "metadata": { + "id": "i-8hm1HPI5Ce" + }, + "outputs": [ + { + "data": { + "text/plain": [ + "True" + ] + }, + "execution_count": 198, + "metadata": {}, + "output_type": "execute_result" + } + ], + "source": [ + "'function_call' in result.candidates[0].content.parts[0]" + ] + }, + { + "cell_type": "code", + "execution_count": 199, + "metadata": { + "id": "n8BTs6ogDEkq" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "\n" + ] + } + ], + "source": [ + "fc = result.candidates[0].content.parts[0].function_call\n", + "print(type(fc))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "kILNHmG2IED3" + }, + "source": [ + "The `glm.FunctionCall` class is based on Google Protocol Buffers, convert it to a more familiar JSON compatible object:" + ] + }, + { + "cell_type": "code", + "execution_count": 200, + "metadata": { + "id": "5GKHtT4-F3qa" + }, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "{\n", + " \"name\": \"add_to_database\",\n", + " \"args\": {\n", + " \"relationships\": [\n", + " {\n", + " \"relationship\": \"mother-daughter\",\n", + " \"person_2_name\": \"Elise\",\n", + " \"person_1_name\": \"Anya\"\n", + " },\n", + " {\n", + " \"person_1_name\": \"Anya\",\n", + " \"relationship\": \"father-daughter\",\n", + " \"person_2_name\": \"Edward\"\n", + " },\n", + " {\n", + " \"relationship\": \"best friends\",\n", + " \"person_1_name\": \"Anya\",\n", + " \"person_2_name\": \"Samuel\"\n", + " }\n", + " ],\n", + " \"places\": [\n", + " {\n", + " \"name\": \"Willow Creek\",\n", + " \"description\": \"a quaint town nestled amidst rolling hills and whispering willows\"\n", + " },\n", + " {\n", + " \"name\": \"forest\",\n", + " \"description\": \"a shadowy place with rustling undergrowth\"\n", + " }\n", + " ],\n", + " \"things\": [\n", + " {\n", + " \"description\": \"a backpack with a shimmering emerald-green fabric and leather straps, containing a magical sword, a book of ancient spells, a tiny compass that always points north, and a magical key that could open any lock\",\n", + " \"name\": \"magical backpack\"\n", + " },\n", + " {\n", + " \"description\": \"a weapon that can defeat monsters\",\n", + " \"name\": \"shimmering sword\"\n", + " },\n", + " {\n", + " \"description\": \"a book containing magical spells\",\n", + " \"name\": \"book of ancient spells\"\n", + " },\n", + " {\n", + " \"name\": \"tiny compass\",\n", + " \"description\": \"a compass that always points north\"\n", + " },\n", + " {\n", + " \"name\": \"magical key\",\n", + " \"description\": \"a key that can open any lock\"\n", + " },\n", + " {\n", + " \"name\": \"monster\",\n", + " \"description\": \"a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease\"\n", + " }\n", + " ],\n", + " \"people\": [\n", + " {\n", + " \"description\": \"a young girl\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Anya\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's mother\",\n", + " \"name\": \"Elise\",\n", + " \"end_place_name\": null\n", + " },\n", + " {\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"name\": \"Edward\",\n", + " \"end_place_name\": null,\n", + " \"description\": \"Anya's father\"\n", + " },\n", + " {\n", + " \"name\": \"Samuel\",\n", + " \"end_place_name\": null,\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"description\": \"Anya's best friend\"\n", + " },\n", + " {\n", + " \"name\": \"tall, lanky boy\",\n", + " \"description\": \"a boy who warned Anya about the monster\",\n", + " \"start_place_name\": \"Willow Creek\",\n", + " \"end_place_name\": null\n", + " }\n", + " ]\n", + " }\n", + "}\n" + ] + } + ], + "source": [ + "print(json.dumps(type(fc).to_dict(fc), indent=4))" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "4m8FakjCIKmI" + }, + "source": [ + "## Conclusion\n", + "\n", + "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." ] - }, - "execution_count": 9, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "to_markdown(story)" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "zldoIzn-MuLE" - }, - "source": [ - "## Using Natural language\n", - "\n", - "Large language models are a powerfuls multitask tools. Often you can just ask Gemini for what you want, and it will do okay. \n", - "\n", - "The Gemini API doesn't have a JSON mode, so there are a few things to watch for when generating data structures this way:\n", - "\n", - "- Sometimes parsing fails.\n", - "- The schema can't be strictly enforced.\n", - "\n", - "You'll solve those problems in the next section. First, try a simple natural language prompt with the schema written out as text. This has not been optimized:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "eStTMD6VL4pM" - }, - "outputs": [], - "source": [ - "model = model = model = genai.GenerativeModel(\n", - " model_name='gemini-1.0-pro')\n", - "\n", - "response = model.generate_content(textwrap.dedent(\"\"\"\\\n", - " Please return JSON describing the the people, places, things and relationships from this story using the following schema:\n", - "\n", - " {\"people\": list[PERSON], \"places\":list[PLACE], \"things\":list[THING], \"relationships\": list[RELATIONSHIP]}\n", - "\n", - " PERSON = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", - " PLACE = {\"name\": str, \"description\": str}\n", - " THING = {\"name\": str, \"description\": str, \"start_place_name\": str, \"end_place_name\": str}\n", - " RELATIONSHIP = {\"person_1_name\": str, \"person_2_name\": str, \"relationship\": str}\n", - "\n", - " All fields are required.\n", - "\n", - " Important: Only return a single piece of valid JSON text.\n", - "\n", - " Here is the story:\n", - "\n", - " \"\"\") + story)\n" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "B0b5zHI3uEBm" - }, - "outputs": [ - { - "data": { - "application/vnd.google.colaboratory.intrinsic+json": { - "type": "string" - }, - "text/plain": [ - "'{\"people\": [{\"name\": \"Anya\", \"description\": \"A young girl who possesses a magical backpack.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Elise\", \"description\": \"Anya\\'s kind-hearted mother.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Edward\", \"description\": \"Anya\\'s wise-bearded father.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}, {\"name\": \"Samuel\", \"description\": \"Anya\\'s curious and adventurous best friend.\", \"start_place_name\": \"Willow Creek\", \"end_place_name\": null}], \"places\": [{\"name\": \"Willow Creek\", \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"}, {\"name\": \"The forest\", \"description\": \"A shadowy place with whispering trees and unseen creatures.\"}], \"things\": [{\"name\": \"Magical backpack\", \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\", \"start_place_name\": \"Anya\\'s grandmother\\'s house\", \"end_place_name\": \"Willow Creek\"}, {\"name\": \"Shimmering sword\", \"description\": \"A sword that Anya uses to defeat the monster.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Book of ancient spells\", \"description\": \"A book that contains ancient spells.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Tiny compass\", \"description\": \"A compass that always points north.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Magical key\", \"description\": \"A key that can open any lock.\", \"start_place_name\": \"Anya\\'s backpack\", \"end_place_name\": null}, {\"name\": \"Monster\", \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\", \"start_place_name\": \"The forest\", \"end_place_name\": null}], \"relationships\": [{\"person_1_name\": \"Anya\", \"person_2_name\": \"Elise\", \"relationship\": \"Mother-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Edward\", \"relationship\": \"Father-daughter\"}, {\"person_1_name\": \"Anya\", \"person_2_name\": \"Samuel\", \"relationship\": \"Best friends\"}]}'" - ] - }, - "execution_count": 47, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "response.text" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "ScEdqKq1lhmQ" - }, - "source": [ - "That returned a json string. Try parsing it:" - ] - }, - { - "cell_type": "code", - "execution_count": null, - "metadata": { - "id": "xSdj50czL4pM" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{\n", - " \"people\": [\n", - " {\n", - " \"name\": \"Anya\",\n", - " \"description\": \"A young girl who possesses a magical backpack.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Elise\",\n", - " \"description\": \"Anya's kind-hearted mother.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Edward\",\n", - " \"description\": \"Anya's wise-bearded father.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Samuel\",\n", - " \"description\": \"Anya's curious and adventurous best friend.\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " }\n", - " ],\n", - " \"places\": [\n", - " {\n", - " \"name\": \"Willow Creek\",\n", - " \"description\": \"A quaint town nestled amidst rolling hills and whispering willows.\"\n", - " },\n", - " {\n", - " \"name\": \"The forest\",\n", - " \"description\": \"A shadowy place with whispering trees and unseen creatures.\"\n", - " }\n", - " ],\n", - " \"things\": [\n", - " {\n", - " \"name\": \"Magical backpack\",\n", - " \"description\": \"A magical backpack that holds an enchanted world filled with wonders.\",\n", - " \"start_place_name\": \"Anya's grandmother's house\",\n", - " \"end_place_name\": \"Willow Creek\"\n", - " },\n", - " {\n", - " \"name\": \"Shimmering sword\",\n", - " \"description\": \"A sword that Anya uses to defeat the monster.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Book of ancient spells\",\n", - " \"description\": \"A book that contains ancient spells.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Tiny compass\",\n", - " \"description\": \"A compass that always points north.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Magical key\",\n", - " \"description\": \"A key that can open any lock.\",\n", - " \"start_place_name\": \"Anya's backpack\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"name\": \"Monster\",\n", - " \"description\": \"A massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease.\",\n", - " \"start_place_name\": \"The forest\",\n", - " \"end_place_name\": null\n", - " }\n", - " ],\n", - " \"relationships\": [\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Elise\",\n", - " \"relationship\": \"Mother-daughter\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Edward\",\n", - " \"relationship\": \"Father-daughter\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Samuel\",\n", - " \"relationship\": \"Best friends\"\n", - " }\n", - " ]\n", - "}\n" - ] - } - ], - "source": [ - "import json\n", - "\n", - "json_text = response.text.strip('`\\r\\n ').removeprefix('json')\n", - "print(json.dumps(json.loads(json_text), indent=4))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "TgC_wkHPmkHn" - }, - "source": [ - "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "CxMC28LAOfUf" - }, - "source": [ - "## Use Function Calling" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "x-V6PJn83Kh9" - }, - "source": [ - "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", - "\n", - "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "k83LZ5MCBfTJ" - }, - "source": [ - "### Define the schema\n", - "\n", - "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." - ] - }, - { - "cell_type": "code", - "execution_count": 189, - "metadata": { - "id": "p2efqZA7BAzp" - }, - "outputs": [], - "source": [ - "person = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " 'start_place_name': glm.Schema(type=glm.Type.STRING),\n", - " 'end_place_name': glm.Schema(type=glm.Type.STRING)\n", - " },\n", - " required=['name', 'description', 'start_place_name', 'end_place_name']\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "HGV1wxx6BCJl" - }, - "source": [ - "Then define people as an `ARRAY` of `person` objects:" - ] - }, - { - "cell_type": "code", - "execution_count": 190, - "metadata": { - "id": "Ur7kzpiA_Dqw" - }, - "outputs": [], - "source": [ - "people = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=person\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "N6uD63sBBJ3i" - }, - "source": [ - "Then do the same for each of the entities you're trying to extract:" - ] - }, - { - "cell_type": "code", - "execution_count": 191, - "metadata": { - "id": "7wd3jTqj_bVi" - }, - "outputs": [], - "source": [ - "place = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "places = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=place\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 192, - "metadata": { - "id": "45cLwvCd_vg_" - }, - "outputs": [], - "source": [ - "thing = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'name': glm.Schema(type=glm.Type.STRING),\n", - " 'description': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "things = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=thing\n", - ")" - ] - }, - { - "cell_type": "code", - "execution_count": 193, - "metadata": { - "id": "8DdVSZJfADDY" - }, - "outputs": [], - "source": [ - "relationship = glm.Schema(\n", - " type = glm.Type.OBJECT,\n", - " properties = {\n", - " 'person_1_name': glm.Schema(type=glm.Type.STRING),\n", - " 'person_2_name': glm.Schema(type=glm.Type.STRING),\n", - " 'relationship': glm.Schema(type=glm.Type.STRING),\n", - " }\n", - ")\n", - "\n", - "relationships = glm.Schema(\n", - " type=glm.Type.ARRAY,\n", - " items=relationship\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "mJwqEUqjBToJ" - }, - "source": [ - "Now build the `FunctionDeclaration`:" - ] - }, - { - "cell_type": "code", - "execution_count": 194, - "metadata": { - "id": "YQkiVCtsPbUy" - }, - "outputs": [], - "source": [ - "add_to_database = glm.FunctionDeclaration(\n", - " name=\"add_to_database\",\n", - " description=textwrap.dedent(\"\"\"\\\n", - " Adds entities to the database.\n", - " \"\"\"),\n", - " parameters=glm.Schema(\n", - " type=glm.Type.OBJECT,\n", - " properties = {\n", - " 'people': people,\n", - " 'places': places,\n", - " 'things': things,\n", - " 'relationships': relationships\n", - " }\n", - " )\n", - ")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "e1_QSwD9Bmy5" - }, - "source": [ - "### Call the API\n", - "\n", - "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" - ] - }, - { - "cell_type": "code", - "execution_count": 195, - "metadata": { - "id": "5PGAPRDJP4Qx" - }, - "outputs": [], - "source": [ - "model = model = genai.GenerativeModel(\n", - " model_name='gemini-1.0-pro',\n", - " tools = [add_to_database])" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "1uTYW5cVCDST" - }, - "source": [ - "Each time you call the API the SDK will send the tools along with your prompt, and the model should call that function you defined:" - ] - }, - { - "cell_type": "code", - "execution_count": 196, - "metadata": { - "id": "bAPA7fNtSUwN" - }, - "outputs": [], - "source": [ - "result = model.generate_content(f\"\"\"\n", - "Please add the people, places, things, and relationships from this story to the database:\n", - "\n", - "{story}\n", - "\"\"\")" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "oSG7r6IBCL7S" - }, - "source": [ - "Now there is no text to parse. The result _is_ a datastructure." - ] - }, - { - "cell_type": "code", - "execution_count": 197, - "metadata": { - "id": "07n3wXzFOZ4x" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "False" - ] - }, - "execution_count": 197, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "'text' in result.candidates[0].content.parts[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 198, - "metadata": { - "id": "i-8hm1HPI5Ce" - }, - "outputs": [ - { - "data": { - "text/plain": [ - "True" - ] - }, - "execution_count": 198, - "metadata": {}, - "output_type": "execute_result" - } - ], - "source": [ - "'function_call' in result.candidates[0].content.parts[0]" - ] - }, - { - "cell_type": "code", - "execution_count": 199, - "metadata": { - "id": "n8BTs6ogDEkq" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "\n" - ] } - ], - "source": [ - "fc = result.candidates[0].content.parts[0].function_call\n", - "print(type(fc))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "kILNHmG2IED3" - }, - "source": [ - "The `glm.FunctionCall` class is based on Google Protocol Buffers, convert it to a more familiar JSON compatible object:" - ] - }, - { - "cell_type": "code", - "execution_count": 200, - "metadata": { - "id": "5GKHtT4-F3qa" - }, - "outputs": [ - { - "name": "stdout", - "output_type": "stream", - "text": [ - "{\n", - " \"name\": \"add_to_database\",\n", - " \"args\": {\n", - " \"relationships\": [\n", - " {\n", - " \"relationship\": \"mother-daughter\",\n", - " \"person_2_name\": \"Elise\",\n", - " \"person_1_name\": \"Anya\"\n", - " },\n", - " {\n", - " \"person_1_name\": \"Anya\",\n", - " \"relationship\": \"father-daughter\",\n", - " \"person_2_name\": \"Edward\"\n", - " },\n", - " {\n", - " \"relationship\": \"best friends\",\n", - " \"person_1_name\": \"Anya\",\n", - " \"person_2_name\": \"Samuel\"\n", - " }\n", - " ],\n", - " \"places\": [\n", - " {\n", - " \"name\": \"Willow Creek\",\n", - " \"description\": \"a quaint town nestled amidst rolling hills and whispering willows\"\n", - " },\n", - " {\n", - " \"name\": \"forest\",\n", - " \"description\": \"a shadowy place with rustling undergrowth\"\n", - " }\n", - " ],\n", - " \"things\": [\n", - " {\n", - " \"description\": \"a backpack with a shimmering emerald-green fabric and leather straps, containing a magical sword, a book of ancient spells, a tiny compass that always points north, and a magical key that could open any lock\",\n", - " \"name\": \"magical backpack\"\n", - " },\n", - " {\n", - " \"description\": \"a weapon that can defeat monsters\",\n", - " \"name\": \"shimmering sword\"\n", - " },\n", - " {\n", - " \"description\": \"a book containing magical spells\",\n", - " \"name\": \"book of ancient spells\"\n", - " },\n", - " {\n", - " \"name\": \"tiny compass\",\n", - " \"description\": \"a compass that always points north\"\n", - " },\n", - " {\n", - " \"name\": \"magical key\",\n", - " \"description\": \"a key that can open any lock\"\n", - " },\n", - " {\n", - " \"name\": \"monster\",\n", - " \"description\": \"a massive beast with sharp teeth, glowing red eyes, and claws that could crush a human with ease\"\n", - " }\n", - " ],\n", - " \"people\": [\n", - " {\n", - " \"description\": \"a young girl\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"name\": \"Anya\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"description\": \"Anya's mother\",\n", - " \"name\": \"Elise\",\n", - " \"end_place_name\": null\n", - " },\n", - " {\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"name\": \"Edward\",\n", - " \"end_place_name\": null,\n", - " \"description\": \"Anya's father\"\n", - " },\n", - " {\n", - " \"name\": \"Samuel\",\n", - " \"end_place_name\": null,\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"description\": \"Anya's best friend\"\n", - " },\n", - " {\n", - " \"name\": \"tall, lanky boy\",\n", - " \"description\": \"a boy who warned Anya about the monster\",\n", - " \"start_place_name\": \"Willow Creek\",\n", - " \"end_place_name\": null\n", - " }\n", - " ]\n", - " }\n", - "}\n" - ] + ], + "metadata": { + "colab": { + "name": "structured_data_extraction.ipynb", + "toc_visible": true + }, + "kernelspec": { + "display_name": "Python 3", + "name": "python3" } - ], - "source": [ - "print(json.dumps(type(fc).to_dict(fc), indent=4))" - ] - }, - { - "cell_type": "markdown", - "metadata": { - "id": "4m8FakjCIKmI" - }, - "source": [ - "## Conclusion\n", - "\n", - "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." - ] - } - ], - "metadata": { - "colab": { - "name": "structured_data_extraction.ipynb", - "toc_visible": true - }, - "kernelspec": { - "display_name": "Python 3 (ipykernel)", - "language": "python", - "name": "python3" }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.11.3" - } - }, - "nbformat": 4, - "nbformat_minor": 4 + "nbformat": 4, + "nbformat_minor": 0 } From 4d3014ffff2cd222cd11ac9cacd0d94cc9438de3 Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Mon, 4 Mar 2024 18:28:00 -0800 Subject: [PATCH 4/4] typo --- site/en/tutorials/structured_data_extraction.ipynb | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/site/en/tutorials/structured_data_extraction.ipynb b/site/en/tutorials/structured_data_extraction.ipynb index f0424e782..e3f210ccf 100644 --- a/site/en/tutorials/structured_data_extraction.ipynb +++ b/site/en/tutorials/structured_data_extraction.ipynb @@ -450,7 +450,7 @@ "id": "TgC_wkHPmkHn" }, "source": [ - "That's relatively simple and often works, but you can porentially make this more strict/robust by defining the schema using the API's Function Calling feature." + "That's relatively simple and often works, but you can potentially make this more strict/robust by defining the schema using the API's function calling feature." ] }, { @@ -459,7 +459,7 @@ "id": "CxMC28LAOfUf" }, "source": [ - "## Use Function Calling" + "## Use function calling" ] }, { @@ -468,9 +468,9 @@ "id": "x-V6PJn83Kh9" }, "source": [ - "If you haven't gone through the [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", + "If you haven't gone through the [Function calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) tutorial yet, make sure you do that first.\n", "\n", - "With Function Calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." + "With function calling your function and its parameters are described to the API as a `glm.FunctionDeclaration`. In basic cases the SDK can build the `FunctionDeclaration` from the function and its annotations. The SDK doesn't currently handle the description of nested `OBJECT` (`dict`) parameters. So you'll need to define them explicitly, for now." ] }, { @@ -481,7 +481,7 @@ "source": [ "### Define the schema\n", "\n", - "Start by defining `person` as an object with strting-fields `name`, `description`, `start_place_name`, `end_place_name`." + "Start by defining `person` as an object with string fields `name`, `description`, `start_place_name`, `end_place_name`." ] }, { @@ -645,7 +645,7 @@ "source": [ "### Call the API\n", "\n", - "Like you saw in [Function Calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" + "Like you saw in [Function calling basics](https://ai.google.dev/tutorials/function_calling_python_quickstart) now you can pass this `FunctionDeclaration` to the `tools` argument of the `genai.GenerativeModel` constructor (the constructor would also accept an equivalent JSON representation of the function declaration):" ] }, { @@ -883,7 +883,7 @@ "source": [ "## Conclusion\n", "\n", - "While the API can handle structured data extraction problems with pure text input and text output, using Function Calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." + "While the API can handle structured data extraction problems with pure text input and text output, using function calling is likely more reliable since it lets you define a strict schema, and eliminates a potentially error-prone parsing step." ] } ],