Update documentation

rs-kellogg · Nov 13, 2023 · c11d0c8 · c11d0c8
1 parent 243d013
commit c11d0c8
Show file tree

Hide file tree

Showing 30 changed files with 548 additions and 513 deletions.
diff --git a/_images/chatgpt-settings.png b/_images/chatgpt-settings.png
diff --git a/_images/openai-api-create-api-key.mp4 b/_images/openai-api-create-api-key.mp4
diff --git a/_images/openai-api-limits.png b/_images/openai-api-limits.png
diff --git a/_sources/augmented-generation.ipynb b/_sources/augmented-generation.ipynb
@@ -9,7 +9,7 @@
     "<font color='purple'>**Retrieval Augmented Generation (RAG)**</font> is a powerful paradigm in natural language processing that combines the strengths of information retrieval and language generation. In the context of the **OpenAI API**, this approach involves retrieving relevant information from a large dataset and using that information to enhance the generation of accurate text. It can be used as another method to fine-tune your models. \n",
     "\n",
     "### _Definition_\n",
-    "<font color='purple'>**RAG**</font> is a method that leverages pre-existing knowledge by retrieving pertinent information from a knowledge base and using it to inform the generation of coherent and contextually relevant text. In the OpenAI API, <font color='purple'>**RAG**</font> is exemplified by models that integrate the retrieval of information to augment the output of the language generation process.  The phrase <font color='purple'>**Retrieval Augmented Generation**</font> comes from a recent paper by Lewis et al. from Facebook AI (https://research.facebook.com/publications/retrieval-augmented-generation-for-knowledge-intensive-nlp-tasks/). The idea is to use a pre-trained language model (LM) to generate text, but to use a separate retrieval system to find relevant documents to condition the LM on.\n",
+    "<font color='purple'>**RAG**</font> is a method that leverages pre-existing knowledge by retrieving pertinent information from a knowledge base and using it to inform the generation of coherent and contextually relevant text. The phrase <font color='purple'>**Retrieval Augmented Generation**</font> comes from a recent paper by Lewis et al. from Facebook AI (https://research.facebook.com/publications/retrieval-augmented-generation-for-knowledge-intensive-nlp-tasks/). The idea is to use a pre-trained language model (LM) to generate text, but to use a separate retrieval system to find relevant documents to condition the LM on.\n",
     "\n",
     "### _How it Works_\n",
     "\n",
@@ -35,11 +35,11 @@
     "\n",
     "- **Code Generation**: In software development, RAG can assist in generating code snippets by retrieving information from programming knowledge bases, ensuring the produced code is accurate and contextually fitting. (example today)\n",
     "\n",
-    "- **Prevent Hallucinations**: Finally, RAG can be used to bring in external knowledge to check whether a GPT response is an hallucination. (example provided)\n",
+    "- **Prevent Hallucinations**: Finally, RAG can be used to bring in external knowledge to check whether a GPT response is a hallucination. (example provided)\n",
     "\n",
     "### _Getting Started_\n",
     "\n",
-    "Please install and import libraries."
+    "Please install and import these libraries."
    ]
   },
   {
@@ -95,46 +95,43 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 58,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Generated Text:\n",
-      "\n",
+      "Answer: \n",
       "\n",
-      "The latest version of Python Selenium (3.141.0) uses the same method to find elements by class name as previous versions:\n",
+      "You can find an element by class name using the find_element_by_class_name() method in the latest version of Python Selenium. An example of this usage is as follows:\n",
       "\n",
-      "driver.find_element_by_class_name(\"some-class\")\n"
+      "element = driver.find_element_by_class_name(\"class_name\")\n"
      ]
     }
    ],
    "source": [
     "# Ask GPT-3 about the Python version\n",
-    "#prompt = \"What is the lastest version of Python's selenium library that you know?\"\n",
     "prompt = \"How do I find an element by class name in the latest version of python selenium?\"\n",
     "\n",
     "# Generate response using GPT-3\n",
     "response = openai.Completion.create(\n",
-    "    engine=\"text-davinci-002\",  # Choose the appropriate engine\n",
+    "    engine=\"text-davinci-003\",  # Choose the appropriate engine\n",
     "    prompt=prompt,\n",
-    "    max_tokens=100,  # Adjust as needed\n",
-    "    temperature=0.7,  # Adjust as needed\n",
+    "    max_tokens=100,  \n",
+    "    temperature=0.7, \n",
     ")\n",
     "\n",
     "# Display the generated text\n",
     "generated_text = response[\"choices\"][0][\"text\"]\n",
-    "print(\"Generated Text:\")\n",
-    "print(generated_text)"
+    "print(f\"Answer: {generated_text}\")"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "This code for finding elements by class name no longer works in newer version of selenium found here: https://www.selenium.dev/documentation/webdriver/troubleshooting/upgrade_to_selenium_4/"
+    "This code for finding elements by class name no longer works in newer versions of selenium found here: https://www.selenium.dev/documentation/webdriver/troubleshooting/upgrade_to_selenium_4/"
    ]
   },
   {
@@ -171,7 +168,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 4,
+   "execution_count": 6,
    "metadata": {},
    "outputs": [
     {
@@ -201,7 +198,7 @@
     "embeddings = OpenAIEmbeddings(openai_api_key=api_key)\n",
     "metadata = [{\"source\": url} for _ in range(len(chunks))]  # Metadata for each chunk\n",
     "\n",
-    "# Create a FAISS vector store and save it to disk\n",
+    "# Create a FAISS vector store and save it\n",
     "store = FAISS.from_texts(chunks, embeddings, metadatas=metadata)\n",
     "faiss.write_index(store.index, \"selenium_docs.index\")\n",
     "store.index = None\n",
@@ -218,42 +215,33 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 60,
+   "execution_count": 10,
    "metadata": {},
    "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "/Users/ambreenchaudhri/anaconda3/lib/python3.11/site-packages/langchain/chains/qa_with_sources/vector_db.py:67: UserWarning: `VectorDBQAWithSourcesChain` is deprecated - please use `from langchain.chains import RetrievalQAWithSourcesChain`\n",
-      "  warnings.warn(\n"
-     ]
-    },
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Answer:  In the latest version of python selenium, you can find an element by class name using the following methods: driver.findElement(By.className(\"className\")); driver.findElement(By.cssSelector(\".className\")); driver.findElementsByCssSelector(\".className\");\n",
+      "Answer:  In the latest version of python selenium, you can find an element by class name using the following syntax: driver.findElement(By.className(\"className\")) or driver.findElement(By.cssSelector(\".className\")).\n",
       "\n"
      ]
     }
    ],
    "source": [
-    "# Load the FAISS index from disk for Selenium.\n",
+    "# Load the FAISS index\n",
     "index = faiss.read_index(\"selenium_docs.index\")  # Assuming the name of the index file is 'selenium_docs.index'\n",
     "\n",
-    "# Load the vector store from disk for Selenium.\n",
+    "# Load the vector store\n",
     "with open(\"selenium_docs.pkl\", \"rb\") as f:\n",
     "    store = pickle.load(f)\n",
     "\n",
-    "# Merge the index and store for Selenium.\n",
+    "# Merge the index and store\n",
     "store.index = index\n",
     "\n",
-    "# Build the question answering chain for Selenium.\n",
+    "# Build the question answering chain\n",
     "chain = VectorDBQAWithSourcesChain.from_llm(llm=OpenAI(openai_api_key=api_key, temperature=0, max_tokens=1500, model_name='text-davinci-003'), vectorstore=store)\n",
     "\n",
-    "# Ask GPT-3 about the latest version of Selenium.\n",
-    "#question = \"What is the latest version of Selenium?\"\n",
+    "# Ask GPT-3 a question\n",
     "question = \"How do I find an element by class name in the latest version of python selenium? Show an example.\"\n",
     "result = chain({\"question\": question})\n",
     "\n",
@@ -267,40 +255,38 @@
    "source": [
     "### _Example B: Preventing Hallucinations_\n",
     "\n",
-    "Another advantage of using RAG is to feed GPT an external knowledge source to check or prevent hallucinations.  An **artificial hallucination** (also called confabulation or delusion) is a response generated by an AI which contains false or misleading information presented as factual.  This could be something as innocuous as saying something exists in a file that doesn't.  Or it could be an instance with GPT actually provides false information.  Here is an example below of Luna, the elephant that walked on the moon. "
+    "Another advantage of using RAG is to feed GPT an external knowledge source to check or prevent hallucinations. An <font color='purple'>**artificial hallucination**</font> is a response that contains false or misleading information presented as factual. This could be something as innocuous as saying an item exists in a file that doesn't. Conversely, it could be an instance when GPT actually provides false information. Here is an example of Ellie, the elephant that walked on the moon. "
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 63,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "Generated Text:\n",
-      "\n",
+      "Answer: \n",
       "\n",
-      "The first elephant that landed on the moon was a female elephant named Ellie. She was born in captivity in Africa and was brought to the United States when she was two years old. Ellie became the first elephant to walk on the moon when she was part of the Apollo 11 mission in 1969.\n"
+      "The first elephant to land on the moon was a female elephant named Ellie. She was born in captivity in Africa and was brought to the United States when she was two years old. Ellie spent the majority of her life performing in circuses and zoos. In 1962, she was sent to the National Zoo in Washington, D.C. where she lived for the rest of her life. Ellie died in 1988 at the age of 36.\n"
      ]
     }
    ],
    "source": [
-    "prompt = \"Can you tell me more about the first elephant that landed on the moon? \"\n",
+    "prompt = \"Can you tell me more about the first elephant that landed on the moon?\"\n",
     "\n",
     "# Generate response using GPT-3\n",
     "response = openai.Completion.create(\n",
-    "    engine=\"text-davinci-002\",  # Choose the appropriate engine\n",
+    "    engine=\"text-davinci-002\",  \n",
     "    prompt=prompt,\n",
-    "    max_tokens=100,  # Adjust as needed\n",
-    "    temperature=0.0,  # Adjust as needed\n",
+    "    max_tokens=100,  \n",
+    "    temperature=0.0, \n",
     ")\n",
     "\n",
     "# Display the generated text\n",
     "generated_text = response[\"choices\"][0][\"text\"]\n",
-    "print(\"Generated Text:\")\n",
-    "print(generated_text)\n"
+    "print(f\"Answer: {generated_text}\")\n"
    ]
   },
   {
@@ -331,8 +317,8 @@
     "        {\"role\": \"system\", \"content\": \"Answer the following question the best you can.\"},\n",
     "        {\"role\": \"user\", \"content\": \"Can you tell me more about the first elephant that landed on the moon?\"}\n",
     "    ],\n",
-    "    max_tokens=100,  # Adjust as needed\n",
-    "    temperature=0.0,  # Adjust as needed\n",
+    "    max_tokens=100,  \n",
+    "    temperature=0.0,  \n",
     ")\n",
     "\n",
     "# Display the generated text\n",
@@ -350,7 +336,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 72,
+   "execution_count": 3,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -390,21 +376,20 @@
     }
    ],
    "source": [
-    "# Load the FAISS index from disk for Selenium.\n",
-    "index = faiss.read_index(\"elephant_docs.index\")  # Assuming the name of the index file is 'selenium_docs.index'\n",
+    "# Load the FAISS index\n",
+    "index = faiss.read_index(\"elephant_docs.index\") \n",
     "\n",
-    "# Load the vector store from disk for Selenium.\n",
+    "# Load the vector store\n",
     "with open(\"elephant_docs.pkl\", \"rb\") as f:\n",
     "    store = pickle.load(f)\n",
     "\n",
-    "# Merge the index and store for Selenium.\n",
+    "# Merge the index and store\n",
     "store.index = index\n",
     "\n",
-    "# Build the question answering chain for Selenium.\n",
-    "chain = VectorDBQAWithSourcesChain.from_llm(llm=OpenAI(openai_api_key=api_key, temperature=1.0, max_tokens=100, model_name='text-davinci-002'), vectorstore=store)\n",
+    "# Build the question answering chain\n",
+    "chain = VectorDBQAWithSourcesChain.from_llm(llm=OpenAI(openai_api_key=api_key, temperature=0, max_tokens=100, model_name='text-davinci-002'), vectorstore=store)\n",
     "\n",
-    "# Ask GPT-3 about the latest version of Selenium.\n",
-    "#question = \"What is the latest version of Selenium?\"\n",
+    "# Ask GPT-3 a question\n",
     "question = \"Can you tell me more about the first elephant that landed on the moon?\"\n",
     "result = chain({\"question\": question})\n",
     "\n",

diff --git a/_sources/function-calling.ipynb b/_sources/function-calling.ipynb
@@ -36,7 +36,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 10,
+   "execution_count": 13,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -58,7 +58,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 11,
+   "execution_count": 14,
    "metadata": {},
    "outputs": [],
    "source": [
@@ -68,6 +68,9 @@
     "\n",
     "# Python code for flight status is adapted from \n",
     "# https://www.tutorialspoint.com/get-flight-status-using-python\n",
+    "#\n",
+    "# Note: Tracking is available for flights scheduled 3 days before or after today.\n",
+    "#\n",
     "def get_flight_status(airline_code, flight_number, day, month, year):\n",
     "    def get_data(url):\n",
     "        response = requests.get(url)\n",
@@ -85,7 +88,7 @@
     "        item.get_text() for item in soup.find_all(\"div\", class_=\"text-helper__TextHelper-sc-8bko4a-0 kbHzdx\")\n",
     "    ]\n",
     "\n",
-    "    return statuses[0] + \"; Departing at \" + time_statuses[0] + \"; Arriving at \" + time_statuses[2]"
+    "    return str(statuses[0] + \"; Departing at \" + time_statuses[0] + \"; Arriving at \" + time_statuses[2])"
    ]
   },
   {
@@ -97,14 +100,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": 15,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "ChatCompletion(id='chatcmpl-8JSS4JgIRLXMFYAa5qUI4f5ejNd7L', choices=[Choice(finish_reason='function_call', index=0, message=ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{\\n  \"airline_code\": \"UA\",\\n  \"flight_number\": 792,\\n  \"day\": 9,\\n  \"month\": 11,\\n  \"year\": 2023\\n}', name='get_flight_status'), tool_calls=None))], created=1699648292, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=48, prompt_tokens=119, total_tokens=167))\n"
+      "ChatCompletion(id='chatcmpl-8KWcsocsczlpDfw3VRp7DEbrhm88W', choices=[Choice(finish_reason='function_call', index=0, message=ChatCompletionMessage(content=None, role='assistant', function_call=FunctionCall(arguments='{\\n  \"airline_code\": \"UA\",\\n  \"flight_number\": 792,\\n  \"day\": 12,\\n  \"month\": 11,\\n  \"year\": 2023\\n}', name='get_flight_status'), tool_calls=None))], created=1699902666, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=48, prompt_tokens=119, total_tokens=167))\n"
      ]
     }
    ],
@@ -114,7 +117,7 @@
     "  messages = [\n",
     "    {\n",
     "      \"role\": \"user\",\n",
-    "      \"content\": \"What is the flight status of UA 792 for Nov 9, 2023?\"\n",
+    "      \"content\": \"What is the flight status of UA 792 for Nov 12, 2023?\"\n",
     "    }\n",
     "  ],\n",
     "  functions = [\n",
@@ -162,7 +165,7 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 13,
+   "execution_count": 16,
    "metadata": {},
    "outputs": [
     {
@@ -193,14 +196,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 14,
+   "execution_count": 17,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "ChatCompletion(id='chatcmpl-8JSS6qstcU6tVKUB3JkZ2yIUbq5KW', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='The flight status of UA 792 for Nov 9, 2023 is on time. The flight is departing at 06:00 CST and arriving at 09:06 EST.', role='assistant', function_call=None, tool_calls=None))], created=1699648294, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=39, prompt_tokens=150, total_tokens=189))\n"
+      "ChatCompletion(id='chatcmpl-8KWdNwNEKsRQDoXLJ6cnrYEBnnRau', choices=[Choice(finish_reason='stop', index=0, message=ChatCompletionMessage(content='The flight status of UA 792 for November 12, 2023, is on time. The flight is scheduled to depart at 06:00 CST and arrive at 09:06 EST.', role='assistant', function_call=None, tool_calls=None))], created=1699902697, model='gpt-3.5-turbo-0613', object='chat.completion', system_fingerprint=None, usage=CompletionUsage(completion_tokens=42, prompt_tokens=150, total_tokens=192))\n"
      ]
     }
    ],
@@ -210,7 +213,7 @@
     "  messages = [\n",
     "    {\n",
     "      \"role\": \"user\",\n",
-    "      \"content\": \"What is the flight status of UA 792 for Nov 9, 2023?\"\n",
+    "      \"content\": \"What is the flight status of UA 792 for Nov 12, 2023?\"\n",
     "    },\n",
     "    {\n",
     "      \"role\": \"function\",\n",
@@ -263,14 +266,14 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": 18,
    "metadata": {},
    "outputs": [
     {
      "name": "stdout",
      "output_type": "stream",
      "text": [
-      "The flight status of UA 792 for Nov 9, 2023 is on time. The flight is departing at 06:00 CST and arriving at 09:06 EST.\n"
+      "The flight status of UA 792 for November 12, 2023, is on time. The flight is scheduled to depart at 06:00 CST and arrive at 09:06 EST.\n"
      ]
     }
    ],

diff --git a/_sources/langchain.ipynb b/_sources/langchain.ipynb