Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update document formatting for v2 tool use in code examples #229

Open
wants to merge 12 commits into
base: main
Choose a base branch
from

Conversation

mrmer1
Copy link
Contributor

@mrmer1 mrmer1 commented Oct 9, 2024

This PR introduces a new tutorial for Cohere's API, which is split into seven parts. Each part focuses on a different use case, including installation and setup, text generation, chatbots, semantic search, reranking, retrieval-augmented generation (RAG), and agents with tool use. The tutorial is designed to be completed in around 15 minutes.

@ai-yann
Copy link
Collaborator

ai-yann commented Oct 25, 2024

@mrmer1 please check the merge conflicts.

@ai-yann
Copy link
Collaborator

ai-yann commented Nov 8, 2024

@mrmer1 This looks great! Could you please also make the following updates:

  1. Add pip install commands as the first code cell in each of the three notebooks
  2. Include specific version numbers for all packages (e.g., pip install pandas==2.1.0) to ensure reproducibility

This will help users run the notebooks successfully in the future without version compatibility issues.

@mrmer1
Copy link
Contributor Author

mrmer1 commented Nov 14, 2024

@ai-yann updated the install versions

Copy link
Collaborator

@ai-yann ai-yann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the versioning edits. I was able to run through all the notebooks without issue -- I just added a few more comments, putting myself into the shoes of a customer during a workshop. I hope it's not too much

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In addition to You can find the [dataset here](https://github.com/cohere-ai/notebooks/blob/main/notebooks/guides/advanced_rag/spotify_dataset.csv). in the markdown, could you please also add to the python cell that follows it:

import pandas as pd
import shutil
from pathlib import Path

def setup_spotify_dataset():
    """
    Loads the Spotify dataset and ensures a local copy exists in the current directory.
    Returns the loaded DataFrame.
    """
    # First try to load from current directory
    local_path = Path('spotify_dataset.csv')
    
    if local_path.exists():
        print("Loading Spotify dataset from local directory...")
        return pd.read_csv(local_path)
    
    # If not found locally, try to find in notebooks directory structure
    try:
        current = Path.cwd()
        while current.name != 'notebooks' and current.parent != current:
            current = current.parent
        if current.name != 'notebooks':
            raise RuntimeError("Could not find notebooks directory")
        
        # Original file path
        original_path = current / 'guides' / 'advanced_rag' / 'spotify_dataset.csv'
        
        if not original_path.exists():
            raise FileNotFoundError(f"Dataset not found at {original_path}")
        
        # Copy file to current directory
        print(f"Copying Spotify dataset to local directory ({local_path})...")
        shutil.copy2(original_path, local_path)
        
        # Load and return the data
        return pd.read_csv(local_path)
        
    except (RuntimeError, FileNotFoundError) as e:
        print(f"Error: {e}")
        print("Please ensure the Spotify dataset is available either locally or in the expected directory structure.")
        raise

# Load the dataset
try:
    spotify_data = setup_spotify_dataset()
    print("\nFirst few rows of the dataset:")
    display(spotify_data.head(3))
except Exception as e:
    print(f"Failed to load dataset: {e}")

This just makes it a smoother experience for folks trying to follow along in a group training setting. This way folks don't need to leave the window, download, upload, etc.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the final results consistent? Can we add one more cell that outputs the final answers in markdown with clickable citations, like we do in the insert_inline_citations method of the notebooks/agents/Vanilla_Tool_Use_v2.ipynb notebook? Because now it's a lot to scroll through, and a lot looks like

Start: 531 | End: 568 | Text: 'Damián Pacheco (twelve-string guitar)'
Sources:
web_search_ra443ajyz6xj:0
web_search_ra443ajyz6xj:2
web_search_ta7g2cd67jrx:0`

Something that looks like:

Spotify 2023 Top Songs Analysis

Top 3 Most Streamed Songs

  1. "Flowers" by Miley Cyrus
  2. "Ella Baila Sola" by Eslabon Armado and Peso Pluma
  3. "Shakira: Bzrp Music Sessions, Vol. 53" by Shakira and Bizarrap

Artist Details

Miley Cyrus

Eslabon Armado

Peso Pluma

Shakira

Bizarrap

Methodology

  • Data sourced from Spotify's 2023 streaming statistics
  • Artist information verified through web searches
  • Ages and citizenships confirmed through multiple sources

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And you can use:

from IPython.display import Markdown
display(Markdown(markdown_response))

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we go to the trouble of creating a markdown table with the results and inline citations, maybe we can display the table as markdown, by adding:

from IPython.display import Markdown
display(Markdown(cited_text))
print("\n" + list_sources(response.message.citations, source_index))

{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"\n",
"# pip install cohere\n",
"\n",
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we remove the two newlines at the top of this cell? just for presentation purposes :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants