-
Notifications
You must be signed in to change notification settings - Fork 61
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updated Cookbook: Example for Fetching Scores from Langfuse #857
base: main
Are you sure you want to change the base?
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@Sohammhatre10 is attempting to deploy a commit to the langfuse Team on Vercel. A member of the Team first needs to authorize it. |
Your Name seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to 02ebc24 in 42 seconds
More details
- Looked at
1680
lines of code in26
files - Skipped
3
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. pages/docs/integrations/dspy.md:242
- Draft comment:
Remove trailing whitespace for cleaner code. This issue is present in multiple files, such asexample-javascript.md
,example-python-langgraph.md
,example-python-instrumentation-module.md
,example-python.md
,example-vercel-ai.md
,external-evaluation-pipelines.md
,integration_dspy.md
,integration_instructor.md
,integration_langgraph.md
,integration_llama-index_instrumentation.md
,integration_llama_index_posthog_mistral.md
,integration_mirascope.md
,integration_mistral_sdk.md
,integration_ollama.md
,integration_openai_structured_output.md
,js_integration_langchain.md
,js_tracing_example_vercel_ai_sdk.md
,prompt_management_langchain.md
. - Reason this comment was not posted:
Confidence changes required:50%
The PR introduces a new example for fetching scores from Langfuse, but there are several instances of trailing whitespace in the markdown files. These should be removed for cleaner code.
Workflow ID: wflow_ON6OiLFA8uvvhNsK
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Disclaimer: Experimental PR review
PR Summary
This pull request adds a comprehensive example of using the fetch_scores()
function from Langfuse to retrieve and analyze evaluation metrics, integrating UpTrain and Ragas for model evaluation.
- Added
pages/docs/scores/example_usage_of_fetch_score.md
with detailed code snippets for setting up, evaluating models, logging scores, and visualizing correlations - Updated
pages/guides/cookbook/example_external_evaluation_pipelines.md
with a guide on creating external evaluation pipelines using Langfuse, including synthetic data creation and custom evaluations - Made minor formatting and content improvements across multiple integration cookbooks (DSPy, Instructor, LangGraph, etc.) to enhance readability and consistency
- Updated various Langchain examples to demonstrate better integration with Langfuse for tracing and prompt management
26 file(s) reviewed, 7 comment(s)
Edit PR Review Bot Settings | Greptile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the contribution. It seems like you mostly want to showcase correlation analysis of different scores in Langfuse (which is a good notebook example). Are you sure that your example correlates the scores on a single trace basis for the analysis at the bottom of this notebook
@marcklingen Yupp, this was based on a single trace, and the scores were fetched accordingly. Haven't used any specifics for traces, but this was the first trace I created, so it defaulted to the first trace. Should I add more specificity for a single trace? Apologies for the late reply. |
Description
This update provides an example of using the
fetch_scores()
function from Langfuse to retrieve evaluation metrics. The example integrates UpTrain and Ragas for model evaluation and demonstrates how to log and fetch scores within Langfuse as mentioned in langfuse/langfuse#3505Key Features
Evaluation with UpTrain and Ragas:
Fetching Scores:
fetch_scores_from_langfuse
.Correlation Matrix Visualization:
Important
Adds an example for using Langfuse to fetch scores, evaluate models with UpTrain and Ragas, and visualize results using a correlation matrix.
fetch_scores()
from Langfuse to retrieve evaluation metrics.dspy.md
,instructor.md
,example-javascript.md
,example-python-langgraph.md
,example-python-instrumentation-module.md
,example-python.md
,example-vercel-ai.md
,example_external_evaluation_pipelines.md
,integration_dspy.md
,integration_instructor.md
,integration_langgraph.md
,integration_llama-index_instrumentation.md
,integration_llama_index_posthog_mistral.md
,integration_mirascope.md
,integration_mistral_sdk.md
,integration_ollama.md
,integration_openai_structured_output.md
,example-langchain.md
,js_integration_langchain.md
,js_tracing_example_vercel_ai_sdk.md
,prompt_management_langchain.md
.This description was created by for 02ebc24. It will automatically update as commits are pushed.