diff --git a/cookbook/example_usage_of_fetch_scores.ipynb b/cookbook/example_usage_of_fetch_scores.ipynb
new file mode 100644
index 000000000..01834550a
--- /dev/null
+++ b/cookbook/example_usage_of_fetch_scores.ipynb
@@ -0,0 +1,1320 @@
+{
+ "cells": [
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "## description: This document focuses on retrieving evaluation results logged in Langfuse using the fetch_scores. category: Examples"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "---\n",
+ "\n",
+ "# Fetching Scores from Langfuse\n",
+ "\n",
+ "Example: Using UpTrain and Ragas for Model Evaluation and Retrieving Metrics from Langfuse\n",
+ "Langfuse makes it easy to log and retrieve model evaluation metrics, helping users analyze and compare various performance measures. In this example, we'll demonstrate how UpTrain and Ragas can be used to evaluate models and retrieve specific evaluation metrics logged into Langfuse using `fetch_scores()` function and verify these metrics extracted by creating comparisons using a correlation matrix.\n",
+ "\n",
+ "**fetch_scores()** provides these arguments - \n",
+ " \n",
+ "- `page` (*Optional[int]*): The page number of the scores to return. Defaults to None. \n",
+ "- `limit` (*Optional[int]*): The maximum number of scores to return. Defaults to None. \n",
+ "- `user_id` (*Optional[str]*): A user identifier. Defaults to None. \n",
+ "- `name` (*Optional[str]*): The name of the scores to return. Defaults to None. \n",
+ "- `from_timestamp` (*Optional[dt.datetime]*): Retrieve only scores with a timestamp on or after this datetime. Defaults to None. \n",
+ "- `to_timestamp` (*Optional[dt.datetime]*): Retrieve only scores with a timestamp before this datetime. Defaults to None. \n",
+ "- `source` (*Optional[ScoreSource]*): The source of the scores. Defaults to None. \n",
+ "- `operator` (*Optional[str]*): The operator of the scores. Defaults to None. \n",
+ "- `value` (*Optional[float]*): The value of the scores. Defaults to None. \n",
+ "- `score_ids` (*Optional[str]*): The score identifier. Defaults to None. \n",
+ "- `config_id` (*Optional[str]*): The configuration identifier. Defaults to None. \n",
+ "- `data_type` (*Optional[ScoreDataType]*): The data type of the scores. Defaults to None. \n",
+ "- `request_options` (*Optional[RequestOptions]*): Additional request options. Defaults to None. \n",
+ "\n",
+ "The returned data contains a list of scores along with associated metadata, which can be useful for evaluating the performance of different models or experiments. If an error occurs during the request, it raises an exception, providing insight into what went wrong.\n",
+ "\n",
+ "---"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 1. Setting up the environment\n",
+ "\n",
+ "Importing necessary libraries and setting up enviornment variables"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "collapsed": true,
+ "id": "cY0ndxos4XIV"
+ },
+ "outputs": [],
+ "source": [
+ "!pip install ragas uptrain litellm datasets rouge_score langfuse"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 41,
+ "metadata": {
+ "id": "Hxfc8X0B-Rjd"
+ },
+ "outputs": [],
+ "source": [
+ "import os\n",
+ "# get keys for your project from https://cloud.langfuse.com\n",
+ "os.environ[\"LANGFUSE_PUBLIC_KEY\"] = \"\"\n",
+ "os.environ[\"LANGFUSE_SECRET_KEY\"] = \"\"\n",
+ "# your openai key\n",
+ "os.environ[\"OPENAI_API_KEY\"] = \"\"\n",
+ "\n",
+ "# Your host, defaults to https://cloud.langfuse.com\n",
+ "# For US data region, set to \"https://us.cloud.langfuse.com\"\n",
+ "os.environ[\"LANGFUSE_HOST\"] = \"https://us.cloud.langfuse.com\""
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 2. Getting the data\n",
+ "\n",
+ "This section demonstrates how to load and prepare a dataset for evaluation. The \"amnesty_qa\" dataset is loaded using the `datasets` library, and a subset of 5 evaluation examples is selected for analysis. The selected data is then converted into a pandas DataFrame for convenient handling and processing."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "UP7-L9Bhdxyx"
+ },
+ "outputs": [],
+ "source": [
+ "from datasets import load_dataset\n",
+ "\n",
+ "amnesty_qa = load_dataset(\"explodinggradients/amnesty_qa\", \"english_v2\")\n",
+ "amnesty_qa_ragas = amnesty_qa[\"eval\"].select(range(5))\n",
+ "amnesty_qa_ragas.to_pandas()"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 6,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 206
+ },
+ "collapsed": true,
+ "id": "wXgbdsp_2d1j",
+ "outputId": "79c3888d-01ef-426a-9b40-d131986cd006"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"amnesty_qa_df\",\n \"rows\": 5,\n \"fields\": [\n {\n \"column\": \"question\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"Which companies are the main contributors to GHG emissions and their role in global warming according to the Carbon Majors database?\",\n \"What are the recommendations made by Amnesty International to the Special Rapporteur on Human Rights Defenders?\",\n \"Which private companies in the Americas are the largest GHG emitters according to the Carbon Majors database?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"ground_truth\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"According to the Carbon Majors database, the main contributors to GHG emissions and their role in global warming are fossil fuel companies. These companies, both state-owned and private, have produced almost a trillion tons of GHG emissions in 150 years. The database shows that 100 existing fossil fuel companies, along with eight that no longer exist, are responsible for 71% of all GHG emissions since 1988. In the Americas, the private companies that have contributed the most emissions are ExxonMobil, Chevron, and Peabody, all from the United States. Among state-owned companies in the Americas, the largest emitter is Mexican company Pemex, followed by Venezuelan company Petr\\u00f3leos de Venezuela, S.A. It is important to note that while people with fewer resources, particularly from countries in the global South, do not significantly contribute to climate change, they are the ones most affected by its impacts. Approximately half of the global population lives in areas that are \\\"very vulnerable\\\" to climate change, and it is people with limited development opportunities who face the greatest risks. This unequal impact disproportionately affects the human rights of those with fewer resources and greater vulnerability in the context of climate change in the global South. Additionally, between 2010 and 2020, human mortality due to climate disasters was 15 times higher in vulnerable regions and populations.\",\n \"The recommendations made by Amnesty International to the Special Rapporteur on Human Rights Defenders include embedding a focus on child and young HRDs in future work, raising awareness about the differences and challenges they face, incorporating age disaggregated data in reports, and creating safe spaces for engagement.\",\n \"The largest private companies in the Americas that are the largest GHG emitters according to the Carbon Majors database are ExxonMobil, Chevron, and Peabody.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"answer\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"According to the Carbon Majors database, the main contributors to greenhouse gas (GHG) emissions and their role in global warming are primarily fossil fuel companies. These companies extract, produce, and sell fossil fuels, which when burned release carbon dioxide (CO2) and other GHGs into the atmosphere. The top contributors include:\\n\\n1. Saudi Arabian Oil Company (Saudi Aramco): As the world's largest oil producer, Saudi Aramco is responsible for a significant portion of global GHG emissions. It is estimated to have contributed around 59.26 gigatons of CO2 equivalent (GtCO2e) between 1965 and 2017.\\n\\n2. Chevron Corporation: Chevron is one of the largest oil and gas companies globally and has contributed approximately 43.35 GtCO2e between 1965 and 2017.\\n\\n3. ExxonMobil Corporation: ExxonMobil is another major oil and gas company and has contributed around 41.90 GtCO2e between 1965 and 2017.\\n\\n4. BP plc: BP is a multinational oil and gas company and has contributed approximately 34.02 GtCO2e between 1965 and 2017.\\n\\n5. Royal Dutch Shell plc: Shell is one of the largest oil and gas companies globally and has contributed around 31.95 GtCO2e between 1965 and 2017.\\n\\n6. TotalEnergies SE: TotalEnergies is a French multinational oil and gas company and has contributed approximately 31.91 GtCO2e between 1965 and 2017.\\n\\n7. ConocoPhillips Company: ConocoPhillips is an American multinational energy corporation and has contributed around 20.67 GtCO2e between 1965 and 2017.\\n\\n8. PetroChina Company Limited: PetroChina is the largest oil and gas producer in China and has contributed approximately 17.26 GtCO2e between 1965 and 2017.\\n\\n9. Peabody Energy Corporation: Peabody Energy is the largest private-sector coal company globally and has contributed around 15.39 GtCO2e between 1965 and 2017.\\n\\n10. Glencore plc: Glencore is a multinational commodity trading and mining company and has contributed approximately 15.11 GtCO2e between 1965 and 2017.\\n\\nThese companies, along with others in the fossil\",\n \"Amnesty International made several recommendations to the Special Rapporteur on Human Rights Defenders. These recommendations include:\\n\\n1. Urging states to fully implement the UN Declaration on Human Rights Defenders and ensure that national laws and policies are in line with international human rights standards.\\n\\n2. Calling on states to create a safe and enabling environment for human rights defenders, including by adopting legislation that protects defenders from threats, attacks, and reprisals.\\n\\n3. Encouraging states to establish effective mechanisms for the protection of human rights defenders, such as national human rights institutions and specialized units within law enforcement agencies.\\n\\n4. Urging states to investigate and hold accountable those responsible for attacks and threats against human rights defenders, ensuring that they are brought to justice in fair trials.\\n\\n5. Calling on states to repeal or amend laws that are used to criminalize or restrict the work of human rights defenders, including laws on defamation, sedition, and counter-terrorism.\\n\\n6. Encouraging states to provide adequate resources and support to human rights defenders, including funding for their work, access to legal assistance, and psychosocial support.\\n\\n7. Urging states to consult and engage with human rights defenders in the development and implementation of laws, policies, and programs that affect their rights and work.\\n\\n8. Calling on states to ensure that human rights defenders are able to access and share information freely, including through the internet and other digital platforms.\\n\\n9. Encouraging states to protect the rights of specific groups of human rights defenders who face particular risks and challenges, such as women defenders, indigenous defenders, and defenders working on issues related to sexual orientation and gender identity.\\n\\n10. Urging states to cooperate with the Special Rapporteur and other UN mechanisms in the promotion and protection of human rights defenders, including by responding to communications and requests for visits.\",\n \"According to the Carbon Majors database, the largest private companies in the Americas that are the largest GHG emitters are:\\n\\n1. Chevron Corporation (United States)\\n2. ExxonMobil Corporation (United States)\\n3. ConocoPhillips Company (United States)\\n4. BP plc (United Kingdom, but with significant operations in the Americas)\\n5. Royal Dutch Shell plc (Netherlands, but with significant operations in the Americas)\\n6. Peabody Energy Corporation (United States)\\n7. Duke Energy Corporation (United States)\\n8. TotalEnergies SE (France, but with significant operations in the Americas)\\n9. BHP Group Limited (Australia, but with significant operations in the Americas)\\n10. Rio Tinto Group (United Kingdom/Australia, but with significant operations in the Americas)\\n\\nPlease note that the rankings may change over time as new data becomes available.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"contexts\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "amnesty_qa_df"
+ },
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " question \\\n",
+ "0 What are the global implications of the USA Su... \n",
+ "1 Which companies are the main contributors to G... \n",
+ "2 Which private companies in the Americas are th... \n",
+ "3 What action did Amnesty International urge its... \n",
+ "4 What are the recommendations made by Amnesty I... \n",
+ "\n",
+ " ground_truth \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 The largest private companies in the Americas ... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 The recommendations made by Amnesty Internatio... \n",
+ "\n",
+ " answer \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 According to the Carbon Majors database, the l... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 Amnesty International made several recommendat... \n",
+ "\n",
+ " contexts \n",
+ "0 [- In 2022, the USA Supreme Court handed down ... \n",
+ "1 [In recent years, there has been increasing pr... \n",
+ "2 [The issue of greenhouse gas emissions has bec... \n",
+ "3 [In the case of the Ogoni 9, Amnesty Internati... \n",
+ "4 [In recent years, Amnesty International has fo... "
+ ]
+ },
+ "execution_count": 6,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "import pandas as pd\n",
+ "amnesty_qa_df = pd.DataFrame(amnesty_qa[\"eval\"].select(range(5)))"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 7,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 293
+ },
+ "id": "ZwjDqG6l2xqd",
+ "outputId": "b3b42102-5869-4f7b-abfa-cc8b51d655f0"
+ },
+ "outputs": [
+ {
+ "data": {
+ "application/vnd.google.colaboratory.intrinsic+json": {
+ "summary": "{\n \"name\": \"amnesty_qa_df\",\n \"rows\": 5,\n \"fields\": [\n {\n \"column\": \"question\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"Which companies are the main contributors to GHG emissions and their role in global warming according to the Carbon Majors database?\",\n \"What are the recommendations made by Amnesty International to the Special Rapporteur on Human Rights Defenders?\",\n \"Which private companies in the Americas are the largest GHG emitters according to the Carbon Majors database?\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"ground_truth\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"According to the Carbon Majors database, the main contributors to GHG emissions and their role in global warming are fossil fuel companies. These companies, both state-owned and private, have produced almost a trillion tons of GHG emissions in 150 years. The database shows that 100 existing fossil fuel companies, along with eight that no longer exist, are responsible for 71% of all GHG emissions since 1988. In the Americas, the private companies that have contributed the most emissions are ExxonMobil, Chevron, and Peabody, all from the United States. Among state-owned companies in the Americas, the largest emitter is Mexican company Pemex, followed by Venezuelan company Petr\\u00f3leos de Venezuela, S.A. It is important to note that while people with fewer resources, particularly from countries in the global South, do not significantly contribute to climate change, they are the ones most affected by its impacts. Approximately half of the global population lives in areas that are \\\"very vulnerable\\\" to climate change, and it is people with limited development opportunities who face the greatest risks. This unequal impact disproportionately affects the human rights of those with fewer resources and greater vulnerability in the context of climate change in the global South. Additionally, between 2010 and 2020, human mortality due to climate disasters was 15 times higher in vulnerable regions and populations.\",\n \"The recommendations made by Amnesty International to the Special Rapporteur on Human Rights Defenders include embedding a focus on child and young HRDs in future work, raising awareness about the differences and challenges they face, incorporating age disaggregated data in reports, and creating safe spaces for engagement.\",\n \"The largest private companies in the Americas that are the largest GHG emitters according to the Carbon Majors database are ExxonMobil, Chevron, and Peabody.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"answer\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"According to the Carbon Majors database, the main contributors to greenhouse gas (GHG) emissions and their role in global warming are primarily fossil fuel companies. These companies extract, produce, and sell fossil fuels, which when burned release carbon dioxide (CO2) and other GHGs into the atmosphere. The top contributors include:\\n\\n1. Saudi Arabian Oil Company (Saudi Aramco): As the world's largest oil producer, Saudi Aramco is responsible for a significant portion of global GHG emissions. It is estimated to have contributed around 59.26 gigatons of CO2 equivalent (GtCO2e) between 1965 and 2017.\\n\\n2. Chevron Corporation: Chevron is one of the largest oil and gas companies globally and has contributed approximately 43.35 GtCO2e between 1965 and 2017.\\n\\n3. ExxonMobil Corporation: ExxonMobil is another major oil and gas company and has contributed around 41.90 GtCO2e between 1965 and 2017.\\n\\n4. BP plc: BP is a multinational oil and gas company and has contributed approximately 34.02 GtCO2e between 1965 and 2017.\\n\\n5. Royal Dutch Shell plc: Shell is one of the largest oil and gas companies globally and has contributed around 31.95 GtCO2e between 1965 and 2017.\\n\\n6. TotalEnergies SE: TotalEnergies is a French multinational oil and gas company and has contributed approximately 31.91 GtCO2e between 1965 and 2017.\\n\\n7. ConocoPhillips Company: ConocoPhillips is an American multinational energy corporation and has contributed around 20.67 GtCO2e between 1965 and 2017.\\n\\n8. PetroChina Company Limited: PetroChina is the largest oil and gas producer in China and has contributed approximately 17.26 GtCO2e between 1965 and 2017.\\n\\n9. Peabody Energy Corporation: Peabody Energy is the largest private-sector coal company globally and has contributed around 15.39 GtCO2e between 1965 and 2017.\\n\\n10. Glencore plc: Glencore is a multinational commodity trading and mining company and has contributed approximately 15.11 GtCO2e between 1965 and 2017.\\n\\nThese companies, along with others in the fossil\",\n \"Amnesty International made several recommendations to the Special Rapporteur on Human Rights Defenders. These recommendations include:\\n\\n1. Urging states to fully implement the UN Declaration on Human Rights Defenders and ensure that national laws and policies are in line with international human rights standards.\\n\\n2. Calling on states to create a safe and enabling environment for human rights defenders, including by adopting legislation that protects defenders from threats, attacks, and reprisals.\\n\\n3. Encouraging states to establish effective mechanisms for the protection of human rights defenders, such as national human rights institutions and specialized units within law enforcement agencies.\\n\\n4. Urging states to investigate and hold accountable those responsible for attacks and threats against human rights defenders, ensuring that they are brought to justice in fair trials.\\n\\n5. Calling on states to repeal or amend laws that are used to criminalize or restrict the work of human rights defenders, including laws on defamation, sedition, and counter-terrorism.\\n\\n6. Encouraging states to provide adequate resources and support to human rights defenders, including funding for their work, access to legal assistance, and psychosocial support.\\n\\n7. Urging states to consult and engage with human rights defenders in the development and implementation of laws, policies, and programs that affect their rights and work.\\n\\n8. Calling on states to ensure that human rights defenders are able to access and share information freely, including through the internet and other digital platforms.\\n\\n9. Encouraging states to protect the rights of specific groups of human rights defenders who face particular risks and challenges, such as women defenders, indigenous defenders, and defenders working on issues related to sexual orientation and gender identity.\\n\\n10. Urging states to cooperate with the Special Rapporteur and other UN mechanisms in the promotion and protection of human rights defenders, including by responding to communications and requests for visits.\",\n \"According to the Carbon Majors database, the largest private companies in the Americas that are the largest GHG emitters are:\\n\\n1. Chevron Corporation (United States)\\n2. ExxonMobil Corporation (United States)\\n3. ConocoPhillips Company (United States)\\n4. BP plc (United Kingdom, but with significant operations in the Americas)\\n5. Royal Dutch Shell plc (Netherlands, but with significant operations in the Americas)\\n6. Peabody Energy Corporation (United States)\\n7. Duke Energy Corporation (United States)\\n8. TotalEnergies SE (France, but with significant operations in the Americas)\\n9. BHP Group Limited (Australia, but with significant operations in the Americas)\\n10. Rio Tinto Group (United Kingdom/Australia, but with significant operations in the Americas)\\n\\nPlease note that the rankings may change over time as new data becomes available.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"context\",\n \"properties\": {\n \"dtype\": \"object\",\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n },\n {\n \"column\": \"response\",\n \"properties\": {\n \"dtype\": \"string\",\n \"num_unique_values\": 5,\n \"samples\": [\n \"According to the Carbon Majors database, the main contributors to greenhouse gas (GHG) emissions and their role in global warming are primarily fossil fuel companies. These companies extract, produce, and sell fossil fuels, which when burned release carbon dioxide (CO2) and other GHGs into the atmosphere. The top contributors include:\\n\\n1. Saudi Arabian Oil Company (Saudi Aramco): As the world's largest oil producer, Saudi Aramco is responsible for a significant portion of global GHG emissions. It is estimated to have contributed around 59.26 gigatons of CO2 equivalent (GtCO2e) between 1965 and 2017.\\n\\n2. Chevron Corporation: Chevron is one of the largest oil and gas companies globally and has contributed approximately 43.35 GtCO2e between 1965 and 2017.\\n\\n3. ExxonMobil Corporation: ExxonMobil is another major oil and gas company and has contributed around 41.90 GtCO2e between 1965 and 2017.\\n\\n4. BP plc: BP is a multinational oil and gas company and has contributed approximately 34.02 GtCO2e between 1965 and 2017.\\n\\n5. Royal Dutch Shell plc: Shell is one of the largest oil and gas companies globally and has contributed around 31.95 GtCO2e between 1965 and 2017.\\n\\n6. TotalEnergies SE: TotalEnergies is a French multinational oil and gas company and has contributed approximately 31.91 GtCO2e between 1965 and 2017.\\n\\n7. ConocoPhillips Company: ConocoPhillips is an American multinational energy corporation and has contributed around 20.67 GtCO2e between 1965 and 2017.\\n\\n8. PetroChina Company Limited: PetroChina is the largest oil and gas producer in China and has contributed approximately 17.26 GtCO2e between 1965 and 2017.\\n\\n9. Peabody Energy Corporation: Peabody Energy is the largest private-sector coal company globally and has contributed around 15.39 GtCO2e between 1965 and 2017.\\n\\n10. Glencore plc: Glencore is a multinational commodity trading and mining company and has contributed approximately 15.11 GtCO2e between 1965 and 2017.\\n\\nThese companies, along with others in the fossil\",\n \"Amnesty International made several recommendations to the Special Rapporteur on Human Rights Defenders. These recommendations include:\\n\\n1. Urging states to fully implement the UN Declaration on Human Rights Defenders and ensure that national laws and policies are in line with international human rights standards.\\n\\n2. Calling on states to create a safe and enabling environment for human rights defenders, including by adopting legislation that protects defenders from threats, attacks, and reprisals.\\n\\n3. Encouraging states to establish effective mechanisms for the protection of human rights defenders, such as national human rights institutions and specialized units within law enforcement agencies.\\n\\n4. Urging states to investigate and hold accountable those responsible for attacks and threats against human rights defenders, ensuring that they are brought to justice in fair trials.\\n\\n5. Calling on states to repeal or amend laws that are used to criminalize or restrict the work of human rights defenders, including laws on defamation, sedition, and counter-terrorism.\\n\\n6. Encouraging states to provide adequate resources and support to human rights defenders, including funding for their work, access to legal assistance, and psychosocial support.\\n\\n7. Urging states to consult and engage with human rights defenders in the development and implementation of laws, policies, and programs that affect their rights and work.\\n\\n8. Calling on states to ensure that human rights defenders are able to access and share information freely, including through the internet and other digital platforms.\\n\\n9. Encouraging states to protect the rights of specific groups of human rights defenders who face particular risks and challenges, such as women defenders, indigenous defenders, and defenders working on issues related to sexual orientation and gender identity.\\n\\n10. Urging states to cooperate with the Special Rapporteur and other UN mechanisms in the promotion and protection of human rights defenders, including by responding to communications and requests for visits.\",\n \"According to the Carbon Majors database, the largest private companies in the Americas that are the largest GHG emitters are:\\n\\n1. Chevron Corporation (United States)\\n2. ExxonMobil Corporation (United States)\\n3. ConocoPhillips Company (United States)\\n4. BP plc (United Kingdom, but with significant operations in the Americas)\\n5. Royal Dutch Shell plc (Netherlands, but with significant operations in the Americas)\\n6. Peabody Energy Corporation (United States)\\n7. Duke Energy Corporation (United States)\\n8. TotalEnergies SE (France, but with significant operations in the Americas)\\n9. BHP Group Limited (Australia, but with significant operations in the Americas)\\n10. Rio Tinto Group (United Kingdom/Australia, but with significant operations in the Americas)\\n\\nPlease note that the rankings may change over time as new data becomes available.\"\n ],\n \"semantic_type\": \"\",\n \"description\": \"\"\n }\n }\n ]\n}",
+ "type": "dataframe",
+ "variable_name": "amnesty_qa_df"
+ },
+ "text/html": [
+ "\n",
+ "
\n"
+ ],
+ "text/plain": [
+ " question \\\n",
+ "0 What are the global implications of the USA Su... \n",
+ "1 Which companies are the main contributors to G... \n",
+ "2 Which private companies in the Americas are th... \n",
+ "3 What action did Amnesty International urge its... \n",
+ "4 What are the recommendations made by Amnesty I... \n",
+ "\n",
+ " ground_truth \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 The largest private companies in the Americas ... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 The recommendations made by Amnesty Internatio... \n",
+ "\n",
+ " answer \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 According to the Carbon Majors database, the l... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 Amnesty International made several recommendat... \n",
+ "\n",
+ " context \\\n",
+ "0 [- In 2022, the USA Supreme Court handed down ... \n",
+ "1 [In recent years, there has been increasing pr... \n",
+ "2 [The issue of greenhouse gas emissions has bec... \n",
+ "3 [In the case of the Ogoni 9, Amnesty Internati... \n",
+ "4 [In recent years, Amnesty International has fo... \n",
+ "\n",
+ " response \n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 According to the Carbon Majors database, the l... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 Amnesty International made several recommendat... "
+ ]
+ },
+ "execution_count": 7,
+ "metadata": {},
+ "output_type": "execute_result"
+ }
+ ],
+ "source": [
+ "amnesty_qa_df['response'] = amnesty_qa_df['answer']\n",
+ "amnesty_qa_df.rename(columns={'contexts':'context'}, inplace=True)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 3. Evaluation with UpTrain\n",
+ "\n",
+ "This code demonstrates how to evaluate a dataset using UpTrain's `EvalLLM` class. An instance of `EvalLLM` is created using the OpenAI API key. The `evaluate` function assesses the `amnesty_qa_df` DataFrame against three evaluation criteria: context relevance, factual accuracy, and response completeness. The evaluation results are stored in a new DataFrame, which is then printed and optionally saved as a CSV file. Finally, the function is called in the main block to execute the evaluation and store the results. Refer a detailed version [here](https://langfuse.com/guides/cookbook/evaluation_with_uptrain)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 28,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "gb0_o8jWIIoO",
+ "outputId": "8edba767-f4b6-4b01-ee2f-17e73a629ead"
+ },
+ "outputs": [
+ {
+ "name": "stderr",
+ "output_type": "stream",
+ "text": [
+ "100%|██████████| 5/5 [00:01<00:00, 3.19it/s]\n",
+ "100%|██████████| 5/5 [00:02<00:00, 2.01it/s]\n",
+ "100%|██████████| 5/5 [00:06<00:00, 1.30s/it]\n",
+ "100%|██████████| 5/5 [00:02<00:00, 2.25it/s]\n",
+ "\u001b[32m2024-10-13 16:50:32.097\u001b[0m | \u001b[1mINFO \u001b[0m | \u001b[36muptrain.framework.evalllm\u001b[0m:\u001b[36mevaluate\u001b[0m:\u001b[36m376\u001b[0m - \u001b[1mLocal server not running, start the server to log data and visualize in the dashboard!\u001b[0m\n"
+ ]
+ },
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ " question \\\n",
+ "0 What are the global implications of the USA Su... \n",
+ "1 Which companies are the main contributors to G... \n",
+ "2 Which private companies in the Americas are th... \n",
+ "3 What action did Amnesty International urge its... \n",
+ "4 What are the recommendations made by Amnesty I... \n",
+ "\n",
+ " ground_truth \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 The largest private companies in the Americas ... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 The recommendations made by Amnesty Internatio... \n",
+ "\n",
+ " answer \\\n",
+ "0 The global implications of the USA Supreme Cou... \n",
+ "1 According to the Carbon Majors database, the m... \n",
+ "2 According to the Carbon Majors database, the l... \n",
+ "3 Amnesty International urged its supporters to ... \n",
+ "4 Amnesty International made several recommendat... \n",
+ "\n",
+ " context \\\n",
+ "0 [- In 2022, the USA Supreme Court handed down ... \n",
+ "1 [In recent years, there has been increasing pr... \n",
+ "2 [The issue of greenhouse gas emissions has bec... \n",
+ "3 [In the case of the Ogoni 9, Amnesty Internati... \n",
+ "4 [In recent years, Amnesty International has fo... \n",
+ "\n",
+ " response score_context_relevance \\\n",
+ "0 The global implications of the USA Supreme Cou... 1.0 \n",
+ "1 According to the Carbon Majors database, the m... 1.0 \n",
+ "2 According to the Carbon Majors database, the l... 1.0 \n",
+ "3 Amnesty International urged its supporters to ... 1.0 \n",
+ "4 Amnesty International made several recommendat... 1.0 \n",
+ "\n",
+ " explanation_context_relevance score_factual_accuracy \\\n",
+ "0 {\\n \"Reasoning\": \"The extracted context con... 1.0 \n",
+ "1 {\\n \"Reasoning\": \"The given context provide... 0.6 \n",
+ "2 {\\n \"Reasoning\": \"The extracted context pro... 0.4 \n",
+ "3 {\\n \"Reasoning\": \"The given context contain... 0.8 \n",
+ "4 {\\n \"Reasoning\": \"The extracted context con... 0.6 \n",
+ "\n",
+ " explanation_factual_accuracy \\\n",
+ "0 {\\n \"Result\": [\\n {\\n \"Fa... \n",
+ "1 {\\n \"Result\": [\\n {\\n \"Fa... \n",
+ "2 {\\n \"Result\": [\\n {\\n \"Fa... \n",
+ "3 {\\n \"Result\": [\\n {\\n \"Fa... \n",
+ "4 {\\n \"Result\": [\\n {\\n \"Fa... \n",
+ "\n",
+ " score_response_completeness \\\n",
+ "0 1.0 \n",
+ "1 1.0 \n",
+ "2 1.0 \n",
+ "3 1.0 \n",
+ "4 1.0 \n",
+ "\n",
+ " explanation_response_completeness \n",
+ "0 {\\n \"Reasoning\": \"The given response is com... \n",
+ "1 {\\n \"Reasoning\": \"The given response is com... \n",
+ "2 {\\n \"Reasoning\": \"The given response is com... \n",
+ "3 {\\n \"Reasoning\": \"The given response is com... \n",
+ "4 {\\n \"Reasoning\": \"The given response is com... \n"
+ ]
+ }
+ ],
+ "source": [
+ "import os\n",
+ "import json\n",
+ "import pandas as pd\n",
+ "from uptrain import EvalLLM, Evals\n",
+ "\n",
+ "OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')\n",
+ "eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)\n",
+ "\n",
+ "def evaluate():\n",
+ " # Step 5: Evaluate data using UpTrain\n",
+ " results = eval_llm.evaluate(\n",
+ " data=amnesty_qa_df,\n",
+ " checks=[Evals.CONTEXT_RELEVANCE, Evals.FACTUAL_ACCURACY, Evals.RESPONSE_COMPLETENESS]\n",
+ " )\n",
+ "\n",
+ " # Convert the results to a DataFrame\n",
+ " results_df = pd.DataFrame(results)\n",
+ "\n",
+ " # Print the DataFrame\n",
+ " print(results_df)\n",
+ "\n",
+ " # Optionally, save the DataFrame to a CSV file\n",
+ " results_df.to_csv('evaluation_results.csv', index=False)\n",
+ "\n",
+ " return results_df\n",
+ "\n",
+ "# Call the function and store results in a DataFrame\n",
+ "if __name__ == \"__main__\":\n",
+ " uptrain_df = evaluate()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 4. Evaluation with Ragas\n",
+ "\n",
+ "The `evaluate` function is called with the selected evaluation data and a list of metrics, including context precision, faithfulness, and answer relevancy. The results from the evaluation are then converted into a Pandas DataFrame for easier analysis. This approach enables users to assess the quality of model responses based on specific criteria. For more detailed information on evaluating RAG models with Ragas visit [here](https://langfuse.com/guides/cookbook/evaluation_of_rag_with_ragas)."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {
+ "id": "dfRCTHEauMcK"
+ },
+ "outputs": [],
+ "source": [
+ "import json\n",
+ "from ragas import evaluate\n",
+ "from ragas.metrics import (\n",
+ " answer_relevancy,\n",
+ " faithfulness,\n",
+ " context_precision,\n",
+ ")\n",
+ "\n",
+ "ragas_result = evaluate(\n",
+ " amnesty_qa[\"eval\"].select(range(5)),\n",
+ " metrics=[\n",
+ " context_precision,\n",
+ " faithfulness,\n",
+ " answer_relevancy,\n",
+ " ],\n",
+ ")\n",
+ "\n",
+ "ragas_df = ragas_result.to_pandas()"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 5. Setting Up Langfuse Client\n",
+ "\n",
+ "This code snippet initializes a Langfuse client using the `Langfuse` class. The client is configured with a secret key, public key, and host URL, which are retrieved from the environment variables. This setup allows users to interact with the Langfuse API for logging and analyzing model evaluation metrics seamlessly."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 42,
+ "metadata": {
+ "id": "NwExqBSEinBB"
+ },
+ "outputs": [],
+ "source": [
+ "from langfuse import Langfuse\n",
+ "langfuse_client = Langfuse(\n",
+ " secret_key=os.environ.get(\"LANGFUSE_SECRET_KEY\"),\n",
+ " public_key=os.environ.get(\"LANGFUSE_PUBLIC_KEY\"),\n",
+ " host = os.environ.get(\"LANGFUSE_HOST\")\n",
+ ")"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 6. Logging Evaluation Scores to Langfuse\n",
+ "\n",
+ "The functions `log_uptrain_scores_to_langfuse` and `log_ragas_scores_to_langfuse` log evaluation scores from the UpTrain and Ragas frameworks into Langfuse. Each function iterates through its respective DataFrame, extracting relevant score columns and logging them with `langfuse_client.score`, using a unique ID for each entry.\n",
+ "\n",
+ "Scores in Langfuse are objects for storing evaluation metrics, linked to traces and optional observations. Each score can include attributes such as name, value, trace ID, and configuration ID to ensure they comply with a specified schema. This structured approach enables effective analysis of evaluation metrics within the Langfuse platform. \n",
+ "\n",
+ "#### Key Attributes of a Score Object:\n",
+ "- **name**: Name of the score (e.g., user_feedback).\n",
+ "- **value**: Numeric value of the score.\n",
+ "- **traceId**: ID of the related trace.\n",
+ "- **id**: Unique identifier for the score.\n",
+ "\n",
+ "Using scores effectively allows for quick overviews of evaluations, segmentation of traces by quality, and detailed reporting across use cases. Score schemas can be defined to standardize metrics for consistency and comparability in analysis."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 24,
+ "metadata": {
+ "id": "PSw4ocNHrOOk"
+ },
+ "outputs": [],
+ "source": [
+ "def log_uptrain_scores_to_langfuse(uptrain_df):\n",
+ " \"\"\"Log evaluation scores to Langfuse.\"\"\"\n",
+ " score_columns = ['score_factual_accuracy', 'score_context_relevance', 'score_response_completeness']\n",
+ " for index, row in uptrain_df.iterrows():\n",
+ " for score_name in score_columns:\n",
+ " score_value = row[score_name]\n",
+ " langfuse_client.score(id=f\"Uptrain_{index}_{score_name}\", value=score_value, name=score_name)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 35,
+ "metadata": {
+ "id": "dN1YgjgdwLFn"
+ },
+ "outputs": [],
+ "source": [
+ "def log_ragas_scores_to_langfuse(ragas_df):\n",
+ " score_columns = ['context_precision', 'faithfulness', 'answer_relevancy']\n",
+ "\n",
+ " for index, row in ragas_df.iterrows():\n",
+ " for score_name in score_columns:\n",
+ " score_value = row[score_name]\n",
+ " langfuse_client.score(id=f\"Ragas_{index}_{score_name}\", value=score_value, name=score_name)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 43,
+ "metadata": {
+ "id": "MK9w1bFgyG1F"
+ },
+ "outputs": [],
+ "source": [
+ "log_ragas_scores_to_langfuse(ragas_df)\n",
+ "log_uptrain_scores_to_langfuse(uptrain_df)"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 7. Fetching Scores from Langfuse\n",
+ "\n",
+ "The `fetch_scores_from_langfuse` function retrieves evaluation scores from Langfuse based on the specified score name. It utilizes the `fetch_scores` method from the Langfuse client to obtain a comprehensive list of scores that have been logged in the system. This function is particularly useful for users who want to analyze specific evaluation metrics associated with their models or applications.\n",
+ "\n",
+ "By using the `fetch_scores` method, the function provides flexibility through various optional parameters that allow users to filter the retrieved scores according to their needs. For instance, users can specify pagination options such as the page number and the limit on the number of scores returned, making it easier to handle large datasets without overwhelming the interface.\n",
+ "\n",
+ "In addition to pagination, the function supports filtering scores by criteria like user identifiers, timestamps, and score sources. This means users can fetch scores that were recorded by specific users or during a certain time frame, allowing for a more focused analysis. Users can also filter scores based on their values or specific configurations, ensuring that the retrieved data aligns with the evaluation metrics of interest.\n",
+ "\n",
+ "The result of this function is a `FetchScoresResponse`, which includes not only the list of scores but also metadata about the scores retrieved. This allows users to quickly gain insights into the evaluation metrics relevant to their projects and make informed decisions based on the data. Overall, this function enhances the usability of Langfuse by simplifying the process of accessing and analyzing evaluation scores."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 45,
+ "metadata": {
+ "id": "Lgd0Xz2Bvo9V"
+ },
+ "outputs": [],
+ "source": [
+ "def fetch_scores_from_langfuse(score_name):\n",
+ " \"\"\"Fetch scores from Langfuse based on score name.\"\"\"\n",
+ " # Fetch scores for the specified name from Langfuse\n",
+ " scores_fetched = langfuse_client.fetch_scores(name=score_name)\n",
+ " return scores_fetched"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 75,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/"
+ },
+ "id": "yIxYZ0vjPlkI",
+ "outputId": "1d115955-a31b-488e-9d25-b27f366b72eb"
+ },
+ "outputs": [
+ {
+ "name": "stdout",
+ "output_type": "stream",
+ "text": [
+ "[Score_Numeric(value=1.0, id='Uptrain_4_score_context_relevance', trace_id='95ad7bdd-b93b-4905-a865-938f346871bd', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_3_score_context_relevance', trace_id='f9b43538-77b6-478f-a5d9-c2be3b4cdada', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_2_score_context_relevance', trace_id='02185905-be84-41d9-9b64-b02fb45704f3', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_1_score_context_relevance', trace_id='b68fc2e6-e6a0-489b-becc-5441d9f1dd4e', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_context_relevance', trace_id='75bd20ac-3a34-4fa0-b74a-0fb7a454bfa1', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n",
+ "[Score_Numeric(value=0.6, id='Uptrain_4_score_factual_accuracy', trace_id='e5ad0a8e-3c20-4dc8-ba19-1f11f224ebbf', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.8, id='Uptrain_3_score_factual_accuracy', trace_id='2ed536e7-a583-401c-b3e9-1227985875c1', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.4, id='Uptrain_2_score_factual_accuracy', trace_id='8552536a-70ae-4678-a789-c0af61d3a436', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.6, id='Uptrain_1_score_factual_accuracy', trace_id='812d7ae7-f2bf-4251-9784-9ee248b469d7', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_factual_accuracy', trace_id='f4135b5b-d20a-4741-b777-186d37d1fa52', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n",
+ "[Score_Numeric(value=1.0, id='Uptrain_4_score_response_completeness', trace_id='1a54b4e2-3e2c-4235-801b-b56153c8e293', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_3_score_response_completeness', trace_id='ce78dce7-f4bd-45a4-b69c-f31fd6258565', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_2_score_response_completeness', trace_id='103927f0-dd9f-4d94-95d6-a4a6fce3898d', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_1_score_response_completeness', trace_id='6e7ae4f6-aca0-4152-b299-5b1ae06bd7e9', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_response_completeness', trace_id='3c100175-8e20-4d1f-ab1b-a7e4dc870cac', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n",
+ "[Score_Numeric(value=0.9999999999666667, id='Ragas_4_context_precision', trace_id='1441c394-fc54-42f3-a798-7ab1b338748c', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.99999999995, id='Ragas_3_context_precision', trace_id='a91146c0-09d4-4039-828d-adf308d09dd8', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.8333333332916666, id='Ragas_2_context_precision', trace_id='16bf0af8-b988-44d0-a9c5-35a0ffa69ffd', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9999999999666667, id='Ragas_1_context_precision', trace_id='976e6974-f6d7-4ff0-b961-5653ae58e9ef', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9999999999666667, id='Ragas_0_context_precision', trace_id='4e0edb60-c6b1-452d-ae58-ce7449dc3f47', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n",
+ "[Score_Numeric(value=0.1428571428571428, id='Ragas_4_faithfulness', trace_id='8c3f995f-bc00-4935-90e5-069478987ce3', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.2, id='Ragas_3_faithfulness', trace_id='424fddad-f617-491a-9816-d9642f33d0e6', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.0, id='Ragas_2_faithfulness', trace_id='c7b7e4a1-ab80-4951-ae16-293265970dc3', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.12, id='Ragas_1_faithfulness', trace_id='77a2d6ae-b840-454f-b4e3-52edb8909bcb', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Ragas_0_faithfulness', trace_id='8f61a293-836f-4cc9-84f9-996c19c42620', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n",
+ "[Score_Numeric(value=0.9891308706741455, id='Ragas_4_answer_relevancy', trace_id='21a3c662-a494-4029-b95a-8fd25f90a8c6', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9795341682836177, id='Ragas_3_answer_relevancy', trace_id='f398dd78-ccdd-423c-9662-92ff548183e7', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9916994382653276, id='Ragas_2_answer_relevancy', trace_id='65d48c73-2fbd-4577-bec9-7a46858e0a6a', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9652149513821247, id='Ragas_1_answer_relevancy', trace_id='116c5ac3-7931-471b-83eb-da6c91725621', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Ragas_0_answer_relevancy', trace_id='e7642418-7f1f-4c4f-8480-06dd8c276fbd', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]\n"
+ ]
+ }
+ ],
+ "source": [
+ "score_columns = [ 'score_context_relevance', 'score_factual_accuracy', 'score_response_completeness', 'context_precision', 'faithfulness', 'answer_relevancy']\n",
+ "\n",
+ "scores_df = pd.DataFrame(columns=score_columns)\n",
+ "\n",
+ "for score_name in score_columns:\n",
+ " fetch_scores = fetch_scores_from_langfuse(score_name)\n",
+ " print(fetch_scores.data)\n",
+ " scores_df[score_name] = [score.value for score in fetch_scores.data[::-1]]"
+ ]
+ },
+ {
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "### 8. Creating a Correlation Heatmap\n",
+ "\n",
+ "This section illustrates how to visualize the correlation between evaluation scores using a heatmap. The code calculates the correlation matrix for two sets of scores: UpTrain scores (`'score_context_relevance'`, `'score_factual_accuracy'`, and `'score_response_completeness'`) and RAGAS scores (`'context_precision'`, `'faithfulness'`, and `'answer_relevancy'`).\n",
+ "\n",
+ "1. **Calculate the Correlation Matrix**: The `corr()` function computes correlation coefficients between specified score columns in the `scores_df` DataFrame, indicating the strength and direction of relationships.\n",
+ "\n",
+ "2. **Create and Customize the Heatmap**: A heatmap is generated using Matplotlib and Seaborn, displaying correlation coefficients with colors ranging from blue (negative) to red (positive). The layout is adjusted for clarity.\n",
+ "\n",
+ "This visualization helps identify patterns in the evaluation metrics, aiding in the analysis of `fetch_scores()` performance."
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": 83,
+ "metadata": {
+ "colab": {
+ "base_uri": "https://localhost:8080/",
+ "height": 807
+ },
+ "id": "FqNgHsA-W0m8",
+ "outputId": "36e3014a-009e-4c76-ca98-c42beeff982c"
+ },
+ "outputs": [
+ {
+ "data": {
+ "image/png": "\n",
+ "text/plain": [
+ "
"
+ ]
+ },
+ "metadata": {},
+ "output_type": "display_data"
+ }
+ ],
+ "source": [
+ "import matplotlib.pyplot as plt\n",
+ "import seaborn as sns\n",
+ "\n",
+ "corr_matrix = scores_df.corr()\n",
+ "\n",
+ "# Create a heatmap of the correlation matrix\n",
+ "plt.figure(figsize=(10, 8))\n",
+ "sns.heatmap(corr_matrix, annot=True, vmin=-1, vmax=1, center=0, linewidths=.5, linecolor='white', cmap='crest')\n",
+ "plt.title('Correlation Matrix of Six Scores')\n",
+ "plt.tight_layout()"
+ ]
+ },
+ {
+ "attachments": {
+ "%7B283F9496-4034-464B-9F93-DEA587D37A5B%7D.png": {
+ "image/png": ""
+ }
+ },
+ "cell_type": "markdown",
+ "metadata": {},
+ "source": [
+ "![%7B283F9496-4034-464B-9F93-DEA587D37A5B%7D.png](attachment:%7B283F9496-4034-464B-9F93-DEA587D37A5B%7D.png)"
+ ]
+ },
+ {
+ "cell_type": "code",
+ "execution_count": null,
+ "metadata": {},
+ "outputs": [],
+ "source": []
+ }
+ ],
+ "metadata": {
+ "colab": {
+ "provenance": []
+ },
+ "kernelspec": {
+ "display_name": "Python 3 (ipykernel)",
+ "language": "python",
+ "name": "python3"
+ },
+ "language_info": {
+ "codemirror_mode": {
+ "name": "ipython",
+ "version": 3
+ },
+ "file_extension": ".py",
+ "mimetype": "text/x-python",
+ "name": "python",
+ "nbconvert_exporter": "python",
+ "pygments_lexer": "ipython3",
+ "version": "3.11.5"
+ }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 1
+}
diff --git a/pages/docs/integrations/dspy.md b/pages/docs/integrations/dspy.md
index e62d71148..1968672ed 100644
--- a/pages/docs/integrations/dspy.md
+++ b/pages/docs/integrations/dspy.md
@@ -239,6 +239,6 @@ print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in pred.context]
Question: Who conducts the draft in which Marc-Andre Fleury was drafted to the Vegas Golden Knights for the 2017-18 season????????
Predicted Answer: National Hockey League
Retrieved Contexts (truncated): ['2017–18 Pittsburgh Penguins season | The 2017–18 Pittsburgh Penguins season will be the 51st season for the National Hockey League ice hockey team that was established on June 5, 1967. They will enter...', 'Marc-André Fleury | Marc-André Fleury (born November 28, 1984) is a French-Canadian professional ice hockey goaltender playing for the Vegas Golden Knights of the National Hockey League (NHL). Drafted...', "2017 NHL Expansion Draft | The 2017 NHL Expansion Draft was an expansion draft conducted by the National Hockey League on June 18–20, 2017 to fill the roster of the league's expansion team for the 201..."]
-
+
Example query trace in Langfuse: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/baf30bf5-0741-493c-aba3-2a66290d4d1d
diff --git a/pages/docs/integrations/langchain/example-javascript.md b/pages/docs/integrations/langchain/example-javascript.md
index 543434e22..403bb6913 100644
--- a/pages/docs/integrations/langchain/example-javascript.md
+++ b/pages/docs/integrations/langchain/example-javascript.md
@@ -62,7 +62,7 @@ console.log(res.content)
Why did the bear wear a fur coat to the BBQ?
Because it was grizzly cold outside!
-
+
### `stream`
@@ -107,7 +107,7 @@ for await (const chunk of stream) {
light
!
-
+
## Explore the trace in Langfuse
diff --git a/pages/docs/integrations/langchain/example-python-langgraph.md b/pages/docs/integrations/langchain/example-python-langgraph.md
index 1de1dbe5d..f9212f5a2 100644
--- a/pages/docs/integrations/langchain/example-python-langgraph.md
+++ b/pages/docs/integrations/langchain/example-python-langgraph.md
@@ -117,7 +117,7 @@ for s in graph.stream({"messages": [HumanMessage(content = "What is Langfuse?")]
```
{'chatbot': {'messages': [AIMessage(content='Langfuse is a tool designed to help developers monitor and observe the performance of their Large Language Model (LLM) applications. It provides detailed insights into how these applications are functioning, allowing for better debugging, optimization, and overall management. Langfuse offers features such as tracking key metrics, visualizing data, and identifying potential issues in real-time, making it easier for developers to maintain and improve their LLM-based solutions.', response_metadata={'token_usage': {'completion_tokens': 86, 'prompt_tokens': 13, 'total_tokens': 99}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_400f27fa1f', 'finish_reason': 'stop', 'logprobs': None}, id='run-9a0c97cb-ccfe-463e-902c-5a5900b796b4-0', usage_metadata={'input_tokens': 13, 'output_tokens': 86, 'total_tokens': 99})]}}
-
+
### View traces in Langfuse
@@ -353,7 +353,7 @@ for s in graph_2.stream({"messages": [HumanMessage(content = "How does photosynt
----
{'supervisor': {'next': 'FINISH'}}
----
-
+
```python
@@ -370,7 +370,7 @@ for s in graph_2.stream({"messages": [HumanMessage(content = "What time is it?")
----
{'supervisor': {'next': 'FINISH'}}
----
-
+
### See traces in Langfuse
@@ -516,7 +516,7 @@ print(langchain_system_prompt)
```
You are a translator that translates every input text into Spanish.
-
+
Now we can use the new system prompt string to update our assistant.
@@ -566,7 +566,7 @@ for s in graph.stream({"messages": [HumanMessage(content = "What is Langfuse?")]
```
{'chatbot': {'messages': [AIMessage(content='¿Qué es Langfuse?', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 30, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_400f27fa1f', 'finish_reason': 'stop', 'logprobs': None}, id='run-1f419fe3-73e2-4413-aa6c-96560bbd09c8-0', usage_metadata={'input_tokens': 30, 'output_tokens': 6, 'total_tokens': 36})]}}
-
+
## Feedback
diff --git a/pages/docs/integrations/llama-index/example-python-instrumentation-module.md b/pages/docs/integrations/llama-index/example-python-instrumentation-module.md
index 6874a2273..663b344db 100644
--- a/pages/docs/integrations/llama-index/example-python-instrumentation-module.md
+++ b/pages/docs/integrations/llama-index/example-python-instrumentation-module.md
@@ -78,7 +78,7 @@ print(response)
```
He made home movies using a Super 8 camera.
-
+
Example trace: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/d933c7cc-20bf-4db3-810d-bab1c8d9a2a1
@@ -90,7 +90,7 @@ print(response)
```
He made home movies using a Super 8 camera growing up.
-
+
Example trace: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/4e285b8f-9789-4cf0-a8b4-45473ac420f1
diff --git a/pages/docs/integrations/mirascope/example-python.md b/pages/docs/integrations/mirascope/example-python.md
index ac9fb982d..a15324749 100644
--- a/pages/docs/integrations/mirascope/example-python.md
+++ b/pages/docs/integrations/mirascope/example-python.md
@@ -52,7 +52,7 @@ print(response.content)
```
I recommend **"The House in the Cerulean Sea" by TJ Klune**. It's a heartwarming fantasy that follows Linus Baker, a caseworker for magical children, who is sent on a special assignment to a mysterious orphanage. There, he discovers unique and lovable characters and confronts themes of acceptance, found family, and the importance of love and kindness. The book combines whimsy, humor, and poignant moments, making it a delightful read for fantasy lovers.
-
+
[**Example trace**](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/84bbb50e-aebc-424a-ae8a-e1012914d46b)
@@ -87,7 +87,7 @@ generate_facts(3)
Sure! Frogs can breathe through their skin, allowing them to absorb oxygen and release carbon dioxide directly into and out of their bloodstream. This process is known as cutaneous respiration.
Some species of frogs can absorb water through their skin, meaning they don't need to drink water with their mouths.
Frogs can breathe through their skin! This adaptation allows them to absorb oxygen directly from water, which is especially useful when they're submerged.
-
+
Head over to the Langfuse Traces table [in Langfuse Cloud](https://cloud.langfuse.com ) to see the entire chat history, token counts, cost, model, latencies and more
diff --git a/pages/docs/integrations/mistral-sdk.md b/pages/docs/integrations/mistral-sdk.md
index 4ce943aa9..266b6de55 100644
--- a/pages/docs/integrations/mistral-sdk.md
+++ b/pages/docs/integrations/mistral-sdk.md
@@ -301,7 +301,7 @@ stream_find_best_five_painter_from("Spain")
ó
.
-
+
@@ -500,7 +500,7 @@ await async_stream_find_best_five_musician_from("Spain")
ía
.
-
+
diff --git a/pages/docs/integrations/ollama.md b/pages/docs/integrations/ollama.md
index 2fbaf09e5..202225dc0 100644
--- a/pages/docs/integrations/ollama.md
+++ b/pages/docs/integrations/ollama.md
@@ -113,7 +113,7 @@ print(response.choices[0].message.content)
```
A famous moment in history! When Neil Armstrong took his historic first steps on the moon, his first words were: "That's one small step for man, one giant leap for mankind." (Note: The word was actually "man", not "men" - it's often been reported as "one small step for men", but Armstrong himself said he meant to say "man")
-
+
### **Step 4:** See Traces in Langfuse
@@ -200,7 +200,7 @@ print(response.choices[0].message.content)
```
The most recently confirmed element is oganesson (Og), with symbol Og and atomic number 118. It was officially recognized by IUPAC (International Union of Pure and Applied Chemistry) in 2016, following the synthesis of several atoms at laboratories in Russia and Germany. The latest unofficially-recognized element is ununsextium (Uus), with atomic number 138. However, its synthesis is still under investigation, and IUPAC has yet to officially confirm its existence.
-
+
### Step 4: See Traces in Langfuse
diff --git a/pages/docs/integrations/openai/python/structured-outputs.md b/pages/docs/integrations/openai/python/structured-outputs.md
index 2175809b5..bf2192b9a 100644
--- a/pages/docs/integrations/openai/python/structured-outputs.md
+++ b/pages/docs/integrations/openai/python/structured-outputs.md
@@ -136,7 +136,7 @@ print(result.content)
```
{"steps":[{"explanation":"We need to isolate the term with the variable, 8x. So, we start by subtracting 7 from both sides to remove the constant term on the left side.","output":"8x + 7 - 7 = -23 - 7"},{"explanation":"The +7 and -7 on the left side cancel each other out, leaving us with 8x. The right side simplifies to -30.","output":"8x = -30"},{"explanation":"To solve for x, divide both sides of the equation by 8, which is the coefficient of x.","output":"x = -30 / 8"},{"explanation":"Simplify the fraction -30/8 by finding the greatest common divisor, which is 2.","output":"x = -15 / 4"}],"final_answer":"x = -15/4"}
-
+
```python
@@ -178,7 +178,7 @@ print(final_answer)
x = -15/4
-
+
## Step 3: See your trace in Langfuse
@@ -229,7 +229,7 @@ print(result.final_answer)
[Step(explanation='To isolate the term with the variable on one side of the equation, start by subtracting 7 from both sides.', output='8x = -23 - 7'), Step(explanation='Combine like terms on the right side to simplify the equation.', output='8x = -30'), Step(explanation='Divide both sides by 8 to solve for x.', output='x = -30 / 8'), Step(explanation='Simplify the fraction by dividing both the numerator and the denominator by their greatest common divisor, which is 2.', output='x = -15 / 4')]
Final answer:
x = -15/4
-
+
## See your trace in Langfuse
diff --git a/pages/docs/prompts/example-langchain.md b/pages/docs/prompts/example-langchain.md
index c526e4411..93c0660b6 100644
--- a/pages/docs/prompts/example-langchain.md
+++ b/pages/docs/prompts/example-langchain.md
@@ -123,7 +123,7 @@ print(f"Prompt model configurations\nModel: {model}\nTemperature: {temperature}"
Prompt model configurations
Model: gpt-3.5-turbo-1106
Temperature: 0
-
+
### Create Langchain chain based on prompt
@@ -191,7 +191,7 @@ print(response.content)
- Transportation: Eco-friendly shuttle service
Overall, the wedding will be a beautiful blend of art and nature, with a focus on sustainability and creativity. The event will showcase the couple's love for each other and their shared passions, creating a memorable and unique experience for all in attendance.
-
+
## View Trace in Langfuse
diff --git a/pages/docs/scores/example_usage_of_fetch_score.md b/pages/docs/scores/example_usage_of_fetch_score.md
new file mode 100644
index 000000000..61423d078
--- /dev/null
+++ b/pages/docs/scores/example_usage_of_fetch_score.md
@@ -0,0 +1,1020 @@
+## description: This document focuses on retrieving evaluation results logged in Langfuse using the fetch_scores. category: Examples
+
+---
+
+# Fetching Scores from Langfuse
+
+Example: Using UpTrain and Ragas for Model Evaluation and Retrieving Metrics from Langfuse
+Langfuse makes it easy to log and retrieve model evaluation metrics, helping users analyze and compare various performance measures. In this example, we'll demonstrate how UpTrain and Ragas can be used to evaluate models and retrieve specific evaluation metrics logged into Langfuse using `fetch_scores()` function and verify these metrics extracted by creating comparisons using a correlation matrix.
+
+**fetch_scores()** provides these arguments -
+
+- `page` (*Optional[int]*): The page number of the scores to return. Defaults to None.
+- `limit` (*Optional[int]*): The maximum number of scores to return. Defaults to None.
+- `user_id` (*Optional[str]*): A user identifier. Defaults to None.
+- `name` (*Optional[str]*): The name of the scores to return. Defaults to None.
+- `from_timestamp` (*Optional[dt.datetime]*): Retrieve only scores with a timestamp on or after this datetime. Defaults to None.
+- `to_timestamp` (*Optional[dt.datetime]*): Retrieve only scores with a timestamp before this datetime. Defaults to None.
+- `source` (*Optional[ScoreSource]*): The source of the scores. Defaults to None.
+- `operator` (*Optional[str]*): The operator of the scores. Defaults to None.
+- `value` (*Optional[float]*): The value of the scores. Defaults to None.
+- `score_ids` (*Optional[str]*): The score identifier. Defaults to None.
+- `config_id` (*Optional[str]*): The configuration identifier. Defaults to None.
+- `data_type` (*Optional[ScoreDataType]*): The data type of the scores. Defaults to None.
+- `request_options` (*Optional[RequestOptions]*): Additional request options. Defaults to None.
+
+The returned data contains a list of scores along with associated metadata, which can be useful for evaluating the performance of different models or experiments. If an error occurs during the request, it raises an exception, providing insight into what went wrong.
+
+---
+
+### 1. Setting up the environment
+
+Importing necessary libraries and setting up enviornment variables
+
+
+```python
+!pip install ragas uptrain litellm datasets rouge_score langfuse
+```
+
+
+```python
+import os
+# get keys for your project from https://cloud.langfuse.com
+os.environ["LANGFUSE_PUBLIC_KEY"] = ""
+os.environ["LANGFUSE_SECRET_KEY"] = ""
+# your openai key
+os.environ["OPENAI_API_KEY"] = ""
+
+# Your host, defaults to https://cloud.langfuse.com
+# For US data region, set to "https://us.cloud.langfuse.com"
+os.environ["LANGFUSE_HOST"] = "https://us.cloud.langfuse.com"
+```
+
+### 2. Getting the data
+
+This section demonstrates how to load and prepare a dataset for evaluation. The "amnesty_qa" dataset is loaded using the `datasets` library, and a subset of 5 evaluation examples is selected for analysis. The selected data is then converted into a pandas DataFrame for convenient handling and processing.
+
+
+```python
+from datasets import load_dataset
+
+amnesty_qa = load_dataset("explodinggradients/amnesty_qa", "english_v2")
+amnesty_qa_ragas = amnesty_qa["eval"].select(range(5))
+amnesty_qa_ragas.to_pandas()
+```
+
+
+```python
+import pandas as pd
+amnesty_qa_df = pd.DataFrame(amnesty_qa["eval"].select(range(5)))
+```
+
+
+
+
+
+
+
+
+
+
+### 3. Evaluation with UpTrain
+
+This code demonstrates how to evaluate a dataset using UpTrain's `EvalLLM` class. An instance of `EvalLLM` is created using the OpenAI API key. The `evaluate` function assesses the `amnesty_qa_df` DataFrame against three evaluation criteria: context relevance, factual accuracy, and response completeness. The evaluation results are stored in a new DataFrame, which is then printed and optionally saved as a CSV file. Finally, the function is called in the main block to execute the evaluation and store the results. Refer a detailed version [here](https://langfuse.com/guides/cookbook/evaluation_with_uptrain)
+
+
+```python
+import os
+import json
+import pandas as pd
+from uptrain import EvalLLM, Evals
+
+OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')
+eval_llm = EvalLLM(openai_api_key=OPENAI_API_KEY)
+
+def evaluate():
+ # Step 5: Evaluate data using UpTrain
+ results = eval_llm.evaluate(
+ data=amnesty_qa_df,
+ checks=[Evals.CONTEXT_RELEVANCE, Evals.FACTUAL_ACCURACY, Evals.RESPONSE_COMPLETENESS]
+ )
+
+ # Convert the results to a DataFrame
+ results_df = pd.DataFrame(results)
+
+ # Print the DataFrame
+ print(results_df)
+
+ # Optionally, save the DataFrame to a CSV file
+ results_df.to_csv('evaluation_results.csv', index=False)
+
+ return results_df
+
+# Call the function and store results in a DataFrame
+if __name__ == "__main__":
+ uptrain_df = evaluate()
+```
+
+ 100%|██████████| 5/5 [00:01<00:00, 3.19it/s]
+ 100%|██████████| 5/5 [00:02<00:00, 2.01it/s]
+ 100%|██████████| 5/5 [00:06<00:00, 1.30s/it]
+ 100%|██████████| 5/5 [00:02<00:00, 2.25it/s]
+ [32m2024-10-13 16:50:32.097[0m | [1mINFO [0m | [36muptrain.framework.evalllm[0m:[36mevaluate[0m:[36m376[0m - [1mLocal server not running, start the server to log data and visualize in the dashboard![0m
+
+
+ question \
+ 0 What are the global implications of the USA Su...
+ 1 Which companies are the main contributors to G...
+ 2 Which private companies in the Americas are th...
+ 3 What action did Amnesty International urge its...
+ 4 What are the recommendations made by Amnesty I...
+
+ ground_truth \
+ 0 The global implications of the USA Supreme Cou...
+ 1 According to the Carbon Majors database, the m...
+ 2 The largest private companies in the Americas ...
+ 3 Amnesty International urged its supporters to ...
+ 4 The recommendations made by Amnesty Internatio...
+
+ answer \
+ 0 The global implications of the USA Supreme Cou...
+ 1 According to the Carbon Majors database, the m...
+ 2 According to the Carbon Majors database, the l...
+ 3 Amnesty International urged its supporters to ...
+ 4 Amnesty International made several recommendat...
+
+ context \
+ 0 [- In 2022, the USA Supreme Court handed down ...
+ 1 [In recent years, there has been increasing pr...
+ 2 [The issue of greenhouse gas emissions has bec...
+ 3 [In the case of the Ogoni 9, Amnesty Internati...
+ 4 [In recent years, Amnesty International has fo...
+
+ response score_context_relevance \
+ 0 The global implications of the USA Supreme Cou... 1.0
+ 1 According to the Carbon Majors database, the m... 1.0
+ 2 According to the Carbon Majors database, the l... 1.0
+ 3 Amnesty International urged its supporters to ... 1.0
+ 4 Amnesty International made several recommendat... 1.0
+
+ explanation_context_relevance score_factual_accuracy \
+ 0 {\n "Reasoning": "The extracted context con... 1.0
+ 1 {\n "Reasoning": "The given context provide... 0.6
+ 2 {\n "Reasoning": "The extracted context pro... 0.4
+ 3 {\n "Reasoning": "The given context contain... 0.8
+ 4 {\n "Reasoning": "The extracted context con... 0.6
+
+ explanation_factual_accuracy \
+ 0 {\n "Result": [\n {\n "Fa...
+ 1 {\n "Result": [\n {\n "Fa...
+ 2 {\n "Result": [\n {\n "Fa...
+ 3 {\n "Result": [\n {\n "Fa...
+ 4 {\n "Result": [\n {\n "Fa...
+
+ score_response_completeness \
+ 0 1.0
+ 1 1.0
+ 2 1.0
+ 3 1.0
+ 4 1.0
+
+ explanation_response_completeness
+ 0 {\n "Reasoning": "The given response is com...
+ 1 {\n "Reasoning": "The given response is com...
+ 2 {\n "Reasoning": "The given response is com...
+ 3 {\n "Reasoning": "The given response is com...
+ 4 {\n "Reasoning": "The given response is com...
+
+
+### 4. Evaluation with Ragas
+
+The `evaluate` function is called with the selected evaluation data and a list of metrics, including context precision, faithfulness, and answer relevancy. The results from the evaluation are then converted into a Pandas DataFrame for easier analysis. This approach enables users to assess the quality of model responses based on specific criteria. For more detailed information on evaluating RAG models with Ragas visit [here](https://langfuse.com/guides/cookbook/evaluation_of_rag_with_ragas).
+
+
+```python
+import json
+from ragas import evaluate
+from ragas.metrics import (
+ answer_relevancy,
+ faithfulness,
+ context_precision,
+)
+
+ragas_result = evaluate(
+ amnesty_qa["eval"].select(range(5)),
+ metrics=[
+ context_precision,
+ faithfulness,
+ answer_relevancy,
+ ],
+)
+
+ragas_df = ragas_result.to_pandas()
+```
+
+### 5. Setting Up Langfuse Client
+
+This code snippet initializes a Langfuse client using the `Langfuse` class. The client is configured with a secret key, public key, and host URL, which are retrieved from the environment variables. This setup allows users to interact with the Langfuse API for logging and analyzing model evaluation metrics seamlessly.
+
+
+```python
+from langfuse import Langfuse
+langfuse_client = Langfuse(
+ secret_key=os.environ.get("LANGFUSE_SECRET_KEY"),
+ public_key=os.environ.get("LANGFUSE_PUBLIC_KEY"),
+ host = os.environ.get("LANGFUSE_HOST")
+)
+```
+
+### 6. Logging Evaluation Scores to Langfuse
+
+The functions `log_uptrain_scores_to_langfuse` and `log_ragas_scores_to_langfuse` log evaluation scores from the UpTrain and Ragas frameworks into Langfuse. Each function iterates through its respective DataFrame, extracting relevant score columns and logging them with `langfuse_client.score`, using a unique ID for each entry.
+
+Scores in Langfuse are objects for storing evaluation metrics, linked to traces and optional observations. Each score can include attributes such as name, value, trace ID, and configuration ID to ensure they comply with a specified schema. This structured approach enables effective analysis of evaluation metrics within the Langfuse platform.
+
+#### Key Attributes of a Score Object:
+- **name**: Name of the score (e.g., user_feedback).
+- **value**: Numeric value of the score.
+- **traceId**: ID of the related trace.
+- **id**: Unique identifier for the score.
+
+Using scores effectively allows for quick overviews of evaluations, segmentation of traces by quality, and detailed reporting across use cases. Score schemas can be defined to standardize metrics for consistency and comparability in analysis.
+
+
+```python
+def log_uptrain_scores_to_langfuse(uptrain_df):
+ """Log evaluation scores to Langfuse."""
+ score_columns = ['score_factual_accuracy', 'score_context_relevance', 'score_response_completeness']
+ for index, row in uptrain_df.iterrows():
+ for score_name in score_columns:
+ score_value = row[score_name]
+ langfuse_client.score(id=f"Uptrain_{index}_{score_name}", value=score_value, name=score_name)
+```
+
+
+```python
+def log_ragas_scores_to_langfuse(ragas_df):
+ score_columns = ['context_precision', 'faithfulness', 'answer_relevancy']
+
+ for index, row in ragas_df.iterrows():
+ for score_name in score_columns:
+ score_value = row[score_name]
+ langfuse_client.score(id=f"Ragas_{index}_{score_name}", value=score_value, name=score_name)
+```
+
+
+```python
+log_ragas_scores_to_langfuse(ragas_df)
+log_uptrain_scores_to_langfuse(uptrain_df)
+```
+
+### 7. Fetching Scores from Langfuse
+
+The `fetch_scores_from_langfuse` function retrieves evaluation scores from Langfuse based on the specified score name. It utilizes the `fetch_scores` method from the Langfuse client to obtain a comprehensive list of scores that have been logged in the system. This function is particularly useful for users who want to analyze specific evaluation metrics associated with their models or applications.
+
+By using the `fetch_scores` method, the function provides flexibility through various optional parameters that allow users to filter the retrieved scores according to their needs. For instance, users can specify pagination options such as the page number and the limit on the number of scores returned, making it easier to handle large datasets without overwhelming the interface.
+
+In addition to pagination, the function supports filtering scores by criteria like user identifiers, timestamps, and score sources. This means users can fetch scores that were recorded by specific users or during a certain time frame, allowing for a more focused analysis. Users can also filter scores based on their values or specific configurations, ensuring that the retrieved data aligns with the evaluation metrics of interest.
+
+The result of this function is a `FetchScoresResponse`, which includes not only the list of scores but also metadata about the scores retrieved. This allows users to quickly gain insights into the evaluation metrics relevant to their projects and make informed decisions based on the data. Overall, this function enhances the usability of Langfuse by simplifying the process of accessing and analyzing evaluation scores.
+
+
+```python
+def fetch_scores_from_langfuse(score_name):
+ """Fetch scores from Langfuse based on score name."""
+ # Fetch scores for the specified name from Langfuse
+ scores_fetched = langfuse_client.fetch_scores(name=score_name)
+ return scores_fetched
+```
+
+
+```python
+score_columns = [ 'score_context_relevance', 'score_factual_accuracy', 'score_response_completeness', 'context_precision', 'faithfulness', 'answer_relevancy']
+
+scores_df = pd.DataFrame(columns=score_columns)
+
+for score_name in score_columns:
+ fetch_scores = fetch_scores_from_langfuse(score_name)
+ print(fetch_scores.data)
+ scores_df[score_name] = [score.value for score in fetch_scores.data[::-1]]
+```
+
+ [Score_Numeric(value=1.0, id='Uptrain_4_score_context_relevance', trace_id='95ad7bdd-b93b-4905-a865-938f346871bd', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 177000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_3_score_context_relevance', trace_id='f9b43538-77b6-478f-a5d9-c2be3b4cdada', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 897000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_2_score_context_relevance', trace_id='02185905-be84-41d9-9b64-b02fb45704f3', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 614000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_1_score_context_relevance', trace_id='b68fc2e6-e6a0-489b-becc-5441d9f1dd4e', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 326000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_context_relevance', trace_id='75bd20ac-3a34-4fa0-b74a-0fb7a454bfa1', name='score_context_relevance', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 46000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+ [Score_Numeric(value=0.6, id='Uptrain_4_score_factual_accuracy', trace_id='e5ad0a8e-3c20-4dc8-ba19-1f11f224ebbf', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 84000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.8, id='Uptrain_3_score_factual_accuracy', trace_id='2ed536e7-a583-401c-b3e9-1227985875c1', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 804000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.4, id='Uptrain_2_score_factual_accuracy', trace_id='8552536a-70ae-4678-a789-c0af61d3a436', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 517000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.6, id='Uptrain_1_score_factual_accuracy', trace_id='812d7ae7-f2bf-4251-9784-9ee248b469d7', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 231000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_factual_accuracy', trace_id='f4135b5b-d20a-4741-b777-186d37d1fa52', name='score_factual_accuracy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 954000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+ [Score_Numeric(value=1.0, id='Uptrain_4_score_response_completeness', trace_id='1a54b4e2-3e2c-4235-801b-b56153c8e293', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 271000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_3_score_response_completeness', trace_id='ce78dce7-f4bd-45a4-b69c-f31fd6258565', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 990000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_2_score_response_completeness', trace_id='103927f0-dd9f-4d94-95d6-a4a6fce3898d', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 709000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_1_score_response_completeness', trace_id='6e7ae4f6-aca0-4152-b299-5b1ae06bd7e9', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 423000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Uptrain_0_score_response_completeness', trace_id='3c100175-8e20-4d1f-ab1b-a7e4dc870cac', name='score_response_completeness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 138000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+ [Score_Numeric(value=0.9999999999666667, id='Ragas_4_context_precision', trace_id='1441c394-fc54-42f3-a798-7ab1b338748c', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 207000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.99999999995, id='Ragas_3_context_precision', trace_id='a91146c0-09d4-4039-828d-adf308d09dd8', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 927000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.8333333332916666, id='Ragas_2_context_precision', trace_id='16bf0af8-b988-44d0-a9c5-35a0ffa69ffd', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 643000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9999999999666667, id='Ragas_1_context_precision', trace_id='976e6974-f6d7-4ff0-b961-5653ae58e9ef', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 310000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9999999999666667, id='Ragas_0_context_precision', trace_id='4e0edb60-c6b1-452d-ae58-ce7449dc3f47', name='context_precision', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 798000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+ [Score_Numeric(value=0.1428571428571428, id='Ragas_4_faithfulness', trace_id='8c3f995f-bc00-4935-90e5-069478987ce3', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 300000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.2, id='Ragas_3_faithfulness', trace_id='424fddad-f617-491a-9816-d9642f33d0e6', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 19000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.0, id='Ragas_2_faithfulness', trace_id='c7b7e4a1-ab80-4951-ae16-293265970dc3', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 740000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.12, id='Ragas_1_faithfulness', trace_id='77a2d6ae-b840-454f-b4e3-52edb8909bcb', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 456000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Ragas_0_faithfulness', trace_id='8f61a293-836f-4cc9-84f9-996c19c42620', name='faithfulness', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 23, 894000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+ [Score_Numeric(value=0.9891308706741455, id='Ragas_4_answer_relevancy', trace_id='21a3c662-a494-4029-b95a-8fd25f90a8c6', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 398000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9795341682836177, id='Ragas_3_answer_relevancy', trace_id='f398dd78-ccdd-423c-9662-92ff548183e7', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 25, 114000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9916994382653276, id='Ragas_2_answer_relevancy', trace_id='65d48c73-2fbd-4577-bec9-7a46858e0a6a', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 834000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=0.9652149513821247, id='Ragas_1_answer_relevancy', trace_id='116c5ac3-7931-471b-83eb-da6c91725621', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 550000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8'), Score_Numeric(value=1.0, id='Ragas_0_answer_relevancy', trace_id='e7642418-7f1f-4c4f-8480-06dd8c276fbd', name='answer_relevancy', source=, observation_id=None, timestamp=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), created_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), updated_at=datetime.datetime(2024, 10, 13, 16, 59, 24, 59000, tzinfo=datetime.timezone.utc), author_user_id=None, comment=None, config_id=None, data_type='NUMERIC', stringValue=None, trace={'userId': None}, projectId='cm1vkhmj40jxlhaue9mntmwk8')]
+
+
+### 8. Creating a Correlation Heatmap
+
+This section illustrates how to visualize the correlation between evaluation scores using a heatmap. The code calculates the correlation matrix for two sets of scores: UpTrain scores (`'score_context_relevance'`, `'score_factual_accuracy'`, and `'score_response_completeness'`) and RAGAS scores (`'context_precision'`, `'faithfulness'`, and `'answer_relevancy'`).
+
+1. **Calculate the Correlation Matrix**: The `corr()` function computes correlation coefficients between specified score columns in the `scores_df` DataFrame, indicating the strength and direction of relationships.
+
+2. **Create and Customize the Heatmap**: A heatmap is generated using Matplotlib and Seaborn, displaying correlation coefficients with colors ranging from blue (negative) to red (positive). The layout is adjusted for clarity.
+
+This visualization helps identify patterns in the evaluation metrics, aiding in the analysis of `fetch_scores()` performance.
+
+
+```python
+import matplotlib.pyplot as plt
+import seaborn as sns
+
+corr_matrix = scores_df.corr()
+
+# Create a heatmap of the correlation matrix
+plt.figure(figsize=(10, 8))
+sns.heatmap(corr_matrix, annot=True, vmin=-1, vmax=1, center=0, linewidths=.5, linecolor='white', cmap='crest')
+plt.title('Correlation Matrix of Six Scores')
+plt.tight_layout()
+```
+
+
+
+![png](/public/images/cookbook/example_usage_of_fetch_scores_files/example_usage_of_fetch_scores_23_0.png)
+
+
+
+![%7B283F9496-4034-464B-9F93-DEA587D37A5B%7D.png](/public/images/cookbook/example_usage_of_fetch_scores_files/example_fetch_scores_langfuse.png)
+
+
+```python
+
+```
diff --git a/pages/docs/scores/external-evaluation-pipelines.md b/pages/docs/scores/external-evaluation-pipelines.md
index 51ab879de..895fbd40b 100644
--- a/pages/docs/scores/external-evaluation-pipelines.md
+++ b/pages/docs/scores/external-evaluation-pipelines.md
@@ -216,7 +216,7 @@ print(f"Traces in first batch: {len(traces_batch)}")
```
Traces in first batch: 10
-
+
## 2. Run your evaluations
diff --git a/pages/docs/sdk/typescript/example-vercel-ai.md b/pages/docs/sdk/typescript/example-vercel-ai.md
index 7f7c91e23..a9f27b8ed 100644
--- a/pages/docs/sdk/typescript/example-vercel-ai.md
+++ b/pages/docs/sdk/typescript/example-vercel-ai.md
@@ -245,7 +245,7 @@ console.log(data);
```
Love is a complex and deep emotion that can manifest in various forms such as romantic love, platonic love, familial love, and love for oneself. It often involves feelings of care, affection, empathy, and a strong bond with another person. Love can bring joy, happiness, and fulfillment to our lives, but it can also be challenging and require effort, communication, and understanding to maintain healthy relationships. Overall, love is a fundamental aspect of human experience that can bring meaning and purpose to our lives.
-
+
### Explore the trace in the UI
@@ -257,7 +257,7 @@ console.log(response.headers.get("X-Langfuse-Trace-Url"))
```
https://cloud.langfuse.com/trace/14cd44b6-1a56-46af-ba85-3fd91bbf9739
-
+
![Trace in Langfuse UI](https://langfuse.com/images/cookbook/js_tracing_example_vercel_ai_sdk_trace.png)
diff --git a/pages/guides/cookbook/example_external_evaluation_pipelines.md b/pages/guides/cookbook/example_external_evaluation_pipelines.md
index 51ab879de..895fbd40b 100644
--- a/pages/guides/cookbook/example_external_evaluation_pipelines.md
+++ b/pages/guides/cookbook/example_external_evaluation_pipelines.md
@@ -216,7 +216,7 @@ print(f"Traces in first batch: {len(traces_batch)}")
```
Traces in first batch: 10
-
+
## 2. Run your evaluations
diff --git a/pages/guides/cookbook/integration_dspy.md b/pages/guides/cookbook/integration_dspy.md
index e62d71148..1968672ed 100644
--- a/pages/guides/cookbook/integration_dspy.md
+++ b/pages/guides/cookbook/integration_dspy.md
@@ -239,6 +239,6 @@ print(f"Retrieved Contexts (truncated): {[c[:200] + '...' for c in pred.context]
Question: Who conducts the draft in which Marc-Andre Fleury was drafted to the Vegas Golden Knights for the 2017-18 season????????
Predicted Answer: National Hockey League
Retrieved Contexts (truncated): ['2017–18 Pittsburgh Penguins season | The 2017–18 Pittsburgh Penguins season will be the 51st season for the National Hockey League ice hockey team that was established on June 5, 1967. They will enter...', 'Marc-André Fleury | Marc-André Fleury (born November 28, 1984) is a French-Canadian professional ice hockey goaltender playing for the Vegas Golden Knights of the National Hockey League (NHL). Drafted...', "2017 NHL Expansion Draft | The 2017 NHL Expansion Draft was an expansion draft conducted by the National Hockey League on June 18–20, 2017 to fill the roster of the league's expansion team for the 201..."]
-
+
Example query trace in Langfuse: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/baf30bf5-0741-493c-aba3-2a66290d4d1d
diff --git a/pages/guides/cookbook/integration_langgraph.md b/pages/guides/cookbook/integration_langgraph.md
index 1de1dbe5d..f9212f5a2 100644
--- a/pages/guides/cookbook/integration_langgraph.md
+++ b/pages/guides/cookbook/integration_langgraph.md
@@ -117,7 +117,7 @@ for s in graph.stream({"messages": [HumanMessage(content = "What is Langfuse?")]
```
{'chatbot': {'messages': [AIMessage(content='Langfuse is a tool designed to help developers monitor and observe the performance of their Large Language Model (LLM) applications. It provides detailed insights into how these applications are functioning, allowing for better debugging, optimization, and overall management. Langfuse offers features such as tracking key metrics, visualizing data, and identifying potential issues in real-time, making it easier for developers to maintain and improve their LLM-based solutions.', response_metadata={'token_usage': {'completion_tokens': 86, 'prompt_tokens': 13, 'total_tokens': 99}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_400f27fa1f', 'finish_reason': 'stop', 'logprobs': None}, id='run-9a0c97cb-ccfe-463e-902c-5a5900b796b4-0', usage_metadata={'input_tokens': 13, 'output_tokens': 86, 'total_tokens': 99})]}}
-
+
### View traces in Langfuse
@@ -353,7 +353,7 @@ for s in graph_2.stream({"messages": [HumanMessage(content = "How does photosynt
----
{'supervisor': {'next': 'FINISH'}}
----
-
+
```python
@@ -370,7 +370,7 @@ for s in graph_2.stream({"messages": [HumanMessage(content = "What time is it?")
----
{'supervisor': {'next': 'FINISH'}}
----
-
+
### See traces in Langfuse
@@ -516,7 +516,7 @@ print(langchain_system_prompt)
```
You are a translator that translates every input text into Spanish.
-
+
Now we can use the new system prompt string to update our assistant.
@@ -566,7 +566,7 @@ for s in graph.stream({"messages": [HumanMessage(content = "What is Langfuse?")]
```
{'chatbot': {'messages': [AIMessage(content='¿Qué es Langfuse?', response_metadata={'token_usage': {'completion_tokens': 6, 'prompt_tokens': 30, 'total_tokens': 36}, 'model_name': 'gpt-4o-2024-05-13', 'system_fingerprint': 'fp_400f27fa1f', 'finish_reason': 'stop', 'logprobs': None}, id='run-1f419fe3-73e2-4413-aa6c-96560bbd09c8-0', usage_metadata={'input_tokens': 30, 'output_tokens': 6, 'total_tokens': 36})]}}
-
+
## Feedback
diff --git a/pages/guides/cookbook/integration_llama-index_instrumentation.md b/pages/guides/cookbook/integration_llama-index_instrumentation.md
index 6874a2273..663b344db 100644
--- a/pages/guides/cookbook/integration_llama-index_instrumentation.md
+++ b/pages/guides/cookbook/integration_llama-index_instrumentation.md
@@ -78,7 +78,7 @@ print(response)
```
He made home movies using a Super 8 camera.
-
+
Example trace: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/d933c7cc-20bf-4db3-810d-bab1c8d9a2a1
@@ -90,7 +90,7 @@ print(response)
```
He made home movies using a Super 8 camera growing up.
-
+
Example trace: https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/4e285b8f-9789-4cf0-a8b4-45473ac420f1
diff --git a/pages/guides/cookbook/integration_llama_index_posthog_mistral.md b/pages/guides/cookbook/integration_llama_index_posthog_mistral.md
index 6e12ec811..1d48a76e8 100644
--- a/pages/guides/cookbook/integration_llama_index_posthog_mistral.md
+++ b/pages/guides/cookbook/integration_llama_index_posthog_mistral.md
@@ -128,7 +128,7 @@ We download the file we want to use for RAG. In this example, we use a [hedgehog
2024-09-20 13:16:40 (2.03 MB/s) - ‘./hedgehog.pdf’ saved [1160174/1160174]
-
+
Next, we load the pdf using the LlamaIndex [`SimpleDirectoryReader`](https://docs.llamaindex.ai/en/stable/module_guides/loading/simpledirectoryreader/).
@@ -160,7 +160,7 @@ print(response)
```
Hedgehogs that require help are those that are sick, injured, and helpless, such as orphaned hoglets. These hedgehogs in need may be temporarily taken into human care and must be released into the wild as soon as they can survive there independently.
-
+
All steps of the LLM chain are now tracked in Langfuse.
@@ -204,7 +204,7 @@ langfuse.score(
```
Based on the provided context, it is not recommended to keep wild hedgehogs as pets. The Federal Nature Conservation Act protects hedgehogs as a native mammal species, making it illegal to chase, catch, injure, kill, or take their nesting and refuge places. Exceptions apply only to sick, injured, and helpless hedgehogs, which may be temporarily taken into human care and released into the wild as soon as they can survive independently. It is important to respect the natural habitats and behaviors of wild animals, including hedgehogs.
-
+
diff --git a/pages/guides/cookbook/integration_mirascope.md b/pages/guides/cookbook/integration_mirascope.md
index ac9fb982d..a15324749 100644
--- a/pages/guides/cookbook/integration_mirascope.md
+++ b/pages/guides/cookbook/integration_mirascope.md
@@ -52,7 +52,7 @@ print(response.content)
```
I recommend **"The House in the Cerulean Sea" by TJ Klune**. It's a heartwarming fantasy that follows Linus Baker, a caseworker for magical children, who is sent on a special assignment to a mysterious orphanage. There, he discovers unique and lovable characters and confronts themes of acceptance, found family, and the importance of love and kindness. The book combines whimsy, humor, and poignant moments, making it a delightful read for fantasy lovers.
-
+
[**Example trace**](https://cloud.langfuse.com/project/cloramnkj0002jz088vzn1ja4/traces/84bbb50e-aebc-424a-ae8a-e1012914d46b)
@@ -87,7 +87,7 @@ generate_facts(3)
Sure! Frogs can breathe through their skin, allowing them to absorb oxygen and release carbon dioxide directly into and out of their bloodstream. This process is known as cutaneous respiration.
Some species of frogs can absorb water through their skin, meaning they don't need to drink water with their mouths.
Frogs can breathe through their skin! This adaptation allows them to absorb oxygen directly from water, which is especially useful when they're submerged.
-
+
Head over to the Langfuse Traces table [in Langfuse Cloud](https://cloud.langfuse.com ) to see the entire chat history, token counts, cost, model, latencies and more
diff --git a/pages/guides/cookbook/integration_mistral_sdk.md b/pages/guides/cookbook/integration_mistral_sdk.md
index 4ce943aa9..266b6de55 100644
--- a/pages/guides/cookbook/integration_mistral_sdk.md
+++ b/pages/guides/cookbook/integration_mistral_sdk.md
@@ -301,7 +301,7 @@ stream_find_best_five_painter_from("Spain")
ó
.
-
+
@@ -500,7 +500,7 @@ await async_stream_find_best_five_musician_from("Spain")
ía
.
-
+
diff --git a/pages/guides/cookbook/integration_ollama.md b/pages/guides/cookbook/integration_ollama.md
index 2fbaf09e5..202225dc0 100644
--- a/pages/guides/cookbook/integration_ollama.md
+++ b/pages/guides/cookbook/integration_ollama.md
@@ -113,7 +113,7 @@ print(response.choices[0].message.content)
```
A famous moment in history! When Neil Armstrong took his historic first steps on the moon, his first words were: "That's one small step for man, one giant leap for mankind." (Note: The word was actually "man", not "men" - it's often been reported as "one small step for men", but Armstrong himself said he meant to say "man")
-
+
### **Step 4:** See Traces in Langfuse
@@ -200,7 +200,7 @@ print(response.choices[0].message.content)
```
The most recently confirmed element is oganesson (Og), with symbol Og and atomic number 118. It was officially recognized by IUPAC (International Union of Pure and Applied Chemistry) in 2016, following the synthesis of several atoms at laboratories in Russia and Germany. The latest unofficially-recognized element is ununsextium (Uus), with atomic number 138. However, its synthesis is still under investigation, and IUPAC has yet to officially confirm its existence.
-
+
### Step 4: See Traces in Langfuse
diff --git a/pages/guides/cookbook/integration_openai_structured_output.md b/pages/guides/cookbook/integration_openai_structured_output.md
index 2175809b5..bf2192b9a 100644
--- a/pages/guides/cookbook/integration_openai_structured_output.md
+++ b/pages/guides/cookbook/integration_openai_structured_output.md
@@ -136,7 +136,7 @@ print(result.content)
```
{"steps":[{"explanation":"We need to isolate the term with the variable, 8x. So, we start by subtracting 7 from both sides to remove the constant term on the left side.","output":"8x + 7 - 7 = -23 - 7"},{"explanation":"The +7 and -7 on the left side cancel each other out, leaving us with 8x. The right side simplifies to -30.","output":"8x = -30"},{"explanation":"To solve for x, divide both sides of the equation by 8, which is the coefficient of x.","output":"x = -30 / 8"},{"explanation":"Simplify the fraction -30/8 by finding the greatest common divisor, which is 2.","output":"x = -15 / 4"}],"final_answer":"x = -15/4"}
-
+
```python
@@ -178,7 +178,7 @@ print(final_answer)
x = -15/4
-
+
## Step 3: See your trace in Langfuse
@@ -229,7 +229,7 @@ print(result.final_answer)
[Step(explanation='To isolate the term with the variable on one side of the equation, start by subtracting 7 from both sides.', output='8x = -23 - 7'), Step(explanation='Combine like terms on the right side to simplify the equation.', output='8x = -30'), Step(explanation='Divide both sides by 8 to solve for x.', output='x = -30 / 8'), Step(explanation='Simplify the fraction by dividing both the numerator and the denominator by their greatest common divisor, which is 2.', output='x = -15 / 4')]
Final answer:
x = -15/4
-
+
## See your trace in Langfuse
diff --git a/pages/guides/cookbook/js_integration_langchain.md b/pages/guides/cookbook/js_integration_langchain.md
index 543434e22..403bb6913 100644
--- a/pages/guides/cookbook/js_integration_langchain.md
+++ b/pages/guides/cookbook/js_integration_langchain.md
@@ -62,7 +62,7 @@ console.log(res.content)
Why did the bear wear a fur coat to the BBQ?
Because it was grizzly cold outside!
-
+
### `stream`
@@ -107,7 +107,7 @@ for await (const chunk of stream) {
light
!
-
+
## Explore the trace in Langfuse
diff --git a/pages/guides/cookbook/js_tracing_example_vercel_ai_sdk.md b/pages/guides/cookbook/js_tracing_example_vercel_ai_sdk.md
index 7f7c91e23..a9f27b8ed 100644
--- a/pages/guides/cookbook/js_tracing_example_vercel_ai_sdk.md
+++ b/pages/guides/cookbook/js_tracing_example_vercel_ai_sdk.md
@@ -245,7 +245,7 @@ console.log(data);
```
Love is a complex and deep emotion that can manifest in various forms such as romantic love, platonic love, familial love, and love for oneself. It often involves feelings of care, affection, empathy, and a strong bond with another person. Love can bring joy, happiness, and fulfillment to our lives, but it can also be challenging and require effort, communication, and understanding to maintain healthy relationships. Overall, love is a fundamental aspect of human experience that can bring meaning and purpose to our lives.
-
+
### Explore the trace in the UI
@@ -257,7 +257,7 @@ console.log(response.headers.get("X-Langfuse-Trace-Url"))
```
https://cloud.langfuse.com/trace/14cd44b6-1a56-46af-ba85-3fd91bbf9739
-
+
![Trace in Langfuse UI](https://langfuse.com/images/cookbook/js_tracing_example_vercel_ai_sdk_trace.png)
diff --git a/pages/guides/cookbook/prompt_management_langchain.md b/pages/guides/cookbook/prompt_management_langchain.md
index c526e4411..93c0660b6 100644
--- a/pages/guides/cookbook/prompt_management_langchain.md
+++ b/pages/guides/cookbook/prompt_management_langchain.md
@@ -123,7 +123,7 @@ print(f"Prompt model configurations\nModel: {model}\nTemperature: {temperature}"
Prompt model configurations
Model: gpt-3.5-turbo-1106
Temperature: 0
-
+
### Create Langchain chain based on prompt
@@ -191,7 +191,7 @@ print(response.content)
- Transportation: Eco-friendly shuttle service
Overall, the wedding will be a beautiful blend of art and nature, with a focus on sustainability and creativity. The event will showcase the couple's love for each other and their shared passions, creating a memorable and unique experience for all in attendance.
-
+
## View Trace in Langfuse
diff --git a/public/images/cookbook/example_usage_of_fetch_scores_files/example_fetch_scores_langfuse.png b/public/images/cookbook/example_usage_of_fetch_scores_files/example_fetch_scores_langfuse.png
new file mode 100644
index 000000000..16bed1cc7
Binary files /dev/null and b/public/images/cookbook/example_usage_of_fetch_scores_files/example_fetch_scores_langfuse.png differ
diff --git a/public/images/cookbook/example_usage_of_fetch_scores_files/example_usage_of_fetch_scores_23_0.png b/public/images/cookbook/example_usage_of_fetch_scores_files/example_usage_of_fetch_scores_23_0.png
new file mode 100644
index 000000000..8ded42d98
Binary files /dev/null and b/public/images/cookbook/example_usage_of_fetch_scores_files/example_usage_of_fetch_scores_23_0.png differ