fix links

langchain-ai · Nov 25, 2024 · 57baca8 · 57baca8
1 parent b2f1118
commit 57baca8
Show file tree

Hide file tree

Showing 14 changed files with 43 additions and 43 deletions.
diff --git a/docs/evaluation/how_to_guides/async.mdx b/docs/evaluation/how_to_guides/async.mdx
@@ -4,19 +4,19 @@ import { CodeTabs, python } from "@site/src/components/InstructionsWithCode";
 
 :::info Key concepts
 
-[Evaluations](../../concepts#applying-evaluations) | [Evaluators](../../concepts#evaluators) | [Datasets](../../concepts#datasets) | [Experiments](../../concepts#experiments)
+[Evaluations](../concepts#applying-evaluations) | [Evaluators](../concepts#evaluators) | [Datasets](../concepts#datasets) | [Experiments](../concepts#experiments)
 
 :::
 
-We can run evaluations asynchronously via the SDK using [aevaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._arunner.aevaluate.html),
-which accepts all of the same arguments as [evaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate.html) but expects the application function to be asynchronous.
-You can learn more about how to use the `evaluate()` function [here](../../how_to_guides/evaluate_llm_application).
+We can run evaluations asynchronously via the SDK using [aevaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._arunner.aevaluate.html),
+which accepts all of the same arguments as [evaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate.html) but expects the application function to be asynchronous.
+You can learn more about how to use the `evaluate()` function [here](./evaluate_llm_application).
 
 :::info Python only
 
 This guide is only relevant when using the Python SDK.
 In JS/TS the `evaluate()` function is already async.
-You can see how to use it [here](../../how_to_guides/evaluate_llm_application).
+You can see how to use it [here](./evaluate_llm_application).
 
 :::
 
@@ -76,5 +76,5 @@ list 5 concrete questions that should be investigated to determine if the idea i
 
 ## Related
 
-- [Run an evaluation (synchronously)](../../how_to_guides/evaluate_llm_application)
-- [Handle model rate limits](../../how_to_guides/rate_limiting)
+- [Run an evaluation (synchronously)](./evaluate_llm_application)
+- [Handle model rate limits](./rate_limiting)
diff --git a/docs/evaluation/how_to_guides/custom_evaluator.mdx b/docs/evaluation/how_to_guides/custom_evaluator.mdx
@@ -8,12 +8,12 @@ import {
 
 :::info Key concepts
 
-- [Evaluators](../../concepts#evaluators)
+- [Evaluators](../concepts#evaluators)
 
 :::
 
 Custom evaluators are just functions that take a dataset example and the resulting application output, and return one or more metrics.
-These functions can be passed directly into [evaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._arunner.aevaluate.html).
+These functions can be passed directly into [evaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._arunner.aevaluate.html).
 
 ## Basic example
 
@@ -138,5 +138,5 @@ answer is logically valid and consistent with question and the answer."""
 
 ## Related
 
-- [Evaluate aggregate experiment results](../../how_to_guides/summary): Define summary evaluators, which compute metrics for an entire experiment.
-- [Run an evaluation comparing two experiments](../../how_to_guides/evaluate_pairwise): Define pairwise evaluators, which compute metrics by comparing two (or more) experiments against each other.
+- [Evaluate aggregate experiment results](./summary): Define summary evaluators, which compute metrics for an entire experiment.
+- [Run an evaluation comparing two experiments](./evaluate_pairwise): Define pairwise evaluators, which compute metrics by comparing two (or more) experiments against each other.
diff --git a/docs/evaluation/how_to_guides/evaluate_llm_application.mdx b/docs/evaluation/how_to_guides/evaluate_llm_application.mdx
@@ -12,17 +12,17 @@ import {
 
 :::info Key concepts
 
-[Evaluations](../../concepts#applying-evaluations) | [Evaluators](../../concepts#evaluators) | [Datasets](../../concepts#datasets)
+[Evaluations](../concepts#applying-evaluations) | [Evaluators](../concepts#evaluators) | [Datasets](../concepts#datasets)
 
 :::
 
-In this guide we'll go over how to evaluate an application using the [evaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate.html) method in the LangSmith SDK.
+In this guide we'll go over how to evaluate an application using the [evaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate.html) method in the LangSmith SDK.
 
 :::tip
 
-For larger evaluation jobs in Python we recommend using [aevaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._arunner.aevaluate.html), the asynchronous version of `evaluate()`.
+For larger evaluation jobs in Python we recommend using [aevaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._arunner.aevaluate.html), the asynchronous version of `evaluate()`.
 It is still worthwhile to read this guide first, as the two have nearly identical interfaces,
-and then read the how-to guide on [running an evaluation asynchronously](../../how_to_guides/async).
+and then read the how-to guide on [running an evaluation asynchronously](./async).
 
 :::
 
@@ -92,7 +92,7 @@ To understand how to annotate your code for tracing, please refer to [this guide
 
 ## Create or select a dataset
 
-We need a [Dataset](../../concepts#datasets) to evaluate our application on. Our dataset will contain labeled [examples](../../concepts#examples) of toxic and non-toxic text.
+We need a [Dataset](../concepts#datasets) to evaluate our application on. Our dataset will contain labeled [examples](../concepts#examples) of toxic and non-toxic text.
 
 <CodeTabs
   groupId="client-language"
@@ -150,11 +150,11 @@ We need a [Dataset](../../concepts#datasets) to evaluate our application on. Our
   ]}
 />
 
-See [here](../../how_to_guides#dataset-management) for more on dataset management.
+See [here](.#dataset-management) for more on dataset management.
 
 ## Define an evaluator
 
-[Evaluators](../../concepts#evaluators) are functions for scoring your application's outputs. They take in the example inputs, actual outputs, and, when present, the reference outputs.
+[Evaluators](../concepts#evaluators) are functions for scoring your application's outputs. They take in the example inputs, actual outputs, and, when present, the reference outputs.
 Since we have labels for this task, our evaluator can directly check if the actual outputs match the reference outputs.
 
 <CodeTabs
@@ -176,11 +176,11 @@ Since we have labels for this task, our evaluator can directly check if the actu
   ]}
 />
 
-See [here](../../how_to_guides#define-an-evaluator) for more on how to define evaluators.
+See [here](.#define-an-evaluator) for more on how to define evaluators.
 
 ## Run the evaluation
 
-We'll use the [evaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._arunner.aevaluate.html) methods to run the evaluation.
+We'll use the [evaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._arunner.aevaluate.html) methods to run the evaluation.
 
 The key arguments are:
 
@@ -214,11 +214,11 @@ The key arguments are:
   ]}
 />
 
-See [here](../../how_to_guides#run-an-evaluation) for other ways to kick off evaluations and [here](../../how_to_guides#configure-an-evaluation-job) for how to configure evaluation jobs.
+See [here](.#run-an-evaluation) for other ways to kick off evaluations and [here](.#configure-an-evaluation-job) for how to configure evaluation jobs.
 
 ## Explore the results
 
-Each invocation of `evaluate()` creates an [Experiment](../../concepts#experiments) which can be viewed in the LangSmith UI or queried via the SDK.
+Each invocation of `evaluate()` creates an [Experiment](../concepts#experiments) which can be viewed in the LangSmith UI or queried via the SDK.
 Evaluation scores are stored against each actual output as feedback.
 
 _If you've annotated your code for tracing, you can open the trace of each row in a side panel view._
@@ -364,6 +364,6 @@ _If you've annotated your code for tracing, you can open the trace of each row i
 
 ## Related
 
-- [Run an evaluation asynchronously](../../how_to_guides/async)
-- [Run an evaluation via the REST API](../../how_to_guides/run_evals_api_only)
-- [Run an evaluation from the prompt playground](../../how_to_guides/run_evaluation_from_prompt_playground)
+- [Run an evaluation asynchronously](./async)
+- [Run an evaluation via the REST API](./run_evals_api_only)
+- [Run an evaluation from the prompt playground](./run_evaluation_from_prompt_playground)
diff --git a/docs/evaluation/how_to_guides/evaluate_on_intermediate_steps.mdx b/docs/evaluation/how_to_guides/evaluate_on_intermediate_steps.mdx
@@ -391,4 +391,4 @@ The experiment will contain the results of the evaluation, including the scores
 
 ## Related
 
-- [Evaluate a `langgraph` graph](../evaluation/langgraph)
+- [Evaluate a `langgraph` graph](./langgraph)
diff --git a/docs/evaluation/how_to_guides/evaluate_pairwise.mdx b/docs/evaluation/how_to_guides/evaluate_pairwise.mdx
@@ -13,14 +13,14 @@ import {
 
 :::info Key concepts
 
-- [Pairwise evaluations](../../concepts#pairwise)
+- [Pairwise evaluations](../concepts#pairwise)
 
 :::
 
 LangSmith supports evaluating **existing** experiments in a comparative manner.
 This allows you to score the outputs from multiple experiments against each other, rather than being confined to evaluating outputs one at a time.
 Think [LMSYS Chatbot Arena](https://chat.lmsys.org/) - this is the same concept!
-To do this, use the [evaluate_comparative](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate_comparative.html) / `evaluateComparative` function with two existing experiments.
+To do this, use the [evaluate_comparative](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate_comparative.html) / `evaluateComparative` function with two existing experiments.
 
 If you haven't already created experiments to compare, check out our [quick start](https://docs.smith.langchain.com/#5-run-your-first-evaluation) or oue [how-to guide](https://docs.smith.langchain.com/how_to_guides/evaluate_llm_application) to get started with evaluations.
 

diff --git a/docs/evaluation/how_to_guides/langchain_runnable.mdx b/docs/evaluation/how_to_guides/langchain_runnable.mdx
@@ -136,4 +136,4 @@ The runnable is traced appropriately for each output.
 
 ## Related
 
-- [How to evaluate a `langgraph` graph](../evaluation/langgraph)
+- [How to evaluate a `langgraph` graph](./langgraph)
diff --git a/docs/evaluation/how_to_guides/langgraph.mdx b/docs/evaluation/how_to_guides/langgraph.mdx
@@ -239,7 +239,7 @@ If we need access to information about intermediate steps that isn't in state, w
 
 :::tip Custom evaluators
 
-See more about what arguments you can pass to custom evaluators in this [how-to guide](../evaluation/custom_evaluator).
+See more about what arguments you can pass to custom evaluators in this [how-to guide](./custom_evaluator).
 
 :::
 

diff --git a/docs/evaluation/how_to_guides/llm_as_judge.mdx b/docs/evaluation/how_to_guides/llm_as_judge.mdx
@@ -8,7 +8,7 @@ import {
 
 :::info Key concepts
 
-- [LLM-as-a-judge evaluator](../../concepts#llm-as-judge)
+- [LLM-as-a-judge evaluator](../concepts#llm-as-judge)
 
 :::
 
@@ -72,8 +72,8 @@ for the answer is logically valid and consistent with question and the answer.\\
 ]}
 />
 
-See [here](../../how_to_guides/custom_evaluator) for more on how to write a custom evaluator.
+See [here](./custom_evaluator) for more on how to write a custom evaluator.
 
 ## Prebuilt evaluator via `langchain`
 
-See [here](../../how_to_guides/use_langchain_off_the_shelf_evaluators) for how to use prebuilt evaluators from `langchain`.
+See [here](./use_langchain_off_the_shelf_evaluators) for how to use prebuilt evaluators from `langchain`.
diff --git a/docs/evaluation/how_to_guides/manage_datasets_in_application.mdx b/docs/evaluation/how_to_guides/manage_datasets_in_application.mdx
@@ -7,7 +7,7 @@ sidebar_position: 1
 :::tip Recommended Reading
 Before diving into this content, it might be helpful to read the following:
 
-- [Concepts guide on evaluation and datasets](../../concepts#datasets-and-examples)
+- [Concepts guide on evaluation and datasets](../concepts#datasets-and-examples)
 
 :::
 
@@ -36,14 +36,14 @@ Certain fields in your schema have a `+ Transformations` option.
 Transformations are preprocessing steps that, if enabled, update your examples when you add them to the dataset.
 For example the `convert to OpenAI messages` transformation will convert message-like objects, like LangChain messages, to OpenAI message format.
 
-For the full list of available transformations, see [our reference](/reference/evaluation/dataset_transformations).
+For the full list of available transformations, see [our reference](/referen./dataset_transformations).
 
 :::note
 If you plan to collect production traces in your dataset from LangChain
 [ChatModels](https://python.langchain.com/docs/concepts/chat_models/)
 or from OpenAI calls using the [LangSmith OpenAI wrapper](/observability/how_to_guides/tracing/annotate_code#wrap-the-openai-client), we offer a prebuilt Chat Model schema that converts messages and tools into industry standard openai formats that can be used downstream with any model for testing. You can also customize the template settings to match your use case.
 
-Please see the [dataset transformations reference](/reference/evaluation/dataset_transformations) for more information.
+Please see the [dataset transformations reference](/referen./dataset_transformations) for more information.
 :::
 
 ## Add runs to your dataset in the UI

diff --git a/docs/evaluation/how_to_guides/metric_type.mdx b/docs/evaluation/how_to_guides/metric_type.mdx
@@ -6,7 +6,7 @@ import {
 
 # How to return categorical vs numerical metrics
 
-LangSmith supports both categorical and numerical metrics, and you can return either when writing a [custom evaluator](../../how_to_guides/custom_evaluator).
+LangSmith supports both categorical and numerical metrics, and you can return either when writing a [custom evaluator](./custom_evaluator).
 
 For an evaluator result to be logged as a numerical metric, it must returned as:
 
@@ -68,4 +68,4 @@ Here are some examples:
 
 ## Related
 
-- [Return multiple metrics in one evaluator](../../how_to_guides/multiple_scores)
+- [Return multiple metrics in one evaluator](./multiple_scores)
diff --git a/docs/evaluation/how_to_guides/multiple_scores.mdx b/docs/evaluation/how_to_guides/multiple_scores.mdx
@@ -6,7 +6,7 @@ import {
 
 # How to return multiple scores in one evaluator
 
-Sometimes it is useful for a [custom evaluator function](../../how_to_guides/custom_evaluator) or [summary evaluator function](../../how_to_guides/summary) to return multiple metrics.
+Sometimes it is useful for a [custom evaluator function](./custom_evaluator) or [summary evaluator function](./summary) to return multiple metrics.
 For example, if you have multiple metrics being generated by an LLM judge, you can save time and money by making a single LLM call that generates multiple metrics instead of making multiple LLM calls.
 
 To return multiple scores using the Python SDK, simply return a list of dictionaries/objects of the following form:
@@ -75,4 +75,4 @@ Rows from the resulting experiment will display each of the scores.
 
 ## Related
 
-- [Return categorical vs numerical metrics](../../how_to_guides/metric_type)
+- [Return categorical vs numerical metrics](./metric_type)
diff --git a/docs/evaluation/how_to_guides/rate_limiting.mdx b/docs/evaluation/how_to_guides/rate_limiting.mdx
@@ -76,7 +76,7 @@ See some examples of how to do this in the [OpenAI docs](https://platform.openai
 ## Limiting max_concurrency
 
 Limiting the number of concurrent calls you're making to your application and evaluators is another way to decrease the frequency of model calls you're making, and in that way avoid rate limit errors.
-`max_concurrency` can be set directly on the [evaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/latest/evaluation/langsmith.evaluation._arunner.aevaluate.html) functions.
+`max_concurrency` can be set directly on the [evaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._runner.evaluate.html) / [aevaluate()](https://langsmith-sdk.readthedocs.io/en/late./langsmith.evaluation._arunner.aevaluate.html) functions.
 
 <CodeTabs
   groupId="client-language"

diff --git a/docs/evaluation/how_to_guides/run_evals_api_only.mdx b/docs/evaluation/how_to_guides/run_evals_api_only.mdx
@@ -191,7 +191,7 @@ for model_name in model_names:
 ## Run a pairwise experiment
 
 Next, we'll demonstrate how to run a pairwise experiment. In a pairwise experiment, you compare two examples against each other.
-For more information, check out [this guide](../evaluation/evaluate_pairwise).
+For more information, check out [this guide](./evaluate_pairwise).
 
 ```python
 #  A comparative experiment allows you to provide a preferential ranking on the outputs of two or more experiments

diff --git a/docs/evaluation/how_to_guides/version_datasets.mdx b/docs/evaluation/how_to_guides/version_datasets.mdx
@@ -46,4 +46,4 @@ client.update_dataset_tag(
 )
 ```
 
-To run an evaluation on a particular tagged version of a dataset, you can follow [this guide](../evaluation/dataset_version).
+To run an evaluation on a particular tagged version of a dataset, you can follow [this guide](./dataset_version).
Original file line number	Diff line number	Diff line change
Expand Up		@@ -391,4 +391,4 @@ The experiment will contain the results of the evaluation, including the scores

		## Related

		- [Evaluate a `langgraph` graph](../evaluation/langgraph)
		- [Evaluate a `langgraph` graph](./langgraph)
Original file line number	Diff line number	Diff line change
Expand Up		@@ -136,4 +136,4 @@ The runnable is traced appropriately for each output.

		## Related

		- [How to evaluate a `langgraph` graph](../evaluation/langgraph)
		- [How to evaluate a `langgraph` graph](./langgraph)
-Original file line number
+Diff line change
@@ Expand Up @@
     :::tip Custom evaluators
-    See more about what arguments you can pass to custom evaluators in this [how-to guide](../evaluation/custom_evaluator).
+    See more about what arguments you can pass to custom evaluators in this [how-to guide](./custom_evaluator).
     :::
@@ Expand Down @@