Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add optional crash dumping on ConnectionError #1077

Closed
wants to merge 22 commits into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
38 changes: 38 additions & 0 deletions python/langsmith/client.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
"""Client for interacting with the LangSmith API.

Check notice on line 1 in python/langsmith/client.py

View workflow job for this annotation

GitHub Actions / benchmark

Benchmark results

......................................... create_5_000_run_trees: Mean +- std dev: 573 ms +- 44 ms ......................................... create_10_000_run_trees: Mean +- std dev: 1.13 sec +- 0.06 sec ......................................... create_20_000_run_trees: Mean +- std dev: 1.14 sec +- 0.06 sec ......................................... dumps_class_nested_py_branch_and_leaf_200x400: Mean +- std dev: 767 us +- 7 us ......................................... dumps_class_nested_py_leaf_50x100: Mean +- std dev: 27.0 ms +- 0.7 ms ......................................... dumps_class_nested_py_leaf_100x200: Mean +- std dev: 110 ms +- 2 ms ......................................... dumps_dataclass_nested_50x100: Mean +- std dev: 27.4 ms +- 0.5 ms ......................................... WARNING: the benchmark result may be unstable * the standard deviation (7.07 ms) is 12% of the mean (60.3 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. dumps_pydantic_nested_50x100: Mean +- std dev: 60.3 ms +- 7.1 ms ......................................... WARNING: the benchmark result may be unstable * the standard deviation (37.6 ms) is 17% of the mean (221 ms) * the maximum (416 ms) is 88% greater than the mean (221 ms) Try to rerun the benchmark with more runs, values and/or loops. Run 'python -m pyperf system tune' command to reduce the system jitter. Use pyperf stats, pyperf dump and pyperf hist to analyze results. Use --quiet option to hide these warnings. dumps_pydanticv1_nested_50x100: Mean +- std dev: 221 ms +- 38 ms

Check notice on line 1 in python/langsmith/client.py

View workflow job for this annotation

GitHub Actions / benchmark

Comparison against main

+------------------------------------+----------+------------------------+ | Benchmark | main | changes | +====================================+==========+========================+ | dumps_dataclass_nested_50x100 | 27.8 ms | 27.4 ms: 1.02x faster | +------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_50x100 | 27.4 ms | 27.0 ms: 1.02x faster | +------------------------------------+----------+------------------------+ | dumps_class_nested_py_leaf_100x200 | 112 ms | 110 ms: 1.01x faster | +------------------------------------+----------+------------------------+ | create_20_000_run_trees | 1.12 sec | 1.14 sec: 1.01x slower | +------------------------------------+----------+------------------------+ | Geometric mean | (ref) | 1.01x faster | +------------------------------------+----------+------------------------+ Benchmark hidden because not significant (5): dumps_pydantic_nested_50x100, create_5_000_run_trees, create_10_000_run_trees, dumps_class_nested_py_branch_and_leaf_200x400, dumps_pydanticv1_nested_50x100

Use the client to customize API keys / workspace ocnnections, SSl certs,
etc. for tracing.
Expand All @@ -20,6 +20,7 @@
import datetime
import decimal
import functools
import gzip
import importlib
import importlib.metadata
import io
Expand Down Expand Up @@ -931,6 +932,43 @@
filler = "*" * (max(0, len(api_key) - 7))
masked_api_key = f"{prefix}{filler}{suffix}"

debug_crash_dump_file = ls_utils.get_env_var(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we read env var once in global scope and then just use that here (reading env vars is expensive)

"LANGSMITH_DEBUG_CRASH_DUMP"
)
if debug_crash_dump_file is not None and e.request is not None:
try:
headers = e.request.headers.copy()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hinthornw Let me know if you agree with this approach.

Made a shallow copy of the headers so we could get rid of the x-api-key, but log the rest of the headers.

Manually added all of the fields which I think are relevant here. This is to prevent us accidentally logging other sensitive fields (if any).

headers.pop("x-api-key", None)
request_to_log = {
"method": e.request.method,
"url": e.request.url,
"headers": headers,
"body": (
e.request.body
if isinstance(e.request, requests.PreparedRequest)
else None
),
"data": (
e.request.data
if isinstance(e.request, requests.Request)
else None
),
"json": (
e.request.json
if isinstance(e.request, requests.Request)
else None
),
}
with gzip.open(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you easily open a gzip file that has been appended to multiple times?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just looked into it a bit, it is a bit of a pain to read from multiple streams.

Do you think an acceptable alternative is to write to a file with a certain timestamp appended to the file path?

We can probably turn the env variable back to a boolean, and then use a fixed static name like debug_crash_dump_{timestamp}. wdyt? @nfcampos

debug_crash_dump_file, "at", encoding="utf-8"
) as f:
json_data = json.dumps(request_to_log)
f.write(json_data + "\n")
recommendation += f" More info can be found in: {debug_crash_dump_file}."
except Exception as write_error:
recommendation += f" Encountered this error while trying to write to {debug_crash_dump_file}: {repr(write_error)}"
continue

raise ls_utils.LangSmithConnectionError(
f"Connection error caused failure to {method} {pathname}"
f" in LangSmith API. {recommendation}"
Expand Down
Loading