-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OpenTelemetry Support #101
OpenTelemetry Support #101
Conversation
258451c
to
b1ebc2b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and gave a self-review. The major gap I see is connecting the parent trace/state headers to the context - I'll make that change shortly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good! I think y'all have been talking through changing the process.env.HONEYCOMB_API_HOST
check; assuming the tests pass & a manual re-scaffold of echolibrium produces the expected spans, I'd say we're pretty much good to go!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention two things in my first review:
-
We'll need to export the OTel
context
andtrace
objects so that callers can use them to do things like add context to the automatically generated current span, start explicit spans if desired, etc
We might as well export thebeeline
object for the HC case, while we're at it.
This work could be in a followup PR if you want to test the current functionality first. -
Does the Node autoinstrumentation package that we're pulling in automatically add the trace* headers to outgoing HTTP calls made by the app?
second, does that work if the app author decides to use undici (my memeory is that it isn't autoinstrumented yet), if so we should document how to do manual traceheader propogation in an example or something for undici users.
Cheers!
Yeah, we can do that. In my head I expected people to import the otel abstractions directly, but I don't see a reason why we shouldn't export them. Similarly with the beelines object.
I honestly don't know. I'll dig into both what the node auto-instrumentation actually gets us and the
I actually knew nothing about undici until just now, and now I know what's on the npm page. 😆 My guess is that you're right about the auto-instrumentation, but I'll have to take a closer look at that and get back to you. |
templates/boltzmann/core/prelude.ts
Outdated
let sampler: Sampler = new AlwaysOnSampler() | ||
|
||
if (sampleRate === 0) { | ||
sampler = new AlwaysOffSampler() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, I'm confused. The sample rate would be # of traces recorded by honeycomb
/ total number of traces
right? So a sample rate of 1.0
would be always on and a sample rate of 0.0
would be always off. Is the convention backwards from how I've been reading it?
Josh, one question that I can't track down among all the code: are we exporting the OTel |
This is a good question, and there's actually a lot of As for the longer conversation: as far as I can tell, anything imported into boltzmann is also exported. This is partially a limitation of the build scripts (afaict) but has the added bonus of you being able to access any imported dependency, as-imported. In theory you can import 'beelines' from it today, and it's appropriate to do the same with otel. But there's a lot of otel stuff across a lot of namespaces and I don't know how much I want to pollute boltzmann's with otel noise. All this stuff is complicated by how I moved honeycomb stuff into a new file. Anyway: Right now honeycomb.ts simply imports what it needs and then exports it to stick with conventions and avoid collisions with other boltzmann objects. That's working reasonably well, minus a handful of things I have to button up. But the biggest of those is how to make the most use out of the namespace given to us. My best plan is to move towards importing * and giving the result a clearly module-related name: import * as otelAPI from '@opentelemetry/api'; The boltzmann namespace will still be a little busy and some of the exports will have clunky names. But in this example you would get you the whole API, including trace and context, in a predictable location. I think that's overall a win. But let me know if this makes sense to you! |
I think that makes sense!! But I'm so so far from knowledgeable in Node, it would probably be worth double checking with Chris to get the opinion of someone who knows what the hell they're talking about 😅 |
2e6477a
to
4cf40fe
Compare
11eb035
to
cc457bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is close. I found a few things that I want to patch - some of which I'd just forgotten to push - but otherwise this is damned near what I want to go with!
"githubci": false, | ||
"ping": true, | ||
"status": true | ||
"status": true, | ||
"debug": false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll regenerate all these examples prior to release. This debug
here should go away - it was a bug that I've since licked.
@@ -338,13 +450,13 @@ | |||
), | |||
DependencySpec( | |||
name: "@typescript-eslint/parser", | |||
version: "^4.11.0", | |||
version: "^5.12.1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of our apps was extremely grouchy at me using the older versions. I naively upgraded them here and the app became happy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense!
@@ -0,0 +1,20 @@ | |||
declare module 'honeycomb-beeline' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The example build was really grouchy about this type stub being missing. I added it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I wonder if we should provide it automatically as part of the typescript scaffold?
edit: ah, I see, it is scaffolded in dirspec.ron
, but somehow it ended up missing from this example at some point in the past!
@@ -172,10 +172,11 @@ r#"declare module 'honeycomb-beeline' { | |||
finishTrace: (_: string) => void | |||
bindFunctionToTrace<T extends Fn>(...args: Parameters<T>): ReturnType<T> | |||
addContext: (opts?: Record<string, unknown>) => void | |||
addTraceContext: (opts?: Record<string, unknown>) => void |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps surprising, but this was missing from the type stubs despite being a very common method for our apps to be calling! In practice we're moving away from using this method, but it's here now.
declare const beeline: Beeline; | ||
export = beeline; | ||
declare const beeline: Beeline | ||
export = beeline |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was causing a linting error in one of the apps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semicolons: the big bug scourge of the javascript
@@ -127,8 +130,20 @@ class Context { | |||
/**{{- tsdoc(page="02-handlers.md", section="traceURL") -}}*/ | |||
get traceURL () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't actually figured out how to get boltzmann to show me this URL. <_<
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
console.log(context.traceURL)
in a handler will do it!
* trace with opentelemetry | ||
* ╱╱┏┳┓╭╮┏┳┓ ╲╲ | ||
* ▔▏┗┻┛┃┃┗┻┛▕▔ | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PLEASE let me keep this comment lol
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course!
void `{% if honeycomb %}` | ||
honeycomb.logger = bole('boltzmann:honeycomb') | ||
if (honeycomb.features.beeline) { | ||
honeycomb.logger.warn( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm split on this. On one hand I'd like us to get off beelines as soon as it makes sense, but on the other Honeycomb itself has not deprecated beeline, merely placing it in "maintenance mode". What do people think?
templates/github-action-test.yml
Outdated
@@ -25,10 +25,10 @@ jobs: | |||
{% endif %} | |||
steps: | |||
- uses: actions/checkout@v2 | |||
- name: Use Node.js 12.x | |||
uses: actions/setup-node@v1 | |||
- name: Use Node.js 16.x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bunch of the apps had been rolled to v16 in their test actions, so I made the change here as well. v12 is old!
80fbaf6
to
1f8a01d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey Josh, if we aren't setting spanKind: "Server" where appropriate, or "Internal" for the middleware appropriate, then we should add that. It's important for Honeycomb metrics and UI.
See here: https://opentelemetry.io/docs/instrumentation/js/api/tracing/#span-kind
We should be setting the span kind throughout, but if you spot something I missed definitely point it out! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WHEW WHAT A DOOZY
great work, josh!
I have some comments which I will summarize after posting the review!
@@ -0,0 +1,20 @@ | |||
declare module 'honeycomb-beeline' { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 I wonder if we should provide it automatically as part of the typescript scaffold?
edit: ah, I see, it is scaffolded in dirspec.ron
, but somehow it ended up missing from this example at some point in the past!
* trace with opentelemetry | ||
* ╱╱┏┳┓╭╮┏┳┓ ╲╲ | ||
* ▔▏┗┻┛┃┃┗┻┛▕▔ | ||
*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
of course!
<string>handler.version | ||
) | ||
|
||
Object.entries(context.params).forEach(([key, value]) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💡 I think span
has a setAttributes
method that we can use to set the attributes in bulk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does, I wrote this line before noticing and it looks like I forgot to clean this up.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I looked closer at this, and yes that's true, however I'm actually doing a map across the keys on L238 (paramSpanAttribute(key)
). I could do an Object.fromEntries
and pass that to span.setAttributes
but I don't feel like it would make this more clear. But maybe I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is ready to resolve after my investigation ☝️ but I'll keep it open so we can ensure consensus on this one.
To recap my comments:
|
Co-authored-by: Chris Dickinson <[email protected]>
Background
This PR adds support for OpenTelemetry tracing with Honeycomb. For more background, see https://github.com/eaze/otel/issues/1 . Closes #100. Closes #95.
Overview
There are a lot of changes in this branch! Most of them are OpenTelemetry related, but a few of them were quality-of-life improvements I made along the way. In particular, it closes #95. I'll point out the others in individual comments.
I would start in
templates/boltzmann/core/honeycomb.ts
, and referencetemplates/boltzmann/core/prelude.ts
andtemplates/boltzmann/index.tera
to see how it fits together with them. Then, check outtemplates/boltzmann/core/entrypoint.ts
to see how it gets booted. It should read more or less in order, minus some subclassing you may want to check out on a second pass.That'll give you the basics of the startup procedure. The next big pieces are gonna be
templates/boltzmann/core/middleware.ts
andtemplates/boltzmann/middleware/honeycomb.ts
. Those files are the ones responsible for creating the automatic boltzmann spans. In general I just have two implementations of everything - this touches beeline code as little as possible, in return for a little dampness. I think this is better than a bespoke facade people will want to rip out later, and adding a beeline compatibility layer inside the otel abstractions themselves would be both obnoxious and more likely to break beeline.middleware.ts
in particular contains the major tracing test - it creates a honeycomb otel mock, sets up a test app, simulates a request with shot and a custom root span, and asserts that the expected traces get emitted.Logging: OpenTelemetry ships with a logger, accessible under the
otel.diag
API - and we use it. That said, there's two major complications:So logging looks like this:
LOG_LEVEL
is set todebug
, we configure an otel logger{% if not selftest %}
blocks nor{% else %}
statements. This is why it does the weird let trick.LOG_LEVEL
is set, we default otel logs toinfo
. Most logging we do is at theverbose
log level, which is even lower thandebug
. This separate log level may be configured with the standardOTEL_LOG_LEVEL
env variable.JSON.stringify
. Abole
logger is then attached intemplates/boltzmann/middleware/log.ts
.The most interesting set of logging hooks is going to be in the
HoneycombTraceExporter
class incore/honeycomb.ts
. Those log when spans are being sent/received.This should leave a couple of bin script and package.json tweaks, updates to dependencies, and some light edits to type stubs, github action templates and docs.
Current State
All the unit tests are passing and I've created both beeline and otel traces in my test cluster, using our refinery. All of the PRs for updating the services are up in draft mode, with their boltzmanns generated against this branch. Everything seems to behave as-expected when I run our CI tests! I think we're good to go.