Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring log middleware #10

Merged
merged 6 commits into from
Nov 12, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
coverage
dist
node_modules
.eslintcache
19 changes: 11 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,21 +12,18 @@ Here is an example server with log tracing enabled.

```js
import {
fetchGcpProjectId,
getHttpTraceHeader,
middleware,
logger as BNLogger
logger as buildLogger
} from "@bonniernews/logger";
import express from "express";

const logger = BNLogger();
const logger = buildLogger();
const app = express();

app.use(middleware);

// Fetches the project ID from the GCP metadata server in the background on startup.
// This is only necessary if you don't set the `GCP_PROJECT` environment variable.
fetchGcpProjectId();
// This middleware will create a request context and
// automatically decorate all logs with tracing data:
app.use(middleware());

app.get("/", async (req, res) => {
logger.info("Hello, world!");
Expand All @@ -39,6 +36,12 @@ app.get("/", async (req, res) => {
});
```

The `middleware` should be put as early as possible, since only logs after this middleware will get the tracing data. The middleware will lookup the active project ID from GCP. Alternatively, you can set the `GCP_PROJECT` environment variable for this purpose.

Use the `getHttpTraceHeader` function to pass tracing headers to downstream services.

If you want to decorate logs with custom data, use the exported `decorateLogs` function. In order to use this, the middleware needs to be installed first.

### Logger Interface

#### Options
Expand Down
3 changes: 1 addition & 2 deletions index.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,4 @@
export { fetchGcpProjectId } from "./lib/gcp";
export { getHttpTraceHeader } from "./lib/http";
export { getLoggingTraceData, logger } from "./lib/logging";
export { decorateLogs, logger, Logger, LoggerOptions, DestinationStream } from "./lib/logging";
export { middleware } from "./lib/middleware";
export { createTraceparent } from "./lib/traceparent";
23 changes: 7 additions & 16 deletions lib/gcp.ts
Original file line number Diff line number Diff line change
@@ -1,26 +1,17 @@
import gcpMetaData from "gcp-metadata";

let gcpProjectId: string | undefined = undefined;

export function getGcpProjectId() {
return process.env.GCP_PROJECT || gcpProjectId;
}

/**
* Fetches the Google Cloud Platform (GCP) project ID from the GCP metadata server.
*
* You only need to call this function if you're not setting the `GCP_PROJECT` environment variable.
* You can alternatively set the `GCP_PROJECT` environment variable, which takes precedence.
*/
export async function fetchGcpProjectId() {
export async function getGcpProjectId(): Promise<string | undefined> {
if (process.env.GCP_PROJECT) {
return process.env.GCP_PROJECT;
}

const isAvailable = await gcpMetaData.isAvailable();
if (!isAvailable) return;

gcpProjectId = await gcpMetaData.project("project-id");
}

/**
* Resets the GCP project ID, for testing.
*/
export function reset() {
gcpProjectId = undefined;
return await gcpMetaData.project("project-id");
}
7 changes: 3 additions & 4 deletions lib/http.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
import { getStore } from "./middleware";

export function getHttpTraceHeader() {
const { traceparent } = getStore();
if (traceparent) return { traceparent };
return {};
export function getHttpTraceHeader(): Record<string, string> {
const { traceparent } = getStore() || {};
return traceparent ? { traceparent } : {};
}
49 changes: 27 additions & 22 deletions lib/logging.ts
Original file line number Diff line number Diff line change
@@ -1,34 +1,39 @@
import pino, { DestinationStream, LoggerOptions } from "pino";
import { getGcpProjectId } from "./gcp";
import pino, {
DestinationStream as PinoDestinationStream,
LoggerOptions as PinoOptions,
Logger as PinoLogger,
} from "pino";
import { getStore } from "./middleware";
import { getTraceFromTraceparent } from "./traceparent";

export function getLoggingTraceData() {
const { traceparent, ...rest } = getStore();
if (!traceparent) return rest;

const trace = getTraceFromTraceparent(traceparent);
if (!trace) return rest;
function getLoggingData() {
const store = getStore();
return store ? store.logFields : {};
}

const logData = { traceId: trace.traceId, spanId: trace.parentId, ...rest };
/**
* Add any additional log data to the request context. To use this feature, you must enable the
* request storage by initializing the built-in middleware.
*/
export function decorateLogs(obj: Record<string, unknown>) {
const store = getStore();

const gcpProjectId = getGcpProjectId();
if (!gcpProjectId) return logData;
if (!store) throw new Error("@bonniernews/logger middleware has not been initialized");

return {
...logData,
"logging.googleapis.com/trace": `projects/${gcpProjectId}/traces/${trace.traceId}`,
"logging.googleapis.com/spanId": trace.parentId,
"logging.googleapis.com/trace_sampled": trace.isSampled,
};
for (const key in obj) {
store.logFields[key] = obj[key];
}
Comment on lines +17 to +24
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we need this as a separate function. Also, in this form, you can only add static data to your logs, which might not be that interesting. My thought was to add an optional argument to the middleware factory function, that can add logging data from the Request object. something like:

type MiddlewareConfig = {
  extractRequestData?: (request: Request) => Record<string, unknown>;
};

export const middleware: Middleware = ({ extractRequestData }: MiddlewareConfig = {}) => {
  return (req, _res, next) => {
    ...

    if (extractRequestData) {
      extra = extractRequestData(req);
      for (const key in obj) {
        logFields[key] = obj[key];
      }
    }
    ...

This way, you also don't have to keep track of whether the storage has been initalized or not, it's all done in the middleware.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My specific use case in greenfield is extracting the IAP headers to log which service account made a specific request.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that type of middleware option function is great! But I think its nice to have a separate decorateLogs type function that you can use anywhere in your application, such as if (user) { decorateLogs(...) }. Lets att that middleware option in the next PR 🙏

I can remove the extra function 👌

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I misunderstood. I thought you were referring to another function. I think separating the argument on the middleware from decorateLogs() could be nice, since they cover different use cases.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so the idea is to write my own middleware where decorateLogs is called? That is probably even more flexible actually.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you can either call it inside your own middleware, or anywhere in the request flow. I have used it on Di's site in some conditional logic before.

}

type BnLoggerOptions = Omit<LoggerOptions, "level" | "formatters"> & {
export type Logger = PinoLogger<never, boolean>;

export type LoggerOptions = Omit<PinoOptions, "level" | "formatters"> & {
logLevel?: "trace" | "debug" | "info" | "warn" | "error" | "fatal";
formatLog?: NonNullable<LoggerOptions["formatters"]>["log"];
formatLog?: NonNullable<PinoOptions["formatters"]>["log"];
};

export function logger(options: BnLoggerOptions = {}, stream?: DestinationStream | undefined) {
export type DestinationStream = PinoDestinationStream;

export function logger(options: LoggerOptions = {}, stream?: DestinationStream | undefined): Logger {
const env = process.env.NODE_ENV /* c8 ignore next */ || "development";
const shouldPrettyPrint = ["development", "test", "dev"].includes(env) && !stream;

Expand Down Expand Up @@ -68,7 +73,7 @@ export function logger(options: BnLoggerOptions = {}, stream?: DestinationStream
...(formatLog && { log: formatLog }),
},
transport,
mixin: (...args) => ({ ...getLoggingTraceData(), ...mixin?.(...args) }),
mixin: (...args) => ({ ...getLoggingData(), ...mixin?.(...args) }),
...rest,
},
stream
Expand Down
56 changes: 45 additions & 11 deletions lib/middleware.ts
Original file line number Diff line number Diff line change
@@ -1,23 +1,57 @@
import type { RequestHandler } from "express";
import { AsyncLocalStorage } from "node:async_hooks";
import { createTraceparent } from "./traceparent";
import { getGcpProjectId } from "./gcp";
import { getTraceFromTraceparent, createTraceparent } from "./traceparent";

type Store = {
traceparent?: string;
clientServiceAccount?: string;
type LogFields = {
traceId?: string;
spanId?: string;
"logging.googleapis.com/trace"?: string;
"logging.googleapis.com/spanId"?: string;
"logging.googleapis.com/trace_sampled"?: boolean;
[key: string]: unknown;
};

export type Store = {
traceparent?: string;
logFields: LogFields;
};

export type Middleware = () => RequestHandler;

const storage = new AsyncLocalStorage<Store>();

export const middleware: RequestHandler = (req, _res, next) => {
const traceparent = req.header("traceparent") || createTraceparent();
export const middleware: Middleware = () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps also rename this to something better too. Prefix with create? Or should it be called something less generic over all - app.use(logStoreMiddleware()) or app.use(logTracingMiddleware())?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think middleware is ok. We're only exporting 1 middleware in this lib and callers are free to rename it when they import it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perfect! I've always felt torn on this topic, since named exports affect implementors this way. But I'm happy with this approach.

let initialized = false;
let projectId: string | undefined;

return async (req, _res, next) => {
if (!initialized) {
initialized = true;
projectId = await getGcpProjectId();
}

const traceparent = req.header("traceparent") || createTraceparent();
const trace = getTraceFromTraceparent(traceparent);
Comment on lines +34 to +35
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should combine these rows, since we might get no trace if the header was invalid. Thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both variants work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it like this for now then 😄
Probably easier to notice somethings wrong if the trace doesn't exist vs some random other trace is there instead.

const logFields: LogFields = {};

if (trace) {
logFields.traceId = trace.traceId;
logFields.spanId = trace.parentId;

if (projectId) {
logFields["logging.googleapis.com/trace"] = `projects/${projectId}/traces/${trace.traceId}`;
logFields["logging.googleapis.com/spanId"] = trace.parentId;
logFields["logging.googleapis.com/trace_sampled"] = trace.isSampled;
}
}

storage.run({ traceparent }, () => {
next();
});
storage.run({ traceparent, logFields }, () => {
next();
});
};
};

export function getStore() {
return storage.getStore() || {};
export function getStore(): Store | undefined {
return storage.getStore();
}
Loading