Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Azure log exporter #668

Merged
merged 13 commits into from
May 31, 2019
Merged

Azure log exporter #668

merged 13 commits into from
May 31, 2019

Conversation

reyang
Copy link
Contributor

@reyang reyang commented May 30, 2019

  1. Implemented the actual log conversion and exporting logic.
  2. Moved the network/storage transmission logic from trace exporter to a shared mixin, applied to both traces and logs.
  3. Avoid common names (Object and prototype), use BaseObject and _default instead. (Addressing the comments left in Initial version of Azure extension #613 (comment)).

@reyang reyang requested review from c24t, songy23 and a team as code owners May 30, 2019 15:37
@@ -97,7 +106,7 @@ def stop(self, timeout=None):
return time.time() - start_time # time taken to stop


class AzureLogHandler(BaseLogHandler):
class AzureLogHandler(TransportMixin, BaseLogHandler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some duplicated behaviour in AzureLogHandler and the BaseExporter logic. Ideally we should have the logging behaviour (emit) inherited from logging.Handler, queue/Worker and export inherited from BaseExporter. Why are we not re-using the exporter logic if we are making the log "exporter"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, that's our goal. For more info please take a look at #642 #657.

Today the trace exporter has legacy issue that needs to be refactored (e.g. trace exporter has export method which sends data to the queue, and emit which actually export the data. what we want is to only have export method, move queue and worker to the core opencensus library).

In this logging effort, we're trying to have a cleaner design, and later we'll refactor traces to follow this design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the mixin relies on the exporter's options and storage makes this kind of brittle. I think this is fine for now (and as you mentioned we need to refactor the exporters), but we should revisit this before moving it out of the package.

@@ -97,7 +106,7 @@ def stop(self, timeout=None):
return time.time() - start_time # time taken to stop


class AzureLogHandler(BaseLogHandler):
class AzureLogHandler(TransportMixin, BaseLogHandler):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the exporters really be a child class of the Transporter? The two don't really have the "is-a" relationship used for inheritance. Maybe we should have transport be used as a service or utility instead?

Copy link
Contributor Author

@reyang reyang May 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I took the mixin approach, so the log handler owns all the resources, transport mixin just provides the helper methods on top of these resources.

@reyang
Copy link
Contributor Author

reyang commented May 31, 2019

@c24t @songy23 need your help to review, thanks!

Copy link
Contributor

@songy23 songy23 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass looks good, though I'm not too familiar with Azure data models.

except Exception:
pass
if response.status_code == 200:
logger.info('Transmission succeeded: %s.', text)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I feel there's a bit too much logs in this class. Maybe consider reducing logs or using traces with a low sampling rate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is at info level, which is intended for debugging only and off by default.
The default log level is WARNING.

This function should never throw exception.
"""
# TODO: prevent requests being tracked
blacklist_hostnames = execution_context.get_opencensus_attr(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if it's a good idea to put this in the trace execution_context? /cc @c24t WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long term goal is to avoid exporters to access execution_context directly, for more info, refer to #573.
In this PR, the code was extracted from the existing trace exporter.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started storing the blacklist in the context in #289, and probably should have caught it then. I think it's a problem for another PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to address it in a separate PR, there are similar scenarios which we need a consistent solution:

  1. How to prevent traces generated by trace exporters (e.g. during sending data via HTTP) being sent to trace exporter and cause infinite loop.
  2. How to prevent logs/exceptions generated by log exporters/handlers being emitted to the same log handlers and cause infinite loop.

My current thinking is to have the entire worker thread marked as "do not track", so all the spans/logs will have a special tag, and based on this tag we can filter these data out before sending them to exporters.

Copy link
Member

@c24t c24t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only minor comments, but let's review again before moving these into core.

contrib/opencensus-ext-azure/CHANGELOG.md Outdated Show resolved Hide resolved
This function should never throw exception.
"""
# TODO: prevent requests being tracked
blacklist_hostnames = execution_context.get_opencensus_attr(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We started storing the blacklist in the context in #289, and probably should have caught it then. I think it's a problem for another PR.

break
self = self.prototype
self = self._default
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand what's happening here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is trying to mimic the JavaScript prototype behavior, traveling through the prototype chain and get all the key-value pairs.
Do you think assigning to self is evil and we should have another variable?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe not evil, but certainly confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, will give a better name in the next PR.

@@ -97,7 +106,7 @@ def stop(self, timeout=None):
return time.time() - start_time # time taken to stop


class AzureLogHandler(BaseLogHandler):
class AzureLogHandler(TransportMixin, BaseLogHandler):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that the mixin relies on the exporter's options and storage makes this kind of brittle. I think this is fine for now (and as you mentioned we need to refactor the exporters), but we should revisit this before moving it out of the package.

)
execution_context.set_opencensus_attr(
'blacklist_hostnames',
['dc.services.visualstudio.com'],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May as well give this the constant treatment too.

Copy link
Contributor Author

@reyang reyang May 31, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will leave it for now. This will go away when we don't have exporters explicitly maintain the blacklist.

@reyang reyang merged commit 6901d09 into master May 31, 2019
@reyang reyang deleted the azure branch May 31, 2019 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants