forked from pivotal-cf/docs-pks
-
Notifications
You must be signed in to change notification settings - Fork 0
/
monitor-sinks.html.md.erb
200 lines (150 loc) · 7.94 KB
/
monitor-sinks.html.md.erb
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
---
title: Monitoring PKS with Sinks
owner: PKS
---
<strong><%= modified_date %></strong>
This topic describes how to monitor Pivotal Container Service (PKS) deployments using sink resources.
## <a id="prerequisites"></a> Prerequisites
Using sink resources for monitoring requires that you have set up a log processing solution capable
of log ingress over TCP as described in [RFC 5424](https://tools.ietf.org/html/rfc5424).
In addition, you must configure sinks in PKS to send logs to that destination.
For information on how to create and manage sinks in PKS, see [Creating Sink Resources](./create-sinks.html).
<p class="note"><strong>Note</strong>: Sinks created in PKS only support TCP connections.
UDP connections are not currently supported.</p>
## <a id="distinguish"></a> About Sink Log Entries
Sinks and ClusterSinks include both pod logs as well as events from the Kubernetes API.
These logs and events are combined in a shared format to provide operators with a
robust set of filtering and monitoring options.
Sink data has the following characteristics. All entries:
* Are timestamped.
* Contain the host ID of the BOSH-defined VM.
* Are annotated with a set of structured data, which includes the namespace, the object name or pod ID, and the container name.
### <a id="log-format"></a> Sink Log Entry Format
All sink log entries use the following format:
```
APP-NAME/NAMESPACE/POD-ID/CONTAINER-NAME
```
Where:
* `APP-NAME` is `pod.log` or `k8s.event`.
* `NAMESPACE` is the namespace associated with the pod log or Kubernetes event.
* `POD-ID` is the ID of the pod associated with the pod log or Kubernetes event.
* `CONTAINER-NAME` is the deployment associated with the pod log or Kubernetes event.
## <a id="pod-logs"></a> Pod Logs
Pod logs entries are distinguished by the string `pod.log` in the `APP-NAME` field.
### <a id="pod-log-example"></a> Pod Log Example
The following is a sample pod log entry:
<pre>
36 <14>1 2018-11-26T18:51:41.647825+00:00 vm-3ebfe45d-492d-4bfd-59c4-c45d91688c65
pod.log/rocky-raccoon/logspewer-6b58b6689d-dhddj - - [kubernetes@47450
app="logspewer" pod-template-hash="2614622458" namespace_name="rocky-raccoon"
object_name="logspewer-6b58b6689d-dhddj" container_name="logspewer"]
2018/11/26 18:51:41 Log Message 589910
</pre>
Where:
* `vm-3ebfe45d-492d-4bfd-59c4-c45d91688c65` is the host ID of the BOSH VM.
* `pod.log` is the `APP-NAME`.
* `rocky-raccoon` is the `NAMESPACE`.
* `logspewer-6b58b6689d-dhddj` is the `POD-ID`.
## <a id="k8s-api"></a> Kubernetes API Events
Kubernetes API Event entries are distinguished by the string `k8s.event` in the `APP-NAME` field.
### <a id="k8s-api-example"></a> Kubernetes API Event Example
The following is an example Kubernetes API event log entry:
<pre>
Nov 14 16:01:49 vm-b409c60e-2517-47ac-7c5b-2cd302287c3a
k8s.event/rocky-raccoon/logspewer-6b58b6689d-j9n:
Successfully assigned rocky-raccoon/logspewer-6b58b6689d-j9nq7
to vm-38dfd896-bb21-43e4-67b0-9d2f339adaf1
</pre>
Where:
* `vm-b409c60e-2517-47ac-7c5b-2cd302287c3a` the host ID of the BOSH VM.
* `k8s.event` is the `APP-NAME`.
* `rocky-raccoon` is the `NAMESPACE`.
* `logspewer-6b58b6689d-j9n` is the `POD-ID`.
### <a id="important-events"></a> Notable Kubernetes API Events
The following section lists Kubernetes API Events that can help assess any Kubernetes scheduling problems in PKS.
To monitor for these events, look for log entries that contain the <strong>Identifying String</strong> indicated below for each event.
####<a id="imagepullbackoff"></a> Failure to Retrieve Containers from Registry
<table>
<tr><th colspan="2" style="text-align: center;"><br>ImagePullBackOff<br><br></th></tr>
<tr>
<th width="25%">Description</th>
<td>Image pull back offs occur when the Kubernetes API cannot reach a registry to retrieve a container or the container does not exist in the registry. The scheduler might be trying to access a registry that is not available on the network. For example, Docker Hub is blocked by a firewall. Other reasons might include the registry is experiencing an outage or a specified container has been deleted or was never uploaded.</td>
</tr>
<tr>
<th>Identifying String</th>
<td><code>Error:ErrImagePull</code></td>
</tr>
<tr>
<th>Example Sink Log Entry</th>
<td>
<code>Jan 25 10:18:58 gke-bf-test-default-pool-aa8027bc-rnf6 k8s.event/default/test-669d4d66b9-zd9h4/: Error: ErrImagePull</code>
</td>
</tr>
</table>
#### <a id="crashloopbackoff"></a> Malfunctioning Containers
<table>
<tr><th colspan="2" style="text-align: center;"><br>CrashLoopBackOff<br><br></th></tr>
<tr>
<th width="25%">Description</th>
<td>Crash loop back offs imply that the container is not functioning as intended. There are several potential causes of crash loop back offs which depend on the related workload. To investigate further, examine the logs for that workload.</td>
</tr>
<tr>
<th>Identifying String</th>
<td><code>Back-off restarting failed container</code></td>
</tr>
<tr>
<th>Example Sink Log Entry</th>
<td>
<code>Jan 25 09:26:44 vm-bfdfedef-4a6a-4c36-49fc-8b290ad42623 k8s.event/monitoring/cost-analyzer-prometheus-se: Back-off restarting failed container</code>
</td>
</tr>
</table>
#### <a id="containercreated"></a> Successful Scheduling of Containers
<table>
<tr><th colspan="2" style="text-align: center;"><br>ContainerCreated<br><br></th></tr>
<tr>
<th width="25%">Description</th>
<td>Operators can monitor the creation and successful start of containers to keep track of platform usage at a high level. Cluster users can track this event to monitor the usage of their cluster.</td>
</tr>
<tr>
<th>Identifying String</th>
<td><code>Started container</code></td>
</tr>
<tr>
<th>Example Sink Log Entries</th>
<td>
<code>
Jan 25 09:14:55 35.239.18.250 k8s.event/rocky-raccoon/logspewer-6b58b6689d/: Created pod: logspewer-6b58b6689d-sr96t <br/>
Jan 25 09:14:55 35.239.18.250 k8s.event/rocky-raccoon/logspewer-6b58b6689d-sr9: Successfully assigned rocky-raccoon/ logspewer-6b58b6689d-sr96t to vm-efe48928-be8e-4db5-772c-426ee7aa52f2 <br/>
Jan 25 09:14:55 vm-efe48928-be8e-4db5-772c-426ee7aa52f2 k8s.event/rocky-raccoon/logspewer-6b58b6689d-mkd: Killing container with id docker://logspewer:Need to kill Pod <br/>
Jan 25 09:14:56 vm-efe48928-be8e-4db5-772c-426ee7aa52f2 k8s.event/rocky-raccoon/logspewer-6b58b6689d-sr9: Container image "oratos/logspewer:v0.1" already present on machine <br/>
Jan 25 09:14:56 vm-efe48928-be8e-4db5-772c-426ee7aa52f2 k8s.event/rocky-raccoon/logspewer-6b58b6689d-sr9: Created container <br/>
Jan 25 09:14:56 vm-efe48928-be8e-4db5-772c-426ee7aa52f2 k8s.event/rocky-raccoon/logspewer-6b58b6689d-sr9: Started container <br/>
</code>
</td>
</tr>
</table>
#### <a id="failedscheduling"></a> Failure to Schedule Containers
<table>
<tr><th colspan="2" style="text-align: center;"><br>FailedScheduling<br><br></th></tr>
<tr>
<th width="25%">Description</th>
<td>This event occurs when a container cannot be scheduled. For instance, this may occur due to lack of node resources..</td>
</tr>
<tr>
<th>Identifying String</th>
<td><code>Insufficient RESOURCE</code><br>where <code>RESOURCE</code> is a specific type of resource. For example, <code>cpu</code>.</td>
</tr>
<tr>
<th>Example Sink Log Entries</th>
<td>
<code>
Jan 25 10:51:48 gke-bf-test-default-pool-aa8027bc-rnf6 k8s.event/default/test2-5c87bf4b65-7fdtd/: 0/1 nodes are available: 1 Insufficient cpu.
</code>
</td>
</tr>
</table>
## <a id="related-links"></a> Related Links
For more information on sinks in PKS, see the following topics:
* For information about creating sinks in PKS, see [Creating Sink Resources](./create-sinks.html).
* For information about sink architecture, see [Sink Architecture in PKS](./sink-architecture.html).