Skip to content

Commit 464ebe8

Browse files
authored
Merge pull request #156 from handellm/capture_time
Capture, receive, and RTP timestamp concept definitions & normative requirements for gUM/gDM
2 parents 14ff6a3 + fcafe1c commit 464ebe8

File tree

1 file changed

+110
-2
lines changed

1 file changed

+110
-2
lines changed

index.html

+110-2
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@
99
// See https://github.com/w3c/respec/wiki/ for how to configure ReSpec
1010
var respecConfig = {
1111
group: "webrtc",
12-
xref: ["geometry-1", "html", "infra", "permissions", "dom", "image-capture", "mediacapture-streams", "webaudio", "webcodecs", "webidl"],
12+
xref: ["geometry-1", "html", "infra", "permissions", "dom", "hr-time", "image-capture", "mediacapture-streams", "screen-capture", "webaudio", "webcodecs", "webidl"],
1313
edDraftURI: "https://w3c.github.io/mediacapture-extensions/",
1414
editors: [
1515
{name: "Jan-Ivar Bruaroey", company: "Mozilla Corporation", w3cid: 79152},
@@ -58,6 +58,9 @@ <h2>Terminology</h2>
5858
<p>The terms [=permission state=], [=request permission to use=], and
5959
<a data-cite="permissions">prompt the user to choose</a> are defined in
6060
[[!permissions]].</p>
61+
<p>
62+
{{Performance.now()}} is defined in [[!hr-time]].
63+
</p>
6164
</section>
6265
<section id="conformance">
6366
</section>
@@ -1151,7 +1154,112 @@ <h2>Constrainable Properties</h2>
11511154
</tbody>
11521155
</table>
11531156
</section>
1154-
<section>
1157+
<section class="informative">
1158+
<h2>Video timestamp concepts</h2>
1159+
<p>
1160+
Video media flowing inside media stream tracks comprises of a sequence of video frames, where
1161+
the frames are sampled from the media at instants spread out over time.
1162+
</p>
1163+
<p>
1164+
Each video frame must have a <dfn class="export">presentation timestamp</dfn>
1165+
which is relative to a source specific origin.
1166+
A source of frames can define how this timestamp is set. A sink of frames
1167+
can define how this timestamp is used.
1168+
</p>
1169+
<p>
1170+
The timestamp is present for sinks to be able to define an absolute presentation timeline of the frames
1171+
relative to a clock reference, for example for playback.
1172+
</p>
1173+
<p>
1174+
Each frame may have an absolute <dfn class="export">capture timestamp</dfn> representing
1175+
the instant the frame capture process began, which is useful for example for
1176+
delay measurements and synchronization.
1177+
A source of frames can define how this timestamp is set, otherwise it is unset. A
1178+
sink of frames can define how this timestamp is used if set.
1179+
</p>
1180+
<p>
1181+
Each frame may have an absolute <dfn class="export">receive timestamp</dfn> representing
1182+
the last received timestamp of packets used to produce this video frame was received in its entirety.
1183+
The timestamp is useful for example for network jitter measurements.
1184+
A source of frames can define how this timestamp is set, otherwise it is unset. A sink of
1185+
frames can define how this timestamp is used if set.
1186+
</p>
1187+
<p>
1188+
Each frame may have a <dfn class="export">RTP timestamp</dfn> representing the packet RTP
1189+
timestamp used to produce this video frame. The timestamp is useful for example for frame
1190+
identification and playback quality measurements. A source of frames can define how the
1191+
timestamp is set, otherwise it is unset. A sink of frames can define how this timestamp is
1192+
used if set.
1193+
The packet RTP timestamp concept is defined in [[?RFC3550]] Section 5.1.
1194+
</p>
1195+
<h3>Timestamp clock relations</h3>
1196+
<p>
1197+
The [=capture timestamp=] and [=receive timestamp=] are using the same clock and offset.
1198+
The [=presentation timestamp=] and [=capture timestamp=] are using the same clock and
1199+
have an offset which can be arbitrarily chosen by the user agent since it isn't
1200+
directly observable by script.
1201+
</p>
1202+
<h3>{{VideoFrameMetadata}}</h3>
1203+
<pre class="idl">
1204+
partial dictionary VideoFrameMetadata {
1205+
DOMHighResTimeStamp captureTime;
1206+
DOMHighResTimeStamp receiveTime;
1207+
unsigned long rtpTimestamp;
1208+
};</pre>
1209+
<section class="notoc">
1210+
<h5>Members</h5>
1211+
<dl class="dictionary-members" data-link-for="VideoFrameMetadata" data-dfn-for="VideoFrameMetadata">
1212+
<dt><dfn><code>captureTime</code></dfn> of type <span class="idlMemberType">DOMHighResTimeStamp</span></dt>
1213+
<dd>
1214+
<p>The capture timestamp of the frame relative to {{Performance}}.{{Performance/timeOrigin}}. It corresponds to
1215+
the [=capture timestamp=] of {{MediaStreamTrack}} video frames.
1216+
</p>
1217+
</dd>
1218+
<dt><dfn><code>receiveTime</code></dfn> of type <span class="idlMemberType">DOMHighResTimeStamp</span></dt>
1219+
<dd>
1220+
<p>The receive time of the corresponding encoded frame relative to {{Performance}}.{{Performance/timeOrigin}}.
1221+
It corresponds to the [=receive timestamp=] of {{MediaStreamTrack}} video frames.</p>
1222+
</dd>
1223+
<dt><dfn><code>rtpTimestamp</code></dfn> of type <span class="idlMemberType">unsigned long</span></dt>
1224+
<dd>
1225+
<p>The RTP timestamp of the corresponding encoded frame. It corresponds to [=RTP timestamp=] of
1226+
{{MediaStreamTrack}} video frames.</p>
1227+
</dd>
1228+
</dl>
1229+
</section>
1230+
<h3>Algorithms</h3>
1231+
When the <dfn class="abstract-op">Initialize Video Frame Timestamps From Internal MediaStreamTrack Video Frame</dfn>
1232+
algorithm is invoked with |frame| and |offset| as input, run the following steps.
1233+
<ol class=algorithm>
1234+
<li>Set {{VideoFrame/timestamp}} from [=presentation timestamp=] minus |offset|.</li>
1235+
<li>Set {{VideoFrameMetadata/captureTime}} from [=capture timestamp=] if set.</li>
1236+
<li>Set {{VideoFrameMetadata/receiveTime}} from [=receive timestamp=] if set.</li>
1237+
<li>Set {{VideoFrameMetadata/rtpTimestamp}} from [=RTP timestamp=] if set.</li>
1238+
</ol>
1239+
When the <dfn class="abstract-op">Copy Video Frame Timestamps To Internal MediaStreamTrack Video Frame</dfn>
1240+
algorithm runs with |frame| as input, run the following steps.
1241+
<ol class=algorithm>
1242+
<li>Set [=presentation timestamp=] from {{VideoFrame/timestamp}}.</li>
1243+
<li>Set [=capture timestamp=] from {{VideoFrameMetadata/captureTime}} if [=map/exist|present=].</li>
1244+
<li>Set [=receive timestamp=] from {{VideoFrameMetadata/receiveTime}} if [=map/exist|present=].</li>
1245+
<li>Set [=RTP timestamp=] from {{VideoFrameMetadata/rtpTimestamp}} if [=map/exist|present=].</li>
1246+
</ol>
1247+
</section>
1248+
<section>
1249+
<h3>Local video capture timestamps</h3>
1250+
<p>
1251+
The user agent MUST set the [=capture timestamp=] of each video frame that is sourced from
1252+
{{MediaDevices/getUserMedia()}} and {{MediaDevices/getDisplayMedia()}} to its best estimate of the time that
1253+
the frame was captured.
1254+
This value MUST be monotonically increasing.
1255+
</p>
1256+
<div class="note">
1257+
Local capture tracks have a fixed offset between [=presentation timestamp=] and [=capture timestamp=]. The
1258+
user agent may let this be zero with the result that [=presentation timestamp=] is the same as [=capture timestamp=].
1259+
</div>
1260+
</section>
1261+
1262+
<section>
11551263
<h2>Exposing MediaStreamTrack source heuristic reactions support</h2>
11561264
<div>
11571265
<p>Some platforms or User Agents may provide built-in support for video effects triggered by user motion heuristics, in particular for camera video streams.

0 commit comments

Comments
 (0)