Skip to content

Commit 27b5db4

Browse files
committed
Add phrases instead of context
Remove SpeechRecognitionContext and add SpeechRecognitionPhraseList to SpeechRecognition directly Remove updateContext and always update phrases instead Rename context-not-supported error code to phrases-not-supported Add removeItem to SpeechRecognitionPhraseList
1 parent a36a57a commit 27b5db4

File tree

1 file changed

+80
-63
lines changed

1 file changed

+80
-63
lines changed

index.bs

+80-63
Original file line numberDiff line numberDiff line change
@@ -162,14 +162,13 @@ interface SpeechRecognition : EventTarget {
162162
attribute boolean interimResults;
163163
attribute unsigned long maxAlternatives;
164164
attribute SpeechRecognitionMode mode;
165-
attribute SpeechRecognitionContext context;
165+
attribute SpeechRecognitionPhraseList phrases;
166166

167167
// methods to drive the speech interaction
168168
undefined start();
169169
undefined start(MediaStreamTrack audioTrack);
170170
undefined stop();
171171
undefined abort();
172-
undefined updateContext(SpeechRecognitionContext context);
173172
static Promise<boolean> availableOnDevice(DOMString lang);
174173
static Promise<boolean> installOnDevice(DOMString lang);
175174

@@ -195,7 +194,7 @@ enum SpeechRecognitionErrorCode {
195194
"not-allowed",
196195
"service-not-allowed",
197196
"language-not-supported",
198-
"context-not-supported"
197+
"phrases-not-supported"
199198
};
200199

201200
enum SpeechRecognitionMode {
@@ -259,27 +258,20 @@ interface SpeechRecognitionPhrase {
259258
readonly attribute float boost;
260259
};
261260

262-
// The object representing a list of biasing phrases.
261+
// The object representing a list of phrases for contextual biasing.
263262
[Exposed=Window]
264263
interface SpeechRecognitionPhraseList {
265-
constructor();
264+
constructor(sequence<SpeechRecognitionPhrase> phrases);
266265
readonly attribute unsigned long length;
267266
SpeechRecognitionPhrase item(unsigned long index);
268267
undefined addItem(SpeechRecognitionPhrase item);
269-
};
270-
271-
// The object representing a recognition context collection.
272-
[Exposed=Window]
273-
interface SpeechRecognitionContext {
274-
constructor(SpeechRecognitionPhraseList phrases);
275-
readonly attribute SpeechRecognitionPhraseList phrases;
268+
undefined removeItem(unsigned long index);
276269
};
277270
</xmp>
278271

279272
<h4 id="speechreco-attributes">SpeechRecognition Attributes</h4>
280273

281274
<dl>
282-
283275
<dt><dfn attribute for=SpeechRecognition>lang</dfn> attribute</dt>
284276
<dd>This attribute will set the language of the recognition for the request, using a valid BCP 47 language tag. [[!BCP47]]
285277
If unset it remains unset for getting in script, but will default to use the <a spec=html>language</a> of the html document root element and associated hierarchy.
@@ -305,8 +297,14 @@ interface SpeechRecognitionContext {
305297
<dt><dfn attribute for=SpeechRecognition>mode</dfn> attribute</dt>
306298
<dd>An enum to determine where speech recognition takes place. The default value is "ondevice-preferred".</dd>
307299

308-
<dt><dfn attribute for=SpeechRecognition>context</dfn> attribute</dt>
309-
<dd>This attribute will set the speech recognition context for the recognition session to start with.</dd>
300+
<dt><dfn attribute for=SpeechRecognition>phrases</dfn> attribute</dt>
301+
<dd>
302+
This attribute represents a list of phrases for contextual biasing.
303+
The setter steps are:
304+
1. If the {{SpeechRecognitionPhraseList/length}} of the given value is greater than 0 and the system does not support contextual biasing,
305+
throw a {{SpeechRecognitionErrorEvent}} with the {{phrases-not-supported}} error code and abort these steps.
306+
1. Set phrases to the given value.
307+
</dd>
310308
</dl>
311309

312310
<p class=issue>The group has discussed whether WebRTC might be used to specify selection of audio sources and remote recognizers.
@@ -352,17 +350,6 @@ See <a href="https://lists.w3.org/Archives/Public/public-speech-api/2012Sep/0072
352350
The user agent must raise an <a event for=SpeechRecognition>end</a> event once the speech service is no longer connected.
353351
If the abort method is called on an object which is already stopped or aborting (that is, start was never called on it, the <a event for=SpeechRecognition>end</a> or <a event for=SpeechRecognition>error</a> event has fired on it, or abort was previously called on it), the user agent must ignore the call.</dd>
354352

355-
<dt><dfn method for=SpeechRecognition>updateContext({{SpeechRecognitionContext}} |context|)</dfn> method</dt>
356-
<dd>
357-
The updateContext method updates the speech recognition context after the speech recognition session has started.
358-
If the session has not started yet, user should update {{SpeechRecognition/context}} instead of using this method.
359-
360-
When invoked, run the following steps:
361-
1. If {{[[started]]}} is <code>false</code>, throw an {{InvalidStateError}} and abort these steps.
362-
1. If the system does not support speech recognition context, throw a {{SpeechRecognitionErrorEvent}} with the {{context-not-supported}} error code and abort these steps.
363-
1. The system updates its speech recognition context to be |context|.
364-
</dd>
365-
366353
<dt><dfn method for=SpeechRecognition>availableOnDevice({{DOMString}} lang)</dfn> method</dt>
367354
<dd>The availableOnDevice method returns a Promise that resolves to a boolean indicating whether on-device speech recognition is available for a given BCP 47 language tag. [[!BCP47]]</dd>
368355

@@ -384,9 +371,10 @@ following steps:
384371
1. If |requestMicrophonePermission| is `true` and [=request
385372
permission to use=] "`microphone`" is [=permission/"denied"=], abort
386373
these steps.
387-
1. If {{SpeechRecognition/context}} is not null and the system does not support
388-
speech recognition context, throw a {{SpeechRecognitionErrorEvent}} with the
389-
{{context-not-supported}} error code and abort these steps.
374+
1. If the {{SpeechRecognitionPhraseList/length}} of {{SpeechRecognition/phrases}}
375+
is greater than 0 and the system does not support contextual biasing, throw a
376+
{{SpeechRecognitionErrorEvent}} with the {{phrases-not-supported}} error code
377+
and abort these steps.
390378
1. Once the system is successfully listening to the recognition, queue a task to
391379
[=fire an event=] named <a event for=SpeechRecognition>start</a> at [=this=].
392380

@@ -481,8 +469,8 @@ For example, some implementations may fire <a event for=SpeechRecognition>audioe
481469
<dt><dfn enum-value for=SpeechRecognitionErrorCode>"language-not-supported"</dfn></dt>
482470
<dd>The language was not supported.</dd>
483471

484-
<dt><dfn enum-value for=SpeechRecognitionErrorCode>"context-not-supported"</dfn></dt>
485-
<dd>The speech recognition model does not support speech recognition context.</dd>
472+
<dt><dfn enum-value for=SpeechRecognitionErrorCode>"phrases-not-supported"</dfn></dt>
473+
<dd>The speech recognition model does not support phrases for contextual biasing.</dd>
486474
</dl>
487475
</dd>
488476

@@ -563,62 +551,91 @@ For a non-continuous recognition it will hold only a single value.</p>
563551

564552
<h4 id="speechreco-phrase">SpeechRecognitionPhrase</h4>
565553

566-
<p>The SpeechRecognitionPhrase object represents a phrase for contextual biasing.</p>
554+
<p>The SpeechRecognitionPhrase object represents a phrase for contextual biasing and has the following internal slots:</p>
555+
556+
<dl dfn-type=attribute dfn-for="SpeechRecognitionPhrase">
557+
: <dfn>[[phrase]]</dfn>
558+
::
559+
A DOMString representing the text string to be boosted. The initial value is null.
560+
</dl>
561+
562+
<dl dfn-type=attribute dfn-for="SpeechRecognitionPhrase">
563+
: <dfn>[[boost]]</dfn>
564+
::
565+
A float representing approximately the natural log of the number of times more likely the website thinks this phrase is
566+
than what the speech recognition model knows.
567+
A valid boost must be a float value inside the range [0.0, 10.0], with a default value of 1.0 if not specified.
568+
A boost of 0.0 means the phrase is not boosted at all, and a higher boost means the phrase is more likely to appear.
569+
A boost of 10.0 means the phrase is extremely likely to appear and should be rarely set.
570+
</dl>
567571

568572
<dl>
569573
<dt><dfn constructor for=SpeechRecognitionPhrase>SpeechRecognitionPhrase(|phrase|, |boost|)</dfn> constructor</dt>
570574
<dd>
571-
When invoked, run the following steps:
572-
1. If the |phrase| is an empty string, throw a "{{SyntaxError}}" {{DOMException}}.
573-
1. If the |boost| is smaller than 0.0 or greater than 10.0, throw a "{{SyntaxError}}" {{DOMException}}.
574-
1. Construct a new SpeechRecognitionPhrase object with |phrase| and |boost|.
575-
1. Return the object.
575+
When this constructor is invoked, run the following steps:
576+
1. If |phrase| is an empty string, throw a {{SyntaxError}} and abort these steps.
577+
1. If |boost| is smaller than 0.0 or greater than 10.0, throw a {{SyntaxError}} and abort these steps.
578+
1. Let |phr| be a new object of type {{SpeechRecognitionPhrase}}.
579+
1. Set |phr|.{{[[phrase]]}} to be |phrase|.
580+
1. Set |phr|.{{[[boost]]}} to be |boost|.
581+
1. Return the object |phr|.
576582
</dd>
577583

578584
<dt><dfn attribute for=SpeechRecognitionPhrase>phrase</dfn> attribute</dt>
579-
<dd>This attribute is the text string to be boosted.</dd>
585+
<dd>This attribute returns the value of {{[[phrase]]}}.</dd>
580586

581587
<dt><dfn attribute for=SpeechRecognitionPhrase>boost</dfn> attribute</dt>
582-
<dd>This attribute is approximately the natural log of the number of times more likely the website thinks this phrase is than what the speech recognition model knows.
583-
A valid boost must be a float value inside the range [0.0, 10.0], with a default value of 1.0 if not specified.
584-
A boost of 0.0 means the phrase is not boosted at all, and a higher boost means the phrase is more likely to appear.
585-
A boost of 10.0 means the phrase is extremely likely to appear and should be rarely set.
586-
</dd>
588+
<dd>This attribute returns the value of {{[[boost]]}}.</dd>
587589
</dl>
588590

589591
<h4 id="speechreco-phraselist">SpeechRecognitionPhraseList</h4>
590592

591-
<p>The SpeechRecognitionPhraseList object holds a sequence of phrases for contextual biasing.</p>
593+
<p>The SpeechRecognitionPhraseList object holds a sequence of phrases for contextual biasing and has the following internal slot:</p>
594+
595+
<dl dfn-type=attribute dfn-for="SpeechRecognitionPhraseList">
596+
: <dfn>[[phrases]]</dfn>
597+
::
598+
A sequence of {{SpeechRecognitionPhrase}} representing the phrases to be boosted. The initial value is an empty list.
599+
</dl>
592600

593601
<dl>
594-
<dt><dfn constructor for=SpeechRecognitionPhraseList>SpeechRecognitionPhraseList()</dfn> constructor</dt>
595-
<dd>This constructor returns an empty list.</dd>
602+
<dt><dfn constructor for=SpeechRecognitionPhraseList>SpeechRecognitionPhraseList(|phrases|)</dfn> constructor</dt>
603+
<dd>
604+
When this constructor is invoked, run the following steps:
605+
1. Let |list| be a new object of type {{SpeechRecognitionPhraseList}}.
606+
1. Set |list|.{{[[phrases]]}} to be |phrases|.
607+
1. Return the object |list|.
608+
</dd>
596609

597610
<dt><dfn attribute for=SpeechRecognitionPhraseList>length</dfn> attribute</dt>
598-
<dd>This attribute indicates how many phrases are in the list. The user agent must ensure it is set to the number of phrases in the list.</dd>
611+
<dd>
612+
This attribute indicates the number of phrases in the list.
613+
When invoked, return the number of items in {{[[phrases]]}}.
614+
</dd>
599615

600-
<dt><dfn method for=SpeechRecognitionPhraseList>SpeechRecognitionPhrase(|index|)</dfn> method</dt>
616+
<dt><dfn method for=SpeechRecognitionPhraseList>item(|index|)</dfn> method</dt>
601617
<dd>
602-
This method gets the SpeechRecognitionPhrase object at the |index| of the list.
618+
This method gets the {{SpeechRecognitionPhrase}} object at the |index| of the list.
603619
When invoked, run the following steps:
604-
1. If the |index| is smaller than 0, or greater than or equal to {{SpeechRecognitionPhraseList/length}}, return null.
605-
1. Return the SpeechRecognitionPhrase from the |index| of the list.
620+
1. If |index| is smaller than 0, or greater than or equal to {{SpeechRecognitionPhraseList/length}},
621+
throw a {{RangeError}} and abort these steps.
622+
1. Return the {{SpeechRecognitionPhrase}} at the |index| of {{[[phrases]]}}.
606623
</dd>
607624

608625
<dt><dfn method for=SpeechRecognitionPhraseList>addItem(|item|)</dfn> method</dt>
609-
<dd>This method adds the SpeechRecognitionPhrase object |item| to the end of the list.</dd>
610-
</dl>
611-
612-
<h4 id="speechreco-context">SpeechRecognitionContext</h4>
613-
614-
<p>The SpeechRecognitionContext object holds contextual information to provide to the speech recognition models.</p>
615-
616-
<dl>
617-
<dt><dfn constructor for=SpeechRecognitionContext>SpeechRecognitionContext(|phrases|)</dfn> constructor</dt>
618-
<dd>This constructor returns a new SpeechRecognitionContext object with the SpeechRecognitionPhraseList object |phrases| in it.</dd>
626+
<dd>
627+
This method adds the {{SpeechRecognitionPhrase}} object |item| to the list.
628+
When invoked, add |item| to the end of {{[[phrases]]}}.
629+
</dd>
619630

620-
<dt><dfn attribute for=SpeechRecognitionContext>phrases</dfn> attribute</dt>
621-
<dd>This attribute represents the phrases to be boosted.</dd>
631+
<dt><dfn method for=SpeechRecognitionPhraseList>removeItem(|index|)</dfn> method</dt>
632+
<dd>
633+
This method removes the {{SpeechRecognitionPhrase}} object at the |index| of the list.
634+
When invoked, run the following steps:
635+
1. If |index| is smaller than 0, or greater than or equal to {{SpeechRecognitionPhraseList/length}},
636+
throw a {{RangeError}} and abort these steps.
637+
1. Remove the {{SpeechRecognitionPhrase}} object at the |index| of {{[[phrases]]}}.
638+
</dd>
622639
</dl>
623640

624641
<h3 id="tts-section">The SpeechSynthesis Interface</h3>

0 commit comments

Comments
 (0)