-
Notifications
You must be signed in to change notification settings - Fork 10
[precise] Wakeword improve activation speed #3
Comments
Speed is more related to hardware resources. What are you running on, and how loaded is it? |
For testing actually on my dev machine so that really shouldn't be a bottleneck (Intel i7-5600U) Does this mean my suspicion that the engine requires the trailing 1 second before considering it a "activation" is wrong? I should add that I'm new to ML so I'm still in a steep learning curve ':D Edit: |
Should only be listening to 1.5 seconds, I think, to activate. My data cuts off usually pretty quickly. I haven't used alexa in a while, but activation seems to be near-google-speed in my experience. I run on an i7-4770 with 8gb doing a bunch of things (mycroft/wiki/tts/stt) and it's not noticeably slow. |
Hm... So maybe that has an effect on the perceived speed as "hey mycroft" is much closer to a 1.5 cycle then "Kiana". That said. After playing with the batch-size a bit I managed to get a model that at least activates (although val_acc could be much better) and it activates much faster about 50% of the time. I'll do some more testing tomorrow and probably compare against the official mycroft model in terms of speed. Do you have a clue why a data-set with 1 sec. of silence would perform so much better when it comes to training then the ones without? |
Dunno. A lot depends on your data. Train with more data or more steps to improve val_acc. I have something like 300 wake word samples now, and about 4x that in not-wake-words (particularly in things that triggered false activations). A good chunk of the noises in PCD are from that as well. If you're not using at least 50 samples of wake word 3x that in not, you will probably want to add more. Also use wakeword saving to build more samples, particularly of the not-wake-word variety. |
Dammit, I knew I forgot to share important information... I'm currently at:
False activations im not yet worried about as I can fix those later on through the methods you outlined in your write-up. Given my dataset I would expect val_acc to hit 1 all the time. Changing the sensitivity to a higher value (e.g. 0.8 rather then 0.2) seems to improve activation speed which seems odd. |
Hmm. Yeah, i was more concerned with accuracy than anything, speed never was an issue. |
So, I tried the "hey microft" model and damn that thing activates fast. I really wish I knew how exactly this model was trained. I don't know if the activation speed would improve the more data one adds or if they used different training-technics. I have a lot more reading/learning to do it seems |
50k hey mycrofts was what I heard. There's a lot of other data, including nww's they have, but not all of it is good/usable? |
Interesting. I'd have one more question if you'd be so kind. From what I read, keras usually splits data into training and test data itself while precise doesn't do that. The question now. How do you handle |
I randomly sample 10% and move it over. I use google voice commands, psounds, and a few thousand nww's I recorded/saved. I end up running precise-test against the full wakeword dataset for fun to see where it's having issues as well. (I've run it against my nww's as well, which generally isn't as useful) |
I see. This means you have some samples inside "test" that might not exist in any other form in ww. This is all been quite help-full. Thanks a lot. |
I still model words for others on occasion, so any new or better info is always welcome. But what I've gleaned is also through a bunch of trial and lots of error, so better to share that so others can get where they need to go sooner. |
Oh damn that reminded me of one more question I wanted to ask.
Any news on that? p.s. with all the additional testing I've done so far my model still is far from the activation speed of what "hey mycroft" has. |
Doesn't hurt as far as I can tell. It's mostly the captured wake words and such, they tend to be fairly noisy, and I haven't noticed a decrease in activations |
While I started writing a rather chunky issue over on the mycroft-precise repo, I re-read your documentation and though I'd better ask here/you directly.
What I'm currently struggling with is activation speed. I've been very carefully in regards to my training data and ensured every clip starts immediately with the wake-word followed with 1 second of silence (silence meaning quiet room -> me not speaking).
Using my dataset combined with the following for
not-wake-words
:reaches a
val_acc 1
in about 120 epochs (super quick).While it activates quite consistantly, it does so rather slow as it requires for the trailing 1 second to pass as well.
If I now duplicate the data-set and strip 500 ms from the end of every single wake-word clip, I'm suddenly unable to reach a
val_acc
higher than 0.5Stripping 800-1000ms has me sitting on
val_acc 0
.Training for more epochs (I tried up to 6000) did not help.
Is this to be expected? Is there a way to work around this?
Any help would be much appreciated and thanks for your current write-up. It already helped a lot :)
The text was updated successfully, but these errors were encountered: