Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio Crackling #180

Open
freesig opened this issue Jun 4, 2018 · 16 comments
Open

Audio Crackling #180

freesig opened this issue Jun 4, 2018 · 16 comments

Comments

@freesig
Copy link
Contributor

freesig commented Jun 4, 2018

Spikes in cpu happening every so often causing glitches in audio.
Worse on start up

@freesig
Copy link
Contributor Author

freesig commented Jun 4, 2018

Here you can see two peaks where the flags are. They are around 90% cpu but only last about 18ms.
This time period is 10s
image

@freesig
Copy link
Contributor Author

freesig commented Jun 4, 2018

This is the responsible function:
9.00 ms 100.0% 7.00 ms core::ops::function::impls::_$LT$impl$u20$core..ops..function..Fn$LT$A$GT$$u20$for$u20$$RF$$u27$a$u20$F$GT$::call::h1b01f83aac3f9eb9
It's just called by this:
9.00 ms 100.0% 0 s _$LT$nannou..audio..requester..Requester$LT$S$GT$$GT$::fill_buffer::h837708360a31ea69
So it's something in the audio thread that is occasionally peaking

@mitchmindtree
Copy link
Member

We spoke to Charlie earlier and he mentioned that he might have noticed glitching occurring more often for Continuous sounds (sounds that play to a continuous timeline) as opposed to Retrigger sounds (sounds that always restart from the beginning. The only difference between these two types of playback behind the scenes is that Continuous sounds seek to the correct position within the WAV file before starting playback. However, this occurs on the WAV reading threads (not on the audio thread) so I'm unsure how this could be impacting the audio thread glitching - just adding this comment for the record and as a possible avenue of investigation if we're struggling to find the culprit within the audio thread's fill_buffer function.

@freesig
Copy link
Contributor Author

freesig commented Jun 25, 2018

Right now I'm slicing the function up into sections and measuring variance between the average length that section runs and the longest runs.
I'm also resetting every 1000 calls.
So what I'm thinking is that a section that is running normally should have basically no variance. A section that is peaking every now and then should have a few runs that show a large variance.

Some issues I've come across:

  • The smaller the section the larger the variance. I guess this is expected because on smaller time scales you notice change more but I'm not sure how to frame this better. Maybe I could weight it depending on how big the section is or make the variance calculation relative to the whole function and not the section
  • Looking inside loops. This means that the timer gets called a bunch. I was thinking of maybe averaging a timer that gets multiple hits.
  • The audio server with a large amount of channels runs near 100% on my machine anyway so the peaks are harder to see. I will try lowering the channel number and sounds until it runs faster and see if the peaks are there. If that doesn't work I might make a testing binary with the timing code and send it to science works to run at night or something

@mitchmindtree
Copy link
Member

Sweet, this looks like a good approach 👍

Not sure if you've came across this yet, but this line! macro might be useful to more easily identify where in the code the profiler is referring to.

Maybe a nice way of logging this data would be to implement Serialize for whatever the collection is that stores the resulting profiling and write it to a JSON file so that it's easier to share here? Otherwise I guess you could just take a look yourself and post here the areas that seem like potential culprits.

And yep, I think the largest absolute time variance is likely what we really care about. This is probably how we will find the spikes that were showing up in the Time Profiler graph.

Also, seeing as we're working in such a high-performance domain it might also be worth timing how long it takes to measure the time between two markers with no other code between them. This will let us know if the I/O required to read from the system clock is significant enough to affect the profiling. Maybe then we could even subtract the average time taken by the markers themselves from the profiling results, but this probably isn't worth it unless it looks like the markers themselves are affecting the profiling.

@freesig
Copy link
Contributor Author

freesig commented Jun 28, 2018

Got it working pretty well using the line macro.

Seems to have a cost of about 0.05+- nano seconds which should not be an issue unless it's being called a lot in a loop.

The last piece of the puzzle is working out how to store the data for reading. I want to run a version of audio_server with this timing code overnight and then get some data that we can see logged to a file.
I'm thinking maybe a graph that shows:
top duration time for a section as a percentage of total average time to run the function
contrast against
average duration time for a section as a percentage of total average time to run the function

This should show pretty clearly which sections are peaking out periodically.
Might be worth graphing the top ten.
This would reset every 1000 frames.
I should be able to serialize it and make the charts in google docs or plot.ly

@mitchmindtree
Copy link
Member

I think there is a CSV crate for generating tables that you can read from google docs or excel but I haven't had a play with it yet

@freesig
Copy link
Contributor Author

freesig commented Jul 2, 2018

Ok finally got something to show.
image
So this is the top durations for a section of a of the function as a percentage of the average function run time.
So you can see the top lines is this range (line numbers are slightly off due to added timing code)

This section seems to be often run over 100% of the average time for the entire function.
These readings are the top values over 500 frames of audio.
This chart is from reading 1000 to 2000. I skipped the start because there is some loading that peaks when the application starts.
I have over 5gig of data so there's a lot more I can pull from this and make my timer display this info better.
Also the next challenge will be to time inside the loops. But this will generate a lot more data and also call instant a lot more, so the readings might become less relevant. I will give it a go though.
Heres the chart svg file if you want to have a look yourself

chart.svg.zip

@freesig
Copy link
Contributor Author

freesig commented Jul 2, 2018

Just ran a timing run on my mac with more timers so won't be the most accurate but it shows this section as using the most

@mitchmindtree
Copy link
Member

Some other things that might be worth tracking simultaneously are:

  • the number of sounds
  • the number of channels in each sound
  • the number of speakers to which each channel is outputting

as each of these contribute to the number of iterations in these loops. E.g. if this mixing loop that you highlight in your last comment is indeed the bottleneck, it may be because one or more sounds and their channels very occasionally become within range of many more speakers and require many more iterations. I'm not sure this reflects the behaviour though, which is more like a very occasional click/pop rather than longer periods of glitching as sound's travel through larger groups of speakers. It might be worth experimenting by reducing the PROXIMITY_LIMIT to reduce the range of speakers that the channels can reach and in turn reduce the total number of iterations. This might at least help us to identify whether or not it is the bottleneck?

@freesig
Copy link
Contributor Author

freesig commented Jul 4, 2018

Ok got some more information:
image
The staggered thick brown line is amount of sound_channels. I couldn't get the number of channels per sound without logging inside loops. It would also create a lot of data.
Here is a view more zoomed in:
image
All the points performing above 100% is this section Which shows a similar result to what was happening on my computer.
In the second chart the yellow line running straight across at 70% is the average for this same section. This section is occasionally running way over but it doesn't seem to correlate exactly with the number of sound_channels.
I'm pretty confident that this section is the causing the glitching but will need to look closer to find something correlates more in time with those peaks

@freesig
Copy link
Contributor Author

freesig commented Jul 4, 2018

I think the next step is to time inside the loop. This is a little challenging because if you imagine the loop "unrolled" it's like a very long piece of code. So it will produce a lot of data and the calls to Instant may start to be significant.
The only thing that seems like it could make that run slow would be the length of the 3 for loops right?
Is there anything else that you can see in that block that wouldn't be dependent on the length of the loops?

@mitchmindtree
Copy link
Member

To get further confirmation that this nested loop area is the culprit, perhaps we can track the total number of iterations that occur in that nested loop and graph them alongside the spikes that you're seeing (this is basically the same as tracking the number of sounds, channels per sound and target speakers per sound as I mentioned in my previous comment). The most interesting way might be to add a let mut count_loop_iterations = 0; at the beginning of the function and then increment it on each of the deepest nested iterations here, and then check what the resulting count is at the end of the function. My guess is that we will see a correlation between the number of loop iterations and the CPU spikes. If that is correct, then hopefully reducing the PROXIMITY_LIMIT slightly will give us enough speed back, as this should reduce the total number of target speakers for each sound channel and in turn, reduce the total number of iterations. Best to check for sure first though.

@freesig
Copy link
Contributor Author

freesig commented Jul 9, 2018

Ok so there was no correlation between the number of iterations.
image
The orange line running across the top in number of iterations inside the loop. The more red line that jumps up is the peaks in function runs.

With the PROXIMITY_LIMIT is it possible that when a speaker becomes in range that it jumps from 0 volume to some higher volume because it's not using ramping smoothly up once it's in range?

Another thing is if it's not the amount of loops maybe its a cache missing issue? Could explain why a loop with the same number of iterations could be held up

Todo:

  • Run a version of audio_server with reduced Proximity limit for a day and see if it glitches less.
  • Refactor timing code so that its more clear and more flexible in what it's measuring.
  • Run a cache miss test with instruments (Doesn't seem that this is possible anymore.)

@freesig
Copy link
Contributor Author

freesig commented Jul 9, 2018

Some good information on Rust and cache misses

@freesig
Copy link
Contributor Author

freesig commented Jul 10, 2018

Could it be that in this line:
let channel_sample = sound.unmixed_samples[channel_sample_index];
channel_sample_index is jumping around a lot so that the cpu has to keep grabbing from different parts of memory causing cache misses?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants