Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add multichannel audio mixer #1386

Merged
merged 6 commits into from
Apr 6, 2024

Conversation

levs42
Copy link
Contributor

@levs42 levs42 commented Mar 13, 2024

Description & motivation

This is a PR based on #1380. It replaces audio resampler with an audio mixer. Everything works as before when there is just one input source. Once there is more, audio mixer creates an audio graph; adds a resampler for each input source; starts the mixing process. It is using plain AudioUnit components instead of AVAudioNode to avoid interacting with audio session and instead of AUAudioUnit to avoid deprecated AUGraph.

Users can provide dedicated audio settings for every input source with sourceSettings. This allows to set channelMap and other parameters independently for every resampler. However, output sampleRate and channels are enforced by the IOMixer to be the same as for the default resampler. This ensures that mixer node works with the same audio format from every source. Changes of settings and audio format of the default resampler are monitored for this purpose.

All interfaces and default behavior is kept the same.

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Screenshots:

@levs42 levs42 force-pushed the feature/basic-audio-mixer branch 2 times, most recently from ef87fd7 to a475e05 Compare March 13, 2024 01:52
@shogo4405
Copy link
Owner

The PR is large, so I'll need some time for the review.

Copy link
Owner

@shogo4405 shogo4405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it with ReplayKit@Control Center at hand. Only the audioApp could be heard. What device was it tested on? Just for reference, I'll leave the FormatDescription here.

iPhone 15 Pro max + iOS17.3 a475e05

audioMic

CMSampleBuffer 0x102506830 retainCount: 4 allocator: 0x1e8f806a8
	invalid = NO
	dataReady = YES
	makeDataReadyCallback = 0x0
	makeDataReadyRefcon = 0x0
	formatDescription = <CMAudioFormatDescription 0x283311900 [0x1e8f806a8]> {
	mediaType:'soun' 
	mediaSubType:'lpcm' 
	mediaSpecific: {
		ASBD: {
			mSampleRate: 48000.000000 
			mFormatID: 'lpcm' 
			mFormatFlags: 0xc 
			mBytesPerPacket: 2 
			mFramesPerPacket: 1 
			mBytesPerFrame: 2 
			mChannelsPerFrame: 1 
			mBitsPerChannel: 16 	} 
		cookie: {(null)} 
		ACL: {(null)}
		FormatList Array: {
			Index: 0 
			ChannelLayoutTag: 0x640001 
			ASBD: {
			mSampleRate: 48000.000000 
			mFormatID: 'lpcm' 
			mFormatFlags: 0xc 
			mBytesPerPacket: 2 
			mFramesPerPacket: 1 
			mBytesPerFrame: 2 
			mChannelsPerFrame: 1 
			mBitsPerChannel: 16 	}} 
	} 
	extensions: {(null)}
}

audioApp

CMSampleBuffer 0x1047130f0 retainCount: 4 allocator: 0x1e8f806a8
	invalid = NO
	dataReady = YES
	makeDataReadyCallback = 0x0
	makeDataReadyRefcon = 0x0
	formatDescription = <CMAudioFormatDescription 0x283ab08c0 [0x1e8f806a8]> {
	mediaType:'soun' 
	mediaSubType:'lpcm' 
	mediaSpecific: {
		ASBD: {
			mSampleRate: 44100.000000 
			mFormatID: 'lpcm' 
			mFormatFlags: 0xe 
			mBytesPerPacket: 4 
			mFramesPerPacket: 1 
			mBytesPerFrame: 4 
			mChannelsPerFrame: 2 
			mBitsPerChannel: 16 	} 
		cookie: {(null)} 
		ACL: {(null)}
		FormatList Array: {
			Index: 0 
			ChannelLayoutTag: 0x650002 
			ASBD: {
			mSampleRate: 44100.000000 
			mFormatID: 'lpcm' 
			mFormatFlags: 0xe 
			mBytesPerPacket: 4 
			mFramesPerPacket: 1 
			mBytesPerFrame: 4 
			mChannelsPerFrame: 2 
			mBitsPerChannel: 16 	}} 
	} 
	extensions: {(null)}
}

@@ -0,0 +1,154 @@
import AudioUnit

extension AudioNode: CustomStringConvertible {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Q]
It's convenient for debugging, but I prefer to avoid having it in the library to prevent changing the default behavior. Will moving it to an example still not work as expected?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AudioNode is a new class so I implemented a custom description. I can rename it to something else. Is it more about reimplementing a custom description for AudioStreamBasicDescription? It's very handy to print mFormatFlags as a readable flags. For example,

mFormatFlags: 12 AudioFormatFlags(audio_IsPacked | audio_IsSignedInteger | ll_32BitSourceData | pcm_IsPacked | pcm_IsSignedInteger)

instead of just mFormatFlags: 12.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding custom classes in HK, that's fine. However, please keep the Extension folder for Cocoa frame-only extensions. Please move it to the bottom of the definition file.

As for AudioStreamBasicDescription, I haven't adopted it because the default behavior changes. I acknowledge that it's convenient for debugging, but we often avoid it in application-level development due to the numerous problems it can cause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good, I'll move it tomorrow. Can I keep AudioStreamBasicDescription extension while removing description override and rename it to something like verboseDescription for debug purpose?

track(channel: Int(channel))?.resampler.append(audioBuffer, when: when)
}

private func createTrack(channel: Int, settings: IOAudioResamplerSettings) -> Track {
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Must]

  • createTrack -> makeTrack
    Following Apple's Cocoa naming conventions, factory methods adopt the 'make' convention.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to makeTrack.

@levs42
Copy link
Contributor Author

levs42 commented Mar 19, 2024

For the original issue of mic audio being silent, I've pushed 7f837a7 with the fix. I was testing with custom audio settings, so resamplers were calling delegate methods and ring buffers were initialized. However, the delegate method wasn't called for default settings, so ring buffer was nil. Now ring buffer always has a default value for current output format.

@shogo4405
Copy link
Owner

I tried it with 44e506f using ReplayKit@Control Center, but this time the microphone audio was coming in, while the app audio was not. Please check if the same issue occurs on your end. The testing device is the same as mentioned above.

@levs42
Copy link
Contributor Author

levs42 commented Mar 25, 2024

Please try the new version. I've tested Screencast example with and without mic enabled from the start.

@shogo4405
Copy link
Owner

shogo4405 commented Mar 27, 2024

Thank you very much. It's a great feature! The built-in microphone on the iPhone and external USB headphones seem to be working fine without any noticeable audio delay.

However, during the next round of testing, the following issues were discovered:

  1. When streaming with a Bluetooth microphone, no microphone audio is captured.
  2. During streaming, an incoming call was received. After dismissing the call without answering, the app's audio ceased to be captured.
    • In some cases, due to ReplayKit behavior, app audio may not be captured, which may be considered acceptable as per the specifications.

@levs42
Copy link
Contributor Author

levs42 commented Mar 28, 2024

Awesome. Thank you for testing!

  1. Do you have a recommendation for a Bluetooth microphone model? Also could you please provide the ASBD?
  2. Does this only happen to ReplayKit? I did a test on 17.4 and have the same behavior. processSampleBuffer receives only video buffers after a phone call. I will submit a bug report to Apple next week.

@shogo4405
Copy link
Owner

The standard equipment adopted is the AirPods Pro. The ASBD (AudioStreamBasicDescription) is as follows:
<AVAudioFormat 0x2828fc690: 1 ch, 24000 Hz, Int16> numSamples = 1024
Some inexpensive Bluetooth microphones may operate at 16kHz, so I'd like to try at that frequency as well. However, I don't know the product name.

@levs42
Copy link
Contributor Author

levs42 commented Apr 1, 2024

I've tested 3 headphones with my iPhone 11 Pro Max, iOS 17.4.1 and wasn't able to reproduce the issue. Here are my tests:

Summary

Test 1: starting with Apple AirPods Pro
channel:0, inputFormat:<AVAudioFormat 0x300fe54f0: 1 ch, 24000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x300fe9c20: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to built-in mic

Test 2: starting with built-in mic
channel:0, inputFormat:<AVAudioFormat 0x302435720: 1 ch, 48000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x302435fe0: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to AirPods

Test 3: start with Beats Fit Pro (Apple)
channel:0, inputFormat:<AVAudioFormat 0x3008a9c70: 1 ch, 24000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x3008a9ef0: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to built-in mic

Test 4: starting with built-in mic
channel:0, inputFormat:<AVAudioFormat 0x303bd5b80: 1 ch, 48000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x303bd6210: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to Beats

Test 5: start with non-Apple headphones (Bose QC 35 II)
channel:0, inputFormat:<AVAudioFormat 0x302765a90: 1 ch, 16000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x303bd6210: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to built-in mic

Test 6: starting with built-in mic
channel:0, inputFormat:<AVAudioFormat 0x303bd5b80: 1 ch, 48000 Hz, Int16>
channel:1, inputFormat:<AVAudioFormat 0x303bd6210: 2 ch, 44100 Hz, Int16, interleaved>
// Switched to Bose

I noticed that iOS keeps the initial audio format and performs the audio conversion before providing it to the ReplayKit sample handler when user changes the audio source. For example, using if users starts the broadcast with 16KHz headphones connected, disconnecting them doesn't switch back to 48KHz.

Could you please verify couple of things when using AirPods Pro?

  1. The sample handler receives a buffer here
  2. The audio mixer appends the audio buffer here
  3. After appending it calls mix() function and the mix function calls the delegate method here
  4. And the audio buffer for channel 0 actually has audio audioBuffer.mutableAudioBufferList.pointee.mBuffers.mData in XCode's Memory Browser

This covers the default mixing troubleshooting. Also please let me know if func audioMixer(_ audioMixer: IOAudioMixer, errorOccurred error: IOAudioUnitError) delegate method is getting called at any moment. Thank you.

@shogo4405
Copy link
Owner

I noticed that iOS keeps the initial audio format and performs the audio conversion before providing it to the ReplayKit sample handler when user changes the audio source. For example, using if users starts the broadcast with 16KHz headphones connected, disconnecting them doesn't switch back to 48KHz.

This can be switched with some actions. Here are the ones I know:

Triggering an alarm using the clock function.
Tapping the notch area to trigger a stop broadcast alert.
AudioSession is reset when moving apps, etc., and may switch to 48 kHz at some point.
I think it's unnecessary to support these actions up to this point.

@shogo4405 shogo4405 changed the base branch from main to feature/audio-mixer April 2, 2024 12:59
Copy link
Owner

@shogo4405 shogo4405 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried it just now and it worked as expected. Thank you.

Since the audio part is fundamental, merging it as it is and releasing it may lead to many criticisms.
Therefore, I plan to proceed with the following steps:

  1. In version 1.7.x, make changes to method signatures and other related changes to accommodate this external API change.
  2. In version 1.8.0,
    • Switch between this mixing feature and the existing audio processing, with the default being the existing audio.
    • Make adjustments to allow switching the processing via rtmpConnection.requireNetworkFramework as an example.
  3. In version 1.9.0,
    • After confirming stability, make this mixing process the default.

Please let me work on the branch shogo4405:feature/audio-mixer for now.

@levs42
Copy link
Contributor Author

levs42 commented Apr 2, 2024

This plan looks great! Thank you.

@shogo4405 shogo4405 changed the base branch from feature/audio-mixer to main April 5, 2024 01:27
@shogo4405 shogo4405 added this to the 1.8.0 milestone Apr 5, 2024
@shogo4405 shogo4405 closed this Apr 6, 2024
@shogo4405 shogo4405 reopened this Apr 6, 2024
@shogo4405 shogo4405 merged commit 886e8ff into shogo4405:main Apr 6, 2024
0 of 2 checks passed
@levs42 levs42 deleted the feature/basic-audio-mixer branch May 7, 2024 21:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants