-
-
Notifications
You must be signed in to change notification settings - Fork 23.5k
Add MicrophoneFeed with direct access to the microphone input buffer #108773
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| return buf.size() / 2; | ||
| } | ||
|
|
||
| PackedVector2Array MicrophoneFeed::get_frames(int p_frames) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is wrong since it mimics the audio capture api but does not use a ring buffer. You must use a ring buffer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The AudioDriver input_buffer is already a ring buffer, so I'm not sure where another ring buffer would come into play.
The current audio capture apis (capture effect and record effect) use separate ring buffers because they're receiving a fixed, transient batch of data from the AudioDriver input_buffer (pulled off it in AudioStreamMicrophone) as part of the audio server _mix_step, and need to store it until the user can request it in their own code, outside the audio server thread/loop. This implementation bypasses that, allowing users to pull from the audio driver ring buffer directly. Each feed can have its own 'buffer_ofs', so multiple places in the code can be retrieving microphone data from the same device without stepping on each others toes.
The issue I have with this is it should need to lock the AudioDriver, since input_buffer is the single buffer that is being written to by the driver in the AudioDriver thread. As far as I can tell, not locking will lead to race conditions in multithreaded environments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the locking in the current implementation is inconsistent.
There is no locking applied when the samples are added to the buffer by
AudioDriver::input_buffer_write(int32_t sample), but there is a lock applied when the samples are taken from the buffer by AudioStreamPlaybackMicrophone::_mix_internal(AudioFrame *p_buffer, int p_frames).
That means the locking isn't doing anything useful since it is on only one half of the transaction. Assuming it was meant to do something, I added a corresponding lock into the input_buffer_write() function in my first PR. But there were complaints that I was adding a lock into the very time-sensitive audio thread which could be potentially harmful.
The lack of a lock in input_buffer_write() evidently caused an index out-of-range crash that was mitigated by inserting in an extra boundary test on the index, instead of finding the root cause, which could only be due two threads executing input_position++ at the same time:
void AudioDriver::input_buffer_write(int32_t sample) {
if ((int)input_position < input_buffer.size()) {
input_buffer.write[input_position++] = sample;
if ((int)input_position >= input_buffer.size()) {
input_position = 0;
}
} else {
WARN_PRINT("input_buffer_write: Invalid input_position=" + itos(input_position) + " input_buffer.size()=" + itos(input_buffer.size()));
}
}In any case, there should never be two threads entering this function since it would result in choppy out-of-order audio chunks being pulled from the operating system and buffered. I've only seen it happen in circumstances when AudioDriverPulseAudio::input_start() was called a second time. Some of the code in my current PR protects this from happening again on the various different platforms.
With regards to the race condition, I think it is safe since the input_buffer is never getting realloced and the MicrophoneFeed::get_frames() function doesn't write to any conflicted values and can tolerate an out-of-date input_position value.
Indeed, there is no point in adding in a lock into this function without adding the corresponding lock into input_buffer_write() function. Unfortunately we don't know what the consequences of acquiring a lock at the rate of 88.2kHz in the audio thread would be, and it has persisted for this long without it being a problem that it would be a risk to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The drivers lock themselves in their own threads before making changes to input_buffer (except maybe the Android driver).
| ad->lock(); |
godot/drivers/wasapi/audio_driver_wasapi.cpp
Line 765 in 71a9948
| ad->lock(); |
godot/platform/web/audio_driver_web.cpp
Line 405 in 71a9948
| driver->lock(); |
You may be right that accessing input_buffer from multiple threads is fine (due to no reallocation), but it really doesn't feel correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stand corrected. Note that the lock is per chunk of audio, not per sample.
Yes, it was the Android version which had all the bugs I have been most interested in fixing.
|
Many GH checks failed, |
|
Work to do: Do the docs again using https://docs.godotengine.org/en/stable/contributing/documentation/updating_the_class_reference.html#updating-class-reference-when-working-on-the-engine Try using: Remove vector& lock in the android version and locking in the get buffer. |
aff6747 to
a807709
Compare
|
The deeper proposal is to separate the Audio Input (microphone code) from Audio Output (speakers) and move this code into a default microphone feed class. Unfortunately the Godot codebase does not make this separation easy to do. Here is why. The Godot Each of these classes manages the single input and single output for that platform -- sometimes with the same function. For example, in Here are the options of what can be done. Option 1: Separate the Audio Input and Output at a deep levelThis requires me to write a new The case of Option 2 (what I have implemented): Touch the AudioDriver code as little as humanly possible so nothing breaks:Leave all the code and its technical debt (in relation to the feature of multiple microphones) in place, including its single microphone input buffer on which the 7 different platform implementations depend, and simply access this single buffer from the one and only Option 3: Something in betweenLeave the Since none of the platforms implement multiple microphones, the value of In my opinion the appropriate time to implement the code that can manage multiple MicrophoneFeeds is when at least one of the platforms' AudioDrivers has been extended to support it, because otherwise its implementation will be speculative and likely to be wrong. |
|
I made a similar implementation of a My implementation has the However, since only one real microphone exists in the |
This PR is an alternative to #100508 and #105244 and is designed to fix the reliability issues on the microphone data discovered when implementing VoIP as identified in godotengine/godot-proposals#11347
A comprehensive demo is at:
https://github.com/goatchurchprime/godot-demo-projects/tree/gtch/micplotfeed/audio/mic_feed
As discussed at length in the Audio Group meetings, the API needed to future-proofed for the case where there might be more than one microphone data stream, even though only a single microphone input has been hard-coded throughout the AudioDriver code on all platforms.
Accordingly, I have modeled a framework based on the
CameraServerandCameraFeedobjects and created the new objectMicrophoneServerthat has a singleMicrophoneFeed.This
MicrophoneFeedcontains the following functions:I have renamed the functions from their original names
get_frames_available()andget_buffer()inAudioEffectCapture, because they were confusing as you are getting frames from a buffer, not the buffer. I have also added a function to access the size of the internal input buffer so its overflow condition is predictable.Known issues
I cannot see whyMicrophoneFeed.xmlis not being picked up by the documentation system. Probably some stupid typo I can't find.I don't understand the remaining issues in the automatic integration checksI haven't found a good way to detect and handle the lack of theMicrophoneServerobject in older versions. The functionClassDB.class_exists("MicrophoneServer")works to discriminate whether it is there or not, but there is no equivalent ofClassDB.instantiate("MicrophoneServer")for singleton global objects.Need to check that the use of(not advised but doesn't cause any unresolvable issues).AudioStreamMicrophonein same project doesn't break the system.Future work
When this is stable and in use, we can go back through
AudioStreamMicrophoneand rewrite it depend on this API as well as mitigate the flaws in its design with regards to the assumption that the mix-rate of the input and output streams always match to a high precision without any slippage.It would also be useful after all this complexity has been added to demonstrate at least one example case of more than microphone being implemented in any platform.