Add JavaScript async API for OfflineRecongizer decodeStream.#3049
Conversation
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThis pull request adds asynchronous support for offline speech recognition decoding. A new async worker and wrapper are introduced in the C++ addon layer, a decodeAsync method is exposed in the Node.js API, and test scripts plus documentation demonstrate the new functionality with FunASR Nano models. Changes
Sequence DiagramsequenceDiagram
participant App as Node.js App
participant API as OfflineRecognizer<br/>(decodeAsync)
participant Addon as Native Addon<br/>(DecodeOfflineStreamAsyncWrapper)
participant Worker as AsyncWorker<br/>(DecodeOfflineStreamAsyncWorker)
participant Stream as Offline Stream
App->>API: decodeAsync(stream)
API->>Addon: addon.decodeOfflineStreamAsync(handle, stream.handle)
Addon->>Worker: Create & Queue AsyncWorker
Addon-->>API: Return Promise
API-->>App: Return Promise
par Async Background Processing
Worker->>Stream: SherpaOnnxDecodeOfflineStream
Stream-->>Worker: JSON Results
Worker->>Worker: OnOK - Parse & Wrap JSON
end
Worker-->>API: Resolve Promise with Result
API-->>App: Deliver Decoded Result
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @csukuangfj, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request enhances the Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request introduces an asynchronous API for the OfflineRecognizer's decodeStream functionality, which is a valuable addition for improving responsiveness in Node.js applications. The changes include the C++ N-API implementation for the async worker, updates to the JavaScript wrapper, a new example demonstrating its usage, and corresponding documentation updates in the README. The overall structure and intent are clear and well-executed. However, a critical issue regarding object lifetime management in the N-API async worker needs to be addressed to prevent potential use-after-free bugs.
| class DecodeOfflineStreamAsyncWorker : public Napi::AsyncWorker { | ||
| public: | ||
| DecodeOfflineStreamAsyncWorker(Napi::Env env, | ||
| const SherpaOnnxOfflineRecognizer *recognizer, | ||
| const SherpaOnnxOfflineStream *stream, | ||
| Napi::Promise::Deferred deferred) | ||
| : Napi::AsyncWorker(env), | ||
| recognizer_(recognizer), | ||
| stream_(stream), | ||
| deferred_(deferred) {} | ||
|
|
||
| void Execute() override { | ||
| try { | ||
| SherpaOnnxDecodeOfflineStream(recognizer_, stream_); | ||
| } catch (const std::exception &e) { | ||
| SetError(e.what()); | ||
| } | ||
| } | ||
|
|
||
| void OnOK() override { | ||
| const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream_); | ||
| Napi::String s = Napi::String::New(Env(), json); | ||
| SherpaOnnxDestroyOfflineStreamResultJson(json); | ||
| deferred_.Resolve(s); | ||
| } | ||
|
|
||
| void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); } | ||
|
|
||
| private: | ||
| const SherpaOnnxOfflineRecognizer *recognizer_; | ||
| const SherpaOnnxOfflineStream *stream_; | ||
| Napi::Promise::Deferred deferred_; | ||
| }; |
There was a problem hiding this comment.
The DecodeOfflineStreamAsyncWorker stores raw pointers recognizer_ and stream_. These pointers are extracted from Napi::External objects, which are managed by the JavaScript garbage collector. If the JavaScript OfflineRecognizer or OfflineStream objects are garbage collected before the AsyncWorker completes its Execute() or OnOK() methods, these raw pointers will become dangling, leading to use-after-free bugs and application crashes. To prevent this, the AsyncWorker must hold persistent N-API references to the Napi::External objects (or the Napi::Objects containing them) to ensure they remain valid throughout the worker's lifecycle. These references should be released in the worker's destructor.
class DecodeOfflineStreamAsyncWorker : public Napi::AsyncWorker {
public:
DecodeOfflineStreamAsyncWorker(
Napi::Env env,
Napi::External<SherpaOnnxOfflineRecognizer> recognizer_external,
Napi::External<SherpaOnnxOfflineStream> stream_external,
Napi::Promise::Deferred deferred)
: Napi::AsyncWorker(env),
recognizer_ref_(Napi::Persistent(recognizer_external)),
stream_ref_(Napi::Persistent(stream_external)),
deferred_(deferred) {
recognizer_ = recognizer_external.Data();
stream_ = stream_external.Data();
}
~DecodeOfflineStreamAsyncWorker() {
recognizer_ref_.Unref();
stream_ref_.Unref();
}
void Execute() override {
try {
SherpaOnnxDecodeOfflineStream(recognizer_, stream_);
} catch (const std::exception &e) {
SetError(e.what());
}
}
void OnOK() override {
const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream_);
Napi::String s = Napi::String::New(Env(), json);
SherpaOnnxDestroyOfflineStreamResultJson(json);
deferred_.Resolve(s);
}
void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); }
private:
Napi::Reference<Napi::External<SherpaOnnxOfflineRecognizer>> recognizer_ref_;
Napi::Reference<Napi::External<SherpaOnnxOfflineStream>> stream_ref_;
const SherpaOnnxOfflineRecognizer *recognizer_;
const SherpaOnnxOfflineStream *stream_;
Napi::Promise::Deferred deferred_;
};| const SherpaOnnxOfflineRecognizer *recognizer = | ||
| info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>().Data(); | ||
|
|
||
| const SherpaOnnxOfflineStream *stream = | ||
| info[1].As<Napi::External<SherpaOnnxOfflineStream>>().Data(); |
There was a problem hiding this comment.
To align with the suggested fix for DecodeOfflineStreamAsyncWorker, the Napi::External objects should be passed directly to the worker's constructor, allowing the worker to create persistent references. This ensures that the underlying native objects are not prematurely garbage collected.
Napi::External<SherpaOnnxOfflineRecognizer> recognizer_external =
info[0].As<Napi::External<SherpaOnnxOfflineRecognizer>>();
Napi::External<SherpaOnnxOfflineStream> stream_external =
info[1].As<Napi::External<SherpaOnnxOfflineStream>>();
Napi::Promise::Deferred deferred = Napi::Promise::Deferred::New(env);
auto *worker =
new DecodeOfflineStreamAsyncWorker(env, recognizer_external, stream_external, deferred);| |[./test_asr_non_streaming_omnilingual_asr_ctc.js](./test_asr_non_streaming_omnilingual_asr_ctc.js)|Non-streaming speech recognition from a file using a [Omnilingual-ASR](https://github.com/facebookresearch/omnilingual-asr) CTC model with greedy search| | ||
| |[./test_asr_non_streaming_medasr_ctc.js](./test_asr_non_streaming_medasr_ctc.js)|Non-streaming speech recognition from a file using a [Google MedASR](https://github.com/google-health/medasr) CTC model with greedy search| | ||
| |[./test_asr_non_streaming_funasr_nano.js](./test_asr_non_streaming_funasr_nano.js)|Non-streaming speech recognition from a file using a [FunASR Nano](https://modelscope.cn/models/FunAudioLLM/Fun-ASR-Nano-2512) model| | ||
| |[./test_asr_non_streaming_funasr_nano_async.js](./test_asr_non_streaming_funasr_nano_async.js)|Async non-streaming speech recognition from multiple files using a [FunASR Nano](https://modelscope.cn/models/FunAudioLLM/Fun-ASR-Nano-2512) model| |
| ### Asynchronous non-streaming speech recognition with FunASR Nano models | ||
|
|
||
| ```bash | ||
| wget https://github.com/k2-fsa/sherpa-onnx/releases/download/asr-models/sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2 | ||
| tar xvf sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2 | ||
| rm sherpa-onnx-funasr-nano-int8-2025-12-30.tar.bz2 | ||
|
|
||
| node ./test_asr_non_streaming_funasr_nano_async.js | ||
| ``` |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (5)
.github/scripts/test-nodejs-npm.shharmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-asr.ccnodejs-addon-examples/README.mdnodejs-addon-examples/test_asr_non_streaming_funasr_nano_async.jsscripts/node-addon-api/lib/non-streaming-asr.js
🧰 Additional context used
🧬 Code graph analysis (1)
harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-asr.cc (1)
sherpa-onnx/c-api/c-api.cc (6)
SherpaOnnxDecodeOfflineStream(636-640)SherpaOnnxDecodeOfflineStream(636-638)SherpaOnnxGetOfflineStreamResultAsJson(769-778)SherpaOnnxGetOfflineStreamResultAsJson(769-770)SherpaOnnxDestroyOfflineStreamResultJson(780-780)SherpaOnnxDestroyOfflineStreamResultJson(780-780)
🔇 Additional comments (8)
.github/scripts/test-nodejs-npm.sh (1)
18-20: LGTM!The async test is correctly placed after the model download and before cleanup. The test ordering ensures the model files are available for both sync and async tests.
nodejs-addon-examples/README.md (2)
131-131: LGTM!The table entry clearly describes the async functionality and links to the correct test file.
434-442: LGTM!The setup instructions follow the existing documentation pattern and provide clear steps to run the async test.
scripts/node-addon-api/lib/non-streaming-asr.js (1)
70-79: LGTM!The async API implementation is clean and follows the existing patterns in the codebase. The method correctly awaits the native async call and parses the JSON result.
nodejs-addon-examples/test_asr_non_streaming_funasr_nano_async.js (2)
42-63: LGTM!The test effectively demonstrates the async API's value by decoding multiple files concurrently with
Promise.all. The error handling at the entry point is appropriate for a test script.
55-57: Verify thread safety when usingPromise.all()withdecodeAsync()on a shared recognizer.The code uses
Promise.all()with concurrentdecodeAsync()calls, which leverages Node.js's thread pool (viaNapi::AsyncWorker) to execute multipleSherpaOnnxDecodeOfflineStream()calls in parallel. While each stream is independent, they share the same recognizer instance (and its underlying ONNX model). Confirm that the ONNX Runtime session used by the recognizer is thread-safe for concurrent inference operations. If not, consider either serializing the decoding operations or implementing a mutex to synchronize access to the recognizer's model.harmony-os/SherpaOnnxHar/sherpa_onnx/src/main/cpp/non-streaming-asr.cc (2)
669-701: LGTM!The wrapper function correctly validates arguments, extracts external pointers, creates the deferred promise, and queues the async worker. The implementation follows standard N-API async patterns.
781-782: LGTM!The new async function is correctly exported alongside the existing sync version.
✏️ Tip: You can disable this entire section by setting review_details to false in your review settings.
| class DecodeOfflineStreamAsyncWorker : public Napi::AsyncWorker { | ||
| public: | ||
| DecodeOfflineStreamAsyncWorker(Napi::Env env, | ||
| const SherpaOnnxOfflineRecognizer *recognizer, | ||
| const SherpaOnnxOfflineStream *stream, | ||
| Napi::Promise::Deferred deferred) | ||
| : Napi::AsyncWorker(env), | ||
| recognizer_(recognizer), | ||
| stream_(stream), | ||
| deferred_(deferred) {} | ||
|
|
||
| void Execute() override { | ||
| try { | ||
| SherpaOnnxDecodeOfflineStream(recognizer_, stream_); | ||
| } catch (const std::exception &e) { | ||
| SetError(e.what()); | ||
| } | ||
| } | ||
|
|
||
| void OnOK() override { | ||
| const char *json = SherpaOnnxGetOfflineStreamResultAsJson(stream_); | ||
| Napi::String s = Napi::String::New(Env(), json); | ||
| SherpaOnnxDestroyOfflineStreamResultJson(json); | ||
| deferred_.Resolve(s); | ||
| } | ||
|
|
||
| void OnError(const Napi::Error &e) override { deferred_.Reject(e.Value()); } | ||
|
|
||
| private: | ||
| const SherpaOnnxOfflineRecognizer *recognizer_; | ||
| const SherpaOnnxOfflineStream *stream_; | ||
| Napi::Promise::Deferred deferred_; | ||
| }; |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && find . -name "*.cc" -o -name "*.js" | grep -E "(non-streaming|streaming)" | head -20Repository: k2-fsa/sherpa-onnx
Length of output: 325
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "AsyncWorker" --type cpp -A 5 | head -150Repository: k2-fsa/sherpa-onnx
Length of output: 3947
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && find . -name "non-streaming-asr.js" -o -name "streaming-asr.js" | xargs ls -laRepository: k2-fsa/sherpa-onnx
Length of output: 961
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && find . -name "*.js" -type f | head -20Repository: k2-fsa/sherpa-onnx
Length of output: 44
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "DecodeOfflineStreamAsync" --type js -B 3 -A 10Repository: k2-fsa/sherpa-onnx
Length of output: 210
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && cat sherpa_onnx/src/main/cpp/non-streaming-asr.cc | sed -n '669,710p'Repository: k2-fsa/sherpa-onnx
Length of output: 1373
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && find . -type f -name "*.js" -o -name "*.ts"Repository: k2-fsa/sherpa-onnx
Length of output: 186
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "decodeOfflineStreamAsync\|DecodeOfflineStream" -i -B 5 -A 5Repository: k2-fsa/sherpa-onnx
Length of output: 44
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && cat sherpa_onnx/src/main/cpp/non-streaming-asr.cc | sed -n '635,667p'Repository: k2-fsa/sherpa-onnx
Length of output: 1154
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && find . -type f \( -name "*.js" -o -name "*.jsx" -o -name "*.ts" -o -name "*.tsx" \) ! -path "*/node_modules/*" ! -path "*/.git/*"Repository: k2-fsa/sherpa-onnx
Length of output: 186
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && cat sherpa_onnx/src/main/cpp/non-streaming-asr.cc | sed -n '1,50p'Repository: k2-fsa/sherpa-onnx
Length of output: 1446
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "class.*External\|Napi::External" --type cpp -B 2 -A 8 | head -80Repository: k2-fsa/sherpa-onnx
Length of output: 44
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "Napi::Reference\|ObjectReference" --type cpp | head -20Repository: k2-fsa/sherpa-onnx
Length of output: 44
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && cat sherpa_onnx/src/main/cpp/non-streaming-asr.cc | sed -n '700,730p'Repository: k2-fsa/sherpa-onnx
Length of output: 892
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "Napi::External" --type cpp -B 3 -A 3 | head -100Repository: k2-fsa/sherpa-onnx
Length of output: 7156
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "Napi::Reference" --type cpp -B 2 -A 5Repository: k2-fsa/sherpa-onnx
Length of output: 6442
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && cat sherpa_onnx/src/main/cpp/non-streaming-asr.cc | sed -n '485,530p'Repository: k2-fsa/sherpa-onnx
Length of output: 1430
🏁 Script executed:
cd harmony-os/SherpaOnnxHar && rg -n "CreateOfflineRecognizer\|CreateOfflineStream" --type cpp -B 3 -A 12 | head -60Repository: k2-fsa/sherpa-onnx
Length of output: 44
The async worker has a use-after-free vulnerability due to unmanaged pointer lifetime.
DecodeOfflineStreamAsyncWorker stores raw pointers to the recognizer and stream but does not hold references to the Napi::External JavaScript objects that own them. When DecodeOfflineStreamAsyncWrapper returns, if the JavaScript engine garbage collects those objects, their finalizers (SherpaOnnxDestroyOfflineRecognizer and SherpaOnnxDestroyOfflineStream) will free the underlying C++ pointers while the async worker still holds raw references to them.
Unlike SpeakerDiarizationProcessWorker and TtsGenerateWorker elsewhere in the codebase (which use Napi::Reference<Napi::Value> to keep JS objects alive), this worker relies entirely on external reference management.
Hold references to the External objects in the worker using Napi::Reference:
class DecodeOfflineStreamAsyncWorker : public Napi::AsyncWorker {
private:
Napi::Reference<Napi::External<SherpaOnnxOfflineRecognizer>> recognizer_ref_;
Napi::Reference<Napi::External<SherpaOnnxOfflineStream>> stream_ref_;
const SherpaOnnxOfflineRecognizer *recognizer_;
const SherpaOnnxOfflineStream *stream_;Then in the constructor, capture the references:
recognizer_ref_ = Napi::Reference<Napi::External<SherpaOnnxOfflineRecognizer>>::New(recognizer_external);
stream_ref_ = Napi::Reference<Napi::External<SherpaOnnxOfflineStream>>::New(stream_external);Pass the External objects from info[0] and info[1] directly to the worker constructor.
Summary by CodeRabbit
New Features
Documentation
✏️ Tip: You can customize this high-level summary in your review settings.