VAD
Overview
Voice Activity Detection (VAD) is a component in audio processing that determines whether the incoming audio contains speech. In this implementation, VAD helps control the noise cancellation process and optimizes audio transmission.
KrispSDK constructor arguments
Property | Type | Default | Description |
---|---|---|---|
params.models.modelNC | String | undefined | path to the NC model used for sampling rates above 8KHz |
params.models.modelVAD | String | undefined | path to the VAD model |
let krispSDK = new KrispSDK({
// .. other params
models: {
// ... other models
modelNC: "/dist/models/c6.f.s.da1785.kef",
modelVAD: "/dist/models/vad_2.0.0_1.0.kef",
},
},
});
Noise Filter Creation
const audioSettings = {
audio: {
echoCancellation: false,
noiseSuppression: false,
autoGainControl: false,
},
};
const stream = await navigator.mediaDevices.getUserMedia(audioSettings);
const source = audioContext.createMediaStreamSource(stream);
const destination = audioContext.createMediaStreamDestination();
const filterParam = {
audioContext,
useVAD: true, // Toggle VAD with this paramter
vad: {
// The VAD threshold is a configurable value that dictates when an audio signal is classified as speech or silence.
// It is compared against the output of the VAD processing function to decide if speech is present
threshold: 0.5,
},
};
const filterNode = await krispSDK.createNoiseFilter(filterParam, onReady);
source.connect(filterNode).connect(destination);
destination.stream; // destination stream is the resulting stream which can be used in your buisness logic
The accuracy of VAD depends on the chosen threshold value. A lower threshold may classify more background noise as speech, while a higher threshold might cause speech to be ignored.
Updated about 14 hours ago