VAD
Overview
Voice Activity Detection (VAD) is a component in audio processing that determines whether the incoming audio contains speech. In this implementation, VAD helps control the noise cancellation process and optimizes audio transmission.
KrispSDK constructor arguments
| Property | Type | Default | Description | 
|---|---|---|---|
| params.models.modelNC | String | undefined | path to the NC model used for sampling rates above 8KHz | 
| params.models.modelVAD | String | undefined | path to the VAD model | 
let krispSDK = new KrispSDK({
     // .. other params
      models: {
        // ... other models
				modelNC: "/dist/models/c6.f.s.da1785.kef",
        modelVAD: "/dist/models/vad_2.0.0_1.0.kef",
      },
    },
});
Noise Filter Creation
const audioSettings = {
  audio: {
    echoCancellation: false,
    noiseSuppression: false,
    autoGainControl: false,
  },
};
const stream = await navigator.mediaDevices.getUserMedia(audioSettings);
const source = audioContext.createMediaStreamSource(stream);
const destination = audioContext.createMediaStreamDestination();
const filterParam = {
	audioContext,
  useVAD: true, // Toggle VAD with this paramter
  vad: {
    // The VAD threshold is a configurable value that dictates when an audio signal is classified as speech or silence.
    // It is compared against the output of the VAD processing function to decide if speech is present
    threshold: 0.5,
  },
};
const filterNode = await krispSDK.createNoiseFilter(filterParam, onReady);
source.connect(filterNode).connect(destination);
destination.stream; // destination stream is the resulting stream which can be used in your buisness logic
The accuracy of VAD depends on the chosen threshold value. A lower threshold may classify more background noise as speech, while a higher threshold might cause speech to be ignored.
Updated 2 months ago
