All audio filters are real-time and language-independent.
Noise Cancellation algorithm is designed to remove background noise during real-time
communication. Krisp SDK includes technologies for both Outbound (Microphone) and Inbound (Speaker) Noise Cancellation.
When the noise cancellation algorithm runs, it also automatically performs de-reverberation removing room echo from the audio.
The technical specs and more details about the algorithms can be found here.
Background Voice Cancellation (BVC) technology is developed to cancel all background voices. It also removes all background noises and reverberation. The technology does not require user voice enrollment or training on user voice data. Krisp has deployed this technology in its Desktop applications, fixing the problem of cross-talk in call centers and offices.
BVC technology is designed to work with any headset and earbud. It works best with wired USB headsets with a boom microphone and is also compatible with most Bluetooth headsets, including AirPods.
Read more for the specs and details about the algorithm and supported devices.
Voice Activity Detection (VAD) algorithm is designed to predict whether there is
speech in an audio frame or not. It is able to identify the speech presence in high noise conditions.
Specs and details here.
This real-time algorithm retrieves per-frame statistics about the levels of processed voice and removed noise. These statistics are represented as values within the range of 0 to 100, indicating the amount of voice and removed noise in each frame.
In addition to per-frame statistics, the algorithm includes an end-of-stream feature that enables users to retrieve information on the amount of removed noise classified into four categories: no noise, low, medium, and high. This feature also provides information on the total talk time accumulated from the start of the processing until the point at which the statistics are retrieved.
NoiseDB algorithm estimates the noise energy (in DB) of a given audio fragment.
The algorithm takes as input an audio frame and returns integrated estimated Noise DB over time. For reasonably accurate results, we suggest calling it on consecutive frames of an at least 1-second long audio buffer and use the last frame value as overall noiseDB estimation of that audio buffer.
To identify if there is noise in the fragment SDK, the client needs to use a threshold and classify noisy/no-noise segments based on that threshold. The higher the threshold the more high energy noises will be considered as noisy. We suggest using 50 as the threshold of noiseDB value - if the last estimated NoiseDB is greater than 50 then the given fragment contains noise with enough energy.
Updated 3 months ago