MELP TSVCIS - MELP Vocoder

MELP using Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS)

The Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) is variable data rate system that is based on a MELPe STANAG-4591, and offers also scalability to enhanced coding at higher rates such as at 8000, 12000 and 16000 bit/s.

In 2009, the TSVCIS was created by National Security Agency (NSA) and Naval Research Laboratory (NRL), based on STANAG4591 MELPe. [7] It includes the following two main voice categories:

TSVCIS Narrowband (NB) Waveform which is based on STANAG-4591, the 600 bit/s, 1200 bit/s and 2400 bit/s NATO Interoperable Narrow Band Voice Coder, i.e. MELPe codec,
TSVCIS Wideband (WB) Waveform which operates at 8000, 12000, and 16000 bit/s and is based on both the STANAG-4591 codec and additional encoded voice wideband parameters that enable scalability of the 2400 bit/s MELPe to higher multi-rate speech coding.

Both categories may have different modes including Forward Error Correction (FEC) using different blocks of BCH (Bose-Chaudhuri-Hocquenghem). The FEC protection gives a tremendous advantage in highly degraded channels so that speech intelligibility will be maintained even at very high rates of channel bit errors.

TSVCIS includes also a special WB Voice 16 Gateway mode at 16000 bit/s channel, using band FEC, which is used when NB to WB crossbanding has occurred.

Improved Voice Quality, Goals and Modes

One of the goals in implementing the TSVCIS was to improve voice quality as compared to existing equipment. A key aspect of overall vocoder performance is how well a vocoder performs in two common military environments: a) harsh acoustic platform noise and b) harsh RF channel noise (bit errors). [8]

For the 2400 bits/s NB voice mode, MELPe is the DoD and NATO standard. It was designed to provide good performance overall in tactical environments and has been tested extensively as part of the standardization process.

The goals for the WB voice modes, besides improving voice quality, were to provide multiple voice rates that can be matched to varying channel conditions while still maintaining direct interoperability with the other voice modes, both wideband (WB) and narrowband (NB).

These goals were met with four WB voice modes that are all supersets of the MELPe 2400 bit/s mode. There is an 8000 bit/s mode, a 12,000 bit/s mode, and two modes at 16,000 bit/s. The rates of these modes were selected to conform to the voice data rates of existing radio equipment. The modes themselves were defined to accommodate the expected range of acoustic and channel noise condition.

The 8000 bit/s and 12,000 bit/s modes are both super-protected MELPe frames. They both have Forward Error Correction (FEC) layered on the NB bitstream to strongly protect it from channel noise and so provide higher voice quality over longer transmission distances.

The first 16000 bit/s mode contains both layered FEC to protect from channel noise and a layer of additional voice information (spectral coefficient parameters) to improve voice quality in harsh acoustic environments.

Finally, the second 16000 bit/s mode contains no FEC but even more voice information. This mode is designed to provide high voice quality when the channel has low bit errors or FEC is provided out of band.

TSVCIS VDR Coder Architecture

The TSVCIS VDR encoder is illustrated below. Using the input speech and the selected mode, the STANAG-4591 MELPe vocoder can be used along with the NRL Variable Data Rate (VDR) encoder to encode Wideband Waveforms frames at rates 8000, 12000, and 16000 bit/s. [7] Based on the selected TSVCIS VDR operation mode and input speech frames input from the MELPe encoder, the VDR encoder generates and encodes residual spectral codes parameters into additional bits that, along with the standard 2400 bit/s STANAG-4591 bits, are used to generate the Wideband Waveform frames at 8000, 12000, and 16000 bit/s.

The TSVCIS VDR decoder is illustrated below. It extracts STANAG-4591 and VDR frames, and input them to STANAG-4591 decoder and VDR decoder, respectively. The STANAG-4591 MELPe decoder synthesizes its output speech, which is then input to the VDR decoder. The VDR decoder decodes its residual signal which is then used, along with the MELPe output, to synthesize the TSVCIS VDR output speech.

The TSVCIS VDR encoder, is used as part of the TSVCIS encoder as illustrated below. The TSVCIS encoder encrypts the VDR encoded frames, and subsequently performs Forwarded Error Correction (FEC) coding using BCH codes.

The TSVCIS VDR decoder, is used as part of the TSVCIS decoder as illustrated below. The TSVCIS decoder performs Forwarded Error Correction (FEC) decoding , and subsequently decrypts the frames to generate the received VDR encoded frames, which are then input to the TSVCIS VDR decoder to generate the output speech.

The TSVCIS NB and WB channels are illustrated below, where the NB uses the standard STANAG-4591 MELPe, and the WB uses the STANAG-4591 along with the VDR to transmit 8000, 12000, and 16000 bit/s waveforms.

The TSVCIS NB and WB Waveforms' bit layers are illustrated below, where the NB uses the standard STANAG-4591 MELPe, and the WB uses the STANAG-4591 along with the VDR, encryption, and FEC, to transmit 8000, 12000, and 16000 bit/s waveforms.

Compandent's STANAG-4591 MELPe suits can be used along with the VDR coder to generate the TSVCIS vocoder. Alternatively Compandent's FLEXI-232 DTE may be used along with some additional processing to form a TSVCIS vocoder, as illustrated below.

Similarly, when modem is used, Compandent's FLEXI-232 DTE may be used along with some additional processing to form a TSVCIS vocoder, as illustrated below.

TSVCIS MELP - Audio Samples: (click to play)

Coder / Condition	Original	2400 MELP	16 kbps TSVCIS with FEC	16 kbps TSVCIS without FEC
Clean	Original	2400 MELP	16 kbps TSVCIS with FEC	16 kbps TSVCIS without FEC
Noisy	Original	2400 MELP	16 kbps TSVCIS with FEC	16 kbps TSVCIS without FEC

Table 1. Audio samples of standard 2400 MELP, 16 kbps TSVCIS with FEC, and 16 kbps TSVCIS without FEC (including Noise Pre-Processor) in error-free channel.

Note: Compandent's MELPe suite achieves better quality than the above MELP sample.

References

"A Mixed Excitation LPC Vocoder Model for Low Bit Rate Speech Coding," Alan V. McCree, Thomas P. Barnweell, 1995 in IEEE Trans. Speech and Audio Processing (Original MELP).
"Analog-to-Digital Conversion of Voice by 2,400 Bit/Second Mixed Excitation Linear Prediction (MELP)," US DoD (MIL_STD-3005, Original MELP).
"The 1200 and 2400 bit/s NATO Interoperable Narrow Band Voice Coder," STANAG-4591, NATO.
"MELPe Variation for 600 bit/s NATO Narrow Band Voice Coder, STANAG-4591," NATO.
Alan McCree, “A scalable phonetic vocoder framework using joint predictive vector quantization of MELP parameters,” in Proc. IEEE Int. Conf. Acoust., Speech, Signal Processing, 2006, pp. I 705–708, Toulouse, France.
"SCIP Signaling Plan," Revision 3.6 January 8, 2013.
“Tactical Secure Voice Cryptographic Interoperability Specification (TSVCIS) Version 2.1,” July 2, 2012.
Thomas Moran, David Heide, Swati Shah, U.S. Naval Research Laboratory "An Overview of the Tactical Secure Voice Cryptographic Interoperability Specification," in Proceedings of MILCOM 2010, pp.213-218.

Find out more about MELPe software.

To find out more about Compandent's STANAG-4591 MELPe software...

MELPe Software