Encoding method, encoder, method of determining periodic feature value, device for determining periodic feature value, programme and recording medium

FIELD: physics, acoustics.

SUBSTANCE: invention relates to encoding an audio signal. An encoding method for encoding a sample sequence in a frequency domain which is derived from an audio signal in frames, the method comprising: an interval determination step of determining an interval T between samples from a set S of candidates for the interval T, the interval T corresponding to a periodicity of the audio signal or to an integer multiple of the fundamental frequency of the audio signal; an additional information generating step of encoding the interval T determined at the interval determination step to obtain additional information; and a sample sequence encoding step of encoding a rearranged sample to obtain a code sequence.

EFFECT: high quality of encoding an audio signal with a low bit-rate with less processing.

22 cl, 10 dwg

 

The technical field TO WHICH the INVENTION RELATES

The present invention relates to a method of coding an audio signal and, in particular, to coding sequences of samples in the frequency domain that are obtained by converting the audio signal into frequency domain, and to a method of determining the amount of periodic basis (e.g., fundamental frequency or period of the pitch) that can be used as an indicator for reordering sequences of samples when encoding.

The LEVEL of TECHNOLOGY

Adaptive encoding, which encodes the coefficients of orthogonal polynomials, such as the coefficients of the discrete Fourier transform (DFT), modified discrete cosine transform (MDCT), is known as a method of encoding speech signals and audio signals at low bit transmission speeds (e.g., about 10-20 kbps). For example, the advanced wideband adaptive multirate codec (AMR-WB+), which is the standard method has an encoding mode with a transform coded excitation (TCX), in which the DFT coefficients and normalized vector quantization is performed every 8 counts.

When weighting vector quantization with alternation and conversion of regions (TwinVQ) all the MDCT coefficients per�arranged in accordance with a fixed rule, and the resulting set of samples are combined in the vectors and encoded. In some cases, TwinVQ method is used, in which large components are derived from MDCT coefficients, for example, each period of the basic tone, the information corresponding to the period of the basic tone is encoded, the remaining sequence of MDCT coefficients after removal of large components in each period of the basic tone are reordered, and performs vector quantization of the reordered sequence of MDCT coefficients using each predetermined number of samples. Examples of references to TwinVQ include non-patent literature 1 and 2.

An example of a method for extraction of samples at regular intervals for encoding is the method described in patent literature 1.

LITERATURE of prior art

[Patent literature]

Patent literature 1: lined patent application of Japan No. 2009-156971.

[Non-patent literature]

Non-patent literature 1: T. Moriya, N. Iwakami, A. Jin, K. Ikeda, and S. Miki, “A Design of Transform Coder for Both Speech and Audio Signals at 1 bit/sample,” Proc. ICASSP '97, pp.1371-1384, 1997.

Non-patent literature 2: J. Herre, E. Allamanche, K. Brandenburg, M. Dietz, B. Teichmann, B. Grill, A. Jin, T. Moriya, N. Iwakami, T. Norimatsu, M. Tsushima, T. Ishikawa, “The Integrated Filterbank Based Scalable MPEG-4 Audio Coder,” 105thConvention Audio Engineering Society, 4810, 1998.

Summary of the INVENTION

Problem solved by the invention]

Since coding based on TCX, such as AMR-WB+, does not take into account the changes in the amplitude of the frequency domain coefficients based on the frequency, decreasing the encoding efficiency when varying amplitudes are encoded together. There are changes in the quantization and coding based on TCX. In this case, the example in which entropy encoding is applied to the sequence of MDCT coefficients that represent a discrete value obtained by the quantization and arranged in increasing order of frequency to achieve compression. In this case, many samples are treated as a single symbol (block coding), and assigned to the symbol code is adaptively controlled depending on the symbol immediately preceding this symbol. As a rule, shorter codes are assigned to symbols with smaller amplitudes and longer codes are assigned to symbols with large amplitudes. As assigned codes are adaptively controlled depending on an immediately preceding symbol are assigned shorter codes, when in the sequence meet the values with small amplitudes. When the countdown with a significantly greater amplitude suddenly appears after the reference amplitude is small, this reference is assigned a very�ü long code.

Normal TwinVQ was developed with the assumption that you are using vector quantization with a fixed length code, where the codes with the same length are assigned to each vector composed of data samples, and it was assumed that it will not be used for coding of MDCT coefficients by coding with variable length.

In light of the above technical explanations of the object of the present invention is to provide a coding method which increases the quality of digital signals, particularly digital signals, speech/audio, encoded by low-speed encoding with a small amount of computation, and providing a method of determining the amount of periodic symptom that can be used as an indicator for reordering sequences of samples when encoding.

[Means for solving problems]

According to an encoding method of the present invention encoding method for encoding a sequence of samples in the frequency domain that are derived from audio signals in frames, includes the step of determining the interval to determine the interval T between samples, which correspond to the periodicity of an audio signal or an integer multiple of the fundamental frequency of the audio signal from the set S of possible options for the interval T, the phase genericeventhandler information for encoding interval T, defined at the stage of determining the interval for more information, and the step of encoding the sequence of samples to encode the reordered sequence of samples to obtain a code sequence, and reordered the sequence of samples (1) includes all samples in the sequence of samples and (2) is a sequence of samples in which at least some of the sequences of samples are reordered so that all or some of one or of a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency of the audio signal in the sequence of samples, and one or a plurality of successive counts which includes counting, the appropriate integer multiple of the periodicity or fundamental frequency of the audio signal in the sequence of samples, gather together in a cluster based on the interval T determined by the stage of determining interval. The step of determining the interval the interval T is determined from the set S, composed of Y possible options (where Y<Z), of Z of possible options for the interval T provided with additional information, and Y possible options include Z2possible options (where Z2<Z) selected for without�animosty from possible options subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame, and includes an option, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame.

The step of determining the interval may further include the step of adding to add to the set of S values, the adjacent possible option, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame, and/or values having a predetermined difference from the possible options.

The step of determining the interval may further include a pre-selection for the selection of some of Z1possible options from the number Z of possible options for the interval T provided with additional information, as Z2possible options based on the indicator received from the audio signal and/or the sequence of samples in the current frame, where Z2<Z1.

The step of determining the interval may further include a pre-selection for the selection of some of Z1possible options from the number Z of possible options for the interval T, presented�aemula with additional information based on the indicator received from the audio signal and/or the sequence of samples in the current frame, and the second phase adding for choice, as Z2possible options, the set of possible options that you selected during the preliminary selection, and value, nearby for a possible option that you selected at the stage of preliminary selection, and/or values having a predetermined difference from the option selected in the preliminary selection.

The step of determining the interval may include the second phase of pre-selection to select some of the options for the interval T, which is included in the set S, based on the indicator received from the audio signal and/or the sequence of samples in the current frame, and the final stage of selection to determine the interval T from the set made up of some of the possible options that you selected at the second stage of the preliminary selection.

Also a possible configuration where the more the indicator indicating the degree of stationarity of the audio signal in the current frame, the greater the share options, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame in the set S.

Also a possible configuration where, when the indicator pointing�th degree of stationarity of the audio signal in the current frame, less than a predetermined threshold, only the Z2possible options included in the set S.

The indicator indicating the degree of stationarity of the audio signal in the current frame is incremented when at least one of the following conditions:

(a-1) increases the gain of the predictions of the audio signal in the current frame",

(a-2) increases "the estimated gain of the predictions of the audio signal in the current frame",

(b-1) decreases the difference between the "gain predictions of the audio signal in the frame immediately preceding the current frame", and "gain predictions of the audio signal in the current frame",

(b-2) decreases the difference between the "estimated gain prediction in the immediately preceding frame" and "estimated gain of the prediction in the current frame",

(c-1), increases the "sum of amplitudes of samples of an audio signal included in the current frame",

(c-2), increases the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(d-1) decreases the difference between the "sum of amplitudes of samples of an audio signal included in the immediately preceding frame" � "sum of amplitudes of samples of an audio signal, included in the current frame,

(d-2) decreases the difference between the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the immediately preceding frame in frequency domain", and the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(e-1) increases the power of the audio signal in the current frame",

(e-2) increasing the capacity of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain,"

(f-1) decreases the difference between the "power of the audio signal in the immediately preceding frame" and the "power of the audio signal in the current frame", and

(f-2) decreases the difference between the "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the immediately previous frame in the frequency domain," and "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain".

The phase encoding sequence�scenic spots of samples may include the phase of the output code sequence, obtained by encoding the sequence of samples before performing reordering, or a code sequence obtained by encoding the reordered sequence of samples and additional information, which has a smaller code size.

The step of encoding the sequence of samples may output a code sequence obtained by encoding the reordered sequence of samples and additional information, when the amount code amount or estimated value of the code amount of a code sequence obtained by encoding the reordered sequence of samples, and the code amount of the additional information is less than the code value or assessed value code amount of a code sequence obtained by encoding the sequence of samples before performing reordering, and may output a code sequence obtained by encoding the sequence of samples before performing reordering, when the code value or evaluated value of the code amount of a code sequence obtained by encoding the sequence of samples before performing reordering, less than the sum of code amount or estimated value SIZ�s code code sequence, obtained by encoding the reordered sequence of samples, and the code amount of additional information.

The share options, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame in the set S can be more, when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the reordered sequence of samples than when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the sequence of samples before performing reordering.

Also a possible configuration where, when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding perioperative sequence of samples, the set S includes only the Z2the possible options.

Also a possible configuration where, when the current frame is the first frame temporarily, or when the immediately preceding frame is encoded by the encoding that is different from STRs�both the encoding of the present invention, or when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding perioperative sequence of samples, the set S includes only the Z2the possible options.

Method of determining the amount of the periodic characteristic of the audio signal in frames according to the present invention includes the step of determining the magnitude of a periodic basis to determine the magnitude of the periodic characteristic of the audio signal from the set of possible options for the values of a periodic feature on a frame-by-frame basis, and the step of generating additional information for encoding the magnitude of the periodic characteristic, obtained at the stage of determining the values of a periodic basis, with the aim of obtaining additional information. The step of determining the magnitude of the periodic basis is determined by the magnitude of the periodic characteristic of a set S composed of Y possible options (where Y<Z) of the number Z of possible options for the value of the periodic symptom, present additional information, and Y possible options include Z2possible options (where Z2<Z), selected without depending on a possible subjected to the step of determining the magnitude of the periodic�who tag in the frame, previous predefined number of frames before the current frame, and includes an option, subject to the determination of the magnitude of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame.

The step of determining the magnitude of the periodic basis may further include the step of adding to add to the set of S values, the adjacent possible option, subject to the determination of the magnitude of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame, and/or values having a predetermined difference from the possible options.

Also a possible configuration where, the greater the indicator indicating the degree of stationarity of the audio signal in the current frame, the greater the share options, subject to the determination of the magnitude of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame in the set S.

Also a possible configuration where, when the indicator indicating the degree of stationarity of the audio signal in the current frame is less than a predetermined threshold, only the Z2possible options included in the set S.

An indicator showing the degree of a stationary�arity of the audio signal in the current frame, increases, when executed by at least one of two conditions:

(a-1) increases the gain of the predictions of the audio signal in the current frame",

(a-2) increases "the estimated gain of the predictions of the audio signal in the current frame",

(b-1) decreases the difference between the "gain predictions of the audio signal in the frame immediately preceding the current frame", and "gain predictions of the audio signal in the current frame",

(b-2) decreases the difference between the "estimated gain prediction in the immediately preceding frame" and "estimated gain of the prediction in the current frame",

(c-1), increases the "sum of amplitudes of samples of an audio signal included in the current frame",

(c-2), increases the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(d-1) decreases the difference between the "sum of amplitudes of samples of an audio signal included in the immediately preceding frame" and the "sum of amplitudes of samples of an audio signal included in the current frame",

(d-2) decreases the difference between the "sum of amplitudes of samples included in the sequence of indications received pocrescophobia sequence of samples of the audio signal, included in the immediately preceding frame in frequency domain", and the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(e-1) increases the power of the audio signal in the current frame",

(e-2) increasing the capacity of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain,"

(f-1) decreases the difference between the "power of the audio signal in the immediately preceding frame" and the "power of the audio signal in the current frame", and

(f-2) decreases the difference between the "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the immediately previous frame in the frequency domain," and "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain".

The TECHNICAL RESULT of the INVENTION

According to the present invention at least some of the samples included in the sequence of samples in the frequency domain that are derived from the audio signal, for example, are reordered so that are merged into one cluster or a plurality of successive samples, includes a count corresponding to the frequency or the fundamental frequency of the audio signal, and one or a plurality of successive samples including samples corresponding to an integer multiple of the periodicity or fundamental frequency of the audio signal. This processing can be performed with a small amount of computation reordering of samples with equal or almost equal to the indicators that reflect the value of the samples, are collected together in a cluster and, thus, improves the encoding efficiency, and reduces quantization distortion. In addition, can effectively determine the magnitude of the periodic characteristic of the current frame or interval, as the variant for the magnitude periodic basis or interval discussed in the previous frame, whereas entity-based audio signal in the period where the audio signal is in a stationary state.

BRIEF description of the DRAWINGS

Fig. 1 is a schematic diagram illustrating an exemplary functional configuration of the embodiment of the encoder;

Fig. 2 is a diagram illustrating a procedure of a process embodiment of the method of coding;

Fig. 3 is an exemplary diagram illustrating an example of a reordering of the samples included in the sequence of samples;

Fig. 4 is an exemplary diagram illustrating an example of a reordering of the samples included in the sequence of samples;

Fig. 5 is a diagram illustrating an exemplary functional configuration of the embodiment of the decoder;

Fig. 6 is a diagram illustrating a procedure of a process embodiment of the method of decoding;

Fig. 7 is a diagram illustrating an example of a function of the process for determining the interval T;

Fig. 8 is a diagram illustrating an example procedure of a process for determining the interval T;

Fig. 9 is a diagram illustrating a modification of the procedure of the process for determining the interval T; and

Fig. 10 is a diagram illustrating a modification of the embodiment of the encoder.

DETAILED DESCRIPTION of embodiments of

Embodiments of the present invention are described with reference to the drawings. The same elements assigned the same reference position, and is omitted repeated description of these elements.

One of the characteristics of the present invention is to improve the coding to reduce quantization distortion by reordering counts, based on symptom counts the frequency domain and reducing the amount of code through the use� coding with variable length in infrastructure quantization sequences of samples to the frequency domain, output from the audio signal in a given period of time. Given time period below in this document referred to as a frame. The encoding can be improved by reordering the samples in the frame in which the basic frequency, for example, is relatively obvious in accordance with the frequency for collecting samples with large amplitudes, together in a cluster. Examples of samples in the frequency domain that are derived from the audio signal includes the sequence of DFT coefficients and the sequence of MDCT coefficients obtained by converting a digital signal of speech/audio frames in the time domain into frequency domain, and the sequence of coefficients obtained by applying the normalization, weighting and quantization to these sequences of coefficients. Embodiments of the present invention are described below with sequences of MDCT coefficients as an example.

[Implementation options]

The encoding process

The encoding process is described first with reference to Fig. 1-4. The encoding process of the present invention is performed by the encoder 100 of Fig. 1, which includes block 1 convert the frequency domain, block 2 the normalization of the weighted envelope, block 3 calculate the normalized gain, Blo� 4 quantization the reordering unit 5 and unit 6 encoding, or the encoder 100a in Fig. 10, which includes block 1 convert the frequency domain, block 2 the normalization of the weighted envelope, block 3 calculate the normalized gain, the quantization unit 4, the reordering unit 5, unit 6, coding, unit 7 definition of interval and unit 8 for generating additional information. However, the encoder 100 or 100a does not need to include the unit 1 conversion frequency domain, block 2 the normalization of the weighted envelope, block 3 calculate the normalized gain and the quantization unit 4. For example, the encoder 100 may consist of a reordering unit 5 and unit 6 encoding; the encoder 100a may include reordering unit 5, the encoding unit 6, unit 7 definition of interval and unit 8 for generating additional information. Although in the encoder 100a shown in Fig.10, block 7 determine the range includes a reordering unit 5, the encoding unit 6 and unit 8 for generating additional information, the encoder is not limited to this configuration.

Block 1 convert the frequency domain

First block 1 convert the frequency domain converts the digital signal of speech/audio into a sequence of MDCT coefficients in N points in the frequency domain on a frame-by-frame basis (step S1).

Typically,the encoding side quantum sequence of MDCT coefficients, encodes the quantized sequence of MDCT coefficients and transmits the resulting code sequence to the decoding side, the decoding side can recover the quantized sequence of MDCT coefficients from a code sequence and may further restore the digital signal to the speech/audio time domain by inverse MDCT transform. The amplitude of the MDCT coefficients has approximately the same amplitude envelope (the envelope of the power spectrum) and the power spectrum of conventional DFT. Therefore, information function, which is proportional to the logarithm of the envelope of the amplitude, can evenly disperse the quantization distortion (quantization error) of the MDCT coefficients in all frequency bands, to reduce the total quantization distortion and compress information. It should be noted that the envelope of the power spectrum can be efficiently evaluated by using the linear prediction coefficient obtained by analysis based on linear prediction. Ways of controlling the quantization error includes a method for the adaptive assignment of bits to the quantization of MDCT coefficients (dither amplitude and then adjusting the step size of quantization) and method for the adaptive assignment of the weighting factor simply�PTO weighted vector quantization to determine the codes. It should be noted that while this document describes one example of a method of quantization that is performed in the embodiment of the present invention, the present invention is not limited to the described method of quantization.

Unit 2 the normalization of the weighted envelope

Unit 2 the normalization of the weighted envelope normalizes the coefficients of the input sequence of MDCT coefficients by using the sequence of coefficients of the envelope of the power spectrum of the digital signal of speech/audio, estimated using the linear prediction coefficient obtained by the analysis based on the linear prediction digital signal speech/audio frame, and outputs the weighted normalized sequence of MDCT coefficients (step S2). In this case, to achieve quantization which visually minimizes distortion, unit 2 the normalization of the weighted envelope uses a weighted sequence of coefficients of the envelope of the power spectrum obtained by attenuation of the envelope of the power spectrum for normalization coefficients in the sequence of MDCT coefficients on a frame-by-frame basis. As a result, the weighted normalized sequence of MDCT coefficients has no blockage amplitude or large amplitude changes compared to the input of the followers�the nost of the MDCT coefficients, but has a change of size, change the sequence of coefficients of the envelope of the power spectrum of the digital signal of speech/audio, i.e. the weighted normalized sequence of MDCT coefficients is somewhat large amplitude in the field of the coefficients corresponding to low frequencies, and has a smooth structure due to a period of the basic tone.

[Example of the process of normalization of the weighted envelope]

The coefficients W(1), ..., W(N) is the sequence of coefficients of the envelope of the power spectrum, which correspond to the coefficients X(1), ..., X(N) sequence of MDCT coefficients in N points, can be obtained by converting the coefficients of the linear transformation in the frequency domain. For example, in accordance with an autoregressive process p-order, which is the model with the poles, the time signal x(t) at time t can be expressed by equation (1) with the previous values x(t-1), ..., x(t-p) of the time signal in the previous p time points, residues e(t) of the prediction coefficients and α1, ..., αpthe linear prediction. Then the coefficients W(n)[1≤n≤N] the sequence of the coefficients of the envelope of the power spectrum can be expressed by the equation (2), where exp(·) is an exponential function with osnovnoskolci Napier, j represents the imaginary unit and σ2represents the energy balance prediction

x(t)+α1x(t1)++αpx(tp)=e(t)(1)W(n)=σ22π1|1+α1exp(jn)+α2exp(2jn)++αpexp (pjn)|2(2)

The linear prediction coefficients can be obtained by an analysis based on the linear prediction unit 2 weighted normalization of the envelope of the digital signal of speech/audio in block 1 of converting the frequency domain or can be obtained by an analysis based on linear prediction digital signal of speech/audio other undescribed means in the encoder 100 or 100a. In this case, unit 2 weighted normalization of the envelope receives the coefficients W(1), ..., W(N) in the sequence of coefficients of the envelope of the power spectrum through the use of linear prediction coefficient. If the coefficients W(1), ..., W(N) in the sequence of coefficients of the envelope of the power spectrum have already been obtained by other means (unit 9 calculation of the sequence of coefficients of the envelope of the power spectrum) in the encoder 100 or 100a, unit 2 the normalization of the weighted envelope can use the coefficients W(1), ..., W(N) in the sequence of coefficients of the envelope of the power spectrum. It should be noted that, since the decoder 200, which is described below, you must obtain these values, gender�military in the encoder 100 or 100a, uses the quantized linear prediction coefficients and/or the sequence of coefficients of the envelope of the power spectrum. Below in this document, the term "coefficient of the linear prediction" or "the sequence of coefficients of the envelope of the power spectrum" means the quantized linear prediction coefficient or a quantized sequence of the coefficients of the envelope of the power spectrum, unless specified otherwise. The linear prediction coefficients are coded using conventional coding method, and coding of prediction coefficients are then transmitted to a decoding side. Conventional encoding method may be a method of coding that provides the codes corresponding to the very linear prediction coefficients as the codes of the coefficients of a prediction encoding method that converts the linear prediction coefficients into LSP parameters and provides the codes corresponding to the LSP parameters as codes of prediction coefficients, or encoding method that converts the linear prediction coefficients in the PARCOR coefficients and provides the codes corresponding to the PARCOR coefficients, for example, the codes of the prediction coefficients. If the sequence of coefficients of the envelope of the power spectrum obtained by other means, provide�supported in the encoder 100 or 100a, other means in the encoder 100 or 100a encode the linear prediction coefficients by a conventional coding method and transmitting the codes of the coefficients of the prediction on the decoding side.

Although later in this document shows two examples of the process of normalization of the weighted envelope, the present invention is not limited to these examples.

<Example 1>

Unit 2 the normalization of the weighted envelope divides the coefficients X(1), ..., X(N) in the sequence of MDCT coefficients for values of Wγ(1), ..., Wγ(N) the modification of the coefficients in the sequence of coefficients of the envelope of the power spectrum, which correspond to the coefficients to obtain the coefficients X(1)/Wγ(1), ..., X(N)/Wγ(N) in the weighted normalized sequence of MDCT coefficients. The values of Wγ(n)[1≤n≤N] modifications are determined by equation (3) where γ is a positive constant that is less than or equal to 1, and weakens the coefficients of the power spectrum.

<Example 2>

Unit 2 the normalization of the weighted envelope divides the coefficients X(1), ..., X(N) in the sequence of MDCT coefficients for raised to the power value W(1)β, ..., W(N)β, which are obtained by the construction of the coefficients in the sequence of coefficients of the envelope of the power spectrum, which �sootvetstvuut the coefficients X(1), ..., X(N), β is the degree (0<β<1) to obtain the coefficients X(1)/W(1)β, ..., X(N)/W(N)βthe weighted normalized sequence of MDCT coefficients.

The result is a weighted normalized sequence of MDCT coefficients in the frame. Weighted normalized sequence of MDCT coefficients has no blockage amplitude or large amplitude changes compared with the input sequence of MDCT coefficients, but has a change of size, change of the envelope of the power spectrum of the input sequence of MDCT coefficients, i.e. the weighted normalized sequence of MDCT coefficients is somewhat large amplitude in the field of the coefficients corresponding to low frequencies, and has a smooth structure due to a period of the basic tone.

It should be noted that because the reverse process of normalization of the weighted envelope, i.e. the process to restore the sequence of MDCT coefficients from a weighted normalized sequence of MDCT coefficients, is performed on the decoding side, it is setup for calculation of weighted sequences of coefficients of the envelope of the power spectrum of the sequences of coefficients of the envelope of the power spectrum must be shared between the encoding and decoder�ing parties.

Unit 3 calculate the normalized gain

Then the block 3 calculate the normalized gain factor determines the size of the quantization step by using the sum of the amplitude values or the values of energy at all frequencies, so that the coefficients in the weighted normalized sequence of MDCT coefficients in each frame can be quantized according to the total number of bits, and obtains the ratio (below in this document referred to as the gain), whereby the share coefficients for the weighted normalized sequence of MDCT coefficients, which are ensured by the specific size of the quantization step (step S3). Information representing the gain factor is transmitted to the decoding side in the form of information gain. Unit 3 calculate the normalized gain normalizes (divides) the coefficients in the weighted normalized sequence of MDCT coefficients in each frame on the gain.

The quantization unit 4

Then, the unit 4 uses the quantization step size of quantization defined in the process in step S3, the quantization coefficients in the weighted normalized sequence of MDCT coefficients, normalized by the gain on a frame-by-frame basis (step S4)./p>

Unit 5 reordering

Quantized sequence of MDCT coefficients in each frame obtained through the process in step S4 is entered in block 5 of reordering, which is the substantive part of this embodiment. Enter in block 5 reordering is not limited to the sequences of the coefficients obtained through the processes at the steps S1-S4. For example, the input may be a sequence of coefficients, which is not normalized by the normalization unit 2 weighted envelope, or a sequence of coefficients, which is not quantized by the quantization unit 4. To ensure a clear understanding of this, enter in block 5 reordering below in this document referred to as "the sequence of samples to the frequency domain" or simply referred to as "consistency counts". In this embodiment, the implementation sequence of quantized MDCT coefficients obtained in the process in step S4, the equivalent sequence of samples to the frequency domain, and, in this case, the samples comprising the sequence of samples to the frequency domain, equivalent to the quantized coefficients in the sequence of MDCT coefficients.

Unit 5 reordering reorders on a frame-by-frame basis at least some of the counts included in posledovatel�of samples to the frequency domain, so (1) includes all samples in the sequence counts the frequency domain, and (2) readings that have equal or almost equal to the indicators that reflect the value of the samples, are collected together in a cluster, and outputs the reordered sequence of samples (step S5). In this case, examples of indicators that reflect the size of the samples" include, but are not limited to, the absolute values of the amplitudes of the samples or the power (quadratic) of the samples.

[Details of the reordering process]

Below is described an example of the reordering process. For example, block 5 reordering reorders at least some of the samples included in the sequence of samples, so that (1) include all samples in the sequence of samples, and (2) some or all of one or a plurality of consecutive samples in the sequence of counts, including the count, which corresponds to the frequency or the fundamental frequency of the audio signal, and one or a plurality of consecutive samples in the sequence of counts, including the count that corresponds to a multiple of the periodicity or fundamental frequency of the audio signal, gather together in a cluster, and outputs the reordered sequence of samples. I.e. are reordered at m�re some of the samples, included in the input sequence of samples so that one or a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency of the audio signal, and one or a plurality of successive counts which includes the count corresponding to an integer multiple of the periodicity or fundamental frequency of the audio signal, gather together in a cluster.

This is based on the distinctive features of audio signals, especially speech and music that the absolute values of the amplitudes of the samples and the output samples that correspond to the fundamental frequency and the harmonic frequency which is an integer multiple of the fundamental frequency), and counts about these samples more of the absolute values of the amplitudes of the samples and the output samples that correspond to frequency bands, except for the fundamental frequency and harmonics. Audio signals also have a feature that, since the value of a periodic basis (e.g., the period of the pitch) of the audio signal, which is extracted from the audio signal, such as speech and music, is equivalent to the fundamental frequency, the absolute value and amplitude of the samples and the output samples that correspond to the largest periodic basis (e.g., the period of the pitch) of the audio signal and the integer multiple and the absolute values of the amplitudes of the samples and the power odonto� about these samples more of the absolute values of the amplitudes of the samples and the output samples, which correspond to the frequency bands in addition to the periodic value of the trait and whole multiples of magnitude periodic basis.

One or a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency of the audio signal, and one or a plurality of successive counts which includes the count corresponding to an integer multiple of the periodicity or fundamental frequency of the audio signal, gather together in one cluster on the low-frequency side. The interval between the count corresponding to the frequency or the fundamental frequency of the audio signal, and a count corresponding to a multiple of the periodicity or fundamental frequency of the audio signal, (below in this document referred to simply as interval) below in this document referred to as T.

In a specific example, the reordering unit 5 selects three of reference, namely the reference F(nT) corresponding to integer multiples of the interval T, the countdown before the countdown F(nT), and the count subsequent to the count F(nT) F(nT-1), F(nT) and F(nT+1), from the input sequence of samples. F(j) is a count corresponding to the identification number j representing the index of the count corresponding to the frequency. In this case, n is an integer in the range from 1 to a value such that nT+1 does not exceed a predetermined upper p�Edel N samples, subject to reordering. n=1 corresponds to the fundamental frequency, and n>1 corresponds to the harmonic. The maximum value of the identification number j representing the index of the count corresponding to the frequency, denoted as jmax. A set of samples selected in accordance with n, referred to as the group of samples. The upper limit of N may be equal to jmax. However, N may be less than jmax, to collect samples, with the large indicators, together in a cluster on the low frequency side to improve coding efficiency, as described below, as indicators of counts in the band of high frequencies of the audio signal, such as speech and music, are usually quite small. For example, N may be about half the value of jmax. Let nmax denotes the maximum value of n which is determined on the basis of the upper limit N, then the counts corresponding to the frequency range from low frequency to the first predetermined frequency nmax*T+1, the number of samples in the input sequence of samples are samples that are subject to reordering. In this case, the symbol * represents multiplication.

The reordering unit 5 allocates the selected samples F(j) in order from the beginning of the sequence of samples, while maintaining the original order of identification numbers j to generate the last�dovalidate A samples. For example, if n represents an integer in the range from 1 to 5, the reordering unit 5 allocates the first group of samples F(T-1), F(T) and F(T+1), the second group of samples F(2T-1), F(2T) and F(2T+1), the third group of samples F(3T-1), F(3T) and F(3T+1), the fourth group of samples F(4T-1), F(4T) and F(4T+1) and the fifth group of samples F(5T-1), F(5T) and F(5T+1) in order from the beginning of the sequence of samples. I.e. 15 samples F(T-1), F(T) F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T) and F(5T+1) are placed in this order from the beginning of the sequence of samples and 15 samples comprise a sequence of samples.

The reordering unit 5 additionally place the samples F(j) that have not been selected, in order, from the end of A sequence of samples, while maintaining the original order of identification numbers j. The samples F(j) that were not selected, are arranged between the groups of samples that comprise A sequence of samples. A cluster of such successive samples is referred to as a set of samples. I.e. in the above example, the first set of samples F(1), ..., F(T-2), a second set of samples F(T+2), ..., F(2T-2), the third set of samples F(2T+2), ..., F(3T-2), the fourth set of samples F(3T+2), ..., F(4T-2), the fifth set of samples F(4T+2), ..., F(5T-2) and the sixth set of samples F(5T+2), ..., F(jmax) are placed in order from the end of A sequence of samples, and these samples comprise sequence B OTS�Yotov.

Briefly, the input sequence of samples F(j)(1≤j≤jmax) in this example is reordered as F(T-1), F(T) F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T), F(5T+1), F(1), ..., F(T-2), F(T+2), ..., F(2T-2), F(2T+2), ..., F(3T-2), F(3T+2), ..., F(4T-2), F(4T+2), ..., F(5T-2), F(5T+2), ..., F(jmax) (see Fig. 3).

It should be noted that in the low frequency counts except counts corresponding to the periodicity or fundamental frequency of the audio signal, and samples with their respective integer multiple, often have large amplitude, and power values. Therefore, the counts range from a low frequency to a predetermined frequency f can be excluded from reordering. For example, if the pre-defined frequency f is equal to nT+α, the original samples F(1), ..., F(nT+α) are reordered not, but are reordered original samples F(nT+α+1) and subsequent samples, where α is initially set at an integer that is greater than or equal to 0, and somewhat less than T (for example, an integer less than T/2). In this case, n may be an integer greater than or equal to 2. Alternatively, the source of P consecutive samples F(1), ..., F(P) from the reference corresponding to the lowest frequency that can be excluded from reordering, and can paleoproductivity original countdown F(P+1) and subsequent samples. In this case, a predetermined frequency f is P. the Combination of�spine counts, be reordered, reordered in accordance with the above rule. It should be noted that, if the first pre-defined frequency has been set, pre-defined frequency f (the second pre-defined frequency) is less than the first pre-defined frequency.

If the original samples F(1), ..., F(T+1), for example, are reordered, the original reference F(T+2) and subsequent samples should paleoproductivity, the input sequence of samples F(j)(1≤j≤jmax) is reordered as F(1), ..., F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T), F(5T+1), F(T+2), ..., F(2T-2), F(2T+2), ..., F(3T-2), F(3T+2), ..., F(4T-2), F(4T+2), ..., F(5T-2), F(5T+2), ..., F(jmax) in accordance with the above rule reordering (see Fig. 4). It should be noted that, although all samples included in the reference sequence in frequency domain is depicted as having a value greater than or equal to 0 in Fig. 3 and 4, so they are depicted to show that the samples which have a large amplitude, appear on the side of lower frequencies as a result of the reordering of the samples. The samples included in the sequence of samples in the frequency domain, can take positive or negative values or zero in some cases; the reordering described above, or reordering described nor�e, can be run against any of these cases.

Other upper limits N or other first predetermined frequency, which determine the maximum value of identification numbers j that are subject to reordering, can be set for different frames, instead of setting the upper limit of N or a first predetermined frequency which is common to all frames. In this case, information specifying an upper limit of N or a first predetermined frequency for each frame, can be transmitted to the decoding side. In addition, the number of groups of samples that are subject to reorder, may be set instead of setting the maximum value of identification numbers j that are subject to reordering. In this case, the number of groups of samples can be set for each frame, and information that specifies the number of groups of samples, can be transmitted to the decoding side. Of course, the number of groups of samples that are subject to reordering, can be shared for all frames. Different second predetermined frequency f can be set for different frames instead of setting the second predetermined value which is common to all frames. In this case, information specifying a second predetermined Astatula each frame, can be transmitted to the decoding side.

The envelope of the indicators of the samples in the sequence of samples, reordered thus, decreases with increasing frequency, when the frequency and indicators of the samples plotted in the abscissa and ordinate, respectively. The reason is that the sequence of samples of an audio signal, especially the sequence of samples of speech and music signals in the frequency domain, as a rule, contain fewer high frequency components. In other words, block 5 reordering reorders at least some of the samples contained in the input sequence of samples, so that the envelope of the indicators of the samples decreases with increasing frequency.

Although the reordering in this embodiment, the implementation collects one or a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency, and one or a plurality of successive counts which includes the count corresponding to an integer multiple of the periodicity or fundamental frequency, together in one cluster on the low frequency side can be performed reordering that collects one or a plurality of successive counts which includes the count corresponding to the frequency or main often�e, and one or a plurality of successive samples including samples corresponding to an integer multiple of the periodicity or fundamental frequency, together in one cluster on the high frequency side. In this case, the group of samples in A sequence of samples are placed in reverse order, sets of samples in the sequence of B samples are placed in reverse order, the sequence of B samples is placed on the low frequency side, a sequence of counts follows a sequence of B samples. I.e., the samples in the example described above are rearranged in the following order from low frequency side: the sixth set of samples F(5T+2), ..., F(jmax), the fifth set of samples F(4T+2), ..., F(5T-2), the fourth set of samples F(3T+2), ..., F(4T-2), the third set of samples F(2T+2), ..., F(3T-2), a second set of samples F(T+2), ..., F(2T-2), the first set of samples F(1), ..., F(T-2), the fifth group of samples F(5T-1), F(5T), F(5T+1), a fourth group of samples F(4T-1), F(4T), F(4T+1), the third group of samples F(3T-1), F(3T), F(3T+1), the second group of samples F(2T-1), F(2T), F(2T+1) and the first group of samples F(T-1), F(T) F(T+1). The envelope of the indicators of the samples in the sequence of samples, reordered thus increases with increasing frequency, when the frequency and indicators of the samples are plotted in abscissas and ordinates, respectively. In other words, the reordering unit 5 preupgrade�Ivan doesn at least some of the counts, included in the input sequence of samples, so that the envelope of the samples increases with increasing frequency.

The interval T may be a fractional value (for example, a 5.0, 5.25 in, or 5,5 5,75) instead of integer. In this case, are chosen F(R(nT-1)), F(R(nT)) and F(R(nT+1)), where R(nT) represents the value of nT, rounded to an integer.

Unit 6 encoding

The encoding unit 6 encodes the reordered input sequence of samples and outputs the resulting code sequence (step S6). Unit 6 changes the encoding encoding with variable length in accordance with the determination of the location of amplitudes of samples included in the reordered input sequence of samples and encodes the sequence of samples. I.e., as the samples having a large amplitude, gather together in a cluster at low frequency (or high frequency) side of the frame by reordering, block 6 encoding performs encoding with variable length, suitable for determining location. If the counts are equal or nearly equal amplitude, gather together in a cluster in each local area, like a reordered sequence of samples, the average code size may be reduced, for example by rice coding, using different parameters of rice for different areas. Op�area is described an example, in which the samples having a large amplitude, gather together in a cluster on the low frequency side in the frame (the side that is closer to the beginning of the frame).

[Coding example]

Unit 6 encoding uses the encoding rice (also called coding Golomb-rice) to each reference in the field, where samples with indicators corresponding to large amplitudes, gather together in a cluster.

In the field except this area, the encoding unit 6 applies entropy encoding (such as Huffman coding or arithmetic coding) to the set of samples in the quality of the unit. To apply encoding rice rice parameter and the area on which to apply the coding of rice, can be fixed, or may be achieved in many different combinations of the field on which to apply the coding of rice, and rice parameter, so that one combination can be selected from combinations. When you select one of the many combinations, the subsequent variable-length prefix codes (binary codes that are enclosed in quotation marks " "), for example, can be used as the selection information indicating a selection for encoding rice, and the encoding unit 6 outputs the code sequence that includes selection information indicating the selection.

"1": rice Coding is not applied.

"01": the Encoding of rice PR�changing for the first 1/32 region sequence with parameter 1 rice.

"001": rice Coding is applied to the first 1/32 region sequence with parameter 2 rice.

"0001": rice Coding is applied to the first 1/16 the area of a sequence parameter 1 rice.

"00001": rice Coding is applied to the first 1/16 the area of a sequence parameter 2 rice.

"00000": rice Coding is applied to the first 1/32 region sequence with parameter 3 rice.

A method of selecting one of these alternatives may be a comparison of code of code sequences corresponding to different alternatives for encoding rice, which are obtained by coding to select the alternative with the least amount of code.

When the area where the samples with the amplitude of 0, occurs in a long sequence appears in a reordered sequence of samples, the average code size can be reduced by encoding the length of the series, for example, the number of consecutive samples that have amplitude 0. In this case, the encoding unit 6 (1) applies the encoding of rice to each reference in the field, where it counts, having indicators corresponding to high amplitudes, gather together in a cluster, and (2) in areas other than this area, (a) uses the encoding that displays the codes that represent the number of consecutive OTS�Yotov, having an amplitude of 0, in the region where the samples with the amplitude 0, appear in the sequence (b) applies entropy encoding (such as Huffman coding or arithmetic coding) to the set of samples in a block in the remaining areas. Again, may be the choice among alternatives to rice coding described above. In this case, information indicating the area where you applied the encoding length of the series, should be sent to the decoding side. This information may be included, for example, in the code sequence. Additionally, if many types of entropy coding methods is provided as alternatives, information identifying which type of encoding was selected, should be sent to the decoding side. The information may be included, for example, in the code sequence.

[Methods for determining the interval T]

Describes how to determine the interval T. In the example of a simple way to provide pre-Z of possible options for the interval T, T1, T2, ..., TZ, block 5 reordering reorders the samples included in the sequence of indications through the use of each of the options, Ti(i=1, 2, ..., Z), the encoding unit 6, which is described below, receives the amount of code to�best sequence, the corresponding sequence of samples derived from each of the options, Tiand selects the variant Tithat provides the least amount of code as the interval T. the encoding Unit 6 displays additional information that identifies the reordering of the samples included in the sequence of samples, for example, the code obtained by encoding interval T.

To determine a suitable interval T, it is desirable that Z is large enough. However, if Z is large enough, requires a significantly larger amount of computation to calculate the actual code values for all possible options that may be problematic from the point of view of efficiency. From this point of view, to reduce the amount of computation, the process of pre-selection can be applied to Z possible options to reduce the number of possible options to Y. the Process of pre-selection, in this case, is the process to select possible options for the final selection process by means of approximate calculation code amount (calculate the estimated code amount) of a code sequence corresponding to the reordered sequence of samples (depending on the conditions of the original posledovatelnostyu, which was not reordered), obtained on the basis of each possible option, or by obtaining the indicator, reflecting the amount of code of a code sequence, or indicator, which refers to the code value code sequence (in this case, the indicator is different from the "value code"). The process of final selection selects the interval T on the basis of actual values code code sequence corresponding to the sequence of samples. Although there are various types of pre-selection processes, the amount of code of a code sequence corresponding to the sequence of samples, in fact, is calculated for each of Y possible variants, obtained by any process of pre-selection, and the variant Tjthat produces the lowest value of code is selected as the interval T (Tj( SYwhere SYis a set of Y of possible options). Y must satisfy at least Y<z in order To significantly reduce the amount of computation Y is preferably set to a value substantially less Z, so that, for example, is Y≤Z/2. Typically, the process of calculating the values of the code requires a very large amount of computation. Let A denotes the number of these calculations. Assuming that A volume calculated�nd for the process of pre-selection is about 1/10 of the amount of computation, i.e. A/10, then the amount of computation required to calculate the code for all values Z of possible options, equal to ZA. On the other hand, the amount of computation required to perform the process of pre-selection, applied to all possible options Z, and then calculating quantities of code for Y possible options selected by the prior selection process, equal (ZA/10+YA). It is clear that, if Y<9Z/10, the method using the prior selection process, requires less calculation to determine the interval T.

The present invention also provides a method of determining the interval T with a smaller amount of computation. Before describing the embodiment of the method describes the principle of determining the interval T with a small amount of computation.

The periodic characteristic of the audio signal, such as speech and music, as a rule, often gradually changes over multiple frames in the period where the audio signal is in a stationary state. Therefore, given the interval Tt-1defined in the frame Xt-1immediately preceding the given frame Xt, can effectively be determined by the interval Ttin the frame Xt. However, the interval Tt-1defined in the frame Xt-1that is not necessarily an interval Ttsuitable for frame Xt. Therefore, it is preferable that in�possible option for the interval T, used to determine the interval Tt-1in the frame Xt-1that was included in the possible options for the interval T to determine the range Ttin the frame Xtinstead of considering only the interval Tt-1defined in the frame Xt-1.

On the other hand, in the period of the signal on the multiple scenes where the audio signal is non-stationary state, it is difficult to expect the continuity of the magnitude of the periodic characteristic of the audio signal on the neighboring frames. Therefore, if the determination as to whether or not the period of signal HR period, where the signal is in a stationary state is not performed by other means which are not shown, the strategy of finding the interval Ttin the frame Xtfrom among the possible options for the interval T, used to determine the interval Tt-1in the frame Xt-1"not necessarily provide the preferred result. I.e., in such a situation, it is desirable that it was possible for the interval Ttdetected from among the possible options for the interval T in the frame Xtthat do not depend on possible options for the interval T, used to determine the interval Tt-1in the frame Xt-1.

Further detail is described variant of implementation, based on the principle of the invention (see Fig. 7 and 8). In this embodiment, the implementation of block 7 �determine interval is provided in the encoder 100a, as shown in Fig. 10, and the reordering unit 5, the encoding unit 6 and unit 8 for generating additional information are provided in block 7 determine the range.

(A) the pre-selection Process (step S71)

Possible options for the interval T, which can be provided with additional information identifying the reordering of the samples in the sequence of samples, previously determined in Association with the encoding of additional information, which is described below, such as encoding with a fixed length or encoding with variable length. Unit 7 definition of interval preserves the Z1possible options for T1, T2, ..., TZselected in advance from Z pre-defined different possible options for the interval T(Z1<Z). The purpose of this is to reduce the number of options available, subject to prior selection process. It is desirable that variations are possible, subject to prior selection process included the maximum number of intervals, which are preferred, as the interval T of the frame from among T1, T2, ..., TZ. In reality, however, the preference is unknown before the prior selection process. Therefore, Z1possible options�ants Z are selected from the possible options T 1, T2, ..., TZin the even intervals, for example, as a possible intervals, subjected to prior selection process. For example, Z1possible options subject to prior selection process can be selected from Z possible options for T1, T2, ..., TZin accordance with the policy of "selection of options in the odd positions of number Z of possible options for T1, T2, ..., TZin the options, subject to prior selection process" (where Z1=ceil(Z/2) and ceil(·) is a function of rounding to the nearest integer). The set of possible Z variants denoted as SZ(SZ={T1, T2, ..., TZ} and the set of Z1possible variants denoted as SZ1.

Unit 7 determine the range performs the selection process described above, over Z1options, subject to prior selection process. The number of options reduced by this choice, denoted as Z2. There are various types of pre-selection processes, as set out above. There may be a way, based on the indicator related to the code values of a code sequence corresponding to the reordered sequence of samples, to select Z2in�possible options on the basis of the degree of concentration indicators of counts in the low frequency region or based on the number of consecutive samples, which have an amplitude of zero on the frequency axis from the largest frequency to low frequency side.

Specifically, if the value of Z2not pre-installed, the following process is performed for preliminary selection. Unit 7 definition of interval performs reordering, as described above, a sequence of samples on the basis of each possible option for each of the possible combinations, calculates the sum of absolute values of amplitudes of samples included in the first ¼ of the field, for example, from the low-frequency side of the reordered sequence of samples as an indicator belonging to the code values of a code sequence corresponding to the sequence of samples, and selects this option, if the sum is greater than a predetermined threshold. Alternatively, block 7 determine the interval reorders the sequence of samples, as described above, on the basis of each of the candidate that receives the number of consecutive samples that have amplitude zero, from the largest frequency to low frequency side as an indicator belonging to the code value code sequence corresponding to the sequence of samples, and selects this option, if the number of consecutive counts more pre-defined�steering threshold. The reordering is performed by the reordering unit 5. In this case, the number of possible variants is equal to Z2and the value of Z2can vary from frame to frame.

If the value of Z2pre-installed, the following process is performed for preliminary selection. Unit 7 definition of interval performs reordering, as described above, a sequence of samples on the basis of each of the possible choices for the Z1possible options, calculates the sum of absolute values of amplitudes of samples included in the first ¼ of the field, for example, from the low-frequency side of the reordered sequence of samples as an indicator belonging to the code value code sequence corresponding to the sequence of samples, and selects the Z2possible options that give the Z2the largest amounts. Alternatively, block 7 determine the range performs the reordering described above, a sequence of samples on the basis of each of the possible choices for the Z1possible options receives a number of consecutive samples having zero amplitude, in a reordered sequence of samples from the largest frequency to low frequency side as an indicator belonging to the code values �annual sequence, the corresponding sequence of samples, and selects the Z2possible options that give the Z2the largest numbers of consecutive samples. Reordering the sequence of samples is performed by the reordering unit 5. Z2are the same in each frame. Of course, it is at least the dependence of Z>Z1>Z2. The set of Z2possible variants denoted as SZ2.

(B) the Process of adding (step S72)

Then the block 7 determine the interval executes a process of adding one or more possible variants to the set SZ2possible options obtained by the prior selection process in (A). The purpose of this addition process is preventing the value of the Z2was very small for finding the interval T when the above final choice, when the value of Z2can vary from frame to frame, or an increase in the probability of choosing a suitable interval T at a final selection of the maximum possible, even if Z2becomes relatively large. Since the purpose of the method of determining the interval T in the present invention is to reduce the amount of computation compared to the amount of computation from the conventional methods, the number Q add possible options should satisfy unequal�TSS Z 2+Q<Z, where the number |SZ2| items (possible options) of the set SZ2is |SZ2|=Z2. A more preferable condition is that Q satisfies the inequality Z2+Q<Z1. Added options include options Tk-1and Tk+1preceding and subsequent possible option for Tkincluded in the set SZ2for example , where Tk-1,Tk+1( SZ(where possible options, "earlier and later" for option Tkrepresent the possible options preceding and following Tkin order T1<T2<...<TZbased on the magnitude of the value entered in the set SZ={T1, T2, ..., TZ}). The reason is that there is a possibility that options Tk-1and Tk+1not included in Z1possible options subject to prior selection process. However, if possible variants of Tk-1, Tk+1( SZ1and possible options Tk-1and Tk+1not included in the set SZ2options Tk-1and Tk+1optional to add. You only need to select options to add from a set of SZ. For example, for a possible Tkincluded in the set SZ2, Tk-α (DG� T k-α ) SZ) and/or Tk+β (where Tk+β ( SZ) can be added as a new option. In this case, α and β are given positive real numbers, for example, α may be equal to β. If Tk-α and/or Tk+β overlap another option included in the set SZ2, Tk-α and/or Tk+β are not added (because there is no point in adding them). The set of Z2+Q possible variants denoted as SZ3. Then the process is done in (D1) or (D2).

(D) the pre-selection Process (step S73)

(D1 - step S731) If the frame should be defined in the interval T, is temporarily the first frame, block 7 determine the interval performs a pre-selection process described above for Z2+Q possible options that are included in the set SZ3. The number of options reduced by the prior selection process, denoted by Y that satisfies Y<Z2+Q.

There are various types of pre-selection processes, as described earlier. For example, the same process as that of pre-selection in (A) can be performed (the number of possible variants is different, i.e., Y≠Z2). It should be noted that in this case the value of Y can vary from frame to frame. In the process of pre - �of HOICE, different from the prior selection process in (A) described above described above, the reordering is performed on the sequence of counts for each of Z2+Q possible options that are included in the set SZ3for example , and a predetermined equation of an approximate calculation for the approximate calculation code amount of a code sequence obtained by encoding the reordered sequence of samples, is used to obtain approximate values of code (estimated code amount). Reordering the sequence of samples is performed by the reordering unit 5. For possible options, for which he received the reordered sequence of samples in the pre-selection process in (A), can be used reordered the sequence of samples received in the prior selection process in (A). In this case, if the Y value is not set in advance, the possible options, which give approximate values of the code that are less than or equal to the predefined threshold, can be selected in the options, subject to (E) the process of calculating the magnitude of the code, which is described below (in this case, the number of possible variants is Y); if Y value is pre-mouth�is selected, Y options that give the smallest approximate value of code can be selected in the options, subject to (E) the final selection process, which is described below. Y of possible options is stored in memory and used in the process in (C) or (D2), which is described below, for determining the interval T temporarily in the second frame. After the process in (D1) is the final selection process in (E).

If the same process of pre-selection and pre-selection process in (A) is performed in (D1), and options are selected by comparing between the indicator relating to the amount of code of a code sequence obtained by encoding the reordered sequence of samples in the pre-selection process in (A), and threshold options selected in the pre-selection process in (A), is always selected in the pre-selection process in (D1). Therefore, the comparison of the indicator with a threshold to select the possible options should be performed only for possible options, added in the process (B) is added, and options selected in this case and possible options selected in the process (A) pre-selection, are subjected to the final selection process in (E). However, it is preferably�m, the Y value was fixed at pre-set in the pre-selection process in (D1) and Y options that give the smallest approximate value of the code would be selected in the options, subject to the final selection process in (E), as is a large amount of computation (E) of the final selection process.

(D2 - step S732) If the frame should be defined in the interval T, is temporarily the first frame, block 7 determine the interval performs a pre-selection process described above, over a maximum of Z2+Q+Y+W options included in the join SZ3∪SP(where |SP|=Y+W). It describes the Union of SZ3∪SP. The frame should be defined in the interval T, denoted by Xtand the frame, temporarily immediately preceding frame Xtdenoted by Xt-1. Set SZ3represents a set of possible options in the frame Xtobtained in the processes (A) to(B) described above, and the number of possible options that are included in the set SZ3, is equal to Z2+Q. the Set SPrepresents the set Union of SYpossible options selected in the options, subject to the final selection process in (E), which is about�anywaysa below when the interval T is determined in the frame Xt-1and a set of SWpossible options to be added to the set SYthe process of adding in (C), which is described below. Set SYhas been stored in memory. In this case, |SY|=Y and |SW|=W, and should be met at least |SZ3∪SP|<Z. the above-Described process of pre-selection is performed over a maximum of Z2+Q+Y+W options included in the join SZ3∪SP. The number of options reduced by the prior selection process, is Y, and Y satisfies Y<|SZ3∪SP|≤Z2+Q+Y+W. there are various types of pre-selection processes, as described earlier. For example, may execute the same process as described above has a prior selection process in (B), (number of possible variants varies (i.e., Y≠Z2)). It should be noted that in this case the value of Y can vary from frame to frame. In the prior selection process described above of the prior selection process in (B) described above, the reordering is performed on the sequence of samples on the basis of each of the |SZ3∪SP| possible options, for example, and a predetermined equation of an approximate calculation for the approximate Vici�tion value codes code sequence, obtained by encoding the reordered sequence of samples, is used to obtain approximate values of code (estimated value of code). Reordering the sequence of samples is performed by the reordering unit 5. For possible options, for which he received the reordered sequence of samples in the pre-selection process in (A), can be used reordered the sequence of samples received in the prior selection process in (A). In this case, if the Y value is not set in advance, the possible options, which give approximate values of the code that are less than or equal to the predefined threshold, can be selected in the options, subject to (E) the final selection process, which is described below (in this case, the number of possible variants is Y); if Y value is set in advance, Y options that give the smallest approximate value of code can be selected in the options, subject to (E) the final selection process, which is described below. Y possible options are stored in memory and used in the process in (D2), which is performed when determining the interval T temporarily in the next frame. Pic�e process in (D2) is the final selection process in (E).

If the same process of pre-selection and pre-selection process in (A) is carried out in (D2), and options are selected by comparing between the indicator relating to the amount of code of a code sequence obtained by encoding the reordered sequence of samples in the pre-selection process in (A), and threshold options selected in the pre-selection process in (A) is always selected in the pre-selection process in (D2). Therefore, the comparison of the indicator with a threshold to select the possible options should be performed only for possible options, added in the process of adding (B), the possible options, subject to the final selection process in (E), which is described below, when the interval T is determined in the frame Xt-1and possible options, added in the process of adding in (C), and options selected in this case and possible options selected in the process (A) pre-selection, are subjected to the final selection process in (E). However, it is preferred that the Y value was recorded on a pre-established value in the process of pre-selection in (D2), and Y options that give the smallest approximate value of the code would be selected as the who�one of the options subjected to the final selection process in (E), as is a large amount of computation (E) of the final selection process.

(C) the Process of adding (step S74)

Unit 7 determine the interval executes a process of adding one or more possible options in the set SYsubject to the final selection process in (E), which is described below, when the interval T is determined in the frame Xt-1. The options included in the set of SYmay be possible options for Tm-1and Tm+1the preceding and subsequent possible option for Tmincluded in the set SYfor example , where Tm-1, Tm+1( SZ(in this case, possible options, "prior and subsequent" for option Tmare options preceding and following Tm in the order T1<T2<...<TZbased on the magnitude of the value entered in the set SZ={T1, T2, ..., TZ}). You only need to choose options that you want to add, from a set of SZ. For example, for a possible Tmincluded in the set SY, Tm-γ(where Tm-γ ( SZ) and/or Tm+η (where Tm+η ( SZ), can be added as new options. In this case, γ and η are advanced� some positive real number, for example, γ may be equal to η. If Tm-γ and/or Tm+η overlap another option included in the set SY, Tm-γ and/or Tm+η is not added (because there is no point in adding them). Then the process is done in (D2).

(E) the final selection Process (step S75)

Unit 7 definition of interval reorders the sequence of samples on the basis of each of Y possible variants, as described above, encodes the reordered sequence of samples to obtain a code sequence, receives the actual code and selects an option, which gives the smallest code size, as the interval T. the Reordering is performed by the reordering unit 5, and coding reordered sequence of samples is performed by the encoding unit 6. For possible options, for which he received the reordered sequence of samples in the pre-selection process in (A) or (D), reordered the sequence of samples received in the prior selection process, may be introduced in unit 6 encoding and can be encoded by the encoding unit 6.

It should be noted that the process of adding in (B), the process of adding in (C) and the prior selection process in (D) are not significant, and at least any one of proce�owls may be omitted. If omitted the process of adding in (B), then the number |SZ3| items (possible options) of the set SZ3is |SZ3|=Z2since Q=0. If omitted, the pre-selection process in (D), then the maximum of Z2+Q possible options that are included in the set SZ3(if the frame should be defined in the interval T, is temporarily the first frame), or a maximum of Z2+Q+Y+W possible options that are included in the Association SZ3∪SP(if the frame should be defined in the interval T, is temporarily the first frame), are subjected to the final selection process in (E).

Although the "first frame" is "temporally first frame" in the description of the determination of the interval T, the first frame is not limited to this. "The first frame" can be any frame except the frame that satisfy the conditions (1) to(3) listed in the Conditions of A below (see Fig. 9).

<A>

For the frame:

(1) frame is the first frame temporarily,

(2) the previous frame was encoded according to the coding mode of the present invention, and

(3) the preceding frame was subjected to the process of reordering described above.

Although the set SYin the process in (D2) is the set of possible options, subject to the final selection process in (E) described below, when the interval T t�determined in the previous frame X t-1"in the aforementioned description, a set of SYmay represent a "Union of the sets of possible options, subject to the final selection process in (E) described below, in determining the interval T in each of a plurality of frames preceding time frame, which should be defined in the interval T". Specifically, a set of SYrepresents the set Union of St-1possible options subject to the final selection process in (E) described below, in determining the interval T in the frame Xt-1, a set of St-2possible options subject to the final selection process in (E) described below, in determining the interval for the frame Xt-2, ..., and set the St-mpossible options subject to the final selection process, described below, when determining the interval T in the frame Xt-m, i.e., SY=St-1∪St-2∪...∪St-mwhere m is the number of previous frames. In this case, m is preferably any one of 1, 2 and 3, as a larger value of m requires an increased amount of computations depending on the values of Z, Z1, Z2and Q.

Assuming that A volume calculation for the prior selection process is about 1/10 of this amount calculation process for calculating a code, i.e. A/10, then the amount of computation, trebuetsya executing processes (A), (B), (C) and (D2) equal to max ((Z1+Z2+Q+Y+W)A/10+YA), if Z, Z1, Z2, Q, W and Y are pre-set at fixed values. In this case, let Z2+Q≈3Z2and Y+W≈3Y, then the amount of computation is ((Z1+3Z2+3Y)A/10+YA). Comparison with the volume of calculations (ZA/10+YA), described above, shows that the amount of computation can be reduced by setting Z, Z1, Z2and Y that satisfy Z > (Z1+3Z2+3Y). For example, plants can be Z=256, Z1=64 and Z2=Y=8.

SZ={T1, T2, ..., TZ} may be constant or may change from frame to frame. The Z value can be constant or can vary from frame to frame. However, the number of options subject to the final selection process in (E), must be less than Z. Therefore, if |SY| greater than or equal to Z in the process in (D2), the process of pre-selection is performed over the set SYread from the memory, for example, through the use of indicator similar to the indicator used in the prior selection process in (A), described above, to reduce the number of possible options, so the number of possible variants subjected to the final selection process in (E) is less than Z. If the prior selection process in (D) is omitted and |SZ3∪SP|≥Z, �predvaritelny selection is performed above the S Z3∪SPthrough the use of indicator similar to the indicator used in the prior selection process in (A), described above, to reduce the number of possible options, so the number of possible variants subjected to the final selection process in (E) is less than Z.

<Modification of the method of determining the interval T>

In the audio, such as speech and music signals, there is often a high correlation between current frame and previous frames in the period of the signal, where the audio signal is in a stationary state for a variety of frames. Using this property of the stationary signal, the ratio of SZ3and SPmay change in the process in (D2) to further reduce the amount of computation, while keeping the characteristics of the compression. The ratio, in this case, can be defined as the ratio of SPto SZ3or can be specified as a ratio of SZ3to SPor can be specified as a fraction SPin SZ3∪SPor can be specified as a fraction SZ3in SZ3∪SP.

Determining whether the high stationarity or not in a certain segment of the signal can be made on the basis of whether or not an indicator, such as indicating the degree of stationarity, is greater than or equal to the threshold, or whether or �em indicator is more than the threshold. The indicator indicating the degree of stationarity, can be the indicator below. Of interest is the frame for which is determined by the interval T, below in this document referred to as the current frame and the frame immediately preceding the current frame in time, referred to as the previous frame. The indicator of the degree of stationarity is greater when:

(a-1) more "gain predictions of the audio signal in the current frame",

(a-2) more "estimated gain predictions of the audio signal in the current frame",

(b-1) is less than the difference between the "gain predictions of the audio signal in the preceding frame" and "gain predictions of the audio signal in the current frame",

(b-2) is less than the difference between the "estimated gain predictions of the audio signal in the preceding frame" and "estimated gain predictions of the audio signal in the current frame",

(c-1) is greater than the "sum of amplitudes of samples of an audio signal included in the current frame",

(c-2) more "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(d-1) is less than the difference between the "sum of amplitudes of samples in the audio signal included in the pre cursors�following frame" and "sum of amplitudes of samples of an audio signal, included in the current frame,

(d-2) is less than the difference between the "sum of amplitudes of samples of an audio signal included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the preceding frame in frequency domain", and the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"

(e-1) more "power of the audio signal in the current frame",

(e-2) more "power sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain,"

(f-1) is less than the difference between the "power of the audio signal in the preceding frame" and the "power of the audio signal in the current frame, and/or

(f-2) is less than the difference between the "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the preceding frame in the frequency domain," and "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain".

It should be noted that the gain of the prediction is the ratio of energy�and the source signal to the signal energy of the prediction error in the coding with prediction. The value of the gain predictions, essentially, proportional to the ratio of the sum of the absolute values of the values of the samples included in the sequence of MDCT coefficients in the frame output from the block 1 convert the frequency domain, the sum of the absolute values of the values of the samples included in the weighted normalized sequence of MDCT coefficients in the frame output from the normalization unit 2 weighted envelope, or the ratio of the sum of the squares of the values of the samples included in the sequence of MDCT coefficients in the frame, to the sum of the squared values of the samples included in the weighted normalized sequence of MDCT coefficients in the frame. Therefore, any of these relations can be used as a value, the value of which is equivalent to the amount of "gain predictions of the audio signal in the frame".

"The gain predictions of the audio signal in the frame is equal to E, defined as follows:

where kmis a PARCOR coefficient of m-th order corresponding to the linear prediction coefficient in a frame, used by unit 2 weighted normalization of the envelope. In this case, the PARCOR coefficient corresponding to the linear prediction coefficient, represents sequentually PARCOR all orders. If E is computed through the use aquantance PARCOR coefficient of certain orders (e.g., the first - P2-th order, where P2<PO) or quantized PARCOR coefficient or all orders as PARCOR coefficient corresponding to a linear prediction coefficient calculated E will be equal to the estimated coefficient predictions of the audio signal in the frame".

"Sum of amplitudes of samples of an audio signal included in the frame", is the sum of the absolute values of values of samples of the digital signal of speech/audio included in the frame, or the sum of the absolute values of the values of the samples included in the sequence of MDCT samples in the frame output from the block 1 convert the frequency domain.

"The power of the audio signal in the frame" represents the sum of the squares of the values of samples of the digital signal of speech/audio included in the frame, or the sum of the squares of the values of the samples included in the sequence of MDCT coefficients in the frame output from the block 1 convert the frequency domain.

Any of the above (a) to(f) may be used to determine the degree of stationarity, or logical OR or And of two or more of the above (a) to(f) may be used to determine the degree of stationarity. In the first case, block 7 determine the interval �sportsuit, for example, (a) only the gain of the predictions of the audio signal in the current frame and, if ε<G is between "gain predictions of the audio signal in the current frame G and with a predefined threshold ε, determines that the stationarity is high or the block 7 determine the range by using, for example, only (b) difference (Goffbetween "gain predictions of the audio signal in the preceding frame" and "gain predictions of the audio signal in the current frame and, if Goff<τ is difference between Goffand with a predefined threshold τ, determines that the stationarity is high. In the latter case, block 7 determine the range by using, for example, criteria (c) and (e), and if ξ<Ac runs between the "sum of amplitudes of samples of an audio signal included in the current frame" Ac and with a predefined threshold ξ, and δ<Pc is performed between the "power of the audio signal in the current frame" Pc and with a predefined threshold δ, determines that the stationarity is high or the block 7 determine the range uses criteria (a), (c) and (f), and, if ε<G is between "gain predictions of the audio signal in the current frame G and with a predefined threshold ε or ξ<Ac runs between the "sum of amplitudes odonto� audio, included in the current frame" Ac and with a predefined threshold ξ, and Poff<θ is performed between the difference of Poffbetween the "power of the audio signal in the preceding frame" and the "power of the audio signal in the current frame and a predefined threshold θ, determines that the stationarity is high.

The ratio of SZ3and SPthat changes depending on the determination of the degree of stationarity, is set in advance in the reference table, for example, in block 7 determine the range. Usually, when stationarity is determined high, the ratio of SPin SZ3∪SPis set to a large value (the ratio of SZ3is relatively small or the ratio of SPin SZ3∪SPmore than 50%), or when stationarity is determined low, the ratio of SPin SZ3∪SPis set to the lower value of the ratio (SZ3is relatively high or the ratio of SPin SZ3∪SPdoes not exceed 50%), or the ratio is about 50:50. When stationarity is determined high, accesses the reference table to determine the ratio SP(or the ratio of SZ3in the process (D2), and the number of possible variants in the set SZ3reduced through selection of options with the pain�their indicators as in the prior selection process in (A), as described above, for example, so the number of possible options that are included in SPand SZ3that is consistent with the ratio. On the other hand, when stationarity is determined low, accesses the reference table to determine the ratio SP(or the ratio of SZ3), and the number of possible options that are included in the set SPchanges through selection of options with large indicators in the same manner as in process (A) described above, for example, so the number of possible options that are included in SPand SZ3to be consistent with the ratio. Thus, the number of possible variants subjected to the process in (D2), can be reduced, then how can be increased the ratio set, which probably included the interval T for the current frame as a possible option. Thus, can effectively be determined by the interval T. it Should be noted that if the stationarity is determined low, SPcan be an empty set. Ie options selected subject to the final selection process in (E) in the preceding frame, are excluded from the possible options, subject to prior selection process in (D) in the current frame.

In alternative configurations may establish different ratios between� S Z3and SPthat depends on the degree of stationarity. For example, determining whether the high stationarity or not, is performed only through the use of criterion (a) "gain predictions of the audio signal in the current frame", a variety of thresholds ε1, ε2, ..., εk-1, εk(where ε12< ... <εk-1k) is provided in advance to gain predictions of the audio signal in the current frame G, and

G<ε1⇒ the ratio of SPin SZ3∪SP: 10%

ε1≤G < ε⇒ SPin SZ3∪SP:20%

...

εk-1≤G < εk⇒ the ratio of SPin SZ3∪SP:80%

εk≤G⇒ SPin SZ3∪SP: 90%

set in the reference table in advance. Although there has been described an example that uses only criterion (a) "gain predictions of the audio signal in the current frame, other ratios between the SZ3and SPdepending on the degree of stationarity can be set in the reference table for other criteria or logical operations OR And or two or more of criteria (a) to(f).

Although there has been described an exemplary variant of implementation, in which the ratio of SZ3and SPvaries in accordance with the determination of the degree hospital�spine, after you have defined the sets SZ3and SPin the process in (D2) determining whether the degree of stationarity is high or not, is performed before will be identified sets of SZ3and SPin the alternative implementation. For example, the values of Z1, Z2, Q and W in accordance with the determination of whether or not the degree of stationarity is high, can be installed in advance in the reference table in Association with the values of Y. At least one of the values of Z1, Z2and Q (preferably Z2or Q), associated with the definition that stationarity is high, set low (or set W large), so that |SZ3| less than the value Y+W (where W may be 0). At least one of the values of Z1, Z2and Q (preferably Z2or Q), associated with the definition that the stationarity is not high, a large set (or W is set to small), so that |SZ3| more values of Y+W (where W may be 0).

In a variant implementation, in which determining whether the high stationarity or not, is performed before the definition of the sets SZ3and SPa Z-value of1, Z2and Q according to the degree of stationarity can be set in the reference table. For example, if the definition is�I whether high or low stationarity, is performed only through the use of criterion (a) "gain predictions of the audio signal in the current frame", a variety of thresholds ε1, ε2, ..., εk-1, εk(where ε12k-1k) is provided in advance to gain predictions of the audio signal in the current frame G, and

G<ε1⇒ Z2=16, Q=30

ε1≤G < ε2⇒ Z2=12, Q=20

...

εk-1≤G < εk⇒ Z2=4, Q=4

εk≤G ⇒ Z2=2, Q=0

set in the reference table in advance. Although there has been described an example that uses only criterion (a) "gain predictions of the audio signal in the current frame, the values of Z1, Z2and Q, which change depending on the degree of stationarity, can be set in the reference table for other criteria or logical operations OR And or two or more of criteria (a) to(f).

[Method for determination of periodic symptom]

Although there has been described the method for determining the interval T with a small amount of computation, the parameter, to be determined by this method is not restricted to the interval T. for Example, this method can be used to determine the magnitude periodic basis (e.g., fundamental frequency or period of the pitch) of the audio signal, which is a inform�tion to identify groups of samples with the reordering of the samples. Specifically, it may be caused by the operation unit 7 definition of the interval as the device determine the magnitude periodic basis to determine the interval T as the magnitude of the periodic symptom no output code sequence that can be obtained by encoding the reordered sequence of samples. In this case, the term "interval T" in the description, "a Method of determining the interval T" can be replaced by the term "the period of the pitch", or the sampling rate of the sequence of counts divided by the time interval T may be replaced by "main frequency". The method can determine the fundamental frequency or the period of the basic tone for reordering the samples with small amount of computation.

[Additional information that identifies the reordering of the samples in the sequence of samples]

The encoding unit 6 or the unit 8 for generating additional info displays additional information identifying the reordering of the samples included in the sequence of samples, i.e., information indicating the periodicity of the audio signal, or information indicating the fundamental frequency, or information indicating the interval T between the count corresponding to the frequency or the fundamental frequency of the audio signal, and the reference is suitable� integer multiple of the periodicity or fundamental frequency of the audio signal. It should be noted that, if the encoding unit 6 outputs the additional information, the encoding unit 6 may perform the process for more information on the process for encoding the sequence of samples, or may perform the process for more information as a separate process from the encoding process. For example, if the interval T is determined for each frame, the additional information identifying the reordering of the samples included in the sequence of indications displayed for each frame. Additional information that identifies the reordering of the samples in the sequence of samples may be obtained through the coding of the periodicity of the fundamental frequency or interval T on a frame-by-frame basis. Coding can be encoding with a fixed length or may be the encoding of variable length to reduce the average size of code. If you use an encoding with a fixed length, additional information is stored in Association with a code that uniquely identifies additional information, for example, and displays the code associated with the input additional information. If you use the encoding of variable length, the difference between the interval T in the current frame and the interval T in predshestvuyuschei� the frame may be encoded by encoding variable length, and the resulting information can be used as information indicating the interval T. In this case, for example, the difference in the interval T is stored in Association with a code uniquely identifying the difference, and displays the code associated with the input difference between the interval T in the current frame and the interval T in the previous frame. Similarly, the difference between the fundamental frequency of the current frame and the fundamental frequency of the previous frame may be encoded by encoding variable length, encoded data can be used as information indicating the fundamental frequency. In addition, if n can be selected from a set of alternatives, the upper limit of the n or the upper limit number N, described earlier, may be included in additional information.

[The number of collected samples]

Although an example in this embodiment, the implementation, where the number of readings included in each group of samples is xed to three, namely, the count corresponding to the frequency or the fundamental frequency, or a multiple of the periodicity or fundamental frequency (below in this document, reference, referred to as Central count), count, prior to the Central count, and the count subsequent to the Central count, if the number of samples in the group of samples and the Indus�XY counts are variable, information indicating one alternative selected from a set of alternatives, in which the combinations of the number of samples in the group of samples and indices of the samples are different, can be included in additional information.

For example, if

(1) only the Central reference F(nT),

(2) in the sum of three reference, namely the Central reference count preceding a Central count, and the count subsequent to the Central count, F(nT-1), F(nT) F(nT+1),

(3) in the amount of three reference, namely the Central count and the two previous reference frame F(nT-2), F(nT-1), F(nT),

(4) in the amount of four reference, namely the Central three count and the previous count, F(nT-3), F(nT-2), F(nT-1), F(nT),

(5) in the amount of three reference, namely the Central count and two count, F(nT) F(nT+1), F(nT+2), and

(6) in the amount of four reference, namely the Central count and three subsequent reference, F(nT) F(nT+1), F(nT+2), F(nT+3)

are set as alternatives, and selected (4), information indicating that the selected (4), is included in additional information. Three bits are enough for the information indicating the selected alternative in this example.

One way to select one alternative is the following. The reordering unit 5 can perform the reordering corresponding to each of these alternatives, and the encoding unit 6 can obtain the value of �ode code sequence, corresponding to each of the alternatives. Then you can select the alternative that produces the lowest value of the code. In this case, additional information identifying the reordering of the samples included in the sequence of samples outputted from the encoding unit 6 instead of reordering unit 5. This method is also applicable to the case where n can be selected from many alternatives.

However, there may be a very large number of combinations of alternatives, such as alternatives concerning interval T, the alternatives concerning combinations the number of samples included in the sequence of samples, and the reference index, and alternatives relating to n. It requires a tremendous amount of processing to calculate the final values of code from all combinations of alternatives that can cause the problem from the point of view of efficiency. From this point of view, it is preferable to satisfy the following approximate calculation process to reduce the amount of processing. The encoding unit 6 obtains the approximate value of the code, which are estimated code amount by means of a simple method for approximate calculation for all combinations of alternatives, extracts the set of possible options, which is probably preferable, for example, by selecting the advanced�individual a certain number of possible options which give the smallest approximate value of the code, and selects the alternative that produces the lowest code value selected from among the possible options. Thus, can be achieved sufficiently small final value of code with a small amount of processing.

In one example, the number of samples included in the group of samples, can be fixed at "three", then the possible options for the interval T is reduced to a small quantity, number of samples included in the group of samples, combined with every possible option, and can be selected the preferred alternative.

Alternatively, an approximate sum of the indicators of the samples is measured, and the alternative can be selected based on the concentration of indicators of counts in the low frequency region or based on the number of consecutive samples that have an amplitude of zero and are from the largest frequency to the side of lower frequencies along the frequency axis. Specifically, can be obtained by the sum of the absolute values of the amplitudes of the reordered samples in the first ¼ of the field with low-frequency side of the reordered sequence of samples. If the sum is greater than a predetermined threshold, it may be considered that the reordering is the preferred reordering. The method of choosing the viola�alternative, which gives the highest number of consecutive samples that have an amplitude of zero from the largest frequency to the low frequency side of the reordered reference frame can also be regarded as the preferred reordering, since the samples with large indicators are concentrated in the low frequency region.

When alternatives are selected by the above described process of the approximation, the amount of processing is small, but cannot be forcibly selected reordering the samples in the sequence of counts that produces the lowest final value of the code. Therefore, many alternatives can be selected by the above described process of the approximation, and the value of codes for the small number of possible variants can finally accurately evaluated to select the preferred alternative (which generates a small amount of code).

[Revision]

In some situations there may not be advantages in the reordering of the samples included in the sequence of samples. In this case, it is necessary to encode the original sequence of samples. The reordering unit 5, therefore, also displays the original sequence of samples (a sequence of samples that has not been reordered). Then, the unit 6 encoding the encoding�t the original sequence of samples by means of encoding with variable length. The amount of code of a code sequence obtained by encoding with variable-length source sequence of counts is compared with the sum of code amount of a code sequence obtained by encoding variable length reordered sequence of samples, and the code amount of additional information.

If less than the amount of code of a code sequence obtained by encoding with variable-length source sequence of samples, displays the code sequence obtained by encoding with variable-length source sequence of samples.

If less amount code amount code sequence obtained by encoding variable length reordered sequence of samples, and the code amount of the additional information is output code sequence obtained by encoding variable length reordered sequence of samples, and additional information.

If the amount of code of a code sequence obtained by encoding with variable-length source sequence of samples equal to the sum of code amount of a code sequence obtained by encoding with variable length of perioperation�Noah sequence of samples, and the code amount of the additional information is displayed, any one of a code sequence obtained by encoding with variable-length source sequence of samples, and a code sequence obtained by encoding variable length reordered sequence of readings with additional information. Which one should be displayed is determined in advance.

Additionally, the second subsidiary information indicating whether or not the sequence of samples corresponding to the code sequence, the reordered sequence of the samples is also shown (see Fig.10). One bit is sufficient for the second additional information.

It should be noted that, if the approximate value of the code, i.e., the estimated value of code, a code sequence obtained by encoding variable length reordered sequence of samples, obtained as described above, the approximate code size code sequence obtained by encoding variable length reordered sequence of samples, can be used instead of the code amount of a code sequence obtained by encoding variable length reordered sequence of samples. And�illogical, the approximate code size, i.e. the estimated value of code, a code sequence obtained by encoding with variable-length source sequence of samples may be obtained and used together code amount of a code sequence obtained by encoding with variable-length source sequence of samples.

In addition, you can pre-define the reordering of the samples included in the sequence of samples, unless the gain predictions or estimated gain predictions more pre-defined threshold. This method uses the fact that when the gain prediction in speech or music is high, the oscillation of the vocal cords or the oscillation of a musical instrument is strong, and the frequency is high. The gain prediction is an energy source of a sound, divided by the energy balance predictions. When encoding, which uses the linear prediction coefficients and the PARCOR coefficients as parameters, quantized parameters can be used together at the encoder and at the decoder. So, for example, the encoding unit 6 may use the quantized coefficient k(i) PARCOR i-th order obtained by the other neiser�by means provided in the encoder 100, for calculating the estimated gain of the predictions presented an inverse value for (1-k(i)*k(j)) times for each order. If the calculated estimated value is greater than a predetermined threshold, the encoding unit 6 outputs a code sequence obtained by encoding variable length reordered reference; otherwise, the block encoding displays the code sequence obtained by encoding with variable-length source sequence of samples. If quantized parameters can be used together at the encoder and at the decoder as in this example, there is no need to withdraw the second subsidiary information indicating whether or not the sequence of samples corresponding to the code sequence, reordered sequence of samples. Ie likely that the reordering has minimal effect in unpredictable noise signal or a pause, and, therefore, the reordering is lowered to reduce wasteful use more information and computation.

In an alternative configuration, the reordering unit 5 can calculate the gain predictions or estimated gain predictions. If the ratio is high�of predictions or estimated gain predictions more than a predetermined threshold, the reordering unit 5 may reorder the sequence of samples and to output the reordered sequence of samples on the unit 6 encoding; otherwise, the reordering unit 5 may output a sequence of samples entered in the reordering unit 5, unit 6 encoding without reordering the sequence of samples. Then, the unit 6 encoding can encode the sequence of samples output from the reordering unit 5, by means of encoding with variable length.

In this configuration, the threshold is preset as a value shared by the encoding side and decoding side.

It should be noted that rice coding, entropy encoding, and the encoding of the length of the series, taken as an example in this document, are all well known and, therefore, is omitted a detailed description of these methods.

The decoding process

The following describes the decoding process with reference to Fig. 5 and 6.

In the decoder 200, the MDCT coefficients are restored by performing the reverse process to the encoding process by the encoder 100 or 100a. At least the above information about the gain of additional information and code sequences entered in the decoder 200. If the second additional information vivacitas encoder 100a, the second additional information is also entered in the decoder 200.

The decode block 11

First, the decoding unit 11 decodes the input code sequence in accordance with information about the selection, and outputs a sequence of samples in the frequency domain on a frame-by-frame basis (step S11). Of course, perform a method of decoding corresponding to the encoding performed to obtain a code sequence. The details of the decoding process by the decoding unit 11 correspond to the details of the encoding process by the encoding unit 6 of the encoder 100. Therefore, the description of the encoding process included here is the claim that the decoding corresponding to the encoding performed by the encoder 100, is the process of decoding performed by the decoding unit 11, and thus, is omitted a detailed description of the decoding process. It should be noted that what type of coding was carried out, may be identified by information about choosing. If the selection includes, for example, information identifying an area where encoding has been done rice, and rice options, information indicating the area where you applied the encoding of the length of the series, and information identifying the type of entropy coding, decoding methods corresponding to these method�m coding applied to the appropriate regions of the input sequences encoding. The decoding process corresponding to the encoding of rice, a decoding process corresponding to entropy encoding, and the decoding process corresponding to the encoding of the length of the series, well known and, therefore, description of these decoding processes is omitted.

Unit 12 recovery

Then the recovery unit 12 receives a sequence of source samples from the sequence of samples to the frequency domain from the block 11 decoding on a frame-by-frame basis in accordance with the input additional information (step S12). In this case, the sequence of source samples" equivalent "sequence counts the frequency domain", enter in block 5 reordering of the encoder 100. Although there are various ways of reordering that can be performed by the reordering unit 5 of the encoder 100, and various possible alternatives reordering corresponding to the ways of reordering, as described above, only one type of reordering, if any, was performed on the sequence, and identifying information for reordering, is included in additional information. Consequently, the recovery unit 12 may reorder the sequence of samples to the frequency domain, in�led away from the decode block 11, in the original sequence of samples on the basis of additional information.

It should be noted that also a possible alternative configuration, in which is entered the second subsidiary information indicating whether the reordering or not. In this configuration, if the second additional information indicating whether the reordering or not, indicates that the reordering has been performed, the block 12 recovery reorders the sequence of samples to the frequency domain output from the decoding unit 11, an input sequence of samples; if the second additional information indicates that the reordering is not done, the recovery unit 12 outputs a sequence of samples to the frequency domain output from the decoding unit 11, without reordering.

Also another possible alternative configuration in which identification is performed based on the magnitude of the gain predictions or estimated gain of predicting whether the reordering or not. In this configuration, the recovery unit 12 uses the quantized coefficient k(i) PARCOR i-th order input from other nasobrahan funds provided in the decoder 200, for the calculation of the estimated gain predictions, before�provided return value for (1-k(i)*k(j)), multiplied for each order. If the calculated estimated value is greater than a predetermined threshold, the block 12 recovery reorders the sequence of samples to the frequency domain output from the decoding unit 11, an input sequence of samples, and outputs the resulting sequence of samples; otherwise, the recovery unit 12 outputs the sequence of samples output from the decoding unit 11, without reordering.

The details of the recovery process performed by the recovery unit 12 correspond to the details of the reordering process performed by the reordering unit 5 of the encoder 100. Therefore, the process of reordering is enabled here by the assertion that the recovery process performed by the recovery unit 12 is inverse to the reordering performed by the block 5 reordering (reordering in reverse order), and thus, is omitted a detailed description of the recovery process. To facilitate the understanding of the process, the following describes one example of a recovery process corresponding to the specific example described previously, the process of reordering.

For example, in the previously described example in which the reordering unit 5 collects together the group of samples in the cluster on nescac�frequency side, and outputs F(T-1), F(T) F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T), F(5T+1), F(1), ..., F(T-2), F(T+2), ..., F(2T-2), F(2T+2), ..., F(3T-2), F(3T+2), ..., F(4T-2), F(4T+2), ..., F(5T-2), F(5T+2), ..., F(jmax), the sequence of samples to the frequency domain F(T-1), F(T)THAT F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T), F(5T+1), F(1), ..., F(T-2), F(T+2), ..., F(2T-2), F(2T+2), ..., F(3T-2), F(3T+2), ..., F(4T-2), F(4T+2), ..., F(5T-2), F(5T+2), ..., F(jmax) output from the decoding unit 11, is introduced into the recovery unit 12.

Additional information includes information such as information relating to the interval T, the information indicating that n is an integer that is greater than or equal to 1 and less than or equal to 5, and information indicating that the group of samples contains three count. Therefore, based on additional information, block 12 recovery can recover the input sequence of samples F(T-1), F(T) F(T+1), F(2T-1), F(2T), F(2T+1), F(3T-1), F(3T), F(3T+1), F(4T-1), F(4T), F(4T+1), F(5T-1), F(5T), F(5T+1), F(1), ..., F(T-2), F(T+2), ..., F(2T-2), F(2T+2), ..., F(3T-2), F(3T+2), ..., F(4T-2), F(4T+2), ..., F(5T-2), F(5T+2), ..., F(jmax) in the original sequence of samples F(j)(1≤j≤jmax).

Unit 13 inverse quantization

Then, the unit 13 of the inverse quantization performs inverse quantization of the original sequence of samples F(j)(1≤j≤jmax) output from block 12 recovery on a frame-by-frame basis (step S13). Considering the example described earlier, the "weighted normalized sequence of coef�of icients MDCT, normalized for gain" entered in block 4 quantization of the coder 100 can be obtained by inverse quantization.

Unit 14 multiplying by the gain factor

Then the block 14 multiplying by the gain multiplies on a frame-by-frame basis, each coefficient from a weighted normalized sequence of MDCT coefficients, normalized by the amplification coefficient" output from block 13 of the inverse quantization, the gain identified in information gain, as described above, to obtain the "normalized weighted sequence of normalized MDCT coefficients (step S14).

Block 15 opposite the normalization of the weighted envelope

Then the block 15 opposite the normalization of the weighted envelope divides into frame-by-frame basis, each coefficient normalized weighted sequence of normalized MDCT coefficients" output from block 14 multiplying by the gain on the weighted value of the envelope of the power spectrum to obtain a sequence of MDCT coefficients (step S15).

Unit 16 conversion time-domain

Then, the conversion unit 16 converts the time domain on a frame-by-frame basis, the sequence of MDCT coefficients" output from block 15 opposite the normalization of the weighted envelope, sovremennoy region for receiving a digital signal of speech/audio frame (step S16).

As the processes at steps S13-S16 are conventional processes, a detailed description of these processes is omitted. Such processes are, for example, described in non-patent literature listed above.

As is evident from the embodiment of, for example, if the fundamental frequency is not in doubt, efficient encoding can be performed by encoding the sequence of samples, reordered in accordance with the fundamental frequency (i.e., can be reduced, the average code length). In addition, as the samples that have equal or almost equal indicators, gather together in a cluster in the local area through a reordering of the samples included in the sequence of samples, the quantization distortion and the code size can be reduced, while still allowing you to perform efficient encoding.

<Exemplary hardware configuration of the encoder/decoder>

The encoder/decoder according to the above implementation options includes an input unit to which can be connected to the keyboard, etc., output unit, to which can be connected to a liquid crystal display etc., a Central processing unit (CPU) (which may include memory such as cache memory), storage devices such as random access memory (RAM) and permanent-only memory�e device (ROM), the external storage device is a hard disk, and a bus which connects the input unit, output unit, CPU, RAM, ROM and the external storage device in such a way that they can exchange data. The device (drive), able to read and write data on the recording media such as CD-ROM (CD-ROM), may be provided in the encoder/decoder as needed. A physical object that includes these hardware resources may be a General computer.

Program to perform encoding/decoding and data needed to process the programs that are stored on the external storage device encoder/decoder (storage device is not limited to the external storage device; for example, the programs may be stored on a permanent storage device such as ROM). Data obtained by processing programs are stored in RAM or on the external storage device as needed. A storage device that stores data and the addresses of the memory cells of the device, below in this document is simply referred to as "storage device".

Storage device, the encoder stores a program to reorder the samples in each sequence of samples included in a frequency region that conclusion�tsya signal from the speech/audio, and program for coding reordered sequences of samples.

Storage device, the decoder stores a program for decoding an input code sequence and a decoded sequence of samples in the original sequence of samples prior to performing the reordering by the encoder.

In the encoder program stored in a storage device, and data needed to process the programs that are loaded into RAM when needed and are interpreted and executed or processed by the CPU. As a result, the CPU implements these functions (block reordering and block coding) to implement encoding.

The decoder program stored in a storage device, and data needed to process the programs that are loaded into RAM when needed and are interpreted and executed or processed by the CPU. As a result, SCE implements these functions (block decoding and block recovery) for decoding.

<Application>

The present invention is not limited to variants of implementation described above and can be modified without derogating from the essence of the present invention. In addition, the processes described in the variants of implementation, can be performed not only in a temporary seq�to the dedication, as it is written, and can be executed in parallel or individually, depending on the bandwidth of devices that perform the processes, or requirements.

If processing functions of any of the hardware objects (coder/decoder) that is described in the variants of implementation, are implemented by computer, the processing functions that are hardware objects must include, described in the programs. The program is executed on a computer to implement the processing functions of the hardware object on the computer.

The program describing the processing can be recorded on a computer-readable recording media. A computer-readable recording medium may be any recording medium such as a magnetic recording device, optical disc, magneto-optical media recording and semiconductor memory. Specifically, for example, hard disk device, floppy disk, or magnetic tape can be used as a magnetic recording device, a digital multi-function disk (DVD), rewritable DVD (DVD-RAM), compact disk (CD-ROM) or CD-ROM one-time recording (CD-R)/rewritable CD (CD-RW) can be used as an optical disk, a magneto-optical disk (MO) may be used as a magneto-optical recording media, and electronically erasable and programmable pathoanatomical device (EEPROM) can be used as a semiconductor memory.

The program is distributed through sale, transfer or lease of removable recording media on which is recorded a program such as DVD or CD-ROM. The program may be stored on a storage device of a server computer and sent from the server computer to other computers over the network, thus distributing the program.

The computer that executes the program stores the program recorded on the removable recording medium or transferred from the server computer on a storage device of the computer. When the computer executes a process, the computer reads the program stored on the recording medium of the computer, and executes processes in accordance with the read program. In another mode of execution of the program, the computer may read the program directly from the removable recording medium and execute the processes in accordance with the program or may execute processes in accordance with the program every time the program is sent from the server computer to the computer. Alternatively, the processes may be performed using so-called service provider of application services (ASP), in which the program is being forwarded from the server computer to another, but processing functions are implemented by instructions in the program execution and collection of results ICP�assign. It should be noted that the program in this mode covers the information that is provided for processing an electronic computer, and an equivalent program (such as data that are not direct commands to the computer, but have the essence that defines the processing of the computer).

Although the hardware facilities, can be configured to trigger execution by the computer to a pre-defined program in the above embodiments, at least some of the processes may be implemented by the hardware.

1. Coding method for encoding a sequence of samples in the frequency domain that is output from the audio in frames, and the method includes:
the step of determining the interval to determine the interval T between samples from the set S of possible options for the interval T, the interval T corresponds to the periodicity of an audio signal or an integer multiple of the fundamental frequency of the audio signal;
the step of generating additional information for the coding interval T, defined at the stage of determining the interval for more information, and
the step of encoding the sequence of samples to encode the reordered reference to obtain a code sequence, and reordered consistent�nce counts
(1) includes all samples in the sequence of samples, and
(2) represents the sequence of samples in which at least some of the counts reorder so that some or all of one or of a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency of the audio signal in the sequence of samples, and one or a plurality of successive counts which includes the count corresponding to an integer multiple of the periodicity or fundamental frequency of the audio signal in the sequence of samples, collected together in a cluster based on the interval T determined by the phase interval;
wherein the step of determining the interval determines the interval T from the set S of possible options for the interval T, and the set S consists of Y possible options from the number Z of possible options for the interval T, and Y possible options include Z2possible options selected without depending on a possible subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame, and includes an option, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current to�drôme, and Z of possible options are presented with additional information, where Z2<Z and Y<z

2. Coding method according to claim 1,
wherein the step of determining the interval further comprises the step of adding to add to the set of S values, the adjacent possible option, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame, and/or values having a predetermined difference from the possible options.

3. Coding method according to claim 1 or 2,
wherein the step of determining the interval further comprises a pre-selection for the selection of some of Z1possible options from the number Z of possible options for the interval T provided with additional information, as Z2possible options based on the indicator received from the audio signal and/or the sequence of samples in the current frame, where Z2<Z1.

4. Coding method according to claim 1 or 2,
wherein the step of determining the interval further comprises:
the pre-selection for the selection of some of Z1possible options from the number Z of possible options for the interval T provided with additional information, on the basis of the indicator derived from audio and/or sequence auto�tov in the current frame; and
the second stage added for choice, as Z2possible options, the set of possible options that you selected during the preliminary choice, and values adjacent to the option selected in the preliminary selection, and/or values having a predetermined difference from the option selected in the preliminary selection.

5. Coding method according to claim 1 or 2,
wherein the step of determining the interval contains:
the second phase of pre-selection to select some of the options for the interval T, which is included in the set S, based on the indicator received from the audio signal and/or the sequence of samples in the current frame; and
the final stage of selection to determine the interval T from the set made up of some of the possible options that you selected at the second stage of the preliminary selection.

6. Coding method according to claim 1,
in which the greater the indicator indicating the degree of stationarity of the audio signal in the current frame, the greater the share options, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame in the set S.

7. Coding method according to claim 1,
wherein, when the indicator indicating the degree of stationarity of the audio signal in the current� frame less than a predetermined threshold, only the Z2possible options include in the set S.

8. Coding method according to claim 6 or 7,
wherein the indicator indicating the degree of stationarity of the audio signal in the current frame is incremented when at least one of two conditions:
(a-1) which increases the gain of the predictions of the audio signal in the current frame",
(a-2) which increases the "estimated gain predictions of the audio signal in the current frame",
(b-1) that decreases the difference between the "gain predictions of the audio signal in the frame immediately preceding the current frame", and "gain predictions of the audio signal in the current frame",
(b-2) that decreases the difference between the "estimated gain prediction in the immediately preceding frame" and "estimated gain of the prediction in the current frame",
(C-1) that increases the "sum of amplitudes of samples of an audio signal included in the current frame",
(2) increase the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"
(d-1) that decreases the difference between the "sum of amplitudes of samples of an audio signal, included�tions in the immediately preceding frame" and the "sum of amplitudes of samples of an audio signal, included in the current frame",
(d-2) that decreases the difference between the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the immediately preceding frame in frequency domain", and the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"
(e-1) that increases the power of the audio signal in the current frame",
(e-2) which increases the capacity of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain,"
(f-1) that decreases the difference between the "power of the audio signal in the immediately preceding frame" and the "power of the audio signal in the current frame", and
(f-2) that decreases the difference between the "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the immediately previous frame in the frequency domain," and "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain".

9. Method Kodirov�tion according to claim 1,
in which the step of encoding the sequence of samples includes a step output code sequence obtained by encoding the sequence of samples before performing reordering, or a code sequence obtained by encoding the reordered sequence of samples and additional information, which has a smaller code size.

10. Coding method according to claim 1,
in which the step of encoding the sequence of samples
output code sequence obtained by encoding the reordered sequence of samples and additional information, when the amount code amount or estimated value of the code amount of a code sequence obtained by encoding the reordered sequence of samples, and the code amount of the additional information is less than the code value or assessed value code amount of a code sequence obtained by encoding the sequence of samples before performing reordering, and
output code sequence obtained by encoding the sequence of samples before performing reordering, when the code value or evaluated value of the code amount of a code sequence obtained by�the case of encoding the sequence of samples before performing reordering, less than the sum of code amount or estimated value of the code amount of a code sequence obtained by encoding the reordered sequence of samples, and the code amount of additional information.

11. Coding method according to claim 9 or 10,
in which the share options, subject to the determination of the interval in the frame preceding the pre-defined number of frames before the current frame in the set of S more, when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the reordered sequence of samples than when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the sequence of samples before performing reordering.

12. Coding method according to claim 9 or 10,
in which, when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the sequence of samples prior to performing the reordering, the set S includes only the Z2the possible options.

13. Coding method according to claim 9 or 10,
wherein, when the current frame is the first frame temporarily, or when the immediately preceding frame is encoded by the encoding different from the encoding, or when the code sequence shown in the immediately preceding frame is a code sequence obtained by encoding the sequence of samples prior to performing the reordering, the set S includes only the Z2the possible options.

14. Method of determining the amount of the periodic characteristic of the audio signal in frames, and the method includes:
the step of determining the magnitude of a periodic basis to determine the magnitude of the periodic characteristic of the audio signal from the set of possible options for the values of a periodic feature on a frame-by-frame basis; and
the step of generating additional information for encoding the magnitude of the periodic characteristic, obtained at the stage of determining the values of a periodic basis, for more information.
wherein the step of determining the magnitude of the periodic symptom determine the amount of periodic symptom from the set S of possible options for the value of the periodic characteristic, and the set S consists of Y possible options from the number Z of possible options for the magnitude of the periodic�ski tag and Y possible options include Z2possible options selected without depending on a possible subjected to the step of determining the periodic value of the trait in the frame preceding the pre-defined number of frames before the current frame, and includes an option, subject to the determination of the magnitude of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame, and Z of possible options are presented with additional information, where Z2<Z and Y<Z;
moreover, the magnitude of the periodic characteristic is the fundamental frequency or the period of the pitch of the audio signal.

15. The method for determination of periodic sign according to claim 14,
wherein the step of determining the magnitude of the periodic characteristic further comprises the step of adding to add to the set of S values, the adjacent possible option, subject to the determination of the magnitude of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame, and/or values having a predetermined difference from the possible options.

16. The method for determination of periodic sign according to claim 14,
in which the more an indicator showing the degree hundred�OneNote audio signal in the current frame, the higher the share options, subject to the determination of the periodic characteristic in the frame preceding the pre-defined number of frames before the current frame in the set S.

17. The method for determination of periodic sign according to claim 16,
wherein, when the indicator indicating the degree of stationarity of the audio signal in the current frame is less than a predetermined threshold, only the Z2possible options include in the set S.

18. The method for determination of periodic sign according to claim 16 or 17,
wherein the indicator indicating the degree of stationarity of the audio signal in the current frame is incremented when at least one of two conditions:
(a-1) which increases the gain of the predictions of the audio signal in the current frame",
(a-2) which increases the "estimated gain predictions of the audio signal in the current frame",
(b-1) that decreases the difference between the "gain predictions of the audio signal in the frame immediately preceding the current frame", and "gain predictions of the audio signal in the current frame",
(b-2) that decreases the difference between the "estimated gain prediction in the immediately preceding frame" and "estimated gain of the prediction in the current frame",
(C1) that increases the sum of amplitudes of samples of an audio signal, included in the current frame",
(2) increase the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"
(d-1) that decreases the difference between the "sum of amplitudes of samples of an audio signal included in the immediately preceding frame" and the "sum of amplitudes of samples of an audio signal included in the current frame",
(d-2) that decreases the difference between the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the immediately preceding frame in frequency domain", and the "sum of amplitudes of samples included in the sequence of samples obtained by converting the sequence of samples of an audio signal included in the current frame in the frequency domain,"
(e-1) that increases the power of the audio signal in the current frame",
(e-2) which increases the capacity of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain,"
(f-1) that decreases the difference between the "power of the audio signal in the immediately preceding frame" and the "power of the audio signal in �ekuwem frame", and
(f-2) that decreases the difference between the "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the immediately previous frame in the frequency domain," and "power of the sequence of samples obtained by converting the sequence of samples of the audio signal in the current frame in the frequency domain".

19. Encoder, encoding the sequence of samples in the frequency domain that is output from the audio in frames, and the encoder includes:
block determine the interval that defines the interval T between samples from the set S of possible options for the interval T, the interval T corresponds to the periodicity of an audio signal or an integer multiple of the fundamental frequency of the audio signal;
unit for generating additional information, encoding the interval T determined by the block determining the interval, for additional information; and
block encoding the sequence of samples, encoding the reordered sequence of samples to obtain a code sequence, and reordered the sequence of readings
(1) includes all samples in the sequence of samples, and
(2) represents a sequence of samples, which are reordered at least some�s reading, so some or all of one or of a plurality of successive counts which includes the count corresponding to the frequency or the fundamental frequency of the audio signal in the sequence of samples, and one or a plurality of successive counts which includes the count corresponding to an integer multiple of the periodicity or fundamental frequency of the audio signal in the sequence of samples, gather together in a cluster based on the interval T determined by the block determining the interval;
moreover, the definition block of the interval defines the interval T from the set S of possible options for the interval T, and the set S consists of Y possible options from the number Z of possible options for the interval T, and Y possible options include Z2possible options selected without depending on a possible subjected to the processing unit determining the interval in the frame preceding the pre-defined number of frames before the current frame and includes an option, subject to the processing unit determining the interval in the frame preceding the pre-defined number of frames before the current frame, and Z of possible options are presented with additional information, where Z2<Z and Y<z

20. The encoder according to claim 19,
in which the block coding sequence�scenic spots counts
outputs a code sequence obtained by encoding the reordered sequence of samples and additional information, when the amount code amount or estimated value of the code amount of a code sequence obtained by encoding the reordered sequence of samples, and the code amount of the additional information is less than the code value or assessed value code amount of a code sequence obtained by encoding the sequence of samples before performing reordering, and
outputs a code sequence obtained by encoding the sequence of samples before performing reordering, when the code value or evaluated value of the code amount of a code sequence obtained by encoding the sequence of samples before performing reordering, less the sum of code amount or estimated value of the code amount of a code sequence obtained by encoding the reordered sequence of samples, and the code amount of additional information.

21. The device determine the magnitude of the periodic characteristic that specifies the amount of the periodic characteristic of the audio signal in frames, wherein the device contains:
block definitions� value periodic basis to determine the magnitude of the periodic characteristic of the audio signal from the set of possible options for the values of a periodic feature on a frame-by-frame basis; and
unit for generating additional information for encoding the magnitude of the periodic indication received in block determine the value of periodic basis, for more information.
wherein the block of determining the amount of periodic sign determines the magnitude of the periodic characteristic from the set S of possible options for the value of the periodic characteristic, and the set S consists of Y possible options from the number Z of possible options for the periodic value of the trait Y possible options include Z2possible options selected without depending on a possible subjected to the processing unit determining the periodic value of the trait in the frame preceding the pre-defined number of frames before the current frame, and include the variant treated by the block determining the periodic value of the trait in the frame preceding the pre-defined number of frames before the current frame, and Z of possible options are presented with additional information, where Z2<Z and Y<Z;
moreover, the magnitude of the periodic characteristic is the fundamental frequency or the period of the pitch of the audio signal.

22. A computer-readable recording medium containing recorded on it computer prog�the Amma for prescribing the computer to execute the steps of the method of encoding according to claim 1 or the method of determining the amount of periodic sign according to claim 14.



 

Same patents:

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to bandwidth expansion devices. An excitation signal based on an acoustic signal is generated; with that, the acoustic signal includes a variety of frequency components. A feature vector is distinguished out of the acoustic signal; with that, the feature vector includes at least one feature of a component in a frequency domain and at least one feature of a component in a time domain. At least one parameter of the spectrum shape is determined based on the feature vector; with that, at least one parameter of the spectrum shape corresponds to a sub-range signal containing frequency components that belong to an additional variety of frequency components. A signal of the sub-range is generated by the filtration of an excitation signal by means of a filter bank and weighing of a filtered excitation signal using at least one parameter of the spectrum shape.

EFFECT: technical result consists in the improvement of perception of an expanded acoustic signal.

21 cl, 10 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to systems for encoding audio signal sources. Provided is subband block-based harmonic transposition, where the time block of complex discrete values of subbands is processed by common phase modification. Superposition of multiple modified discrete values yields the resultant effect of limiting undesirable cross products, making it possible to use coarser frequency resolution and/or lesser degree of oversampling. In one embodiment, the invention further includes a window function suitable for use with cross product-enhanced, subband block-based HFR. A hardware embodiment may include an analysing filter unit (101), a control data-configurable subband processing module (102) and a synthesising filter unit (103).

EFFECT: efficient implementation of high-frequency reconstruction (HFR) through enhancement with cross products, where a new component with frequency QΩ+rΩ0 is generated based on existing components with frequencies Ω and Ω+Ω0.

63 cl, 9 dwg

FIELD: physics, computer engineering.

SUBSTANCE: present invention relates to signal processing means. An encoder sets an interval including 16 frames as interval section to be processed, outputs high-frequency band encoded data to obtain the high-frequency band component of an input signal and low-frequency band encoded data obtained by encoding the low-frequency band signal of the input signal for each section to be processed. In this case, for each frame, a coefficient used in estimation of the high-frequency band component is selected and the section to be processed is divided into continuous frame segments including continuous frames from which the coefficient with the same section to be processed is selected. In addition, high-frequency band encoded data are produced which include data including information indicating the length of each continuous frame segment, information indicating the number of continuous frame segments included in the section to be processed and a coefficient index indicating the coefficient selected in each continuous frame segment.

EFFECT: improved sound quality with frequency band expansion.

23 cl, 51 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to signal processing means. The system receives an encoded low-frequency band signal and encoded energy information used for frequency shift of the encoded low-frequency band signal. The low-frequency band signal is decoded and energy suppression of the decoded signal is smoothed. The smoothed low-frequency band signal is frequency shifted to generate a high-frequency band signal. The low-frequency band signal and the high-frequency band signal are then merged and output.

EFFECT: high quality of the decoded signal.

20 cl, 14 dwg

FIELD: physics, computer engineering.

SUBSTANCE: present invention relates to means of encoding and decoding. An envelope predistortion link predistorts an envelope. A noise shaping link divides the predistorted envelope formed by envelope predistortion by a value greater than 1 and subtracts from the division result a noise shaping signal determined by information. A sampling link sets the subtraction result as a number of sampling bits and, based on said number of sampling bits, samples a normalised spectrum formed by spectrum normalisation. A multiplexing link multiplexes the information, sampled spectrum, formed by sampling the normalised spectrum, and the envelope. The present invention can applied, for example, to an encoding device which encodes an audio signal.

EFFECT: improved audio quality due to encoding audio signals.

14 cl, 31 dwg

FIELD: physics, computer engineering.

SUBSTANCE: present invention relates to means of encoding and decoding. An envelope predistortion link predistorts an envelope. A noise shaping link divides the predistorted envelope formed by envelope predistortion by a value greater than 1 and subtracts from the division result a noise shaping signal determined by information. A sampling link sets the subtraction result as a number of sampling bits and, based on said number of sampling bits, samples a normalised spectrum formed by spectrum normalisation. A multiplexing link multiplexes the information, sampled spectrum, formed by sampling the normalised spectrum, and the envelope. The present invention can applied, for example, to an encoding device which encodes an audio signal.

EFFECT: improved audio quality due to encoding audio signals.

14 cl, 31 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to radio engineering and is intended for controlling an audio signal, including a transient event. The device comprises a unit for replacing a transient signal, configured to replace the transient part of a signal, which includes a transient event of an audio signal, with part of a replacement signal adapted to energy characteristics of the signal of one or more transient parts of the audio signal, or to the energy characteristic of the signal of the transient part of the signal to obtain an audio signal with a shorter transient process. The device also includes a signal processor configured to process an audio signal with a shorter transient process to obtain a processed version of the audio signal with a shorter transient process. The device also includes a transient signal inserting unit configured to merge the processed version of the audio signal with a shorter transient process with the transient signal, representing in the original or processed form the transient content of the transient part of the signal.

EFFECT: high accuracy of reproducing the signal.

14 cl, 20 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to communication engineering. An audio decoder for providing decoded audio information based on encoded audio information includes a window application-based signal converter formed to map a frequency-time presentation, which is described by the encoded audio information, to a time interval presentation. The window application-based signal converter is formed to select one of a plurality of windows, which include windows of different transition inclinations and windows of different conversion lengths based on window information. The audio decoder includes a window selector formed to evaluate window information of a variable-length code word for selecting a window for processing said part of the frequency-time presentation associated with said audio information frame.

EFFECT: eliminating artefacts arising when processing time-limited frames.

15 cl, 23 dwg

FIELD: physics, computer engineering.

SUBSTANCE: group of inventions relates to means of encoding and decoding a signal. The encoder comprises a first layer encoding section which encodes an input signal in a low-frequency range below a predetermined frequency. First encoded information is generated. The first encoded information is decoded to generate a decoded signal. The input signal is broken down in a high-frequency range above a predetermined frequency into a plurality of frequency subbands. A spectrum component is partially selected in each frequency subband. An amplitude adjustment parameter is calculated, which is used to adjust the amplitude of the selected spectrum component in order to generate second encoding information.

EFFECT: high efficiency of encoding spectral data of a high-frequency part and high quality of the decoded signal.

14 cl, 15 dwg

FIELD: physics, video.

SUBSTANCE: invention relates to a method and an apparatus for improving audio and video encoding. A signal is processed using DCTIV for each block of samples of said signal (x(k)), wherein integer transform is carried out using lifting steps which represent sub-steps of said DCTIV. Integer transform of said sample blocks using lifting steps and adaptive noise shaping is performed for at least some of said lifting steps, said transform providing corresponding blocks of transform coefficients and noise shaping being performed such that rounding noise from low-level magnitude transform coefficients in a current one of said transformed blocks is decreased whereas rounding noise from high-level magnitude transform coefficients in said current transformed block is increased, and wherein filter coefficients (h(k)) of a corresponding noise shaping filter are derived from said audio or video signal samples on a frame-by-frame basis.

EFFECT: optimising rounding error noise distribution in an integer-reversible transform (DCTIV).

26 cl, 13 dwg

FIELD: technologies for encoding audio signals.

SUBSTANCE: method for generating of high-frequency restored version of input signal of low-frequency range via high-frequency spectral restoration with use of digital system of filter banks is based on separation of input signal of low-frequency range via bank of filters for analysis to produce complex signals of sub-ranges in channels, receiving a row of serial complex signals of sub-ranges in channels of restoration range and correction of enveloping line for producing previously determined spectral enveloping line in restoration range, combining said row of signals via synthesis filter bank.

EFFECT: higher efficiency.

4 cl, 5 dwg

FIELD: analysis of sound signal quality, possible use for estimating quality of speech transferred through radio communication channels.

SUBSTANCE: in accordance to the method for machine estimation of sound signal quality, the signal is divided onto critical bands and spectral energy values are computed for critical bands, values of spectral likeness of active phase of fragments are determined, and quality of tested sound signal is determined by means of weighted linear combination of aforementioned quality values for each phase. The difference of the method is that selected fragments of active and inactive phase of both signals are synchronized, inactive phase spectrums are determined for each fragment, resulting spectrums of active and inactive phase of fragments are divided onto additional sets of bands, for each one of which spectral energy values are computed, resulting spectral energies of active and inactive fragment phases are compared in couples, to determine spectral likeness coefficients, resulting likeness coefficient for each phase is determined as an average value of likeness coefficients for all sets of bands, which is the estimate of quality of each phase.

EFFECT: ensured universality and optimized quality of estimation process depending on purposes of estimation.

5 cl, 13 dwg, 6 tbl

FIELD: method for transmitting audio signals between transmitter and at least one receiver using priority pixel transmission method.

SUBSTANCE: in accordance to the invention, an audio signal is separated onto certain number n of spectral components, separated audio signal is stored in two-dimensional matrix with a set of fields with frequency and time as sizes and amplitude as corresponding value recorded in the field, then each separate field and at least two adjacent fields groups are formed and priority is assigned to certain groups, where priority of one group is selected the higher, the higher are amplitudes of group values and/or the higher are amplitude differences of values of one group and/or the closer the group is connected actual time, and groups are transmitted to receiver in the order of their priority.

EFFECT: ensured transmission of audio signals without losses even when the width of transmission band is low.

7 cl, 1 dwg

FIELD: physics.

SUBSTANCE: said utility invention relates to audio coders, in particular, to audio coders, in which time representation is converted into spectral representation. The essence of the invention is as follows: for the determination of the quantiser step for the quantisation of a signal containing audio or video information, the first quantiser step value is generated, along with the interference threshold. After that, the interference actually introduced due to the first quantiser step value is calculated and compared to the interference threshold. In spite of the fact that the comparison indicates that the actually introduced interference exceeds the threshold, the second, coarser quantiser step value is applied, which is then used for the quantisation if it is found that the interference introduced due to the coarser quantisation step value is less than the threshold or the interference introduced due to the first quantiser step value.

EFFECT: result of invention implementation is that quantisation interference decreases due to selection of coarser quantisation step value and resulting increased compression benefit.

10 cl, 5 dwg

FIELD: physics.

SUBSTANCE: device for multichannel signal processing includes comparator between the first and the second of the two channels. Additionally, a filter of spectral ratio prediction is provided to perform filtration of prediction only with one prediction filter for both channels in case of high similarity between the first and the second channel and filtration of prediction with two separate prediction filters in case of distinction between the first and the second channel.

EFFECT: increased efficiency of coding with technology of stereosignal coding.

12 cl, 3 dwg

Audio coding // 2335809

FIELD: physics.

SUBSTANCE: substance of invention implies exclusion of common procedure of interpolation relative to filter factors and gain value for interpolated intermediate audio values, and coding can be performed not by gain value interpolation, but by power limit calculated by masking threshold value rather as area lower than square of masking threshold value for each knot, i.e. for each transferred parameterisation followed by interpolation between these power limits in adjacent knots, e.g. by linear interpolation. Both on coder side, and on decoder side, gain value can be calculated then by intermediate power limit calculated so that quantising noise caused by fixed frequency quantisation prior to followed filtration on decoder side is lower than power limit or corresponded thereto after followed filtration.

EFFECT: provided coding of tapped audio noise reduction.

16 cl, 15 dwg

FIELD: physics.

SUBSTANCE: to define assessed value of information unit necessity for signal encoding, beside permissible interference for frequency band and frequency band power, an nl(b)) value is accounted for power distribution within the frequency band.

EFFECT: improved precision of assessed value of information unit necessity, allowing more precise and efficient signal encoding.

11 cl, 10 dwg

FIELD: information technology.

SUBSTANCE: in the audio-coder codes of key information are created for one or several audio-channels where a code of the key information bending around is created by characterising time bending around in the audio-channel. In the audio-decoder E that which is transferred by the audio-channels (audio-channel) is decoded for creation C audio-channels of reproduction, where C >E ≥1. The received codes of the key information include a code of the key information bending around, corresponding characterised time bending around the audio-channel corresponding the transferred channel (channels). One or several transferred channels mix with the increase in the number of channels for creation of one or several channels mixed with the increase of the number of channels. One or several channels are synthesised for reproduction by the application of codes of key information to one or several channels mixed with the increase of the number of channels where the code of key information bending around applies to the channel mixed with the increase of the number of channels, or to the synthesised signal for the adjustment of time bending around the synthesised signal on the basis of the characterised time bending around so the adjusted time bending around in essence coincides with characterised time bending around.

EFFECT: widening the arsenal of resources for coding audio-channels.

42 cl, 27 dwg

FIELD: information technologies.

SUBSTANCE: invention is related to method of sound signal coding support, in which at least one segment of sound signal should be coded with the help of coding model, which makes it possible to use different durations of coding frame, according to which it is suggested to define at least one control parameter on the basis of sound signal characteristics. Then this control parameter is used for limitation of versions of possible frame durations selection in respect to at least one segment of signal. Group of inventions also comprises module (10, 11), in which this method is realised, device (1) and system, which comprise such module (10, 11), and also software product, which includes program code for realisation of suggested method.

EFFECT: presentation of possibility of simple selection of corresponding most suitable duration of coding frame.

34 cl, 4 dwg

FIELD: physics; computer engineering.

SUBSTANCE: invention relates to computer engineering and can be used in sound encoding devices. The method involves the following: an information amplitude signal is decomposed to spectral lines, where each spectral line contains a sequence of spectral values, presented in an x-bit presentation without taking the logarithm; each spectral value is squared for each spectral group, and the squared spectral values are summed up to obtain a sum of squares as the result of calculation in the presentation without taking the logarithm, wherein presentation of the calculation result without taking the logarithm is scaled by an effective scaling factor compared to the sum of squares; for each calculation result, a logarithmic function is applied to y bits of presenting the result without taking the logarithm in order to obtain a scaled presentation of the calculation result with taking the logarithm, where y is less than x multiplied by 2; and compensation factor is added or subtracted for each presentation of scaling with taking the logarithm to the scaled logarithmic presentation or from it, respectively, where the value which corresponds to the logarithmic function is applied to the effective scaling coefficient to obtain presentation of the calculation result with taking the logarithm of the energy of the signal of the corresponding spectral group and so that values of energy of the signal in spectral groups have the same degree of scaling.

EFFECT: easy calculation and/or possibility of calculating with low expenses on equipment.

22 cl, 8 dwg

Up!