Encoder, decoder and method therefor

FIELD: physics, computer engineering.

SUBSTANCE: group of inventions relates to means of encoding and decoding a signal. The encoder comprises a first layer encoding section which encodes an input signal in a low-frequency range below a predetermined frequency. First encoded information is generated. The first encoded information is decoded to generate a decoded signal. The input signal is broken down in a high-frequency range above a predetermined frequency into a plurality of frequency subbands. A spectrum component is partially selected in each frequency subband. An amplitude adjustment parameter is calculated, which is used to adjust the amplitude of the selected spectrum component in order to generate second encoding information.

EFFECT: high efficiency of encoding spectral data of a high-frequency part and high quality of the decoded signal.

14 cl, 15 dwg

 

The technical field to which the invention relates

The present group of inventions relates to an encoding device, decoding device and method, which are used for a communication system which transmits a signal via signal encoding.

The level of technology

When speech or audio signals are transmitted through a communication system with packet switched mobile communication system, etc., as represented by Internet communication, compression and coding are often used to increase the efficiency of transmission of speech and audio signals. Additionally, in recent years, when coding of speech and audio signals just at low bit rate, there is increasing demand for technology-coding of speech and audio signals over a wide frequency band.

To meet these needs, developed various technologies to encode wideband speech or audio signals without a significant increase in the amount of information after encoding. For example, according to the technology disclosed in patent document 1, the encoding device calculates a parameter to shape the spectrum of the high frequency part of the spectral data obtained by converting the input of acoustics, amp the definition of the signal for a constant period of time, and displays this option through its coordination with the encoded information of the low-frequency part. In particular, the encoding device divides the spectral data of the high-frequency part of the frequency on many popolos frequencies and calculates the parameter that specifies the range of the low-frequency part, which is the most similar to the spectrum of each podology frequencies. Then, the encoding device regulates most similar to the spectrum of the low-frequency part using two types of scaling factors so that the peak amplitude or energy podology frequency (hereinafter in this document, "pampalona energy") and form in the high-frequency spectrum, which has to be formed, becomes similar peak amplitude, podpolnoy energy and the spectral shape of the high-frequency part of the input signal as a target.

The list of bibliographic references

Patent documents

PTL 1. Publication WO No. 2007/052088

The invention

The technical problem

However, according to the above-described patent document 1, when combining the high-frequency range, the encoding device performs logarithmic transformation for all samples (MDCT coefficients) of the spectral data of the input signal and the combined data of the high-frequency spectrum. Then, the device is about coding calculates the parameter so that the corresponding pampalona energy and forms are similar to the peak amplitude, podpolnoy energy and shape of the frequency spectrum of the input signal as a target. Therefore, there is a problem in that the amount of arithmetic operations in the encoding device is very large. Additionally, the encoding device uses the calculated parameter for all samples in popoloca frequencies and does not take into account the size of the amplitudes of the individual samples. Therefore, the amount of arithmetic operations in the encoding device when forming the high-frequency spectrum using the calculated parameter also becomes very large. Additionally, the quality of decoded speech, which is to be formed is insufficient, and there is a possibility that abnormal sound is generated, depending on the situation.

Therefore, the aim of the present invention is to provide an encoding device, decoding device and method for them, allowing for efficient coding of spectral data of the high-frequency part and improving the quality of the decoded signal on the basis of spectral data of the low-frequency part of the broadband signal.

The solution

The encoding device of the present invention is configured to include the AMB: first encoding means for forming first encoded information by encoding the lower-frequency part, equal to or lower than a predetermined frequency of the input signal; decoding means for forming a decoded signal by decoding the first encoded information; and a second encoding means for forming the second encoded information by separating the high-frequency part of the input signal exceeding the predetermined frequency in many popolos frequency estimation of multiple popolos frequencies, respectively, of the input signal or the decoded signal, a partial selection component of the spectrum in each of popolos frequencies and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum.

The decoding device of the present invention is configured to include: a means for receiving the first encoded data obtained by encoding the lower-frequency portions of the input signal equal to or lower than a predetermined frequency generated by the encoder, and the second encryption information generated by dividing the high-frequency part of the input signal exceeding the predetermined frequency in many popolos frequency estimation of multiple popolos frequencies, respectively, from the input of the on signal or from the first decoded signal, obtained by decoding the first encoded information, a partial selection component of the spectrum in each of popolos frequencies and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum; first decoding means for forming a second decoded signal by decoding the first encoded information; and a second decoding means for forming a third decoded signal by evaluating the high-frequency part of the input signal from the second decoded signal.

The encoding method of the present invention includes: the step of forming the first encoded information by encoding the lower-frequency portions of the input signal equal to or lower than a predetermined frequency; a step of forming a decoded signal by decoding the first encoded information; and the step of forming the second encoded information by separating the high-frequency part of the input signal exceeding the predetermined frequency in many popolos frequency estimation of multiple popolos frequencies, respectively, of the input signal or the decoded signal, a partial selection component of the spectrum in each of popolos frequencies and calculate parameter is regulirovaniya amplitude for regulating the amplitude for the selected component of the spectrum.

The encoding method of the present invention includes: a step of receiving the first encoded data obtained by encoding the lower-frequency portions of the input signal, less a predetermined frequency generated by the encoder, and the second encryption information generated by dividing the high-frequency part of the input signal exceeding the predetermined frequency in many popolos frequency estimation of multiple popolos frequencies, respectively, of the input signal or from the first decoded signal obtained by decoding the first encoded information, partial selection component of the spectrum in each of popolos frequencies and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum; the step of forming the second decoded signal by decoding the first encoded information; and a step of forming a third decoded signal by evaluating the high-frequency part of the input signal from the second decoded signal.

The positive effect of the invention

According to the present invention, the spectral data of the high-frequency part of the broadband signal can be efficiently encoded/decoded, the volume of arithmetic operations can be drastically reduced, and the quality of the decoded signal can also be increased.

Brief description of drawings

Fig. 1 is a block diagram showing the configuration of a communication system which has the encoding device and the decoding device according to option 1 implementation of the present invention;

Fig. 2 is a block diagram showing the relevant configuration of the internal part of the encoder shown in Fig. 1 under option 1 implementation of the present invention;

Fig. 3 is a block diagram showing the relevant configuration of the internal part of the section encoding of the second layer shown in Fig. 2 under option 1 implementation of the present invention;

Fig. 4 is a block diagram showing the relevant configuration section of the coding gain, shown in Fig. 3 under option 1 implementation of the present invention;

Fig. 5 is a block diagram showing the relevant configuration section coding logarithmic amplification, shown in Fig. 4 under option 1 implementation of the present invention;

Fig. 6 is a diagram for explanation of the details of the filtering process in section filter according to option 1 implementation of the present invention;

Fig. 7 is a flowchart of the operational sequence of the method, showing Metaprocess find the optimal coefficient T p' the main tone podology SBpfrequencies in the partition search under option 1 implementation of the present invention;

Fig. 8 is a block diagram showing the relevant configuration of the internal part of the decoding device shown in Fig. 1 under option 1 implementation of the present invention;

Fig. 9 is a block diagram showing the relevant configuration of the internal part of the section decoding the second layer shown in Fig. 8 under option 1 implementation of the present invention;

Fig. 10 is a block diagram showing the relevant configuration of the internal part of the section of the regulatory spectrum, shown in Fig. 9 under option 1 implementation of the present invention;

Fig. 11 is a block diagram showing the relevant configuration of the internal part of the section decoded logarithmic gain, shown in Fig. 10 under option 1 implementation of the present invention;

Fig. 12 is a block diagram showing the relevant configuration of the internal part of the section encoding the second layer under option 2 implementation of the present invention;

Fig. 13 is a block diagram showing the relevant configuration of the internal part of the section, the coding gain is shown in Fig. 12 under option 2 the implementation of this image is the shadow;

Fig. 14 is a block diagram showing the relevant configuration of the internal part of the section coding logarithmic amplification, shown in Fig. 13 under option 2 implementation of the present invention; and

Fig. 15 is a block diagram showing the relevant configuration of the internal part of the section decoded logarithmic gain under option 2 implementation of the present invention.

Detailed description of embodiments

The main characteristic of the present invention is that the device calculates the encoding parameter regulation podpolnoy energy and form groups of samples, which is extracted based on the position of the sample with the maximum amplitude in popoloca frequencies, when the encoding device generates the spectral data of the high-frequency part of the signal that must be encoded, on the basis of spectral data of the low-frequency part. Another main characteristic is that the decoder applies the calculated parameter to a group of samples, which is extracted based on the position of the sample with the maximum amplitude in popoloca frequencies. Based on these characteristics of the present invention, the spectral data of the high-frequency part of the broadband signal can be efficiently encoded/decode ovani, the amount of arithmetic operations can be drastically reduced, and the quality of the decoded signal can also be increased.

Embodiments of the present invention more explained below with reference to the drawings. The device of the speech coding apparatus and decoding of speech are explained as an example of the encoding device and the decoding device according to the present invention.

Option 1 implementation

Fig. 1 is a block diagram showing the configuration of a communication system which has the encoding device and the decoding device according to option 1 implementation of the present invention. In Fig. 1, the communication system includes a device 101 encoding device 103 decoding, and they can communicate with each other through the channel 102 of the transmission. As the device 101 encoding and device 103 decoding typically used by installing on the device, base station device of the communication terminal, etc.

The device 101 encoding divides the input signal for every N samples (N is a natural number) and encodes each frame by setting the N samples in one frame. The input signal must be encoded, is expressed as xn(n=0,..., N-1). This n denotes the (n+1)-St order of the element signal for the input signal, to the which is divided for every N samples. The device 101 encoding transmits the encrypted input information (encrypted information) in the device 103 decoding through the channel 102 of the transfer.

The device 103, the decoder receives encoded data that is transmitted from the device 101 encoding, through the channel 102 of the transfer.

Fig. 2 is a block diagram showing the relevant configuration of the internal part of the device 101 encoding shown in Fig. 1. When the sampling frequency of the input signal is SR1section 201 of the processing down-sampling buck discretetime the sampling frequency of the input signal SR1to SR2(SR2<SR1), and outputs the input signal, which buck is sampled, in section 202 encoding the first layer as the input signal after down-sampling. The operation is explained below by considering example, when SR2is 1/2 the sampling rate SR1.

Section 202 encoding the first layer forms the encoded information of the first layer by encoding the input signal after down-sampling, which is introduced from section 201 of the processing of reducing the sampling rate, for example, using the method of encoding speech system CELP (linear prediction excitation code). In particular, section 202 encoding first the Loya generates encoded information of the first layer by encoding the lower-frequency portions of the input signal, equal to or lower than a predetermined frequency. Section 202 encoding the first layer outputs the generated encoded information of the first layer section 203 decodes the first layer and section 207 multiplexing coded information.

Section 203 decodes the first layer generates a decoded signal of the first layer by decoding encoded information of the first layer, which is introduced from the encoding section 202 of the first layer, for example, using the method of decoding speech CELP system. Section 203 decodes the first layer outputs the generated decoded signal of the first layer in section 204 of the processing increases the sample rate.

Section 204 of the processing increases the sampling polysoude discretetime from SR2to SR1the sampling rate of the decoded signal of the first layer, which is input from the decoding section 203 of the first layer, and outputs the decoded signal of the first layer, which polysoude discretized in section 205 of the processing of the orthogonal transformation in the quality of the decoded signal of the first layer after increasing the sample rate.

Section 205 of the orthogonal transformation processing has buffers buf1n and buf2n (n=0,..., N-1) in the inner part and performs a modified discrete cosine transform (MDCT) for the input is ignal x nand the decoded signal ynthe first layer after increasing the sample rate, which is introduced from section 204 of the processing increases the sample rate.

Referring to the process of the orthogonal transformation by section 205 of the orthogonal transformation processing, the step of calculating and outputting data in the internal buffer are explained below.

First, section 205 of the orthogonal transformation processing initializes the buffers buf1n and buf2n by setting "0" as initial values, respectively, by the following equations 1 and 2.

1
2

Then, section 205 of the processing of orthogonal transform performs MDCT for the input signal xnand the decoded signal ynthe first layer after increasing the sampling rate by the following equations 3 and 4 and receives the MDCT coefficient of the input signal (hereinafter in this document, "input spectrum S2(k) and MDCT coefficient decoded signal ynthe first layer after increasing the sample rate (further in this document, "decoded spectrum of the first layer") S1(k).

3
4

In the above equations k denotes the index of each sample in one frame. Section 205 processing orthogonal transformation gets the xn' as a vector combination of the input signal xnand buffer buf1n by the following equation 5. Section 205 processing orthogonal transformations also receives yn' as a vector combining the decoded signal ynthe first layer after increasing the sample rate and buffer buf2n by the following equation 6.

5
6

Then, section 205 of the orthogonal transformation processing updates the buffers buf1n and buf2n by equations 7 and 8.

7
8

Section 205 processing the TCI orthogonal transformations displays the input spectrum S2(k) and the decoded spectrum S1(k) of the first layer section 206 encoding of the second layer.

The process of the orthogonal transformation by section 205 of the orthogonal transformation processing is explained above.

Section 206 encoding of the second layer forms the encoded information of the second layer using the input spectrum S2(k) and the decoded spectrum S1(k) of the first layer, which is entered from section 205 of the processing of orthogonal transform, and outputs the generated encoded information of the second layer section 207 multiplexing coded information. Details section 206 encoding of the second layer are described below.

Section 207 multiplexing coded information multiplexes encoded information of the first layer, which is introduced from the encoding section 202 of the first layer, and encoded information of the second layer, which is entered from section 206 encoding of the second layer, and displays the source code multiplexed information in the channel 102 pass as coded data by adding the error code when transferring, etc. to this source code information if necessary.

The relevant configuration of the internal part of section 206 encoding of the second layer shown in Fig. 2, is explained next with reference to Fig. 3.

Section 206 encoding of the second layer includes the section 260 of the separation of frequency bands, section 261 job status filter section 22 of the filter, section 263 of the search, section 264 of the job factors the main tone, section 265 coding gain and section 266 multiplexing, and these sections perform the following operations.

Section 260 of the separation band divides the high-frequency part (FL≤k < FH) of input spectrum S2(k), which is introduced from section 205 of the processing of orthogonal transform in excess of the predetermined frequency, P (where P is an integer greater than 1) popolos SBpfrequency (p=0, 1,..., P-1). Section 260 of the separation bandwidth displays the bandwidth BWptransmission (p=0, 1,..., P-1) and BS indexpheader (i.e., the initial position podology frequency) (p=0, 1,..., P-1) (FL≤BSp<FH) each separated podology frequencies, the quality of the information division of the frequency band, in section 262 of the filtering section 263 of the search and section 266 multiplexing. Further in this document, from the input spectrum S2(k), the part corresponding to popoloca SBpfrequencies, is described as a spectrum S2p(k) podology frequency (BSp≤k<BSp+BWp).

Section 261 of the task of the States of the filter sets of decoded spectrum S1(k) of the first layer (0≤k<FL), which is introduced from section 205 processing orthogonal transformations, as the state of the filter that should be used by section 262 of the filter. I.e. decoded spectrum S1(k) of the first loadername as the internal state (the state of the filter) in the frequency band 0≤k< FL spectrum S(k) of the entire frequency band 0≤k < FH in section 262 of the filter.

Section 262 of the filter includes a filter main tone from multiple taps, filters range decoding the first layer based on the state of the filter, which is set by section 261 of the job status filter coefficient of the fundamental tone, which is introduced from section 264 of the task of the coefficients of the fundamental tone, and information division of the frequency band, which is entered from section 260 of the separation of frequency bands, and calculates the estimated value S2p'(k) (BSp≤k<BSp+BWp) (p=0, 1,..., P-1) (hereinafter in this document, "estimated spectrum S2p' podology SBpfrequency") of each podology SBpfrequency (p=0, 1,..., P-1). Section 262 of the filter outputs the estimated spectrum S2p'(k) podology SBpfrequencies in section 263 of the search. The details of the filtering process section 262 of the filter are described below. It is assumed that the number of taps of the multiple branches can be any value (integer) that is equal to or greater than 1.

Section 263 of the search calculates the degree of similarity between the estimated spectrum S2p'(k) podology SBpfrequency, which is introduced from section 262 filtering, and spectrum S2p(k) each podology frequencies in the high frequency part (FL≤k < FH) of input spectrum S2(k), which is introduced from section 205 of the processing of orthogonal the th conversion, on the basis of the information division of the frequency band, which is entered from section 260 of the separation of frequency bands. This degree of similarity is calculated, for example, by calculating the correlation. The processes section 262 filtering section 263 of the search and section 264 of the task of the ratios of the primary colors make the process of finding a closed path for each podology frequencies. Each closed loop section 263 search calculates the degree of similarity corresponding to each coefficient of the fundamental tone, through the various changes of the ratio T of the fundamental tone, which is introduced from section 264 of the task of the coefficients of the fundamental tone in section 262 of the filter. Closed circuit for each podology frequency section 263 of the search obtains the optimal rate Tp' the main tone (in the range Tmin-Tmax), in which the degree of similarity becomes maximum in a closed loop, the corresponding popoloca SBpfrequency, and outputs P of the optimal coefficients of the fundamental tone in section 266 multiplexing. Details of the method of calculating the degree of similarity by section 263 of the search are described below.

Section 263 of the search computes the part of the frequency band (frequency band, which is the most similar to each spectrum of each podology frequencies of the decoded spectrum of the first layer, each such popoloca SBpfrequency for SIP the soup each optimal coefficient of T p' the main tone. Additionally, section 263 of the search displays in section 265 coding gain is estimated spectrum S2p'(k) corresponding to each optimal coefficient Tp' the main tone (p=0, 1,..., P-1) and the ideal gain α1pas a parameter regulating the amplitude, which is used to calculate the optimal rate Tp' the main tone (p=0, 1,..., P-1), calculated according to equation 9. In equation 9, M' denotes the number of samples to use in order to calculate the degree of similarity D and it can be an arbitrary value equal to or less bandwidth each podology frequencies. Of course, M' may be the width BWipodology frequencies. The details of the process of finding an optimal coefficient of Tp' the main tone (p=0, 1,..., P-1) by section 263 of the search are described below.

9

Section 264 of the job factors primary colors sequentially outputs in section 262 of the filter coefficient T main tone through small changes in a predefined range Tmin-Tmax search together with section 262 filtering and section 263 of the search control section 263 of the search. Section 264 of the job is the main factors tones can set the ratio T of the fundamental tone by small changes in a predefined range Tmin-Tmax search if the search process is a closed loop, the respective first popoloca frequencies, and may set the coefficient T of the fundamental tone through small changes based on the optimal ratio of the primary colors obtained in the search of a closed loop corresponding to (m-1)-howl popoloca frequencies, in the case of performing the process of the search of a closed loop corresponding to m-th (m=2, 3,..., P) popoloca frequencies, for example, and after the second podology frequency.

Section 265 coding gain calculates for each podology frequency logarithmic amplification as a parameter for the regulation of energy relations in the nonlinear region based on the input spectrum S2(k) and estimated spectrum S2p'(k) (p=0, 1,..., P-1) and the ideal gain α1peach podology frequencies, which are entered from section 263 of the search. Section 265 coding gain quantum ideal enhancement and logarithmic amplification, and outputs the quantized ideal gain and quantized logarithmic amplification section 266 multiplexing.

Fig. 4 shows the internal configuration section 265 coding gain. Section 265 of the coding gain, mainly consists of section 271 of the ideal coding gain and section 272 encoding logarithmic amplification.

Section 271 of the ideal coding gain configures estimated spectrum S2'(k) visocica totoy portions of the input spectrum by continuing in the frequency part of the estimated spectrum S2 p'(k) (p=0, 1,..., P-1) each podology frequency, which is introduced from section 263 of the search. Then, section 271 of the ideal coding gain calculates the estimated spectrum S3'(k) by multiplying the ideal gain α1peach podology, the insertion of section 263 of the search, the estimated spectrum S2'(k) according to equation 10. In equation 10, BLpdenotes the index of the header of each podology frequencies, and BHpdenotes the index of the end of each podology frequencies. Section 271 of the ideal coding gain outputs the calculated estimated range S3'(k) in section 272 encoding logarithmic amplification. Section 271 of the ideal coding gain quantum ideal gain α1pand outputs the quantized ideal gain αQ1pin section 266 multiplexing as coded information ideal gain.

10

Section 272 encoding logarithmic amplification calculates the logarithmic gain as a parameter (parameter control amplitude) for the regulation of energy relations in the nonlinear region for each podology frequency between the high frequency part (FL≤k < FH) of input spectrum S2(k), which is introduced from section 205 processing orthogonal transformed the education, and estimated spectrum S3'(k), which is introduced from section 271 of the ideal coding gain. Section 272 encoding logarithmic amplification outputs the calculated logarithmic amplification section 266 multiplexing as encoded information of the logarithmic amplification.

Fig. 5 shows the internal configuration of the section 272 encoding logarithmic amplification. Section 272 encoding logarithmic amplification, mainly consists of sections 281 searching for the maximum amplitude values, section 282 retrieve groups of samples and section 283 calculate logarithmic amplification.

Section 281 of the search for the maximum amplitude values performs a search for each podology frequency, the maximum value MaxValuepthe amplitude and index of the sample (component of the spectrum) of the sample with the maximum amplitude, i.e. the maximum index MaxIndexpamplitude to the estimated spectrum S3'(k), which is introduced from section 271 of the ideal coding gain, as expressed by equation 11.

11

Section 281 of the search for the maximum amplitude values displays the estimated spectrum S3'(k), the maximum value MaxValuepthe amplitude and the maximum index MaxIndexpthe amplitude of the section 282 retrieve groups of samples.

Section 282 retrieve groups of samples determines the flag SelectFlag(k) of the extract for each sample, corresponding to the calculated maximum index MaxIndexpthe amplitude for each podology frequency, as expressed by equation 12. Section 282 retrieve groups of samples displays the estimated spectrum S3'(k), the maximum value MaxValuepamplitude and flag SelectFlag(k) extraction in section 283 of the calculation of the logarithmic amplification. In equation 12, Nearpdenotes a threshold value, which becomes the basis for determining flag SelectFlag(k) retrieval.

12

I.e., section 282 retrieve groups of samples determines the value of the flag SelectFlag(k) extraction on the basis of such a standard that the value of the flag SelectFlag(k) retrieve easily becomes equal to 1 for a sample component of the spectrum), which is closer to the sample having the maximum value MaxValuepthe amplitude in each popoloca frequency, as expressed by equation 12. I.e., section 282 retrieve groups of samples partially selects the sample based on the weighting factor, which allows for easy selection of the sample, which is closer to the sample having the maximum value MaxValuepthe amplitude in each popoloca hour is from. In particular, section 282 retrieve groups of samples chooses a selection index, which indicates that the distance from the maximum value MaxValuepthe amplitude is in the range Nearpas expressed by equation 12. Additionally, section 282 retrieve groups of samples specifies the value of the flag SelectFlag(k) retrieval set to 1 for a sample index with an even number, even when the sample is not close to the sample having the maximum amplitude value, as expressed by equation 12. Accordingly, even when the sample has a large amplitude, is present in the frequency band at a considerable distance from the sample having the maximum amplitude value, the sample or the sample with the amplitude near the amplitude of this sample may be extracted.

Section 283 calculate logarithmic amplification calculates an energy ratio (logarithmic gain) α2pin the logarithmic region of the high-frequency part (FL≤k<FH) estimated spectrum S3'(k) and the input spectrum S2(k), according to equation 13, for a sample in which the value of the flag SelectFlag(k) extract, which is entered from section 282 retrieve groups of samples that is equal to 1. In equation 13, M' denotes the number of samples to use in order to calculate the logarithmic gain, and it can be an arbitrary value, RA is th or less of the bandwidth of each podology frequencies. Of course, M' may be the width BWipodology frequency.

13

I.e., section 283 calculate logarithmic amplification calculates the logarithmic gain α2ponly for samples that are partially selected by section 282 retrieve groups of samples. Section 283 calculate logarithmic amplification quantum logarithmic amplification α2pand outputs the quantized logarithmic amplification α2Qpin section 266 multiplexing as encoded information of the logarithmic amplification.

The process by section 265 of the coding gain is explained above.

Section 266 multiplexing multiplexes, as encoded information of the second layer, the information division of the frequency band, which is entered from section 260 of the separation of frequency bands, the optimal rate Tp' the main tone for each podology SBpfrequency (p=0, 1,..., P-1), which is introduced from section 263 of the search indexes (encrypted information ideal gain and encrypted information logarithmic amplification), properly corresponding to the ideal acceleration α1Qpand logarithmic amplification α2Qpentered from section 65 of the coding gain, and outputs the encoded information of the second layer section 207 multiplexing coded information. The indices Tp' and α1Qpand α2Qpcan directly be entered in section 207 multiplexing coded information and can be multiplexed as encoded information of the first layer by section 207 multiplexing coded information.

The details of the filtering process, by section 262 of the filter shown in Fig. 3, are explained next with reference to Fig. 6.

Section 262 of the filter generates the estimated spectrum in the frequency band BSp≤k<BSp+BWp(p=0, 1,..., P-1) for podology SBpfrequency (p=0, 1,..., P-1), using state of the filter, which is entered from section 261 job status filter coefficient T of the fundamental tone, which is introduced from section 264 of the task of the coefficients of the fundamental tone, and information division of the frequency band, which is entered from section 260 of the separation of frequency bands. The transfer function F(z) filter, which is used by section 262 of the filter is expressed by the following equation 14.

The process of formation estimated spectrum S2p'(k) for the spectrum S2p(k) podology frequencies is illustrated further by considering podology SBpfrequency as an example.

14

In equation 14, T denotes the ratio of the fundamental tone, which is available from section 264 of the task of the coefficients of the fundamental tone, and βidenotes the filtration coefficient, which is stored in advance in the inner part. For example, when the number of taps is equal to 3, a variation of the filtration coefficient is (β-1that β0that β1)=(0, 1, 0,8, 0, 1). Additionally, the value of (β-1that β0that β1)=(0,2, 0,6, 0,2), (0,3, 0,4, 0,3) also is appropriate. Value (β-1that β0that β1)=(0,0, 1,0, 0,0) is also suitable, and in this case, the value indicates that part of the frequency band of the decoded spectrum of the first layer of the frequency band 0≤k<FL directly copied to the bandwidth BSp≤k<BSp+BWpwithout changing the shape of the part of the band. In the following explanation, the value of (β-1that β0that β1)=(0,0, 1,0, 0,0) is assumed as an example. In equation 14, it is assumed that M=1. M denotes an index that is relevant to the number of taps.

Decoded spectrum S1(k) of the first layer is stored as the internal state (the state of the filter) in the frequency band 0≤k<FL spectrum S(k) of the entire frequency band in section 262 of the filter.

Estimated spectrum S2p'(k) podology SB pfrequencies stored in the frequency band BSp≤k<BSp+BWpS(k) through a filtering process in the next step. I.e., as shown in Fig. 6, in the main, the spectrum S(k-T) frequency which is less than k on T, is substituted in S2p'(k). However, in order to improve the smoothness of the spectrum, in fact, the spectrum, which is obtained by adding all of the i-spectrum βi·S (k-T+i) is obtained by multiplying the approximate spectrum S (k-T+1), which is separated only by i from the spectrum S(k) for a predetermined filter coefficient βithen in S2p'(k). This processing is represented by the following equation 15.

15

Estimated spectrum S2p'(k) BSp≤k<BSp+BWpis calculated by performing the above calculations, sequentially from k=BSplow frequency, by changing k in the range BSp≤k<BSp+BWp.

The above filtering process is performed by cleaning from the zeros of S(k) every once in range BSp≤k<BSp+BWpevery time coefficient T main tone is provided from section 264 of the task of the coefficients of the fundamental tone. This S(k) is calculated each RA is, when the ratio T of the fundamental tone changes, and the results are displayed in section 263 of the search.

Fig. 7 is a flowchart of the operational sequence of the method, showing a step in the process of finding the optimal coefficient Tp' the main tone podology SBpfrequencies in section 263 of the search shown in Fig. 3. Section 263 of the search searches for the optimal coefficient of Tp' the main tone (p=0, 1,..., P-1) corresponding to each popoloca SBpfrequency (p=0, 1,..., P-1), through repetition of the stage shown in Fig. 7.

First, section 263 search initializes the minimum degree of similarity Dminas a variable to store the minimum value of the degree of similarity equal to "+∞" (ST2010). Then, section 263 search calculates the degree of similarity D between the high-frequency part (FL≤k < FH) of input spectrum S2(k) in a certain ratio to the main tone, and estimated spectrum S2p'(k) on the basis of the following equation 16 (ST2020).

16

In equation 16, M' denotes the number of samples to calculate the degree of similarity D, and this value can be an arbitrary value equal to or less bandwidth each podology frequencies. Of course, M' can take the width BW ipodology frequencies. In equation 16, S2p'(k) is not present, because BSpand S2'(k) are used to represent S2p'(k).

Section 263 of the search determines that fewer or not the calculated degree of similarity D is the minimum degree of similarity Dmin(ST2030). When the degree of similarity D, calculated on the ST2020, is less than the minimum degree of similarity Dmin("YES" at ST2030), section 263 search replaces the degree of similarity D on the minimum degree of similarity Dmin(ST2040). On the other hand, when the degree of similarity calculated on ST2020 equal to or exceeds the minimum degree of similarity Dmin("NO" at ST2030), section search indicates completed or there is no process in the search range. I.e., section 263 of the search determines the calculated or not the degree of similarity for all coefficients of the fundamental tone in the search range according to the equation 16 ST2020 (ST2050). When the process is not completed within the search range ("NO" at ST2050), section 263 of the search returns the process to the ST2020. Section search calculates the degree of similarity according to equation 16 for the coefficients of the fundamental tone, which differ from the ratio of the primary colors, for which the degree of freedom is calculated according to equation 16, the last step ST2020. On the other hand, when the process is completed in the search range ("YES" at ST2050), section 263 of the search displays the ratio T of the main colors, with testwuide minimum degree of similarity D minsection 266 multiplexing as the optimal coefficient Tp' the main tone (ST2060).

The device 103 decoding shown in Fig. 1, is explained next.

Fig. 8 is a block diagram showing the relevant configuration of the internal part of the device 103 decoding.

In Fig. 8, section 131 demuxing encoded information further demultiplexes encoded information of the first layer and the encoded information of the second layer of the input coded information (i.e., encoded information received from the device 101 encoding), outputs the encoded information of the first layer section 132 decoding the first layer and outputs the encoded information of the second layer section 135 decoding of the second layer.

Section 132 of the decoder decodes the first layer encoded information of the first layer, which is entered from section 131 demuxing coded information, and outputs the generated decoded signal of the first layer in section 133 of the processing of increasing the sampling rate. The operation section 132 decoding the first layer are similar to the operations section 203 decodes the first layer shown in Fig. 2, and therefore, detailed explanation of the operations is omitted.

Section 133 of the processing increases the sampling performs a process of increasing discre is Itachi sampling rate of SR 2to SR1for the decoded signal of the first layer, which is introduced from section 132 decoding the first layer, and outputs the resulting decoded signal of the first layer after increasing the sample rate section 134 processing orthogonal transformations.

Section 134 of the processing of orthogonal transform performs the orthogonal transform (MDCT) for the decoded signal of the first layer after increasing the sample rate, which is introduced from section 133 of the processing of increasing the sampling rate, and outputs the MDCT coefficient obtained decoded signal of the first layer after increasing the sample rate (further in this document, "decoded spectrum of the first layer") S1(k) in section 135 of the decoding of the second layer. The operation section 134 processing orthogonal transformations are similar to the operations section 205 of the orthogonal transformation processing shown in Fig. 2, is performed for the decoded signal of the first layer after increasing the sampling rate, and therefore, detailed explanation of the operations is omitted.

Section 135 of the decoding of the second layer forms a decoded signal of the second layer containing the high-frequency component by using the decoded spectrum S1(k) of the first layer, which is introduced from section 134 of the orthogonal transformation processing, and coded and the formation of the second layer, which is entered from section 131 demuxing coded information, and outputs the generated signal as the output signal.

Fig. 9 is a block diagram showing the relevant configuration of the internal part of the section decoding the second layer shown in Fig. 8.

Section 351 demuxing further demultiplexes encoded information of the second layer, which is entered from section 131 demuxing coded information, the information division of the frequency band that contains the bandwidth BWptransmission (p=0, 1,..., P-1) and BS indexpheader (p=0, 1,..., P-1) (FL≤BSp<FH) each podology frequencies, the optimal rate Tp' the main tone (p=0, 1,..., P-1) as information regarding filtering and indexes coded information ideal gain (j=0, 1,..., J-1) and encoded information of the logarithmic amplification (j=0, 1,..., J-1) as information regarding gain. Section 351 of the demux outputs information division of the bandwidth and the optimal rate Tp' the main tone (p=0, 1,..., P-1) in section 353 filtering and outputs the coded indexes information ideal gain and coded information logarithmic amplification section 354 decoding gain. In section 131 demuxing coded information when encoded with the second information of the second layer is divided by the information division of the frequency band, the optimal rate Tp' the main tone (p=0, 1,..., P-1) and the indices of the coded information of the ideal gain and coded information logarithmic amplification section 351 demuxing do not need to be.

Section 352 of the task of the States of the filter sets of decoded spectrum S1(k) of the first layer (0≤k<FL), which is introduced from section 134 processing orthogonal transformations, as the state of the filter that should be used by section 353 of the filter. When the entire range of the frequency band 0≤k < FH in section 353 of the filter is called S(k) for convenience, the decoded spectrum S1(k) of the first layer is stored in the frequency range 0≤k<FL S(k) as the internal state (the state of the filter) of the filter. The configuration and operation section 352 of the task of the States of the filter are similar to the configurations and operations section 261 job status filter shown in Fig. 3, and therefore, detailed explanation of the configuration and operations is omitted.

Section 353 of the filter includes a filter main tone with multiple taps (the number of taps larger than 1). Section 353 of the filter filters the decoded spectrum S1(k) of the first layer and calculates the estimated value S2p'(k) (BSp≤k<BSp+BWp) (p=0, 1,..., P-1) each podology SBpfrequency (p=0, 1,..., P-1), shown in useprivilegeseparation 15, on the basis of the information division of the frequency band, which is entered from section 351 demuxing, filter status, as defined by section 352 of the job status filter coefficient Tp' the main tone (p=0, 1,..., p-1) and the filter coefficient stored in the internal parts in advance. The filter function shown in the above equation 14, is also used in section 353 of the filter. However, the process of filtration and the filter function in this case distinguished by the fact that T in equations 14 and 15 is substituted with Tp'. I.e., section 353 filter estimates the high-frequency part of the input spectrum in the device 101 encoding of the decoded spectrum of the first layer.

Section 354 decoding amplification decodes the coded indexes information ideal gain and encoded information of the logarithmic gain, which is entered from section 351 demuxing, and obtains a quantized ideal gain αQ1pand quantized logarithmic amplification α2Qpthe quantized values of the ideal gain α1pand logarithmic amplification α2p.

Section 355 of the regulatory spectrum calculates the decoded spectrum on the basis of the estimated value S2p'(k) (BSp≤k<BSp+BWp) (p=0, 1,..., P-1) each podology SBpfrequency (p=0, 1,..., P-1), which wodis is from section 353 filtering, and the ideal gain αQ1pfor each podology frequency that is entered from section 354 decoding gain. Section 355 of the regulatory spectrum displays the calculated decoded spectrum in section 356 of the processing of orthogonal transform.

Fig. 10 shows the internal configuration section 355 of the regulatory spectrum. Section 355 of the regulatory spectrum mainly consists of sections 361 decoding the ideal gain and section 362 decoded logarithmic gain.

Section 361 decoding the ideal gain receives the estimated spectrum S2'(k) of the input spectrum by continuing in the frequency part of the estimated value S2p'(k) (BSp≤k<BSp+BWp) (p=0, 1,..., P-1) each podology frequency that is entered from section 353 filtering. Then, section 361 decoding the ideal gain calculates the estimated spectrum S3'(k) by multiplying the ideal gain αQ1pfor each podology frequency that is entered from section 354 decoding the gain on the estimated spectrum S2'(k) on the basis of the following equation 17. Section 361 decoding the ideal gain displays the estimated spectrum S3'(k) in section 362 decoded logarithmic gain.

Section 362 decoded logarithmic gain performs energy regulation in the logarithmic region for the estimated spectrum S3'(k), which is introduced from the section 361 decoding the ideal gain, using quantized logarithmic amplification α2Qpfor each podology frequency that is entered from section 354 decoding amplification, and outputs the resulting spectrum in section 356 of the processing of the orthogonal transformation in the quality of decoded spectrum.

Fig. 11 shows the internal configuration section 362 decoded logarithmic gain. Section 362 decoded logarithmic gain mainly consists of sections 371 searching for the maximum amplitude values, section 372 retrieve groups of samples and section 373 application of logarithmic amplification.

Section 371 of the search for the maximum amplitude values performs a search for each podology frequency, the maximum value MaxValuepthe amplitude and the maximum index MaxIndexpamplitude as an index of selection (feature selection), the maximum amplitude to the estimated spectrum S3'(k), which is introduced from the section 361 decoding the ideal gain, as expressed by equation 11. Section 371 of the search for the maximum amplitude values displays the estimated spectrum S3'(k), the maximum value MaxValuep amplitude and the maximum index MaxIndex pamplitude in section 372 retrieve groups of samples.

Section 372 retrieve groups of samples determines the flag SelectFlag(k) of the extract for each sample, corresponding to the calculated maximum index MaxIndexpthe amplitude for each podology frequency, as expressed by equation 12. I.e., section 372 retrieve groups of samples partially selects the sample based on the weighting factor, which allows for easy selection of the sample (component of the spectrum), which is closer to the sample having the maximum value MaxValuepthe amplitude in each popoloca frequencies. Section 372 retrieve groups of samples displays the estimated spectrum S3'(k), the maximum value MaxValuepthe amplitude and the maximum index MaxIndexpamplitude and flag SelectFlag(k) of the extract for each sample in section 373 application of logarithmic amplification.

The processes that runs through a section 371 searching for the maximum amplitude values and section 372 retrieve groups of samples that are similar to the processes performed by section 281 of the search for the maximum amplitude values and section 282 retrieve groups of samples of the device 101 encoding.

Section 373 of the application of the logarithmic gain computes the Signp(k) that indicates the sign (+, -) of the extracted group of samples of the estimated spectrum S3'(k) and flag SelectFlag(k) extraction, to the which are entered from section 372 retrieve groups of samples, as expressed by equation 18. I.e. as expressed by equation 18, section 373 application of logarithmic amplification calculates the Signp(k)=1, when the sign of the extracted sample is "+" (when S3'(k)≥0), and calculates the Signp(k)= -1 in other cases (when the sign of the extracted sample is "-" (when Signp(k)≥0).

17
18

Section 373 of the application of the logarithmic gain calculates the decoded spectrum S5'(k) according to equations 19 and 20 for a sample in which the value of the flag SelectFlag(k) extraction is equal to 1, based on the estimated spectrum S3'(k), the maximum value MaxValuepamplitude and flag SelectFlag(k) extraction, which are entered from section 372 retrieve groups of samples, and on the basis of the quantized logarithmic amplification α2Qpthat is entered from section 354 decoding amplification, and mark Signp(k), which is calculated according to equation 18.

19
20

I.e., section 373 application of logarithmic amplification uses a logarithmic α2 ponly to the sample, which is partially selected by section 372 of the extract samples (sample flag extract SelectFlag (k=1)). Section 373 of the application of logarithmic amplification outputs the decoded spectrum S5'(k) in section 356 processing orthogonal transformations. In this case, the low-frequency part (0≤k<FL) decoded spectrum S5'(k) consists of decoded spectrum S1(k) of the first layer and the high-frequency part (FL≤k < FH) of the decoded spectrum S5'(k) consists of a spectrum obtained by performing energy regulation in the logarithmic region for the estimated spectrum S3'(k). However, for a sample that is not selected by section 372 of the extract samples (sample flag SelectFlag(k) extract=0) in the high-frequency part (FL≤k < FH) of the decoded spectrum S5'(k), the value of this sample is provided as the value estimated spectrum S3'(k).

Section 356 of the processing of orthogonal transform of the orthogonal transforms the decoded spectrum S5'(k), which is introduced from the section 355 of the regulatory spectrum, the signal of the time domain and outputs the resulting decoded signal of the second layer as the output signal. In this case, due process of encoding weighting and adding the overlay is performed if necessary, thereby avoiding the formation of heterogeneity between the frame of the mi.

The detailed process of section 356 of the orthogonal transformation processing is explained below.

Section 356 processing orthogonal transformation buffer buf'(k) in its inner part, and initializes the buffer buf'(k), as expressed by the following equation 21.

21

Section 356 processing orthogonal transformations also receives the decoded signal yn"the second layer on the basis of the following equation 22 using the decoded spectrum S5'(k) of the second layer, which is introduced from the section 355 of the regulatory spectrum.

22

In equation 22, Z4(k) is a vector that combines the decoded spectrum S5'(k) and the buffer buf'(k), as expressed by the following equation 23.

23

Section 356 of the orthogonal transformation processing updates the buffer buf'(k) on the basis of the following equation 24.

24

Section 356 of the processing of orthogonal transform outputs the decoded signal yn" as the output signal.

As explained above, according to this variant implementation, when encoding/decoding to estimate the spectrum of the high-frequency part by performing expansion of a frequency band using a low-frequency part of the spectrum, the spectrum of the high frequency part is estimated by using the decoded low-frequency spectrum, and then the sample is selected (weeded) by imposing the weight of the sample at the external border of the maximum amplitude value in each popoloca frequency estimated range, and gain control in the logarithmic region is performed only for the selected sample. Based on this configuration, the amount of arithmetic operations required for the gain control in the logarithmic region, can be drastically reduced. Additionally, by performing gain control only for acoustically important sampling near the maximum value of the amplitude, the formation of abnormal sound, which leads to increased sampling low amplitude value, can be suppressed, and the sound quality of the decoded signal may be increased.

This version of the implementation is to be placed, when you set the flag retrieval, the value of the flag extraction is set to 1 when the index is an even number for a sample that is not close to the sample having the maximum amplitude in popoloca frequencies. However, the possible application of the present invention is not limited to this, and the invention may similarly be applied, for example, to the case when the value of the flag extraction of the sample, in which the balance for index 3 is equal to 0, is set equal to 1. I.e., the possible application of the present invention is not limited to the above method of setting the flag extraction, and the present invention may similarly be applied to the method of extraction of the sample on the basis of the weighting factor (scale), which provides the ability to easily set the value of the flag extraction is equal to 1 for a sample, which is closer to the sample having the maximum amplitude value corresponding to the position of the maximum amplitude value in popoloca frequencies. For example, there is a way to set a flag extraction in three stages, when the encoding device and the decoding device remove all samples, which is very close to the sample having the maximum amplitude value (i.e., the encoding device and the decoding device sets the value of the flag extraction is equal to 1), remove samples that are n the short distance from the maximum amplitude value, only when the index is an even number, and extract samples that are at a greater distance from the maximum amplitude value when the remainder for index 3 is equal to 0. Of course, the present invention can also be applied to method the job for more than three stages.

In the present embodiment, when the flag extract, as an example, explains that after the search of the sample, which has a maximum amplitude in popoloca frequencies executed, the flag extraction is specified by the distance from this sample. However, the application of this variant implementation is not limited to this, and the invention can also be applied to the case of the encoding device and the decoding device searches for a sample that has a minimum value of the amplitude, set the flag extraction of each sample corresponding to the distance from the sample, which has a minimum amplitude value, and calculate and apply the parameter regulating the amplitude of the logarithmic amplification and so on, for example, only to the extracted sample (sample in which the value of the flag extraction is set equal to 1). This configuration is valid, for example, when the parameter regulating the amplitude has the effect of weakening the estimated high-frequency spectrum. Although the creature is t the risk of the formation of abnormal sound by attenuating the high-frequency spectrum for the sample, having a large amplitude, there is a possibility of improving the sound quality through the application process only to weaken the external boundary of the sample having the minimum amplitude value. There is also a configuration in which the encoding device and the decoding device to extract the sample by using the weighting factor (scale), which allows for easy extraction of the sample, which is located at a greater distance from the sample having the maximum amplitude value, by searching for the maximum amplitude value instead of finding the minimum amplitude value. The present invention can similarly be applied to this configuration.

In the present embodiment, when the flag extract, as an example, explains that after the search of the sample, which has a maximum amplitude in popoloca frequencies executed, the flag extraction is specified by the distance from this sample. However, the application of this variant implementation is not limited to this, and the invention may similarly be applied to the case when the approximate flag is set equal to the set of samples according to the distance from each sample by selecting these samples from samples having a large amplitude, for each podology frequencies. Through PR is delivering the above configuration can be effectively extracted sample, when many samples that have similar dimensions amplitudes present in popoloca frequency.

In the present embodiment, explained is the case when the sample is partially selected by determining whether the sample in each popoloca frequencies close to the sample, which has a maximum amplitude value based on the threshold value (Nearpexpressed in equation 12). In the present invention, the encoding device and the decoding device may be implemented with the possibility to choose the sample over a wide range for podology frequency to a higher frequency of multiple popolos frequencies, for example, as a sample, which is close to the sample having the maximum amplitude value. I.e., in the present invention, Nearpwhich is expressed in equation 12, it can take a larger value for podology frequency with a higher frequency of multiple popolos frequencies. In this arrangement, during the separation of the band, even when the width podology frequency is set as large for higher frequencies, for example, on a scale barque, for example, the sample may partially be selected without variances between the sub-bands of frequencies, and the deterioration of the sound quality of the decoded signal can be prevented. It was experimentally confirmed that meant for the Near I pwhich is expressed by equation 12, a good result is obtained by setting approximately 5-21 (for example, a value Nearpin the lowest popoloca frequency equals 5, and a value Nearpin higher popoloca frequency equals 21), when the number of samples (MDCT coefficients) of one frame is, for example, approximately 320.

In the present embodiment, is illustrated configuration of the encoding device and the decoding device in which the detection section of the group of samples partially selects the sample based on the weighting factor, which allows for easy selection of the sample, which is closer to the sample having the maximum value MaxValuepthe amplitude in each popoloca frequency, as expressed by equation 12. In this case, by way of extraction of a group of samples, which is expressed by equation 12, the sample is close to the maximum value of the amplitude can easily be selected independently from the border podology frequencies, even when the sample having the maximum amplitude, is on the border of each podology frequencies. I.e. according to the configuration explained in the present embodiment, since the sample is selected through consideration of the position of the sample, which has the maximum value of the amplitude is related popoloca frequencies, acoustically important sampling can now be selected.

In the present embodiment, the section searching for the maximum amplitude values calculates the maximum amplitude in the linear region and not in the logarithmic region. When a logarithmic transformation is performed for all samples (MDCT coefficients) (for example, patent document 1 and so on), the amount of arithmetic operations does not increase when the maximum amplitude value is calculated in the logarithmic region or the linear region. However, as in the configuration of this variant implementation, when a logarithmic transformation is performed for partially selected sample, the amount of arithmetic operations for calculating the maximum amplitude value can be reduced to a greater extent than the amount of arithmetic operations by means of the method in patent document 1 and so on, for example, when the section searching for the maximum amplitude values calculates the maximum value of the amplitude in the linear region, as described above.

Option 2 implementation

In option 2, the implementation of the present invention, the section of the coding gain in the encoding section of the second layer may further reduce the amount of arithmetic operations by using the configuration that is different from the configuration illustrated in the embodiment 1 of the westline.

Communication system (not shown) under option 2 implementation is basically similar to the communication system shown in Fig. 1, and differs from device 101 encoding device 103 decoding communication system of Fig. 1 only in the configuration and operations of the encoding device and the decoding device. Option 2 implementation is illustrated below by adding reference numbers 111 and 113, respectively, to the encoding device and the decoding device according to the present variant implementation.

The inner part of the device 111 encoding (not shown) according to the present variant implementation mainly consists of section 201 of the processing down-sampling section 202 encoding the first layer, the decoding section 203 of the first layer, section 204 of the processing increases the discretization section 205 processing orthogonal transformation section 206 encoding of the second layer and section 207 multiplexing coded information. The constituent elements other than section 226 encoding of the second layer performs the processes identical to the processes in the embodiment 1 of implementation (Fig. 2), and therefore, their explanation is omitted.

Section 226 encoding of the second layer forms the encoded information of the second layer using the input spectrum S2(k) and the decoded spectrum S1(k) Pervov the layer, entered from section 205 of the processing of orthogonal transform, and outputs the generated encoded information of the second layer section 207 multiplexing coded information.

Then, the relevant configuration of the internal part of section 226 encoding of the second layer is explained with reference to Fig. 12.

Section 206 encoding of the second layer includes the section 260 of the separation of frequency bands, section 261 job status filter section 262 filtering section 263 search section 264 of the job factors the main tone, section 235 coding gain and section 266 multiplexing, and each section performs the following operations. The constituent elements other than section 235 coding gain, are identical to the constituent elements explained in the embodiment 1 of implementation (Fig. 3), and therefore, their explanation is omitted.

Section 235 coding gain calculates for each podology frequency logarithmic amplification as a parameter (parameter control amplitude) for the regulation of energy relations in the nonlinear region based on the input spectrum S2(k) and estimated spectrum S2p'(k) (p=0, 1,..., P-1) and the ideal gain α1peach podology frequencies, which are entered from section 263 of the search. Section 235 coding gain quantum ideal logarithmic amplification and the amplification and outputs the quantized ideal gain and quantized logarithmic amplification section 266 multiplexing.

Fig. 13 shows the internal configuration section 235 coding gain. Section 235 of the coding gain, mainly consists of section 241 of the ideal coding gain and section 242 encoding logarithmic amplification. Section 241 of the ideal coding gain is a constituent element that is identical to the constituting element explained in the embodiment 1 of implementation, and therefore, explanation of section 241 of the ideal coding gain is omitted.

Section 242 encoding logarithmic amplification calculates the logarithmic gain as a parameter (parameter control amplitude) for the regulation of energy relations in the nonlinear region for each podology frequency between the high frequency part (FL≤k < FH) of input spectrum S2(k), which is introduced from section 205 of the orthogonal transformation processing, and the estimated spectrum S3'(k), which is introduced from section 241 of the ideal coding gain. Section 242 encoding logarithmic amplification outputs the calculated logarithmic amplification section 266 multiplexing as encoded information of the logarithmic amplification.

Fig. 14 shows the internal configuration section 242 encoding logarithmic amplification. Section 242 encoding logarithmic amplification, mainly consists of sections 253 search for the maximum amplitude values, section 251 retrieve groups of samples, and section 252 of the computing logarithmic amplification.

Section 253 of the search for the maximum amplitude values performs a search for each podology frequency, the maximum value MaxValuepthe amplitude and index of the sample (component of the spectrum) of the maximum amplitude, i.e. the maximum index MaxIndexpamplitude to the estimated spectrum S3'(k), which is introduced from section 241 of the ideal coding gain, as expressed by equation 25.

25

I.e., section 253 searching for the maximum amplitude values searches for the maximum amplitude value only for the sample with an even index number. In this arrangement, the amount of arithmetic operations required to search for the maximum amplitude value, can be effectively reduced.

Section 253 of the search for the maximum amplitude values displays the estimated spectrum S3'(k), the maximum value MaxValuepthe amplitude and the maximum index MaxIndexpamplitude in section 251 retrieve groups of samples.

Section 251 retrieve groups of samples determines the value of the flag SelectFlag(k) of the extract for each sample (component of the spectrum) for the estimated spectrum S3'(k), which enter the descendants of section 253 of the search for the maximum amplitude values, on the basis of the following equation 26.

26

I.e., section 251 retrieve groups of samples specifies the value of the flag SelectFlag(k) retrieval set to 0 to select the index with an odd number, and sets the flag value SelectFlag(k) retrieval set to 1 for a sample index with an even number as expressed by equation 26. I.e., section 251 retrieve groups of samples partially selects the sample (component spectrum) (only the sample with index even number) for the estimated spectrum S3'(k). Section 251 retrieve groups of samples displays the flag SelectFlag(k) extract, estimated range S3'(k) and the maximum value MaxValuepamplitude in section 252 of the computing logarithmic amplification.

Section 252 of the computing logarithmic amplification calculates an energy ratio (logarithmic gain) α2pin the logarithmic region between the estimated spectrum S3'(k) and the high-frequency part (FL≤k < FH) of input spectrum S2(k), based on equation 13, for a sample in which the value of the flag SelectFlag(k) extract, which is entered from section 251 retrieve groups of samples that is equal to 1. I.e., section 252 calculate logarithmic amplification calculates the logarithmic gain α2ponly for the sample, which partly liberalshatefreedom section 251 retrieve groups of samples.

Section 252 of the computing logarithmic amplification quantum logarithmic amplification α2pand outputs the quantized logarithmic amplification α2Qpin section 266 multiplexing as encoded information of the logarithmic amplification.

The process by section 235 of the coding gain is explained above.

The process device 111 encoding according to the present variant of the implementation is the same as explained above.

On the other hand, the inner part of the device 113 decoding (not shown) according to the present variant implementation mainly consists of sections 131 demuxing coded information section 132 decoding the first layer, section 133 of the processing increases the discretization section 134 processing of orthogonal transform, and section 295 of the decoding of the second layer. The constituent elements other than section 295 of the decoding of the second layer performs the processes identical to the processes in the embodiment 1 of implementation (Fig. 8), and therefore, their explanation is omitted.

Section 295 of the decoding of the second layer forms a decoded signal of the second layer containing the high-frequency component by using the decoded spectrum S1(k) of the first layer, which is introduced from section 134 of the orthogonal transformation processing, and coded information of the second is Loya, which is entered from section 131 demuxing coded information, and outputs the generated signal as the output signal.

Section 295 of the decoding of the second layer mainly consists of sections 351 demuxing, section 352 job status filter section 353 filtering section 354 decoding amplification, section 396 of spectrum regulation and section 356 of the processing of orthogonal transformations. The constituent elements other than section 396 of the regulation range, perform the processes identical to the processes in the embodiment 1 of implementation (Fig. 9), and therefore, their explanation is omitted.

Section 396 of the regulatory spectrum mainly consists of sections 361 decoding the ideal gain and section 392 decoding logarithmic amplification (not shown). Section 361 decoding the ideal enhance performs a process identical to the process in embodiment 1 of implementation (Fig. 10), and therefore, an explanation of the section 361 decoding the ideal gain is omitted.

Fig. 15 shows the internal configuration section 392 decoded logarithmic gain. Section 392 encoding logarithmic amplification, mainly consists of sections 381 searching for the maximum amplitude values, the section 382 retrieve groups of samples and section 383 of the application of logarithmic amplification.

Clubs is 381 searching for the maximum amplitude values searches, for each podology frequency, the maximum value MaxValuepthe amplitude and index of the sample (component of the spectrum) of the sample with the maximum amplitude, i.e. the maximum index MaxIndexpamplitude to the estimated spectrum S3'(k), which is introduced from the section 361 decoding the ideal gain, as expressed by equation 25. I.e., section 381 of the search for the maximum amplitude values searches for the maximum amplitude value only for the sample with an even index number. I.e., section 381 of the search for the maximum amplitude values searches for the maximum amplitude value only for part of the sample (component of the spectrum) of the estimated spectrum S3'(k). In this arrangement, the amount of arithmetic operations required to search for the maximum amplitude value, can be effectively reduced. Section 381 of the search for the maximum amplitude values displays the estimated spectrum S3'(k), the maximum value MaxValuepthe amplitude and the maximum index MaxIndexpamplitude in section 382 retrieve groups of samples.

Section 382 retrieve groups of samples determines the flag SelectFlag(k) of the extract for each sample, corresponding to the calculated maximum index MaxIndexpthe amplitude for each podology frequency, as expressed by equation 12. I.e., section 382 retrieve groups of samples part is but chooses the sample based on the weighting factor, which allows for easy selection of the sample (component of the spectrum), which is closer to the sample having the maximum value MaxValuepthe amplitude in each popoloca frequencies. In particular, the section 382 retrieve groups of samples chooses a selection index, which indicates that the distance from the maximum value MaxValuepthe amplitude is in the range Nearpas expressed by equation 12. Additionally, section 382 retrieve groups of samples specifies the value of the flag SelectFlag(k) retrieval set to 1 for a sample index with an even number, even when the sample is not close to the sample having the maximum amplitude value, as expressed by equation 12. Accordingly, even when the sample has a large amplitude, is present in the frequency band at a considerable distance from the sample having the maximum amplitude value, the sample or samples with amplitude close to the sample, the sample may be extracted. Section 382 retrieve groups of samples displays the estimated spectrum S3'(k) and the maximum value MaxValuepamplitude and flag SelectFlag(k) extracting for each podology frequencies in section 383 calculate logarithmic amplification.

The processes performed by the section 381 searching for the maximum amplitude values and the section 382 retrieve groups of samples, Aleuts is similar processes, performed by section 253 of the search for the maximum amplitude values and section 282 retrieve groups of samples of the device 101 encoding.

Section 383 of the application of the logarithmic gain computes the Signp(k) that indicates the sign (+, -) of the extracted group of samples of the estimated spectrum S3'(k) and flag SelectFlag(k) extraction, which are entered from the section 382 retrieve groups of samples, as expressed by equation 18. I.e. as expressed by equation 18, section 383 of the application of the logarithmic gain computes the Signp(k)=1, when the sign of the extracted sample is "+" (when S3'(k)≥0), and calculates the Signp(k)=-1 in other cases (when the sign of the extracted sample is "-" (when Signp(k)≥0).

Section 383 of the application of the logarithmic gain calculates the decoded spectrum S5'(k) according to equations 19 and 20 for a sample in which the value of the flag SelectFlag(k) extraction is equal to 1, based on the estimated spectrum S3'(k), the maximum value MaxValuepamplitude and flag SelectFlag(k) extraction, which are entered from the section 382 retrieve groups of samples, and on the basis of the quantized logarithmic amplification α2Qpthat is entered from section 354 decoding amplification, and mark Signp(k), which is calculated according to equation 18.

I.e., section 383 of the application of logarithmic amplification uses a logarithmic reinforced the e α2 ponly to the sample, which is partially selected by section 382 of the extract samples (sample flag extract SelectFlag (k=1)). Section 383 of the application of logarithmic amplification outputs the decoded spectrum S5'(k) in section 356 processing orthogonal transformations. In this case, the low-frequency part (0≤k<FL) decoded spectrum S5'(k) consists of decoded spectrum S1(k) of the first layer and the high-frequency part (FL≤k < FH) of the decoded spectrum S5'(k) consists of a spectrum obtained by performing energy regulation in the logarithmic region for the estimated spectrum S3'(k). However, for a sample that is not selected by section 382 of the extract samples (sample flag SelectFlag(k) extract=0) in the high-frequency part (FL≤k < FH) of the decoded spectrum S5'(k), the value of this sample is provided as the value estimated spectrum S3'(k).

The process section 396 regulation of the spectrum is explained above.

The process device 113 decoding according to the present variant of the implementation is the same as explained above.

As explained above, according to this variant implementation, when encoding/decoding to estimate the spectrum of the high-frequency part by performing expansion of a frequency band using a low-frequency part of the spectrum, the spectrum of the high frequency part of the estimated p and using the decoded low-frequency spectrum, and after that, the sample is selected (weeded) in each popoloca frequency estimated range, and gain control in the logarithmic region is performed only for the selected sample. Unlike option 1 implementation, the encoding device and the decoding device calculates the parameter gain control (logarithmic amplification) without taking into account the distance from the maximum amplitude value, and the decoder takes into account the distance from the maximum amplitude value in popoloca frequency only when the applied gain control (logarithmic amplification). Based on this configuration, the amount of arithmetic operations can be reduced to a greater extent than the amount of arithmetic operations in the version 1 implementation.

As explained in the present embodiment, through experiments confirmed that there is no deterioration in sound quality, even when the encoding device calculates the parameter of the gain control only from the sample with an even index, and when the decoder takes into account the distance from the sample having the maximum amplitude in popoloca frequency, and applies the setting of the gain control for the extracted sample. I.e. we can say that there are no problems, even when the group of samples that should be used for Vici the population parameter gain control, not necessarily coincides with the group of samples that should be used to apply the setting of the gain control. This indicates, as explained in the present embodiment, for example, that the encoding device and the decoding device can efficiently compute the gain control, even when all samples are not retrieved by uniform extraction of samples in all popoloca frequencies. This also indicates that the decoding device can effectively reduce the amount of arithmetic operations by use of the obtained parameter gain control only for the samples extracted by considering the distance of the sample having the maximum amplitude in popoloca frequencies. According to this variant implementation, the amount of arithmetic operations is reduced to a greater extent than the amount of arithmetic operations in embodiment 1 of implementation, without compromising sound quality, by using this configuration.

In the present embodiment, as an example, explains that the process of encoding/decoding the low-frequency component of the input signal and the process of encoding/decoding a high-frequency component of the input signal are performed separately, i.e. the process of encoding/decoding is done in many of the layer structure of two layers. However, the possible application of the present invention is not limited to this, and the invention can also similarly be applied to the case of performing encoding/decoding in a multilayer structure of three or more layers. When the multi-layer encoding section of three or more layers are considered in section decoding the second layer, which generates a local decoded signal of the decoding section of the second layer, the group of samples to which the parameter applies gain control (logarithmic amplification) may be a group of samples, which does not take into account the distance from the sample having the maximum amplitude value, which is calculated in the encoding device according to the present variant implementation, or may be a group of samples, which takes into account the distance from the sample having the maximum amplitude value, which is calculated in the decoding device according to the present variant implementation.

In the present embodiment, when the flag retrieval, the value of the flag extraction is set equal to 1 only when the index of the sample is an even number. However, the possible application of the present invention is not limited to this, and the invention may similarly be applied, for example, to the case of the balance for the index 3 is equal to 0.

Each version of the wasp is estline of the present invention is illustrated above.

In the above embodiments, implementation, an example is explained that the number J popolos frequency obtained by dividing the high-frequency portions of the input spectrum S2(k) in section 265 of the coding gain (or section 235 of the coding gain), is different from the number F popolos frequency obtained by dividing the high-frequency portions of the input spectrum S2(k) in section 263 of the search. However, the task is not limited to this method in the present invention, and the number popolos frequency obtained by dividing the high-frequency portions of the input spectrum S2(k) in section 265 of the coding gain (or section 235 of the coding gain), can be set equal to P.

In the above embodiments, implementation, illustrated configuration, which evaluates the high-frequency part of the input spectrum by using the decoded low-frequency part of the spectrum of the first layer, obtained from the decoding section of the first layer. However, the configuration is not limited in the present invention, and the invention can also similarly be applied to a configuration, which evaluates the high-frequency part of the input spectrum using the low-frequency portions of the input spectrum instead of the decoded spectrum of the first layer. In this configuration, the device coding calculates the encrypted information encrypted information is Yu second layer) for forming the high-frequency component of the input spectrum from the low frequency component of the input spectrum, and the decoder uses this encoded information to the decoded spectrum of the first layer and forms a high-frequency component of the decoded spectrum.

In the above embodiments, implementation, an example is illustrated a process that reduces the amount of arithmetic operations and improves the quality of sound in a configuration that calculates and applies the parameter for the regulation of energy relations in the logarithmic region, on the basis of the process in the patent document 1. However, the possible application of the present invention is not limited to this, and the invention may similarly be applied to a configuration that regulates energy ratio in the conversion in the nonlinear region, different from the logarithmic conversion. The invention can also be applied to the transform in the linear region, as well as the conversion in the nonlinear region.

In the above embodiments, implementation, an example is illustrated a process that reduces the amount of arithmetic operations and improves the quality of sound in a configuration that calculates and applies the parameter for the regulation of energy relations in the logarithmic region in the process of expanding the frequency band, based on the process in the patent document 1. However, the application of this option the image is placed is not limited to this, and the invention can also similarly be applied to the process other than the process of expanding the bandwidth.

The encoding device, decoding device and method for them is not limited to the above variants of implementation, and various modifications can also be implemented. For example, these options for implementation may be properly combined to implement.

In the above embodiments, implementation, an example is explained that the decoding device performs a process using the encryption information transmitted from the encoding device in each embodiment. However, the process is not limited to the above in the present invention, the decoding device can also perform the process using the coded information, which contains the necessary parameters and data, not necessarily through the use of encryption information from the encoder in the above described embodiments implement.

In the above embodiments, implementation, although the speech signal is illustrated as appropriate for encoding the audio signal may also be encoded, and the acoustic signal, which contains both of these signals may also be encoded.

The present invention can also be applied to the case of registration and record and the program signal processing on a mechanically readable recording medium, for example, a storage device, disk, tape, CD and DVD, and operations, and may also obtain operations and effects similar operations and effects in real options implementation.

Also, although in the above embodiment, as examples of the described cases, when the present invention is executed by hardware, the present invention may also be implemented by software.

Each functional block used in the explanation of each of the above embodiments, typically can be implemented as an LSI, comprising the integrated circuit. It can be a separate chip or they may be partially or completely contained on a single chip. In this document the term LSI, but it may also be referred to as IC, system LSI, super LSI or ultra LSI" depending on the different degree of integration.

Moreover, the method of integration of circuits is not limited to LSI, and implementation using specialized circuits or General-purpose processor is also possible. After manufacturing the LSI, a programmable FPGA (user-programmable gate array) or a reconfigurable processor where connections or connectors cells circuits within the LSI can be reconfigured is also possible.

p> In addition, if new technology of integrated circuits to replace LSI, resulting in improvement of a semiconductor technology or other derivative technology, of course, also possible to integrate the functional blocks with the help of this technology.

Disclosure of the patent application (Japan) room 2009-044676, filed February 26, 2009, patent applications (Japan) room 2009-089656 filed April 2, 2009, and patent applications (Japan) room 2010-001654, filed January 7, 2010, in the detailed description, drawings and summary, are fully included herein by reference.

Industrial applicability

The encoding device, decoding device and method for them, according to the present invention can improve the quality of the decoded signal when evaluating the spectrum of the high-frequency part by performing expansion of a frequency band using a low-frequency part of the spectrum and can be used, for example, to a communication system with packet switching and the mobile communications system.

List of links

101 - device encoding

102 - transmission

103 - decoder

201 - section processing down-sampling

202 - section coding the first layer

132, 203 - section decoding the first layer

133, 204 - section processing polysoude the sample rate

134, 205, 356 - section processing orthogonal transformations

206, 226 - section coding the second layer

207 section multiplexing coded information

260 - section of the separation band

261, 352 section job status filter

262, 353 - section filter

263 - section search

264 section specify the coefficients of the fundamental tone

235, 265 - section coding gain

266 section multiplexing

241, 271 - section ideal coding gain

242, 272 - section coding logarithmic amplification

253, 281, 371, 381 - section searching for the maximum amplitude values

251, 282, 372, 382 section retrieve groups of samples

252, 283 - section calculations logarithmic amplification

131 section demuxing coded information

135 section decoding the second layer

351 section demuxing

354 section decoding gain

355 - section of the throttling range

361 section decoding the ideal gain

362 section decoded logarithmic gain

373, 383 section apply logarithmic amplification.

1. The encoding device, comprising:
the first encoding section that generates the first encoded information by encoding the lower-frequency portions of the input signal is equal to or smaller prewar the tion of a certain frequency;
a decoding section that generates a decoded signal by decoding the first encoded information;
section separation, which separates the high-frequency part of the input signal exceeding the predetermined frequency, P (P is an integer greater than 1) popolos frequency, and generates the corresponding initial position and bandwidth R popolos frequencies in the quality of the information division of the frequency band; and
a second encoding section that generates second encoded information by evaluating the spectra of many popolos frequencies, respectively, of the degree of similarity between the spectrum of the input signal and the spectrum obtained by filtering the decoded signal, based on the information division of the frequency band, a partial selection component of the spectrum in each of popolos frequencies, based on the weight ratio, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum amplitude in each popoloca frequency, and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum.

2. The encoding device under item 1, in which the second encoding section contains:
section filtering, which filters dekodieren the first signal and generates R.-ies (p=1, 2, ..., P) of the estimated signals from the first estimated signal to R-valued signal;
section of the job, which specifies the coefficients of the fundamental tone, which should be used by sections of the filter, by changing the coefficients of the fundamental tone;
section search, which searches the ratio of the fundamental tone, which generates the highest degree of similarity between the p-th estimated signal and R-this podoloski frequency ratios of the primary colors, as a p-optimal coefficient of the fundamental tone; and
a multiplexing section that receives the second encoded information by multiplexing P the optimal coefficients of the basic tone from the first optimum ratio of the fundamental tone to the p-optimal coefficient of the fundamental tone with the information division of the frequency band, and
section of the job, which specifies the coefficients of the fundamental tone, which should be used by the section filter to estimate the first podporou frequencies, by changing the ratio of the primary colors in a predefined range and sets the coefficients of the fundamental tone, which should be used by the section filter to estimate the m-th (m=2, 3, ..., P) podporou frequencies in and after the second podology frequencies through essentialexperience the main tone in the range, the corresponding (m-1)-PTO optimal ratio of the fundamental tone, or in a predefined range.

3. The encoding device under item 1, in which the second encoding section contains:
section search for similar parts, which searches for a frequency band, which is the most similar to the spectrum of each of the many popolos frequency, and the first parameter regulating the amplitude of the input signal or the spectrum of the decoded signal;
section search amplitude values, which performs a search for each of popolos frequency component of the spectrum having the maximum or minimum value of the amplitude spectrum for high frequency, which is estimated by the most similar frequency band and the first parameter of the control amplitude;
the partition feature selection range, which is partially selects the component of the spectrum on the basis of the weighting factor, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum or minimum value of the amplitude; and
section calculation parameter control amplitude, which calculates the second parameter of the control amplitude for partially selected component of the spectrum.

4. The encoding device under item 1, in which the second encoding section contains:
the section of the CIP is ka similar parts, which searches for a frequency band, which is the most similar to the spectrum of each of the many popolos frequency, and the first parameter regulating the amplitude of the input signal or the spectrum of the decoded signal;
the partition feature selection range, which is partially selects the component of the spectrum for the spectrum of high frequency, which is estimated by the most similar frequency band and the first parameter of the control amplitude; and
section calculation parameter control amplitude, which calculates the second parameter of the control amplitude for partially selected component of the spectrum.

5. The encoding device according to p. 3, in which the partition feature selection range selects the component of the spectrum over a wide range for podology frequency to a higher frequency of multiple popolos frequency component of the spectrum, which is close to the component of the spectrum having the maximum or minimum amplitude value.

6. The device of the communication terminal containing the encoding device under item 1.

7. The device is a base station that contains the encoding device under item 1.

8. The decoding device, comprising:
a receiving section that receives the first encoded information obtained by encoding the lower-frequency portions of the input signal, avnoj or less pre-determined frequency, generated by the encoder, and the second encrypted information generated by dividing the high-frequency part of the input signal exceeding the predetermined frequency, P (P is an integer greater than 1) popolos frequencies corresponding to the initial position and bandwidth R popolos frequencies are split frequency bands, spectral estimates of many popolos frequencies, respectively, of the degree of similarity between the spectrum of the input signal and the spectrum obtained by filtering the first decoded signal obtained by decoding the first encoded information, based on the information division of the frequency band, partial selection component of the spectrum in each of popolos frequencies, based on the weight ratio, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum amplitude in each popoloca frequency, and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum;
the first decoding section that generates a second decoded signal by decoding the first encoded information; and
a second decoding section that generates a third on the coded signal, containing high-frequency part of the input signal from the spectrum obtained by performing the orthogonal transformation processing of the second decoded signal using the second encoded information.

9. The decoding device under item 8 in which the second decoding section contains:
section search amplitude values, which performs a search for each of popolos frequency component of the spectrum having the maximum or minimum value of the amplitude, frequency band, which is the most similar to the corresponding spectra of many popolos frequencies, computed from the spectrum of the second decoded signal, and the spectrum of high frequency, which is estimated by the first parameter regulation of amplitude contained in the second coded information;
the partition feature selection range, which is partially selects the component of the spectrum on the basis of the weighting factor, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum or minimum value of the amplitude; and
section the application of the regulation parameters of amplitude, which uses the second parameter of the control amplitude for partially selected component of the spectrum.

10. The decoding device under item 9, in which the section search Zn the values of the amplitude searches, for each of popolos frequency component of the spectrum having the maximum or minimum value of the amplitude for the component parts of the spectrum from the spectrum of high frequency, which is evaluated.

11. The device of the communication terminal containing the decoding device under item 8.

12. The base station device containing the decoding device under item 8.

13. The encoding method, comprising:
the first stage, which form the first encoded information by encoding the lower-frequency portions of the input signal equal to or lower than a predetermined frequency;
the stage at which form a decoded signal by decoding the first encoded information;
the stage at which share high-frequency part of the input signal exceeding the predetermined frequency, P (P is an integer greater than 1) popolos frequency, and output the corresponding initial position and bandwidth R popolos frequencies in the quality of the information division of the frequency band; and
the stage at which generate second encoded information by evaluating the spectra of many popolos frequencies, respectively, of the degree of similarity between the spectrum of the input signal and the spectrum obtained by filtering the decoded signal, based on the information of the separation strip h is the frequency, partial selection component of the spectrum in each of popolos frequencies, based on the weight ratio, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum amplitude in each popoloca frequency, and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum.

14. The method of decoding, comprising:
the stage at which receive the first encoded information obtained by encoding the lower-frequency portions of the input signal equal to or lower than a predetermined frequency generated by the encoder, and the second encryption information generated by dividing the high-frequency part of the input signal exceeding the predetermined frequency, P (P is an integer greater than 1) popolos frequencies corresponding to the initial position and bandwidth R popolos frequencies are split frequency bands, spectral estimates of many popolos frequencies, respectively, of the degree of similarity between the spectrum of the input signal and spectrum obtained by filtering the first decoded signal obtained by decoding the first encoded information, envivas the information on the separation of frequency bands, partial selection component of the spectrum in each of popolos frequencies, based on the weight ratio, which allows for easy selection component of the spectrum that is closer to the component of the spectrum having the maximum amplitude in each popoloca frequency, and calculating a parameter regulating the amplitude control amplitude for the selected component of the spectrum;
the stage at which form the second decoded signal by decoding the first encoded information; and
the stage at which generate a third decoded signal containing high frequency part of the input signal from the spectrum obtained by performing the orthogonal transformation processing of the second decoded signal using the second encoded information.



 

Same patents:

FIELD: physics, video.

SUBSTANCE: invention relates to a method and an apparatus for improving audio and video encoding. A signal is processed using DCTIV for each block of samples of said signal (x(k)), wherein integer transform is carried out using lifting steps which represent sub-steps of said DCTIV. Integer transform of said sample blocks using lifting steps and adaptive noise shaping is performed for at least some of said lifting steps, said transform providing corresponding blocks of transform coefficients and noise shaping being performed such that rounding noise from low-level magnitude transform coefficients in a current one of said transformed blocks is decreased whereas rounding noise from high-level magnitude transform coefficients in said current transformed block is increased, and wherein filter coefficients (h(k)) of a corresponding noise shaping filter are derived from said audio or video signal samples on a frame-by-frame basis.

EFFECT: optimising rounding error noise distribution in an integer-reversible transform (DCTIV).

26 cl, 13 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to means of generating an output spatial multichannel audio signal based on an input audio signal. The input audio signal is decomposed based on an input parameter to obtain a first signal component and a second signal component that are different from each other. The first signal component is rendered to obtain a first signal representation with a first semantic property and the second signal component is rendered to obtain a second signal representation with a second semantic property different from the first semantic property. The first and second signal representations are processed to obtain an output spatial multichannel audio signal.

EFFECT: low computational costs of the decoding/rendering process.

5 cl, 8 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio signal transmission and is intended for processing an audio signal by varying the phase of spectral values of the audio signal, realised in a bandwidth expansion scheme. The audio signal processing method and device comprise a window processing module for generating a plurality of successive sampling units, a plurality of successive units including at least one added audio sampling unit, an added unit having added values and audio signal values, a first converter for converting the added unit into a spectral representation having spectral values, a phase modifier for varying the phase of spectral values and obtaining a modified spectral representation and a second converter for converting the modified spectral representation into a time domain varying audio signal.

EFFECT: high sound quality.

20 cl, 15 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio encoding technologies. An audio encoder for encoding an audio signal has a first coding channel for encoding an audio signal using a first coding algorithm. The first coding channel has a first time/frequency converter for converting an input signal into a spectral domain. The audio encoder also has a second coding channel for encoding an audio signal using a second coding algorithm. The first coding algorithm differs from the second coding algorithm. The second coding channel has a domain converter for converting an input signal from an input domain into an output domain audio signal.

EFFECT: improved encoding/decoding of audio signals in low bitrate circuits.

21 cl, 43 dwg, 10 tbl

FIELD: physics, computation hardware.

SUBSTANCE: invention relates to audio signal processing. Proposed method comprises audio signal filtration for division into two frequency bands and generation of multiple sub bands for signal of every frequency band. Note here that for signal in one frequency band multiple signals of sub bands are generated by conversion from time band to frequency band. For another frequency band, multiple signals of sub bands are generated with the help of bank of sub band filters. Proposed device comprises one processor and one memory device with computer program code. Note also that one memory device and one computer program code are configured to make at least one processor control over process implementation.

EFFECT: higher accuracy of audio signals due to improved signal source SNR.

31 cl, 8 dwg

FIELD: physics, acoustics.

SUBSTANCE: audio encoder (100) for encoding audio signal readings includes a first encoder with time superposition (aliasing) (110) for encoding audio readings in a first encoding region according to a first windowing rule, with attachment of a start window and a stop window. The audio encoder (100) further includes a second encoder (120) for encoding readings in a second encoding region, which processes a frame format-set number of audio readings and comprising a series of audio readings of an encoding mode stabilisation interval, which applies a different, second encoding rule, wherein the frame of the second encoder (120) is an encoded representation of time-consecutive audio signals, the number of which is set by the frame format. The audio encoder (100) also includes a controller (130) which performs switching from the first encoder (110) to the second encoder (120) according to the characteristics of the audio readings and corrects the second windowing rule when switching from the first encoder (110) to the second encoder (120) or modifies the start window or stop window of the first encoder (110) while keeping the second windowing rule unchanged.

EFFECT: improved switching between multiple working regions when encoding sound in both the time and frequency domains.

34 cl, 28 dwg

FIELD: physics.

SUBSTANCE: input spectrum is broken into a plurality of subbands. A representative value is calculated for each subband using an arithmetic mean and a geometric mean. Nonlinear conversion is performed with respect to each representative value. The nonlinear conversion characteristic is amplified as the value increases. The representative value, which was subjected to nonlinear conversion for each subband, is smoothed in the frequency domain.

EFFECT: faster spectral smoothing and higher quality of the output audio signal.

11 cl, 15 dwg

FIELD: information technology.

SUBSTANCE: audio signal decoder designed to provide a decoded representation of an audio signal based on an encoded representation of the audio signal, which includes information on evolution of a temporary deformation loop, includes a temporary deformation loop computer, a device for changing the scale of the temporary deformation loop data and a deformation decoder. The temporary deformation loop computer is designed to generate temporary deformation loop data through multiple restarting from a predefined starting value of the temporary deformation loop based on information on evolution of the temporary deformation loop, which describes time evolution of the temporary deformation loop. The device for changing the scale of temporary deformation loop data is designed to change the scale of at least part of temporary deformation loop data to avoid, reduce or eliminate non-uniformity during restart in a scaled version of the temporary deformation loop. The deformation decoder is designed to provide a decoded representation of an audio signal based on an encoded representation of the audio signal and by using the scaled version of the temporary deformation loop.

EFFECT: supporting low bit rate with reliable reconstruction of the required temporary deformation information at the decoder side.

14 cl, 40 dwg

FIELD: information technology.

SUBSTANCE: in the encoder, spectrum residue form vector candidates are stored in a spectrum residue form codebook (305), spectrum residue gain candidates are stored in a spectrum residue gain codebook (307), and the spectrum residue form vector and the spectrum residue gain are successively output from the candidates in accordance with an instruction from a search unit (306). A multiplier (308) multiplies the spectrum residue form vector candidate by the spectrum residue gain candidate and sends the result to a filtration unit (303). Using the internal status of the filter, the filtration unit (303) filters the fundamental tone given by the filter status setting unit (302), lag T which is output by a lag setting unit (304), and the spectrum residue form vector and the controlled gain.

EFFECT: obtaining a high quality decoded signal with scalable coding of the initial signal in first and second layers, even if the unit of the second or higher layer encodes at a low bit rate.

9 cl, 21 dwg

FIELD: information technology.

SUBSTANCE: described is a method and a system for generating a converted output signal from an input signal using a conversion coefficient T. The system includes an analysis window with length La, which extracts an input signal frame, and a unit which analyses transformation of the order M, which transforms discrete values into M complex coefficients. M depends on the conversion coefficient T. The system also includes a nonlinear processing unit, which changes the phase of complex coefficients using the conversion coefficient T, a unit which synthesises transformation of the order M, which transforms the changed coefficients into M changed discrete values, and a synthesis window with length Ls, which generates an output signal va(n) frame.

EFFECT: high reliability of the signal conversion system, and providing improved harmonic conversion with little additional complexity.

37 cl, 12 dwg

FIELD: technologies for encoding audio signals.

SUBSTANCE: method for generating of high-frequency restored version of input signal of low-frequency range via high-frequency spectral restoration with use of digital system of filter banks is based on separation of input signal of low-frequency range via bank of filters for analysis to produce complex signals of sub-ranges in channels, receiving a row of serial complex signals of sub-ranges in channels of restoration range and correction of enveloping line for producing previously determined spectral enveloping line in restoration range, combining said row of signals via synthesis filter bank.

EFFECT: higher efficiency.

4 cl, 5 dwg

FIELD: analysis of sound signal quality, possible use for estimating quality of speech transferred through radio communication channels.

SUBSTANCE: in accordance to the method for machine estimation of sound signal quality, the signal is divided onto critical bands and spectral energy values are computed for critical bands, values of spectral likeness of active phase of fragments are determined, and quality of tested sound signal is determined by means of weighted linear combination of aforementioned quality values for each phase. The difference of the method is that selected fragments of active and inactive phase of both signals are synchronized, inactive phase spectrums are determined for each fragment, resulting spectrums of active and inactive phase of fragments are divided onto additional sets of bands, for each one of which spectral energy values are computed, resulting spectral energies of active and inactive fragment phases are compared in couples, to determine spectral likeness coefficients, resulting likeness coefficient for each phase is determined as an average value of likeness coefficients for all sets of bands, which is the estimate of quality of each phase.

EFFECT: ensured universality and optimized quality of estimation process depending on purposes of estimation.

5 cl, 13 dwg, 6 tbl

FIELD: method for transmitting audio signals between transmitter and at least one receiver using priority pixel transmission method.

SUBSTANCE: in accordance to the invention, an audio signal is separated onto certain number n of spectral components, separated audio signal is stored in two-dimensional matrix with a set of fields with frequency and time as sizes and amplitude as corresponding value recorded in the field, then each separate field and at least two adjacent fields groups are formed and priority is assigned to certain groups, where priority of one group is selected the higher, the higher are amplitudes of group values and/or the higher are amplitude differences of values of one group and/or the closer the group is connected actual time, and groups are transmitted to receiver in the order of their priority.

EFFECT: ensured transmission of audio signals without losses even when the width of transmission band is low.

7 cl, 1 dwg

FIELD: physics.

SUBSTANCE: said utility invention relates to audio coders, in particular, to audio coders, in which time representation is converted into spectral representation. The essence of the invention is as follows: for the determination of the quantiser step for the quantisation of a signal containing audio or video information, the first quantiser step value is generated, along with the interference threshold. After that, the interference actually introduced due to the first quantiser step value is calculated and compared to the interference threshold. In spite of the fact that the comparison indicates that the actually introduced interference exceeds the threshold, the second, coarser quantiser step value is applied, which is then used for the quantisation if it is found that the interference introduced due to the coarser quantisation step value is less than the threshold or the interference introduced due to the first quantiser step value.

EFFECT: result of invention implementation is that quantisation interference decreases due to selection of coarser quantisation step value and resulting increased compression benefit.

10 cl, 5 dwg

FIELD: physics.

SUBSTANCE: device for multichannel signal processing includes comparator between the first and the second of the two channels. Additionally, a filter of spectral ratio prediction is provided to perform filtration of prediction only with one prediction filter for both channels in case of high similarity between the first and the second channel and filtration of prediction with two separate prediction filters in case of distinction between the first and the second channel.

EFFECT: increased efficiency of coding with technology of stereosignal coding.

12 cl, 3 dwg

Audio coding // 2335809

FIELD: physics.

SUBSTANCE: substance of invention implies exclusion of common procedure of interpolation relative to filter factors and gain value for interpolated intermediate audio values, and coding can be performed not by gain value interpolation, but by power limit calculated by masking threshold value rather as area lower than square of masking threshold value for each knot, i.e. for each transferred parameterisation followed by interpolation between these power limits in adjacent knots, e.g. by linear interpolation. Both on coder side, and on decoder side, gain value can be calculated then by intermediate power limit calculated so that quantising noise caused by fixed frequency quantisation prior to followed filtration on decoder side is lower than power limit or corresponded thereto after followed filtration.

EFFECT: provided coding of tapped audio noise reduction.

16 cl, 15 dwg

FIELD: physics.

SUBSTANCE: to define assessed value of information unit necessity for signal encoding, beside permissible interference for frequency band and frequency band power, an nl(b)) value is accounted for power distribution within the frequency band.

EFFECT: improved precision of assessed value of information unit necessity, allowing more precise and efficient signal encoding.

11 cl, 10 dwg

FIELD: information technology.

SUBSTANCE: in the audio-coder codes of key information are created for one or several audio-channels where a code of the key information bending around is created by characterising time bending around in the audio-channel. In the audio-decoder E that which is transferred by the audio-channels (audio-channel) is decoded for creation C audio-channels of reproduction, where C >E ≥1. The received codes of the key information include a code of the key information bending around, corresponding characterised time bending around the audio-channel corresponding the transferred channel (channels). One or several transferred channels mix with the increase in the number of channels for creation of one or several channels mixed with the increase of the number of channels. One or several channels are synthesised for reproduction by the application of codes of key information to one or several channels mixed with the increase of the number of channels where the code of key information bending around applies to the channel mixed with the increase of the number of channels, or to the synthesised signal for the adjustment of time bending around the synthesised signal on the basis of the characterised time bending around so the adjusted time bending around in essence coincides with characterised time bending around.

EFFECT: widening the arsenal of resources for coding audio-channels.

42 cl, 27 dwg

FIELD: information technologies.

SUBSTANCE: invention is related to method of sound signal coding support, in which at least one segment of sound signal should be coded with the help of coding model, which makes it possible to use different durations of coding frame, according to which it is suggested to define at least one control parameter on the basis of sound signal characteristics. Then this control parameter is used for limitation of versions of possible frame durations selection in respect to at least one segment of signal. Group of inventions also comprises module (10, 11), in which this method is realised, device (1) and system, which comprise such module (10, 11), and also software product, which includes program code for realisation of suggested method.

EFFECT: presentation of possibility of simple selection of corresponding most suitable duration of coding frame.

34 cl, 4 dwg

FIELD: physics; computer engineering.

SUBSTANCE: invention relates to computer engineering and can be used in sound encoding devices. The method involves the following: an information amplitude signal is decomposed to spectral lines, where each spectral line contains a sequence of spectral values, presented in an x-bit presentation without taking the logarithm; each spectral value is squared for each spectral group, and the squared spectral values are summed up to obtain a sum of squares as the result of calculation in the presentation without taking the logarithm, wherein presentation of the calculation result without taking the logarithm is scaled by an effective scaling factor compared to the sum of squares; for each calculation result, a logarithmic function is applied to y bits of presenting the result without taking the logarithm in order to obtain a scaled presentation of the calculation result with taking the logarithm, where y is less than x multiplied by 2; and compensation factor is added or subtracted for each presentation of scaling with taking the logarithm to the scaled logarithmic presentation or from it, respectively, where the value which corresponds to the logarithmic function is applied to the effective scaling coefficient to obtain presentation of the calculation result with taking the logarithm of the energy of the signal of the corresponding spectral group and so that values of energy of the signal in spectral groups have the same degree of scaling.

EFFECT: easy calculation and/or possibility of calculating with low expenses on equipment.

22 cl, 8 dwg

Up!