Forward time-domain aliasing cancellation with application in weighted or original signal domain

FIELD: physics, computer engineering.

SUBSTANCE: present invention relates to methods and devices for forward time-domain aliasing cancellation in a coded signal transmitted from an encoder to a decoder. The technical result is facilitating cancelling of aliasing effects at the switching point between encoding modes. The decoder receives the bit stream and cancels the time-domain aliasing in the encoded signal in response to the information contained in the bitstream. The information may be representative of a difference between a frame of the audio signal to be encoded in a first encoding mode and a decoded signal from the frame including time-domain aliasing effects.

EFFECT: information related to correction of the time-domain aliasing in the encoded signal is calculated at the encoder and added to a bit stream sent from the encoder to the decoder.

34 cl, 17 dwg

 

AREA of TECHNOLOGY

The present invention relates to the field of encoding and decoding audio signals. In particular, the present invention relates to an apparatus and method compensate for the overlap of the spectra in the time domain using transfer additional information.

The LEVEL of TECHNOLOGY

In the prior art for encoding audio uses frequency-time decomposition to represent the signal in the form of its important part to reduce the amount of data. In particular, the encoders audio signals are used conversions to perform display of samples in the time domain into coefficients in the frequency domain. Discrete time transformations that are used for such display time in frequency, usually based on the nuclei of sinusoidal functions, such as discrete Fourier transform (DFT) and discrete cosine transform (DCT). It can be shown that under such transformations is achieved by the compaction energy of audio signals. This means that in the transformation (or frequency domain) energy distribution of localized, with less important factors than in the samples in the time domain. The gain from the encoding may further be achieved by using adaptive allocation of bits and a suitable quantization for coefficient� in the frequency domain. At the receiver, the bits representing the quantized and encoded parameters (e.g., coefficients in the frequency domain) are used to restore the quantized coefficients in the frequency domain (or other quantized data, such as gains), while the inverse transform to generate an audible signal in the time domain. Such encoding schemes are usually referred to as encoding conversion.

By definition encoding conversion works on consecutive blocks of samples of the input audio signal. Since the quantization introduces some distortion in each synthesized block of the audio signal, the use of non-overlapping blocks may introduce heterogeneity on the borders of the block, which may degrade the quality of the audio signal. Therefore, when encoding conversion in order to avoid discontinuities, the encoded blocks of the audio signal overlap before applying a discrete transform and accordingly the window are weighted overlapped segment to ensure a smooth transition from one decoded block to the next. Using standard transformations such as DFT (or its equivalent - fast Fourier transform (FFT) or DCT, and its application to the overlapping blocks, unfortunately, leads � so-called "non-critical sampling". For example, if you take a typical overlap of 50%, the encoding block of N consecutive samples in the time domain actually requires conversion in 2N consecutive samples N samples from the current block and N samples from the overlapping part of the next block. Therefore, for each block of N samples in the time domain encoded by 2N coefficients in the frequency domain. Critical sampling in the frequency domain implies that N input samples in the time domain is formed only of N coefficients in the frequency domain subject to quantization and coding.

We developed specialized conversion to allow the use of overlapping Windows and, nevertheless, support of critical samples in the transformation range to 2N samples in the time domain at the input of the conversion led to the N coefficients in the frequency domain at the output of the transformation. To achieve this, a block of 2N samples in the time domain is first reduced to a block of N samples in the time domain using special temporal inversion and summation of individual parts of the window-weighted signal of length 2N samples. Such special temporary inversion of the summation makes a so-called "aliasing in the time domain", or TDA. When making such a n�the decomposition of the spectra into the block signal cannot be removed using only this block. This is the signal with the overlay of the spectra in the time domain, which is the input to the transform size N (instead of 2N), forming the N coefficients in the frequency domain conversion. To restore the N samples in the time domain with the inverse transformation should be used conversion factors from two successive and overlapping Windows to compensate for the TDA during the process called compensation overlay spectra in the time domain, or TDAC.

An example of such a transformation with the use of TDAC, which is widely used in audio coding, is a modified discrete cosine transform (or MDCT). Actually MDCT implements the above TDA without explicit coagulation in the time domain. Rather, aliasing in the time domain is introduced in the analysis of both direct and inverse MDCT (IMDCT) of one block. This results from the mathematical construction MDCT and is well known to specialists in this field of technology. But we also know that this aliasing in the time domain can be considered as equivalent to the first inverting parts of samples in the time domain, and then the inverted summation of these parts to other parts of the signal (or subtraction from). This is known as "trimming".

Samples�EMA occurs when the encoder of the audio signals is switched between the two encoding models: one using the TDAC and the other without the use. Suppose, for example, the codec switches between model TDAC encoding and model encoding is not-TDAC. The side of the block of samples encoded using the TDAC coding model, which is shared with the block coded without using TDAC, contains a superposition of the spectra, which cannot be compensated with the use of block samples, coded using a coding model non-TDAC.

The first solution is to exclude samples containing the superposition of the spectra, which could not be compensated.

This solution leads to inefficient use of bandwidth because the block of samples for which TDA cannot be compensated is encoded twice: once by the codec based on the TDAC and the second time the codec is not based on TDAC.

The second solution is to use a specially designed Windows that do not contribute TDA, at least in one part of the window when you apply the process of temporal inversion and summation. Fig. 1 is a diagram of an example of the window, making a TDA with his left hand, but not making TDA from its right side. In particular, Fig. 1 box of 2N samples 100 makes TDA 110 with his left side. The window 100 in Fig. 1 conviction�but for transitions from the codec on the basis of TDAC to the codec is not based on TDAC. The first half of this window is formed in such a way that it makes TDA 110, which can be compensated, if the previous window is also used TDA without overlapping. However, with the right side of the window in Fig. 1, there is a sample with a zero value after 120 point of coagulation in position 3N/2. Therefore, this part of Windows 100 made no TDA, when the process of temporal inversion and summation (or coagulation) are performed around the point of coagulation in position 3N/2.

In addition, the left side of the window 100 contains a flat region 130, which is preceded by a wedge-shaped region 140. The purpose of the sphenoid region 140 is to provide good spectral resolution in the calculation of conversion and in smoothing out the transition during operations overlap and summation between adjacent blocks. Increasing the duration of the flat region 130 of the window reduces the bandwidth information and reduces the spectral efficiency of the window, because the window is sent without any information.

In the multi-mode audio Codec unified speech and audio codec (USAC) expert Group on the moving images (MPEG) uses several special Windows such as the window described in Fig. 1, for controlling various transitions between frames using non-overlapping rectangular Windows to frames with the use of the n�overlapping rectangular Windows. These special Windows are designed to achieve different trade-offs between spectral resolution, the reduction in data costs and smoothness of transition between these different types of frames.

Summary of the INVENTION

Consequently, there is a need for a method of compensating for the aliased to ensure switching between the encoding modes, wherein the method compensates for the effects of superposition of the spectra at the switching point between these modes.

In this regard, in accordance with the present invention provides a method for the direct payment of superposition of the spectra of time-domain coded signal received in the bit stream at the decoder. The method includes receiving the bit stream to the decoder from the encoder additional information related to correction of overlay spectra in the time domain in the encoded video signal. The decoder aliasing in the time domain compensated coded signal in response to additional information.

In accordance with the present invention it is also proposed that the direct compensation method of superposition of the spectra of time-domain coded signal for transmission from the encoder to the decoder. This method includes the calculation in the encoder additional information related to correction of overlay Spa�spectra of time-domain coded signal. Additional information related to correction of overlay spectra of time-domain coded signal is sent in the bit stream from the encoder to the decoder.

In accordance with this izobretenie.narod.rucountry also a device for direct compensation of overlay spectra of time-domain coded signal received in the bit stream. The apparatus comprises a receiver for receiving the bit stream from the encoder additional information related to correction of overlay spectra of time-domain coded signal. The device also contains a anti-aliased time-domain coded signal in response to additional information.

In addition, the present invention relates to a device for direct compensation of overlay spectra of time-domain coded signal for transmission to the decoder. The device comprises a transmitter additional information related to correction of overlay spectra of time-domain coded signal. The device also includes a transmitter for sending the bit stream to the decoder, the additional information related to correction of overlay spectra of time-domain coded signal.

The above and other features will become apparent upon reading the following neogranichena�tion of the description of illustrative embodiments of the invention, given merely as an example with reference to the accompanying drawings.

BRIEF description of the DRAWINGS

Embodiments of the present invention are described only by way of example with reference to the accompanying drawings, in which:

Fig. 1 is a diagram of an example of the window, making a TDA with his left hand, but not making TDA from its right side;

Fig. 2 is a diagram of an example of the transition from unit using non-overlapping rectangular window to the block using the overlapping window;

Fig. 3 is a diagram showing the folding and TDA in relation to the scheme shown in Fig. 2;

Fig. 4 is a diagram showing a direct correction of the overlay of the spectra in relation to the scheme shown in Fig. 2;

Fig. 5 is a diagram showing nasiruta direct correction compensation overlay spectra (FAC) (left) and rolled correction FAC (right);

Fig. 6 is a first illustration of the application of the method of correction FAC using MDCT;

Fig. 7 is a FAC correction scheme using information from the ACELP mode;

Fig. 8 is a FAC correction scheme used in the transition from unit using overlapping Windows to the unit using supercruiser�Xia rectangular Windows;

Fig. 9 is a diagram resveratol correction FAC (left) and the folded correction FAC (right);

Fig. 10 is an illustration of a second application of the method of correction FAC using MDCT;

Fig. 11 is a block diagram of the quantization FAC, which includes error correction TLC;

Fig. 12 is a diagram of various cases of the use of the correction FAC in the multimode coding system;

Fig. 13 is a diagram of another use case correction FAC in the multimode coding system;

Fig. 14 is a diagram of the first case of using the correction FAC when switching between short frames with conversion and ACELP frames;

Fig. 15 is a diagram of the second case of using the correction FAC when switching between short frames with conversion and ACELP frames;

Fig. 16 is a block diagram of an example of a device for direct compensation of overlay spectra of time-domain coded signal received in the bit stream; and

Fig. 17 is a block diagram of an example of a device for direct compensation of overlay spectra of time-domain coded signal for transmission to the decoder.

DETAILED DESCRIPTION

In the following description deals with the problem of compensation effects dub�of spectra in the time domain and weighing using non-rectangular Windows for encoding audio signal using as overlapping, and non-overlapping Windows in adjacent frames. When using the technology described here can avoid suboptimal use of special Windows, at the same time ensuring proper management of transitions in the model using non-overlapping rectangular window and non-rectangular, overlapping Windows.

An example of a frame using a weighing using non-overlapping rectangular window is the encoding with linear prediction (LP), and, in particular, linear prediction algebraic code excitation (ACELP). In an alternative example implementation of the example of weighing using non-rectangular, overlapping window is to encode the transform coded excitation (TCX) used in Unified speech codec and audio codec (USAC), which in TLC frames are used as overlapping frames and the modified discrete cosine transform (MDCT), which introduces aliasing in the time domain (TDA). USAC is also a typical example in which adjacent frames can be encoded using either non-overlapping rectangular Windows, such as ACELP frames, or non-rectangular, overlapping Windows, such as TLC staffing and personnel improved audio coding (AAC). Without usche�BA for generality in the present description is thus considered a specific example USAC to illustrate the advantages of the proposed system and method.

Consider two separate cases. The first case occurs when the transition is performed from the frame using non-overlapping rectangular window to the frame using a non-rectangular, overlapping Windows. The second case occurs when the transition is performed from the frame using a non-rectangular, overlapping window to the frame using non-overlapping rectangular window. For purposes of illustration and without intending limitations frames using non-overlapping rectangular Windows can be encoded using ACELP model, and frames using non-rectangular, overlapping Windows can be encoded using TLC. In addition, some frames are used, of a specified duration, e.g., 20 milliseconds for the frame TLC, denoted by TSH. However, it should be remembered that these specific examples are only for illustrative purposes, but what can be assumed and the duration and types of encoding that is different from the ACELP and TCX.

Below we consider the case of the transition from frame using non-overlapping rectangular window to the frame using a non-rectangular, overlapping Windows in the following description, given with reference to Fig. 2, which represents the FDS�second circuit example of the transition from unit using non-overlapping rectangular window to the block using overlapping Windows.

According to Fig. 2, a typical non-overlapping rectangular window contains a frame 202 ACELP, and the typical non-rectangular, overlapping window contains a frame 206 TSH. TSH refers to short stafng TLC in USAC, which nominally have a duration of 20 MS, as well as ACELP frames in many applications. Fig. 2 shows what samples are used in each frame window and how they are weighted in the encoder. The same window 204 is used in the decoder, so that the total effect is observed at the decoder, it is a square shaped window shown in Fig. 2. Of course, this double window weighting - once in the encoder and decoder is typical when encoding conversion. In those cases, when the window is not shown, as in the frame 202 ACELP, this effectively means that for a given frame rectangular window is used. Nonrectangular window 204 for the frame 206 TSH shown in Fig. 2, is chosen so that if in the previous and subsequent frames are also used overlapping and non-overlapping Windows, the overlapping portions a and 204b Windows after the second weighting window in the decoder are complementary and allow us to recover the "unweighted by using the window signal in the overlapping area of the Windows.

For efficient encoding of the frame 206 TSH shown in Fig. 2, dub�tion spectra in the time domain (TDA) is commonly applied to window weighted samples for a given frame 206 TSH. In particular, the left a and right sections 204b rolled up and held together. Fig. 3 is a diagram showing the folding and TDA in relation to the scheme shown in Fig. 2. Nonrectangular window 204, introduced in the description of Fig. 2, shown as four quarters. 1st and 4th quarters - a and 204d of the window 204 is shown in dashed lines, since they are combined with the 2nd and 3rd quarters 204b, 204c, as shown by the solid line. The amalgamation of the 1st and 4th quarters a, 204d with the 2nd and 3rd quarters 204b, 204c is in a process similar to the process used in the MDCT coding, as follows. 1st quarter a is reversed in time, then it has consistently polyborate is aligned with the 2nd quarter 204b Windows and, finally, the reversed in time and shifted 1st quarter e is subtracted from the 2nd quarter of the window 204b. Similarly 4th quarter 204d of the window is reversed in time and shifted (204f) to align with the 3rd quarter s window and, finally, it is summarized with the 3rd quarter s window. If the window 204 TSH shown in Fig. 2, a 2Nsamples, at the end of this process we obtainNsamples that pass exactly from beginning to end is shown in Fig. 3 frame 206 TSH. Then the specifiedNsamples form the input data of the corresponding transformation for efficient coding in the field of flip�hardware. When using a special overlay spectra in the time domain described in Fig. 3, MDCT may be the transformation used for this purpose.

After the merger reversioning in time and shifted box plots as described in Fig. 3, it is already impossible to restore the original sampling in the time domain in a frame TSH because they are mixed with reversionary in time versions of the samples outside the picture TSH. In the encoder of the audio signals based on MDCT, such as MPEG AAC, in which all the frames are encoded using the same conversion and overlapping Windows, such aliasing in the time domain can be compensated for, with samples of audio signals can be recovered by using two consecutive overlapping frames. However, when contiguous frames are not used the same process windowing and overlap, as in Fig. 2 where the frame TSH preceded by an ACELP frame, the effect of non-rectangular Windows and overlay spectra in the time domain cannot be excluded using only information from the previous ACELP frame and the subsequent frame TSH.

Above presented methods of management of the transition of this type. In the present description proposes an alternative approach to the management of such transitions. In this� approach not used sub-optimal and asymmetric window in the frame, in which encoding is used in transformation-based MDCT. Instead, enter here the methods and devices allow you to use a symmetric window centered in the middle of the encoded frame, such as, for example, shown in Fig. 3 frame TSH, and with 50% overlapping frames encoded using MDCT, which also used a non-rectangular window. Thus, enter here the methods and devices proposed to be sent from the encoder to the decoder as an additional information bit stream correction to compensate for the effect of windowing and overlap of spectra in the time domain when switching from frames encoded with non-overlapping rectangular window, a frame encoded with non-rectangular, overlapping window, and Vice versa. If such transitions are possible only a few cases.

Fig. 2 weighing using non-overlapping rectangular window is shown for the ACELP frame, and weighing using non-rectangular, overlapping window is shown for the frame TSH. When using TDA, introduced in Fig. 3, the decoder, receiving bits from the first frame ACELP, has enough information to completely decode a given frame ACELP until his last fetch. But then, after taking bits from frame TSH, proper decoding of all SEL�rock in the frame TSH is disturbed due to the effect of superposition of the spectra caused by the presence of the previous ACELP frame. If the next frame is also used overlapping window, weighting is by using non-rectangular Windows and TDA entered in the encoder, can be compensated for in the second half of the frame shown TSH, and these samples can be properly decoded. Therefore, in the first half of the frame TSH in which the reversed in time and shifted 1st quarter e is subtracted from the 204b in Fig. 3, the effect of non-rectangular Windows and TDA entered in the encoder, can not be compensated, as in the previous ACELP frame is used non-overlapping window. So enter here the methods and devices proposed to transmit information - Direct compensation overlay spectra in the time domain (FAC) is to compensate for these effects, and properly restore the first half of the frame TSH.

Fig. 4 is a diagram showing a direct correction of overlay spectra (FAC) in relation to the scheme shown in Fig. 2. Fig. 4 illustrates the situation in the decoder, in which the weighting by means of, for example, cosinusoidal window used in MDCT, were already applied a second time after the reverse transformation. Considers only the transition from ACELP to TSH regardless of the frame following the frame TSH. Therefore, in Fig. 4 sampling in �which applies correction FAC, match the first half of the frame TSH. This is what is called zone 402 FAC. There are two effects that are compensated through FAC in this example. The first effect is the effect of the weighting window, called x_w 404 in Fig. 4. This corresponds to the multiplication of the samples in the first half of the frame 206 TSH to the 2nd quarter 204b non-rectangular window in Fig. 3. Therefore, the first part of the correction FAC contains the addition of these additions window weighted samples, which corresponds to the correction for the segment x_w 406 in Fig. 4. For example, if a given input sample x[n] multiplied by the sample window w[n] in the encoder, the addition of this window weighted samples is simply equal to the value (1-w[n]) is multiplied by x[n]. The amount x_w 404 and correction for x_w 406 is equal to 1 for all samples in this segment. The second part of the correction FAC corresponds to the component overlay of the spectra in the time domain, which was added in the encoder in the frame TSH. To exclude this component overlay of the spectra, referred to as part of the overlay spectra ha 408 in Fig. 4, correction for ha 406 in Fig. 4 is inverted in time, is aligned with the first half of the frame TSH and summed with the specified first half of the segment, shown as part of the overlay spectra ha 408. It is added, not subtracted because in Fig. 3 the left part of the coagulation cascade, leading to overlap of spectra in a temporary �area, included subtraction of this component, therefore, to exclude them now she is newly added. The sum of these two parts - the compensation window x_w 404 and compensation overlay spectra ha 408 forms a complete correction FAC in the zone 402 FAC.

There are several variants encoding correction FAC. Fig. 5 is a diagram showing nasiruta correction FAC (left) and rolled correction FAC (right). One option may be a direct encoding of window weighted FAC signal, as shown on the left side in Fig. 5. This signal, called an FAC 502 in Fig. 5, covers twice the length of the zone FAC. In the decoder the decoded window weighted FAC signal may further be minimized (invert in time of the left half and summing it with the right half), and then rolled this signal can be added as a correction 504 in the area of FAC 402, as shown on the right side in Fig. 5. In this approach, sampling in the time domain is encoded twice in comparison with the length of the correction.

Another approach to the encoding of the signal correction FAC, shown on the left in Fig. 5, consists in performing coagulation in the encoder to encode the signal. This leads to sternotomy signal to the right in Fig. 5, the left half of the window weighted FAC signal is reversed in time and summed with the right half of the window in�vesennego FAC signal. Then it collapsed to the signal can be applied to the encoding conversion using, for example, DCT. In the decoder the decoded folded signals can be simply added to the area of the FAC, as the collapse has already been used in the encoder. This approach allows us to encode the same number of samples in the time domain, and the length of the zone FAC, which leads to the encoding conversion with the formation of critical samples.

Another approach to coding signal correction FAC, shown on the left in Fig. 5, is to use implicit coagulation MDCT. Fig. 6 is a first illustration of the application of the method of correction FAC using MDCT. In the upper left quadrant shows the contents of the window 502 FAC, with a slight change. In particular, the last quarter of the window a FAC to the left of the window 502 FAC and inverted in sign (502b). In other words, the window FAC Fig. 5 cyclically rotated to the right by ¼ of its full length, and then mark the first ¼ of the samples is inverted. Then to the window weighted signal MDCT is applied. MDCT in its mathematical constructs implicitly applies an operation of coagulation, which leads to sternotomy signal 602, shown in the upper right quadrant of Fig. 6. Such coagulation in MDCT applies the inversion of the sign in the left part 502b, but not in the right part s in which EXT�is rolled segment. Comparing the resulting folded signal 602 with full correction FAC 504 in Fig. 5, you can see that it is equivalent to the correction FAC 504, except for the inversion in time. Thus, in the decoder after inverse MDCT (IMDCT) this signal 602, which is the inverted signal of the correction FAC, is inverted in time (or mirrored) and becomes the signal 604 correction FAC, as shown in the lower right quadrant of Fig. 6. As stated above, this correction FAC 604 may be added to the signal in the area of FAC Fig. 4.

In the specific case of the transition from frame to frame ACELP TLC additional efficiency can be achieved by using information that is already available at the decoder. Fig. 7 is a FAC correction scheme using the information of the ACELP mode. The synthesized signal 702 ACELP until the end of the frame 202 ACELP known at the decoder. Moreover, the response when the input signal (ZIR) 704 synthesizing filter has a good correlation with the signal at the beginning of the frame 206 TSH. This feature has already been used in the 3GPP AMR-WB+ to manage the transition from ACELP frames to frames TLC. Here this information is used for two purposes: 1) to reduce the amplitude of the signal subject to coding as a correction FAC, and 2) to ensure continuity error signal for the purpose of improving the efficiency of coding MDCT e�wow error signal. According to Fig. 7, the signal correction 706, encode for transmission correction FAC, is calculated as follows. The first half of this signal 706 of correction, which runs until the end of the frame 202 ACELP, is taken as the difference 708 between the balanced signal source 710 in nekodirovannie region and the weighted synthesized signal 702 in the frame 202 ACELP. Taking into account that the ACELP coding module has sufficient characteristics, this first half of the signal 706 correction has reduced energy and amplitude compared to the original signal. Further, with regard to the second half of the signal 706 correction, the difference is taken 708 between weighted signal 712 in the source nekodirovannie region in the beginning of the frame 206 TSH and response when the input signal 704 weighting synthesizing filter ACELP. Since the response when the input signal 704 correlated with the weighted signal 712, at least to some extent, especially in the beginning of the frame TSH, this difference has a smaller amplitude and energy compared to a weighted signal 712 at the beginning of the frame TSH. This efficiency of the response when the input signal 704 in the simulation of the source signal is usually higher in the beginning of the frame. When you add the effect of the window 502 FAC, which has a decreasing amplitude for the second half of OK�and FAC form the second half of the signal correction 706 in Fig. 7 should go to zero at the beginning and at the end, with perhaps more energy is concentrated in the middle of the second half of the window 502 FAC depending on the accuracy of compliance ZIR of the weighted signal. After performing these operations, windowing and discernment, as described with reference to Fig. 7, the resulting signal 706 correction can be encoded as described in Fig. 5 or 6, or any specified encoding FAC signal. In the decoder the actual FAC correction signal is calculated again first decoding of the transmitted signal 706 correction described above, and then re-summation of the synthesized signal 702 ACELP signal 706 in the first half of the window 502 FAC and summation ZIR 704 with the same signal 706 in the second half of the window 502 FAC.

So far the present invention has been described transitions between frames using non-overlapping rectangular window to the frame using a non-rectangular, overlapping window on the case study of the transition from frame to frame ACELP TLC. It should be understood that there is the opposite situation, namely the transition from frame TCX to ACELP frame. Fig. 8 is a FAC correction scheme used in the transition from frames using the overlapping non-rectangular window to the frame using NEPA�Krivosheya rectangular Windows. Fig. 8 shows a frame 802 TSH, followed by the frame 804 ACELP, collapsed box 806 TSH, as seen in the decoder, the frame TLC. Fig. 8 also shows a zone 810 FAC, in which the correction FAC is applied to compensate for the effect of windowing and overlap of spectra in the time domain at the end of the frame 802 TSH. It should be noted that the frame 804 ACELP carries no information to compensate for these effects. Box 812 FAC is symmetric with respect to the window FAC 502 in Fig. 5.

Coagulation two parts - 812-th left and 812-th right - window 812 FAC in this case is shown in the case of the transition from frame TCX to ACELP frame. Compared with Fig. 5 there are the following differences: box 812 FAC in this case, reversed in time, and the folding parts of the overlay of the spectra is applied to the operation of subtraction instead of summation, as illustrated in Fig. 5 to be consistent with the sign of coagulation MDCT in this part of the window.

Fig. 9 is a diagram resveratol correction FAC (left) and the folded correction FAC (right). Box 812 FAC reproduced on the left side of Fig. 9. Signal 902 collapsed correction FAC may be encoded using a DCT or any other suitable method. Assuming the Hanning window in the transform that was used, for example, MDCT, equations 904 and 906 in Fig. 9 describe the window 812 FAC in the case of Fig. 9. Of course, when using OK�n other form other equations are used to describe the window FAC. In addition, the use of a window type Hanning in MDCT means that before MDCT encoder used in cosinusoidally the window, and in the decoder after IMDCT is used again cosinusoidally the window. It is polyporaceae the combination of these two cosinusoidally Windows leads to the required shape of the Hanning window, which has a corresponding complementary shape to overlap and summation in the part window with 50% overlap.

And again, approach with MDCT can also be used to encode the window FAC, as described in Fig. 6. Fig. 10 is an illustration of a second application of the method of correction FAC using MDCT. In the upper left quadrant of Fig. 10 shows a window 812 FAC depicted in Fig. 8. First quarter a box 812 FAC is shifted to the right from the window FAC and inverted in sign (812b). In other words, the window 812 FAC cyclically rotated to the left by ¼ of its full length, and then sign the last ¼ of the samples is inverted. Then this window weighted signal in the upper right quadrant of Fig. 10 MDCT is applied. MDCT applies in the operation of coagulation, which leads to sternotomy signal 1002, shown in the upper right quadrant of Fig. 10. Such coagulation in MDCT applies the inversion of the sign in the left part s, but not in the right part 812b, to which is added a rolled segment. Comparing the resulting folded signal with signal 1002 902 EV comp�and FAC on the right side in Fig. 9, you can see that it is equivalent, excluding invert (mirror) and inversion of the sign. Thus, in the decoder after IMDCT this signal 1002 which is inverted correction FAC, is inverted in time (or mirrored), is inverted in sign and becomes a correction 1004 FAC, as shown in the lower right quadrant of Fig. 10. As stated above, this adjustment 1004 FAC may be added to the signal in the area of FAC Fig. 8.

Quantization signal corresponding correction FAC, requires the proper execution. Indeed, the correction FAC is part coded in transformation of the signal, including, for example, frames TSH used in the examples in Fig. 2-10, as it is added to the frame to compensate for the effects of windowing and overlap of spectra. Since the quantization of such a correction FAC introduces distortion, this distortion is controlled so that it is properly mixed with coded in transformation signal, or consistent with its distortion and does not introduce audible artifacts in this transition, corresponding to the FAC area. If the called quantization noise and the shape of the quantization noise in time and frequency domains remain approximately the same in the FAC correction signal, as coded with the number of�ing to the frame, uses correction FAC, the correction FAC does not introduce additional distortion.

There are several approaches to the quantization of the signal correction FAC, including as non-limiting examples of scalar quantization, vector quantization, stochastic codebook, the algebraic codebook, etc., In each case, it should be understood that there is a strong correlation in the attributes of the coefficients of the correction FAC and the coefficients of the corresponding encoded in transformation of the frame, as in the example frame TSH. Indeed, sampling in the time domain used in the FAC area should be the same as sampling in the time domain in the beginning coded in transformation of the frame. Thus, the scaling factors used in the device of quantization applied to coded in transformation frame are approximately the same as the scaling factors used in the device of quantization applied to the correction FAC. Of course, the number of samples or coefficients in the frequency domain, the correction FAC is not quite as coded in transformation frame: coded in transformation frame has more samples than the correction FAC, which covers only a portion of the encoded in the transformations of the frame. Important support�continue the same level of quantization noise by a factor in the frequency domain in the signal correction FAC, as in the corresponding coded in transformation frame (e.g., frame TSH).

Considering a specific example of the method of Algebraic vector quantization (AVQ) used in the standard encoding of audio signals 3GPP AMR-WB+ to quantize the spectral coefficients, and applying it to the quantization correction FAC, we can obtain the following result. The total gain AVQ calculated by the quantization coded in transformation of frame, such as frame TSH, and the total gain is used to scale the amplitudes of the coefficients in the frequency domain for the purpose of maintaining the flow of bits is below a given bit budget may be a reference gain factor for use in quantization of the frame FAC. This also applies to any other scaling factors, for example, the scaling factors used in the Adaptive amplifier of low frequencies (ALFE), such as used in the standard AMR-WB+. Other examples include the scaling factors when encoding AAC. In this category also may be considered any other scaling factors that control the noise level and spectrum shape.

Depending on the length encoded in the frame transformations between coded in transformation kad�ω and the correction FAC is used display the specified parameters of the scaling factors m1. For example, in the case of using three lengths of the frame TLC - 20 MS, 40 MS or 80 MS, as in the audio codec MPEG USAC, scaling factors, such as, for example, the scaling factors used in ALFE used formconsecutive spectral coefficients coded in transformation frame can be used for 1 spectral coefficient in the correction FAC.

To harmonize the level of quantization error correction FAC with the level of errors of quantization of the encoded transform frame may have to be included in the encoder the encoding error window weighted coded with the conversion of the frame. Fig. 11 is a block diagram of the quantization FAC, which includes error correction TLC. We compute the difference between window 1102 weighted and minimized signal in the frame 1104 TLC and window are weighted and rolled TLC synthesis of this frame 1106. TLC synthesis 1106 in this context is simply the inverse transformation - including window weighting used in the decoder the quantized coefficients in the transformation of this frame TLC. Then the difference signal 1108, or an encoding error TLC, 1110 is added to the signal 1112 correction FAC, synchronized with the FAC area. Then this composite signal 1114 containing signal 1112 correction FAC plus error 1108 coding frame TLC, kV�Tulsa quantizer 1116 for transmission to the decoder. In this regard, such a quantized signal 1118 correction FAC, in accordance with Fig. 11, the decoder corrects the effect of windowing and aliasing spectra, and error encoding TLC in the FAC area. The use of factors of 1120 scale, as shown in Fig. 11, helps coordinate distortion correction FAC with distortion in the frame TLC.

Fig. 12 is a diagram use case correction FAC in the multimode coding system. Presents examples that show the switch between Windows of the usual form with an overlap of 50% or more and the Windows of variable form, including Windows FAC. Fig. 12 the lower part can be seen as a continuation of the upper part along the time axis. Fig. 12 it is assumed that all frames are encoded after pre-processing the input audio signal by varying in time the filtration process, which may be, for example, weighting filter derived from LPC analysis of the input signal or some other processing with weighting of the input signal. In this example, the input signal is encoded up to a "Point And switch" with the use of this approach from the family of modern methods of encoding audio signals, as the AAS, wherein the analysis window are optimized for encoding in the frequency domain. As a rule, e�means the use of Windows with an overlap of 50% and of the usual form, as in cosinusoidal the window, used when encoding with MDCT, although for this purpose can be used and the Windows of other shapes. Further, between "Point A switch" and "Point switch" input signal is encoded using variable window length and shape, not necessarily optimized for encoding in transformation, but rather intended to achieve some compromise between temporal and frequency resolution for the encoding modes used in this segment. Fig. 12 shows a specific example of encoding modes ACELP and TCX used in this segment. It is seen that the form of the Windows for data encoding modes are highly heterogeneous and vary in form and length. Window ACELP is rectangular and non-overlapping, while the window for TLC is non-rectangular and overlapping. In this case, the window FAC is used to compensate the overlay of the spectra in the time domain, as described above. The window itself is FAC, shown in bold in Fig. 12, with its special shape and length, is one of the Windows of variable form, enclosed in the segment between Point A switch" and "the Point In switching."

Fig. 13 is a diagram of another use case correction FAC in the multimode coding system. Fig. 13 shows ka� box FAC can be used in the context in which the encoder is locally switched with normal Windows forms on Windows variable form for encoding transient signal. This is similar to the encoding context of the AAS, wherein the window start and stop apply for local use Windows providing less time for encoding transient signal. In this case, instead, in accordance with Fig. 13, the signal between the first Point of switching and the point In switching", which is short, is encoded using multi-mode coding, including ACELP and TCX in the presented example, that requires the use of Windows FAC for the proper management of the transition when using the ACELP encoding.

Fig. 14 and 15 are a schematic of the first and second cases of the use of the correction FAC when switching between short frames with conversion and ACELP frames. These are cases in which switching is performed between the short shots with the transformation in the field of LPC, for example, short shots TLC, and ACELP frames. Shown in Fig. 14 and 15, an example can be seen as the local situation in a longer signal, which can also be used for other encoding modes on other frames (not shown). It should be noted that the window for short frames TLC in Fig. 14 and 15 may have�ü covering more than 50%. For example, it may have a place in the AAC codec with low latency, which uses long asymmetric window. In this case, some special window start and stop are designed to ensure proper switching between asymmetrical long Windows and short Windows of TLC shown in Fig. 14 and 15.

Fig. 16 is a block diagram of a non-restrictive example of the device 1600 for direct compensation of overlay spectra of time-domain coded signal received in the bit stream 1601. For the purpose of illustration, the device 1600 is presented with reference to the correction FAC Fig. 7 using information from the ACELP mode. Specialists in the art it should be clear that the device 1600 can be implemented in relation to all other examples of the correction FAC, presented in the present description.

The device 1600 includes a receiver 1610 for receiving the bitstream 1601, representing a coded audio signal comprising a correction FAC.

Footage from ACELP bitstream 1601 fed into the decoder 1611 ACELP, which includes synthesizing filter ACELP. Decoder 1611 ACELP generates a response when the input signal (ZIR) 704 synthesizing filter ACELP. In addition, synthesizing decoder 1611 ACELP generates the synthesized signal 702 ACELP. Sint�zeerovannyy signal 702 ACELP and ZIR 704 are combined so that to form the synthesized signal ACELP, followed by ZIR. Nevernude box 502 FAC is applied to the combined signals 702 and 704, then rolls up and is added in the processor 1605, and then applied to the positive input of the adder 1620 for receiving the first (optional) parts of an audio signal in frames of TLC.

Parameters (prm) for frames 20 of TLC bitstream 1601 are fed to the decoder 1606 TLC, followed IMDCT transform and box 1613 for IMDCT for the formation of the synthesized signal 1602 TLC 20 supplied to the positive input of the adder 1616 to obtain the second part of the audio signal in frames TLC 20.

However, when you switch between the encoding modes (e.g., from frame to frame ACELP TLC 20) part of the audio signal is not decoded properly without the use of a compensator 1615 FAC. As shown in Fig. 16 example, the compensator 1615 FAC contains decoder 1617 FAC to be decoded from a received bitstream signal 1601 504 correction (Fig. 5), which corresponds to the signal correction 706 (Fig. 7) after coagulation, as in Fig. 5, and the inverse DCT (IDCT). The output signal 1618 IDCT is fed to the positive input of the adder 1620. The output signal of the adder 1620 is fed to the positive input of the adder 1616.

The total output signal of the adder 1616 is a compensated using FAC synthesized�th signal frame for TLC, following the ACELP frame.

Fig. 17 is a block diagram of a non-restrictive example of a device 1700 for direct compensation of overlay spectra of time-domain coded signal for transmission to the decoder. For the purpose of illustration, the device 1700 is presented with reference to the correction FAC Fig. 7 using information from the ACELP method. Specialists in the art it should be clear that the device 1700 can be implemented in relation to all other examples of the correction FAC, presented in the present description.

The device 1700 is served encode the audio signal 1701. Logic (not shown) delivers the ACELP frames of the audio signal at 1701 ACELP encoder 1710. The output signal ACELP encoder 1710 - encoded using ACELP parameters 1702 is fed to a first input of a multiplexer (MUX) 1711. Another output signal ACELP encoder is a synthesized signal 1760, followed by the response when the input signal (ZIR) 1761 synthesizing filter ACELP encoder 1710. The window 502 FAC applies to Association of signals 1760 and 1761. The output signal of the CPU window FAC 502 is fed to the negative input of the adder 1751.

Logic (not shown) also takes the footage TLC 20 audio signal to the module 1701 1712 encoding MDCT for forming coded with TLC 20 parameters 1703 filed in�ora input multiplexer 1711. Module 1712 encode the MDCT window contains 1731 MDCT, conversion 1732 MDCT and the quantizer 1733. Window weighted input signal MDCT module 1732 is fed to the positive input of the adder 1750. Quantized MDCT-coefficients 1704 apply to the inverse MDCT (IMDCT) 1733, and the output signal IMDCT 1733 is fed to the negative input of the adder 1750. The output signal of the adder 1750 forms the quantization error TLC, which window is weighed in 1736 processor. The output signal of the processor 1736 is fed to the positive input of the adder 1751. As shown in Fig. 17, the output signal of the processor 1736 can be used in the device as needed.

When switching between encoding modes (e.g., from frame to frame ACELP TLC) some of the audio frames encoded MDCT module 1712 may not be decoded properly without additional information. Calculator 1713 gives this additional information, in particular, the signal correction 706 (Fig. 7). All components of the calculator 1713 can be considered as a signal generator correction FAC. Signal generator correction FAC contains the application window 502 FAC to the audio signal 1701, the output signal of the window 502 FAC to the positive input of the adder 1751, the supply of the output signal of the adder 1751 on MDCT 1734 and the quantization output signal MDCT in 1734 quantizer 1737 for the formation parameters 706 FAC that �adultsa to the input of the multiplexer 1711.

The signal at the output of multiplexer 1711 is an encoded audio signal 1755 to be transmitted to the decoder (not shown) through the transmitter 1756 coded bit stream 1757.

Specialists in the art it should be clear that the description of devices and methods direct compensation overlay spectra of time-domain coded signal is merely illustrative and in no way implies limitation. Such specialists in the art will easily be able to implement other variants of implementation, having the advantages of the present invention. In addition, the described system can be modified to implement useful solutions tailored to the needs and objectives of compensation overlay spectra of time-domain coded signal.

Specialists in the art should also be understood that numerous types of endpoints and devices can be implemented as aspects of coding for transmission of encoded sound, and aspects of the decoding of the subsequent reception of the encoded audio in the same device.

In the interest of clarity, not all of the usual symptoms implementations of direct compensation overlay spectra of time-domain coded signal is shown and described. Of course, it is clear that �ri the development of any such actual implementation of the encoding of audio signals must be numerous implementation-specific decisions to achieve the specific objectives of the developer, such as the constraints associated with the applications, systems, networks and business, and that these specific goals will vary from one implementation to another and from one developer to another. In addition, it is clear that engineering can be complex and require time-consuming, but, nevertheless, are standard design for professionals in the field of audio signals in which are realized the advantages of this invention.

In accordance with this disclosure, the components described herein, the steps of processing and/or data structures may be implemented using various types of operating systems, computing platforms, network devices, computer programs and/or General purpose machines. In addition, specialists in the art it should be clear that can also be used in the device for a less General purpose, such as hard-coded in the device field programmable gate array (FPGA), a specialized integrated circuit (ASIC), etc. In the case if the method containing a series of processing steps, implemented using a computer or computing machine, and these processing steps can be stored as a series of instructions, read this mA�other they can be stored on a tangible medium.

In disclosed herein described herein systems and modules may include software, firmware, hardware or a combination (conjunction) of software, firmware or hardware. Software and other modules may reside on servers, workstations, personal computers, computerized tablets, PDAs, personal computers (PDA) and other devices suitable for the purposes described here. Software and other modules may be accessible via a local storage device, via the network, via a browser or other application in the context of application service provider (ASP) or via other means suitable for the purposes described here. These data structures can contain computer files, variables, programming arrays, programming structures, or any scheme or method for storing electronic information, or their combination, are suitable for the purposes described herein.

Although the present invention is described above by means of its non-restrictive illustrative embodiments, these embodiments of discretion can be modified within the scope of the attached� formula of the invention and within the entity of the present invention.

1. The direct compensation method of superposition of the spectra of time-domain coded signal received in the bit stream to the decoder, comprising stages on which:
take in the bit stream at the decoder from the encoder additional information pertaining to the correction of overlay spectra in the time domain in the encoded video signal, and additional information is a correction signal direct compensation overlay spectra (FAC) related to a differential signal based on the difference between the signal subject to coding in the transition from the first encoding mode to a second encoding mode, and a synthesized signal obtained using the first encoding mode; and
compensate for aliasing in the time domain coded signal to the decoder in response to additional information.

2. A method according to claim 1, used in transitions between frames using non-overlapping rectangular Windows and frames using non-rectangular, overlapping window.

3. A method according to claim 1, wherein the FAC correction signal is a windowed weighted or weighted window and rolled, the FAC correction signal.

4. A method according to claim 1, wherein the FAC correction signal is encoded with the transform using the transform for encoding �Adra using non-rectangular, overlapping window.

5. A method according to claim 1, wherein the first encoding mode is a Linear prediction with coded excitation (CELP), and the second encoding mode is an encoding mode of the transformation.

6. A method according to claim 1, wherein the difference signal based on the difference between the declared encoding of the signal and the synthesized signal, combined with the response when the input signal synthesizing filter at the first encoding mode.

7. A method according to claim 1, wherein the compensation overlay spectra in the time domain contains the stages at which, in the decoder:
decode the differential signal; and
recalculate the FAC correction signal using the synthesized signal and the decoded difference signal.

8. A method according to claim 1, wherein the compensation overlay spectra in the time domain contains the stages at which, in the decoder:
decode the signal correction FAC; and
summarize the decoded signal of the correction FAC with a coded signal.

9. A method according to claim 1, wherein the FAC correction signal is quantized using the scaling factors used in non-rectangular, overlapping Windows.

10. The direct compensation method of superposition of the spectra of time-domain coded signal for transmission from the encoder to the decoder, comprising stages on which:
calculate in �Oder additional information related to the correction of overlay spectra of time-domain coded signal, and calculating the additional information contains the phase in which form the correction signal direct compensation overlay spectra (FAC) related to a differential signal based on the difference between the signal subject to coding in the transition from the first encoding mode to a second encoding mode, and a synthesized signal obtained using the first encoding mode; and
send in the bit stream from the encoder to the decoder, the additional information related to correction of overlay spectra of time-domain coded signal.

11. A method according to claim 10, used in the transitions between the frames using non-overlapping rectangular Windows and frames using non-rectangular, overlapping window.

12. A method according to claim 10, wherein the calculating of additional information contains the stage at which the window is weighed, weighed or window and roll, the FAC correction signal.

13. A method according to claim 10, wherein the calculating of additional information contains the stage at which encode the transformation of the FAC correction signal using a transform to encode frames using non-rectangular, overlapping window.

14. A method according to claim 1, in which the first encoding mode is a Linear prediction with coded excitation (CELP), and the second encoding mode is an encoding mode of the transformation.

15. A method according to claim 10, wherein the difference signal based on the difference between the declared encoding of the signal and the synthesized signal, combined with the response when the input signal synthesizing filter at the first encoding mode.

16. A method according to claim 10, containing the stage at which quantum the FAC correction signal using the scaling factors used in non-rectangular, overlapping Windows.

17. A method according to claim 16, containing the stage at which subtracted the error of quantization of the encoded transform frame of the FAC correction signal to quantization of the signal correction FAC.

18. A device for direct compensation of overlay spectra of time-domain coded signal received in the bit stream that contains:
the receiver of the bit stream from the encoder additional information related to correction of overlay spectra of time-domain coded signal, wherein the additional information contains the correction signal direct compensation overlay spectra (FAC) related to a differential signal based on the difference between the signal subject to coding in the transition of� first encoding mode to a second encoding mode, and a synthesized signal obtained using the first encoding mode; and
compensator overlay spectra of time-domain coded signal in response to additional information.

19. The device according to claim 18, used in the transitions between the frames using non-overlapping rectangular Windows and frames using non-rectangular, overlapping window.

20. The device according to claim 18, wherein the FAC correction signal is a windowed weighted or weighted window and rolled, the FAC correction signal.

21. The device according to claim 18, wherein the FAC correction signal is encoded with the conversion using the conversion to encode frames using non-rectangular, overlapping window.

22. The device according to claim 18, in which the first encoding mode is a Linear prediction with coded excitation (CELP), and the second encoding mode is an encoding mode of the transformation.

23. The device according to claim 18, in which the difference signal based on the difference between the declared encoding of the signal and the synthesized signal, combined with the response when the input signal synthesizing filter at the first encoding mode.

24. The device according to claim 18, wherein the canceller decoder:
decodes the difference signal; �
recomputes the FAC correction signal using the synthesized signal and the decoded differential signal.

25. The device according to claim 18, wherein the canceller decoder:
decodes the signal correction FAC; and
adds the decoded signal of the correction FAC with a coded signal.

26. The device according to claim 18, wherein the FAC correction signal is quantized using the scaling factors used in non-rectangular, overlapping Windows.

27. A device for direct compensation of overlay spectra of time-domain coded signal for transmission to a decoder, comprising:
the evaluator additional information related to correction of overlay spectra of time-domain coded signal, and the transmitter of additional information contains a signal generator correction direct compensation overlay spectra (FAC) related to a differential signal based on the difference between the signal subject to coding in the transition from the first encoding mode to a second encoding mode, and a synthesized signal obtained using the first encoding mode; and
a transmitter for sending the bit stream to the decoder mentioned additional information related to correction of overlay spectra in the time domain in kodirovanie� the signal.

28. The device according to claim 27, used in transitions between frames using non-overlapping rectangular Windows and frames using non-rectangular, overlapping window.

29. The device according to claim 27, in which the signal generator correction FAC window weighs, or window weighs and roll, the FAC correction signal.

30. The device according to claim 27, in which the signal generator correction FAC encodes with the transformation of the FAC correction signal using a transform to encode frames using non-rectangular, overlapping window.

31. The device according to claim 27, in which the first encoding mode is a Linear prediction with coded excitation (CELP), and the second encoding mode is an encoding mode of the transformation.

32. The device according to claim 27, in which the difference signal based on the difference between the declared encoding of the signal and the synthesized signal, combined with the response when the input signal synthesizing filter at the first encoding mode.

33. The device according to claim 27, containing the quantizer of the FAC correction signal using the scaling factors used in non-rectangular, overlapping Windows.

34. The device according to claim 33, containing the error subtractor synthesized TLC-the frame of the FAC correction signal to �of santovenia signal correction FAC.



 

Same patents:

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to data processing systems. Computer-implemented method of compressing video includes a step of receiving a request in a server from a client to reproduce a video game or run an application on the Internet. In response to said request, a video game or an application session with the client is set up. The method involves measuring channel characteristics for the Internet communication channel between the server and the client. Further, said video output is encoded using low-latency compression at the server to generate a low-latency video stream. Video output is encoded with a bit rate or compression factor based on the measured channel characteristics. Furthermore, the low-latency video stream is streamed from the server to the client over the Internet, said low-latency video stream being decoded in the client. All operations associated with receiving control signals transmitted from the client, running the video game or application, encoding and streaming the low-latency video stream to the client over the Internet and decoding the low-latency video stream in the client, are performed such that the user has the perception that the selected video game or application instantly responds to the control signals received from the client.

EFFECT: shorter latency time when running video games or applications.

41 cl, 40 dwg

FIELD: information technology.

SUBSTANCE: machine-implemented method of compressing video includes executing one or more applications, receiving packet streams from users and routing said packets to one or more applications. Said packets include user control signal input; the one or more applications are configured to calculate A/V data in response to the user control signal input. The method also includes a step of receiving A/V data from one or more applications and deriving therefrom streaming compressed A/V data with short waiting time. The method also includes routing the streaming compressed A/V data with short waiting time to each user over a corresponding link. If a portion of the routed streaming compressed A/V data with short waiting time results in marked virtual artifacts, then generation of forward error correction (FEC) data for protecting a portion of the routed streaming compressed A/V data with short waiting time is performed.

EFFECT: high protection of data transmitted over a communication channel.

25 cl, 40 dwg

FIELD: information technology.

SUBSTANCE: system for protecting interactive audio/video (A/V) stream with short waiting time includes a plurality of servers on which one or more applications are executed. The system also includes a input routing network which receives packet streams from users and routes them to one or more said servers, wherein said streams include user input commands, wherein one or more said servers is configured to calculate A/V data in response to user input commands. Furthermore, the system includes an output routing network which routes streaming compressed A/V data with short waiting time to each user over a corresponding communication channel through a second interface. The system also includes a compressing unit connected to receive A/V data from one or more servers and output therefrom streaming compressed A/V data with short waiting time, and is configured to generate forward error correction (FEC) data for protecting certain parts of the compressed video stream with short waiting time from channel errors.

EFFECT: high protection of data transmitted over a communication channel.

28 cl, 40 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio processing. Disclosed is a device for generating at least one audio output signal based on an audio data stream comprising audio data relating to one or more sound sources is provided. The apparatus comprises a receiver for receiving an audio data stream comprising audio data. The audio data comprise one or more pressure values for each one of the sound sources. Furthermore, the audio data comprise one or more position values indicating a position of one of the sound sources for each one of the sound sources. Moreover, the apparatus comprises a synthesis module for generating at least one audio output signal based on at least one of the one or more pressure values of the audio data of the audio data stream and based on at least one of the one or more position values of the audio data of the audio data stream.

EFFECT: improved spatial audio capturing.

25 cl, 34 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio processing and particularly to decomposing audio signals into different components. A device for decomposing an input signal, having at least three input channels, comprises a downmixer for downmixing the input signal to obtain a downmixed signal having fewer channels, an analyser for analysing the downmixed signal to obtain an analysis result which is forwarded to a signal processor for processing the input signal or the signal derived from the input signal in order to obtain a decomposed signal.

EFFECT: high accuracy of reproducing stereo sound.

15 cl, 16 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to complex transformation channel coding devices with broadband frequency coding. The coded data of multichannel sound are received in a bit flow, and the coded data of multichannel sound contain the coding data with channel expansion and coding data with frequency expansion, and coding data with channel expansion contain the combined channel for multiple sound channels and the set of parameters for representation of certain canals of this set of sound channels as modified versions of the combined channel. On the basis of information in the bit flow it is determined whether the named set of parameters contains the package of parameters containing a normalised correlation matrix, or the set of parameters containing the complex parameter representing the ratio containing the imaginary component and the real component for cross-correlation between two of the named set of sound channels. On the basis of this determination the named set of parameters is decoded. The set of sound channels is recovered using the coding data with channel expansion and coding data with frequency expansion.

EFFECT: improvement of quality of multichannel sound.

20 cl, 42 dwg, 1 tbl

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to means of stereo encoding and decoding using complex prediction in the frequency domain. In one of the versions of the invention, a decoding method for obtaining an output stereo signal from an input stereo signal encoded by complex prediction coding and comprising first frequency-domain representations of two input channels comprises upmixing steps of: computing a second frequency-domain representation of a first input channel; and computing an output channel based on the first and second frequency-domain representations of the first input channel, the first frequency-domain representation of the second input channel and a complex prediction coefficient. The method includes performing frequency-domain modifications selectively before or after upmixing.

EFFECT: providing high audio quality while reducing computational costs.

15 cl, 19 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio processing, particularly to decomposition of audio signals into different components, for example, differently detectable components. An apparatus for decomposing a signal having at least three channels comprises an analyser (16) for analysing a similarity between two channels of an analysed signal related to the signal having at least two analysed channels, wherein the analyser is configured to use a pre-calculated frequency-dependent similarity curve as a reference curve to determine the analysis result. The signal processor (20) processes the analysed signal or a signal derived from the analysed signal or a signal, from which the analysed signal is derived using the analysis result to obtain a decomposed signal.

EFFECT: decomposing a signal using a pre-calculated frequency-dependent similarity curve as a reference curve.

15 cl, 16 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio signal estimation means. The apparatus includes a unit for determining a codebook from a plurality of codebooks as an identified codebook. In the apparatus, an audio signal is encoded using the identified codebook and an estimation unit, which is configured to obtain a level value associated with the identified codebook as the obtained level value and for estimating the level of the audio signal using the obtained level value.

EFFECT: high efficiency of encoding an audio signal.

19 cl, 11 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to bandwidth expansion devices. An excitation signal based on an acoustic signal is generated; with that, the acoustic signal includes a variety of frequency components. A feature vector is distinguished out of the acoustic signal; with that, the feature vector includes at least one feature of a component in a frequency domain and at least one feature of a component in a time domain. At least one parameter of the spectrum shape is determined based on the feature vector; with that, at least one parameter of the spectrum shape corresponds to a sub-range signal containing frequency components that belong to an additional variety of frequency components. A signal of the sub-range is generated by the filtration of an excitation signal by means of a filter bank and weighing of a filtered excitation signal using at least one parameter of the spectrum shape.

EFFECT: technical result consists in the improvement of perception of an expanded acoustic signal.

21 cl, 10 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to encoding and decoding an audio signal in which audio samples for each object audio signal may be localised in any required position. In the method and device for encoding an audio signal and in the method and device for decoding an audio signal, audio signals may be encoded or decoded such that audio samples may be localised in any required position for each object audio signal. The method of decoding an audio signal includes extracting from the audio signal a downmix signal and object-oriented additional information; generating channel-oriented additional information based on the object-oriented additional information and control information for reproducing the downmix signal; processing the downmix signal using a decorrelated channel signal; and generating a multichannel audio signal using the processed downmix signal and the channel-oriented additional information.

EFFECT: high accuracy of reproducing object audio signals.

7 cl, 20 dwg

FIELD: information technology.

SUBSTANCE: method and apparatus for generating comfortable noises to enhance user experience are disclosed. The method involves the following: calculating a corresponding energy attenuation parameter based on a noise frame and a data frame received earlier than the noise frame if the received data frame is a noise frame; and attenuating noise energy based on the energy attenuation parameter to obtain a comfortable noise signal. An apparatus for generating comfortable noise is also provided.

EFFECT: high quality of signal transmission owing to attenuation of noise energy.

14 cl, 5 dwg

FIELD: information technology.

SUBSTANCE: method of smoothing stationary background noise involves receiving and decoding a signal representing a speech session, said signal comprising both a speech component and a background noise component; providing an indicator of noise properties for said signal, said indicator of noise properties indicating signal predictability, said predictability being defined in prediction gain indicators of a linear predictive coder (LPC) of said signal, and said background noise component is additively smoothed depending on the provided indicator of noise properties. Said smoothing operation is controlled by said indicator of noise properties through a smoothing control parameter which is varied gradually in accordance with the detected increase in said indicator of noise properties, and varied instantly in accordance with the detected decrease in said indicator of noise properties.

EFFECT: improved control of the operation of smoothing background noise in speech sessions in telecommunication systems.

22 cl, 7 dwg

FIELD: information technology.

SUBSTANCE: apparatus for encoding a mutichannel audio signal has a multichannel audio signal receiver, having a first and a second audio signal from a first and a second microphone, a time difference module for determining time difference between the first and second audio signals by combining successive observations of cross-correlations between the first and second audio signals, wherein the cross-correlations are normalised to derive state probabilities accumulated using a Viterbi algorithm to achieve time difference with built-in hysteresis, and the Viterbi algorithm calculates the state probability for each given state in form of a combined contribution of all routes included in that state, a delay module for multichannel audio signal compensation by delaying the first or second audio signal in response to the time difference signal, a monophonic module for generating a monophonic signal by combining multichannel audio signal compensation channels, and a monophonic signal encoder.

EFFECT: high quality and efficiency of encoding.

10 cl, 5 dwg

FIELD: information technology.

SUBSTANCE: audio signal processing method involves reception of an audio signal and predetermined information; deriving a predetermined matrix from the predetermined information, wherein the predetermined matrix indicates the degree of contribution of the object in the output channel; and adjusting the output level of the object using the predetermined matrix. Consequently, if there are no user settings for each object, if predetermined metadata applied to the audio signal are selected with references to predetermined metadata, the level of objects contained in the audio signal can be easily adjusted by using predetermined rendering data corresponding to selected predetermined metadata.

EFFECT: high accuracy and efficiency of adjusting the output channel level and cutting volume of excessive coding.

15 cl, 16 dwg

FIELD: information technology.

SUBSTANCE: method of processing an audio signal comprises steps for determining whether the audio signal encoding type is musical signal encoding type using first type information. If the audio signal encoding type is not a musical signal encoding type, it determined whether the audio signal encoding type is a speech signal encoding type or a mixed signal encoding type using second type information. If the audio signal encoding type is a mixed signal encoding type, spectral data and a linear prediction coefficient is extracted from the audio signal; a difference signal for linear prediction by performing inverse frequency transformation over spectral data is generated; the audio signal is reconstructed by performing linear predictive coding over the linear prediction coefficient and the difference signal and the high-frequency domain signal is reconstructed using a base extension signal corresponding to the frequency domain of the reconstructed audio signal and range extension information.

EFFECT: higher efficiency of coding/decoding a audio signals.

15 cl, 14 dwg

FIELD: information technology.

SUBSTANCE: audio decoder for decoding a multi-object audio signal, having a first-type audio signal and second-type audio signal encoded therein; the multi-object audio signal consists of a downmixing signal and additional information; the additional information includes information on the level of the first-type audio signal and the second-type audio signal in a first predetermined time/frequency resolution, and the residual signal determines the value of the residual level in the second predetermined time/frequency resolution, includes apparatus for calculating prediction coefficients, based on information on level; and apparatus for upmixing the downmixing signal, based on the prediction coefficients and residual signal, to obtain a first upmixing audio signal close to the first-type audio signal and/or upmixing second-type audio signal which is close to the second-type audio signal.

EFFECT: efficient separation of specific objects in a multi-object audio signal.

25 cl, 24 dwg

FIELD: information technology.

SUBSTANCE: signal processing method involves receiving at least one of a first and second signal, receiving mode information to indicate that the assigned mode corresponds to one of at least three modes including a first mode, a second mode and a third mode, and if mode information indicates that the assigned mode is a first mode, then decoding the first signal using a first encoding scheme, if the mode information indicates that the assigned mode is a second mode, decoding the first signal and the second signal, including: decoding the first signal using a first encoding scheme, decoding the second signal using a second encoding scheme, generating the output signal using the decoded first signal and the decoded second signal, if mode information indicates that the assigned mode is a third mode, then decoding the second signal using a second encoding scheme, wherein the first encoding scheme corresponds to a voice encoding scheme, and the second encoding scheme corresponds to an audio encoding scheme.

EFFECT: high efficiency of processing different signals owing to selection of the optimum encoding scheme.

11 cl, 9 dwg

FIELD: information technology.

SUBSTANCE: basic idea of the invention is to ascertain information on the course of the bit rate change during a speech phase. According to the invention, during the speech phase, information on the percentage proportion of broadband speech frames in comparison to narrowband speech frames is compiled on the side of the decoder. A high percentage proportion of broadband active speech frames indicates that broadband use is preferred on the side of the decoder and therefore a need exists for synthesising noise information in broadband form during a DTX phase.

EFFECT: improving quality of a signal synthesised in the decoder by changing bit rate of the SID frame during speech-off.

13 cl, 3 dwg

FIELD: information technologies.

SUBSTANCE: method for audio signal decoding includes extracting from audio signal the signal of lowering mixing and object-oriented additional information; generation of modified lowering mixing signal based on lowering mixing signal and extracted information, which is retrieved from object-oriented additional information; generation of channel-oriented additional information based on object-oriented additional information and control data for lowering mixing signal reproduction; and generation of multichannel audio signal based on modified lowering mixing signal and channel-oriented additional information. Object-oriented additional information contains at least one of information on object level differences, information on cross correlation between objects, information on lowering mixing amplification, information on differences of lowering mixing channels levels and information on absolute energy of objects.

EFFECT: providing more realistic reproduction of object audio signals.

9 cl, 21 dwg

FIELD: information technology.

SUBSTANCE: in the audio signal processing method, it is identified whether the audio signal coding type is a music signal coding type used first type information. If not, it is identified whether the audio signal coding type is a speech signal coding type or a composite signal coding type using second type information. If the coding type of the audio signal is the coding type of a composite signal, then spectral data and a linear prediction coefficient are extracted from the audio signal, a residual signal is generated for linear prediction by performing inverse frequency-domain conversion of said spectral data, and the audio signal is reconstructed by performing linear predictive coding based on the linear prediction coefficient and said residual signal. If the coding type of the audio signal is the coding type of a music signal, then only first type information is used and if the coding type of the audio signal is the coding type of a speech signal or coding type of a composite signal, then both first type and second type information is used.

EFFECT: high efficiency of encoding and decoding audio signals of different types.

15 cl, 14 dwg

Up!