Device, method and computer programme for controlling audio signal, including transient signal

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to radio engineering and is intended for controlling an audio signal, including a transient event. The device comprises a unit for replacing a transient signal, configured to replace the transient part of a signal, which includes a transient event of an audio signal, with part of a replacement signal adapted to energy characteristics of the signal of one or more transient parts of the audio signal, or to the energy characteristic of the signal of the transient part of the signal to obtain an audio signal with a shorter transient process. The device also includes a signal processor configured to process an audio signal with a shorter transient process to obtain a processed version of the audio signal with a shorter transient process. The device also includes a transient signal inserting unit configured to merge the processed version of the audio signal with a shorter transient process with the transient signal, representing in the original or processed form the transient content of the transient part of the signal.

EFFECT: high accuracy of reproducing the signal.

14 cl, 20 dwg

 

According to the invention decisions affect the device, method and computer program to control the audio, including the transition signal.

Next will be described a scenario in which can be applied to solutions of the invention.

In modern processing systems, the audio signals are often processed using digital techniques. Certain parts of the signal, such as transient signals, for example, impose special requirements on digital signal processing.

Transitional events (or "transition signal") are events during which the signal energy across the band or in a particular frequency range is changing rapidly, that is, the signal energy rapidly increases or decreases rapidly. In the distribution of the signal energy in the spectrum can be found characteristic features of certain transient signals (transient event). Typically, the energy of the audio signal during the transition signal is distributed across the entire frequency range, while in the non-transient parts of the signal energy is usually concentrated in the low frequency part of the audio signal or in one or more frequency bands. This means that the non-transient parts of the signal, which is also called permanent or "tone" part of the signal has a spectrum that is non-planar. In addition, the range of the transition part of the signal is typically chaotic and unpredictable (for example, with the known spectrum of the portion of the signal preceding the transient part of the signal). In other words, the signal energy is included in a relatively small number of spectral lines or bands that are heavily allocated above the noise level of the audio signal. In the transition part, however, the energy of the audio signal will be distributed over many different frequencies and, definitely, will be distributed in the high frequency so that the spectrum of the transition part of the audio was relatively flat, generally flatter than the spectrum of the tonal part of the audio signal. However, it should be noted that there are other types of signals having a flat spectrum, as, for example, signals such noise, which is not a transient signal. While the spectral component signals, such noise is uncorrelated or weakly correlated values phase, there are often very significant correlation phase of the spectral component in the transition of the signal.

Typically, the transition signal determines a strong change in the presentation time interval of the audio signal, which means that the signal will include a lot more high frequency components when performing the Fourier transformation. An important feature of this set of harmonics of higher frequencies is that the phases of these high frequency harmonics on is displayed in a very certain relationship, so that the sum of all harmonics led to a rapid change of the signal energy (when considering the signal in the time interval). In other words, there exists a strong correlation along the spectrum in the case of the transition signal. The specific situation with the phase among all harmonics can also be called "vertical coherence". This "vertical coherence" is associated with the representation of the spectrogram of the signal in time/frequency, where the horizontal direction corresponds to the development of the signal over a long period of time, and the vertical direction describes the frequency dependence of the spectral components in the spectrum of a short time interval.

If, for example, changes are made on large time intervals, for example, when the quantization, these changes will affect the entire unit. As transient signals are characterized by a short-term increase in energy, this energy will probably be distributed at the change in the block on the entire area represented by the block.

The problem becomes especially apparent when you change the playback speed of the signal, while supported by the underlying tone, or when the signal is transposed, while supported by the original duration. Both results can be achieved using the phase vocoder, or the methods of the and, such as (P)SOLA (see [A1]-[A4] with regard to this problem). The latter result is achieved by playing the extended signal, accelerated by the stretch factor of time. When the signal representation with discrete time, this corresponds to a decrease in the rate of sampling of a signal using the stretch factor, while supported sampling frequency. Methods of stretching time, such as the phase vocoder, in fact, satisfy only the stationary or quasi-stationary signals, as transient signals are smeared out in time due to dispersion. The phase vocoder weakens the so-called properties of vertical coherence (associated with the representation of the spectrogram time/frequency) signal.

Temporary extension of audio signals plays an important role in the entertainment and technology. Common algorithms, such as PV (Phase Vocoder), SOLA (Synchronous Overlap Add), PSOLA (Pitch Synchronous Overlap Add) and WSOLA (Waveform Similarity Overlap Add), based on the technique of blending and addition (OLA). While these algorithms are able to change the speed of the replay of the audio signals, while keeping their original basic tone, transient signals at the same time remain badly. Stretching time audio, preserving its basic tone, using the OLA requires separate processing of transient signals and time-consuming parts of the signal, h is usually used to avoid transient dispersion [B1] and temporal interference, often WSOLA and SOLA. The problem is the challenge of stretching of the combination of tone, such as trumpet, percussion signal, such as castanets.

For lighting materials relating to this invention, further reference is made to some of the common approaches to solve this problem.

Some modern methods perform stretching time in the vicinity of transient signals more strongly not to exercise or to exercise stretching on a small interval duration of the transition signal (see, for example, links[5] - [8]).

The following articles and patents describe methods of manipulation with the main tone and/or time: [A1], [A2], [A3], [A4], [A5], [A6], [AT], [A8].

In [B2] proposed a method that approximately preserves the signal envelope in the extended time, as well as its spectral characteristics. This approach is expected incident shock extension of time to decay was slower than in the original.

Several well-known methods are valid for processing separation of transient signals and permanent components of the signal, for example, modeling the signal as a sum of sines, transient signals and noise (S+T+N) [B4, B5]. To save transient signals after time scaling, all three parts are stretched separately. This technique is capable of beautifully on the red preservation of transient components of the audio signals. The resulting sound, however, is often perceived as unnatural.

Further approaches change the amount of time stretching and install it in the unit during the transition signal or capture phase in the case of a transient signal [B3, B6, B7].

Article [B8] demonstrates how transient signals can be recorded at a time and frequency extension using PV. In this approach, the transient signals were cut from the signal before stretching. Remove transient parts has led to gaps within the signal, which were stretched using the procedure of PV. After stretching transient signals were re-added to the signal from the environment, which corresponded stretched intervals.

In view of the foregoing there is a need for the creation of the concept of manipulating an audio signal comprising a transition signal that provides an output signal with enhanced perceptual quality.

According to the decision of the invention creates a device to control the audio, including the transition signal. The device includes a module substitution transition signal made with the possibility to replace a part of the transition signal comprising a transient event, part replacement signal, responsive to the energy characteristics of one or more non-transient parts of the RM is of signal, or to the energy characteristics of the signal transient part of the signal to obtain the audio signal is reduced transition signal. The device further includes a signal processor, configured to process the audio signal with a reduced transition signal to obtain a processed version of the audio signal with a reduced transition signal. The device also includes a module insertion of the transition signal made with the possibility of combining the processed version of the audio signal with a reduced transition signal and the representation of the transition signal, in original or processed form, with the contents of the transient part of the signal.

The above solution is based on the found patterns that signal processor provides an output signal of improved quality, if the transition portion of the signal is replaced part replacement signal, the energy of which is adapted to the energy characteristics of the original audio signal, reducing or eliminating transient case. This concept eliminates large step changes in the signal power at the input of the signal processor, which would be caused by a simple removal of the transient part of the signal in the audio signal, and also avoids or at least reduces, the adverse impact of the transition signal to the signal processor.

Thus, the UDA is Yaya or reducing transient event in the audio signal to obtain an audio signal with a reduced transition signal and limiting the power of the audio signal with a reduced transition signal, when compared with the input audio signal processor receives an input signal, such that its output signal approximates the desired output signal in the absence of a transition event.

In the preferred decision module substitution transition signal is arranged to provide side replacement signal (or part of the signal with reduced transient signal) so that part of the replacement signal is a time signal having the smoothed temporal development compared to the transient part of the signal, so that the deviation between the energy side replacement signal and the energy of the non-transient parts of the signal of the audio signal prior to the transition part of the signal or after the transient part of the signal is less than a predefined threshold value. This can be achieved so that for part of the replacement signal following two conditions are fulfilled, namely, the so-called "transition condition" and the so-called "power condition". Transient condition indicates that a transient event, which is represented by a spike or peak in the time interval is limited by the intensity (or the height of the step or height of the peak) within part of the replacement signal. Energy condition further indicates that the audio signal with a reduced transition signal (part of the replacement signal) which should be smooth temporal development of the distribution of spectral power density. The heterogeneity of the development in time of the spectral power density, as a rule, lead to audible distortions. Accordingly, limiting such temporal heterogeneity of spectral energy distributions can avoid audible distortion that can occur because of simple removal (without replacement) the transition portions of the input audio signal,

In the preferred decision module substitution transition signal made with the possibility to extrapolate the amplitude values of one or more parts of the signal prior to the transition portion of the signal to obtain the amplitude value of the part of the replacement signal. Module replacement the transition signal is also made with the possibility to extrapolate the phase values of one or more parts of the signal prior to the transition part of the signal, to obtain values of the phase part of the replacement signal. Using this approach, can be obtained a smooth change of the amplitude of the audio signal with a reduced transition signal. Next phase of different spectral components of an audio signal with a reduced transition signal is well controlled (by extrapolation), such that a transient event that is characterized by certain values of the phase during the transient part of the signal (different from the phase values of the non-transient parts of the signal) suppressed

In other words, the values of the phase introduced by the extrapolation, are formed differently in contrast to the phase values characterizing the transition signal. Extrapolation also provides an advantage, because the knowledge of the parts of the audio signal prior to the transition part of the signal, enough to perform the extrapolation. But of course you may continue to use some third-party information, for example, extrapolation, and to perform the extrapolation.

In another preferred decision module insertion of the transition signal (150) is configured to smoothly apply the processed version of the audio signal with a reduced transition signal and the transition signal representation, in original or processed form, containing transient part of the signal. In this case, the processed version of the signal with a reduced transition signal may be stretched in time version of the input audio signal. Accordingly, the transition signal can be smoothly re-introduced in the stretched version of the input audio signal. In other words, after the (temporary) stretching of the audio signal with a reduced transition signal transient signals (processed or unprocessed form) re-added to the signal from the environment, which corresponds to a stretched interval.

In another preferred solved and module insertion of the transition signal is configured to interpolate between the value of the amplitude part of the signal, previous transient parts of the signal and the amplitude part of the signal after the transient part of the signal to obtain one or more values of the amplitude part of the signal replacement. Module insertion of a transient signal, in addition, is configured to interpolate between the value of the phase part of the signal prior to the transition part of the signal and the value of the phase part of the signal after the transient part of the signal to obtain one or more values of the phase part of the replacement signal. Especially smooth temporal development of the amplitude and phase values can be obtained by performing interpolation. The interpolation phase also typically leads to a reduction or cancellation of the transition events as transient signals typically include a certain distribution phase in the direct vicinity of the transition signal, which usually differs from the phase distribution at a certain interval away from the transition signal.

In the preferred decision module insertion of the transition signal made with the possibility of applying a weighted noise (for example, the spectrum is similar to the noise signal, adapted to the energy signal characteristics of one or more non-transient parts of the signal of the audio signal or energy signal transient part of the signal) in order to obtain values of s is litude side replacement signal, and applying a weighted noise in order to obtain values of the phase part of the replacement signal. It is possible, by applying a weighted noise to further reduce the transition signal, storing the impact energy is sufficiently small.

In the preferred decision module insertion of the transition signal is made with the possibility of combining non-transient component of the transient part of the signal with the extrapolated or interpolated values, to get some replacement signal. It has been found that superior audio quality with a reduced transition signal (and its processed version, which is obtained using the signal processor) can be achieved if supported by the non-transient components of the transient part of the signal. For example, the tonal components of the transient part of the signal can have only a limited impact on the transition signal (because the temporary transition signal, usually a broadband signal having a certain phase distribution of the frequency). Thus, tonal non-transient components of the transient part of the signal can carry important information that can actually help a desired output signal of the signal processor. Thus, the support of such parts of the signal and the reduction of the transition signal may contribute to the improvement of education is otci audio.

In the preferred decision module insertion of the transition signal made with the possibility to receive part of the signal change of variable length depending on the length of the transient part of the signal. It was found that the audio quality can sometimes be improved by adjusting the lengths of the parts replacement signal to the variable length of the transient parts of the signal. For example, some signals the transition portion of the signal may be very short duration. In this case, to be obtained optimized the processed audio signal may, by replacing only a relatively short portion of the input audio signal. Thus, it may be maintained as far as possible, a large (non-transient) information of the original input audio signal. Also, keeping the short side replacement signal (in accordance with the length of the transient part of the signal) in many situations, it is possible to avoid applying the subsequent parts replacement signal. Therefore, in most cases, can be achieved original non-transient part of the signal between two subsequent parts replacement signal. Therefore, the processed audio signal is generated with sufficient accuracy, while preserving as far as possible a large (non-transient) information of the original input audio signal.

In the preferred decision signal processor configured to rabotat audio signal with a reduced transition signal, thus, the transient part of the signal processed version of the audio signal with a reduced transition signal depends on non-overlapping sets of temporal parts of an audio signal with a reduced transition signal. In other words, preferably, the signal processor included temporary memory forming part of the signal processed version of the audio signal with a reduced transition signal. Signal processing using memory possible during block processing of the audio signal with a reduced transition signal, or for temporal filtering (for example, FIR - filtering, or PC - filtering) reduced transition signal of the audio signal. It was also found that invented a way to replace the transient parts of the signal is very well suited for implementation by a processor signal. While for transient signals typically would have a significant negative impact on the described signal processor that performs block processing or a temporary memory offered in the invention part of the replacement signal to reduce the adverse effects of the transition signal. While the transition signal typically would affect many parts of the signal provided by the signal processor, which extends beyond the time limits transient parts of the signal, adverse hcpa the op perate the transition signal is reduced or even eliminated with the use of the invented method. With the support of the smooth time variation of the signal energy with a reduced transition signal any weakening of the intensity of tone can be made quite smooth. For example, block (block processing signal processor), which includes part of the replacement signal (for example, in addition to the original non-transient parts of the signal), not much worse, as part of the replacement signal is adapted to the energy of the rest of the block. Thus, the contents of the block are not significantly affected by the elimination or reduction of the transition signal. Next, temporal filtering, in which the transition signal and removing transient part of the signal (for example, when setting it to zero) would be affected negatively, remains almost unaffected when removing the transition signal (or reduce) using part of the replacement signal.

In the preferred decision signal processor has a capability of processing time blocks of the audio signal with a reduced transition signal to obtain a processed version of the audio signal with a reduced transition signal. Module replacement the transition signal is also made with the possibility of adjustment of the length of the portion of the signal to be substituted for part of the replacement signal with a time resolution better than in the time unit, or replacing the ü transient part of the signal, having a time duration less than the duration time unit part replacement signal having a time duration less than the duration of the temporary block. Thus, the proposed substitution, allows the processing of audio signals with low distortion, even if the length of the remote transient parts differs from the length of the time blocks.

In the preferred decision signal processor configured to process the audio signal with a reduced transition signal depending on the frequency, so that the processing has entered a transitional falling frequency depending on the phase shifts of the audio signal with a reduced transition signal. However, even this treatment for reducing transient signals not have a significant adverse effect on the processed audio signal as transient signals are usually processed separately from the processing of the audio signal with a reduced transition signal. Accordingly, while worsening the transition signal may be applied signal processing in the signal processor, the quality of transient signals can be stored using separate processing of the transition signal and the insertion of transient signals at a later stage of processing.

In the preferred decision module UGT the Cai transient signal includes a sensor transition signal, made with the ability to provide change in time of the detection threshold for detection of a transient signal in the audio signal, so that the detection threshold follows the envelope of the audio signal with adjustable time constant smoothing. The sensor of the transition signal is configured to change the time constant of the smoothing when the transition detection signal and/or depending on the time development of the audio signal. When using such a sensor transition signal it is possible to detect transient signals of different intensity, even if transient signals closely spaced in time. For example, the invention takes into account the detection of weak transition signal, even if the weak transition signal closely follows the previous stronger transition signal. Accordingly, the detection of transient signals for their replacement can be performed reliably and accurately.

In a preferred solution, the device includes a processor transition signal made with the possibility to obtain information on the transition signal representing the contents of the transient part of the signal. In this case, the processor of the transition signal may be configured to obtain, based on the transition signaled handled the transition signal, which reduced the tonal components is you. Module insertion of the transition signal can be made with the possibility of combining the processed version of the audio signal with a reduced transition signal with the processed transient signal generated by the processor of the transition signal. Thus, a separate audio processing with a reduced transition signal and the transition signal of the input audio signal (the information presented on the transition signal) can be performed in such a way that the subsequent combination of the various parts of the signal leads to the corresponding full output signal. These components of the signal transition of the signal that has been processed "main" processor signal (for example, the tonal signal components), should not be included in a separate processing of the transition signal. Respectively, can be performed corresponding to the separation processing of the audio components, the transient parts of the signal.

A further solution according to the invention create a method and a computer program to control the audio, including the transition signal.

A brief description of the illustrations

The solution according to the invention will be described subsequently with reference to the drawings, where:

Fig.1 shows a block diagram of a device for manipulating an audio signal comprising a transition signal corresponding the content of the present invention;

Fig.2 shows a block diagram of the module replacement device in accordance with the content of the present invention;

Fig.3a-3d show block diagrams of the signal processor in accordance with the content of the present invention;

Fig.4 shows a block diagram of a device for temporary transition signal in accordance with the present invention;

Fig.5A shows an overview of how the vocoder for use in the signal processor of Fig.1;

Fig.5b shows an implementation of parts (analysis) of the signal processor of Fig.1;

Fig.5C illustrates the other part (stretching) of the signal processor of Fig.1;

Fig.6 illustrates the conversion phase vocoder the phase for use in the signal processor of Fig.1;

Fig.7 shows a schematic representation of the algorithm of the phase vocoder with synthesized by the hop size, different from the analyzed hop size, for example, 2 times;

Fig.8 shows a graphic representation of the temporal evolution of the amplitude of the audio signal;

Fig.9 shows a graphical representation of the synchronization signal processing in the device of Fig.1;

Fig.10 shows a graphic representation of signals that can be generated in the device in accordance with Fig.1;

Fig.11 shows another graphical representation of the signals that may be generated in the device is TBE in accordance with Fig.1;

Fig.12 shows a sequence diagram of operations in a method of manipulating an audio signal in accordance with the content of the present invention;

Fig.13 shows a graphical representation of the removal of transient signals and interpolation in accordance with the content of the present invention;

Fig.14 shows a graphical representation of the time stretching and inserting transition signal in accordance with the content of the present invention;

Fig.15 shows a graphic representation of waveforms that occur at various steps of the proposed processing of a transient signal in a temporary way of stretching the phase vocoder; and

Fig.16 shows a graphic representation of signals, which are formed at different steps of the temporary extension.

Hereinafter will be described some solutions according to the invention. The first embodiment of the device in order to control the audio, including the transition signal, will be described in relation to Fig.1, which shows a brief overview of the first solution, as well as in relation to Fig.2, 3A-3C, 4, 5A, 5b, 5C, 6 and 7, which show the details of the components of the first solution and the operation of the phase vocoder (Fig.7). The transition signal is shown in Fig.8, and the processing illustrated in Fig.9 - 11. In Fig.12 shows a block diagram corresponding to the method.

Next will be described in p is ocedure build a second device for to control the audio, including a transitional case, in accordance with Fig.13-17.

The solution according to Fig.1

In Fig.1 shows a block diagram of the device in order according to the decision of the invention to control the audio, including the transition signal. The device shown in Fig.1, is determined by the number 100. The device 100 is configured to receive the audio signal 110 that includes a transition signal, and to generate on the basis of the processed audio signal 120 with unprocessed natural or synthesized transition signal. The device 100 includes a module substitution transition signal 130, made with the possibility to replace the transient part of the signal comprising a transient event in the audio signal 110, part replacement signal, adapted to the energy performance of one or more non-transient parts of the signal or energy signal transition portion to receive the audio signal with a reduced transition signal 132. Features phase side replacement signal can be adapted to the peculiarities of the phase of one or more non-transient parts of the signal, the device 100 further includes a signal processor 140, is configured to process the audio signal with a reduced transition signal 132 to receive the processed version 142 audio with sokrasheny the transition signal. The device 100 further includes a module insertion of the transition signal 150 that is configured to combine the processed version 142 of the audio signal with a reduced transition signal transient signal 152 and to receive the processed audio signal 120 with a "natural" unprocessed or synthesized transition signal. The transient signal 152 may be presented in original or processed form, the transition part of the signal which has been replaced by part replacement signal in the module substitution transition signal 130.

Module replacement the transition signal 130 may further, optionally, to provide information on the transition signal 134, representing the transition part of the signal (which replaced part of the replacement signal in the audio signal with a reduced transition signal 132). Accordingly, information about the transitional signal 134 may serve to save the contents of a transient audio signal 110, which is reduced or even completely suppressed in the audio signal with a reduced transition signal 132. Information on the transition signal 134 may be sent directly to the module insertion of the transition signal 150 to serve as a transient signal 152. However, the device 100 may further include additional processor transition signal 160, which is designed with the ability to process information is the information about transitional transitional signal 134 and to get out the transient signal 152. For example, the processor of the transition signal 160 may be configured to perform the conversion of the transition frequency shift of the transition frequency, or synthesis of the transition signal.

The device 100 may further include, optionally, the identifier of the parameters of the signal 170 made with the possibility to determine the parameters of the signal 120 to receive the audio signal with the specific parameters for reproduction.

Regarding the functionality of the device 100 can generally be said that the device 100 allows to separately handle the non-transient content of the audio signal 110 (represented by the audio signal with a reduced transition signal 132) and the contents of the transient audio signal 110 (presents information on the transition signal 134). Transient events are reduced or even suppressed in the audio signal with a reduced transition signal 132 so that the signal processor 140 may perform the signal processing, which would have worsened transient events, and/or which adversely affected by transient events. However, replacing the transition signal adapted to energy parts of the signal replacement module replacement transition signal 130 is used to avoid audible distortion that would be introduced by the signal processor 140, if the transient parts of the signal will simply be set to zero.

Rela is eastwoodiae auditory impression also obtained using a paste of a transient signal in the module insertion of the transition signal 150. Of course, the auditory impression, as a rule, seriously deteriorated if transient events were simply eliminated. Therefore, transient signals are re-introduced into the processed audio signal 142. Re-entered transient signals may be identical to the transitional signals, remote from the audio signal 110 in the module substitution transition signal 130. Alternatively, the processing mentioned deleted (or replaced) transient signals can be performed, for example in the form of a frequency conversion or frequency shift. However, in some solutions, re-inserted transient signals can even be artificially formed, for example, based on the parameters of transient signals, describing the time and intensity transient signals.

Details of the module substitution of the transition signal.

Next will be described the functionality of the module substitution transition signal 130 in accordance with Fig.2, which presents a block diagram of the execution module substitution transition signal 130. Module replacement the transition signal 130 receives the audio signal 110 and generates on the basis of the audio signal with a reduced transition signal 132.

To this end module substitution transition signal 130 may for example include a sensor transition signal 130A, which is designed with the ability to detect the transition signal and to provide in the ormatio about the time of the transition signal. For example, the sensor of the transition signal 130A may provide information 130b, which is the description of the start time and end time of the transition part of the signal. In the technique known various methods of detection of a transient signal, so that their detailed description will be omitted here. However, in some cases, the sensor transition signal 130A can be done with the ability to distinguish transient signals of different lengths, so that the length of the particular frequent transition of the signal may change depending on the actual signal form,

Alternatively, the module substitution transition signal may include an extractor third-party information 130C, for example, if the third-party information describing the choice of intervals of transient signals associated with the audio signal 110. In this case, the sensor of the transition signal 130A may of course be omitted. Extractor third-party information 130C may further, optionally, be configured to provide one or more parameters of interpolation, extrapolation and/or replace options based on third party information associated with the audio signal 110. Module replacement the transition signal 130 further includes a substitute plot of the transition signal 130d, for example, the interpolator transient part of the signal or extrapolator transient part of the signal. The former is rector plot of the transition signal e configured to receive an audio signal 110 and information about the time of the transition signal 130b (generated by the sensor of the transition signal 130A or third-party extractor information 130) and to replace the transitional portion of the audio signal 110 part replacement signal.

Next will be described the details of the detection and replacement (or removal) of transient signals. In particular will be discussed in detail various methods of removal of the transition signal.

Transient signals (for example, accession instrument or percussion signals) can generally be described as short a time interval during which the signal is rapidly evolving in unpredictable ways. For example, the transition signal can be detected (using sensor transition signal 130A) by assessing the presentation time interval of the audio signal 110. If the presentation time interval of the audio signal 110 is greater than the threshold (which can be changed in time), it can be fixed presence of the transition event. A temporary storage area that includes a transient event can be regarded as a transient part of the signal, which can be described by the time information of the transition signal 130b.

Because these parts of the signal (that is, transient signals, or the time intervals during which the signal is rapidly evolving in unpredictable ways) ideally should not be stretched out in time, it is advantageous before stretching (which may be performed by the signal processor 140) delete "transition time". Hearth the separation can take place during the whole period of time, which are considered "non-stationary". For percussion instruments, this time mainly consists of the entire sound events (for example, a single blow to the hihat). Before the tool can serve as a so-called ADSR (Attack Decay Sustain Release support attenuation attack) envelope to illustrate the transition period.

In Fig.8 shows a graphical representation of the temporal development of signal amplitude 800. The abscissa 810 describes the time, and the ordinate 812 describes the amplitude. Curve 814 describes the temporal development of the amplitude. As can be seen in Fig.8, the temporal development of the amplitude includes attack interval, the interval attenuation, the time interval and the interval is complete. Attack interval and interval parameters can, for example, be seen as a "transitional area" or transient part of the signal.

However, it was found that for further signal processing (for example, the signal processor 140), the interval of the audio signal, which is caused by the suppression of the transition signal must be filled so that when listening to the processed signal=signal synthesis) (processed, for example, using the signal processor 140) was auditory continuous, transition, free signal without interrupting interruption and modulation amplitude.

For specific described here the case is I preferable to suppress the transient parts of the original signal (for example, signal 110) in the synthesized signal (e.g. signal 132 generated by the signal processor 140 or, consequently, in the signal 142 generated by the signal processor 140), while the tonal parts and non-transient noise components continue to exist.

In this part, there are already different approaches, but their goal is not quality adjusted transition (or transition cleaned) signal. Regarding this issue, you can refer to, for example, in the publication [Edier].

Regarding the effectiveness of methods for the detection of transient signals and decomposition into various components, such as, for example, "transient signal+noise", can be made findings of relevant publications [Bello] and [Daudet], which provide a good and complete understanding of generally accepted methods: none of the methods is clearly not superior to the other; the choice should be governed by the relevant application and the available computational power.

From this it follows that the choice of specific methods of detection and decomposition can significantly affect the result of the invented method. For qualified professionals may be used any of various known techniques to provide the best condition possible for the respective application scenario.

The concept of the substitution transition part of the signal.

In some application scenarios formed part of the signal that should not be evaluated as "correct" or "incorrect" reconciliation with the reference signal, and should be assessed only on the basis of their good full sound. This means that the solution according to the invention is not limited to the separation of the parts, and with the exception of transient components, but can generate synthesized signals having certain properties.

Therefore, the formation of the synthesized signal (for example, signal processing with reduced transition signal 132 substitute plot of the transition signal 130d) can be a combination of signal decomposition and signal conditioning (in the sense of interpolation and/or extrapolation of the received signal) during the transition time period. Non-transient components of the original signal can be mixed with components interpolation/extrapolation or can be replaced by them.

Some of the solutions according to this invention extrapolation may be equivalent to the formation of the synthesized signal using the previous values. Accordingly, the extrapolation can be performed in real time. On the contrary, in some solutions, the interpolation may be equivalent to the formation of the synthesized signal using the previous and subsequent amount is. Thus, in some cases, interpolation may require foresight.

Summarizing the above-mentioned various concepts can be implemented as a substitute for the plot of the transition signal 130d in order to obtain an audio signal with a reduced transition process 132.

For example, the substitute parcel of the transition signal 130d can be performed with the opportunity to reduce transient components from the audio signal 110 and to receive the audio signal with a reduced transition process. In this case, the substitute parcel of the transition signal 130d can be performed with the opportunity to ensure that sufficient energy is taking place in the transition part of the signal remains in side replacement signal. For example, frequency components, which include the phase response of the transient signal can be removed from the audio signal 110, while the other frequency components, which do not include the phase characteristic of the transition signal (for example, the tonal components of the frequency), can be taken from the transient part of the signal in part of the replacement signal. Accordingly, it can be ensured that the part of the replacement signal includes sufficient signal energy, which does not deviate too greatly from the energy of the signal preceding and subsequent parts of the signal.

Alternatively, the substitute parcel of the transition signal 130d can b shall be made with the possibility of receiving part of the replacement signal with the destruction of forming the phase relations in the transition part of the signal. For example, replacement of the plot of the transient signal can be performed with the ability to randomize or (deterministic) adjust the phase of different frequency components of the transient part of the signal. Accordingly, part replacement signal obtained in this way may include (at least approximately) the same energy as the transient part of the signal (since the modification of the phase components of the frequency does not change the energy). However, the temporal development of the form of the transition signal, described part of the replacement signal may be lost due to the development of a transient signal in time which is based on a certain relation to the phase of different frequency components and destroyed.

Alternatively, the substitute parcel of the transition signal 130d can be interpolated, for example, the temporal development of energy in different frequency ranges on the basis of the non-transient parts of the signal prior to the transition part of the signal. Accordingly, the contents of part replacement signal may simply be based on extrapolation of contents non-transient parts of the signal prior to the transition part of the signal. Accordingly, the transition part of the signal can be completely ignored.

Alternatively, the contents of part replacement signal can be obtained using the substitute parcel of the transition of the signal is and 130d, interpolating between the content of the non-transient parts of the signal prior to the transition part of the signal, and non-transient part of the signal following the transition part of the signal. The content of the transition part of the signal can again be completely ignored. Interpolation may be performed, for example, in the frequency-time domain.

Alternatively, it may use a combination of the above described methods to get the content part of the replacement signal. For example, non-transient content of the transient part of the signal (generated, for example, by removing the transition of detention or destruction of the phase relation, forming the transition process) can be combined with the content of the audio signal obtained by interpolating or extrapolating the one or more transient parts of the signal. In another example, can be destroyed by forming a transition phase relationship in the transition part of the signal, and can be measured the energy of the transition portion of the signal to be adapted to the energy related non-transient parts of the signal.

Given the above, we can say that part of the replacement signal is synthesized on the basis of only the non-transient parts of the signal (for example, on the basis of previous and/or subsequent parts of the transition part of the signal) (without using the content of the transient part of the signal), and on the basis of only the transient part of the signal, or based on a combination of one or more non-transient parts of the signal and transient parts of the signal.

The extension of the concept of generating an audio signal with a reduced transition process-the basics

Next will be described the extension of the concept of generating an audio signal with a reduced transition process 132, the aspects which can be applied in any of the solutions described here. The process of identifying and replacing described in the patent WO 2007/118533, which are included in the link.

Patent WO 2007/118533 A1 describes a device and method for forming the background signal of the surrounding space. This document describes the sensor transition signal, which is made with the ability to detect the time interval from the transition signal. The sensor transition of the signal described in WO 2007/118533 A1, may, for example, be used as the implementation of (or replacement) described herein sensor transition signal 130A. The above-mentioned patent further describes a synthesized signal generator which generates a synthesized signal that satisfies the transition condition and the condition of continuity. Generator synthesized signal described in WO 2007/118533 A1, may, for example, be used to create a substitute plot of the transition signal 130d, or can even be used instead of replace the El section of the transition signal 130d. Thus, the concept of formation of the synthesized signal described in WO 2007/118533 A1, can be used to generate an audio signal with a reduced transition signal 132 in some solutions of this invention.

The extension of the concept of generating an audio signal with a reduced transient-additions

As described above (signal processing, including the transition process, maintaining a good quality of auditory perception), high quality of sound generated signal is significantly more important than described in WO 2007/118533 (Generation of the Background Signal). The method described in WO 2007/118533, expanded some steps to improve the audio quality.

For example, in addition to the extrapolation of the amplitude, the solution according to the invention may also include the extrapolation or interpolation of the phase values to obtain a synthesized signal of improved quality that has no transition parts.

Extrapolation or interpolation is performed, for example, using linear prediction or coding linear prediction (LPC), or linear and/or splines, or etc. + - weighted noise.

In some solutions, the above-described formation of the audio signal with a reduced transition process 132 may be particularly advantageous when it is used in combination with the gas-vocoder which may be part of a signal processor 140, or may present the signal processor 140. In some solutions, a property of the phase vocoder (which, as is commonly believed, is a big problem [8]), which is that there is no predictable relations to use prior to the transition signal frames. Some solutions use the fact of the suppression of the transition signal, wherein the transition signal is erased, causing the relationship with the previous fragments. In other words, the phase coefficients that describe various time and frequency components of the replaced signal on the interval (for example, in the form of complex numbers) are, for example, adjusted using extrapolation from the previous frequency-temporal component (the previous non-transient parts of the signal), or interpolation between the corresponding frequency-time components of the previous non-transient parts of the signal and the following non-transient parts of the signal. In the publication [Maher] described a similar interpolation method. The method presented in [Maher], does not work in real time, as needed parts that follow the period of the signal. In addition, [Maher] describes only the processing of "peaks" in the audio signal (in contrast, some solutions according to the invention process all frequent the local line), and also do not have to explicitly deal with the noise components. In other words, some solutions can be applied the method described in [Maher], to address gaps in the audio, allowing the proposed solution to obtain an audio signal with a reduced transient 132, based on the original input audio signal 110. Rather than joining the "missing" part of the audio signal, identified as transitional, part of the signal can be replaced using the method described in [Maher]. However, interpolation/extrapolation can be performed independently for each frequency component. The amplitude and the phase may be interpolated (e.g., separately).

The sensor of the transition signal 130A

Hereinafter will be described some existing details on the described sensor of the transition signal 130A. It should be noted that it can be used many different versions of the sensor transition signal 130A, so that these parts can be considered as examples of the same best perform. In some solutions, the preferred adaptive thresholds in order to determine the time intervals of transient signals. Usually adaptive thresholds are smoothed version detection function, which can lead to significant fluctuations, and therefore to skip the small peaks in the environment of large Pico is. Details are described in the publication [Bello]. This problem can be solved, for example, a suitable adaptation of the smoothing constants depending on the detected current conditions (transition region/not transition region) and from the development of detection (e.g., attack, decay).

Next, you will see some references regarding the above-mentioned aspects: [Edier], [Bello], [Goodwin], [Walther], [Maher], [Daudet].

Extractor plot of the transition signal a

In addition to the functionalities described above, the module substitution transition signal 130 may further include an extractor plot of the transition signal e, which can be performed with the opportunity to receive the audio signal 110 (or at least transient part of the signal) and to provide information on the transition signal 134. Extractor plot of the transition signal e can be performed with the opportunity to provide information on the transition signal 134 in any possible form, for example in the form of "transition of the signal on the time interval signal, in the form of "representation of the transition of the signal in the frequency-time domain, or in the form of parameters of the transition signal (for example, information about the time of the transition signal and/or information about the intensity of the transition signal and/or information about the steepness of the transition signal and/or any other correspond to the second information on the transition process).

In particular, the extractor plot of the transition signal e can be performed with the opportunity to provide information about transitional transitional signal 134 only for parts of the signal that were removed from the audio signal 110 to receive the audio signal with a reduced transient 132, and to save the data rate small within reasonable limits.

Implementation alternatives of the signal processor 140 - overview

Next will be described the various basic concepts of the implementation of the signal processor 140. In Fig.3A illustrates a preferred implementation of the signal processor 140 shown in Fig.1. This includes performing frequency-selective analyzer 310 and connected in series block frequency-selective processing 312, which is carried out in such a way that has a negative impact on "vertical coherence" of the original audio signal. An example of this frequency-selective processing is the stretching of the signal in time or reduction of the signal in time, and this elongation or reduction of the applied frequency-selective manner so that, for example, the processing of the introduced phase shifts in the processed audio signal, which are different for different frequencies, phases can, for example, be introduced so that transients are suppressed. % The quarrels signal 140, it is shown in Fig.3A, may further may include a combiner frequency 314, which is configured to be combined into a single signal (e.g., the signal in the time interval) of different frequency components of the processed audio signal formed by frequency-selective processing 312.

To perform block processing can be used in frequency-selective analyzer 310, which may divide the audio signal with a reduced transition signal 132 into multiple frequency components (for example, the complex spectral coefficients) and the unifier of frequencies 314, which can be done with the presentation time interval of the processed audio signal 142 based on a variety of complex spectral coefficients for different frequency ranges. For example, frequency-selective analyzer 310 may process (e.g., the processing function window), the block of samples of the audio signal 132 to receive a number of complex spectral coefficients representing audiostereo block of samples of the audio signal. Similarly, additional unifier frequency 314 may receive a number of complex factors (e.g., one for each frequency band of the multiple frequency ranges) and to provide, on the basis of this, the representation of the time interval that overlaps limited to the military time interval, includes many samples the time domain.

Another preferred signal processing illustrated in Fig.3b in the context of processing carried out by the phase vocoder. In General, the phase vocoder analyzer includes popolos/convert 320, connected in series with the processor 322 to perform frequency-selective processing of the set of output signals generated by the analyzer 320, and connected in series unifier popolos/convert 324, which combines the signals processed by the processor 322 to finally get a processed signal 142 in the time domain at the output 326. The processed signal 142 in the time domain, again, is a signal with a full band for the lowpass filter, if the bandwidth of the processed signal 142 is greater than the bandwidth, one branch connecting blocks 322 and 324, as the unifier popolos/convert 324 merges the selected frequency signals.

More detailed information about this phase vocoder will be discussed below in connection with the data in Fig.5A, 5b, 5C, and 6.

In Fig.3C presents another possible implementation of the signal processor 140. As you can see, the audio signal with a reduced transition process 132 can be processed even in the time domain using some the of esani. Typically, the processing in the time domain 330 may include memory, such that the transition process in the signal 132 would have a lasting effect on the processed audio signal 142. In some cases the audio signal with a reduced transient 132 would cause a transient response in a processed audio signal 142, which is considerably longer (e.g., longer than 2 times or even 5 times or even 10 times) than the duration of the transition process (or the duration of the transient part of the signal). In this case, the transients in the audio signal 132 significantly worsened would be undesirable, the processed audio signal 142, for example, producing an audible echo. Next, the complete removal of the transition part of the signal has also had a lasting effect on the processed audio signal 142, because the complete removal of the transition part of the signal directly causes the transition process.

The execution of signal processor using a vocoder - filter Bank

In Fig.5 and 6 is illustrated a preferred implementation of the vocoder, which can be used to perform the signal processor 140, or may be part of the signal processor 140. In Fig.5A shows the filter Bank implementation of the phase vocoder, where the input audio signal (for example, the audio signal with a reduced transient 132) is served on the stroke 500, and the processed audio signal (for example, the processed audio signal 142) is obtained at the output 510. In particular, each channel of the filter Bank scheme, illustrated in Fig.5A, includes a band-pass filter 501 and located decreasing oscillator 502. To produce an output signal at the output 510 of the output signals of all generators from each channel are combined by the combiner, which, for example, implemented as an adder and indicated by block 503. Each filter 501 is configured to provide amplitude of the signal on the one hand and the frequency of the signal from the other side. The amplitude signal and the frequency signal is time signals, illustrating the development of the amplitude in the filter 501 in time, while the frequency signal represents the development of the frequency signal filtered by the filter 501.

Schematically, the operation of the filter 501 is illustrated in Fig.5b. Each filter 501 in Fig.5A may be configured as shown in Fig.5b, where only the frequency fisupplied to two input mixer 551 and the adder 552, different from channel to channel. Both output signals of the mixer is filtered using a lowpass filter 553, where the signals of lower frequencies are different, because they were formed by the signals of the local oscillator, which are not the same at 90°. Top lowpass filter 553 provides a quadrature signal 554, at that time, is to lower the lowpass filter 553 provides a common-mode signal 555. These two signals, that is, I and Q, are fed to the transducer coordinate 556, which generates a representation in the form of amplitude and phase from a view in a rectangular coordinate system. The signal magnitude or signal amplitude in accordance with Fig.5A in time is generated at the output of 557. The signal phase is supplied to the unit deployment phase 558. The output of block 558 no more than a residual phase, which always lies between 0 and 360°, but the magnitude of the phase increases linearly. This "gross" value of the phase is fed to the Converter phase/frequency 559, which may, for example, be performed as a simple shaper phase difference, which subtracts the phase of the previous time and the phase of the current time to obtain the value of the frequency for the current time. This frequency is added to a constant value of frequency fichannel filter i to obtain the time-varying frequency output unit 560. The magnitude of the frequency at the output of block 560 is a constant component = fiand a variable component = frequency deviation, which is the current frequency of the signal in the channel filter deviates from the average frequency fi.

Thus, as illustrated in Fig.5A and 5b, the phase vocoder provides separation of spectral information and temporal information. Spectral information is in a special channel is at a frequency f ithat provides a constant frequency component for each channel, while the time information contained in the deviation of the frequency or magnitude in time, respectively.

In Fig.5C shows the manipulation that can be performed in the vocoder in the position of the vocoder, indicated by broken lines, in Fig.5A.

For scaling of time, for example, the signal amplitude A(t) in each channel or frequency of the signal f(t) in each signal can be decimonovena or interpolated, respectively. In order to move, because it is required for the present invention, the interpolation, i.e. the extension in time or distribution of the signals A(t) and f (t) to obtain the enhanced signal A'(t) and f'(t), where the interpolation is controlled by the multiplier expansion. Using interpolation of phase deviation, i.e. the value before the addition of a constant frequency in the adder 552, the frequency of each individual generator 502 in Fig.5A does not change. Temporary change full audio slowed down 2 times. As a result, the tone, with an original principal tone, extended in time, that is the tone with the original fundamental frequency harmonics.

To move the frequency can be used the following way. By performing signal processing, illustrated in Fig.5C, where such about abode is performed in each channel filter Bank, presented on Fig.5A, and decamerone the resulting signal in time, the audio signal may be reduced back to its original length, while all frequencies are doubled at the same time. This moves the main tone on factor 2, and the resulting audio is the same length as the original audio signal, i.e. the same number of samples.

The execution of signal processor using a vocoder - Converter

As an alternative implementation of the filter Bank is illustrated in Fig.5A, the conversion phase vocoder can be implemented as shown in Fig.6. Here, the audio signal 132, in the form of a sequence of time samples, is supplied to the processor FFT (FFT Fast Fourier Transform), or more broadly, the processor of the Fourier transform for a short time interval 600. The FFT processor 600 shown schematically in Fig.6, is made with the possibility of a temporary audio processing function window, then by the FFT to compute the spectral amplitude and phase, and this calculation is performed for a sequence of spectrograms that are associated with strongly overlapping blocks of audio.

In the extreme case, for each new sample of the audio signal can be calculated new spectrum and new spectrum can be the t can be computed as well for example, only every twentieth of a new sample. This interval "a" between the two spectra presented in samples, preferably determined by the controller 602. The controller 602 is further configured to signal fed to the inverse FFT processor 604, which is designed with the ability to carry out the operation to overlap. In particular, the inverse FFT processor 604 carried out in such a way that it performs the conversion, inverse Fourier transform for a short time interval, performing one inverse FFT spectrogram based on the magnitude and phase of the modified spectrogram, and then to carry out the operation of the overlap and add, in result of which the resulting time signal. Operation overlap and addition eliminates the effects of the analytical window.

The expansion of the signal in time is achieved by ensuring that the distance b between the two spectrograms, as they are processed by the inverse FFT processor 604, which is a great distance than the distance between the spectra during the formation of the spectrograms using the FFT. The basic idea is to expand the audio signal inverse FFT simply by the location of the intervals in contrast to analysis using sequence FFT. As a result, temporary changes in integirrimum the audio occur more slowly, than in the original audio.

Without re-scaling phase in block 606 this, however, would lead to distortions. Let, for example, is considered a single frequency component, which is implemented in a consistent magnitude phase 45°, this implies that the signal within the filter Bank increases the phase speeds of 1/8 per cycle, i.e. at 45° for the time interval and the time interval is a time interval between successive FFT. If now the inverse FFT are further away from each other, the increase in phase by 45° occurs over a longer time interval. This means that because of the phase mismatch occurs in the following overlapping and adding that the process leading to undesirable end signal. To correct this distortion, the phase of the re-scaled (premastered) with the same ratio, with which the audio signal was extended in time. The phase of each spectral value of the FFT, thus, increased with the use of the ratio b/a to describe the discrepancy was resolved.

While in solution, illustrated in Fig.5C, the expansion of the interpolation control signals amplitude/frequency was achieved for a single signal generator in the Bank of filters shown in Fig.5A, rasshirenie in Fig.6 is achieved by creating a distance between the two spectrograms inverse FFT, greater than the distance between the two spectrograms after FFT, that is, b must be greater than and, moreover, however, to prevent distortion of the rescaling phase is performed by coefficient b/A.

Detailed description of phase vocoders made in the following publications:

"The phase Vocoder: A tutorial", Mark Dolson, Computer Music Journal, vol. 10, no. 4, pp.14 - 27,1986, or "New phase Vocoder techniques for pitch-shifting, harmonizing and other exotic effects", L. Laroche and M. Dolson, Proceedings 1999 IEEE Workshop on applications of signal processing to audio and acoustics. New Paltz, New York, October 17-20,1999, pages 91 to 94; New approached to transient processing interphase vocoder", A. Röbel, Proceeding of the 6th international conference on digital audio effects (DAFx-03), London, UK, September 8-11, 2003, pages DAFx-1 to DAFx-6; "Phase-locked Vocoder", Meller Puckette, 1995 Proceedings, IEEE ASSP, Conference on applications of signal processing to audio and acoustics, or US Patent Application Number 6,549,884.

The following example will be briefly described functionality of the phase vocoder is based on the transformation, which is depicted in Fig.7. In Fig.7 shows a schematic view of operations of the algorithm of the phase vocoder with the size of the jump in the synthesis of different size jump in the analysis, for example, 2 times.

In the phase vocoder algorithm is used (PV), which changes the duration of a signal without changing its fundamental tone [B9]. In this algorithm, the signal is divided into so-called particles, which are cutting out processed signal by a window function, length, typically in the range is roughly ten milliseconds. Particles are rebuilt, overlap and fold in the OLA size jump in the synthesis of different size jump in the analysis. To stretch the signal with a scale factor of, for example 2, the size of the jump in the synthesis is double the size of the jump in the analysis. In Fig.7 illustrates the algorithm.

Module insertion of the transition signal

Further in accordance with Fig.4 will be described the preferred implementation shown in Fig.1 module insertion of the transition signal 150.

Module insertion of the transition signal 150 includes, as a key component, the unifier signal 150A. A combiner signal 150A is arranged to receive and process an audio signal 142 and the transition signal 152, and the formation on the basis of these signals processed audio signal 120. A combiner signal 150A may, for example, be made with the ability to perform complex replacement, using the switch, part of the processed audio signal 142 part of the transient signal 152. However, in the preferred solution, the unifier signal 150A can be performed with the opportunity to form a smooth overlap between the processed audio signal 142 and a transient signal 152 in such a way that it creates a smooth transition between these signals 142, 152 within the processed audio signal 120.

Module insertion perehodnik the signal 150 can be implemented with the possibility to determine the optimal rate of insertion. For example, the module insertion of the transition signal 150 may include a calculator 150b in order to calculate the length of the inserted section of the transition signal. Computing the length of the inserted section of the transition signal may be important, for example, if the length of the replaced transition part (as determined, for example, the sensor of the transition signal 130A) is variable depending on the characteristics of the signal. To determine the length of the transition portion of the insert, if the processed audio signal 142 includes a different length (or a different number of samples per second, or different total number of samples) compared to the original input audio signal 110, calculator 150b may be determined coefficient of expansion or compression ratio. A detailed discussion of this change in length will be given below, including reference to Fig.10 and 11.

Module insertion of the transition signal 150 may further include a calculator 150C in order to calculate the position of the insert. In some cases, to calculate the position of the insertion should take into account the stretching or compression of the processed audio signal 142. In some cases, it is preferable that the relationship between non-transient content of the audio signal and the transition signal content of (for example, the relationship in time) in a processed audio signal 120 were at least AP is sustained fashion identical to the relationship in time of the specified non-transient audiotherapy and the specified transition audiotherapy in the original input audio signal 110. However, in addition to pre-calculate the corresponding position of the insertion of the transition signal can be performed accurately setting the specified position of the insertion. For example, the calculator 150C in order to calculate the position of the insertion can be performed with the opportunity to read the processed audio signal 142 and the transition signal 152 and to determine the time of insertion on the basis of comparison of the processed audio signal 142 and the transition signal 152.

Details concerning the possibility to calculate the position of the insertion (pristavki) will be described below with reference to the examples illustrated in Fig.10 and 11.

Possible timings

Next with reference to Fig.9 will be described the details of the possible temporal relation. In Fig.9 shows a graphical representation of the various processing blocks of the original input audio signal 110. The first graphical representation 910 describes the temporal development of the original input audio signal 110, where the abscissa 912 determines the time. The input audio signal 110 includes transient part of the signal 920, the length of which may be variable. On the graphical representation 910 shows the synchronization processing intervals, or processing of blocks a, 922b, 922 with the signal processor 140. As can be seen, the duration of the transient part of the signal 920 may be the ü less than the duration of processing intervals a, 922b, 922 S. In some cases, however, the time duration of the transient part of the signal may even be greater than the time duration of the processed intervals, or to be more than one processed frame. In some cases, the processed intervals a, 922b, s also overlap in time.

Graphical representation 930 shows the audio signal with a reduced transition signal 132, which can be obtained by replacing the transition signal is performed in the module substitution transition signal 130. As can be seen, the transition portion of the signal 920 has been replaced by part replacement signal.

Graphical representation 950 describes the processed audio signal 142, which can be obtained, for example, using block processing of the audio signal with a reduced transition signal 132. The processing may, for example, be performed using the phase vocoder and the reduction of the sampling frequency. In this processing blocks may be processed by the window function, it is possible that the blocks also overlap.

Further graphical representation 970 shows the processed audio signal 120, to which was re-inserted the transition signal (or its modified version) in the module insertion of the transition signal 150.

Important is about to celebrate, that transition of the signal 920 would impact on the entire unit 1", if the transition portion of the signal 920 was seen in block processing, because the energy of the transition signal, normally distributed around the block in this block processing. Thus, if the transient part of the signal had to be considered in block processing, the total energy of the block would probably rigged transition energy. Further, the transition signal, as a rule, would be covered (i.e. expanded), if the transition signal was raised block processing. On the contrary, individual processing of the transition signal takes into account the limitation of exposure of the transition signal to the time interval 1" processed audio signal 120, which is connected with the transition process. Extension of the transition portion of the signal on the full block of the signal processing in the signal processor 140 can be avoided. The duration of the transient part of the signal processed in the audio signal 120 may be determined by processing the transition of the signal performed by the processor of the transition signal 160. Alternatively, if desired, it is possible to insert the transient part of the signal 920 in the processed audio signal 142 in its original duration. Thus unwanted energy propagation of the transition signal in the signal processor 140 moneysmith.

The time distribution of audio

As can be seen from the description given above, the proposed ways to control the audio signal comprising a transient event, can be applied in many different solutions. For example, the above mentioned method can be applied in any audio processing, in which transients would be degraded signal processing, and in which, however, it is desirable to leave transients. For example, many types of non-linear processing of the audio signal would result in seriously degraded results in the presence of transients. Some types of filtering in time, moreover, would be significantly affected by the presence of transients. Further, modular audio processing, as a rule, be worsened by the presence of transients, because the energy transients smeared across the processing unit, resulting in audible distortion.

However, the stretching of the audio signals in time, as you can assume, is the most important application of the concept audio controls, including a transitional case. So next will be described the details of this concept.

Further, to facilitate understanding of the advantages of this invention will be described some of the shortcomings of conventional methods of stretching the audio savremeni. Stretching of the audio signals in time phase vocoder includes "lubrication" part of the signal corresponding to the transition process, using dispersion, this reduces the so-called vertical coherence of the signal (in the sense of a certain phase relations between the components of different frequency ranges). Methods OLA, using the so-called overlap and addition, can generate destructive pre-and delayed echo of transient audio events. These problems can really be pronounced tensile signal in time in the presence of transients. If you must move, then the factor movement is no longer constant in the presence of transients, i.e. the structure of the basic tone imposed by the signal (possibly tone will be changed and will be perceived as destructive surge.

If transients are cut and the resulting time interval is extended, in the Wake behind this will have to be filled very long period. If transients are a close second for each other, large gaps could possibly overlap.

The following describes a new method of signal conversion. The method presented here solves the aforementioned problems.

According to this method, plot, examined the window function and contains the transition process, interpolated or extrapolated from the processed signal (for example, the original input audio signal 110). If the application is critical in time, i.e. if you want to avoid delays, preferably may be selected extrapolation. If you know the future due to the so-called foresight, and if the delay will not play too important a role, it will be preferred interpolation.

In some solutions, the method may essentially consist of the following steps, illustrated in Fig.10 and 11.

1. Recognition of the transition process;

2. Determining the duration of the transition process;

3. The preservation of the transition process;

4. Extrapolation and/or interpolation;

5. The use of the actual method, such as the phase vocoder;

6. Pristavka (box) saved the transition process; and

7. It is possible (optional) oversampling (for modification of the sampling frequency).

When completed this sequence, the duration of the transient time is reduced by lowering the sampling frequency. If this is not desirable, the transition process can be converted in such a way as to be within the desired frequency range before it is re-inserted after the implementation of changes (steps 6 and 7 are swapped).

Next will be the painted some detail in relation to Fig.10. In Fig.10 shows a graphical representation of various signals that may appear at the decision device 100 shown in Fig.1. The view of Fig.10 is determined by the number 1000. The signal representation 1010 describes the temporal development of the original input audio signal 110. As can be seen, the input audio signal 110 includes transient part of the signal 1012, variable length (or duration) which can be determined by sensor transition signal 130A using the adaptation signal. The transition portion of the signal 1012 may be removed and replaced as part of the replacement signal in the module substitution transition signal 130. Respectively, may be received audio signal with a reduced transient 132, which is shown in the picture signal 1020. Part of the signal replacement, replacement transient part of the signal 1012, shown with the reference number 1022. The audio signal with a reduced transition process 132 may be handled by the block method, where different window treatments (which determines the granularity of block processing, and is also defined as "particles") are shown in the view of signal 1030. For example, for each block (or "particles") can be obtained in a number of spectral coefficients to generate a representation of the audio signal with a reduced transition process 132 in the field of frequency-time.Within the view in the field of frequency-time audio signal with a reduced transition process 132 may be applied by the phase vocoder, so that will be a signal of increased duration. This purpose can be obtained from the coefficients of the interpolation region of the frequency-time. The coefficients in frequency-time can then be used to generate a signal interval with the support of the main tone, the time duration of which is extended compared to the original input audio signal. In other words, the number of periods of the signal increases. The signal resulting from the operation of the phase vocoder is shown in the signal representation reference number 1040. As can be seen in the graphical representation of the 1040, the so-called "clipping region of the transition process", which was inserted into a part of the signal replacement to replace the transient part of the signal is time shifted relative to the temporary provisions of the transition portion of the signal in the original input audio signal 110 (when considering relative to the beginning of the input audio signal).

Subsequently, the transition portion of the signal that was previously replaced, re-inserted, for example in the module insertion of the transition signal 150. For example, the transition portion of the signal, described a transient signal 152 may be gradually imposed in the processed version 142 of the audio signal with a reduced transition process. The result of pristavki perehodnogo signal shown in the graphical representation of the number 1050.

In the subsequent procedure to reduce the sampling frequency (subdirectory) can be reduced temporal duration of the processed audio signal 120. The downsampled may, for example, be performed by the determinant of the parameters of the signal 170. The downsampled may, for example, include changing the scale of time. Alternatively, the number of samples may be reduced. As a consequence of temporary duration seriescreative signal is reduced in comparison with the signal generated by the phase vocoder. At the same time the downsampled can be supported by the number of periods in comparison with the signal generated by the phase vocoder. Accordingly, the basic tone seriescreative signal, which is shown in the signal representation number 1050, may increase compared with the signal provided by the phase vocoder (shown in the signal representation number 1040).

In Fig.11 shows another representation of the signals appearing in another decision device 100 shown in Fig.1. Processing similar to the processing shown in Fig.10, so here we will only describe the differences in processing, so that an identical representation of a signal and signal details will be determined by the identical reference numerals in the figures 10 and 11.

In signal processing, presented in a small town is the t signal 1100, the downsampled performed before perestavnoj transition signal. Thus, the signal representation 1150 shows subdirectory signal without the inserted transition part. However, the transition portion of the signal displaced in frequency using the operation for changing the frequency of the transition signal 1160, which can be executed by the processor of the transition signal 160. Moved by frequency transition signal (moved by the frequency of the signal in relation to the transition part of the signal, replaced by module substitution transition signal 130) can be re-inserted in subdirectory the processed audio signal 142 module insertion of the transition signal 150. The result of pristavki transition signal shown in signal representation 1170.

The fine tuning of the transient part of the signal

Hereinafter it will be described as the transient signal 152 may be combined with the processed audio signal 142 module insertion of the transition signal 150. For example, the module insertion of the transition signal 150 may be configured to cut a transitional region from the processed audio signal 142, which must be inserted transient signal 152. Here we can assume that the boundary portion of the transient signal 152 may overlap in time with the boundary parts designed for cutting region the security transition process. Between the overlapping edge regions of the processed audio signal 142 and the transition signal 152 may be a crossfade. The transient signal 152 may also be displaced in time relative to the processed audio signal 142 so that the signal form the boundary of the areas covered by the transitional part is given in good agreement with the signal form the boundary areas of the transient signal 152.

Fine tuning can be performed by calculating the maximum of the cross correlation of the edges of the resulting gap with the edge of the transition part (where the gap may be caused by cutting the transition region of the processed audio signal 142). Thus the subjective sound quality of the transition process no longer diminish the effects of echo and variance.

Can be made precise determination of the position of the transition process to select a suitable region of the cut, for example, by using a floating calculate the center of gravity of the energy for a suitable period of time.

Optimal adjustment of the transition process in accordance with the maximum cross-correlation may require a small offset in time relative to the initial position of the transition process. Because of the existence of the preliminary effects of masking in time and, in particular, postmasters position is the insertion of the transition signal does not exactly match the original position. In this context, because of the longer period postmasters should prevail shift of the transition process in the positive direction of time. When you insert the original part of the signal change in the sampling frequency leads to a change in the timbre, or the main tone. However, it is generally masked by the transition process through mechanisms fisicoculturistas disguise.

Processing of the transition signal

If the transition process (signal) must be less tone to pristavki than cut, for example, because it just needs to be added to the processed signal processed by the corresponding window function the transition portion will need to be handled appropriately. In this context, can be held back (LPC) filter.

An alternative approach is briefly described in the following steps.

1. Finding the Fourier transform for a short time (CITF) (e.g., transient part of the signal described information on the transition signal 134) to

to obtain a spectrogram;

2. Finding Cepstra (for example, the spectrogram of the transition part of the signal);

3. Filtering cepstra using a high pass filter (first coefficients are set to 0) to get the filtered spectrogram using the high pass filter;

4. Defined the e spectrograms (e.g., transient part of the signal) using the filtered spectrogram (e.g., transient part of the signal to obtain a smoothed spectrogram; and

5. The inverse transform (e.g., the smoothed spectrogram) in the time interval (for example, to obtain the processed transient signal 152).

The resulting exhibition of the signal (at least approximately) have the same spectral envelope, and the output signal, but with the loss of tonal parts.

Method

The solution according to the invention includes a method of controlling the audio signal comprising a transient event. In Fig.12 shows a flowchart of such a method 1200.

The method 1200 includes a step 1210 of section substitution transition signal comprising a transient event in the audio signal at the site of replacement, adapted to the energy performance of one or more non-transient parts of the signal, or adapted to the energy characteristics of the site transition signal to obtain an audio signal with a reduced transition signal.

The method 1200 further includes the step 1220 processing an audio signal with a reduced transition signal to obtain a processed version of the audio signal with a reduced transition signal.

The method 1200 further includes the step 1230 combining the processed version of the audio signal with a reduced transition signal and the transition signal, representing in original or processed form transient content area of the transition signal.

The method 1200 can be added to any of these features or functionalities in relation to the above proposed device.

In other words, although some aspects have been described in the device context, it is clear that these aspects also represent a description of the corresponding method, where a block or device corresponds step of the way, or a particular step of the way. Similarly, the aspects described in the context of a step of the method, also represent a description of a corresponding block or item or features of a device.

Computer program

Depending on the specific requirements of the implementation, embodiments of the invention may be implemented in hardware tools, or software. The execution may be performed using a digital storage medium such as a floppy disk, DVD, Blue-Ray, CD, ROM, PROM, EPROM, EEPROM or Flash memory on which is stored electronically readable control signals, which then are (or are able to work together with a programmable computer system so that the appropriate way. Therefore, the digital media data can be accomplished with a business opportunity which I computer.

Some solutions according to the invention include the media, forming with electronically readable control signals, which are able to interact with a programmable computer system so that is one of the methods described here.

In General, solutions of this invention can be implemented as a computer program with software code that is configured to implement one of the methods when the computer program runs on a computer. The program code may, for example, be stored on machine-readable media.

Other solutions include computer program to perform one of the methods described here, stored on computer-readable media.

In other words, the decision of the invented method, therefore, is a computer program with program code to perform one of these methods, when the computer program runs on a computer.

Therefore, a further embodiment of the invented method is the media (or digital data carrier or computer readable medium) including recorded thereon a computer program to perform one of the methods described here.

Therefore, a further embodiment of the invented method is on the OK data or a sequence of signals, representing a computer program to perform one of the methods described here. The data stream or a sequence of signals may, for example, be formed so as to be transmitted through the data transmission system, for example via the Internet.

Further, the solution includes a processor such as a computer or programmable logic device, configured to implement one of the methods described here.

A further embodiment includes a computer with a computer program to perform one of the methods described here.

Some solutions can be used in a programmable logic device (e.g., FPGA - programmable gate array) to perform some or all functions described here are methods. In some solutions, user-programmable gate matrix can interact with the microprocessor to execute one of the methods described here. In General, the methods are preferably performed by any device related to hardware tools.

Conclusions

To summarize the above, the embodiment according to this invention include a new way of looking at sound events that should not be or cannot be processed by existing the existing processing algorithms (for example, the use of signal processor). Some of the solutions developed method essentially consists of extrapolation or interpolation part of the signal that contains audio events that must be processed separately. After processing the selected portion of the transient signal are considered separately and added again. This processing is not limited extension in time or in frequency, but can generally be used in the signal processing when the existing signal processing harmful for the transient part of the signal (or if it adversely affects the transient parts of the signal).

The following describes some of the advantages of the new method, which can be implemented in some of the decisions. In the new method has taken into account distortions (such as dispersion, pre-echo and delayed echo) that may occur during processing of the transition signal, the use of the methods of stretching and movement. Fixed a potential deterioration in the quality of the blend (possibly tone) parts of the signal.

The solution according to the invention can be applied in various fields. The method is, for example, suitable for any audio specialist, where should be changed to the speed of formation of the audio signals, or the transmission of them.

Summing up, were described to avoid distortion means and m is TOD for separate processing of the audio events in the audio.

Solution 2

Another solution of the invention will be described according to Fig.13-16.

First, it will discuss the details of the transition detection signal. Subsequently will be explained the processing of the transition signal in accordance with Fig.13 and 14. The processing results of the transient signal will be discussed in relation to Fig.15. Additional enhancements to the processing of the transition signal will be explained in relation to Fig.16. In addition, assess performance, and will make some conclusions.

Embodiment 2 is the transition detection signal.

In order to implement the proposed method, the replacement of transients and for separate processing of transients, it is important to detect the presence of transient processes.

In addition to the application time stretching, a wide range of signal processing methods requires knowledge of the transition content of the audio signal. Important examples are: the determination of the block size (Century Edier, "Coding of audio signals with overlapping block transform and adaptive window functions (in German)", ", vol. 43, no. 9, pp.252-256, Sept. 1989), or separate coding of transient and stationary signals (Oliver Niemeyer and Bemd Edier, "Detection and extraction of transients for audio coding," in AES 120 th Convention, Paris, France, 2006), coder-decoders audio conversion, modification of transient components (M. M. Goodwin and C. Avendano, "Frequency-domain algorithms for audio signal enhancement based on tansient modifiation", Journal of thr Aution Engineering Society, voi. 54, pp.827-840, 2006), segmentation of the audio data (R. Brossier, J. P. Bello, and M. D. Plumbley, "Real-time temporal segmentation of note objects in music signals," in ICMC, Miami, USA, 2004). As numerous as their applications, are ways of detecting transients. Typically, the detection is performed by calculating detection (J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, and M. B. Sandier, "A tutorial on onset detection in music signals", Speech and Audio Processing, IEEE Transactions vol. 13, no. 5, pp.1035-1047, Sept. 2005), i.e. a function with local maxima coinciding with the occurrence of transients. The various proposed methods have a detection function, exploring the (weighted) the magnitude or energy envelope signals papolos, the signal over a wide frequency range, its derivative or its functions relative difference (see, for example, (A. Klapuri, "Sound onset detection by applying psychoacoustic knowledge," in ICASSP, 1999) and (P. Masri and A. Batcman, "Improved modelling of attack transients in music analysis-resynthesis," in ICMC, 1996)).

Other methods calculate the deviation between the measured and predicted phase (see, for example, C. Duxbury, M. Davies, and M. Sandier, "Separation of transient information in musical audio using multiresolution analysis techniques", in DAFX, 2001), the combined analysis of the magnitudes and phases of the signals papolos (see, for example, C. Duxbury, M. Sandier, and M. Davies, "A hybrid approach to musical note onset detection," in DAFX, 2002), or an error made on the basis of adaptive linear prediction (see, for example, W-C. Lee, and C-C. 1 Kuo, "Musical onset detection based on adptive linear prediction", in ICME, 2006). The choice of peaks present transition process and its localization in time is obtained or when binary review, or apply continuous detection function to control the behavior of the module modifications (see, for example, Ret M. M. Goodwin and C. Avendano, "Frequency-domain algorithms for audio signal enhancement based on transient modifiation", Journal of the Audio Engineering Society, vol. 54, pp.827-840, 2006).

When a binary analysis of the wrong destination because of the loss of the dimension on the detection phase can cause serious deterioration in some decisions. For an existing algorithm false negation (i.e. missing transition process) would be worse than a false positive (i.e., the detection of non-existent transition process). The first would lead to a smeared transition component, while the latter only leads to unnecessary interpolation, if interpolation is performed properly.

The resulting weighted absolute values of the transformed blocks on the basis of the Fourier transform in a short period of time are used for the detection of transition regions. This function shows a marked increase during the attack transient and is also capable of indicating the attenuation of shock signals and related reverb. Picking peaks in the smoothed detection was carried out using adapt the threshold, based on the calculation of percentile as described, for example, in Ref. J. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, and M. B. Sandier, "A tutorial on onset detection in music signals", Speech and Audio Processing, IEEE Transactions on vol. 13. 5, pp.1035-1047, Sept. 2005.

In sum, in the art there are known various methods for the detection of the transition process, which can be used in the proposed device. For example, the aforementioned method for detecting the transition process can be used in the sensor of the transition signal 130A module substitution transition signal 130.

Solution 2 - handling the transition signal

Next will be described the processing of the transition process, in accordance with Fig.13 and 14. In Fig.13 shows a graphical representation of the removal of the transition signal and interpolation. Fig.14 shows a graphical representation of the stretching of time and pristavki transition signal. Thus, the schematic diagram of Fig.13 and 14 illustrate a sequence of processing steps of the presented algorithm.

The first line 1310 in Fig.13 shows the original signal (i.e. the audio signal 110) containing a transition event 1312. In response to (or during) the detection of this transient signal 1312 is defined (for example, the sensor of the transition signal 130A) region of the transition signal (for example, extending from the start position of the region of the transition signal 131 to the position of the end region of the transition signal 1316), which is subsequently subtracted from the signal. In other words, first, the transition signal is detected and processed by the window function. Secondly, it is subtracted from the signal. The signal, which is subtracted the transition process, it is shown in [in20]. Himself the transition signal is stored for later use. Up to this step, the algorithm is identical to that described in [B8] algorithm, despite the fact that designed to cut window used here is of a rectangular (dashed thick line). To store the transition signal precedes and added guard interval in milliseconds, and the window is narrowed (thin solid line), to identify areas crostada for smooth pristavki saved the transition process in the time interval corresponding remote free transition signals.

Subsequently, the most important feature of the invented algorithm according to the existing solution is the use of interpolation in order to complete the gap. In other words, finally resulting gap is filled using interpolation. The interpolation can be seen in the bottom row of Fig.13 regarding non 1330. Since the signal is usually a quasi-permanent after interpolation, it can now be stretched, without creating distortion. The result of this stretching illustriou is in the first row of Fig.14 in regard to the number 1410. The transition region in the displaced position is identified and prepared for pristavka first stored and processed by the window function of the transition signal. Therefore, the wedge-shaped window (which was applied for the extraction and/or storage of the transition signal, and which is shown as thin solid line in the graphical representation in respect of non 1310) is inverted and applied to the signal to provide re-adding the transition signal. The result of this process is shown by chart number 1420. Finally saved the transition signal is added to the extended signal, as can be seen in the graphical representation in respect of non 1430.

Summarizing the above, the removal of the transition signal and the interpolation interval, which is caused by the removal of a transient signal, shown in Fig.13. First, the transition signal is detected and processed by the window function. Secondly, it is subtracted from the signal. Finally, the resulting gap is filled by interpolation. In Fig.14 shows a stretch of time and pristavka transition signal that follows the destruction of the transition signal and interpolation. First, a quasi-permanent signal is stretched, for example, with the use described here vocoder. Subsequently, the position of a transient signal in the extended time the signal p is Gotovina by multiplying by a window function, the reverse of what was used to keep the transition signal (see Fig.14). Finally, the transition signal is re-added to the signal. In other words, finally saved the transition process added to the long-time signal.

Solution 2 is the result of processing transient

Further, in accordance with Fig.15 will discuss some results of the proposed treatment of the transition signal. In Fig.15 shows a graphical representation of the steps of the proposed treatment of the transition of the signal applied to the stretch of time using the phase vocoder. The first row contains the stretched signal, and the second row contains the stretched areas. It is necessary to describe the various temporary sites used in the graphical representations of the first and second rows.

In Fig.15 shows the results of the various algorithmic steps on the basis of Castanet sound, mixed with the sound of the trumpet.

In Fig.15A shows the graph of the original input signal with a signs transition areas. Fig.15b shows designed for cutting region of the transition process, which are interpolated (next step) to obtain a stationary signal, shown in Fig.15s. Fig.15d contains transitional area, including smoothly blend the guard interval at the time, as of Fig.15th shows the interpolated (and, as a rule, time-stretched signal, which is suppressed by using the inverse of the cross-fade window where deleted transient signals. Finally, in Fig.15f shows the final result of the algorithm, stretching in time.

Thus, Fig.15A represents the audio signal 110. Fig.15th is the audio signal with a reduced transition process 132. Fig.15d represents the transition signal 152. Fig.15f is processed audio signal 120.

Solution 2 - improve the processing of transient signals

It was found that various concepts concerning interpolation intended to cut areas of the transition process can be important in some cases. For example, interpolation in the transition region can be difficult if the signal before the transition process differs significantly from the signal after the transition. In some cases, the complexity of the signal during the transition process can hardly be predicted. In Fig.16 illustrates such a simplified situation. The algorithm (for example, the algorithm in order to perform the interpolation, filling the gap) must make a choice for one involved the main tone (interpolated signal fills the gap). The same applies to more complex Shiro is apronym signals. A possible solution to overcome the problem lies in the forward and backward prediction with crossfade between each other. Thus, this direct and inverse prediction with crossfade between each other can be applied to calculate the interpolated signal filling the gap.

This problem and its solution according to the invention is illustrated in Fig.16. In Fig.16 shows that the interpolation of the transition signal (i.e. the interpolation of the gap caused by the removal of the transition process) is difficult if the signal changes substantially during the transition process. Endless path contours of the primary colors exist during the prediction plot (i.e. the gap caused by the removal of the transition process). In Fig.16A shows a graphical representation of the signal, containing a transient event in the form of a presentation to the frequency of the time. A transitional area, that is, the time interval, which was identified as a transition time interval, determined by reference 1610. In Fig.16b shows a graphical representation of different possibilities in order to get the time portion of the input audio signal, where the transition process was discovered and removed. As can be seen, if there is first the basic tone, in time precedes the time interval 1620, in which the C input audio signal is removed the transition process, and the second main tone is in time after time interval 1620, it is necessary to determine the development of the primary colors in order to fill the gap, which is formed by removal of a transient signal in a time interval 1620. As can be seen, for example, there could be a direct extrapolation (time direction) of the main tone preceding time interval 1620 to get the basic tone during the time interval 1620 (see the dotted line 1630). Alternatively, it is possible backward extrapolation (in the opposite direction) the fundamental tone, which is present after the time interval 1620, to the time interval 1620 (see the dotted line 1632). Alternatively, it is possible to interpolate in the time interval 1620, between the main tone, which is present before the time interval 1620 and the main tone, which is present after the time interval 1620 (see the dotted line 1634). Naturally, various schemes of receipt of the development of the primary colors in time on the time interval 1620 (the gap caused by the removal of a transient signal).

The impact on final processed audio signal pristavki transition signal shown in Fig.16C. As you can see, re-inserted the transition portion of the signal (which represents the original or processed content is the transient part of the signal) may be shorter in time, than processed (e.g., stretched in time) of the audio signal 142, which was processed without transients. Thus, the choice of concept in order to fill the gap caused by the removal of the transition process in the audio signal 132 may actually have an audible effect on the processed audio signal 120 even after pristavki transition signal, for example, if you re-inserted the transition portion (described transient signal 152) shorter than the processed result of the filling of the gap in the processed audio signal 142. Reference is made to the time interval 140 prior re-entered the transition process and time interval 142, following re-entered the transition process.

To summarize spoken in Fig.16 it was shown that the interpolation of the transition region requires some consideration, if significant signal changes during the transition process. The infinite paths of the primary colors exist during the interpolation range. In Fig.16A shows a signal containing a transition event. In Fig.16b by the dotted lines shows the various possibilities for the interpolation of the transition area. In Fig.16C shows the stretched signal. Because stretched interpolated region extends beyond the transition parts, the interpolated signal is slushi is diversified and can lead to distortions of perception.

Solution 2 is to determine the effectiveness

To get some idea about the effectiveness of the proposed method regarding the perception, held an informal audition. To evaluate the advantage of the new scheme of processing of transient signals, at the same time ensuring that stationary signals are not deteriorated, the selected audio signals included fragments with transitional and stationary characteristics.

This informal test showed a significant advantage for the above combination pipe and Castanet than those used in engineering software algorithm stretching over time. Studies have shown the advantage of algorithms stretching over time based on PV, compared WSOLA, when attention fall transient signals.

The signals of the real world, stretched in a new way, also sometimes had preference in comparison with other methods.

Conclusion

Summing up, it has been described a new scheme of processing of transient signals, which may with advantage be used in the algorithms stretch of time. Change or the speed or pitch of audio signals without affecting each other often used for music production and creative reproduction, such as creating remixes. It is also used in other CE is s, such as higher speeds and increased bandwidth. While constant signals can be stretched without damage to the quality, transients often poorly supported after stretching using conventional algorithms. This invention demonstrates the approach for processing transient signals in the algorithms stretch in time. Transition areas are replaced by stationary signals. Thus, remote transients are saved and re-entered in the time-stretched stationary audio signal after stretching in time.

The problem is obtained by solving the problem of stretching of the combination of tone, such as a trumpet, and percussion signal, such as castanets.

While some conventional methods approximately preserve the signal envelope in the extended time version as well as its spectral characteristics, and expect to be stretched in time shock event will be slightly slower than the original, this invention is based on the position that when the time scale of musical tones goal is to keep the envelope transient events. Therefore, some solutions according to the invention stretch only the continuous component to achieve the effect, sounds of the same instrument playing at a different pace (see,for example, [B3]). To achieve this, according to the invention transitory and permanent components of the signal is treated separately.

The solution according to the invention is based on the concept that was described in the publication [B8], where it was demonstrated how the transients can be saved with the extension in time and frequency phase vocoder. In that approach transients are cut out from the signal before it is stretched.Remove transition part leads to gaps within the signal, which stretched the phase vocoder. After stretching transients with the environment, which adjusted for stretched periods, re-added to the signal. It was found that the solution includes some benefits for many signals. It was also found that when cutting transients introduces new distortions, because the intervals introduce new non-stationary part of the signal, particularly at the boundaries of the intervals. Such nonstationarity can be seen, for example, in Fig.15b.

The solution described here invented method have the advantage described, for example, in the publications [OT], [B6], [B7], ways that allow you to stretch the signal in time without the need to change the stretch factor in the vicinity of the transition process. The invented method is here the General properties described, for example, in [B8] and [B5] methods. Invented scheme divides the signal on the transitional part of the quasi-permanent and non-transient signal. Unlike the method described in [B8], intervals, which are the result of the exclusion of transient processes are replaced by permanent signal. To determine the continuation of the signal within the period of the signals surrounding the gap, is used interpolation method. Then the resulting quasi-permanent part suitable for processing using algorithms stretching over time. Due to the fact that this signal does now (i.e. after interpolation or extrapolation) - no longer includes neither transient nor gaps can be prevented distortion introduced stretched transients and stretched intervals. After performing stretching transients replace part of the interpolated signal. The technique is based on two provisions:

proper detection of transients and perceptual correct interpolation constant part. However, as described above, in addition to interpolation can be used other filling methods.

To better summarize the above, in some of the above solutions, the aim was to stretch without any perceptual distortion combination of strictly tonal and re the one signal, such as pipe and castanets. It has been shown that this invention provides a significant progress towards this goal. One of the important aspects of the present invention is in the correct identification of the transition event, especially its exact beginning and more hard-to-find decay, as well as related reverb. Since the attenuation and reverberation transition events superimposed on the stationary part of the signal, these parts need careful handling to avoid perceptual oscillations after pridobivanje to the stretched parts of the signal.

Some listeners tend to prefer the version in which the reverb is stretched together with the continuous parts of the signal. This preference is contrary to the actual purpose to consider the transition process and associated sounds as a single. Therefore, in some cases, you have a greater understanding of the preferences of the listeners.

However, the idea and principle approach, according to this invention, have proved their value and patentability. However, it is expected that the range of applications of the present invention may even be expanded. Because of its structure, invented the algorithm can be easily adapted to be used for manipulation of the transition part, for example, by changing its level in comparison with the regular parts of the signal.

Dalnas the e possible application of the invented method would be, to arbitrarily reduce or get transients for a replay. This can be used to change the volume of transient events, such as drums, or even remove them completely, because the separation of the signal into transient and permanent part is natural for the algorithm.

The solutions above illustration to explain the principles of the present invention. It is understood that the modifications and changes described here measures and details will be obvious to a person skilled in the art. Therefore, the intention is to restrict the area available independent requirements, rather than the specific details presented here by describing and explaining decisions.

The list of references

[Al] J. L. Flanagan and R. M. Golden, "The Bell System Technical Journal, November 1966", pages 1394 to 1509;

[A2] United States Patent 6,549,884, Laroche, 1 & Dolson, M.: "Phase-vocoder pitch-shining";

[A3] Jean Laroche and Mark Dolson, "New Phase-Vocoder Techniques for Pitch-Shifting, Harmonizing and Other Exotic Effects", by Proc.

[A4] Zolzer, U: "DAFX: Digital Audio Effects", Wiley & Sons, Edition: 1 (26 February 2002), pages 201-298;

[A5] L. Laroche, M. Dolson: "Improved phase vocoder timescale modification of audio", IEEE Trans. Speech and Audio Processing, vol. 7, no. 3, pp.323-332;

[A6] Emmanuel Ravelli, Mark Sandier and Juan P. Bello: "Fast implementation for non-linear time-scaling of stereo audio", Proc. of the 8thInt. Conference on Digital Audio Effects

(DAFx'05), Madrid, Spain, September 20-22,2005;

[A7] Duxbury, C., Davies, and M. Sandier (2001, December): "Separation of transient nformation in musical audio using multiresolution analysis techniques". In: Proceedings of the COST G-6 Conference on Digital Audio Effects (DAFX-01), Limerick, Ireland;

[A8] A. Röbel: "A NEW APPROACH TO TRANSIENT PROCESSING IN THE PHASE VOCODER", Proc. Of the 6thInt. Conference on Digital Audio Effects (DAFx-03), London, UK, September 8-11,2003.

[Bl] T. Karrer, E. Lee, and J. Borchers, "Phavorit: A phase vocoder for real-time interactive time-stretching", in Proceedings of the ICMC 2006 International Computer Music Conference, New Orleans, USA, November 2006, pp.708-715.

[B2] T. F. Quatieri, R. B. Dunn, R. J. McAulay and T. E. Hanna, "Time-scale modifications of complex acoustic signals in noise", Technical report, Massachusetts Institute of Technology, February 1994.

[B3] C. Duxbury, M. Davies, and M. B. Sandier, "Improved time-scaling of musical audio using phase locking at transients," in 112th AES Convention, Munich, 2002, the Audio Engineering Society.

[B4] S. Levine and Julius 0.Smith III, "A sines+transients+noise audio representation for data compression and time/pitchscale modifications", 1998.

[B5] T. S. Verma and T. H. Y. Meng, "Time scale modification using a sines+transients+noise signal model", in DAFX98, Barcelona, Spain, 1998.

[B6] A. Röbel, "A new approach to transient processing in the phase vocoder," in 6th Digital Audio Effects (DAFx-03), London, 2003, pp.344-349.

.[B7] A. Röbel, "Transient detection and preservation in the phase vocoder," in Int. Computer Music Conference (ICMC 03), Singapore, 2003, pp.247-250.

.[B8] F. Nagel, S. Disch, and N. Rettelbach, "A phase vocoder driven bandwidth extension method with novel transient handling for audio codecs", in 126 th AES Convention, Munich, 2009.

[B9] M. Dolson, "The phase vocoder: A tutorial", Computer Music Journal, vol. 10, no. 4, pp.14-27, 1986.

[10] C. Edier, "Coding of audio signals with over-lapping block transform and adaptive window functions (in german) ", ", vol. 43, no. 9, pp.252-256, Sept. 1989.

[11] Oliver Niemeyer and Bemd Edier, "Detection and extraction of transients for audio coding," in AES 120th Convention, Paris, France, 2006.

[12] M. M. Goodwin and C. Avendano, "Frequency-domain algorithms for audio signal enhancement based on transient modifiation", Journal of the Audio Engineering Society, vol. 54, pp.827-840, 2006./p>

[13] P. Brossier, IP. Bello, and M. D. Plumbley, "Real-time temporal segmentation of note objects in music signals," in ICMC, Miami, USA, 2004.

[14] I. P. Bello, L. Daudet, S. Abdallah, C. Duxbury, M. Davies, and M. B. Sandier, "A tutorial on onset detection in music signals", Speech and Audio Processing IEEE Transactions on, vol. 13, no. 5, pp.1035-1047, Sept. 2005.

[15] A. Klapuri, "Sound onset detection by applying psychoacoustic knowledge," in ICASSP, 1999.

[16] P. Masri and A. Bateman, "Improved modelling of attack transients in music analysis-resynthesis," in ICMC, 1996.

[17] C. Duxbury, M. Davies, and M. Sandier, "Separation of transient information in musical audio using multiresolution analysis techniques", in DAFX, 2001.

[18] C. Duxbury, M. Sandier, and M. Davies, "A hybrid approach to musical note onset detection," in DAFX, 2002.

[19] W-C. Lee, and C-C. J. Kuo, "Musical onset detection based on adaptive linear prediction," in ICME, 2006.

[Edier] 0.Niemeyer and B. Edier, "Detection and extraction of transients for audio coding", presented at the AES 120thConvention, Paris, France, 2006;

[Bello] J. P. Bello et al., "A Tutorial on Onset Detection in Music Signals", IEEE Transactions on Speech and Audio Processing, Vol.13, No. 5, September 2005;

[Goodwin] M. Goodwin, C. Avendano, "Enhancement of Audio Signals Using Transient Detection and Modification", presented at the AES 117thConvention, USA, October 2004;

[Walther] Walther et al., "Using Transient Suppression in Blind Multi-channel Upmix Algorithms", presented at the AES 122th Convention, Austria, May 2007;

[Maher] R. C. Maher, "A Method for Extrapolation of Missing Digital Audio Data", JAES, Vol.42, No. 5, May 1994;

[Daudet] L. Daudet, "A review on techniques for the extraction of transients in musical signals", book series: Lecture Notes in Computer Science, Springer Berlin/Heidelberg, Volume 3902/2006, Book: Computer Music Modeling and Retrieval, pp.219-232.

1. The device (100) for controlling the audio signal (110), comprising a transient event that contains the module substitution transition signal (130) is made with the possibility of replacing the shape transitional portion of the audio signal, including transient event, part replacement signal, responsive to the energy performance of one or more non-transient parts of the signal or energy signal transition portion to receive the audio signal with a reduced transient (132); signal processor (140), made with the possibility to process the audio signal with a reduced transient (132) to get a processed version (142) of the audio signal with a reduced transition process; and module insertion of the transition signal (150) made with the possibility to combine the processed version (142) of the audio signal with a reduced transient (132) with the transition signal (152), which, in original or processed form, transition transition portion of the signal; where the module substitution transition signal (130) is configured to extrapolate the magnitude of the amplitude of one or more parts of the signal prior to the transition portion of the signal to obtain the amplitude value side replacement signal, and where the module substitution transition signal (130) is configured to extrapolate the magnitude of the phase of one or more parts of the signal prior to the transition portion of the signal to obtain the magnitude of the phase part of the signal replacement.

2. The device (100) for controlling the audio signal (110), including PE hadnae event, contains the module substitution transition signal (130) is made with the possibility to replace the transient part of the signal comprising a transient event in the audio part of the replacement signal, responsive to the power signal characteristics of one or more non-transient parts of the signal of the audio signal or energy signal transient part of the signal to obtain an audio signal with a reduced transient (132); signal processor (140), made with the possibility to process the audio signal with a reduced transient (132) to obtain the processed version (142) of the audio signal with a reduced transition process; and module insertion of the transition signal (150), made with the possibility of combining the processed version (142) of the audio signal with a reduced transient (132) and the transition signal (152), which in original or processed form transition transition portion of the signal; where the module substitution transition signal (130) is configured to interpolate between the value of the amplitude part of the signal prior to the transition part of the signal and the amplitude part of the signal following the transition part of the signal to obtain one or more values of the amplitude part of the signal replacement, and where the module substitution transition signal (130) performed in what zmoznostjo interpolation between the value of the phase part of the signal, previous transient part of the signal, and the magnitude of the phase part of the signal following the transition part of the signal to obtain one or more values of the phase part of the signal replacement.

3. The device (100) for controlling the audio signal (110), comprising a transient event that contains the module substitution transition signal (130) is made with the possibility to replace the transient part of the signal comprising a transient event in the audio part of the replacement signal, responsive to the power signal characteristics of one or more non-transient parts of the signal or energy signal transient part of the signal to obtain an audio signal with a reduced transient (132); signal processor (140), made with the possibility to process the audio signal with a reduced transient (132) to obtain the processed version (142) audio signal with a reduced transition process; and module insertion of the transition signal (150), made with the possibility of combining the processed version (142) of the audio signal with a reduced transient (132) with the transition signal (152), which in original or processed form transition transition portion of the signal; where the module substitution transition signal (130) is made with the possibility to extrapolate to the field of frequency-time com is the integrated area ratios of the frequency-time associated with non-transient part of the audio signal (110) prior to the transition portion of the signal to obtain the coefficients of the field of frequency-time part replacement signal, or where the module substitution transition signal (130) is made with the possibility of interpolation in frequency-time between the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal (110), previous transient part of the signal, and the complex coefficients of the field frequency time associated with the non-transient part of the audio signal following the transition part of the signal to obtain the coefficients of the field of frequency-time part replacement signal.

4. The device (100) according to p. 3, where the module substitution transition signal (130) includes a sensor transition signal (130A, 130C), made with the ability to detect transient part of the audio signal (110) on the basis of control of the audio signal (110), or based on third party information accompanying the audio signal, and to determine the length of the transient part of the signal; where the module substitution transition signal (130) is configured to take into account the length of the transient part of the signal is determined by the sensor of the transition signal (130A, 130C); where the module substitution transition signal (130) made with the possibility to extrapolate to the field of frequency-time, the complex coefficients is blasti frequency-time associated with non-transient part of the audio signal (110), previous transient part of the signal to obtain the coefficients in frequency-time part replacement signal, or where the module substitution transition signal (130) is made with the possibility of interpolation, in the field of frequency-time between the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal (110), previous transient part of the signal, and the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal following the transition part of the signal to obtain the coefficients of the field of frequency-time part replacement signal; where signal processor (140) is configured to implement the processing of reducing the transition process audio time stretching or compressing the time so that the processed signal (142) formed by signal processor (140), included length, greater than, or less than, the length of the raw signal (132) received by the processor of the audio signal; and where the device (100) is configured to adjust the scaling of time or the sampling frequency of the signal received in the module insertion of the transition signal (150) so that at least the non-transient signal components, obtained in the module sun is where it is refuelled transition signal (150), were displaced in frequency compared with the audio signal (110) included in the module substitution transition signal (130).

5. The device (100) under item 1, where the module substitution transition signal (130) is configured to combine the non-transient components of the transition part of the signal is extrapolated or interpolated values, to get some replacement signal.

6. The device (100) under item 1, where the module substitution transition signal (130) is arranged to receive part of the signal change of variable length depending on the length of the existing transient part of the signal.

7. The device (100) under item 1, where the signal processor (140) configured to process an audio signal with a reduced transient (132) so that the transient part of the signal processed version (142) of the audio signal with a reduced transition process depends on many displaced in time temporal parts of an audio signal with a reduced transient (132).

8. The device (100) under item 1, where the signal processor (140) is configured to implement the processing time blocks of the audio signal with a reduced transition process 132 to receive the processed version (142) of the audio signal with a reduced transition process; and where the module substitution transition signal 130 is configured to adjust the duration is alnost transient part of the signal, to be substituted for part of the replacement signal with a time resolution better than the time unit, or replace transient part of the signal having a time duration less than the duration of a temporary unit, part, replacement signal having a time duration less than the duration of the temporary block.

9. The device (100) under item 1, where the signal processor (140) is configured to process the audio signal with a reduced transient (132) method, depending on the frequency, so that the processing is introduced into the audio signal with a reduced transient (132) to reduce the transition process-dependent frequency shifts of the phase.

10. The device (100) under item 1, where the module substitution transition signal (130) includes a sensor transition signal (130A), where the sensor transition signal (130A) is made with the ability to provide time-varying detection threshold to detect a transient in the audio signal (110) in such a way that the detection threshold follows the envelope of the audio signal with adjustable constant temporal smoothing, and where the sensor transition signal is configured to change the time constant of the smoothing in response to detection of the transition process and/or depending on the time development of the audio signal.

11. The device (100)p. 1, where the device (100) includes a processor transition signal (160) made with the possibility to receive information about transition signal (134) and to generate, based on the transition signal (134) processed transition signal (152), which reduced the tonal components, and where the module insertion of the transition signal (150) is configured to combine the processed version (142) of the audio signal with a reduced transient (132) and processed transition signal (152) generated by the processor of the transition signal (160).

12. The device (100) under item 1, where the module insertion of the transition signal (150) is configured to crossfade processed version (142) of the audio signal with a reduced transient (132) and the transition signal (152), which in original or processed form transition transition part of the signal.

13. Method (1200) manipulating an audio signal comprising a transient event, the method includes replacement (1210) of the transition portion of the signal comprising a transient event in the audio part of the replacement signal, responsive to the power signal characteristics of one or more non-transient parts of the signal, or energy performance of the transition portion of the signal to obtain an audio signal with a reduced transition process; process(1220) of the audio signal with a reduced transition process, to obtain a processed version of the audio signal with a reduced transition process; and the Association (1230) processed version of the audio signal with a reduced transition process with transition signal representing the original or processed form transition transition part of the signal, where the amplitude of one or more parts of the signal prior to the transition part of the signal is extrapolated to obtain the amplitude part of the signal replacement, and where the magnitude of the phase of one or more parts of the signal prior to the transition part of the signal is extrapolated to obtain the magnitude of the phase part of the signal replacement; or, where the interpolation is made between the value of the amplitude part of the signal, previous transient part of the signal and the amplitude part of the signal following the transition part of the signal to obtain one or more values of the amplitude part of the signal replacement, and where the interpolation is made between the value of the phase part of the signal prior to the transition part of the signal, and the magnitude of the phase part of the signal following the transition part of the signal to obtain one or more values of the phase part of the signal replacement; or, where the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal prior to the transition part of the signal, extrapolators is in the field of frequency-time to get the complex coefficients of the field of frequency-time part replacement signal; or where the interpolation is done in the frequency-time between the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal prior to the transition part of the signal, and the complex coefficients of the field of frequency-time associated with the non-transient part of the audio signal following the transition part of the signal to obtain the coefficients in frequency-time part replacement signal.

14. The computer-readable storage medium with a computer program to perform the method according to p. 13, when executing a computer program on a computer.



 

Same patents:

FIELD: physics, acoustics.

SUBSTANCE: group of inventions relates to means of analysing time variations of audio signals. Disclosed is an apparatus for obtaining a parameter describing variation of a signal characteristic of a signal based on actual transform-domain parameters describing an audio signal in transform-domain which includes a parameter determiner. The parameter determiner is configured to determine one or more model parameters of a transform-domain variation model describing evolution of the transform-domain parameters depending on one or more model parameters representing a signal characteristic, such that a model error, representing deviation between a modelled temporal evolution of the transform-domain parameters and evolution of the actual transform-domain parameters, is brought below a predetermined threshold value or minimised.

EFFECT: designing highly reliable means for obtaining a parameter describing time variation of a signal characteristic.

27 cl, 9 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to computer engineering. A method of maintaining speech audibility in a multi-channel audio signal, comprising comparing a first characteristic and a second characteristic of the multi-channel audio signal to generate an attenuation factor, wherein the first characteristic corresponds to a first channel of the multi-channel audio signal that contains speech audio and non-speech audio, wherein the second characteristic corresponds to a second channel of the multi-channel audio signal that contains predominantly non-speech audio; adjusting a gain applied to a second power spectrum until the predicted speech intelligibility meets a criterion; and using the adjusted gain as the attenuation factor once the predicted speech intelligibility meets the criterion; adjusting the attenuation factor according to a speech likelihood value to generate an adjusted attenuation factor; and attenuating the second channel using the adjusted attenuation factor.

EFFECT: improved speech audibility in a multi-channel audio signal.

14 cl, 5 dwg

FIELD: radio engineering, communication.

SUBSTANCE: method of picking up speech signal in presence of interference, which comprises converting an input mixture of an acoustic signal and interference into an electrical signal, filtering with a band-pass filter to obtain a mixture of a speech signal and interference with a given bandwidth, which is amplified in a low frequency amplifier; an analogue-to-digital converter (ADC) generates readings of the mixture of the signal and interference in digital form and transmits said readings to a computing device, which forms pairs of sums of amplitudes of the readings in a certain manner and calculates signal amplitudes for each moment in time using the obtained summation results by solving corresponding systems of linear equations.

EFFECT: high efficiency of picking up a speech signal in the presence of interference.

2 dwg, 1 tbl

FIELD: physics, acoustics.

SUBSTANCE: invention relates to HFR (High Frequency Reconstruction/Regeneration) of audio signals and is intended for performing HFR of audio signals having large variations in energy level across the low frequency range which is used to reconstruct the high frequencies of the audio signal. The system configured to generate a plurality of high frequency subband signals covering a high frequency interval from a plurality of low frequency subband signals. The system comprises means of receiving a plurality of low frequency subband signals; means of receiving a set of target energies, each target energy covering a different target interval within the high frequency interval and being indicative of the required energy of one or more high frequency subband signals lying within the target interval; means of generating a plurality of high frequency subband signals from the plurality of low frequency subband signals and from a plurality of spectral gain coefficients associated with the plurality of low frequency subband signals, respectively; and means of adjusting the energy of the plurality of high frequency subband signals using the set of target energies.

EFFECT: preventing undesirable noise caused by discontinuities of the spectral envelope of the high frequency audio signal.

20 cl, 14 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to means of generating a broadband signal using a low-bandwidth input signal. A processor performs controlled bandwidth expansion using a low-bandwidth input signal and a first set of parameters for generating first frequency content, which continues up to the a first frequency, and performs blind bandwidth expansion using the first frequency content and a second set of parameters for generating second frequency content, which continues up to a second frequency which is higher than the first frequency. The first set of parameters and the input low-bandwidth signal are extracted from the bit stream. The processor comprises a parameter generator for generating the second set of parameters from the first frequency content, wherein the parameter generator is configured to obtain spectral envelope parameters for the second set of parameters for the second frequency content via extrapolation from lower to higher frequencies of information about the energy of the formed spectral envelope of the first frequency content.

EFFECT: wider bandwidth with low bit rate and maintaining high signal quality.

13 cl, 7 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to means of decoding and/or transcoding audio. A first and a second source set of spectral band replication (SBR) parameters are merged into a target set of SBR parameters. The first and second source set comprise a first and second frequency band partitioning, respectively, which are different from one another. The first source set comprises a first set of energy related values associated with frequency bands of the first frequency band partitioning. The second source set comprises a second set of energy related values associated with frequency bands of the second frequency band partitioning. The target set comprises a target set of energy related values associated with an elementary frequency band. The method comprises steps of breaking up the first and the second frequency band partitioning into a joint grid comprising the elementary frequency band; assigning a first value of the first set of energy related values to the elementary frequency band; assigning a second value of the second set of energy related values to the elementary frequency band; and combining the first and second value to yield the target energy related value for the elementary frequency band.

EFFECT: simplifying the process of reducing the number of channels while preserving the relevant high-frequency channel information.

32 cl, 9 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to means of generating an equalised multichannel audio signal. An audio encoder for obtaining an output signal using an input audio signal comprises a patch generator, a comparator and an output interface. The patch generator generates at least one bandwidth extension signal, having a high-frequency band. The high-frequency band of the bandwidth extension signal is based on a low frequency band of the input audio signal. A comparator calculates a plurality of comparison parameters. A comparison parameter is calculated based on a comparison of the input audio signal and a generated bandwidth extension signal. Each comparison parameter of the plurality of comparison parameters is calculated based on a different offset frequency between the input audio signal and a generated high bandwidth signal. Further, the comparator determines a comparison parameter from the plurality of comparison parameters, wherein the determined comparison parameter satisfies a predefined criterion.

EFFECT: improved signal encoding quality at high bit rate.

17 cl, 22 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to means of filtering a multichannel audio signal, having a speech channel and at least one non-speech channel. The method includes determining at least one attenuation control value which serves as a feature of the extent of similarity between speech-related content which is defined by the speech channel and speech-related content which is defined by the non-speech channel; attenuating the non-speech channel in response to at least one attenuation control value; scaling the raw attenuation control signal (e.g. a gain control signal with suppression of a weak signal with a stronger signal) for the non-speech channel in response to at least one attenuation control value.

EFFECT: high speech intelligibility defined by a signal.

66 cl, 7 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to acoustic echo suppressing means. An acoustic echo suppressor includes an input interface (230) means for extracting a downmix signal (310) from an input signal (300) which contains downmix (310) and overhead parametric information (320), collectively representing a multichannel signal; and also includes a computer (220) for calculating transmission factors of an adaptive filter (240) based on the downmix signal (310) and a microphone signal (340) or a signal derived from a microphone signal; and an adaptive filter (240) for a microphone signal (340) or a signal derived from a microphone signal, using assigned transmission factors for suppressing echo excited by a multichannel signal in the microphone signal (340).

EFFECT: reduced computational complexity and high efficiency of the acoustic echo suppression process.

15 cl, 10 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to means of encoding a source audio signal to form an equalised multichannel audio signal. A harmonic transposition method for high frequency reconstruction is employed. A signal of the analysed subband is generated from an input signal, where the signal of the analysed subband includes a series of complex-valued analysed discrete values, each having a phase and an amplitude. A signal of the synthesised subband is determined from the signal of the analysed subband using a subband transposition factor and a subband stretch factor. Block-based nonlinear processing is performed, where the amplitude of the discrete values of the signal of the synthesised subband is determined from the amplitude of corresponding discrete values of the signal of the analysed subband and a pre-determined discrete value of the signal of the analysed subband. A time-stretched and/or frequency-transposed signal is generated from the signal of the synthesised subband.

EFFECT: reduced computational complexity of the encoding process with high quality of the audio signal.

38 cl, 7 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to communication engineering. An audio decoder for providing decoded audio information based on encoded audio information includes a window application-based signal converter formed to map a frequency-time presentation, which is described by the encoded audio information, to a time interval presentation. The window application-based signal converter is formed to select one of a plurality of windows, which include windows of different transition inclinations and windows of different conversion lengths based on window information. The audio decoder includes a window selector formed to evaluate window information of a variable-length code word for selecting a window for processing said part of the frequency-time presentation associated with said audio information frame.

EFFECT: eliminating artefacts arising when processing time-limited frames.

15 cl, 23 dwg

FIELD: physics, computer engineering.

SUBSTANCE: group of inventions relates to means of encoding and decoding a signal. The encoder comprises a first layer encoding section which encodes an input signal in a low-frequency range below a predetermined frequency. First encoded information is generated. The first encoded information is decoded to generate a decoded signal. The input signal is broken down in a high-frequency range above a predetermined frequency into a plurality of frequency subbands. A spectrum component is partially selected in each frequency subband. An amplitude adjustment parameter is calculated, which is used to adjust the amplitude of the selected spectrum component in order to generate second encoding information.

EFFECT: high efficiency of encoding spectral data of a high-frequency part and high quality of the decoded signal.

14 cl, 15 dwg

FIELD: physics, video.

SUBSTANCE: invention relates to a method and an apparatus for improving audio and video encoding. A signal is processed using DCTIV for each block of samples of said signal (x(k)), wherein integer transform is carried out using lifting steps which represent sub-steps of said DCTIV. Integer transform of said sample blocks using lifting steps and adaptive noise shaping is performed for at least some of said lifting steps, said transform providing corresponding blocks of transform coefficients and noise shaping being performed such that rounding noise from low-level magnitude transform coefficients in a current one of said transformed blocks is decreased whereas rounding noise from high-level magnitude transform coefficients in said current transformed block is increased, and wherein filter coefficients (h(k)) of a corresponding noise shaping filter are derived from said audio or video signal samples on a frame-by-frame basis.

EFFECT: optimising rounding error noise distribution in an integer-reversible transform (DCTIV).

26 cl, 13 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to means of generating an output spatial multichannel audio signal based on an input audio signal. The input audio signal is decomposed based on an input parameter to obtain a first signal component and a second signal component that are different from each other. The first signal component is rendered to obtain a first signal representation with a first semantic property and the second signal component is rendered to obtain a second signal representation with a second semantic property different from the first semantic property. The first and second signal representations are processed to obtain an output spatial multichannel audio signal.

EFFECT: low computational costs of the decoding/rendering process.

5 cl, 8 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio signal transmission and is intended for processing an audio signal by varying the phase of spectral values of the audio signal, realised in a bandwidth expansion scheme. The audio signal processing method and device comprise a window processing module for generating a plurality of successive sampling units, a plurality of successive units including at least one added audio sampling unit, an added unit having added values and audio signal values, a first converter for converting the added unit into a spectral representation having spectral values, a phase modifier for varying the phase of spectral values and obtaining a modified spectral representation and a second converter for converting the modified spectral representation into a time domain varying audio signal.

EFFECT: high sound quality.

20 cl, 15 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to audio encoding technologies. An audio encoder for encoding an audio signal has a first coding channel for encoding an audio signal using a first coding algorithm. The first coding channel has a first time/frequency converter for converting an input signal into a spectral domain. The audio encoder also has a second coding channel for encoding an audio signal using a second coding algorithm. The first coding algorithm differs from the second coding algorithm. The second coding channel has a domain converter for converting an input signal from an input domain into an output domain audio signal.

EFFECT: improved encoding/decoding of audio signals in low bitrate circuits.

21 cl, 43 dwg, 10 tbl

FIELD: physics, computation hardware.

SUBSTANCE: invention relates to audio signal processing. Proposed method comprises audio signal filtration for division into two frequency bands and generation of multiple sub bands for signal of every frequency band. Note here that for signal in one frequency band multiple signals of sub bands are generated by conversion from time band to frequency band. For another frequency band, multiple signals of sub bands are generated with the help of bank of sub band filters. Proposed device comprises one processor and one memory device with computer program code. Note also that one memory device and one computer program code are configured to make at least one processor control over process implementation.

EFFECT: higher accuracy of audio signals due to improved signal source SNR.

31 cl, 8 dwg

FIELD: physics, acoustics.

SUBSTANCE: audio encoder (100) for encoding audio signal readings includes a first encoder with time superposition (aliasing) (110) for encoding audio readings in a first encoding region according to a first windowing rule, with attachment of a start window and a stop window. The audio encoder (100) further includes a second encoder (120) for encoding readings in a second encoding region, which processes a frame format-set number of audio readings and comprising a series of audio readings of an encoding mode stabilisation interval, which applies a different, second encoding rule, wherein the frame of the second encoder (120) is an encoded representation of time-consecutive audio signals, the number of which is set by the frame format. The audio encoder (100) also includes a controller (130) which performs switching from the first encoder (110) to the second encoder (120) according to the characteristics of the audio readings and corrects the second windowing rule when switching from the first encoder (110) to the second encoder (120) or modifies the start window or stop window of the first encoder (110) while keeping the second windowing rule unchanged.

EFFECT: improved switching between multiple working regions when encoding sound in both the time and frequency domains.

34 cl, 28 dwg

FIELD: physics.

SUBSTANCE: input spectrum is broken into a plurality of subbands. A representative value is calculated for each subband using an arithmetic mean and a geometric mean. Nonlinear conversion is performed with respect to each representative value. The nonlinear conversion characteristic is amplified as the value increases. The representative value, which was subjected to nonlinear conversion for each subband, is smoothed in the frequency domain.

EFFECT: faster spectral smoothing and higher quality of the output audio signal.

11 cl, 15 dwg

FIELD: information technology.

SUBSTANCE: audio signal decoder designed to provide a decoded representation of an audio signal based on an encoded representation of the audio signal, which includes information on evolution of a temporary deformation loop, includes a temporary deformation loop computer, a device for changing the scale of the temporary deformation loop data and a deformation decoder. The temporary deformation loop computer is designed to generate temporary deformation loop data through multiple restarting from a predefined starting value of the temporary deformation loop based on information on evolution of the temporary deformation loop, which describes time evolution of the temporary deformation loop. The device for changing the scale of temporary deformation loop data is designed to change the scale of at least part of temporary deformation loop data to avoid, reduce or eliminate non-uniformity during restart in a scaled version of the temporary deformation loop. The deformation decoder is designed to provide a decoded representation of an audio signal based on an encoded representation of the audio signal and by using the scaled version of the temporary deformation loop.

EFFECT: supporting low bit rate with reliable reconstruction of the required temporary deformation information at the decoder side.

14 cl, 40 dwg

FIELD: technologies for encoding audio signals.

SUBSTANCE: method for generating of high-frequency restored version of input signal of low-frequency range via high-frequency spectral restoration with use of digital system of filter banks is based on separation of input signal of low-frequency range via bank of filters for analysis to produce complex signals of sub-ranges in channels, receiving a row of serial complex signals of sub-ranges in channels of restoration range and correction of enveloping line for producing previously determined spectral enveloping line in restoration range, combining said row of signals via synthesis filter bank.

EFFECT: higher efficiency.

4 cl, 5 dwg

Up!