Generation of binaural signals

FIELD: radio engineering, communication.

SUBSTANCE: described is a device for generating a binaural signal based on a multi-channel signal representing a plurality of channels and intended for reproduction by a speaker system, wherein each virtual sound source position is associated to each channel. The device includes a correlation reducer for differently converting, and thereby reducing correlation between, at least one of a left and a right channel of the plurality of channels, a front and a rear channel of the plurality of channels, and a centre and a non-centre channel of the plurality of channels, in order to obtain an inter-similarity reduced combination of channels; a plurality of directional filters, a first mixer for mixing output signals of the directional filters modelling the acoustic transmission to the first ear canal of the listener, and a second mixer for mixing output signals of the directional filters modelling the acoustic transmission to the second ear canal of the listener. Also disclosed is an approach where centre level is reduced to form a downmix signal, which is further transmitted to a processor for constructing an acoustic space. Another approach involves generating a set of inter-similarity reduced transfer functions modelling the ear canal of the person.

EFFECT: providing an algorithm for generating a binaural signal which provides stable and natural sound of a record in headphones.

33 cl, 14 dwg

 

The present invention relates to the generation components of the binaural signal, simulating the effects of reflection and/or reverb in the room, to generate the actual binaural signal and to generate a set of functions for modeling the perception of sound by minimization of mutual similarity.

The auditory system is able to determine the direction or multiple directions of the sources of the perceived sounds. With the help of hearing people assess the differences between the sound that is captured in the right ear, and the sound that is captured in the left ear. The received information includes, for example, so-called Interaural benchmarks, reflecting misnia distinctive characteristics of sound signals. Interiorally landmarks are the most important means of spatial localization. The difference between the pressure levels between the ears, namely, Interaural the difference in intensity (ILD) is the most important independent reference information for spatial localization. When the sound reaches the listener in the horizontal plane with a non-zero azimuth, and each ear has a different volume level. Shaded ear naturally gets muted acoustic display compared to unshaded ear. Another very important property related to the object-spatial positioning, one is camping Interaural temporary [phase] difference (ITD). Shaded ear is located farther from the sound source, and therefore, the sound wave front reaches its later than unshaded ear. The value of the ITD increases at low frequencies, which do not fade when reaching the shadowed ear is stronger than when reaching unobscured ear. The role of ITD decreases on the upper frequencies, where the wavelength of sound is approaching the distance between the ears. In other words, the object-spatial localization is due to different interactions of the sound traveling from the source to the left and right ear, respectively, with the head, ears and shoulders of the listener.

Problems arise when the stereo signal is intended to play through the loud speaker or head phones. It is highly likely that the listener feeling the source of the sound in your head, will perceive the sound as unnatural, cumbersome and annoying. This phenomenon is often found in the literature as localization "in my head". Prolonged listening "in your head" may cause auditory fatigue. This happens due to the fact that the supporting audio for positioning the listener to the sound source, in other words, Interaural landmarks missing or blurred.

For playback through headphones stereo or even mnogokanalnyy, containing more than two channels, it is necessary to model these interactions by means of filtration. In particular, be generated from the decoded multi-channel signal output for headphones is possible, passing each signal after decoding in a couple of designed filters. Such filters are usually used for the simulation of sound transmission from a virtual acoustic source location to the auditory canal of the listener, i.e. for the implementation of the so-called binaural transfer function of the ambient space (BRTF). Function BRTF shows time, level and spectral changes and simulates the effects of reflections and reverberation in spatial extent. Directional filters can work both in time and in frequency domain.

The number of required filters should be large, namely, Nx2, where N is the number of decoded channels, so directional filters is very extensive, for example, 20000 popolos filter at 44.1kHz, and the filtering process of computing time-consuming. As a consequence, the directional filters are sometimes extremely minimized. The so-called transfer function of the auditory tract (head) of the listener (HRTF) contain data orientation, including Interaural landmarks. Conventional unit conversion use for them is to be treated reflections and reverberation in the surrounding volume. Module spatial construction can be a simulation of the echo effect in the time or frequency domain by converting single or dual channel input signal generated from the multi-channel input signal by summing the channels of the multichannel input signal. Such a device is described in particular in WO 99/14983 A1. As already mentioned, the construction module of acoustic space creates reflections and/or reverb in the room. The effects of sound reflections and reverberation in a confined space play a significant importance for the localization of sounds, especially for the externalization and create a sense of remoteness of the source in the outside, i.e. to the perception outside of the head of the listener. In the above publication also proposed the implementation of filters in the form of FIR filters (finite impulse response), which converts different channels with different delay and simulating thus the path of sound from a source to one or another of the ear with the appropriate secondary reflection. In addition, among the means to achieve a more attractive sound when listening through a pair of headphones in the above publication proposes to introduce a delay mixed Central and front left channels and Central and frontal p is avago channels relative to the sum and difference of the rear left and rear right channels, respectively.

Nevertheless, the soundtrack still have largely limited the spatial extent of the binaural output signal and lack of externalization.

Moreover, it became obvious that, despite these measures taken to render multi-channel signals for headphones, dialogues in movies and music listening are often perceived with unnatural reverberation and spectral distortion.

In this regard, the invention aims to provide an algorithm for generating a binaural signal, providing sustainable and natural sound of the recording in the headphones.

This goal is achieved with the help of devices that match any of paragraphs 1, 3, 4 and 7 of the claims, and through the use of the methods according to any of paragraphs 16 to 19.

The first idea, which formed the basis for the proposed application, is that a more sustainable and naturally perceive binaural signal for playback through headphones can be obtained by separate transformations, and - thanks - reducing bisimulation is at least one of the pairs of the set of input channels: left and right, front and rear or Central and noncentral forming means is in the set of channels with reduced mutual likeness. Then this combination of channels with limited mutual similarity is passed on many aimed filters and is then transferred to the appropriate taps for left and right ear. Reducing the degree of mutual similarity of the channels of the multichannel input signal, it is possible to extend the spatial coverage of the binaural output signal and to improve the externalization.

Another idea underlying the proposed application, is that a more sustainable and digestible binaural signal for playback through headphones can be achieved when the spectral approach to change by separate amplitude and/or phase transformation at least two of the multiple channels forming thus the set of channels with minimized mutual similarity, which, in turn, can be transferred to many aimed filters with subsequent treatment of the corresponding mixers for the left and right ear. Again, by reducing vzaimopomoshi channels of the multichannel input signal can be extended spatial coverage of the binaural output signal and improved externalization.

The gain in the above-mentioned indicators can be achieved also by forming a set of transfer functions, modeling the head of the listener [HRTF], with limited mutual under the Biy due to the delay pulse characteristics of the original transfer function modeling of the head of the listener relative to each other, or in the spectral region of the phase and/or amplitude characteristics of the original set of modeling capabilities ear separately with respect to each other. This formation can be carried out offline, when designing the system, and interactively, during the generation of binaural signals, by applying perceptual simulated transfer functions as directed filters, suppose that responds to the specified indicators of the spatial position of the virtual audio source.

Another idea underlying this application, is that some film and music will sound in the headphones is more natural, if mono or stereo stereo downmix (mono - or stereodynamics) channels of the multichannel signal to be processed by the processor spatial construction in order to simulate the acoustic reflection effects/reverb as part of the binaural signal, will be performed in such a way that multiple channels will complement the signal down-mono - or stereomicroscope with different levels of intensity, at least two channels of the multichannel signal. In particular, the inventors have found that, as a rule, the dialogue through film and music mixing mainly with the Central canal mn is gokayleego signal, and that the signal of the Central channel, having a processing module for creating an acoustic space, the output is often played with unnatural reverb and distortion of the spectrum. The inventors, however, found that these disadvantages can be eliminated by filing a middle channel on the module spatial construction with a simultaneous decrease in the intensity of, say, 3-12 dB, or, in particular, by 6 dB.

Next is presented in more detail a preferred constructive solutions on the basis of the figures, where: figure 1 is given to the fundamental modular scheme of the device for generating a binaural signal according to the invention; figure 2 is given to the fundamental modular scheme option exercise device for generating a set of functions for modeling the perception of sound with minimizing mutual similarity according to the invention; figure 3 is given to the fundamental modular diagram of the device for simulating the generated binaural acoustic signal reflection effects and/or reverb according to the invention; Piga and 4B are fundamental modular scheme of hardware versions of the CPU build acoustic volume, shown in figure 3; figure 5 is given to the fundamental modular scheme options performance step-down mixer, shown in figure 3; figure 6 graphically presents process.processname of audio multi-channel signal according to the invention; 7 shows a schematic diagram of the generator binaural output signal according to the invention; Fig Dan variant schematic diagram of the generator binaural output signal according to the invention; figure 9 is given another version of the schematic diagram generator binaural output signal according to the invention; figure 10 is given variant schematic diagram of the generator binaural output signal according to the invention; figure 11 is given variant schematic diagram of the generator binaural output signal according to the invention; Fig given schematic diagram of the binaural spatial audio decoder, shown at 11; and Fig given schematic diagram of a modified design of spatial audio decoder, shown at 11.

Figure 1 shows the device for generating a binaural signal suitable, for example, to reproduce phonograms through head phones based on the multi-channel signal representing a multitude of channels, and the location of each of the virtual audio source in the configuration of the speaker corresponds to each individual channel. The device presented under the General number 10, comprises a block minimization of similarity 12, the comb 14 aimed filters 14a-14h, the first mixer 16A and the second mixer 16b.

The minimizer of similarity 12 is designed to convert multi-channel signal 18 representing multiple channels 18a-18d, a group of 20 channels 20a-20d with minimized mutual similarity. The number of channels 18a-18d, presents multi-channel signal 18 may be two or more. Solely for illustrative purposes in figure 1 identifies four channel 18a-18d. The set of channels 18 may be combined, for example, from the center channel, front left channel, a front right channel, a rear left channel and the rear right channel. For example, sound (sound designer) mixed channels 18a-18d from many individual audio signals representing, for example, various tools, fragments of vocals or other individual sound sources, assuming reproduce the channels 18a-18d through loudspeakers (not shown in figure 1), where each speaker is placed at the position predetermined for each virtual sound source associated with a separate channel 18a-18d.

In accordance with a variant implementation figure 1 channels 18a-18d includes at least one pair of left and right channels, one pair of front and rear channels or a couple of Central and noncentral channels. Of course, a combination of 18 channels 18a-18d may include more than one of the above pairs. The minimizer under the Oia 12 individually processes each channel of the multiple channels, thus reducing the degree of similarity between them and receiving the result of a combination of 20 channels 20a-20d with minimized vzaimopomoshi. So, on the one hand, the degree of similarity, at least one left and one right channel from the set of 18 channels, one front and one rear channel from a variety of 18 channels and one Central and one off-center channel from a variety of 18 channels can be reduced unit minimize the similarity 12 with the formation of the group of 20 channels 20a-20d with minimized mutual likeness. On the other hand, the minimizer of similarity (12) may additionally or separately, in the spectral region to perform separate phase and/or amplitude converting at least two of the multiple channels with the formation of a combination of channels 20 with minimized mutual likeness.

As will be explained in further detail below, the minimizer of similarity 12 may perform a separate transformation, in particular, by delaying the corresponding pairs relative to each other, or due to the delay of the corresponding pairs of channels in different size, for example, each of multiple frequency bands, reaching through this reduction intercorrelate in the group of channels 20. Of course, there are other opportunities to reduce the degree of tightness of the correlation between channels. In other words, the minimizer correlation 12 which may have a transfer function, in accordance with which the spectral energy distribution of each channel remains constant, that is, the minimizer of similarity 12, keeping the value of the amplitude of the transfer function at the unit level throughout the relevant range autospectra varies the phase or frequency characteristics papolos. For example, the block minimize correlation 12 may provide for such phase change of all or one or more channels 18, in which the signal of the first channel in the specified frequency band would take place with a delay relative to another channel, at least one sample. Moreover, the power reduction of the level of correlation 12 may be designed so that when the change of the phase characteristics of the group delay on the first channel relative to another channel for all frequency bands had a standard deviation of at least one-eighth of reference. The considered frequency bands can be a strip Barkov or smaller division, or any other type of partitioning of the frequency range.

The weakening of the correlation is not the only way to prevent the effect of localization in the head"that occur in the acoustic analyzer man. Correlation, rather, is one of the criteria by which the auditory system analyzes the similarities are what they sound signal, coming into both ears, and determines the direction of incoming sound. In addition, the minimizer of similarity 12 can perform differential conversion, separating the respective pairs of channels by reducing the intensity of different size, for example, for each of multiple frequency bands, thus forming a combination of channels 20 with minimized likeness, sorted by range. Linked fragment in the spectral region may contain, for example, excessive minimization, assume that the audio signal of the rear channel relative to the sound of the rear channel due to shading of the ear lobe. Accordingly, the block minimize the similarity 12 provides for the possibility of regulation in transformance degree of minimization of the surround channels with other channels. For the formation of such a spectral representation of the minimizer of similarity 12 may, keeping constant phase characteristics, separately to vary across the relevant spectral range of an audio signal amplitude or frequency characteristics of the sub-bands.

In principle, a way of representing multiple channels 18a-18d multi-channel signal 18 does not have any specific restrictions. In particular, the multi-channel signal 18 can represent a set of channels 18a-18d in a compressed form using the m spatial audio encoding. To perform spatial audio encoding the set of channels 18a-18d can be represented by a signal received by decreasing the mixing of these channels containing data down-mixing (downmix), where is the dilution factor of each of the channels 18a-18d, is used to form one or more sealed channels, and where the set of spatial parameters of the multichannel signal, describing the geometry of the sound environment through, for example, fluctuations in the level/intensity, phase shifts, time lags and/or changes in the degree of correlation/coherence between the channels 18a-18d. The output signal of minimizers correlation 12 is divided into channels 20a-20d. In separated channels on the output can be served as temporary signals and spectrograms pogolosovali decomposition.

Directional filters 14a-14h are designed to simulate the transmission of sound from a positioned one of the channels 20a-20d of the virtual source to the ear canal of the listener. Suppose, in figure 1 directional filters 14a-14d model acoustic transfer on the left auditory canal, and directional filters 14e-14h simulate acoustic transfer to the right ear canal. Directional filters simulate the transmission of acoustic waves from being hosted in a virtual acoustic environment sources of sound to the auditory canal of slushatel is due to the variation of indicators of time, the intensity and spectrum, as well as additional modeling of the effects of reflections and reverberation. Directional filters can be applied in the time and frequency domain. This means that the directional filters can act as a temporary memory area, for example, FIR filters, and in the frequency domain by multiplication of certain discrete values of the amplitude and phase transfer characteristics for the corresponding spectral values of the channels 20a-20d. In particular, using directional filters 14a-14h is possible to simulate the transfer function of the auditory system with a description of the targeted impact on the head, ears and shoulders of a person of signals passing through the channels 20a-20d, simulating the relative positions of the virtual sound sources. The first mixer 16A combines the outputs of the directional filters 14a-14d, simulating the acoustic transfer to the left ear canal of the listener in the signal 22A, which may be part or full left channel of the binaural output signal, the second mixer 16b combines the output signals of the directional filters 14e-14h, simulating the transmission of sound to the right ear canal of the listener in the signal 22b, which may be part or full right channel binaural output signal.

As a further more detailed discussion on the examples of implementation, with gnali 22A and 22b can be introduced components, creating the effects of sound reflection and/or reverb. This can be simplified system aimed filters 14a-14h.

Block minimize the similarity 12 as part of the device in figure 1 neutralizes the negative side effects of addition of correlated input signals of the mixers 16A and 16b, leading to a significant narrowing of the spatial coverage and the lack of a feeling of natural body binaural output signal 22A and 22b. These negative side effects are reduced by the decorrelation using the minimizer of similarity 12.

Before proceeding to consideration of the following technical solution according to this invention, it is necessary to summarize what has been stated relative to figure 1, there is shown an example of passing the decoded multi-channel signal and converting the output signal to the headphones. Each signal is filtered by a pair of directional filters. Thus, the channel 18a is filtered in two directional filters 14a-14. Unfortunately, as a rule, during the mixing of multi-channel soundtracks between the channels 18a-18d largely present likeness, such as correlation. This negatively affects binaural output signal. In particular, after processing multi-channel signals are directional filters 14a-14h in their output intermediate signals are combined in the mixers 16A and 16b OBR is reattaching the output signal headphone 20A and 20b. Summation similar/correlated output signals leads to severe narrowing of the spatial volume of the output signal 20A and 20b and the lack of externalization. In practice, this causes particular difficulties when similarity/correlation of the left and right signal and the center channel. In virtue of this unit of minimizing the similarity 12 should, where possible, to minimize the degree of homogeneity of these signals.

You should pay attention to the fact that most functions to reduce vzaimopomoshi channels 18a-18d many channels 18 can be accomplished without the introduction of a minimizer of similarity 12 by the substitution for the expansion of fuchsii aimed filters, which will need not only to model the propagation of sound, but also to ensure its diversity, for example, by means of decorrelation, as discussed above. In this case, directional filters, respectively, are designed not only for modeling but also for the modulation transfer functions of the head and auditory tract (HRTF).

For example, figure 2 presents the device to build a set of perceptual transfer functions to minimize inter-channel similarity for the simulation of sound transmission through a group of channels from a virtual source, the positioning of which is associated with individual channels, to an auditory analysis of the Torah listener. The device, conventionally designated by the General number 30, includes in its membership the Builder functions HRTF (Builder of the model transfer functions of the ear) 32 and the HRTF processor 34.

Builder functions HRTF 32 provides an initial set of modeling capabilities HRTF surround. Step of the algorithm 32 may include measurements using the reference model of the head of the listener to calculate the transfer functions of the auditory tract during the passage of sound from sources in certain positions to the ear canals of the dummy reference of the listener. Similarly, the Builder functions HRTF 32 can search and substitution of the original HRTF functions from memory. Or, on the contrary, the Builder functions HRTF 32 can perform the calculation of the HRTF in accordance with the entered formula, for example, depending on the specific configuration of the virtual sound sources. Thus, the model Builder transfer functions of a head of the listener's HRTF 32 may be designed to operate in the environment of formation of the generator binaural output signal, or to be part of such a generator binaural output signal, providing the original functions of the HRTF in real time, for example, in response to selecting or changing the position of sound sources in the virtual space. In particular, the device 30 may be included in the generator output is neurolingo signal, providing a consistent distribution of multichannel signals between speakers in different configurations depending on the relative position of virtual sound sources associated with the individual channels. In such case, the Builder functions HRTF 32 may provide the source of the simulated transfer function of a head of the listener (HRTF) in such a way that they will be coordinated with the current set position of the virtual sound sources.

The HRTF processor 34, in turn, performs the error pulse characteristics relative to each other, at least one pair of HRTF functions or modifies in the spectral region of their phase and/or amplitude characteristics, ensuring their mutual heterogeneity. Such a pair of functions HRTF can simulate acoustic transmission through one of the pairs of channels - left and right, front and rear or Central and noncentral. This result can be obtained by applying to one or more channels of a multichannel signal one of the following methods or their combination, in particular the delay function HRTF corresponding channel, a change in the phase characteristics corresponding function HRTF and/or application to the appropriate function HRTF decorrelates, for example, seastate, filter, thus forming the set f is NCCI HRTF with minimized intercorrelate, and/or changes in spectral amplitude characteristics corresponding function HRTF, thus forming a set of HRTF functions, at least with a reduced degree of mutual similarity. In any case reached the decorrelation/dissimilarity between the corresponding channels can stimulate the auditory system to the external localization of the sound source and thus to prevent the effect of localization in the head. The HRTF processor 34 can be made, say, with the possibility of modifying the phase characteristics of all, or one, or more, channels HRTF with the introduction of the group delay of the first function HRTF for a particular frequency band, i.e. the lag in a certain frequency range of the first function HRTF - relative to any other functions of the HRTF at least one reference. Further, the HRTF processor 34 may be implemented with the possibility of modifying the phase characteristics such that the group delay of the first function HRTF regarding some other functions of the HRTF for a variety of frequency bands will have a standard deviation of at least one-eighth of reference. The considered frequency bands can be a strip Barkov or smaller division, or any other type of partitioning of the frequency range.

The set of functions HRTF with reduced EOI what impodobim, formed at the output of the HRTF processor 34 may be used to specify functions HRTF aimed filters 14a-14h of the device in figure 1, the layout of which is the minimizer of similarity 12 may be included or not. Due to the mismatch characteristics of the modified HRTF functions of the above-mentioned effects of the expansion of the spatial volume of the binaural output signal and the externalization can be achieved without the use of the minimizer of similarity 12.

As described above, the device of figure 1 may be supplemented by the option to generate effects, sound reflection and/or reverb in a confined space as part of the binaural output signal using the down-mixing of at least some of the input channels 18a-18d. This helps to simplify the operations performed by directional filters 14a-14h. Figure 3 shows the simulating device in binaural output signal, the effects of sound reflection and reverberation in the room. The device 40 includes a signal generator with a decreasing mixing (step-down mixer) 42, which are connected in series processor build acoustic volume 44. The device 40 may be installed between the input terminal for the input multi-channel signal 18 of the device in figure 1 and an output terminal for outputting binaural the wow signal, moreover, the component of the left channel 46a processor spatial structure 44 is connected to the output 22A and the right output channel 46b processor spatial structure 44 is connected to the output 22b. The step-down mixer 42 generates on the basis of the multi-channel signal 18 mono or stereo signal 48, and the processor 44 generates a left channel 46a and the right channel 46b, bearing components of the binaural signal, simulating the reflection and reverberation in the room, modelled on the mono or stereo signal 48.

The idea underlying processor build acoustic space 44, is that the sound reflection/reverb, suppose, in a room can be modeled for the natural perception of the listener on the basis of the down-mixing, for example, in the form of a simple summation of multi-channel signal 18. Because sound reflection/reverb reach the auditory tract later than the sounds coming from the source in a straight line or along an axis visibility, the impulse response processor build acoustic space represent and replace the tail of the impulse response aimed filters, shown in figure 1. The impulse response of directional filters, in turn, can be shortened due to the restrictions of the functions modeling the direct passage of sound is and reflect with attenuation in the head, the ears and the shoulders of the listener. Of course, the boundary between what should be modeled directional filter, and what processor build acoustic space is defined quite arbitrarily, and directional filter may, for example, to simulate the primary reflection/reverberation in the room.

On figa and 4B shows a possible constructive solution processor build acoustic space. As seen on figa, the processor builds acoustic space 44, which consists of two filters, reverb 50A and 50b, a signal 48 mono down-mix. As directed filters, reverb 50A and 50b can work both in time and in frequency domain. The inputs of both signal downward monomethionine (monodox) 48. Filter reverb 50A output component generates the left channel 46a, while the filter reverb 50b forms the output component of the right channel 46b. On FIGU shows an example of the layout processor surround sukapatana 44 for signal processing downward stereomicroscope (Stereophonics) 48. In this case, the spatial processor sukapatana consists of four filters, reverb 50a-50d. The inputs of filters reverb 50A and 50b associated with the first channel 48A down stereoma the encryption 48, inputs, filters, reverb 50 and 50d is connected to the second channel 48b downward stereomicroscope 48. The outputs of filters reverb 50A and 50C are associated with the input of the adder 52a, which is capable of generating component of the left channel 46a. The outputs of filters reverb 50b and 50d are connected to the inputs of the second adder 52b, generates the output component of the right channel 46b.

Although the above that the step-down mixer 42 can perform simple addition of multi-channel signal 18, this does not apply to the configuration of figure 3. The step-down mixer 42 figure 3, rather, provides for the formation of mono - or stereodynamics 48 so that many channels contribute each his component in mono or stereo stereo downmix with intensity, different, at least for the two channels of the multichannel signal 18. This can serve as a means of blocking or activating the simulation of acoustic volume for certain types of content multi-channel signals such as speech or background music, mixed in a dedicated channel or a dedicated channel of the multichannel signal to warn so unnatural sound.

For example, a second mixer 42 figure 3 can perform downward mono - or stereomicrophone 48 so that the components of centrallocal multi-channel signal 18 are introduced into the signal down-mono - or stereomicroscope (mono - or stereodynamics) 48 with different degree of decrease in intensity relative to other multi-channel signal 18. For example, the depth of the lower level can range from 3 dB to 12 dB. The intensity may decrease smoothly across the range of operating frequencies of the multi-channel signal 18, or can be, depending on the frequency, assume, to be bound to a specific part of the spectrum, for example, the respective voice signals. The degree of reduction of the intensity relative to the other channels may be the same for all channels. This means that other channels may be mixed with the signal of the down-mix 48 at the same level. Or Vice versa, other channels can be entered in mixed with decreasing signal 48 at different levels. In addition, the degree of reduction of the intensity relative to the other channels can be correlated with the average value of other channels or average value of all channels, including reduced. In this case, the standard deviation of the mixing weights of other channels or the standard deviation of the mixing weights of all channels can be less than 66% of the level of reduction of the intensity of mix weight of the reduced level of the channel relative to the just-mentioned average value.

The effect of reducing the intensity level relative to the average channel is that binaural output signal generated by the introduction of components 46a and 46b, in prinimaetsja listeners more naturally than without such a reduction in intensity, at least when certain conditions discussed in more detail below. In other words, the signal generator with a decreasing mixing (step-down mixer) 42 receives a weighted sum of multi-channel signal 18 having the weight value associated with the Central channel, a reducible relative to the weight values of the other channels.

The decrease in the intensity of the Central channel is particularly effective for speech in the dialogue through film or play a musical fragments. Improving auditory perception conversational scenes largely compensates for the minor disadvantages, which arise due to the reduction in non-speech fragments. However, on the basis of alternative design solutions, the decline is not mandatory permanent factor. Rather, the step-down mixer 42 can be accomplished with the possibility of switching between mode disabled low level and the mode in which the low level is activated. In other words, the step-down mixer 42 provides for the possibility of varying the depth of the lower level of intensity in time. Changes may be made in binary or analog form in the interval from zero to maximum. The step-down mixer 42 may have komponovki is, providing a switching or varying the depth of the lower level depending on the information contained in the multi-channel signal 18. For example, a second mixer 42 may be configured to recognize the voice of the phases or to differentiate voice and data phases, or may specify a measurement system voice content, for example in the form of an ordinal scale, for a sequence of frames of the Central channel. For example, a second mixer 42 through filter tone frequency detect in the Central channel signs of speech and determines whether surpasses the level at the output of this filter, the total threshold value. When this identification step-down mixer 42 phases voice in the middle channel is not the only way to establish the time dependence of the above-described switching modes of variation of the depth decreases. For example, the multi-channel signal 18 may include Protocol data specifically concerned with separate recognition of voice and data phases or statistical evaluation of the speech material. In this case, the step-down mixer 42 will execute the commands contained in such related information. In another version of the step-down mixer 42 is able to switch between modes, as described above, or to adjust the degree reduce the Oia intensity, comparing, say, the current levels of the middle channel, left channel and right channel. When the Central channel will surpass the left and right channels separately or in an amount greater than a threshold ratio, a second mixer 42 can detect the phase of the speech and to respond accordingly, i.e. to reduce the level of intensity. Similarly, the step-down mixer 42 may use the difference between the levels of the Central, left and right channels for the implementation of the above dependencies.

In addition, a second mixer 42 is capable of handling spatial parameters of the multiple channels of the multi-channel signal 18, which describes the volume of the sound environment. This is shown in figure 5. Figure 5 is given the option of down-mixer 42 multi-channel signal 18 representing a multitude of channels with use of special audio encoding, i.e. by signal 62, mixed with the lowering of the multiple channels, and spatial parameters 64 this set of channels, describing the acoustic volume. Advanced multi-channel signal 18 may contain data down-mixing, describing the relationship of individual channels, is reduced with a decrease in signal 62, or describing individual signal channels 62, received a decreasing mixing, etc is that what channel down-mixing 62 may represent, for example, the signal 62 simple down-mixing (downmix) or signal 62 downward stereomicroscope (Stereophonics). The step-down mixer 42 figure 5 consists of a decoder 64 and mixer (mixer) 66. The decoder 64 in accordance with the procedure of spatial audioactive decodes multi-channel signal 18 with the restoration of a variety of channels, including, among other things, the Central channel 66 and other channels 68. The mixer 66 mixes the Central channel 66 and the other off-center channels 68 with getting mono or stereo 48, fulfilling the previously described lower level. The dashed line 70 indicates that the mixer 66 may use the spatial parameters 64 to switch between lower level and mode of variation of the depth of the lower level, as discussed above. Spatial parameters 64 used by the mixer 66 may constitute, in particular, the prediction coefficients of the channel, which describes how the average channel 66, the left channel or right channel can be reconstructed from the signal of the down-mixer 62, the mixer 66 can optionally use the parameters of the inter-channel coherence/cross-correlation, reflecting the consistency or mutual relationship between opisyvaemymi and right channels, which, in turn, can be formed by decreasing the mixing front left and rear left channels and front right and rear right channels, respectively. For example, the average channel can be mixed in a predetermined ratio for the formation of the left channel and the right channel signal downward stereomicroscope (Stereophonics). In this case, only two of the coefficients of the prediction of the channel to determine, as may be formed of the Central, left and right channels from the appropriate linear combination of the two channels of signal stereodynamics 62. In particular, for the separation of voice and data phase mixer 66 may operate with respect to the sum and difference of the coefficients of the prediction of the channel.

Although for illustration weighted summation of multiple channels, where each channel is involved in panyhose mono - or stereomirror (mono - or stereodynamics) with the degree of intensity than at least two channels of the multichannel signal 18, was taken case of reduction in the level of intensity with regard to the Central channel, there are also examples when the level of the other channels properly decreases or increases with respect to another channel or other channels due to the fact that the content of the sound source contained in one or some of the channels, subject to or not subject to treatment with building a sound volume at the same level with other content multi-channel signal, but at a lower/higher level.

5 in rather General terms illustrates the ability to transfer multiple input channels via the signal down-mixing 62 and spatial parameters 64. 6 extends this explanation. Description 6 also helps to consider the hardware version of the invention, presented in the following figure 10-13. 6 shows the decomposition of a signal received by a decreasing mixing, 62 in the range of many Podpolkovnik components 82. Figure 6 frequency components 82, for clarity, shown in the form of horizontal stripes, increasing in frequency upwards, as indicated by the arrow axis frequency domain 84. The horizontal axis is time 86. For example, mixed with decreasing signal 62 consists of a sequence of spectral values 88 each popolos 82. Resolution time (sampling rate), which podology 82 are divided into increments, expressed as the amount of reference 88, can be determined by slot filter Bank 90. Thus, time slots 90 and frequency podology 82, forming a grid, determine the frequency and temporal resolution. The larger the frequency-time grid FD is formed by combining adjacent samples 88 with the formation of the frequency-time cells 92, denoted at 6 is a dotted outline, which determine the frequency-temporal parametric resolution, or lattice. The above spatial parameters 62 defined by this frequency-time parametric resolution 92. Time-frequency parametric resolution 92 variable in time. To change the multi-channel signal 62 is divided into successive frames 94. For each frame, grate the time-frequency resolution 92 can be set individually. If the reception is mixed with lower signal 62 in the time domain into the scheme of decoder 64 enter the filter Bank analysis, generating the representation of mixing with lower signal 62, as shown in Fig.6. If mixed with decreasing signal 62 is fed to the decoder 64 as shown in Fig.6, the analyzing filter Bank as part of the decoder 64 is not needed. As already mentioned in the context of figure 5, for each cell 92 can be entered two coefficient prediction of the channel, showing how the right and left channels are formed from left and right channel signal stereodynamics 62. In addition, the measure of inter-channel coherence/cross-correlation (ICC) may indicate for cell 92 the presence of ICC between the left and right channels, which will be derived signal downward stereomicroscope 62, is one of which the channel is completely combined with one of the channels of the signal downward stereomicroscope 62, and the second is fully combined with the other channel signal downward stereomicroscope 62. When this indicator is the difference between the levels called the left and right channels (CLD) can be then presented for each cell 92. The indicators CLD can be applied with nonuniform quantization step on a logarithmic scale, giving a high accuracy near zero dB and the decrease in resolution with increasing difference between the levels of the channels. In addition, the spatial parameters 64 may include other indicators. These indicators can, in particular, to determine the difference between the levels of the channels (CLD) and inter-channel coherence (ICC)relating to the channels that participated in the formation of the mixing of these left and right channels, for example, back left, front left, rear right and front right channels.

You should pay attention to the fact that the above options technical performance can be combined with each other. The number of combinatorial possibilities already mentioned before. Other potential opportunities will be listed in the further description of the design solutions presented on Fig.7-13. In addition, when considering implementation options 1 and 5 was conventionally assumed that the transmission paths 20, 66 and 68, respectively, are physically present in the composition of the e device. However, it is not necessary. For example, the modified transfer function of the ear HRTF modeled by a device in figure 2, can be used to adjust the directional filters in figure 1 without the use of a block of minimizing the similarity 12, and in this case, the device 1 can operate mixed with a lowering signal, such as signal 62 figure 5, representing multiple channels 18a-18d, combining the necessary spatial parameters and modified functions HRTF within time-frequency parametric resolution 92, and applying the resulting coefficients of the linear combination for the generation of binaural signals 22A and 22b.

Similarly, the step-down mixer 42 appropriately combines spatial parameters 64 and the degree of weakening of the intensity of the Central channel for the reduction of mono - or stereomicroscope 48 before passing it on to the processor build acoustic space 44. Figure 7 presents an implementation option generator binaural output signal according to the invention. A generator having a common symbol 100, consists of a multi-channel decoder 102, the output terminal of binaural signal 104 and two connecting paths simulating the path, respectively, 106 direct and reflected 108 sonically the waves. In the path of the direct sound directional filters 110 is connected to the multichannel output of the decoder 102. Further, in the direct path of the sound wave built the first group of adders 112 and the second group of adders 114. The adders 112 summarize the output signals of the first half of the comb directional filters 110, and the adders 114 summarize the output signals of the second half of directional filters 110. The summed output signals of the first 112 and second 114 groups of adders form a component of the direct sound waves binaural output signal 22A and 22b. The adders 116 and 118 are designed to combine the component signals 22A and 22b and the components of the binaural signals generated by tract reverb 108, that is, for adding the signals 46a and 46b. In the tract reverb 108 cascaded mixer 120 and the processor spatial sukapatana 122, which connect the output of the multi-channel decoder 102 and the corresponding inputs of the adders 116 and 118, the output signals which form a binaural output signal 104.

To facilitate understanding of the circuit device 7 in his description included the legend used for the corresponding elements or features in figures 1-6. Necessary explanations will be given in the subsequent discussion. You should note that to simplify objasneni is further conditionally accepted, in all configurations the minimizer of similarity performs the function of reducing the degree of correlation. Accordingly, hereinafter, this device will be called "the minimizer correlation". However, as is clear from the preceding discussion described next version of the model is easily converted for use in cases where the minimizer of similarity involves more than to eliminate similarities than to weaken the correlation. In addition, although the below link, where conventionally assumed that the step-down mixer generates a signal for subsequent modeling of sound volume, performs the function of lowering the level of the Central channel, as I mentioned above, that provides an easy transition to alternative technical solutions.

The device 7 converts the stream of decoded multi-channel signal 124 to generate the output signal on the headphones at the output 104. Multi-channel decoder 102 synthesizes from a binary data stream received at the input 126, the decoded multi-channel signal 124 using, for example, the algorithm of the spatial decoding. After decoding each signal or channel decoded multi-channel signal 124 is filtered by a pair of directional filters 110. So, the first (upper) channel decoded multi-channel signal 124 filter is : directional filters DirFilter (1,L) [1 left] and DirFilter (1,R) [1-right], and the second (top) signal, or channel, is filtered directional filters DirFilter (2,L) and DirFilter (2,R), and so on. Filters 110 provide for the simulation of the passage of sound from the virtual source location to the auditory canal of the listener by implementing the so-called binaural transfer function space (BRTF). These filters are able to adjust the parameters of time, level and spectrum, as well as to partially mimic the sound reflection and reverberation in a confined space. Directional filters 110 may be used in the time or frequency domains. As the number of directional filters 110 should be large (Nx2, where N is the number of decoded channels), for the complete simulation of sound reflection and reverberation in the room will take quite a long filter cartridge - 20000 bands at 44.1 kHz, which leads to high computational complexity. Reducing the number of directional filters 110 to an optimal minimum for the simulation of sound reflection and reverberation are the so-called transfer function of the head of the listener (function HRTF) and block modeling acoustic environment 122. Module build acoustic space 122 may implement an algorithm to create the reverb effect in the time or frequency domain and can operate with single or dual channel input signal is m 48, calculated by the mixer 120 on the basis of the decoded multi-channel input signal 124 using the mixing matrix. Module build acoustic space reproduces the effects of echo and/or reverb in the room. The reflection and reverberation have a significant impact on the spatial localization of the sound, especially on the sense of remoteness and externalization, which means the perception of the listener acoustic sources outside his head.

Usually multi-channel sound is constructed so that the sound energy is concentrated in the frontal channels - front left, front right and center. Dialogues in movies and music mostly mixing with the Central channel. After modeling the acoustic volume in block 122, the signals of the Central channel are often perceived by ear with unnatural echo and tonal distortions. Therefore, in embodiment 7 of the Central channel enters the module build acoustic space 122 of the mixer 120 after a significant decrease (approximately 6 dB) level. Thus, the decision of the configuration of figure 7 corresponds to the layout in figure 3 and 5, and, consequently, the legend, 102, 124, 120, and 122 7 correspond to the legend 18, 64, the combination of conditional marked the th 66 and 68, the notation 66 and the notation of 44 in figure 3 and 5, respectively.

On Fig shows another embodiment of the generator binaural output signal. This generator conditionally designated common room 140. To simplify the description Fig, it used the same conventions as figure 7. To indicate that the mixer 120 does not necessarily perform all the same functions as in figure 3, 5 and 7, namely, a lower level of the Central channel module that combines the blocks 102, 120, and 122, designated as 40'. In other words, the attenuation level of the mixer 122 as part of the device on Fig is optional. However, in contrast to 7 between each pair of directional filters 110 and each output of the decoder 102 is mated to the channel decoded multi-channel signal 124 entered decorrelator. Decorrelator marked with the numbers 1421, 1422and so on. Decorrelator 1421, 1422perform the functions of minimizers correlation 12 in figure 1. Despite the fact that Fig decorrelator 1421-1424associated with each of the channels of the decoded multi-channel signal 124, this arrangement is not strictly necessary. Often enough one decorrelator. Decorrelator 142 can simply provide the delay. Preferably, the values of the delays 1421-1424differed among themselves. In the other embodiment, Decorrelator 142 1-1424can be seastate filters, which at a constant amplitude transfer characteristic equal to the unit, changing the phase of the spectral components of the corresponding channel. Changes in phase characteristics caused by decorrelation 1421-1424preferably should vary for each channel. Of course, there are other possibilities. For example, the role of decorrelation 1421-1424can filters with finite impulse response (FIR), etc.

It follows that the elements 1421-1424, 110, 112, and 114 hardware version on Fig in terms of functionality consistent with the device 10 in figure 1.

As in the case of Fig, figure 9 shows a variant implementation of the generator binaural output signal is presented on Fig.7. Accordingly, the notes to figure 9 will also be given using the legend 7. Like the versions on Fig lower level of intensity mixer 122 is also the option for the device of figure 9, therefore, it is more appropriate designation 40'40, as figure 7. The arrangement of figure 9 is aimed at solving a significant correlation between the number of channels in multi-channel sound generation. After passing through the multi-channel signals are directed through the filter is 110 dual intermediate signals of each pair of filters are added by the adders 112 and 114 with the formation of the output signal to the headphone output 104. The addition by the adders 112 and 114 correlated output signals leads to a significant narrowing of the spatial coverage of the output signal at the output 104 and the suppression effect of externalization. Special difficulties arise when the correlation of the left and right signal and the secondary channel in the decoded multi-channel signal 124. Technical solution figure 9 allows using directional filters to form the maximum decorrelating output signal. The diagram in Fig.9 put device 30, forming a set of HRTF functions to build a surround, which uses directional filters 110 on the basis of some initial combination of the transfer functions HRTF. As discussed above, the device 30 may use one or a combination of the following mechanisms for implementing the functions of the HRTF pairs of directional filters associated with one or more channels of the decoded multi-channel signal 124: delay using directional filter or an appropriate pair of directional filters, for example, by shifting the impulse response, for example, due to the displacement of the band filter; a phase change characteristics of the corresponding directional filter; and applying decorrelates filter, for example, seastate to appropriate directional filters soo the relevant channel. This secondaty filter could be implemented as a FIR filter.

As mentioned above, the device 30 can operate in the mode of response to changes in the configuration of loudspeakers, which is bitstream at the input 126.

Embodiments of presented on Fig.7-9, touched the decoded multi-channel signal. Following constructive solutions belong to the parametric multi-channel decoding for output on the headphones. Formulating in General, spatial audio encoding is a compression algorithm multi-channel signal using a perceptual channel irrelevantly (the difference between content channels for the perception of) multi-channel audio signals to achieve maximum compression. When this recorded sound spatial landmarks or features of the volume of the acoustic space, that is, the parameters describing the panoramic view of the multi-channel audio signal. Spatial sound control points, as a rule, reflect the differential level/intensity, phase difference and the degree of correlation/coherence between channels and can be represented in a very compact form. The concept of spatial audio coding has been adopted by the MPEG group that led to the creation of MPEG Surround, then e is th ISO/IEC23003-1. Spatial parameters used in the spatial audio can also be applied for the calculation of the directional filters. In this approach the spatial decoding of the audio data and the inclusion of the directional filters can be combined for high-quality decoding and rendering of multi-channel audio signal for playback through the headphones.

The General structure of the spatial audio decoder to output the signal to the headphone presented on figure 10. Decoder figure 10, provisionally designated by the General number 200 represents pogolosovali spatial-binaural modifier (Converter) 202, which includes in its circuit the input for stereo or mono down-mix 204, the input spatial parameters 206 and output for binaural output signal 208. Mixed with decreasing signal in combination with the spatial parameters 206 forms a multi-channel signal 18 and represents the totality of its channels.

In the scheme Podbelskogo modifier 202 includes a filter Bank analysis 208, block matriciana (matrix encoding) or linear manipulator (block information) 210 and the filter Bank synthesis 212 connected in sequence between the input mixed with decreasing signal and output Podbelskogo modifier 202. Next, papolos the howling modifier 202 includes Converter parameters 214, receiving spatial parameters 206 and the combination of the modified HRTF functions generated by the device 30.

In the arrangement of figure 10 assumes that the signal is down-mixing enters the pre-decoded form, including entropy encoding. In spatial and binaural audio decoder audio arrives obtained a decreasing mixing signal 204. The Converter parameters 214, spatial processing parameters 206 and parametric description of the designed filters in the form of characteristics of the modified HRTF functions 216, generates binaural parameters 218. Parameters 218 are applied unit matriciana 210 in the form of a matrix "two-by-two" (in the case of the signal downward stereomicroscope) and in the form of a matrix one-by-two" (in the case of signal 204 mono down-mix) in the frequency domain to the spectral values 88, calculated by the filter Bank analysis 208 (see Fig.6). In other words, the binaural parameters 218 range resolution time-frequency parametric grid 92, shown in Fig.6, and applied to each discrete value of 88. Using the interpolation can be smoothed matrix coefficients and the corresponding binaural characteristics 218 in the transition from coarser time-frequency parametric is blasti 92 in the area of time-frequency resolution of the analyzing filter Bank 208. Thus, when Panigale stereomirror 204 unit 210 evaluates matriciana two discrete values on pair consisting of a sample value of the left channel 204, mixed with decreasing signal, and the corresponding sample value of the right channel 204, mixed with decreasing signal. As a result of this two discrete values are elements of the left and right channels of a binaural output signal 208, respectively. When working with a mono signal 204 down-mixing unit matriciana 210 generates two discrete values of the reference signal downward monomethionine 204, namely, one is for the left channel and one for right channel of the binaural output signal 208. Binaural characteristics 218 define a matrix mode, in accordance with which one or two discrete values mixed with decreasing signal 204 builds the corresponding values of the samples of the left and right channels of a binaural output signal 208. Binaural parameters 218 already reflect the modified characteristics of the transfer functions HRTF. Therefore, they provide decorrelation of the input multi-channel signal 18, as described above.

From this it follows that the output unit matriciana 210 represent the conversion is nnow spectrogram, shown on Fig.6. Synthesizing filter Bank 212 reconstructs it binaural output signal 208. Formulating otherwise, the filter Bank synthesis 212 converts the resulting two-channel output signal unit matriciana 210 in the time domain. Of course, these features are implemented at the discretion of the user.

In the case of the device of figure 10 reflections and reverberation are not separately identified. If to take into account these effects, their construction should be carried out at the level of the HRTF functions 216. Figure 11 shows the generator output binaural signal combining binaural spatial audio decoder audio 200' with a separate device build effects of sound reflections/reverberation in the room. The icon in the designation of 200' figure 11 indicates that spatial and binaural audio decoder audio 200' may be used unmodified function HRTF, which is the original transfer function modeling of the head of the listener's HRTF, as in figure 2. However, arbitrarily as binaural spatial audio decoder 200' on 11 may be selected similar, shown in figure 10. In any case, the generator binaural signal 11 having a common symbol 230, includes, in addition to the binaural spatial decoder 200', audio decoder audio buck is about mixing (audio decoder audio of downmix) 232, the converted spatial pogolosovali audiodeformator 234, the processor spatial sukapatana 122 and two adder 116 and 118. Audio decoder audio of downmix 232 inserted between the input bit stream 126 and binaural spatial podpomogovym automobilisation 202 in the composition of the binaural spatial audio decoder 200'. Audio decoder audio of downmix 232 decodes the incoming bitstream 126 to extract from it mixed with decreasing signal 204 and spatial parameters 206. Mixed with decreasing signal 204 together with the spatial parameters 206 receives as binaural spatial pogolosovali audiodeformator 202 and converted spatial pogolosovali audiodeformator 234. On the basis of mixing with lower signal 204 transformed spatial modifier popolos sound frequencies 234, using the spatial parameters 206 and the adjusted parameters 236 containing depth data reduction level of the Central channel, as described above, calculates the signal down-mono - or stereomicroscope 48, which serves as an input signal processor build acoustic space 122. The output signals of the binaural spatial Podbelskogo of audiodeformator 202 and spatial processor 122 are added as the components of the coefficients of the channels by the adders 116 and 118 with the formation of the binaural output signal 238.

On Fig given the fundamental modular scheme of binaural audio decoder 200'that is included in the schema of figure 11. You should pay attention to the fact that Fig demonstrates not the internal structure of the spatial-of binaural audio decoder 200' figure 11, and the process of transformation of the signal. In General, the internal structure of the binaural spatial audio decoder 200' corresponds to the structure in figure 10, except that the device 30 may be omitted, if it is used for operations with the original HRTF functions. In addition, the binaural spatial audio decoder audio 200' in the example on Fig converts in binaural output signal 208 of the multi-channel signal 18, which contains all three channels. Thus, the block TTT, or "2->3", performs the function of separating the two channels downward stereomicroscope 204 on average 242, right 244 and left 246 channels. In other words, Fig illustrates an example where the signal downmix 204 is a signal of a stereo down-mixing. Spatial parameters 206, the processed block TTT 248, contain the above-mentioned prediction coefficients of the channel. The weakening of the closeness of correlation is achieved by using three decorrelation marked on Fig as DelayL, Delay R, and DelayC. They correspond to phase decorrelation, for example, figures 1 and 7. It should remind you is, what pig only illustrates the sequence of signal conversion spatial and binaural audio decoder 200', while functional diagram is shown in figure 10. Therefore, despite the fact that the delay elements forming the minimizer correlation 12, shown as components of the scheme, separate from the HRTF functions forming the directional filters 14, the presence of delay elements in the structure of minimizers 12 correlation can be seen as the actualization of the HRTF parameters, forming the source function HRTF aimed filters 14 Fig. First of all, Fig shows that binaural spatial audio decoder audio 200' provides a decorrelation of the channels for playback through headphones. The decorrelation is achieved by simple means, in particular, the connection of the delay block in the parametric transformation of the matrix M and spatial-of binaural audio decoder 200'. It follows that the binaural spatial audio decoder audio 200' may be attached to each channel, the following conversion methods, namely the delay of the center channel is preferably at least one reference delay of the Central channel at different intervals in each band of frequencies, the delay of the left and right channels preferably, at least one sample and/or delay Le the CSOs and right channels at different intervals in each band.

On Fig shows an example layout of a modified spatial Podbelskogo of audiodeformator with 11. Pogolosovali modifier 234 on Fig includes the unit "two-to-three" or TTT 262, cascade weighing a-I, first adders a and 266b, second adders a and 268b, the input stereodynamics 204, the input spatial parameters 206, auxiliary input for differential signal 270 and the output of downmix 48, according to Fig - stereo signal for further processing by the spatial processor.

As can be seen from the diagram constructive solution of the modified spatial Podbelskogo of audiodeformator 234 on Fig, block 2->3" (TTT) 262 simply reconstructs the average channel, right channel 244 and left channel 246 on the signal downward stereomicroscope 204 using the spatial parameters 206. It is possible to recall that in the context of Fig channels 242-246 in the calculations, practically, are not used. More precisely, the binaural spatial pogolosovali audiodeformator converts a matrix M so that the signal downward stereomicroscope 204 directly converted to binaural component, reflecting the functions of the HRTF. But actually Fig reconstruction performs block TTT (two-to-three") 262. As an option, as shown in Fig, block TTT 262 can use annesty signal 270, reflecting the predicted dierence in the reconstruction of the channels 242-246 on the basis of the reduction of stereomicroscope 204 and spatial parameters 206, which, as stated previously, contain the coefficients of the prediction of the channels and, optionally, values of interchannel coherence ICC. The first adders a are used for adding channels 242-246 with the formation of the left channel signal downward stereomicroscope 48. In particular, the adders a and 266b give a weighted sum, where the weights are determined at the stage of weighing a, 264b, s and e, when for each respective channel 246 on 242 is determined to put the weight value EQLL, QRLand EQCL. Similarly, the adders a and 268b calculate the weighted sum of the channels 246 242 on after the steps of weighing 264b, 264d and e finding the weight values, and then using a weighted sum is formed right channel downward stereomicroscope 48.

Parameters 270 for a series of weighings a-e are selected so that the above-described reduction of the level of the center channel as part of stereodynamics 48 in result provides a natural for the perception of sound, as mentioned previously.

In other words, Fig module demonstrates the simulation of acoustic space, which can be combined with binaural parametric decoder 200' F. g. On Fig this module is served mixed with decreasing signal (downmix) 204. Downmix 204 includes all signals of the multichannel signal, providing full stereo compatibility. As explained above, in the module build audiopresenter you only need to enter a reduced Central signal. This decrease in the intensity exercises converted spatial modifier popolos sound frequencies on Fig. In particular, as can be seen in Fig, to restore the Central, left and right channels 242-246 may be involved in the differential signal 270. The differential signal of the Central, left and right channels 242-246 may be decoded by the audio decoder downward mixing 232 (11), which Fig not shown. Indicators EQ or weighted values used during weighing a-e may be valid for left, right, and middle channels 242-246. The Central channel 242 can be set to a single constant characteristics for equal mixing with left and right output channels of the signal downward stereomicroscope 48, which is illustrated in Fig.

Indicators EQ 270 entered in the converted spatial pogolosovali audiodeformator 234 may have the following properties. First, the center channel signal can be the t to be weakened, as recommended at least 6 dB. In addition, the center channel signal may be low-frequency response. Further, the differential signal of the other channels may be enhanced at low frequencies. To compensate for the low level of the secondary channel 242 in comparison with the rest of the channels 244 and 246 using binaural spatial Podbelskogo of audiodeformator 202 proportionally increasing characteristics of the HRTF functions for the middle channel.

The main purpose of the job parameters of the EQ - attenuation center channel output in the construction module of the ambient sound volume. However, the intensity of the Central channel is subject to only limited attenuation: the center channel signal is subtracted from the left and right channel down-mixing unit TTT (two-to-three"). With reduced Central level in the left and right channels can occur audible artifacts. Therefore, the decline of the Central channel at the stage EQ is a compromise between attenuation of the intensity and the appearance of artifacts. The set of fixed installations EQ possible, but it will not be optimal for all signals. Thus, a constructive solution should include an algorithm or module adaptation 274, which would control the depth of the lower level of the Central is the anal, using one or combination of the following parameters.

Spatial parameters 206 used for decoding unit TTT 262 Central channel 242 of the left and right channel down-mix 204, can be used according to the configuration indicated by the dotted line 276.

The parameters of the level of the Central, left and right channels can be used according to the dotted line 278.

The difference between the levels of the Central, left and right channels 242-246 can also be used, as indicated by the dotted line 278.

The result of applying the detection algorithm, the same type of signal, such as voice activity detection, also can be applied in accordance with the dashed line 278.

Finally, static or dynamic metadata describing the audio content can be used to determine the degree of reduction of the intensity of the Central channel, which is indicated by the dashed line 280.

Despite the fact that most of the aspects of the invention is considered here from the point of view of constructive solutions device, it is obvious that such decisions cannot be affected description of the methods, since any element or device involved with a specific purpose, correspond to a particular stage of the method or the distinctive features of the way. Similarly, when considering aspects of the PE the implementation of any method is the description of the corresponding component, or block, or structural features of the corresponding device, for example, the item specialized integrated circuit ASIC, the subroutine code or fragment of programmable logic.

The invention encoded audio signal can be stored in a digital storage medium or can be transmitted in the communication environment information, such as a wireless transmission medium or a wired transmission medium such as the Internet.

Depending on the final destination and features the practical application of the invention can be implemented in hardware or software tools. When technical execution can be used digital media and data storage devices, such in particular as a flexible disk, DVD, CD, ROM, EPROM, programmable ROM, EPROM, or FLASH memory, capable of storing electronically readable control signals and interact with a programmable computer environment so that could be carried out the appropriate way.

Some variants of the design according to this invention are composed of a storage medium containing electronically readable control signals that are compatible with a programmable computer system and is able to participate in the implementation of one of the methods described here.

In General, this is sabreena can be implemented as a computer program product with program code, providing one of the following methods, provided that the computer software product is used with the use of computer. The program code may, for example, be stored on machine-readable media.

Various implementation options include a computer program stored on a machine-readable carrier, for implementing one of the methods described here.

Thus, formulating otherwise related to the invention the method is carried out using a computer program having program code that provides the implementation of one of the methods described here, if the computer program is carried out using the computer.

Hereinafter, therefore, the technical performance of the invented method includes a data carrier (or a digital storage medium or computer readable medium)that contains recorded thereon a computer program designed to perform one of the methods described here.

It follows that the implementation of the invention involves the presence of a data stream or a sequence of signals representing the computer program for implementing one of the methods described here. The data stream or the sequence can be calculated for transmission through the communication means, for example, the Internet.

In addition, the implementation includes hardware, such as a computer or programmable logic device designed or adapted to perform one of the methods described here.

Further, for technical execution requires a computer with a pre-installed computer program for implementing one of the methods described here.

Some versions of the design for implementing one or all of the functionality described here may require the use of programmable logic devices (for example, a field programmable array of logic elements). Depending on the destination version of the basic matrix crystal can be combined with a microprocessor to implement one of the methods described here. Generally, the described methods can be implemented using any hardware.

The above-described constructive solutions are only illustrations of the basic principles of the present invention. Assume that for specialists in this field the ability to make changes and improvements to the layout and elements of the described construction is obvious. Therefore, presented here are descriptions and explanations of the variants of the invention are limited only by the part p is relevant requirements, rather than specific details.

1. Device, generating on the basis of the multi-channel signal representing a set of channels, the binaural signal intended for playback through the speakers, where the position of each virtual sound source associated with an individual channel, including block minimization of similarity (12) for differential treatment and due to it reducing the degree of similarity, at least one left and one right channel of the multiple channels, one front and one rear channel of the multiple channels, one Central and one off-center channel of the multiple channels to form a combination of channels with minimized mutual similarity (20); set (14) directional filters modeling the sonic performance of a particular combination of channels with minimized likeness (20) from the virtual sound source position associated with the corresponding channel from a set of channels with minimized similarity to the ear canal of the listener; the first mixer (16A) for mixing the output signals of the directional filters modeling the acoustic transfer to the first ear canal of the listener, with the aim of forming the first channel (22A) of the binaural signal; and a second mixer (16b) for mixing the output signals of the directional filters modeling swooped the second ear canal of the listener, to form a second channel (22b) binaural signal; the step-down mixer (42), which generates a signal down-mono - or stereomicroscope multiple channels, presents multi-channel signal; and a processor build acoustic space [spatial processor] (44)generating effects of sound reflection/reverberation in the composition of the binaural signal including a first channel and a second channel, simulating reflections/reverberations based on the mono or stereo signal; a first adder (116)combining the output of the first spatial channel processor with the first channel (22A) of the binaural signal; and a second adder (118)combining the output of the second channel spatial processor with the second channel (22b) binaural signal.

2. The device according to claim 1, in which the minimizer of similarity (12) performs differential transformation by introducing a relative delay and/or differential changes in the spectral region of the phase characteristics of at least one left and one right channel of the multiple channels, one front and one rear channel of the multiple channels and the Central and non-Central channel of the multiple channels, and/or differential changes in spectral amplitude characteristics, at least real is left and one right channel of the multiple channels, one front and one rear channel from multiple channels and one Central and one off-center channel of the multiple channels.

3. Device, generating on the basis of the multi-channel signal representing a set of channels, the binaural signal intended for playback through the speakers, where the position of each virtual sound source associated with an individual channel, including the minimizer of similarity (12), which introduces a delay relative to each other and/or performing in the spectral range of differential phase and/or amplitude converting at least two channels of the multiple channels to form a combination of channels with minimized mutual similarity (20); a set of multiple directional filter (14) for the simulation of sound transmission to a specific set of channels with minimized mutual likeness (20) from the virtual sound source position associated with the respective channel of the set of channels with minimized similarity to the ear canal of the listener; the first mixer (16A) for mixing the output signals of the directional filters modeling the acoustic transfer to the first ear canal of the listener, with the aim of forming the first channel (22A) of the binaural signal; and a second mixer (16b) for mixing the output signal voltage is Alannah filters, simulating the sound transmission to the second ear canal of the listener, with the aim of forming the second channel (22b) binaural signal; the step-down mixer (42), which generates a signal down-mono - or stereomicroscope multiple channels, presents multi-channel signal processor build acoustic space (44), which generates the effects of echo/reverb in the closed space based on mono or stereo to binaural signal including a first channel and a second channel; a first adder (116)combining the output of the first spatial channel processor with the first channel (22A) of the binaural signal; and a second adder (118), which combines the output of the second channel spatial processor with the second channel (22b) binaural signal.

4. The device forming the set of transfer functions HRTF with minimizing mutual similarity, simulating the auditory tract for the simulation of sound transmission with multiple channels from a virtual sound source location associated with a particular channel, to the ear canals of the listener, including: Builder functions HRTF (32)intended for the education of the original population of the transfer functions of the head of the listener (HRTF), made in the form of a set of FIR filters by selecting or calculating popolos filter for every what about the original set of HRTF functions, sensitive to the choice or change of position of the virtual sound source; and a processor HRTF (34)that defines the impulse response functions HRTF modeling the audio pre-selected pair of channels with mutual delay, or differentially modulating in the spectral region of the phase and/or amplitude characteristics of the HRTF functions, with a pair of channel consists of left and right channel of the multiple channels, a front and a rear channel of the multiple channels and the Central and non-Central channel of the multiple channels.

5. The device according to claim 4, in which the HRTF processor (34) introduces a delay relative to each other impulse response functions HRTF modeling the transmission of sound is given by a pair of channels, by shifting popolos filtering.

6. The device according to claim 4, in which the HRTF processor (34) introduces a delay relative to each other impulse response functions HRTF modeling the transmission of sound is given by a pair of channels, or differentially converts in the spectral region of their phase and/or amplitude characteristics so that the group delay of the first function HRTF regarding another function HRTF show on the scale Barkov standard deviation, minimum, one-eighth of reference.

7. The device according to claim 4, in which the Builder functions HRTF (32) forms an initial aggregate shall be functions of the HRTF-based points relative location of the virtual sound sources and the HRTF parameters.

8. The device according to claim 4, in which the HRTF processor (34) performs differentiated seastate filtering pulse characteristics pre-selected pair of channels.

9. The method of generating a binaural signal based on the multi-channel signal for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, comprising: differential transformation and through him the weakening of the correlation between at least one of the left and right channel of the multiple channels, a front and a rear channel of the multiple channels and the Central and non-Central channel of the multiple channels for the formation of combinations of channels (20) with minimized mutual likeness; converting the set of channels (20) with minimized similarity set (14) directed filters for simulation of sound transmission from one set of channels (20) from a virtual acoustic source location of which correlated with an individual channel from the set of channels (20), to the respective ear canal of the listener; mixing the output signals of the directional filters modeling of sound transmission to the first ear canal of the listener to the formation of the first channel (22A) of the binaural signal; and mixing the output signals sent to filtros modeling of sound transmission to the second ear canal of the listener for the formation of the second channel (22b) binaural signal; signal processing of the reduction of mono - or stereomicroscope multiple channels, presents multi-channel signal; generating simulated based on the mono or stereo effects sound reflection/reverberation in a limited space in the structure of the binaural signal, including the output of the first channel and the second channel; combining the output of the first spatial channel processor with the first channel (22A) of the binaural signal; and combining the output of the second channel spatial processor with the second channel (22b) binaural signal.

10. The method of generating a binaural signal based on the multi-channel signal for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, comprising: differential phase and/or amplitude converting at least two channels of the multiple channels to form a set of channels (20) with minimized mutual likeness; the transfer of the totality of channels (20) with minimized mutual similarity on a set of directional filters (14) for the simulation of sound transmission in one of the groups of channels (20) with minimized mutual likeness from the virtual sound source location associated with the specified the channel of the channel group (20), with the respective ear canal of the listener; mixing the output signals of the directional filters modeling of sound transmission to the first ear canal of the listener to the formation of the first channel (22A) of the binaural signal; and mixing the output signals of the directional filters modeling of sound transmission to the second ear canal of the listener for the formation of the second channel (22b) binaural signal; a signal processing step-down mono - or stereomicroscope multiple channels, presents multi-channel signal; generating simulated based on the mono or stereo effects sound reflection/reverberation in a limited space in the structure of the binaural signal, including the output of the first channel and the second channel; combining the output of the first spatial channel processor with the first channel (22A) the binaural signal; and combining the output of the second channel spatial processor with the second channel (22b) binaural signal.

11. The method of forming a set of transfer functions of the auditory system (HRTF) with minimized mutual likeness for the simulation of sound transmission with multiple channels from a virtual acoustic source whose position is associated with a particular channel, to the ear canals of a person, includes: education source aggregate functions HRTF in the form of FIR filters by p is dbora or calculation popolos filter for each of the initial set of HRTF functions, sensitive to the choice or change of position of the virtual sound source; and differential transform in the spectral range of phases and/or amplitudes of impulse response functions HRTF modeling the transmission of sound is given by a pair of channels so that the group delay of the first function HRTF regarding the other HRTF shows on the scale Barkov standard deviation of at least one-eighth of reference, and a pair of channel presents the left and right channel of the multiple channels, a front and a rear channel of the multiple channels and the Central and non-Central channel of the multiple channels.

12. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 9.

13. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method under item 10.

14. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 11.

15. The device generating components of the effects of echo/reverb in binaural signal based on the multi-channel signal representing a multitude of channels intended for playback through the speakers,where the position of each virtual sound source associated with an individual channel, includes step-down mixer generates a signal down-mono - or stereomicroscope channels of the multichannel signal; and a processor build acoustic space, for generating a binaural signal based on a mono or stereo effects sound reflection/reverberation in the room, and step-down mixer performs downward mono - or stereomicrophone so that multiple channels are involved in panyhose mono - or stereomirror at a level other than at least two channels of the multichannel signal, while decreasing the mixer performs a downward mono - or stereomicrophone so that the Central channel of the multiple channels involved in panyhose mono - or stereomirror in the reduced relative to other channels of the multichannel signal.

16. The device according to item 15, in which the step-down mixer, using spatial audio reconstructs the set of channels on the basis of mixing with lower signal using the accompanying spatial parameters describing the difference in levels, the shifts of the phase difference and/or the degree of correlation between the multiple channels.

17. The device according to clause 16, in which the step-down mixer generates downmix so that the depth of reducing urovnevogo taken from, at least two channels relative to the second of these, at least two channels depends on the spatial parameters.

18. The device according to clause 16, in which the step-down mixer, using spatial audio reconstructs the set of channels on the basis of mixing with lower signal using the prediction coefficients of the channel describing how the TV signal downward stereomicroscope must be linearly combined to predict the Troika, consisting of the Central, right and left channels, and a differential signal (270), reflecting the remainder of the predictions of the specified triple.

19. The device according to item 15 or 16 or 17 or 18 in which the step-down mixer generates downmix so that the degree of reduction of the level relative to the second of these, at least two channels depends on the difference in levels and/or correlation between the individual channels of the set of channels.

20. The device according to p. 19, in which the step-down mixer increases the divergence levels and/or correlation between the individual channels of the multiple channels based on spatial parameters accompanying mixed with decreasing signal, collectively representing multiple channels.

21. The device under item 15 or 16 or 17 or 18, in which the step-down mixer generators which induces downmix thus, the depth of the lower level of the first of these, at least two channels with respect to the second of these, at least two channels varies over time, as indicated by the indicator changes time sent as part of service data in a multi-channel signal.

22. The device according to p. 15, also including a detector signal type that can recognize speech and non-speech phase in a multi-channel signal, in which the step-down mixer generates downmix so that the degree of reduction in the speech phase is higher than in non-speech phases.

23. The way to create reflections of sound/reverb in a confined space comprising a binaural signal based on the multi-channel signal representing a multitude of channels and intended for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, including mono or stereo stereo downmix of the multi-channel signal; and generating effects of sound reflection/reverb spatial volume in the composition of the binaural signal to simulate the sound reflection/reverb based on the mono or stereo signal, and the step-down mixer performs downward mono - or stereomicrophone so that the plural is significant channels are involved in panyhose mono - or stereomirror at a level different, at least from the two channels of the multichannel signal, while participating in the formation of mono - or stereodynamics, the Central channel of the multiple channels is reduced relative to the other channels of the multichannel signal.

24. The device generating components of the effects of echo/reverb in binaural signal based on the multi-channel signal representing a multitude of channels intended for playback through the speakers, where the position of each virtual sound source associated with an individual channel, includes: step-down mixer generates a signal down-mono - or stereomicroscope channels of the multichannel signal; and a processor build acoustic space, for generating a binaural signal based on a mono or stereo effects sound reflection/reverberation in the room, and step-down mixer performs downward mono - or stereomicrophone so that multiple channels are involved in Panigale mono - or stereomirror at a level other than at least two channels of the multichannel signal, while the step-down mixer, using spatial audio reconstructs the set of channels based on mixed with the decrease of the signal is and using the spatial parameters of the accompanying data, describing the difference in levels, the shifts of the phase difference and/or measures the degree of correlation between multiple channels, and the step-down mixer generates downmix so that the depth of the lower level of the first of the at least two channels relative to the second of these, at least two channels depends on the spatial parameters.

25. The method of generating the components of the effects of echo/reverb in a confined space comprising a binaural signal based on the multi-channel signal representing a multitude of channels and intended for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, comprising: a mono or stereo stereo downmix of the multi-channel signal; and generating effects of sound reflection/reverb spatial volume in the composition of the binaural signal to simulate the sound reflection/reverb based on the mono or stereo signal, and the step-down mixer generates a signal down-mono - or stereomicroscope so that multiple channels participates in the formation of such a signal having a level that is different, at least from the two channels of the multichannel signal; furthermore, the method includes is the use of spatial audio for the reconstruction of the totality of channels on the basis of mixing with lower signal using the spatial parameters of Protocol data describing the difference in levels, the shifts of the phase difference and/or the degree of correlation between multiple channels, and the formation of downmix so that the degree of attenuation level of the first of these, at least two channels relative to the second of these, at least two channels depends on the spatial parameters.

26. The device generating components of the effects of echo/reverb in binaural signal based on the multi-channel signal representing a multitude of channels intended for playback through the speakers, where the position of each virtual sound source associated with an individual channel, includes: step-down mixer generates a signal down-mono - or stereomicroscope channels of the multichannel signal; and a processor build acoustic space, for generating a binaural signal based on a mono or stereo effects sound reflection/reverberation in the room, and step-down mixer performs downward mono - or stereomicrophone so that multiple channels are involved in Panigale mono - or stereomirror at a level other than at least two channels of the multichannel signal, while decreasing the mixer performs a stereo downmix thus, Thu the degree of reduction of the level of the first of these, at least two channels relative to the second of these, at least two channels depends on the difference in levels and/or correlation between the individual channels of the set of channels, or in such a way that the degree of reduction of the first of these, at least two channels relative to the second of these, at least two channels varies over time, as indicated by the indicator changes time included in the service information of the multichannel signal.

27. Method of generating effects of echo/reverb in a confined space comprising a binaural signal based on the multi-channel signal representing a multitude of channels and intended for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, including mono or stereo stereo downmix of the multi-channel signal; and generating effects of sound reflection/reverb spatial volume in the composition of the binaural signal to simulate the sound reflection/reverb based on the mono or stereo signal, and the step-down mixer generates a signal down-mono - or stereomicroscope so that multiple channels are involved in the formation of such signal having a level, Otley is audica, at least two channels of the multichannel signal; and stereo downmix is performed so that the depth of the lower level of the first of these, at least two channels relative to the second of these, at least two channels depends on the difference in levels and/or correlation between the individual channels of the multiple channels; or in such a way that the degree of reduction of the first of these, at least two channels relative to the second of these, at least two channels varies over time, as indicated by the indicator changes time included in the service information multi-channel signal.

28. The device that generates the effect of echo/reverb in binaural signal based on the multi-channel signal representing a multitude of channels intended for playback through the speakers, where the position of each virtual sound source associated with an individual channel, includes: step-down mixer generates a signal down-mono - or stereomicroscope channels of the multichannel signal; and a processor build acoustic space, for generating a binaural signal based on a mono or stereo effects sound reflection/reverberation in the room, and step-down mixer performs downward mono - Il is stereomicrophone thus, the multiple channels are involved in panyhose mono - or stereomirror at a level other than at least two channels of the multichannel signal, in addition, the device includes: a detector signal type that can recognize speech and non-speech phase in a multi-channel signal, in which the step-down mixer generates downmix so that the degree of reduction in the speech phase is higher than in non-speech phases.

29. Method of generating effects of echo/reverb in a confined space comprising a binaural signal based on the multi-channel signal representing a multitude of channels and intended for playback through the speakers, where the relative positions of the virtual sound sources associated with the individual channel, comprising: a mono or stereo stereo downmix of the multi-channel signal; and generating effects of sound reflection/reverb spatial volume in the composition of the binaural signal to simulate the sound reflection/reverb based on the mono or stereo signal, and the step-down mixer generates a signal down-mono - or stereomicroscope so that multiple channels are involved in the formation of such a signal having a level other than that at m is re, from the two channels of the multichannel signal; furthermore, the method includes: detecting speech and non-speech phases of the multi-channel signal, which is formed so that in the speech phase level decreases stronger than in nonverbal phases.

30. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 23.

31. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 25.

32. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 27.

33. The computer-readable storage medium with written computer program to implement, being executed on a computer, the method according to p. 29.



 

Same patents:

FIELD: information technology.

SUBSTANCE: method comprises estimating a first wave representation comprising a first wave direction measure characterising the direction of a first wave and a first wave field measure being related to the magnitude of the first wave for the first spatial audio stream, having a first audio representation comprising a measure for pressure or magnitude of a first audio signal and a first direction of arrival of sound; estimating a second wave representation comprising a second wave direction characterising the direction of the second wave and a second wave field measure being related to the magnitude of the second wave for the second spatial audio stream, having a second audio representation comprising a measure for pressure or magnitude of a second audio signal and a second direction of arrival of sound; processing the first wave representation and the second wave representation to obtain a merged wave representation comprising a merged wave field measure, a merged direction of arrival measure and a merged diffuseness parameter; processing the first audio representation and the second audio representation to obtain a merged audio representation, and forming a merged audio stream.

EFFECT: high quality of a merged audio stream.

15 cl, 7 dwg

FIELD: physics.

SUBSTANCE: apparatus (100) for generating a multichannel audio signal (142) based on an input audio signal (102) comprises a main signal upmixing means (110), a section (segment) selector (120), a section signal upmixing means (110) and a combiner (140). The section signal upmixing means (110) is configured to provide a main multichannel audio signal (112) based on the input audio signal (102). The section selector (120) is configured to select or not select a section of the input audio signal (102) based on analysis of the input audio signal (102). The selected section of the input audio signal (102), a processed selected section of the input audio signal (102) or a reference signal associated with the selected section of the input audio signal (102) is provided as section signal (122). The section signal upmixing means (130) is configured to provide a section upmix signal (132) based on the section signal (122), and the combiner (140) is configured to overlay the main multichannel audio signal (112) and the section upmix signal (132) to obtain the multichannel audio signal (142).

EFFECT: improved flexibility and sound quality.

12 cl, 10 dwg

FIELD: information technology.

SUBSTANCE: invention relates to lossless multi-channel audio codec which uses adaptive segmentation with random access point (RAP) and multiple prediction parameter set (MPPS) capability. The lossless audio codec encodes/decodes a lossless variable bit rate (VBR) bit stream with random access point (RAP) capability to initiate lossless decoding at a specified segment within a frame and/or multiple prediction parameter set (MPPS) capability partitioned to mitigate transient effects. This is accomplished with an adaptive segmentation technique that fixes segment start points based on constraints imposed by the existence of a desired RAP and/or detected transient in the frame and selects a optimum segment duration in each frame to reduce encoded frame payload subject to an encoded segment payload constraint. RAP and MPPS are particularly applicable to improve overall performance for longer frame durations.

EFFECT: higher overall encoding efficiency.

48 cl, 23 dwg

FIELD: physics.

SUBSTANCE: method and system for generating output signals for reproduction by two physical speakers in response to input audio signals indicative of sound from multiple source locations including at least two rear locations. Typically, the input signals are indicative of sound from three front locations and two rear locations (left and right surround sources). A virtualiser generates left and right surround output signals suitable for driving front loudspeakers to emit sound that a listener perceives as emitted from rear sources. Typically, the virtualiser generates left and right surround output signals by transforming rear source input signals in accordance with a sound perception simulation function. To ensure that virtual channels are well heard in the presence of other channels, the virtualiser performs dynamic range compression on rear source input signals. The dynamic range compression is preferably performed by amplifying rear source input signals or partially processed versions thereof in a nonlinear way relative to front source input signals.

EFFECT: separating virtual sources while avoiding excessive emphasis of virtual channels.

34 cl, 9 dwg

FIELD: information technologies.

SUBSTANCE: invention discloses the method for reproduction of multiple audio channels, according to which out-of-phase information is extracted from side and/or rear side channels contained in a multi-channel audio signal.

EFFECT: improved reproduction of a multi-channel audio signal.

15 cl, 10 dwg

FIELD: information technologies.

SUBSTANCE: audio decoder for decoding multi-object audio signal comprises module to compute factor of forecasting matrix C consisting of factors forecasts based on data about object level difference (OLD), as well as means for step-up mixing proceeding from forecast factors for getting first upmix audio signal tending first type audio signal and/or second upmix signal tending to second type audio signal. Note here that multi-object audio signal comprises coded audio signals of first and second types. Multi-object audio signal consists of downmix signal 112 and service info. Service info comprises data on first and second type signal levels in first predefined frequency-time resolution.

EFFECT: separation of individual audio objects in mixing and decreasing/increasing channel number.

20 cl, 24 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to processing audio signals, particularly to improving intelligibility of dialogue and oral speech, for example, in surround entertainment ambient sound. A multichannel audio signal is processed to form a first characteristic and a second characteristic. The first channel is processed to generate a speech probability value. The first characteristic corresponds to a first measured indicator which depends on the signal level in the first channel of the multichannel audio signal containing speech and non-speech audio. The second characteristic corresponds to a second measured indicator which depends on the signal level in the second channel of the multichannel audio signal primarily containing non-speech audio. Further, the first and second characteristics of the multichannel audio signal are compared to generate an attenuation coefficient, wherein the difference between the first measured indicator and the second measured indicator is determined, and the attenuation coefficient is calculated based on the obtained difference and a threshold value. The attenuation coefficient is then adjusted in accordance with the speech probability value and the second channel is attenuated using the adjusted attenuation coefficient.

EFFECT: improved speech perceptibility.

12 cl, 5 dwg

FIELD: radio engineering.

SUBSTANCE: invention relates to a mechanism, which tracks signals of a secondary microphone in a mobile device with multiple microphones in order to warn a user, if one or more secondary microphones are covered at the moment, when the mobile device is used. In one example the estimate values of secondary microphone capacity averaged in a smoothed manner may be calculated and compared to the estimate value of the minimum noise level of the main microphone. Detection of microphone cover may be carried out by comparison of smoothed estimate values of secondary microphone capacity with an estimate value of minimum noise level for the main microphone. In another example the estimate values of the minimum noise level for signals of the main and secondary microphones may be compared with the difference in the sensitivity of the first and second microphones in order to detect whether the secondary microphone is covered. As soon as detection is over, a warning signal may be generated and issued to the user.

EFFECT: improved quality of main sonic signal sound.

37 cl, 9 dwg

FIELD: information technology.

SUBSTANCE: signal processing method involves: receiving a signal and spatial information which includes channel level difference (CLD) information, a channel prediction coefficient (CPC), interchannel coherence (ICC) information; obtaining mode information for determining the encoding scheme and modification flag information indicating whether the signal has been modified. If the mode information indicates an audio encoding scheme, the signal is decoded according to the audio encoding scheme. If the modification flag information indicates that the signal has been modified, restoration information is obtained after modification, which indicates the value for adjusting the window length applied to the signal; the window length is modified based on restoration information after modification and the signal is decoded using the window with the modified length. Further, based extension information, the base extension signal is determined; a downmix extended signal is generated, having a bandwidth which is extended using the base extension signal by restoring the high-frequency region signal; and a multichannel signal is generated by applying spatial information to the downmix extended signal.

EFFECT: high signal encoding efficiency.

7 cl, 22 dwg

FIELD: information technology.

SUBSTANCE: converter generates parameters which determine the relationship between a first and a second channel for a multichannel audio signal, associated with configuration of a multichannel acoustic system. Level parameters are generated based on object parameters from a plurality of audio objects associated with a downmixing channel, which are generated using audio signals of an object associated with audio objects. Object parameters contain an energy parameter which indicates energy of the audio signal of the object. A parametric generator is used to obtain coherence and level parameters which combine the energy parameter and reproduction parameters of the object, and which depend on the desired reproduction configuration.

EFFECT: less complex application of various systems which are designed to encode and decode parametric multichannel audio streams.

27 cl, 10 dwg

Slit type gas laser // 2273116

FIELD: quantum electronics, possible use for engineering technological slit type gas lasers.

SUBSTANCE: slit type gas laser has hermetic chamber, a pair of metallic electrodes, alternating voltage source, a pair of dielectric barriers, and an optical resonator. Chamber is filled with active gas substance. Metallic electrodes are mounted within aforementioned chamber, each of them has surface, directed to face surface of another electrode. Source of alternating voltage is connected to aforementioned electrodes for feeding excitation voltage to them. Dielectric barriers are positioned between metallic electrodes, so that surfaces of these barriers directed to each other form slit discharge gap for forming of barrier discharge in gas substance.

EFFECT: possible construction of slit type gas laser, excited by barrier discharge, dielectric barriers being made specifically to improve heat drain from active substance of laser, decrease voltage fall on these dielectric barriers, provide possible increase of electrodes area, improve efficiency of laser radiation generation, increase output power of laser, improve mode composition of its output signal.

8 cl, 4 dwg

FIELD: stereophonic systems with more than two channels.

SUBSTANCE: in accordance to the method, data is generated for parametric codes of first subset of sound input channels for first frequency area by using parametric multi-channel encoding; and parameter code data is generated for second subset of sound input channels for second frequency area by means of application of parametric multi-channel audio-encoding, where the second frequency area is different from the first frequency area; and the second subset of sound input channels is different from the first subset of sound input channels.

EFFECT: reduced data processing load in encoder and decoder, and also reduced BCC bit code streams.

6 cl, 2 dwg

Audio coding // 2325046

FIELD: audio coding.

SUBSTANCE: with the binaural coding, only one monophonic channel is coded. An additional layer contains parameters for the LH and RH signals. A coder is described, which associates transient process information extracted from the monophonic coded signal with parametric multichannel layers. Transient process locations may also be determined directly from the bit flow or calculated using other coded parameters (e.g., the window switch flag if specified in customer's requirements).

EFFECT: increase in efficiency due to use of transient process information in parametric multichannel layer.

13 cl, 4 dwg

FIELD: physics.

SUBSTANCE: said utility invention relates to sound recording and sound reproduction equipment and may be used for recording and restoration of a multi-dimensional acoustic scene, as well as during its transmission through media. In a recording room, the acoustic axes of all microphones are directed towards the centre of the acoustic scene being recorded, which is located on a vertical plane passing through the performers' front, the acoustic scene centre is located at the listeners' head level and in the middle of the microphones; in the listening room, the acoustic system arrangement on the vertical plane relative to the centre of the acoustic scene being restored is equivalent to the arrangement of microphones in the recording room; during transmission of all acoustic scene signal components from the microphone to the acoustic system and their amplification, output amplitude and phase relationships equivalent to the input ones are provided; displacement of acoustic systems in the vertical plane performs their phasing between one another, and rotation of the acoustic systems converges their axes into the point of acoustic scene restoration. When multi-band acoustic systems are used, the band phase adjustment and acoustic axis angles convergence may be performed online.

EFFECT: possibility to restore amplitude/phase acoustic scene.

2 cl, 2 dwg

FIELD: radio engineering.

SUBSTANCE: invention relates to device and method of multichannel sound signal processing in the compatible stereo format. While processing the multichannel sound signal having at least three initial channels, (12) the first mixing channel and the second mixing channel which are extracted from the initial channels are transmitted. (14) Additional channel information is calculated for the initial channel selected from initial channels in such a way so that mixing channel or combined mixing channel, including the first and the second mixing channels, generate approximation of the selected initial channel using weighting with additional channel information. Additional channel information and the first/second mixing channels form output data (20), which are to be transmitted to the decoder. If a low-level decoder is used, only the first/second mixing channels are decoded; if a high-level decoder is used, a composite multichannel sound signal is transmitted basing on mixing channels and additional channel information.

EFFECT: due to additional channel information occupies few bits and decoder does not use an inverse matrix, effective and high-quality multichannel extension for stereo record-players and multichannel record-players is obtained.

29 cl, 10 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention refers to multichannel audio signal processing, specifically to multichannel audio signal restoration using primary channel and parametrical supplementary information. Multichannel synthesiser contains postprocessor for postprocess characterisation of restoration or values derived from restoration parameter for current time line of input signal so that postprocessed parameter of restoration or postprocessed value differs from relative quantised and inversely quantised parameter by that value is postprocessed parameter of restoration or derives value are not limited by quantisation step length. Multichannel restoration unit (12) applies postprocessed parameter of restoration to restore multichannel output signal. Technical result consists that by postprocessing of restoration parameters with reference to multichannel coding/decoding enables low data transfer rate, on the one hand, and high quality, on the other hand, as far as strong changes in restored multichannel output signal is lowered owing to great quantisation step length for restoration parameter, being preferable due to required data transfer rate.

EFFECT: improved quality of signal transmission.

25 cl, 16 dwg

Audio encoding // 2363116

FIELD: communication devices.

SUBSTANCE: invention relates to encoding a multichannel audio signal, particularly encoding a multichannel signal containing first, second and third signal components. The method of encoding a multichannel audio signal containing at least, a first signal component (LF), second signal component (LR) and a third signal component (RF), involves encoding the first and second signal components using a first parametric encoder (202) to obtain the first encoded signal (L) and the first set (P2) of coding parametres. The first encoded signal and an additional signal (R) are encoded using a second parametric encoder to obtain a second encoded signal (T) and a second set (P1) of coding parametres. The additional signal is obtained from at least the third signal component, and is a multichannel audio signal in form of at least, the resultant encoded signal (T), obtained from at least, the second encoded signal, first set of coding parametres and second set of coding parametres.

EFFECT: more efficient encoding.

13 cl, 13 dwg

FIELD: individual supplies.

SUBSTANCE: invention concerns multichannel sound reproduction systems, particularly application of psychoacoustic principles in acoustic system design. Surrounding sound reproduction system uses a number of filters and system of main and auxiliary speakers producing effect of phantom rear channels of surrounding sound or phantom surrounding sound by acoustic system or system of two speakers installed in front of listener. Acoustic system includes left and right input signals of surrounding sound and left and right frontal input signals. Left and right auxiliary speakers and left and right main speakers are positioned in front of audition position. Distance between respective main and auxiliary speakers is equal to distance between ears of an average human.

EFFECT: surrounding sound reproduction by speakers installed only in front of listener.

59 cl, 21 dwg

FIELD: physics; acoustics.

SUBSTANCE: invention relates to coding several signals from audio sources, which must be transmitted or stored with the objective of mixing in order to synthesise a wave field, signals for multichannel three-dimensional or stereophonic audio after decoding signals from the sources. The proposed method provides for efficient composite coding signals compared to their separate coding, even when there is no redundancy between the signals. This is possible due to statistical properties of signals, properties of the coding method and spatial hearing. The sum of the signals is transmitted together with the statistical properties, which mainly determine spatial features for final mixed audio signals which are important for perception. The signals are reconstructed in a receiver so that statistical properties are approximately identical to corresponding properties of initial signals from the sources.

EFFECT: more efficient coding when mixing coded signals.

22 cl, 14 dwg

FIELD: physics; communications.

SUBSTANCE: invention relates to technology of multichannel audio and, specifically, to applications of multichannel audio in connections with headphone technologies. The device for generating an encoded stereo signal from a multichannel presentation includes a multichannel decoder (11), which forms three or more channels from at least one main channel and parametric information. Said three or more channels are subject to processing (12) headphone signals so as to generate an uncoded first stereo channel and an uncoded second stereo channel, which are then input into a stereo encoder (13) so as to generate an encoded stereo file at the output side. The encoded stereo file can be transmitted to any suitable playback device in form of a CD player or portable playback device such that, the user not only receives a normal stereo impression, but a multichannel impression as well.

EFFECT: efficient signal processing concept, which allows for multichannel quality playback on headphones in simple playback devices.

12 cl, 11 dwg

Up!