Apparatus for merging spatial audio streams

FIELD: information technology.

SUBSTANCE: method comprises estimating a first wave representation comprising a first wave direction measure characterising the direction of a first wave and a first wave field measure being related to the magnitude of the first wave for the first spatial audio stream, having a first audio representation comprising a measure for pressure or magnitude of a first audio signal and a first direction of arrival of sound; estimating a second wave representation comprising a second wave direction characterising the direction of the second wave and a second wave field measure being related to the magnitude of the second wave for the second spatial audio stream, having a second audio representation comprising a measure for pressure or magnitude of a second audio signal and a second direction of arrival of sound; processing the first wave representation and the second wave representation to obtain a merged wave representation comprising a merged wave field measure, a merged direction of arrival measure and a merged diffuseness parameter; processing the first audio representation and the second audio representation to obtain a merged audio representation, and forming a merged audio stream.

EFFECT: high quality of a merged audio stream.

15 cl, 7 dwg

 

The present invention relates to the field of audio processing, in particular spatial sound processing, and combining multiple spatial audio streams.

DirAC (DirAC = Directional Audio Coding), see V.Pulkki and .Faller, Directional audio coding in spatial sound reproduction and stereo upmixing. In AES 28thInternational Conference, Pitea, Sweden, June 2006, and A method for reproducing natural or modified spatial impression in Multichannel listening. Patent WO 2004/077884 A1, September 2004, is an effective approach to the analysis and reproduction of spatial sound. DirAC uses a parametric representation of sound fields based on the features that are important for the perception of spatial sound, namely the direction of arrivals of the sound (DOA = direction of arrivals of the sound) and diffuse reflection of sound (diffusion) in the frequency sub-bands. In fact, DirAC assumes that interiorally time difference [ITD = interiorally (misnia) time differences of arrival of the sound at the right and left ear] and Interaural level differences (ILD = Interaural differences in levels) are interpreted correctly, if properly implemented DOA of the sound field, and interiorally consistency (IC = interiorally coherence) is perceived correctly, if accurately reproduced diffusion.

These parameters, namely DOA and diffusion, are additional relevant information, which the traveler is accompanied by a mono signal into a mono DirAC stream. DirAC parameters obtained from time-frequency representation of the microphone signals. Thus, the parameters depend on time and frequency. When the playback signal this information allows to obtain accurate spatial sound transmission. To recreate the spatial sound with the desired quality listening requires installation of multiple loudspeakers. However, their location is arbitrary. In fact, the signals for the loudspeakers are determined by the DirAC parameters.

There are significant differences between DirAC and parametric coding of multi-channel audio, such as MPEG Surround, although they have very similar ways of processing audio structures, see Lars Villemoes, Juergen Herre, Jeroen Breebaart, Gerard Hotho, Sascha Disch, Heiko Pumhagen, and Kristofer Kjrlingm, MPEG surround: the appropriate ISO standard for spatial audio coding adopted by AES 28th international conference, Pitea, Sweden, June 2006. Although the encoding of MPEG Surround is based on time-frequency analysis of the various channels of the speakers, DirAC accepts, as input, the channels of coincident microphones, which effectively represent the sound field at one point. Thus, DirAC is also an effective method for recording spatial sound.

Another conventional system that uses a spatial sound is SOC (SAOC = coding of spatial audio objects), see Jonas Engdegard, Barbara Resch, Cornelia Falch, Oliver Hellmuth, Johannes Hilpert, Andreas Hoelzer, Leonid Ternetiev, Jeroen Breebaart, Jeroen Koppens, Erik Schuijer, and Werner Oomen, spatial audio coding object (SAOC) the above-mentioned MPEG standard on parametric object based audio coding, in accordance with 124ththe AES Convention, may 17-20, 2008, Amsterdam, The Netherlands, 2008, currently using the ISO/MPEG.

It is based on rendering [processing object model using computer software] model MPEG Surround and corrects various audio objects audio sources. Such encoding of sound provides a very high efficiency bitrate and gives unprecedented freedom for processing during playback. This approach is promising in obtaining important new features and functionality in legacy systems, and other applications.

The aim of the present invention to provide an improved method of combining spatial audio signals.

The problem is solved using hardware for combining in accordance with one of claims 1 or 14 and the method for combining in accordance with one of p or 15.

Note that the Association will be simple in the case of multi-stream DirAC, i.e. if 4 audio channel B-format are available. In fact, signals from different sources can be directly derived from devatsa to receive signals combined stream in format. However, if these channels are not available, the immediate Association is problematic.

The present invention is based on the concept that spatial audio signals can be represented by a sum of wave representations, such as representations of a plane wave and representing a diffuse field. In the result, you can specify the direction of arrivals of the sound. When you merge multiple audio streams variants of embodiments of the invention allow to obtain additional information about the merged stream, such as diffusion and direction of arrivals of the sound. The embodiment can obtain this information from the wave view, but also from the input audio streams. When you combine multiple audio streams of all they can be modeled in the form of a plot of wave or view and diffuse plot or presentation, with lots of waves or components and diffuse parts or components may be combined independently of each other. Association plot of the wave gives the merged parcel waves, for which the joint direction can be obtained using the directions view plots of the wave. In addition, diffuse areas can be combined individually, and from the United diffuse area can be obtained from the General option of diffusely.

The embodiment can use Atisa as a way of combining two or more spatial audio signals, encoded as mono DirAC streams. The resulting combined signal can also be represented as a mono DirAC stream. In embodiments of the invention encoding a mono DirAC can be used as a compact way of describing the spatial audio signal must be transmitted only one audio channel together with additional information.

As a possible application could be, for example, a teleconference with the number of participants more than two. For example, suppose user a communicates with user b and C, which generate two separate mono DirAC stream. Location As the embodiment allows to combine streams of users b and C into a single mono DirAC stream that can be reproduced using conventional methods for the synthesis of DirAC. In the use case network topology in which there is a multipoint conference Server (MCU = Server multipoint conference [hardware and software computing device, designed to combine audio and video conference multipoint mode]), the join operation will be performed by the MCU, so the user will see one mono DirAC stream already contains the speech of both participants and C. it is Clear that the DirAC streams to merge can also be created synthetically, which means that neohomaloptera information can be added to a mono audio signal. In the just mentioned example, the user may receive two audio stream from Within and From without any additional information. Then you can assign each thread a certain direction and diffusion, thereby adding additional information necessary to build a DirAC streams, which can then be combined with the embodiment of the invention.

Another possible scenario is to use variations of the embodiment can be offered in online multiplayer games and virtual reality applications. In these cases, multiple threads are created either from players or from virtual objects. Each thread has a specific direction of the location relative to the listener and therefore can be expressed in the DirAC stream. The embodiment can be used to combine different streams into a single stream of DirAC, which is then played back depending on the position of the listener.

The embodiment of the present invention will be discussed in detail using the accompanying drawings.

On figa shows a variant of the device for the join.

Figv shows the pressure and the components of the velocity vector of a particle in a Gaussian plane to plane waves.

Figure 2 shows a variant of DirAC encoder.

Figure 3 shows an ideal merge audio streams.

Figure 4 on asana inputs and outputs embodiment of a General DirAC block execution of the join.

Figure 5 shows the block diagram implementation.

Figure 6 shows the block diagram of the embodiment of the merge method.

On figa an embodiment of a hardware unit 100 for combining the first spatial audio stream with the second spatial audio stream to obtain a combined audio stream. The embodiment shown in figa merges two audio streams, but it is not limited to two audio streams, similarly can be combined in multiple spatial audio streams. The first spatial audio stream and the second spatial audio stream may, for example, be mono DirAC streams, and then merged audio stream will only mono audio stream DirAC. As will be described in detail hereinafter, the flow of mono DirAC may include a pressure signal, for example, on the selected Omni-directional microphone and additional information. Stream mono DirAC may contain the frequency versus time as a measure of diffusely and direction of arrivals of the sound. On figa shows a hardware block 100 for combining the first spatial audio stream with the second spatial audio stream to obtain a combined audio stream, comprising the evaluation unit 120 for estimating a first wave representation containing the direction of the first wave and measure the field of the first wave for the first spatial audio stream, with the first audio representation and the first direction of arrivals of the sound, and for estimating a second wave representation containing the direction of the second wave and the measurement field of the second wave for the second spatial audio stream with the second audio representation and the second direction of arrivals of the sound. Variants of the first and/or second representation of the wave may correspond to a representation of a plane wave.

In the variant shown in figa, hardware unit 100 further includes a processor 130 for processing representations of the first and second waves and receiving the submission of the consolidated waves containing the combined measurement of the field and the joint direction of arrivals of the sound, and for processing the first and second audio performances and get merged audio representation, and the processor 130 is designed to generate a combined audio stream that contains the merged audio representation and the joint direction of arrivals of the sound.

The evaluation unit 120 may be adapted to estimate the dimension of the first wave fields in terms of the amplitude of the first wave field measure of the second wave field in terms of the amplitude of the second wave field and to measure the phase difference between the first and second dimension of the wave field. In embodiments, the evaluation unit may be adapted is new for assessment phase fields of the first and second waves. In embodiments, the evaluation unit 120 may estimate only the phase shift or difference between the first and second representations of the waves for the first and second measurement field of the wave, respectively. The processor 130 can be adapted for processing the first and second views of the waves and receiving the submission of the consolidated waves containing the measurement field combined wave, which may include amplitude, phase, and direction of the merged wave field, and also for processing the first and second audio representation and obtain the merged audio representation.

In embodiments of the invention, the processor 130 can be adapted for further processing the first and second views of the waves and receiving the submission of the consolidated waves containing the measurement field combined waves, the combined measurement of the direction of arrivals of the sound and the joint parameter diffusely, as well as to ensure the Association of the audio stream that contains the merged audio representation, the joint direction of the intake sound and the combined parameter diffusely.

In other words, in embodiments of the invention, the parameter diffusely can be determined on the basis of wave representations combined audio stream. The parameter diffusely can be determined by measuring the spatial diffusely audio stream, that is, PU is eat measuring the spatial distribution, for example, the angular distribution relative to a specific direction. One of the possible scenarios you can combine two synthesized mono signal with accurate information about the direction.

The processor 130 can be adapted for processing the first and second views of the waves and receiving the submission of the consolidated waves, in which the combined parameter diffusely obtained using the measurement directions of the first and second waves. Variants of the first and second representations of a wave can have different directions of arrival, and the joint direction of the intake could be between them. In this embodiment, although the first and second spatial audio stream cannot pass multiple parameters to diffusely combined parameter diffusely can be determined from the first and second representations of the wave, i.e. on the basis of the measurement directions of the first and second waves. For example, if two plane waves coming from different directions, i.e. the measured direction of the first wave differs from the measured directions of the second wave, the merged audio representation may contain combined joint direction of arrival with other than zero, the combined parameter diffusely, taking into account the measurement directions of the first and second waves. In other words, in EMA as two concentrated spatial audio stream may not be or may not be diffuse, the combined audio stream may have dierent from zero diffusion, since it is obtained using the angular distributions of the generated first and second audio streams.

In the embodiments is possible to estimate the parameter diffusely Ψ, for example, for the United DirAC stream. Generally, in embodiments of the invention can be set or calculate the fixed values of the parameters diffusely separate threads, for example 0 or 0.1, or a variable value obtained from the analysis of the audio view and/or view direction.

In other embodiments, the hardware unit 100 for combining the first spatial audio stream with the second spatial audio stream to obtain a combined audio stream may include the evaluation unit 120 for estimating a first wave representation, containing the measurement direction of the first wave and the measurement of the wave field for the first spatial audio stream having a first audio representation, the first direction of the intake sound and the first parameter of diffusely. In other words, the first audio representation may correspond to an audio signal with a certain spatial width or be diffuse to a certain extent. In one embodiment, this may correspond to a scenario in a computer game. The first player may be in the scenario where the first is udio view represents the source of the sound, such as, for example, a passing train, creating a diffuse field, to a certain extent, sound. This version sounds caused by the train, can be diffuse, and the sound produced by the whistle of the train, i.e. the corresponding frequency components may be diffuse.

The evaluation unit 120 may also be adapted for estimating a second wave representation, containing the measuring direction of the second wave and the second dimension of the wave field for the second spatial audio stream with the second audio representation, the second direction of the intake sound and the second argument is diffusely.

In other words, the second audio representation may correspond to an audio signal with a certain spatial width or be diffuse to a certain extent. Such a case may also correspond to the scenario of the computer game, when the second sound source can be represented by the second audio stream, such as background noise from the other trains going the other way. For the first player in a computer game both sound source can be diffused when it is at the railway station.

In embodiments of the invention, the processor 130 can be adapted for processing the first and second waveform and obtaining views are combined wave containing smirenje merged wave field and the joint direction of receipt, and for processing the first and second audio representation to obtain a merged audio representation and the combined audio stream that contains the merged audio representation and the measurement of the merged direction of arrival. In other words, the processor 130 may determine the combined parameter diffusely. Such a case corresponds to a sound field arising from the second player in the above-described computer game. The second player can be located further from the train station, so the two sound source cannot be taken to diffuse to the second player and are quite concentrated sources of sound from a great distance.

In embodiments, the hardware unit 100 may further comprise a unit 110 for determining the first audio representation and the first direction of the revenues in the first spatial audio stream, and to determine a second audio representation and the second direction of the revenues in the second spatial audio stream. In the options block for identifying 110 may receive direct audio stream, i.e. the definition can be reduced to reading audio representation in terms of, for example, the pressure signal and DOA and possibly also options diffusely as additional information.

The evaluation unit 120 may be adapted on the I evaluation of the first waveform in the first spatial audio stream, contains the first parameter of diffusely, and/or for estimating a second wave representation in the second spatial audio stream containing the second parameter of diffusely, the processor 130 can be adapted for processing the combined measurement of the wave field, the first and second audio representations of the first and second parameters diffusely to obtain the combined parameter diffusely for the merged audio stream, the processor 130 can also be adapted to provide an audio stream that contains the combined parameter diffusely. Unit for determining 110 may be adapted to determine the first parameter of diffusely from the first spatial audio stream and the second parameter of diffusely from the second spatial audio stream.

The evaluation unit 120 can be adapted for estimating a first wave representation in the first spatial audio stream containing the first parameter of diffusely, and/or for estimating a second wave representation in the second spatial audio stream containing the second parameter of diffusely, the processor 130 can be adapted to measure the merged wave field containing the combined parameter diffusely, the first and the second audio representation, the first and second parameters diffusely, and obtain the military option diffusely for the merged audio stream, the processor 130 can be adapted to provide an audio stream that contains the combined parameter diffusely. Unit for determining 110 may be adapted to determine the first parameter of diffusely the first spatial audio stream, and the second parameter of diffusely for the second spatial audio stream.

The processor 130 can be adapted for block processing of spatial audio streams, audio representations, DOA and/or diffusely, i.e. in the form of segments of samples or values. In some embodiments, a segment may contain a predefined number of samples corresponding to the frequency representation in a frequency band for a certain period of time spatial audio stream. Such a segment may correspond to a mono view and will be associated with DOA and parameter diffusely.

In the options block for identifying 110 may be adapted to determine the first and second audio representation, the first and second directions of admission, the first and second parameters diffusely depending on frequency and time, and/or the processor 130 can be adapted for processing the first and second wave representations, diffusely and/or measurement DOA and/or to determine the combined audio performance, the effect, measurements of joint directions of inflow and/or combined parameter diffusely depending on frequency and time.

In embodiments of the invention, the first audio representation may correspond to the first mono view, the second audio representation may correspond to a second mono view and merged audio representation may correspond to a combined mono view. In other words, the audio representation may correspond to a single audio channel.

In embodiments of the invention, the unit for determining 110 may be adapted to determine and/or the processor may be adapted to process the first and second mono view, the first and second DOA, the first and second parameters diffusely, and the processor 130 may form the United mono presentation, to ensure measurement of the joint DOA and/or combined parameter diffusely depending on frequency and time. In variants of the embodiments of the first spatial audio stream can be obtained, for example, in the DirAC representation, the unit for determining 110 may be adapted to determine the first and second mono view, the first and second DOA, the first and second parameters diffusely simply by retrieving them from the first and second audio streams, i.e. in the form of additional information DirAC.

Next will be considered in detail the embodiment, which will be introduced notation and data model. In the options block for identifying 110 may be adapted to determine the first and second audio representations and/or the processor 130 can be adapted to receive the combined mono representation in terms of the pressure signal p(t) or time-frequency transform of the pressure signal P(k,n), where k denotes the frequency index and n denotes the time index.

In various embodiments, the measurement directions of the first and second wave, and measuring the combined areas of the receipt may be in any units, such as a vector, angle, direction (azimuth), etc. and they can be obtained from any direction measurement representing the audio component, such as a vector of intensity, velocity vector of the particle and so the Dimension of the first and second wave field, and measuring the combined wave field can be performed using any physical quantity that describes the audio component, which may be real or complex values corresponding to the pressure signal, the amplitude or the magnitude of the velocities of particles, volume, etc. in Addition, measurements can be carried out in the time and/or frequency domain.

The embodiment can be snowny on the assessment of the representation of plane waves to measure the wave field wave representations of the input streams, which can be carried out by the evaluation unit 120 for Figo. In other words, the dimension of the wave field can be modeled using a representation of a plane wave. In General, there are several equivalent exhaustive (i.e. complete) descriptions of a plane wave or waves in General. Further, the mathematical description for the calculation of the parameters of diffusely and directions receipt or direction measurement for the various components. Although only a few of the descriptions are directly related to physical quantities, such as, for example, pressure, particle velocity, etc., valid existence of an infinite number of different ways of describing the wave representations, of which only one will be presented as an example, but that in no way means limiting embodiments of the present invention.

For more detail as to the various possible descriptions, consider two real numbers a and b. The information contained in a and b, may be transferred by sending C and d,

,

where Ω is the known matrix of 2×2. For example, consider only linear combinations, although any combination, i.e. also non-linear, valid.

Next scalars are represented by small letters a, b, C, and vectors-columns presents p LoginName small letters a, b, c. The index ()Tdenotes the transpose, and, accordingly,and (·)means complex conjugation. Comprehensive denote the phase differ from time. For example, the pressure p(t), which is a real number, which can be measured wave field can be expressed using the vector R, which is a complex number, with which you can obtain another measurement of the wave field by the formula

,

where Re{·} denotes the real part, andthe corner frequency. In addition, hereinafter to denote a vector physical quantities used capital letters. The following elementary example to avoid confusion, you should note that all values with index "PW", discussed below, pertain to planar waves.

For an ideal monochromatic plane wave vector of the particle velocity UPWcan be written in the form

,

where edis the unit vector points in the direction of the received waves, for example, corresponding to the measured direction. It can be shown that

,

Ψ=0

where Iaabout the means the real part of the intensity [intensity of sound - a vector quantity, the value of the flow of sound energy passing through a unit area per unit time in the direction of the local velocity. GOST 30457.3-2006 Acoustics. Determination of sound power levels of noise sources on sound intensity.], ρ0denotes the air density, C indicates the speed of sound, E denotes the energy of the sound field, and Ψ denotes a blur. It is interesting to note that since all the components of edare real numbers, the components UPWare in phase with PPW. On FIGU shows the model UPWand RPWin the Gaussian plane. As mentioned above, all the components of UPWhave the same phase PPWnamely θ. Their magnitude, on the other hand, are related as follows

.

Even when there are several sources of sound pressure and particle velocity can still be expressed as the sum of the individual component. Without loss of generality, consider the case of two sound sources. In practice, the use of more sources can be performed simply.

Let P(1)and P(2)are the pressures that are recorded for the first and second source, respectively, for example, they may represent the first and the second measurement wave field.

Analogion the m way let U(1)and U(2)are integrated by the velocity vectors of the particles. Given the linearity of the phenomenon of propagation, when the sources are at the same time, note the pressure P and particle velocity U is equal to

R=R(1)+R(2),

U=U(1)+U(2).

Thus, the real part of intensity

,

.

Thus,

.

Note that except for special cases

.

When there are two sources, for example, plane waves, the waves are exactly in the same phase (although distributed with respect to each other in different directions),

P(2)=γ·P(1),

moreover, γ is a real number. It follows that

,

,

and

.

When the waves are in phase and are propagated in the same direction, they can be clearly interpreted as a wave.

For γ=-1 and arbitrary direction of the pressure disappears and the flow of energy is absent, i.e. in.

When the waves are strictly per anticolana,

P(2)=γ·ejπ/2P(1),

U(2)= γ·ejπ/2U(1),

,

,

,

moreover, γ is a real number. It follows that

,

,

and

.

Using these equations it can be easily shown that for plane waves each of the selected variables U, P and edor R and Iacan give equivalent and comprehensive description, as well as all other physical quantities that can be derived from them, i.e. any combination of them in embodiments of the invention may be used instead of measurements of the wave field or wave direction. For example, in embodiments of the invention valid 2-dimensional norm of the vector of intensity can be used as a measure of the wave field.

The minimum description that can be used for the consolidation are listed in the variants of the invention. The pressure and velocity vectors of the particles for the i-th plane wave can be expressed as follows:

,

,

moreover, ∠P(i)represents the phase of P(i). The expression for the joint vector of intensity, i.e. the combined measurements of the wave field and the joint direction of receipt of these variables, can be written as follows:

.

Note that the first two summands areand. The equation can be further simplified

.

Substituting

,

get

This equation shows that the information needed to calculate the Iacan be reduced to,,. In other words, the representation for each, for example, a plane wave can be reduced to the wave amplitude and direction of arrival. In addition, you can use a relative phase difference between the waves. When you want to combine more than two waves may be used for the phase difference between all pairs of waves. Obviously, there are several other descriptions that contain the same information. For example, would be equivalent, if known is because the intensity vectors, any phase difference.

Typically, energetic description of plane waves may be insufficient to perform the correct Association. The Association can be approximated for the case of perpendicular propagation. Exhaustive descriptions of the waves (i.e. assuming all known physical quantities waves) may be enough to join, but this is not necessary in all embodiments. In embodiments of the invention allow for proper Association must take into account the amplitude of each wave, the direction of receipt of each wave and the relative phase difference between each pair of the combining waves.

Block 110 to determine and/or the processor 130 can be adapted for processing the first and second directions of admission and/or for the measurement of joint direction proceeds in terms of a unit vector eDOA(k,n), and

andwhere

and

U(k,n)=[Ux(k,n),Uy(k,n)Uz(k,n)]

indicate the time-frequency transformation u(t)=[ux(t),uy(t)uz(t)] of the velocity vector of the particle. In other words, we assume that p(t) and u(t)=[ux(t),uy(t)uz(t)] represent the pressure and the velocity vector frequent the hospitals, accordingly, for a particular point in space, where [·]Tdenotes the transposition. These signals can be converted in the frequency-time domain using an appropriate set of filters, such as fast Fourier transforms (STFT), as proposed, for example, V.Pulkki and .Faller, Directional audio coding: Filterbank and STFT-based design, in 120th AES Convention, May 20-23, 2006, Paris, France, May 2006.

Let P(k,n) and U(k,n)=[Ux(k,n),Uy(k,n)Uz(k,n)]

denote the transformed signals, where k and n indicate the frequency (or frequency range) and time, respectively. The real part of the vector intensity Ia(k,n) can be defined as

where (·)denotes complex conjugation and Re{·} selects the real part. The real part of the vector intensity expresses the net energy flux that characterize the sound field, see F.J.Fahy, Sound Intensity, Essex: Elsevier Science Publishers Ltd., 1989, and, thus, can be used as a measure of the wave field.

Let denotes the speed of sound in the medium and E determines the energy of the sound field in accordance with F.J.Fahy

wherethe calculated 2-dimensional norm. Next will be considered in detail the content of mono DirAC stream.

Mono flux is DirAC may consist of a mono signal p(t) and the additional information. This additional information may include frequency-time dependence of the direction of arrivals and frequency-time dependence measurement diffusely. The first information is denoted by a unit vector eDOA(k,n), which is directed in the direction of arrivals of the sound. The second information diffusion, denoted

Ψ(k,n).

In embodiments of the invention, the block 110 and/or the processor 130 can be adapted for providing/processing the first and second DOAS and/or joint DOA in terms of the unit vector eDOA(k,n). Directions receipt can be obtained as

eDOA(k,n)=-eI(k,n),

where the unit vector eI(k,n) indicates the direction in which directed the real part of the vector of intensity of the pixels, namely

,

The alternative options, DOA can be expressed in terms of azimuth and elevation angle in spherical coordinate system. For example, if φ and ϑ are the azimuth and elevation angle, respectively, then

In embodiments of the invention, the unit for determining 110 and/or the processor 130 can be adapted for providing/processing the first and second parameters diffusely and/or combined parameter diffusely Ψ(k,n) is by using time-frequency dependence. Unit for determining 110 may be adapted to provide the first and/or second parameter of diffusely and/or the processor 130 can be adapted to receive the combined parameter diffusely in terms of

where <·>tindicates averaging over time.

In practice, there are various strategies to obtain P(k,n) and U(k,n). One possibility is to use a b-format microphone, which provides 4 signal, namely w(t), x(t), y(t) and z(t). The first one, w(t), corresponds to a pressure recorded Omni-directional microphone. The last three are pressure microphones with the model pattern in the form of eights directed along the three axes of a Cartesian coordinate system, respectively. These signals are also proportional to the particle velocity. Thus, in some embodiments

P(k,n)=W(k,n),

where W(k,n), X(k,n), Y(k,n) and Z(k,n) are converted signals In the format. Note that the multiplier ofin (6) is obtained from the conventions used in the definition of b-format signals, see Michael Gerzon, Surround sound psychoacoustics. In Wireless World, volume 80, pages 483-486, December 1974.

Alternatively, P(k,n) and U(k,n) can be estimated using omnidirectional microphone, to whom it is offered in J.Merimaa, Applications of a 3-D microphone array, in 112thAES Convention, Paper 5501, Munich, May 2002. The processing steps described above, also shown in figure 2.

Figure 2 shows the DirAC encoder 200, which is adapted to calculate a mono channel audio and additional information from the respective input signals, such as microphone signals. In other words, figure 2 represents the DirAC encoder 200 to determine diffusely and direction of receipt of the respective microphone signals. Figure 2 represents the DirAC encoder 200 includes block 210 valuation P/U. the Device valuation P/U that receives signals from the microphone as a source of information on which the assessment is based P/U. as all information is available, P/U can just be assessed in accordance with the above equations. Energy analysis phase 220 allows to estimate the direction of income and setting diffusely combined stream.

In embodiments, audio streams, different from the mono audio streams DirAC, can be combined. In other words, in the options block for identifying 110 may be adapted to convert any audio stream in the first and second audio streams, such as, for example, a stereo or surround audio data. When in embodiments of the invention are combined flows DirAC different from mono, they can be handled in different ways. If the stream DirAC front of the t signals In the format, as, for example, audio signals, the velocity vectors of the particles is known and the Association will be simple, as will be further shown in detail. When a thread DirAC transmits the audio signals that is different from the b-format signals, or Omni-directional mono signal, a unit for determining 110 can be adapted, first, to convert it into two mono DirAC stream, and then the embodiment of the invention may combine the converted streams. Thus, in embodiments, the first and second spatial audio streams can represent the converted mono DirAC streams.

Embodiments can combine the available audio channels in the approximation of the Omni-directional model of the microphone. For example, in the case of a stereo stream DirAC this can be achieved by summing the left channel L and right channel R.

Next, you will be shown the physical phenomena in the generation of multiple sound sources. When there are multiple sound sources, can similarly Express the pressure and velocity of the particle as the sum of the individual component.

Let R(i)(k,n) and U(i)(k,n) are the pressure and velocity of the particles, which would have been recorded for the i-th source, if he was the only source. The assumption of linearity of propagation, when the sources acting together, the pressure P(k,n) and the particle velocity U(k,n) is equal to

and

.

The previous equation shows that if the pressure and the particle velocity is known, obtaining United mono DirAC stream is simple. This situation is represented in figure 3. Figure 3 shows an embodiment optimized or, perhaps, an ideal merge multiple audio streams. Figure 3 suggests that all known vectors of pressure and particle velocity. Unfortunately, such a trivial merge is not possible for mono DirAC streams, for which the particle velocity U(i)(k,n) is unknown.

Figure 3 illustrates the flows, for each of which the assessment of the P/U is in blocks 301, 302-30N. The results of the evaluation unit P/U represent the corresponding frequency-time representation of individual signals R(i)(k,n) and U(i)(k,n), which can then be combined in accordance with recorded above equations (7) and (8) using two adders 310 and 311. After received the joint P(k,n) and U(k,n), energy analysis stage 320 can directly determine the parameter diffusely Ψ(k,n) and the direction of receipt of eDOA(k,n).

Figure 4 shows a variant combining multiple mono DirAC streams. As described above, the N threads will be merged using the embodiment of the block 100, shown is about 4. As shown in figure 4, each of the N input streams may be represented mono representation of R(i)(k,n) depending on time and frequency, direction of arrivalsand Ψ(k,n), where(1)means the first thread. Figure 4 also shows the corresponding view for the combined stream.

The task of combining two or more mono DirAC streams are presented in figure 4. The pressure P(k,n) can be obtained simply by summing the known values of R(i)(k,n), as in (7), and the task of combining two or more mono DirAC streams boils down to the definition of eDOA(k,n) and Ψ(k,n). The following embodiment is based on the assumption that the field of each source consists of a plane wave summed for diffuse field. Thus, the pressure and the particle velocity for the i-th source can be expressed as

where the index "PW" and "diff" means a plane wave and diffuse field, respectively. In the following embodiment presents the current strategy to estimate the direction of arrivals of the sound and diffusion. The respective processing steps shown in figure 5.

Figure 5 illustrates another block 500 to merge multiple audio streams, which will be discussed in detail below. 5 illustriou the t handle of the first spatial audio stream from the point of view of the first mono representation R (1), the first direction of the receiptsand the first parameter of diffusely Ψ(1). In accordance with figure 5, the first spatial audio stream is divided into an approximate representation of plane wavesand the second spatial audio stream, and possibly other spatial audio streams, respectively. Estimates indicate the maximum value of the corresponding formulas.

The evaluation unit 120 may be adapted to estimate the set N of wave representations ofand views diffuse fieldin view of the approximationsfor many spatial audio streams, with 1≤i≤N. the Processor 130 can be adapted to determine the combined areas of admission on the basis of estimates

where

,

,

,

,

,

with real numbers α(i)(k,n), β(i)(k,n)∈{0...1}.

Figure 5 until the EN dashed line the evaluation unit 120 and the processor 130. In the variant shown in figure 5, the block 110 to determine missing, as it is assumed that the first spatial audio stream and the second spatial audio stream, and possibly other audio streams are presented in mono DirAC representation, i.e. mono representation DOA and settings diffusely distinct from the stream. As shown in figure 5, the processor 130 can be adapted to determine the joint DOA-based assessment.

The direction of arrivals of the sound, i.e. the measurement direction can be estimated by the value of, which is defined as

whereallows to evaluate the real part of the vector of intensity of the combined stream. It can be obtained in the following

whereandare the estimated values of the pressure and particle velocity corresponding plane waves, i.e. as a measurement of the wave field. They can be defined as

The coefficient α (i)(k,n) and β(i)(k,n), generally speaking, depend on the frequency and can be inversely proportional to diffusely Ψ(i)(k,n). In fact, when the diffusion Ψ(i)(k,n) is close to 0, we can assume that the field consists of a single plane wave, so

this means that α(i)(k,n)=β(i)(k,n)=1.

Below you will find the two variants of the embodiments that determine α(i)(k,n) and β(i)(k,n). First, diffuse fields are considered from energy considerations. In embodiments, the evaluation unit 120 may be adapted to determine the coefficients α(i)(k,n) and β(i)(k,n) based on diffuse field. In embodiments believe that the field consists of plane waves summed in an ideal diffuse field. In embodiments, the evaluation unit 120 may be adapted to determine α(i)(k,n) and β(i)(k,n) in accordance with

assuming the density of air ρ0equal to 1 and neglecting, for simplicity, the functional dependence of (k,n), can be written

In embodiments, the processor 130 can be adapted to approximate [approximation] diffuse fields on the basis of their statistical properties, approximation can be obtained as of the time:

where Ediffis the energy of the diffuse field. Embodiments thus allow us to estimate values

To calculate the instantaneous estimates (i.e. for each time-frequency grid) options can be excluded operators expectations, then get

Using the approximation of a plane wave, the estimated value of the velocity of particles can be obtained directly

In the embodiments can be applied simplified modeling of particle velocity. In embodiments, the evaluation unit 120 can be adapted for approximation coefficients α(i)(k,n) and β(i)(k,n) based on simplified models. Embodiments may use alternative solution which can be obtained by the introduction of a simplified simulation of particle velocity

α(i)(k,n)=1,

Next, we formulate the following conclusion. The particle velocity U(i)(k,n) is modeled as

The coefficient β(i)(k,n) can be obtained by substituting (26) into (5), which leads to the expression

To get instant C is achene operators expectations can be eliminated and the solution for β (i)(k,n) has the form

Note that this approach leads to similar results in the direction of arrivals of the sound as in accordance with (19), but with lower computational complexity, given that the coefficient α(i)(k,n) is equal to the unit.

In embodiments, the processor 130 can be adapted to assess diffusely, i.e. to assess the combined parameter diffusely. Diffusion combined stream, denoted by Ψ(k,n)can be estimated directly from the known values of Ψ(i)(k,n) and P(i)(k,n) and evaluation ofobtained by the method described above. In accordance with the energy considerations introduced in the previous section, the options can use the following mark:

Known values ofandallow the use of alternative representations given in equation (b), in embodiments of the invention. In fact, the direction of the wave can be obtained from the, whilegives the amplitude and phase of the i-th wave. From the latter values can be easily calculated all the phase difference ∆(i,j)The parameters of the combined DirAC stream can then be calculated by substituting equation (b) in equation (a), (3) and (5).

6 illustrates an embodiment of a method for combining two or more streams of DirAC. Embodiments can serve as a vehicle for combining the first spatial audio stream with the second spatial audio stream to obtain a combined audio stream. In embodiments, the method may include the step of determining for the first spatial audio stream, the first audio representation and the first DOA, and for the second spatial audio stream, the second audio representation and the second DOA. In embodiments, when the DirAC representation of the spatial audio streams may be available in the identify phase is easy reading, in accordance with representations of audio streams. Figure 6 assumes that two or more threads DirAC can be simply obtained from the audio stream in accordance with step 610.

In embodiments, the method may include the step of evaluating the first view of the wave containing measurement first direction of the receiving wave and the first dimension of the wave field for the first spatial audio stream based on the first audio representation of the first DOA, and possibly the first parameter of diffusely. Accordingly, the method may include the evaluation phase of the second representation of the wave that contains the dimension of the second direction of arrivals of the waves and the second dimension of the wave field for W is the second spatial audio stream based on the second audio representation, the second DOA and possibly the second parameter of diffusely.

The method may further comprise the step of combining the first view of the waves and the second representation of the wave to obtain the joint wave that contains the dimension of the field combined wave measurement DOA combined wave and the step of combining the first and second audio representations to obtain a merged audio representation, which is shown in Fig.6 at step 620 for mono sound channels. The embodiment shown in Fig.6, includes a step of calculating α(i)(k,n) and β(i)(k,n) according to (19) and (25) enables us to estimate the pressure and the velocity vector of the particle representations of plane waves at step 640. In other words, the evaluation stages of the first and second views of a plane wave are carried out at steps 630 and 640 figure 6 in terms of representations of a plane wave.

Combining the first and second views of a plane wave is carried out at step 650, where can sum up the pressure and velocity vectors of the particles of all threads.

At step 660 figure 6 calculation of the real part of the vector of intensity and DOA estimation are based on the consolidated view of plane waves.

Embodiments may include the step of combining or processing the combined measurement field, the first and second mono views, and first and second PA is amerov diffusely to obtain the combined parameter diffusely. In the variant depicted in Fig.6, the computation of diffusely is carried out at step 670, for example, on the basis of (29).

Embodiments have the advantage that combining spatial audio streams can be performed with high quality at a moderate complexity.

Depending on the specific requirements to implement the proposed method, the methods of the invention can be implemented in hardware or software. The implementation may be performed using digital media and, in particular, a flash memory, a DVD or CD with electronically readable control signals stored on them and performing the methods of the invention, which is compatible with the programming system of the computer. Thus, the present invention is a software code for a computer, stored on a machine-readable carrier, the program code implements the methods of the invention when the computer program runs on a computer or processor. In other words, methods of the invention is a computer program with program code and performs at least one of the methods of the invention when run on a computer.

1. Hardware unit (100) for combining the first spatial audio stream with the second spatial audio stream to obtain a combined audio pot is ka, comprising the evaluation unit (120) to assess the representation of the first wave containing the measurement directions of the receipt of the first wave ofcharacterizing the direction of the first wave, and the measurement of the wave field, which is the relative magnitude of the first wave, for the first spatial audio stream with the first autoprestige containing pressure measurement or magnitude of the first audio signal (R(1)), and the first direction receiptsand to assess the representation of the second wave containing the measurement directions of the receipt of the second wave, which characterizes the direction of the second wave ofand the measurement field of the second wave of, which is the relative magnitude of the second wave, for the second spatial audio stream with the second autoprestige containing pressure measurement or magnitude of the second audio signal (R(2)), and the second direction receipts; and a processor (130) for processing the first and second views of the waves and receiving the submission of the consolidated waves containing the combined measurement of the wave field the combined measurement of the direction of arrivalsand the joint parameter diffuselyand the combined parameter diffusely obtained using measurements of the merged wave fieldthe first autoprestige (R(1)and second autoprestige (R(2)), and the measurement of the merged wave fieldbased on the measurement field of the first wave, the measurement field of the second wave, the measurement directions of the receipt of the first wave ofand direction of receipt of the second wave ofat this point, the processor (130) is adapted for processing the first autoprestige (R(1)and second autoprestige (R(2)and get United autoprestige (R), and for forming a merged audio stream that contains the merged autoprestige (R), the dimension of the joint directions of arrivalsand the joint parameter diffusely.

2. Hardware block (100) of claim 1, wherein the evaluation unit (120) adapted to estimate the dimension of the first wave floor is in terms of the field amplitude of the first wave, and to assess the dimension of the second wave field in terms of field amplitude of the second wave, and to evaluate the phase difference between measurements of the first and second wave field, and/or evaluation phase field of the first wave and phase fields of the second wave.

3. Hardware unit (100) according to claim 1, comprising a unit (110) for determining the first autoprestige for the first spatial audio stream, measuring the first direction of the inflow and the first parameter of diffusely, as well as to determine the second autoprestige for the second spatial audio stream, the dimension in the second direction, the receipts and the second parameter of diffusely.

4. Hardware unit (100) according to claim 1, where the processor (130) is adapted to determine the combined autoprestige, measurements of joint areas of income and the combined parameter diffusely given time-frequency dependencies.

5. Hardware unit (100) according to claim 1, where the evaluation unit (120) adapted to estimate the first and/or second wave of representations, and the processor (130) adapted to receive the United autoprestige in terms of the pressure signal p(t) or time-frequency transform of the pressure signal P(k,n), where k denotes the frequency index and n denotes the time index.

6. Hardware unit (100) according to claim 5, where the processor (130) is adapted for processing measure the first and second directions of admission and/or for the measurement of joint directions of arrivals in terms of the unit vectors e DOA(k,n), where
eDOA(k,n)=-eI(k,n) and
,
,
where P(k,n) is the pressure of the combined stream, and U(k,n)=[Ux(k,n), Uy(k,n), Uz(k,n)] denotes the transformation of time-frequency u(t)=[ux(t), uy(t), uz(t)] of the vector particle velocity of the combined stream, where Re{·} denotes the real part.

7. Hardware unit (100) according to claim 6, in which the processor (130) is adapted for processing the first and/or second parameters diffusely and/or to obtain a combined parameter diffusely in terms of


where U(k,n)=[Ux(k,n), Uy(k,n), Uz(k,n)] denotes the transformation of time-frequency vector of the particle velocity of the combined stream, where Re{·} denotes the u(t)=[ux(t), uy(t), uz(t)] is the real part, P(k,n) denotes the time-frequency transform of the pressure signal p(t), k denotes the frequency index, n denotes the time index, C is the speed of sound anddenotes the energy of the sound field, ρ0denotes the air density and <·>tdenotes averaging over time.

8. Hardware unit (100) according to claim 7, in which the evaluation unit (120) adapted to estimate the key set of N wave representations of and presentation of diffuse fieldas an approximation for a set of N spatial streamswhere 1≤i≤N, and in which the processor (130) is adapted to determine the joint direction of arrivals of the sound based on the assessment,
,
,
,
,
,
,
with real numbers α(i)(k,n), β(i)(k,n)∈{0...1}, U(k,n)=[Ux(k,n), Uy(k,n), Uz(k,n)] denotes the transformation of time-frequency u(t)=[ux(t), uy(t), uz(t)] of the vector particle velocity of the combined stream, where Re{·} denotes the real part, P(i)(k,n) denotes the time-frequency transform of the pressure signal p(i)(t), k denotes the frequency index, n denotes the time index, C is the speed of sound, N is the number of spatial audio streams, C is the speed of sound and ρ0denotes the density of the air.

9. Hardware block (100) of claim 8, where the evaluation unit (120) adapted to determine α(i)(k,n) and β(i)(k,n) in accordance the with
α(i)(k,n)=β(i)(k,n)
.

10. Hardware block (100) of claim 8, where the processor (130) is adapted to determine α(i)(k,n) and β(i)(k,n) in accordance with formulas
α(i)(k,n)=1,

11. Hardware unit (100) according to claim 9, in which the processor (130) is adapted to determine the combined parameter diffusely by the formula

12. Hardware unit (100) according to claim 1, in which the first spatial audio stream further comprises a first parameter of diffusely (Ψ(1)), while the second spatial audio stream further comprises a second parameter diffusely (Ψ(2))and the processor (130) is adapted to calculate the combined parameter diffuselyusing the first parameter of diffusely (Ψ(1)and the second parameter of diffusely (Ψ(2)).

13. The way of combining the first spatial audio stream with the second spatial audio stream to obtain a combined stream, which includes the assessment of the first waveform containing the measurement direction of the first wave ofcharacterizing the direction of the first wave, and the measurement field of the first wave of , which is the relative magnitude of the first wave, for the first spatial audio stream with the first autoprestige containing pressure measurement or magnitude of the first audio signal (R(1)), and the first direction of arrivals of the sound; and the evaluation of the second waveform containing the direction of the second wave, which characterizes the direction of the second wave ofand the measurement field of the second wave of, which is the relative magnitude of the second wave, for the second spatial audio stream with the second autoprestige containing pressure measurement or magnitude of the second audio signal (R(2)), and the second direction of arrivals of the sound; and processing the representation of the first wave and the submission of a second wavelength to obtain a combined view of a wavethat contains the dimension of the merged wave field measure the joint direction of receipts andcombined parameter diffuselyand the combined parameter diffuselyreceived on the again of the measurement directions of the first wave and measuring the direction of the second wave; processing the first autoprestige (R(1)and second autoprestige (R(2)for obtaining of the United autoprestige (R), as well as the formation of the combined stream containing the merged autoprestige (R), the dimension of the joint directions of arrivalsand the joint parameter diffusely.

14. The method according to item 13, in which the first spatial audio stream further comprises a first parameter of diffusely (Ψ(1)); the second spatial audio stream further comprises a second parameter diffusely (Ψ(2)), and the combined parameter diffuselyis calculated at the stage of additional processing using the first parameter of diffusely (Ψ(1)) and the second argument is diffusely (Ψ(2)).

15. The computer-readable medium containing stored thereon a computer program with program code capable of implementing the method according to item 13, when the program is executed by a computer or processor.



 

Same patents:

FIELD: physics.

SUBSTANCE: apparatus (100) for generating a multichannel audio signal (142) based on an input audio signal (102) comprises a main signal upmixing means (110), a section (segment) selector (120), a section signal upmixing means (110) and a combiner (140). The section signal upmixing means (110) is configured to provide a main multichannel audio signal (112) based on the input audio signal (102). The section selector (120) is configured to select or not select a section of the input audio signal (102) based on analysis of the input audio signal (102). The selected section of the input audio signal (102), a processed selected section of the input audio signal (102) or a reference signal associated with the selected section of the input audio signal (102) is provided as section signal (122). The section signal upmixing means (130) is configured to provide a section upmix signal (132) based on the section signal (122), and the combiner (140) is configured to overlay the main multichannel audio signal (112) and the section upmix signal (132) to obtain the multichannel audio signal (142).

EFFECT: improved flexibility and sound quality.

12 cl, 10 dwg

FIELD: information technology.

SUBSTANCE: invention relates to lossless multi-channel audio codec which uses adaptive segmentation with random access point (RAP) and multiple prediction parameter set (MPPS) capability. The lossless audio codec encodes/decodes a lossless variable bit rate (VBR) bit stream with random access point (RAP) capability to initiate lossless decoding at a specified segment within a frame and/or multiple prediction parameter set (MPPS) capability partitioned to mitigate transient effects. This is accomplished with an adaptive segmentation technique that fixes segment start points based on constraints imposed by the existence of a desired RAP and/or detected transient in the frame and selects a optimum segment duration in each frame to reduce encoded frame payload subject to an encoded segment payload constraint. RAP and MPPS are particularly applicable to improve overall performance for longer frame durations.

EFFECT: higher overall encoding efficiency.

48 cl, 23 dwg

FIELD: physics.

SUBSTANCE: method and system for generating output signals for reproduction by two physical speakers in response to input audio signals indicative of sound from multiple source locations including at least two rear locations. Typically, the input signals are indicative of sound from three front locations and two rear locations (left and right surround sources). A virtualiser generates left and right surround output signals suitable for driving front loudspeakers to emit sound that a listener perceives as emitted from rear sources. Typically, the virtualiser generates left and right surround output signals by transforming rear source input signals in accordance with a sound perception simulation function. To ensure that virtual channels are well heard in the presence of other channels, the virtualiser performs dynamic range compression on rear source input signals. The dynamic range compression is preferably performed by amplifying rear source input signals or partially processed versions thereof in a nonlinear way relative to front source input signals.

EFFECT: separating virtual sources while avoiding excessive emphasis of virtual channels.

34 cl, 9 dwg

FIELD: information technologies.

SUBSTANCE: invention discloses the method for reproduction of multiple audio channels, according to which out-of-phase information is extracted from side and/or rear side channels contained in a multi-channel audio signal.

EFFECT: improved reproduction of a multi-channel audio signal.

15 cl, 10 dwg

FIELD: information technologies.

SUBSTANCE: audio decoder for decoding multi-object audio signal comprises module to compute factor of forecasting matrix C consisting of factors forecasts based on data about object level difference (OLD), as well as means for step-up mixing proceeding from forecast factors for getting first upmix audio signal tending first type audio signal and/or second upmix signal tending to second type audio signal. Note here that multi-object audio signal comprises coded audio signals of first and second types. Multi-object audio signal consists of downmix signal 112 and service info. Service info comprises data on first and second type signal levels in first predefined frequency-time resolution.

EFFECT: separation of individual audio objects in mixing and decreasing/increasing channel number.

20 cl, 24 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention relates to processing audio signals, particularly to improving intelligibility of dialogue and oral speech, for example, in surround entertainment ambient sound. A multichannel audio signal is processed to form a first characteristic and a second characteristic. The first channel is processed to generate a speech probability value. The first characteristic corresponds to a first measured indicator which depends on the signal level in the first channel of the multichannel audio signal containing speech and non-speech audio. The second characteristic corresponds to a second measured indicator which depends on the signal level in the second channel of the multichannel audio signal primarily containing non-speech audio. Further, the first and second characteristics of the multichannel audio signal are compared to generate an attenuation coefficient, wherein the difference between the first measured indicator and the second measured indicator is determined, and the attenuation coefficient is calculated based on the obtained difference and a threshold value. The attenuation coefficient is then adjusted in accordance with the speech probability value and the second channel is attenuated using the adjusted attenuation coefficient.

EFFECT: improved speech perceptibility.

12 cl, 5 dwg

FIELD: radio engineering.

SUBSTANCE: invention relates to a mechanism, which tracks signals of a secondary microphone in a mobile device with multiple microphones in order to warn a user, if one or more secondary microphones are covered at the moment, when the mobile device is used. In one example the estimate values of secondary microphone capacity averaged in a smoothed manner may be calculated and compared to the estimate value of the minimum noise level of the main microphone. Detection of microphone cover may be carried out by comparison of smoothed estimate values of secondary microphone capacity with an estimate value of minimum noise level for the main microphone. In another example the estimate values of the minimum noise level for signals of the main and secondary microphones may be compared with the difference in the sensitivity of the first and second microphones in order to detect whether the secondary microphone is covered. As soon as detection is over, a warning signal may be generated and issued to the user.

EFFECT: improved quality of main sonic signal sound.

37 cl, 9 dwg

FIELD: information technology.

SUBSTANCE: signal processing method involves: receiving a signal and spatial information which includes channel level difference (CLD) information, a channel prediction coefficient (CPC), interchannel coherence (ICC) information; obtaining mode information for determining the encoding scheme and modification flag information indicating whether the signal has been modified. If the mode information indicates an audio encoding scheme, the signal is decoded according to the audio encoding scheme. If the modification flag information indicates that the signal has been modified, restoration information is obtained after modification, which indicates the value for adjusting the window length applied to the signal; the window length is modified based on restoration information after modification and the signal is decoded using the window with the modified length. Further, based extension information, the base extension signal is determined; a downmix extended signal is generated, having a bandwidth which is extended using the base extension signal by restoring the high-frequency region signal; and a multichannel signal is generated by applying spatial information to the downmix extended signal.

EFFECT: high signal encoding efficiency.

7 cl, 22 dwg

FIELD: information technology.

SUBSTANCE: converter generates parameters which determine the relationship between a first and a second channel for a multichannel audio signal, associated with configuration of a multichannel acoustic system. Level parameters are generated based on object parameters from a plurality of audio objects associated with a downmixing channel, which are generated using audio signals of an object associated with audio objects. Object parameters contain an energy parameter which indicates energy of the audio signal of the object. A parametric generator is used to obtain coherence and level parameters which combine the energy parameter and reproduction parameters of the object, and which depend on the desired reproduction configuration.

EFFECT: less complex application of various systems which are designed to encode and decode parametric multichannel audio streams.

27 cl, 10 dwg

FIELD: information technologies.

SUBSTANCE: audio signal coder comprises a facility to receive M-channel audio signal, where M>2, a facility of downmix to downmix M-channel audio signal into the first stereo signal and related parametric data, a facility of modification to modify the first stereo signal in order to generate the second stereo signal in response to related parametric data and data of spatial parameters, which specify transfer function of binaural perception, besides, the second stereo signal is a binaural signal, a facility for coding of the second stereo signal with the purpose to generate coded data and an output facility to generate out data flow, containing coded data and related parametric data.

EFFECT: increased efficiency of stereo coding of multichannel signals with reduction of coding complexity.

35 cl, 11 dwg

Slit type gas laser // 2273116

FIELD: quantum electronics, possible use for engineering technological slit type gas lasers.

SUBSTANCE: slit type gas laser has hermetic chamber, a pair of metallic electrodes, alternating voltage source, a pair of dielectric barriers, and an optical resonator. Chamber is filled with active gas substance. Metallic electrodes are mounted within aforementioned chamber, each of them has surface, directed to face surface of another electrode. Source of alternating voltage is connected to aforementioned electrodes for feeding excitation voltage to them. Dielectric barriers are positioned between metallic electrodes, so that surfaces of these barriers directed to each other form slit discharge gap for forming of barrier discharge in gas substance.

EFFECT: possible construction of slit type gas laser, excited by barrier discharge, dielectric barriers being made specifically to improve heat drain from active substance of laser, decrease voltage fall on these dielectric barriers, provide possible increase of electrodes area, improve efficiency of laser radiation generation, increase output power of laser, improve mode composition of its output signal.

8 cl, 4 dwg

FIELD: stereophonic systems with more than two channels.

SUBSTANCE: in accordance to the method, data is generated for parametric codes of first subset of sound input channels for first frequency area by using parametric multi-channel encoding; and parameter code data is generated for second subset of sound input channels for second frequency area by means of application of parametric multi-channel audio-encoding, where the second frequency area is different from the first frequency area; and the second subset of sound input channels is different from the first subset of sound input channels.

EFFECT: reduced data processing load in encoder and decoder, and also reduced BCC bit code streams.

6 cl, 2 dwg

Audio coding // 2325046

FIELD: audio coding.

SUBSTANCE: with the binaural coding, only one monophonic channel is coded. An additional layer contains parameters for the LH and RH signals. A coder is described, which associates transient process information extracted from the monophonic coded signal with parametric multichannel layers. Transient process locations may also be determined directly from the bit flow or calculated using other coded parameters (e.g., the window switch flag if specified in customer's requirements).

EFFECT: increase in efficiency due to use of transient process information in parametric multichannel layer.

13 cl, 4 dwg

FIELD: physics.

SUBSTANCE: said utility invention relates to sound recording and sound reproduction equipment and may be used for recording and restoration of a multi-dimensional acoustic scene, as well as during its transmission through media. In a recording room, the acoustic axes of all microphones are directed towards the centre of the acoustic scene being recorded, which is located on a vertical plane passing through the performers' front, the acoustic scene centre is located at the listeners' head level and in the middle of the microphones; in the listening room, the acoustic system arrangement on the vertical plane relative to the centre of the acoustic scene being restored is equivalent to the arrangement of microphones in the recording room; during transmission of all acoustic scene signal components from the microphone to the acoustic system and their amplification, output amplitude and phase relationships equivalent to the input ones are provided; displacement of acoustic systems in the vertical plane performs their phasing between one another, and rotation of the acoustic systems converges their axes into the point of acoustic scene restoration. When multi-band acoustic systems are used, the band phase adjustment and acoustic axis angles convergence may be performed online.

EFFECT: possibility to restore amplitude/phase acoustic scene.

2 cl, 2 dwg

FIELD: radio engineering.

SUBSTANCE: invention relates to device and method of multichannel sound signal processing in the compatible stereo format. While processing the multichannel sound signal having at least three initial channels, (12) the first mixing channel and the second mixing channel which are extracted from the initial channels are transmitted. (14) Additional channel information is calculated for the initial channel selected from initial channels in such a way so that mixing channel or combined mixing channel, including the first and the second mixing channels, generate approximation of the selected initial channel using weighting with additional channel information. Additional channel information and the first/second mixing channels form output data (20), which are to be transmitted to the decoder. If a low-level decoder is used, only the first/second mixing channels are decoded; if a high-level decoder is used, a composite multichannel sound signal is transmitted basing on mixing channels and additional channel information.

EFFECT: due to additional channel information occupies few bits and decoder does not use an inverse matrix, effective and high-quality multichannel extension for stereo record-players and multichannel record-players is obtained.

29 cl, 10 dwg

FIELD: physics, acoustics.

SUBSTANCE: invention refers to multichannel audio signal processing, specifically to multichannel audio signal restoration using primary channel and parametrical supplementary information. Multichannel synthesiser contains postprocessor for postprocess characterisation of restoration or values derived from restoration parameter for current time line of input signal so that postprocessed parameter of restoration or postprocessed value differs from relative quantised and inversely quantised parameter by that value is postprocessed parameter of restoration or derives value are not limited by quantisation step length. Multichannel restoration unit (12) applies postprocessed parameter of restoration to restore multichannel output signal. Technical result consists that by postprocessing of restoration parameters with reference to multichannel coding/decoding enables low data transfer rate, on the one hand, and high quality, on the other hand, as far as strong changes in restored multichannel output signal is lowered owing to great quantisation step length for restoration parameter, being preferable due to required data transfer rate.

EFFECT: improved quality of signal transmission.

25 cl, 16 dwg

Audio encoding // 2363116

FIELD: communication devices.

SUBSTANCE: invention relates to encoding a multichannel audio signal, particularly encoding a multichannel signal containing first, second and third signal components. The method of encoding a multichannel audio signal containing at least, a first signal component (LF), second signal component (LR) and a third signal component (RF), involves encoding the first and second signal components using a first parametric encoder (202) to obtain the first encoded signal (L) and the first set (P2) of coding parametres. The first encoded signal and an additional signal (R) are encoded using a second parametric encoder to obtain a second encoded signal (T) and a second set (P1) of coding parametres. The additional signal is obtained from at least the third signal component, and is a multichannel audio signal in form of at least, the resultant encoded signal (T), obtained from at least, the second encoded signal, first set of coding parametres and second set of coding parametres.

EFFECT: more efficient encoding.

13 cl, 13 dwg

FIELD: individual supplies.

SUBSTANCE: invention concerns multichannel sound reproduction systems, particularly application of psychoacoustic principles in acoustic system design. Surrounding sound reproduction system uses a number of filters and system of main and auxiliary speakers producing effect of phantom rear channels of surrounding sound or phantom surrounding sound by acoustic system or system of two speakers installed in front of listener. Acoustic system includes left and right input signals of surrounding sound and left and right frontal input signals. Left and right auxiliary speakers and left and right main speakers are positioned in front of audition position. Distance between respective main and auxiliary speakers is equal to distance between ears of an average human.

EFFECT: surrounding sound reproduction by speakers installed only in front of listener.

59 cl, 21 dwg

FIELD: physics; acoustics.

SUBSTANCE: invention relates to coding several signals from audio sources, which must be transmitted or stored with the objective of mixing in order to synthesise a wave field, signals for multichannel three-dimensional or stereophonic audio after decoding signals from the sources. The proposed method provides for efficient composite coding signals compared to their separate coding, even when there is no redundancy between the signals. This is possible due to statistical properties of signals, properties of the coding method and spatial hearing. The sum of the signals is transmitted together with the statistical properties, which mainly determine spatial features for final mixed audio signals which are important for perception. The signals are reconstructed in a receiver so that statistical properties are approximately identical to corresponding properties of initial signals from the sources.

EFFECT: more efficient coding when mixing coded signals.

22 cl, 14 dwg

FIELD: physics; communications.

SUBSTANCE: invention relates to technology of multichannel audio and, specifically, to applications of multichannel audio in connections with headphone technologies. The device for generating an encoded stereo signal from a multichannel presentation includes a multichannel decoder (11), which forms three or more channels from at least one main channel and parametric information. Said three or more channels are subject to processing (12) headphone signals so as to generate an uncoded first stereo channel and an uncoded second stereo channel, which are then input into a stereo encoder (13) so as to generate an encoded stereo file at the output side. The encoded stereo file can be transmitted to any suitable playback device in form of a CD player or portable playback device such that, the user not only receives a normal stereo impression, but a multichannel impression as well.

EFFECT: efficient signal processing concept, which allows for multichannel quality playback on headphones in simple playback devices.

12 cl, 11 dwg

Up!