RussianPatents.com

Method of encoding and decoding audio signal and device for realising said method

Method of encoding and decoding audio signal and device for realising said method
IPC classes for russian patent Method of encoding and decoding audio signal and device for realising said method (RU 2473062):
Another patents in same IPC classes:
Shutoff separating device for pressure gauges Shutoff separating device for pressure gauges / 2473061
Shutoff separating device for pressure gauges includes housing (1) with the first (2) and the second (3) chambers, with nozzles (10) for connection of the device to the pipeline and for connection of working pressure gauge. Valves (4), the movement of which is controlled with flywheels (5), are installed in chambers. The first pair of mutually perpendicular holes (12) attaches the pipeline cavity and the first chamber. The second pair of mutually perpendicular holes (13) attaches the first chamber and working pressure gauge. In housing there is an inclined hole (14) at an angle of 20° to the housing axis, which attaches the second pair of mutually perpendicular holes and the second chamber. One of the holes of the first pair of mutually perpendicular holes and inclined hole are passages in seats of valves. In the housing there is a threaded hole for connection of a test pressure gauge or gauge of working medium discharge from the device. Threaded hole is connected to the second chamber.
Fast algorithms for computation of 5-point dct-ii, dct-iv, and dst-iv, and architectures Fast algorithms for computation of 5-point dct-ii, dct-iv, and dst-iv, and architectures / 2464540
More efficient encoder/decoder is provided in which an N-point MDCT transform is mapped into smaller sized N/2-point DCT-IV, DST-IV and/or DCT-II transforms. The MDCT may be systematically decimated by factor of 2 by utilising a uniformly scaled 5-point DCT-II core function as opposed to the DCT-IV or FFT core functions used in many existing MDCT designs in audio codecs. Various transform factorisations of the 5-point transforms may be implemented to more efficiently implement a transform.
Indicating pressure gauge with induction sensors Indicating pressure gauge with induction sensors / 2456564
Pressure gauge has a cylindrical housing, an elastic sensitive element with a mechanism for circular movement of the pointer with a blind and protective clear glass. The pointer with a semi-circular plate of indicator blind is integrated with a support arm and a balance beam. The support arm lies perpendicular to both the diameter D of the indicator blind and the chord L of the balance beam-counterweight. Induction sensors lie on pointers of limit pressure values at radii R1 and R2 from the axis of the tube. One induction sensor lies at a distance shorter than the radius R1 and the second induction sensor lies at a distance greater than the radius R2. The value of the radius R3 of the indicator blind is greater than the value of the larger radius R2 and during rotation of the indicator blind, induction sensors of limit pressure values are covered. In the first pointer of limit pressure values at a distance of the larger radius R2, there is a semi-circular cut into which the semi-circular protrusion of the second induction sensor enters when limit value pointers are superimposed. The limit value pointers structurally have a limit of 180 degrees in setting the range of separation of pointers.
Method of improving flight safety of aircraft Method of improving flight safety of aircraft / 2455623
Method involves measuring full and static pressure in the nose and tail parts of the aircraft, determining pressure difference in the tail and nose parts, comparing with a permissible value and determining flight safety based on the value of deviation from the permissible value.
Solder sealing of case filled by fluid Solder sealing of case filled by fluid / 2441209
In compliance with this invention, sealing element is placed in case filling neck, clamped and soldered. Note here that said sealing element represents cylindrical insert to be fitted in filling opening with clearance to be taken up by clamping along ledge cone, said ledge being made along case opening. Then, cylindrical insert end face is soldered to conical ledge top.
Gas-filled vessel threshold pressure indicator Gas-filled vessel threshold pressure indicator / 2439516
Proposed indicator represents a pressure-operated valve with adjustable operation level and comprises signal generator and transmitter integrated into gas acoustic emitter with its pressure-operated valve outlet communicated with emitter inlet. Said pressure-operated valve incorporates the device to it to be mounted on gas-filled vessel gate valve and gate valve unlocking device. Mike is used as said receiver.
Accuracy-optimised encoding with supression of lead echo Accuracy-optimised encoding with supression of lead echo / 2425340
Method of encoding multichannel audio signals involves generating a first output signal (x'mono) which represents coding parameters which characterise the main signal (xmono). The main signal (xmono) is a first linear combination of signals (16A, 16B) of at least a first and a second channel. The method also involves generating a second output signal (Pside), which represents coding parameters which characterise a side signal (xside). The side signal (xside) is a second linear combination of signals (16A, 16B) of at least a first and a second channel in the coding frame. Generation of the second output signal also includes scaling the side signal (xside) on the energy loop of the main signal (xmono).
Field device of production process with energy limited battery assembly Field device of production process with energy limited battery assembly / 2420832
Field device (200) of production process has housing (202) with wall (204). Wall has feedthrough hole (207) between battery compartment (208) and compartment (206) of electronics. Feedthrough connector (230) seals the feedthrough hole and includes power connector (234, 236) connected to electronics (212) of field device of production process. Feedthrough connector includes polarisation cover which envelopes power contacts available from battery compartment for connection. Battery assembly (216) includes battery housing with connector (244) of battery, which includes poured base of connector, which protrudes into polarisation cover, and electric contacts which come into contact with power contacts in polarisation cover, and includes battery (242) and power limiter (240) connected to battery connector which is engaged with power connector for supply of power to electronics of field device of production process. Seal (250) seals the connection of polarisation cover and base of connector.
Multi-position vortex gas pressure control Multi-position vortex gas pressure control / 2420779
Multi-position vortex gas pressure control includes supply and discharge gas pipelines, vortex tube, control assembly, and hot circuit channel. Control assembly represents slide valve consisting of cylinder in the wall of which the holes are made throughout the length, the number of which is equal to number of operating positions of control, which are equipped with dosing branch pipes, closed from above and from below with covers. Covers are equipped with gas supply branch pipe connected to supply gas line and with balancing branch pipe. Hollow plunger is arranged inside cylinder of slide valve. Elastic element is located inside plunger. Vortex tubes have finning of outer surface of housing and connected through cold tubes to dosing branch pipes of slide valve, and through hot tubes to outlet header. Header is connected through balancing pipe to balancing branch pipe, thus forming hot circuit channel, and to discharge gas line.
Weight sensor Weight sensor / 2406987
Device has a damping part and a load receiver. The device also has a control unit, a cylindrical housing divided by a partition wall into two chambers linked by an orifice. In the lower chamber there is a piston fitted with a spring, and the other chamber houses the said load receiver made in form of coal column in an airtight housing having a membrane and an adjustment screw. The control unit can record changes in resistance of the coal column.
Air pressures receiver Air pressures receiver / 2245525
Device is a body, limited by portion of surface of body of special shape 1, with central 2 and peripheral 3, 4 apertures in it, meant for determining direction and value of speed of gas flow and cutting plane of parallel axis of specific body, on which aperture 5 is placed for determining Mach number and static pressure.
Excessive pressure signaller, method for forming membrane profile for excessive pressure signaller Excessive pressure signaller, method for forming membrane profile for excessive pressure signaller / 2245526
Device has body with hermetically mounted elastic membrane made with concentric corrugation, enveloping upper portion of piston, being a rigid center and interacting with adjustable force spring and electro-contact device. Body also has an insert with central aperture, wherein a piston is mounted with concentric space, an elastic membrane is provided with second corrugation, placed above support surface in insert, separating peripheral portion of membrane, hermetically connected to insert, from its central portion, moving with the piston. Insert is made with stepped central aperture, while stepped transition in aperture serves as support surface for piston bottom, and depth of portion of aperture, wherein a piston is mounted, is equal to piston height, while contacting surfaces of insert and piston and membrane are of matching profile. Also, device has piston-insert pairs with matching parameters, but different effective areas, and springs interacting with piston of different rigidity are made replaceable and interchangeable in terms of mounting dimensions. Also described is method for forming membrane profile for device.
Excessive pressure signaller, method for forming membrane profile for excessive pressure signaller Excessive pressure signaller, method for forming membrane profile for excessive pressure signaller / 2245526
Device has body with hermetically mounted elastic membrane made with concentric corrugation, enveloping upper portion of piston, being a rigid center and interacting with adjustable force spring and electro-contact device. Body also has an insert with central aperture, wherein a piston is mounted with concentric space, an elastic membrane is provided with second corrugation, placed above support surface in insert, separating peripheral portion of membrane, hermetically connected to insert, from its central portion, moving with the piston. Insert is made with stepped central aperture, while stepped transition in aperture serves as support surface for piston bottom, and depth of portion of aperture, wherein a piston is mounted, is equal to piston height, while contacting surfaces of insert and piston and membrane are of matching profile. Also, device has piston-insert pairs with matching parameters, but different effective areas, and springs interacting with piston of different rigidity are made replaceable and interchangeable in terms of mounting dimensions. Also described is method for forming membrane profile for device.
Device for avoidance of hydrating Device for avoidance of hydrating / 2246701
Proposed device has chamber filled with liquid reagent (methanol) and made in form of hermetic vessel connected with drain and filling tube and pulse line connecting the chamber with pressure sensor or pressure differential sensor of high accuracy. Lower end of tube located in chamber is connected with gas line and is provided with fluoroplastic pipe union whose orifice reduces gas exchange between chamber and pipe line.
Method of choosing accessories for setting pickups Method of choosing accessories for setting pickups / 2247333
Method comprises mounting at least two standard set of accessories for setting pressure gauges on a pipeline or tank and determining required set by the algorithm proposed.
Pressure indicator Pressure indicator / 2247955
Housing (1) of the pressure indicator receives bellows (2) connected to core (6) provided with bushing (7) having solenoid (8) which forms inductive converter (9) with core (6). Housing (1) is provided with time relay (15) made of air vessel (16), sensitive member (18) with inductive pickup (19) made of winding (11) and additional core (20) spring-loaded by spring (26), adjustable throttle (22), and stop (25). Before operation, bellows (2) and object (4) are pressurized. In so doing, bellows (2) with bushing (7) moves upward and additional core (20) of pickup (19) enters winding (11), thus signaling of attainment of working pressure. Ring air chamber (13) is then pressurized. In so doing, solenoid (9) is secured to winding (11) interconnected through spacer (10). Relay (15) is pressurized simultaneously. When pressure in bellows (2) drops, core (6) begins to move, thus, generating a signal.
Pressure indicator Pressure indicator / 2247955
Housing (1) of the pressure indicator receives bellows (2) connected to core (6) provided with bushing (7) having solenoid (8) which forms inductive converter (9) with core (6). Housing (1) is provided with time relay (15) made of air vessel (16), sensitive member (18) with inductive pickup (19) made of winding (11) and additional core (20) spring-loaded by spring (26), adjustable throttle (22), and stop (25). Before operation, bellows (2) and object (4) are pressurized. In so doing, bellows (2) with bushing (7) moves upward and additional core (20) of pickup (19) enters winding (11), thus signaling of attainment of working pressure. Ring air chamber (13) is then pressurized. In so doing, solenoid (9) is secured to winding (11) interconnected through spacer (10). Relay (15) is pressurized simultaneously. When pressure in bellows (2) drops, core (6) begins to move, thus, generating a signal.
Media separator Media separator / 2248544
Proposed media separator is used for measuring the pressure of aggressive, toxic, high-viscosity, solidifying (polymerizing) and contaminated media at pulsation of pressure or hydraulic impacts together with pressure gauges. Proposed separator has housing with cover, separating member mounted inside this housing and pressure fluctuation dampener made in form of through hole with thread over part of its length; it is provided with adjusting screw and thrust screw closing the hole on opposite side. Threaded hole is connected with hole of outlet pipe union and with cavity formed by separating member and cover.
Media separator Media separator / 2248545
Proposed media separator is used for aggressive, toxic, high-viscosity, solidifying and contaminated media at pulsations of pressure or hydraulic impacts and other media under question together with use of pressure gauges. Proposed separator has body with inlet pipe union, cover with outlet pipe union and separating member made in form of bellows and mounted in cavity of cover; hole of bellows is located opposite inlet hole of pipe union.
Fiber-optic pressure transducer Fiber-optic pressure transducer / 2253850
Pressure transducer on the base of tunnel effect can be used in different branches of national economy, for example, for measuring high pressures at changes in environmental temperature within ±100C range for items of rocket-space equipment. Transducer has case, supplying and tapping fibers, quartz membrane mounted to have a gap in relation to common edge of fibers and fixed tightly inside coupling, and ring-shaped gasket which has thickness being equal to wavelength of radiation source. Fibers are glued inside the case to be spaced from each other. Free ends of fibers protrude outside surface of case. Ring-shaped gasket is made in form of metal film applied along perimeter. Device also has item having triangle in cross-section. The triangle has apex angle of 2θ. It also has side recess which follows shape and sizes of optical fibers. Metal cap of case has central through hole having width to be equal diameter of optical fiber of d (of) and length of a found from ratio of a=2d(of)tgθ. Cap is tightly mounted between case and coupling to press optical fibers against item having triangular cross-section. Part of optical fibers disposed above the cap is cut away and polished at specific angle to longitudinal axes of fibers.

FIELD: information technology.

SUBSTANCE: spatial information associated with an audio signal is encoded into a bit stream which is transmitted to a decoder or is recorded on a data storage medium. The bit stream contains a different syntax associated with time, frequency and spatial regions, and also includes one or more data structures (e.g., frames) which contain ordered sets of time intervals for which certain parameters are used. The data structures can be fixed or variable. A data structure type indicator may be inserted in the bit stream to enable the decoder to determine the data structure type and activate the corresponding decoding process. The data structure includes position information which can be used by the decoder to identify the correct time interval for which the given set of parameters is used. The position information of the time interval may be encoded using a fixed number of bits or a variable number of bits based on the data structure type indicated by the data structure type indicator. For variable type data structures, position information may be encoded by a variable number of bits based on the position of the time interval in the ordered set of time intervals.

EFFECT: transmission of a multichannel audio signal with low bit rates.

8 cl, 26 dwg

 

The technical field to which the invention relates

The subject of this application relates in General to audio processing.

The level of technology

Currently, research and development of new approaches to perceptual coding of multichannel audio signal, which is usually referred to as spatial audio encoding (SAC). SAC makes it possible to transmit multi-channel audio signal with a low bit rate, which allows you to use SAC for many popular audio applications (such as streaming over the Internet, downloading music).

Instead of performing a discrete encoding of the individual input channels, when using the SAC is fixed spatial image of the multi-channel audio in a compact set of parameters. These parameters can be passed to the decoder, where they are used for synthesis or recovery of the spatial properties of the audio signal.

In some applications, related to the SAC, the spatial parameters are passed to the decoder as part of the bitstream. The bitstream includes spatial frames that contain ordered sets of time intervals, which can be applied to sets of spatial parameters. The bitstream also includes information about the position, Kotor, which can be used by the decoder to identify the correct time frame, to which is applied a specified set of parameters.

In some applications SAC in tracts encoding/decoding using the conceptual elements. One element is usually called the element of one-to-two (OTT), and the other element is usually referred to as the element of "two to three" (TTT), where these names contain the number of input and output signals of the corresponding element decoder. Item OTT encoder selects the two-dimensional parameter and generates a signal resulting from the down-mixing, and the residual signal. Item TTT performs a stereo downmix of the three audio signals, resulting in receiving a stereo signal after down-mixing plus a residual signal. These elements can be combined to create a variety of spatial configurations audiority (e.g., surround sound).

Some applications SAC can run in unattended mode, when the encoder, the decoder is only a stereo signal after down-mixing without the need to transfer spatial parameters. The decoder synthesizes the spatial parameters of the signal obtained by down-mixing, and uses these parameters to create a multi-channel audio.

The invention

Spatial information associated with the audio signal, is encoded in the bit stream that can be transmitted to the decoder or recorded on the data carrier. The bitstream may contain different syntax related to time, frequency and spatial domains. In some embodiments, the bitstream includes one or more data structures (e.g., frames)that contain ordered sets the time interval for which you can apply one or other parameters. These data structures can be either fixed or variable. Indicator type of data structure may be inserted in the bit stream, to allow the decoder to determine the type of data structure and to initiate the appropriate decoding process. The data structure may include information about the position, which can be used by the decoder to identify the correct time frame for which to apply a given set of parameters. Information about the position of the time interval can be encoded using a fixed number of bits or variable number of bits depending on the type of the data structure specified by indicator type of data structure. For a data structure variable of type position information of a time interval can be encoded using a variable number of bits on the basis of which the provisions of the time interval in an ordered set of time intervals.

In some embodiments of the method of encoding an audio signal includes: determining the number of time intervals and the number of parameter sets, and the sets of parameters include one or more parameters; creating information indicating the position of at least one time interval in an ordered set of time intervals for which applies a set of parameters; encoding the audio signal in the form of a bit stream that includes a frame, and this frame contains an ordered set of time intervals; and inserting a variable number of bits in the bit stream, which represent the position of the time interval in an ordered set of time intervals, where a variable number of bits is determined the position of the time interval.

In some embodiments of the invention a method of decoding an audio signal includes: receiving a bit stream representing an audio signal, and the bit stream contains a frame; determining the number of time intervals and the number of sets of parameters from the bitstream, and the sets of parameters include one or more parameters; determining position information of the bit stream, and the position information indicates the position of the time interval in an ordered set of temporary inter the Alov, to which is applied a specified set of parameters, where an ordered set of time intervals contained in the specified frame; and decoding the audio signal based on the number of time intervals, the number of sets of parameters and information about the position, where position information is represented by a variable number of bits based on the position of the time interval.

Disclosed are other ways of encoding the position of the time interval, which relate to systems, methods, devices, data structures and machine-readable media.

It should be understood that the foregoing General description and following detailed description of the variants of the invention are illustrative and explanatory in nature and require further clarification of the claimed invention.

Brief description of drawings

The accompanying drawings, which are included here for a better understanding of the invention and form a part of this application, illustrate variant (variants) of the invention and together with the description serve to explain the principles underlying the present invention. In the drawings:

Fig. 1 is a diagram illustrating the principle of the creation of spatial information according to one variant of the present invention;

Fig. 2 is a block diagram of the encoder for encoding the audio signal with the according to one variant of the present invention;

Fig. 3 is a block diagram of a decoder for decoding an audio signal according to one variant of the present invention;

Fig. 4 is a block diagram of the transform module channels contained in the block increases mixing is included in the decoder, according to one variant of the present invention;

Fig. 5 is a schematic diagram explaining the method of the configuration bitstream of the audio signal according to one variant of the present invention;

Fig. 6A and 6B diagram and graph time/frequency explaining the relationship between a set of parameters, time interval and parametric ranges according to one variant of the present invention;

Fig. 7A is an illustration of the syntax representing the configuration information of the spatial information signal according to one variant of the present invention;

Fig. 7B is a table for multiple parametric ranges of spatial information signal according to one variant of the present invention;

Fig. 8A is an illustration of syntax, representing several parametric ranges applicable for block OTT, in the form of a fixed number of bits according to one variant of the present invention;

Fig. 8B is an illustration of syntax, representing several parametric ranges applicable for block OTT, with variable quantities of the bit according to one variant of the present invention;

Fig. 9A is an illustration of syntax, representing several parametric ranges applicable for unit TTT, in the form of a fixed number of bits according to one variant of the present invention;

Fig. 9B is an illustration of syntax, representing several parametric ranges applicable for unit TTT, using a variable number of bits according to one variant of the present invention;

Fig. 10A is an illustration of the syntax of the configuration information of the spatial extensions of the spatial frame extension according to one variant of the present invention;

Fig. 10B and 10C - illustrate syntaxes configuration information of the spatial expansion of the residual signal in the case where the residual signal is contained in the spatial frame extension according to one variant of the present invention;

Fig. 10D is an illustration of the syntax for the representation of the number of parametric ranges for residual signal according to one variant of the present invention;

Fig. 11A is a block diagram of a decoding device when using unmanaged coding according to one variant of the present invention;

Fig. 11B is a diagram for representation of the number of parametric ranges as one group according to one variationsomaha invention;

Fig. 12 is an illustration of the syntax of the configuration information of the spatial frame according to one variant of the present invention;

Fig. 13A is an illustration of the syntax of the position information of the time interval for which will apply a set of parameters, according to one variant of the present invention;

Fig. 13B is an illustration of the syntax for representing information about the position of the time interval for which applies a set of parameters, the absolute value and the difference according to one variant of the present invention;

Fig. 13C is a diagram representing a lot of information about the position of time intervals, which are used for the sets of parameters, in the form of a group according to one variant of the present invention;

Fig. 14 is a block diagram of the encoding method according to one variant of the present invention;

Fig. 15 is a block diagram of a method of decoding according to one variant of the present invention; and

Fig. 16 is a block diagram of the architecture of the device for implementing the processes of encoding and decoding described with reference to figures 1-15.

The best option of carrying out the invention

In Fig. 1 presents a diagram illustrating the principle of the creation of spatial information according to one variant of the present invention. Diagram of a perceptual coding for the tion for multi-channel audio signals based on the fact, that man can perceive audio signals through three-dimensional space. This three-dimensional space of the audio signal can be represented using the spatial information, including, but not limited to, the following well-known spatial parameters: the difference between channel levels (CLD), inter-channel correlation/coherence (ICC), the difference channels (CTD), the channel coefficients of the prediction (CPC), etc. the CLD Parameter describes the difference in energy level between the two channels, the parameter ICC describes the value of the correlation or coherence between the two channels, and the parameter CTD describes the difference in time between the two audio channels.

Creating parameters CTD and CLD is shown in Fig. 1. The first direct sound wave 103 from a remote source 101 sound enters left the human ear 107, and the second direct sound wave 102 dirigeret around the head of a man, reaching his right ear 106. Direct sound waves 102 and 103 differ from each other by the time of matriculation and energy level. The parameters of the CTD and CLD can be created on the basis of the differences between the times of receipt and energy levels of the sound waves 102 and 103, respectively. In addition, in the ears 106 and 107 receives reflected sound waves 104 and 105, respectively, which have no mutual correlation. Parameter ICC can be created based on the correlation between sound the diversified waves 104 and 105.

In the encoder of the multi-channel input audio signal is allocated spatial information (e.g., spatial parameters), and generates a signal resulting from the down-mixing. The signal after down-mixing and spatial parameters are passed to the decoder. For the signal after down-mixing, you can use any number of audio channels, including but not limited to: mono, stereo or multichannel audio. In the decoder of the signal after down-mixing and spatial parameters creates a multi-channel signal, resulting from the increase mixing.

In Fig. 2 presents a block diagram of an encoder for encoding an audio signal according to one variant of the present invention. The encoder includes a block 202 down-mixing unit 203 creating spatial information block 207 encoding signal after down-mixing and block 209 multiplexing. Other configurations are possible encoder. Encoders can be implemented in hardware, software or combination of hardware and software. Encoders can be implemented in integrated circuits, chipsets, single-chip system (SoC), digital signal processors, General purpose processors and the ranks of digital and analog devices.

Block 202 down-mixing creates from a multichannel audio signal 201 204 resulting from the down-mixing. In Fig. 2 x1,...,xnspecify the input channels. As mentioned above, the signal 204 after down-mixing can be a mono, stereo or multichannel audio. In the shown example, x'1,...,x'mindicate the number of channels of signal 204 after down-mixing. In some embodiments, the encoder instead of the signal 204 down-mixing processes the signal 205 down-mixing, which is supplied from the outside (for example, downward mixing to create artistic effects).

Block 203 creating spatial information extracts spatial information from the multichannel audio signal 201. In this case, the term "spatial information" means information relating to channels of an audio signal used to enhance the mixing signal 204 after down-mixing with receiving a multi-channel audio decoder. Signal 204 down-mixing is generated by down-mixing multi-channel audio signal. Spatial information code to provide the encoded signal 206 with spatial information.

Block 207 Kadirova the Oia signal down-mixing creates a coded signal 208 downward mixing by encoding the signal 204 down-mixing created in block 202 down-mixing.

Block 209 multiplexing creates a bit stream 210 that includes the encoded signal 208 after down-mixing and the encoded signal 206 with spatial information. Bit stream 210 may be passed to a subsequent decoder and/or recorded on the data carrier.

In Fig. 3 presents a block diagram of a decoder for decoding an encoded audio signal according to one variant of the present invention. The decoder includes a block 302 demuxing, block 305 decode the signal after down-mixing unit 307 decodes the spatial information and the block 309 enhance mixing. Decoders can be implemented in hardware, software or combination of hardware and software. Decoders can be implemented in integrated circuits, chipsets, single-chip system (SoC), digital signal processors, General-purpose processors and a variety of digital and analog devices.

In some embodiments, the block 302 demuxing receives the bitstream 301 representing the audio signal, and then selects from the bitstream 301 coded signal 303 after down-mixing and coded signal 304 with spatial information. In Fig. 3 x'1 ,...,x'mindicate signal channels 303 after down-mixing. Block 305 decode the signal down-mix outputs the decoded signal 306 downward mixing by decoding the encoded signal 303 downward mixing. If the decoder is not capable of delivering multi-channel audio signal, then the block 305 decode the signal down-mixing may directly output the signal 306 downward mixing. In Fig. 3 y'1,...,y'mindicate direct output channels block 305 decode the signal down-mixing.

Block 307 decode the signal with spatial information selects the configuration information signal with spatial information from the encoded signal 304 with spatial information, and then decodes the signal 304 with spatial information using the extracted configuration information.

Block 309 enhance mixing can perform enhancing the mixing signal 306, which is the result of down-mixing with receiving a multi-channel audio signal 310, using the extracted spatial information 308. In Fig. 3 y1,...,ynspecify the number of output channels block 309 enhance mixing.

In Fig. 4 presents a block diagram of the transform module channels, to whom that may be included in block 309 enhance mixing in the decoder, it is shown in Fig. 3. In some embodiments, the block 309 enhance mixing may include multiple modules conversion channels. The transform module channels is a conceptual device that can distinguish the number of input channels number of output channels, using specific information.

In some embodiments, the transformation module channels may include block OTT (one to two) for converting a single channel into two channels and Vice versa and block TTT (two to three) to convert two channels to three channels and Vice versa. Blocks OTT and/or TTT can be assembled using a variety of useful configurations. For example, block 309 enhance mixing, shown in Fig. 3, may include configuration 5-1-5, 5-2-5 configuration, the configuration 7-2-7, configuration 7-5-7 etc. In 5-1-5 configuration signal having after down-mixing one channel is generated by down-mixing the five channels into one channel, which can then be subjected to increasing mixing up to five channels. Similarly, you can create other configurations using different combinations of blocks OTT and TTT.

Please refer to Fig. 4, which shows as an example the 5-2-5 configuration for block 400 enhancing mixing. In 5-2-5 configuration signal 401, which is after iAUDIO mixing has two channels, enter in block 400 enhancing mixing. In the example shown as inputs to the block 400 enhancing mixing provided by the left channel (L) and right channel (R). In this embodiment, the block 400 enhancing mixing includes one unit TTT 402 and three blocks OTT 406, 407 and 408. The signal 401, which is after down-mix dual channel, served as input to the block TTT (TTTo) 402, which processes the signal 401 after down-mixing and provides as output signals of the three channels 403, 404 and 405. As input to the block TTT 402 can be provided by one or more spatial parameters (e.g., CPC, CLD, ICC), which is used for signal processing 401 after down-mixing, as described below. In some embodiments, as the entrance to the block TTT 402 may be selectively provided residual signal. In this case as the ratio of prediction to create three channels from the spirit of the channels may be defined in the CPC parameter.

Channel 403, which is provided as output from the block TTT 402 is input to the block OTT 406, which generates two output channels, using one or more spatial parameters. In the shown example, the two output channels represent the position of the front left (FL) and rear left (BL) speakers, for example, surround sound cf is de. Channel 404 is provided as input to the block OTT 407, which generates two output channels, using one or more spatial parameters. In the shown example, the two output channels represent the position of the front right (FR) and rear right (BR) speakers. Channel 405 is provided as input to the block OTT 408, which generates two output channels. In the shown example, the two output channels represent the Central position (C) dynamics and low-frequency channel optimization (LFE). In this case, the spatial information (e.g., CLD, ICC) may be provided as input to each of the blocks OTT. In some embodiments, as inputs in blocks OTT 406 and 407 may be provided residual signals (Res1, Res2). In the specified embodiment, the residual signal may be provided as an input signal to the block OTT 408, which gives the Central channel and the LFE channel.

The configuration shown in Fig. 4 is one example of a configuration for a module conversion channels. Other configurations for transformation module channels, including various combinations of blocks OTT and TTT. Since each of the conversion modules channels can operate in the frequency domain, it is possible to determine the number of parametric ranges applicable to each of the conversion modules Cana is offering. The range parameter indicates at least one frequency range applicable to a single parameter. A number of parametric ranges described with reference to Fig. 6V.

In Fig. 5 shows a diagram illustrating how the configuration bitstream of the audio signal according to one variant of the present invention. In Fig. 5(a) shows the bitstream of an audio signal including only the signal with spatial information, and figures 5(b) and 5(C) shows the bit streams of the audio signal that includes the signal after down-mixing and signal with spatial information.

Please refer to Fig. 5(a), where the bitstream of the audio signal may include information about 501 configuration and frame 503. Frame 503 may be repeated in the bit stream, and in some embodiments may include only spatial frame 502 that contains the spatial audio information.

In some embodiments, information about 501 configuration includes information describing the total number of time intervals in the same spatial frame 502, the total number of parametric bands covering the frequency range of the audio signal, the number of parametric ranges in the block OTT, a number of parametric ranges in the block TTT and the number of parametric ranges in the residual signal. If you want the tees to the information about 501 configuration can be included and other information.

In some embodiments, the spatial frame 502 includes one or more spatial parameters (e.g., CLD, ICC), the frame type, the number of sets of parameters in a single frame, and the time intervals for which can be applied to sets of parameters. Optionally, in the spatial frame 502 may be included and other information. The meaning and value of information about 501 and configuration information contained in the spatial frame 502, is explained below with reference to figures 6 through 10.

Please refer to Fig. 5(b), where the bitstream of the audio signal may include information 504 of the configuration, the signal 505 after down-mixing and spatial frame 506. In this case, one frame 507 may include signal 505 after down-mixing and spatial frame 506, and the frame 507 may in the bit stream to be repeated.

Please refer to Fig. 5(C), where the bitstream of the audio signal may include a signal 508 after down-mixing information 509 configuration and spatial frame 510. In this case, one frame 511 may include information about 509 configuration and spatial frame 510 and the frame 511 in the bit stream can be repeated. If information about 509 configuration is inserted in each frame 511, the audio signal can be played back by the playback device with about Smolnogo.

Although in Fig. 5(C) shows that the information about 509 configuration is inserted in the bit stream using the frame 511, it should be obvious that the information about 509 configuration can be inserted into the bitstream using multiple frames that are repeated periodically or aperiodically.

In figures 6A and 6B presents diagrams illustrating relationships between a set of parameters, time interval and parametric ranges according to one variant of the present invention. The set of parameters represents one or more spatial parameters used for one time interval. The spatial parameters can include spatial information, such as CDL, ICC, CPC, etc. Time interval means an interval of the audio signal, which can be applied spatial parameters. One spatial frame may include one or more time intervals.

Please refer to Fig. 6A, where the spatial frame can be used multiple sets of parameters 1,...,P, and each set of parameters may include one or more fields 1,...,Q-1 data. The set of parameters may be applied to the entire frequency range of the audio signal, and each spatial parameter in the parameter set can be applied to one or more floor areas the son of frequencies. For example, if the set of parameters includes 20 spatial parameters, the entire frequency band of the audio signal may be divided into 20 zones (hereinafter called "parametric ranges"), and for 20 parametric ranges, you can apply 20 spatial parameters from the given set of parameters. Settings can be applied to parametric ranges based on your specific requirements. For example, the spatial parameters can be applied to low-frequency parametric ranges without discharge, and to high-frequency parametric ranges from discharge.

Please refer to Fig. 6B, where the schedule time/frequency shows the relationship between sets of parameters and time intervals. In the example shown, three sets of parameters (set 1 parameter set 2 parameter set 3 parameter) are applied to an ordered set of 12 time slots in the same spatial frame. In this case, the entire frequency range of the audio signal is divided into 9 parametric ranges. Thus, the horizontal axis indicates the number of time intervals, and the vertical axis indicates the number of parametric ranges. Each of the three sets of parameters used for a specific time interval. For example, the first set of parameters (set 1 parameters) is used to time the Val #1, the second set of parameters (set 2 parameters) is used for time interval #5, and the third set of parameters (set 3 parameters) is used for time interval #9. Presets can be applied to other time intervals by interpolation and/or copying of parameter sets for these time intervals. In the General case, the number of sets of parameters may be less than or equal to the number of time intervals, and the number of parametric ranges may be less than or equal to the number of frequency bands of the audio signal. By encoding spatial information for some parts of the time-frequency domain audio signal instead of doing it for the entire time-frequency domain audio signal, it is possible to reduce the amount of spatial information that is sent from the encoder to the decoder. This reduction of the volume of data is possible because according to known principles of perceptual audio encoding discharged information in the time-frequency domain frequency is often enough for the human perception of sound.

An important feature disclosed here, the embodiments of the invention is the encoding and decoding of the provisions of the time intervals, which are used for the sets of parameters using a fixed or moving the nogo number of bits. A number of parametric ranges can also be represented by a fixed number of bits or variable number of bits. The coding scheme with a variable number of bits can also be applied to other information used in the spatial audio encoding, including, but not limited to information related to the temporal, spatial and/or frequency regions (for example, multiple frequency sub-bands at the output of the comb filter).

In Fig. 7A shows the syntax for representing configuration information of the spatial information signal according to one variant of the present invention. The configuration information includes many fields 701 on 718, which can be assigned to some number of bits.

Field bsSamplingFreqencyIndex” 701 specifies the sampling frequency, obtained from the sampling of the audio signal. To represent the sampling frequency field bsSamplingFreqencyIndex” 701 allocated 4 bits. If the value of the field “bsSamplingFreqencyIndex” 701 is 15, that is, a binary number 1111, then added a field “bsSamplingFreqency” 702 to represent the sampling frequency. In this case, the field “bsSamplingFreqency” 702 allocates 24 bits.

Field bsFrameLength” 703 indicates the total number of time intervals (hereinafter called the “numSlots”) in the same spatial frame, and IU the Doo “numSlots” and field “bsFrameLength” 703 may have a ratio NumSlots = bsFrameLength+1.

Field bsFreqRes” 704 indicates the total number of parametric ranges covering the entire frequency domain of the audio signal. Field bsFreqRes” 704 explained below in Fig. 7V.

Field bsTreeConfig” 705 indicates the information for the tree configuration that includes multiple modules conversion channels, such as were described with reference to Fig. 4. Information for the tree configuration includes information such as the type of module conversion channels, the number of conversion modules, channels, type of spatial information used in the module conversion channels number of input/output channels of the audio signal, etc.

Tree configuration can have one of the following configurations: configuration 5-1-5, 5-2-5 configuration, the configuration 7-2-7, configuration, 7-5-7, etc. in accordance with the type of module conversion channel or number of channels. In Fig. 4 shows the tree 5-2-5 configuration.

Field 706 “bsQuantMode” indicates information about the quantization mode of spatial information.

Field bsOneIcc” 707 indicates whether all blocks OTT one subset of parameters ICC. In this case, the subset of parameters refers to the set of parameters used for a specific time interval and specific conversion module channels.

Field bsArbitraryDownmix” 708 indicates whether the whether the absence of randomly selected gain when Panigale mixing. Field bsFixedGainSur” 709 specifies the gain applied to the surround channel, for example, LS (left channel surround) and RS (right channel surround sound).

Field bsFixedGainLF” 710 specifies the gain applied to the LFE channel.

Field bsFixedGainDM” 711 specifies the gain applied to the signal resulting from the down-mixing.

Field bsMatrixMode” 712 specifies whether the encoder matrix compatible stereo signal after down-mixing.

Field bsTempShapeConfig” 713 indicates the operating mode of the temporary formation (e.g., TES (the formation of the temporal envelope) and/or TP (temporary formation)in the decoder.

Field bsDecorrConfig” 714 indicates the operating mode of decorrelator decoder.

And field bs3Daudioode” 715 specifies the coded whether the signal after down-mixing in a 3D (three-dimensional) signal and is used if the processing using the inverse function HRTF (function modeling the perception of sound).

After being identified/extracted information from each field in the encoder/decoder, the encoder/decoder shall/retrieves information for the number of parametric ranges used for transformation module channels. First is determined/is extracted (716) number of parametric ranges used to Blockout, and then define/extracted (717) number of parametric ranges used to block TTT. A number of parametric ranges to block ATT and/or block TTT will be described in detail with reference to figures from 8A through 9B.

In the case when there is a frame extension block “spatialExtensionConfig” 718 includes configuration information for the frame extensions. The information included in the block “spatialExtensionConfig” 718, is described below with reference to figures 10A through 10D.

In Fig. 7B shows a table for the number of parametric ranges of the signal with spatial information according to one variant of the present invention. “numBands” specifies the number of parametric ranges for the entire frequency domain of the audio signal, and “bsFreqRes” specifies the index information for the number of parametric ranges. For example, all frequency domain audio signal can be divided into several parametric ranges (e.g., 4, 5, 7, 10, 14, 20, 28 etc).

In some embodiments, one parameter can be used for each parametric range. For example, if the “numBands” is 28, then all of the frequency domain audio signal is divided into 28 parametric ranges, and each of the 28 parameters can be used for each of the 28 parametric ranges. In another example, if the “numBands” is equally the 4, then all the frequency domain of the given audio signal is divided into 4 parametric range, and each of the 4 parameters can be used for each of the 4 parametric bands. In Fig. 7B, the term “Reserved” means that the number of parametric ranges for the entire frequency domain of the given audio signal is not defined.

It should be noted that the body of the human ear is not sensitive to the number of parametric ranges used in the encoding scheme. Thus, using a small number of parametric ranges can provide the same spatial audio effect for the listener, as if you had used a greater number of parametric ranges.

Unlike option “numBands”, “numSlots”represented by the field “bsFrameLength” 703 shown in Fig. 7A, can represent all values. However, the values of the “numSlots” can be restricted if the number of samples in one spatial frame exactly is divided into “numSlots”. Thus, if the maximum present value “numSlots” is 'b', then the field value “bsFrameLength” 703 may be represented by ceil{log2(b)} bits. In this case, 'ceil(x)' means the minimum integer greater than or equal to the value of 'x'. For example, if one spatial frame includes 72 time interval, that is when the bsFrameLength” 703 can be allocated ceil{log 2(72)} = 7 bits, and the number of parametric ranges used for transformation module channels can be taken to be equal to the value within the “numBands”.

In Fig. 8A shows the syntax to represent the number of parametric ranges used to block OTT, using a fixed number of bits according to one variant of the present invention. Refer to figures 7A and 8A, where 'i' has a value from zero to numOttBoxes - 1 and where 'numOttBoxes' is the total number of blocks OTT. Namely, the value of 'i' indicates each block OTT, and the number of parametric ranges used for each block OTT presented the corresponding value of 'i'. If the block OTT has mode LFE channel, the number of parametric ranges (denoted hereinafter as “bsOttBands”)used for the LFE channel block OTT, can be represented using a fixed number of bits. In the example shown in figa, for the field “bsOttBands” 801 allocated 5 bits. If the block OTT has no mode LFE channel, then the channel unit OTT can be applied to the total number of parametric ranges (numBands).

In Fig. 8B shows the syntax to represent the number of parametric ranges used to block OTT, with a variable number of bits according to one variant of the present invention. In Fig. 8B, which is similar to Fig. 8A, in contrast to Fig. 8A field bOttBands” 802, it is shown in Fig. 8B, presents a variable number of bits. In particular, the field “bsOttBands” 802 whose value is less than or equal to “numBands”can be represented by a variable number of bits using the “numBands”.

If the “numBands” is in the range of greater than or equal to 2(n-1)and smaller 2(n)then the field “bsOttBands” 802 may be represented by a variable number of bits n.

For example: (a) if the “numBands” is 40, the field “bsOttBands” 802 appears to be 6 bits; (b) if the “numBands” is 28 or 20, the field “bsOttBands” 802 appears to be 5 bits; (C) if the “numBands” is 14 or 10, the field “bsOttBands” 802 appears to be 4 bits; and (d) if the “numBands” is 7, 5 or 4, the field “bsOttBands” 802 is represented by 3 bits.

If the “numBands” is in the range greater than 2(n-1)and less than or equal to 2(n)then the field “bsOttBands” 802 may be represented by a variable number of bits n.

For example: (a) if the “numBands” is 40, the field “bsOttBands” 802 appears to be 6 bits; (b) if the “numBands” is 28 or 20, the field “bsOttBands” 802 appears to be 5 bits; (C) if the “numBands” is 14 or 10, the field “bsOttBands” 802 appears to be 4 bits; (d) if the “numBands” is 7, 5, field bsOttBands” 802 is represented by 3 bits; and (e) if the “numBands” is 4, then field bsOttBands” 802 appears to be 2 bits.

Field bsOttBands” 802 may be represented by a variable number of bits through the function (next on the " “function is the least integer”) rounding to the nearest integer value, where the variable is “numBands”.

In particular: (i) in the case when 0 < bsOttBands ≤ numBands or 0 ≤ bsOttBands < numBands, field bsOttBands” 802 is represented by a number of bits corresponding to the value of ceil(log2(numBands)); or (ii) in the case when 0 ≤ bsOttBands ≤ numBands, field bsOttBands” 802 may be represented by ceil(log2(numBands+1)) bits.

If the value is less than or equal to “numBands” (hereinafter called “numberBands”), defined arbitrarily, the field “bsOttBands” 802 may be represented by a variable number of bits through the smallest integer, if as a variable to take “numberBands”.

In particular: (i) in the case when 0 < bsOttBands ≤ numberBands or 0 ≤ bsOttBands < numberBands, field bsOttBands” 802 seems ceil(log2(numberBands)) bits; or (ii) in the case when 0 ≤ bsOttBands ≤ numberBands, field bsOttBands” 802 may be represented by ceil(log2(numberBands+1)) bits.

If you use more than one block OTT, the combination of “bsOttBands” can be expressed by formula 1, below.

[Formula 1]

where bsOttBandsiindicates the i-th “bsOttBands”. For example, suppose you have three blocks OTT and three values (N=3) for the field “bsOttBands” 802. In this example, three values of the field “bsOttBands” 802 (denoted hereinafter a1, a2 and a3, respectively)used for the three corresponding blocks OTT, can be represented by 2 bits each. Therefore, for the expression is of the values a1, a2 and a3 will need only 6 bits. In addition, if the values of a1, a2 and a3 are presented in the form of groups, then there may be 27 (=3·3·3) options that can be represented by 5 bits to saving a bit. If the “numBands” is 3, and the group value is represented by 5 bits, equal to 15, then the value of the group may be presented in the form of 15=1·(32)+2·(31)+0·(30). Therefore, the decoder by applying the inverse image formula 1, can be determined from the values of the 15 groups that the three values a1, a2 and a3 fields bsOttBands” 802 comprise 1, 2, and 0, respectively.

In the case of multiple blocks OTT combination “bsOttBands” may be represented by one of formulae 2 through 4 (defined below) using “numberBands”. Because the view bsOttBands” using “numberBands” similar to a view using the “numBands” of formula 1, a detailed explanation of the following formulas is omitted.

[Formula 2]

[Formula 3]

[Formula 4]

In Fig. 9A shows the syntax to represent the number of parametric ranges used to block TTT, with a fixed number of bits according to one variant of the present invention. Refer to figures 7A and 9A, where 'i' has a value from zero to numTttboxes - 1, where 'numTttboxes' is the total number the blocks in TTT. Namely, the value of 'i' indicates each block TTT. A number of parametric ranges used for each block TTT, submitted in accordance with the value 'i'. In some embodiments, the block TTT can be divided into low-frequency range and high frequency range, and for low and high frequency ranges can be used in different machining processes. Other variants of partitioning.

Field bsTttDualMode” 901 indicates whether this block TTT in different modes (hereinafter this is called “dual mode”) for low-frequency and high-frequency range, respectively. For example, if the value of the field “bsTttDualMode” 901 is equal to zero, then use one mode for the entire range without differences between the low-frequency range and high frequency range. If the value of the field “bsTttDualMode” 901 is equal to 1, then for low-frequency and high-frequency range can be used in different modes.

Field bsTttModeLow” 902 indicates the working mode of the current block TTT, which may have different operating modes. For example, the block TTT can work in the mode of prediction, which uses, for example, the parameters of the CPC and the ICC, in the mode based on the evaluation of energy, which are used, for example, CLD, etc. If the block TTT has a dual mode, for high-frequency range, which may require additional information.

Field bsTttModeHigh” 903 indicates the operating mode of the high-frequency range, and in this case, the block TTT has a dual mode.

Field bsTttBandsLow” 904 specifies the number of parametric ranges used to block TTT.

Field bsTttBandsHigh” 905 has the “numBands”.

If the block TTT has a dual mode, the low-frequency range can be greater than or equal to zero and less than “bsTttBandsLow”, while the high-frequency range can be greater than or equal to “bsTttBandsLow” and less than “bsTttBandsHigh”.

If the block TTT has no dual mode, the number of parametric ranges used to block TTT, may be greater than or equal to zero and less than the “numBands” (907).

Field bsTttBandsLow” 904 may be represented by a fixed number of bits. For example, as shown in Fig. 9A, for submission to the fields bsTttBandsLow” 904 can be allocated 5 bits.

In Fig. 9B shows the syntax to represent the number of parametric ranges used to block TTT, with a variable number of bits according to one variant of the present invention. Fig. 9B is similar to Fig. 9A, but differs from Fig. 9A fact that the field “bsTttBandsLow” 907 in Fig. 9B presents a variable number of bits, while the field “bsTttBandsLow” 904 in Fig. 9A presents a fixed number of bits. In particular, because the “bsTttBandsLow” 907 has a value less than or equal to “umBands”, field bsTttBands” 907 may be represented by a variable number of bits using the “numBands”.

In particular, if the “numBands” is greater than or equal to 2(n-1)and less than 2(n)field “bsTttBandsLow” 907 may be represented by n bits.

For example: (i) if the “numBands” is 40, the field “bsTttBandsLow” 907 seems to be 6 bits; (ii) if the “numBands” is 28 or 20, the field “bsTttBandsLow” 907 seems to be 5 bits; (iii) if the “numBands” is 14 or 10, the field “bsTttBandsLow” 907 seems to be 4 bits; and (iv) if the “numBands” is 7, 5 or 4, the field “bsTttBandsLow” is represented by 3 bits.

If the “numBands” is in the range of greater than 2(n-1)and less than or equal to 2(n)then the field “bsTttBandsLow” 907 may be represented by n bits.

For example: (i) if the “numBands” is 40, the field “bsTttBandsLow” 907 seems to be 6 bits; (ii) if the “numBands” is 28 or 20, the field “bsTttBandsLow” 907 seems to be 5 bits; (iii) if the “numBands” is 14 or 10, the field “bsTttBandsLow” 907 seems to be 4 bits; (iv) if the “numBands” is equal to 7 or 5, the field “bsTttBandsLow” is represented by 3 bits; and (v) if the “numBands” is 4, the field “bsTttBandsLow” 907 seems to be 2 bits.

Field bsTttBandsLow” 907 can be represented by a number of bits determined by the function is the smallest integer, if as a variable to accept the “numBands”.

For example: i) in the case when 0 < bsTttBandsLow ≤ numBands or 0 ≤ bsTttBandsLow < numBands, field bsTttBandsLow” 907 Ave is stavsetra number of bits, the corresponding value of ceil(log2(numBands)); or (ii) in the case when 0 ≤ bsTttBandsLow ≤ numBands, field bsTttBandsLow” 907 may be represented by ceil(log2(numBands+1)) bits.

If the value is less than or equal to “numBands”is “numberBands” defined arbitrarily, the field “bsTttBandsLow” 907 may be represented by a variable number of bits using “numberBands”.

In particular: i) in the case when 0 < bsTttBandsLow ≤ numberBands or 0 ≤ bsTttBandsLow < numberBands, field bsTttBandsLow” 907 is represented by a number of bits corresponding to the value of ceil(log2(numberBands)); or (ii) in the case when 0 ≤ bsTttBandsLow ≤ numberBands, field bsTttBandsLow” 907 can be represented by a number of bits corresponding to ceil(log2(numberBands +1)).

In the case of multiple blocks TTT combination of “bsTttBandsLow” can be expressed in the following formula 5, as defined below.

[Formula 5]

In this case bsTttBandsLowiindicates the i-th “bsTttBandsLow”. Because the formula 5 is identical to the formula 1, a detailed explanation of the formula 5 in the further description is omitted.

In the case of multiple blocks TTT combination of “bsTttBandsLow” can be represented by one of formulas 6 through 8 using “numberBands”. Because formulas 6 through 8 are identical to the formulas 2 to 4, a detailed explanation of the formulas 6 through 8 in the following description is omitted.

[Formula 6]

[Formula 7]

[Formula 8]

A number of parametric ranges used for transformation module channels (e.g., block OTT and/or block TTT)may be represented as a value dividing “numBands”. In this case, as the value of the division uses half the value of the “numBands” or a value resulting from dividing the “numBands” on a specific number.

As soon as the specified number of parametric ranges used to block OTT and TTT, or both, can be defined sets of parameters that can be applied to each block OTT and/or each block TTT within the number of parametric ranges. Each of the sets of parameters can be applied for each block OTT and/or each block TTT for the time interval passed per unit of time. Namely, one set of parameters can be applied to a single time interval.

As mentioned in the previous description, one spatial frame may include multiple time intervals. If the spatial frame refers to a frame fixed type, then the set of parameters may be applied for multiple time intervals of equal duration. If a frame refers to a frame of a variable type, it is necessary to have information about the position of the time interval for which applies the set PA is amerov. This is explained in detail below with reference to figures with 13A through 13C.

In Fig. 10A shows the syntax for configuration information with a spatial extension to the spatial frame extension according to one variant of the present invention. The configuration information with a spatial extension can include “bsSacExtType” 1001 “field bsSacExtLen” 1002 “field bsSacExtLenAdd” 1003 “field bsSacExtLenAddAdd” 1004 and field bsFillBits” 1007. Possible and other fields.

Field bsSacExtType” 1001 specifies the type of data frame spatial extension. For example, the frame of the spatial extension can be filled with zeros, the residual data signal, any residual data signal after down-mixing or arbitrary data tree.

Field bsSacExtLen” 1002 indicates the number of bytes of configuration information with a spatial extension.

Field bsSacExtLenAdd” 1003 indicates the additional number of bytes of configuration information with spatial extension, if the number of bytes of configuration information with a spatial extension was greater than or equal to, for example, 15.

Field bsSacExtLenAddAdd” 1004 indicates the additional number of bytes of configuration information with spatial extension, if the number of bytes of configuration information with a spatial extension became b is the more-or-equal, for example, 270.

After determination/selection encoder/decoder corresponding fields defined (1005) configuration information for the type of data included in the frame of spatial extension.

As mentioned in the above description, in the frame of the spatial extension may contain residual data signal derived residual data signal after down-mixing configuration data tree or the like

Next is calculated (1006) the number of unused bits from the length of the configuration information with a spatial extension.

Field bsFillBits” 1007 specifies the number of bits of data that can be omitted when filling unused bits.

In figures 10B and 10C shows the syntax for configuration information with the spatial extension of the residual signal in the case where the residual signal is included in the frame of spatial expansion, according to one variant of the present invention.

Please refer to Fig. 10B, where the field “bsResidualSamplingFrequencyIndex” 1008 indicates the sample rate of the residual signal.

Field bsResidualFramesPerSpetialFrame” 1009 indicates the number of residual frames in one spatial frame. For example, in one spatial frame can be 1, 2, 3 or 4 residual frame.

Block ResidualConfig” 1010 indicates the number of parameters is practical ranges for residual signal, applied to each block OTT and/or TTT.

Please refer to Fig. 10C, where the field “bsResidualPresent” 1011 indicates the applicability of the residual signal for each block OTT and/or TTT.

Field bsResidualBands” 1012 specifies the number of parametric ranges of the residual signal that exists in each block OTT and TTT, or both, if the residual signal exists in each block OTT and/or TTT. A number of parametric ranges of the residual signal can be represented by a fixed number of bits or variable number of bits. In the case when the number of parametric ranges presents a fixed number of bits, the residual signal may have a value less than or equal to the total number of parametric ranges of the audio signal. So for all parametric ranges can be allocated the necessary number of bits (for example, 5 bits in Fig. 10C).

In Fig. 10D shows the syntax to represent the number of parametric ranges of the residual signal by using a variable number of bits according to one variant of the present invention. Field bsResidualBands” 1014 may be represented by a variable number of bits using the “numBands”. If numBands greater than or equal to 2(n-1)and less than 2(n)then the field “bsResidualBands” 1014 may be represented by n bits.

For example: (i) if the “numBands” is 40, the field “bResidualBands” 1014 seems to be 6 bits; (ii) if the “numBands” is 28 or 20, the field “bsResidualBands” 1014 is represented with 5 bits; (iii) if the “numBands” is 14 or 10, the field “bsResidualBands” 1014 seems to be 4 bits; (iv) if the “numBands” is 7, 5 or 4, the field “bsResidualBands” 1014 is represented by 3 bits.

If numBands more than 2(n-1)and less than or equal to 2(n)the number of parametric ranges of the residual signal can be represented by n bits.

For example: (i) if the “numBands” is 40, the field “bsResidualBands” 1014 seems to be 6 bits; (ii) if the “numBands” is 28 or 20, the field “bsResidualBands” 1014 is represented with 5 bits; (iii) if the “numBands” is 14 or 10, the field “bsResidualBands” 1014 seems to be 4 bits; (iv) if the “numBands” is equal to 7 or 5, the field “bsResidualBands” 1014 is represented by 3 bits; and (v) if the “numBands” is 4, then field bsResidualBands” 1014 seems to be 2 bits.

In addition, field bsResidualBands” 1014 can be represented by a number of bits determined by the function of rounding to the nearest integer, if as a variable to accept the “numBands”.

In particular: i) in the case when 0 < bsResidualBands ≤ numBands or 0 ≤ bsResidualBands < numBands, field bsResidualBands” 1014 seems ceil(log2(numBands)) bits; or (ii) in the case when 0 ≤ bsResidualBands ≤ numBands, field bsResidualBands” 1014 may be represented by ceil(log2(numBands+1)) bits.

In some embodiments, the field “bsResidualBands” 1014 can be represented using values (numberBands), IU is further equal to or numBands.

In particular: i) in the case when 0 < bsResidualBands ≤ numberBands or 0 ≤ bsResidualBands < numberBands, field bsResidualBands” 1014 seems ceil(log2(numberBands)) bits; or (ii) in the case when 0 ≤ bsResidualBands ≤ numberBands, field bsResidualBands” 1014 may be represented by ceil(log2(numberBands +1)) bits.

In the case of a plurality of residual signals (N) a combination of “bsResidualBands” can be expressed in the following formula 9, as defined below.

[Formula 9]

In this case bsResidualBandsiindicates the i-th “bsResidualBands”. Because the formula 9 is identical to the formula 1, a detailed explanation of the formula 9 in the further description is omitted.

In the case of many residual signals, the combination of “bsResidualBands” can be represented by one of formulas 10 and 12 using “numberBands”. Because the view bsResidualBands” using “numberBands” is identical to the formulae 2 through 4, a detailed explanation in the following description is omitted.

[Formula 10]

[Formula 11]

[Formula 12]

A number of parametric ranges of the residual signal can be represented as a value dividing “numBands”. In this case, the values of the division, you can use half the value of the “numBands” or a value resulting from dividing the “numBnds” to a specific value.

The residual signal may be included in the bitstream of the audio signal together with the signal after down-mixing and spatial information signal, and this bit stream may be sent to the decoder. The decoder can select from the bit stream signal after down-mixing, spatial information signal and the residual signal.

Further, the signal obtained by down-mixing, is subjected to the step of mixing using the spatial information. Meanwhile, in the course of increasing mixing the signal down-mixing is attached residual signal. In particular, the signal obtained by down-mixing, is subjected to increasing mixing in multiple conversion modules channels with the use of spatial information. When performing this conversion module channels served residual signal. As mentioned in the foregoing description, the transformation module channels has several parametric ranges, and conversion module channels is a set of parameters for each time interval. When the residual signal in the transform module channels, you may need to residual signal has updated information on inter-channel correlation of the audio signal, to the which is applied to the residual signal. Then the updated information on inter-channel correlation is used in the process of enhancing mixing.

In Fig. 11A shows the block diagram of the decoder for unmanaged coding according to one variant of the present invention. Unmanaged coding means that spatial information is not included in the bitstream of the audio signal.

In some embodiments, the decoder includes a comb filters 1102 for analysis, block 1104 analysis, block 1106 spatial synthesis and comb filters 1108 for synthesis. Although in Fig. 11A shows the signal after down-mixing the signal type stereo, you can use other types of signals after down-mixing.

During the operation of the decoder receives the signal 1101 after down-mixing, and comb filters 1102 for analysis converts the received signal 1101 after down-mixing signal 1103 frequency domain. Block 1104 analysis creates spatial information based on the converted signal 1103 after down-mixing. Unit 1104 performs analysis processing for each time interval and spatial information 1105 can be created for a variety of time intervals. In this case, the time interval includes a time interval.

Spatial information can the be created in two stages. First, from the signal after down-mixing creates a parameter down-mixing. Secondly, the parameter down-mixing is converted into spatial information, such as spatial parameter. In some embodiments, the parameter down-mixing can be created by a matrix calculation of the signal after down-mixing.

Block 1106 spatial synthesis creates multichannel audio signal 1107 created by synthesis of spatial information 1105 signal 1103 down-mixing. Created multi-channel audio signal 1107 passes through the comb filters 1108 for synthesis to convert the audio in 1109 in the time domain.

Spatial information can be created in advance of certain provisions of the time intervals. The distance between these positions can be equal (i.e. equidistant). For example, spatial information can be created at 4 time intervals. Spatial information can also be created with variable positions of time intervals. In this case, the position information of the time interval that generates the spatial information can be extracted from the bitstream. Position information may be represented by a variable number of bits. And the information about the position can be presented as absolute values and the difference relative to the previous position information of the time interval.

In the case of unmanaged coding number of parametric ranges (hereinafter called “bsNumguidedBlindBands”) for each channel of the audio signal may be represented by a fixed number of bits. “bsNumguidedBlindBands” may be represented by a variable number of bits using the “numBands”. For example, if the “numBands” is greater than or equal to 2(n-1)and less than 2(n)“bsNumguidedBlindBands” may be represented by a variable number of n bits.

In particular, (a) if the “numBands” is 40, the “bsNumguidedBlindBands” is 6 bits, (b) if the “numBands” is 28 or 20, “bsNumguidedBlindBands” is 5 bits, (C) if the “numBands” is 14 or 10, “bsNumguidedBlindBands” is 4 bits, and (d) if the “numBands” is 7, 5 or 4, “bsNumguidedBlindBands” is represented by 3 bits.

If the “numBands” is greater than 2(n-1)and less than or equal to 2(n)then “bsNumguidedBlindBands” may be represented by a variable number of n bits.

For example: (a) if the “numBands” is 40, the “bsNumguidedBlindBands” is 6 bits, (b) if the “numBands” is 28 or 20, “bsNumguidedBlindBands” is 5 bits, (C) if the “numBands” is 14 or 10, “bsNumguidedBlindBands” is 4 bits, (d) if the “numBands” is equal to 7 or 5, “bsNumguidedBlindBands” is represented by 3 bits; and (e) if the “numBands” is 4, “bsNumguidedBlindBands” is 2 bits.

In addition, “bsNumguidedBlindBands” may be not only the but variable number of bits using the minimum of the whole, if as a variable to accept the “numBands”.

For example, i) if 0 < bsNumguidedBlindBands ≤ numBands or 0 ≤ bsNumguidedBlindBands < numBands, “bsNumguidedBlindBands” is ceil(log2(numBands)) bits; or (ii) in the case when 0 ≤ bsNumguidedBlindBands ≤ numBands, “bsNumguidedBlindBands” may be represented by ceil(log2(numBands+1)) bits.

If the value is less than or equal to the “numBands”is “numberBands” defined arbitrarily, “bsNumguidedBlindBands” can be represented as follows.

In particular, (i) if 0 < bsNumguidedBlindBands ≤ numberBands or 0 ≤ bsNumguidedBlindBands < numberBands, “bsNumguidedBlindBands” is ceil(log2(numberBands)) bits; or (ii) in the case when 0 ≤ bsNumguidedBlindBands ≤ numberBands, “bsNumguidedBlindBands” may be represented by ceil(log2(numberBands+1)) bits.

If there are N channels, the combination of “bsNumguidedBlindBands” can be expressed in the following formula 13.

[Formula 13]

In this case bsNumguidedBlindBandsiindicates the i-e “bsNumguidedBlindBands”. Because the formula 13 is identical to the formula 1, a detailed explanation of the formula 13 in the subsequent description is omitted. If there are many channels, “bsNumguidedBlindBands” can be represented by one of formulas 14 to 16 using “numberBands”. Since the demise of the “bsNumguidedBlindBands” using “numberBands” identical views in formulas 2 through 4, a detailed explanation of the formulas 14 to 16 will follow the eat the description is omitted.

[Formula 14]

[Formula 15]

[Formula 16]

In Fig. 11B presents the schema for the representation of the number of parametric ranges in the form of a group according to one variant of the present invention. A number of parametric ranges includes information about the number of parametric ranges used for transformation module channels, information about the number of parametric ranges used for the residual signal, and information about the number of parametric ranges for each channel audio when using unmanaged coding. In the case when there are many informations about the number of parametric ranges, a lot of information about the number (for example, “bsOttbands”, “bsTttbands”, “bsResidualBand” and/or “bsNumguidedBlindBands”) can be represented at least in the form of one or more groups.

Please refer to Fig. 11B, where if there is a (kN+L) information on the number of parametric ranges, and if present each information about the number of parametric ranges need Q bits, the set of data on the number of parametric ranges can be represented in the form the next group. In this case, 'k' and 'N' are arbitrary the integers, is not equal to zero, and 'L' is an arbitrary integer satisfying the inequality 0≤L<n

The method of grouping includes the steps create k groups by linking N information about the number of parametric ranges and the last group by linking the last L information about the number of parametric ranges. k groups can be represented by M bits, and the last group can be represented by p bits. In this case, it is preferable that M bits was less than N·Q bits used in the case of each of the information about the number of parametric ranges without their grouping. p bit is preferably less than or equal to L·Q bits used in the case of each of the information about the number of parametric ranges without their grouping.

Suppose, for example, that information about the number of parametric ranges represents b1 and b2, respectively. If each of the values b1 and b2 can have five values, to represent each of the values b1 and b2 must have 3 bits. In this case, although the 3 bits can represent eight values are actually required only five values. That is, each of the values of b1 and b2 has three redundant values. In addition, in the case of b1 and b2 in the form of groups by linking b1 and b2 together, instead of 6 bits (=3 bits + 3 bits) which you can use 5 bits. In particular, since all combinations of b1 and b2 include a 25 (=5·5) options, the group of b1 and b2 can be represented by 5 bits. Since 5 bits can represent 32 values, in the case of a view group is created seven redundant values. In addition, in the case of representation by grouping b1 and b2 redundancy turns out smaller than in the case of each of the quantities b1 and b2 in the form of 3 bits. Way to represent many data about the number of parametric ranges in groups can be implemented in various ways as follows.

If a lot of data about the number of parametric ranges has 40 values each, then create k groups using N values of 2, 3, 4, 5 or 6. k groups can be represented by 11, 16, 22, 27 and 32 bits, respectively. In alternative k groups represent by combining the appropriate options.

If a lot of information about the number of parametric ranges has 28 values for each, then create k groups using N values of 6, and k groups can be represented 29 bits.

If a lot of information about the number of parametric ranges has 20 values for each, then create k groups using N values of 2, 3, 4, 5, 6 or 7. k groups can be represented by 9, 13, 18, 22, 26 and 31 bits according to the respectively. In alternative k groups can be represented by combining the respective cases.

If a lot of information about the number of parametric ranges has 14 values for each, then create k groups using N values of 6. k groups can be represented 23 bits.

If a lot of information about the number of parametric ranges has 10 values for each, then create k groups using N values 2, 3, 4, 5, 6, 7, 8 or 9. k groups can be represented 7, 10, 14, 17, 20, 24, 27 and 30 bits, respectively. In alternative k groups can be represented by combining the respective cases.

If a lot of information about the number of parametric ranges has 7 values for each, then create k groups using N values of 6, 7, 8, 9, 10 or 11. k groups can be represented 17, 20, 23, 26, 29 and 31 bits, respectively. In alternative k groups are represented by combining the respective cases.

If a lot of information about the number of parametric ranges has, for example, 5 values for each, then create k groups using N values 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or 13. k groups can be represented 5, 7, 10, 12, 14, 17, 19, 21, 24, 26, 28 and 31 bits, respectively. In alternative k groups pre is delivered through a combination of relevant cases.

In addition, many informations about the number of parametric ranges can be configured to view the above groups, or for successive presentation by forming each of the information about the number of parametric ranges in the form of independent bit stream.

In Fig. 12 shows a syntax representing the configuration information of the spatial frame according to one variant of the present invention. The spatial frame includes a block FramingInfo” 1201, block bsIndependencyfield” 1202, block OttData” 1203, block TttData” 1204, block SmgData” 1205 and block tempShapeData” 1206.

Block FramingInfo” 1201 includes information about the number of sets of parameters and information about the time interval, which is used for each set of parameters. Block FramingInfo” 1201 is explained in detail with reference to Fig. 13A.

Field bsIndependencyfield” 1202 specifies whether to decode the current frame without information about the previous frame.

Block OttData” 1203 includes all the information about the spatial parameters for all blocks OTT.

Block TttData” 1204 includes all the information about the spatial parameters for all blocks TTT.

Block SmgData” 1205 includes information for temporal smoothing applied to dequantizing spatial parameter.

Block tepShapeData” 1206 includes information for the formation of the temporal envelope, used for decorrelating signal.

In Fig. 13A shows the syntax for representing information about the position of the time interval for which applies a set of parameters, according to one variant of the present invention. Field bsFramingType” 1301 indicates whether the spatial frame of the audio signal to a fixed frame or a variable frame. Fixed frame means a frame in which a parameter set is applied for a preset time interval. For example, the set of parameters is applied to a time interval, a predetermined equal interval. Variable frame means a frame that selectively receives information about the position of the time interval for which applies a set of parameters.

Field bsNumParamSets” 1302 indicates the number of parameter sets in the same spatial frame (hereinafter called “NumParamSets”), and between “NumParamSets” and “bsNumParamSets” there is a ratio NumParamSets = bsNumParamSets +1”.

For example, as for the field “bsNumParamSets” 1302 in Fig. 13A allocated 3 bits, in the same spatial frame can provide a maximum of eight sets of parameters. Since the number of allocated bits is not limited in the spatial frame, you can provide more sets of parameters.

If the spatial frame refers to a fixed type, conformace about the position of the time interval, to which is applied a set of parameters that can be determined according to a predetermined rule, and additional information about the position of the time interval for which applies the set of parameters is not necessary. However, if the spatial frame refers to the variable type, it is necessary to have information about the position of the time interval for which applies a set of parameters.

Field bsParamSlot” 1303 indicates the position information of the time interval for which applies a set of parameters. Field bsParamSlot” 1303 can be represented by a variable number of bits using multiple time slots in the same spatial frame, i.e. the “numSlots”. In particular, if the “numSlots” greater than or equal to 2(n-1)and less than 2(n)field “bsParamSlot” 1303 can be represented by n bits.

For example: (i) if the “numSlots” is in the range between 64 and 127, the field “bsParamSlot” 1303 can be represented by 7 bits; (ii) if the “numSlots” is in the range between 32 and 63, the field “bsParamSlot” 1303 can be represented by 6 bits; (iii) if the “numSlots” is in the range between 16 and 31, the field “bsParamSlot” 1303 can be represented by 5 bits; (iv) if the “numSlots” is in the range between 8 and 15, the field “bsParamSlot” 1303 can be represented by 4 bits; (v) if the “numSlots” is in the range between 4 and 7, what about the field bsParamSlot” 1303 can be represented by 3 bits; (vi) if the “numSlots” is in the range between 2 and 3, the field “bsParamSlot” 1303 can be represented by 2 bits; (vii) if the “numSlots” is 1, the field “bsParamSlot” 1303 can be represented by 1 bit; and (viii) if the “numSlots” is 0, then the field “bsParamSlot” 1303 can be represented by a 0 bit. Similarly, if the “numSlots” is in the range between 64 and 127, the field “bsParamSlot” 1303 can be represented by 7 bits.

If you have multiple (N) of sets of parameters, the combination of “bsParamSlot” can be represented according to the formula 9.

[Formula 9]

In this case, “bsParamSloti” specifies the time interval, which is used for the i-th set of parameters. Suppose, for example, that the “numSlots” is 3, and the field bsParamSlot” 1303 may have ten values. In this case, for the field “bsParamSlot” 1303 need three information (hereinafter referred to as C1, C2 and C3, respectively). Since for each (C1, C2 and C3) requires 4 bits, just need 12 (= 4·3) bits. In the case of C1, C2 and C3 in the form of groups by linking them together, can occur 1000 (=10·10·10) cases, which can be represented by 10 bits, which will save 2 bits. If the “numSlots” is 3, and if the matter is in the form of 5 bits is 31, this value can be represented as 31=1·(32)+5·(31)+7·(30). The decoding device may determine which of elite, that C1, C2 and C3 are 1, 5, and 7, respectively, by applying the inversion formula 9.

In Fig. 13B shows the syntax for representing information about the position of the time interval for which applies a set of parameters, the absolute value and the difference according to one variant of the present invention. If the spatial frame is a frame-variable type, then the field “bsParamSlot” 1303 in Fig. 13A can be presented as absolute values and values of a difference considering the fact that the value of information “bsParamSlot” increases monotonically.

For example: (i) the position of the time interval for which applies the first set of parameters, can be formed as an absolute value, that is, “bsParamSlot[0]”; and (ii) the position of the time interval for which a second or subsequent set of parameters that can be formed in the form of difference, that is, the “difference value” between “bsParamSlot[ps]” and “bsParamSlot[ps-1]” or “difference " value-1” (hereinafter called “bsDiffParamSlot[ps]”). In this case, the “ps” means a set of parameters.

Field bsParamSlot[0]” 1304 can be represented by a number of bits (hereinafter called “nBitsParamSlot(0)”), calculated using the “numSlots” and “numParamSets”.

Field bsDiffParamSlot[ps]” 1305 can be represented by a number of bits (hereinafter called “nBitsParamSlot(ps)”), calculated using “numlots”, “numParamSets” and the time interval for which applied the previous set of parameters, that is, “bsParamSlot[ps-1]”.

In particular, to represent “bsParamSlot[ps]” the minimum number of bits the number of bits to represent “bsParamSlot[ps]” can be made based on the following rules: (i) a set of “bsParamSlot[ps]” increases in ascending sequence (bsParamSlot[ps] > bsParamSlot[ps-1]); (ii) the maximum value bsParamSlot[0]equals “numSlots - numParamSets”; and (iii) if 0 < ps < numParamSets, “bsParamSlot[ps]” can be set only between “bsParamSlot[ps-1] +1” and “numSlots - numParamSets + ps”.

For example, if the “numSlots” is 10 and if numParamSets equal to 3, because “bsParamSlot[ps]” increases, then the maximum value bsParamSlot[0]equals '10-3=7'. Namely, “bsParamSlot[0]” should be selected from the values 0 to 7. The reason for this is that the number of time intervals for other sets of parameters (for example, if ps is 1 or 2) insufficient if “bsParamSlot[0]has a value greater than 7.

If bsParamSlot[0]equal to 5, the position bsParamSlot[1] time interval for the second set of parameters must be chosen from values between the '5+1=6' and '10-3+1=8'.

If bsParamSlot[1]equal to 7, then “bsParamSlot[2]” may be equal to 8 or 9. If bsParamSlot[1]equal to 8, then “bsParamSlot[2]” may be equal to 9.

Therefore, “bsParamSlot[ps]” can be submitted per the current number of bits using the above signs, instead of submitting in the form of a fixed number of bits.

When configuring “bsParamSlot[ps]” in the bit stream if “ps” is 0, “bsParamSlot[0] ' may be represented as an absolute value using a number of bits corresponding to nBitsParamSlot(0)”. If “ps” is greater than 0, then “bsParamSlot[ps]” can be represented as the difference with the number of bits corresponding to nBitsParamSlot(ps)”. When reading configured above bsParamSlot[ps]from the bit stream length of the bit stream for each data, that is, “nBitsParamSlot(ps)”, can be found using the formula 10.

[Formula 10]

In particular, “nBitsParamSlot(ps)can be found as nBitsParamSlot(0) = fb(numSlots - numParamSets +1). If 0<ps<numParamSets, “nBitsParamSlot(ps)can be found as nBitsParamSlot(ps) = fb(numSlots - numParamSets + ps - bsParamSlot[ps-1]). “nBitsParamSlot(ps)can be defined, if we use the formula 11, which complements the formula 10 to 7 bits.

[Formula 11]

The following is an example of a function fb(x). If the “numSlots” is 15 and if numParamSets equal to 3, then this function can be estimated as nBitsParamSlot(0)=fb(15-3+1)=4 bits.

If bsParamSlot[0]”, represented by 4 bits, equal to 7, then the function can be evaluated as nBitsParamSlot(1)=fb(15-3+1-7)=3 bits. In this case, the field “bsDiffParamSlot[1]” 1305 can be represented by 3 bits.

If the value represented by 3 bits, is equal to 3, then “bsParamSlot[1]becomes equal the output 7+3=10. Therefore, nBitsParamSlot(2)=fb(15-3+2-10)=2 bits. In this case, the field “bsDiffParamSlot[2]” 1305 may be represented by 2 bits. If the number of remaining time slots equal to the number of remaining sets of parameters, for the field “bsDiffParamSlot[ps]” can be assigned a 0 bit. In other words, to represent the time interval for which you are using this set of parameters, additional information is required.

Thus, the number of bits for “bsParamSlot[ps]” can be made variable. The number of bits for “bsParamSlot[ps]” can be read from the bitstream using the function fb(x) in the decoder. In some embodiments, the function fb(x) may include the function ceil(log2(x)).

When reading information for “bsParamSlot[ps]”, presented as absolute values and the difference values from the bitstream in the first decoder of the bit stream can be read “bsParamSlot[0]”, and then can be read “bsDiffParamSlot[ps]” for 0<ps<numParamSets. Then you can find “bsParamSlot[ps]” for the interval 0≤ps<numParamSets using “bsParamSlot[0]” and “bsDiffParamSlot[ps]”. For example, as shown in Fig. 13B, “bsParamSlot[ps]” can be found by adding “bsParamSlot[ps-1]” to “bsDiffParamSlot[ps]+1”.

In Fig. 13C shows the syntax for representing information about the position of the time interval for which applies a set of parameters, in view of the group according to one variant of the present invention. In the case when there are many sets of parameters, many “bsParamSlots” 1307 for many sets of parameters may be represented by at least one or more groups.

If the number of “bsParamSlots” 1307 equal (kN+L) if for each “bsParamSlots” 1307 need Q bits, “bsParamSlots” 1307 may be presented in the next group. In this case, 'k' and 'N' are arbitrary integers not equal to zero, and 'L' is an arbitrary integer satisfying the inequality 0≤L<n

The method of grouping may include the steps to create k groups by linking N “bsParamSlots” 1307 and the last group by linking the last L “bsParamSlots” 1307. k groups can be represented by M bits, and the last group can be represented by p bits. In this case, it is preferable that M bits was less than N·Q bits used in the case of each of the “bsParamSlots” 1307 without their grouping. p bit is preferably less than or equal to L·Q bits used in the case of each “bsParamSlots” 1307 without their grouping.

For example, assume that the pair “bsParamSlots” 1307 for two sets of parameters represents d1 and d2, respectively. If each of the values d1 and d2 can have five values, to represent each of the values d1 and d2 must have 3 bits. In this case, although 3 bits and can represent eight values, in fact, you will need five values. That is, each of the values d1 and d2 has three redundant values. In addition, in the case of d1 and d2 as a group by binding to d1 and d2 together, instead of 6 bits (=3 bits+3 bits) can be used 5 bits. In particular, since all combinations of d1 and d2 include 25 (=5·5) types, a group of d1 and d2 can be represented only 5 bits. Since 5 bits can represent 32 values, in the case of a view group is created seven redundant values. In addition, in the case of representation by grouping d1 and d2 redundancy turns out smaller than in the case of each of the values d1 and d2 in the form of 3 bits.

When configuring a data group to group can be configured using “bsParamSlot[0]” to the initial value and the value of the difference between pairs of “bsParamSlot[ps]” for the second or subsequent values.

When configuring a group of bits can be distributed directly without grouping, if the number of sets of parameters is equal to 1, and bits can be distributed after the grouping, if the number of sets of parameters is greater than or equal to 2.

In Fig. 14 presents a block diagram of the encoding method according to one variant of the present invention. Next is explained a method of coding an audio signal and the operation of the encoder according to the present image is ateneu.

First (S1401) is determined by the total number of time intervals (numSlots) in one spatial frame and the total number of parametric ranges (numBands) audio.

Then (S1402) is determined by the number of parametric ranges used for transformation module channels (block OTT and/or block TTT), and/or residual signal.

If the block OTT has mode LFE channel, then separately determine the number of parametric ranges used to block OTT.

If the block OTT has no mode LFE channel, the number of parameters used to block OTT, use the “numBands”.

The following defines the type of the spatial frame. In this case, the space frame can be attributed to the fixed frame type or frame variable type.

If the spatial frame refers to the variable type (S1403), it is determined (S1406), the number of sets of parameters that are used in the same spatial frame. In this case, the set of parameters can be used to transform module channels at each time interval.

Next (S1407) determine the position of the time interval, which is used for this set of parameters. In this case, the position of the time interval, which is used for this set of parameters can be presented as absolute values and the values of p is snasti. For example, the position of the time interval for which applies the first set of parameters may be represented in the form of absolute values, and the position of the time interval for which a second or subsequent set of parameters can be represented as the difference relative to the provisions of the preceding time interval. In this case, the position of the time interval, which is used for this set of parameters can be represented by a variable number of bits.

In particular, the position of the time interval for which applies the first set of parameters may be represented by the number of bits calculated using the total number of time intervals and the total number of parameter sets. The position of the time interval for which a second or subsequent set of parameters can be represented by a number of bits calculated using the total number of time intervals, the total number of sets of parameters and the time interval for which applied the previous set of parameters.

If the spatial frame refers to a fixed type, it is determined (S1404) the number of sets of parameters that are used in the same spatial frame. In this case, the position of the time interval, the La which applies a given set of parameters, choose using predetermined rules. For example, the position of the time interval for which applies a set of parameters that can be defined in such a way that it was equidistant from the position of the time interval, which is used for the previous set of parameters (S1405).

Next block down-mix and block the creation of spatial information to generate a signal after down-mixing and spatial information, respectively, using defined above, the total number of time intervals, the total number of parametric ranges, number of parametric ranges used for unit conversion channel, the total number of parameter sets in the same spatial frame and the position information of the time interval for which applies a set of parameters (S1408).

Finally, the multiplexing unit generates a bit stream that includes the signal after down-mixing and spatial information (S1409), and then transmits the generated bit stream to the decoder (S1409).

In Fig. 15 presents a block diagram of a method of decoding according to one variant of the present invention. The method of decoding the audio signal and the operation of the decoder according to the present invention are explained below.

SN the first decoder receives the bitstream audio (S1501). The unit demux selects the signal after down-mixing and signal with spatial information from the received bitstream (S1502). Next, the block decoding signal with spatial information retrieves the configuration information signal with spatial information about the total number of time intervals in the same spatial frame, the total number of parametric ranges and the number of parametric ranges used for transformation module channels (S1503).

If the spatial frame refers to the variable type (S1504), the spatial frame retrieves the number of parameter sets in the same spatial frame and the position information of the time interval, which is used for this set of parameters (S1505). Information about the position of the time interval can be represented by a fixed or variable number of bits. In this case, the position information of the time interval for which applies the first set of parameters may be represented in the form of absolute values, and information about the position of time intervals, which are used for the second or subsequent sets of parameters may be represented in the form of a difference. Valid information about the position of time intervals, for the which apply a second or subsequent sets of parameters, can be found by adding the difference to the position information of the time interval, which is used for the previous set of parameters.

Finally, the signal after down-mixing is converted into multi-channel audio signal using the extracted information (S1506).

The above disclosed variants of the invention provide several advantages compared to standard schemes of audio encoding.

First, when encoding multi-channel audio signal by representing the time interval for which applies a set of parameters, using a variable number of bits in the disclosed embodiments of the invention may reduce the amount of transmitted data.

Secondly, by presenting the position of the time interval for which applies the first set of parameters, in the form of absolute values and present positions of time intervals, which are used for the second or subsequent sets of parameters, in the form of a difference in the disclosed embodiments of the invention can reduce the amount of transmitted data.

Third, by presenting the number of parametric ranges used for the specified module conversion channels, in the form of a block OTT and/or block TTT with a fixed or variable number of bits revealed in the x variants of the invention can reduce the amount of data transmitted. In this case, the position of time intervals, which are used for the sets of parameters can be represented using the above-discussed principle, where the sets of parameters can be within a number of parametric ranges.

In Fig. 16 presents a block diagram of an exemplary architecture 1600 device for implementing audiocamera/decoder described with reference to figures 1-15. Architecture 1600 device applicable for a wide variety of devices, including, but not limited to: personal computers, server computers, consumer electronics, mobile phones, personal digital assistants (PDAs), tablets, television systems, television set-top boxes, game consoles, media players, music players, navigation systems or any other device capable of decoding audio signals. In some of these devices can be implemented in a modified architecture that uses a combination of hardware and software.

Architecture 1600 includes one or more processors 1602 (e.g., PowerPC®, Intel Pentium® 4 and so on), one or more devices 1604 display (for example, a cathode ray tube (CRT), liquid crystal display (LCD)), the audio subsystem 1606 (e.g., hardware/software audio), one Il the multiple network interfaces 1608 (e.g., Ethernet, FireWire®, USB and so on), the device 1610 input (for example, keyboard, mouse etc) and one or more machine-readable media 1612 (e.g., RAM), read-only memory (ROM), synchronous dynamic RAM (SDRAM), hard disk, optical disk, flash memory etc). These components can exchange messages and data through one or more tires 1614 (e.g., standards, EISA, PCI, PCI Express, etc).

The term “machine-readable medium” refers to any medium that participates in providing processor 1602 teams for their performance, including, but not limited to non-volatile media (e.g., optical or magnetic disks), volatile media (e.g., memory) and the transmission medium. The transmission medium includes, but not limited to, coaxial cables, copper wire and optical fiber. The transmission medium may also exist in the form of acoustic, light, or radio waves.

Machine-readable media 1612, in addition, includes an operating system 1616 (e.g., Mac OS®, Windows®, Linux, etc), network module 1618 communication, audio codec 1620 and one or more applications 1622. Operating system 1616 may be multi-user, multiprocessing, multitasking, multithreading, to be in real time, etc. Operating system 1616 performs basic tasks, including, but not what only: recognition of the input data, coming from devices 1610 input; sending output to devices 1604 display and the audio subsystem 1606; tracking of files and directories on computer-readable media 1612 (e.g., memory or storage device); management of peripheral devices such as disk drives, printers, etc); and managing traffic on the one or more tires 1614.

Network module 1618 communication includes various components for installation and support of network connections (e.g., software for implementing communication protocols, such as TCP/IP, HTTP, Ethernet, etc). Network module 1618 communication may include a browser, which gives operators the architecture 1600 devices the ability to network (e.g. the Internet) information search (e.g., audio content).

The audio codec 1620 is responsible for implementation of all or part of the processes associated with the encoding and/or decoding and described with reference to figures 1-15. In some embodiments, the codec works together with the hardware (for example, the processor (s) 1602, audio subsystem 1606) for audio processing, including encoding and/or decoding audio signals, as described here present invention.

Application 1622 may include any software application related to audiocontent is, and/or encoding and/or decoding of audio content, including, but not limited to, music players, music players (e.g. MP3 players), applications, mobile phones, PDA devices, television systems, television set-top boxes. In one embodiment, the audio codec can be used by the application service provider to provide services encoding/decoding via a network (e.g. the Internet).

In the above description, for purposes of explanation set forth numerous specific details to provide a complete understanding of the invention. However, specialists in the art should be obvious that the invention can be practically implemented without these specific details. In other examples, structures and devices are shown in block diagrams in order not to obscure the essence of the invention.

In particular, specialists in the art should understand that they can be used with other architectures and graphical environment and that the present invention can be implemented using graphical tools and products that differ from those described above. In particular, the approach on the client/server is just an example architecture providing instrumental functionality of the present invention; specialists in the art it is obvious that also can be the used other approaches, different from the schema of the client/server.

Some sections of the detailed description is presented in the language of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by specialists in the field of data processing for more efficient transmission of beings of their work to other specialists in this field of technology. Under the algorithm here, as in General, refers to a self-consistent sequence of steps leading to a desired result. These steps require physical manipulation of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals that can be saved, transferred, combined, compared, and perform other manipulations. In the practical use of these signals it is convenient to call bits, values, elements, symbols, characters, members, numbers, etc.

Industrial applicability

However, it should be borne in mind that all these and similar terms should be associated with appropriate physical quantities and are merely convenient notation used for these quantities. As evident from this discussion, unless otherwise specifically provided, in the eat the description of reasoning, using terms such as “processing,” “computing,” “estimates,” “determining” or “displaying” and the like refer to the actions and processes of a computer system or similar electronic computing device, that manipulates the data and transforms data represented as physical (electronic) quantities within the registers and storage devices of the computer system into other data presented in the same way as the physical quantities in storage devices and the registers of a computer system or other devices for storing, transmitting or displaying information.

The present invention also relates to a device for performing operations. This device may be specially constructed for the required purposes or it may contain a General-purpose computer selectively activated or reconfigured by a computer program stored in the computer. The specified computer program may be stored on machine-readable media, such as, but not limited to the disk of any type, including floppy disks, optical disks, compact disks (CD-ROMs), and magneto-optical disks, memory, read only (ROM), random-access memory (RAM), electrically erasable permanent memory (EPROM), electrically erasable p is grammarware memory (EEPROM), magnetic or optical cards, or media of any type suitable for storing electronic commands associated with the bus of the computer system.

The algorithms and modules essentially not related to any particular computer or other device. Can be used for the General purpose system with programs in accordance herewith basic principles, or sometimes more convenient to construct more specialized device to perform the steps disclosed here. The necessary structure for these various systems stems from the following description. In addition, the present invention is not described with reference to any particular programming language. It is valuable that the implementation described here, the basic principles of the invention can be used many different programming languages. In addition, specialists in the art it should be clear that these modules, features, attributes, methodologies and other aspects of the invention can be implemented as software, hardware, hardware or any combination thereof. Of course, when the component of the present invention is implemented as software, it can be executed as a standalone program, as part of a larger the programme, many individual programs that are statically or dynamically linked libraries that are loaded into the kernel module and/or any other way known now or which may develop in the future at the disposal of specialists in the field of computer programming. In addition, the present invention is not limited to implementation in any specific operating system or environment.

Specialists in the art it should be clear that discussed here a variant of the invention can be made various changes or modifications not beyond being or scope of the invention. Thus, it is assumed here that the present invention covers all these modifications and alterations disclosed here options, if these modifications and alterations are within the scope of the appended claims and its equivalents.

1. The method of decoding an audio signal performed by an audio coding system, the audio signal includes at least one frame, the frame contains at least one time interval and at least one set of parameters containing phases in which:
allocate a number of time intervals and the number of parameter sets from the audio signal to identify information about a time interval, inform the information about the time interval specifies the time interval, applies a set of parameters;
determine the bit length allocated to the information about the time interval, the bit length is variable in accordance with the number of time intervals, the number of sets of parameters and information about the previous time interval, associative associated with the previous set of parameters;
release information about the time interval on the basis of the bit length; and
decode the audio signal on the basis of information about the time interval and the corresponding sets of parameters,
moreover, the selected amount of information about a time interval equal to the number of sets of parameters, and information about a time interval includes an absolute value indicating a time interval that is applied to the first set of parameters, and the differential value indicating a time interval that applies a set of parameters following the first set of parameters.

2. The method according to claim 1, wherein the information about the time interval is position information indicating the position of the time interval that applies a set of parameters.

3. The method according to claim 1, in which the time interval applies to the following set of parameters is determined by adding the difference to the previous time interval.

4. The method according to claim 1, which which the absolute value is determined within the first interval maximum the first interval of the maximum is calculated using the number of parameter sets and the number of time intervals, and
in which the differential value is determined within the second interval of the maximum, the second interval of the maximum is calculated in accordance with the information of the previous time interval.

5. The device is decoding the audio signal, the audio signal includes the signal after down-mixing and spatial information, the spatial information includes at least one frame having at least one time interval and at least one set that contains:
the block decoding spatial information, configured to allocate a number of time intervals and the number of parameter sets from the audio signal to identify information about a time interval, information about the time interval specifies the time interval that applies a set of parameters, and determine the bit length allocated to the information about the time interval, the bit length is variable in accordance with the number of time intervals, the number of sets of parameters and information about the previous time interval, associative associated with the previous set of parameters and selection of information about the time interval n is the basis of the bit length, and decoding spatial information based on the information about the time interval and the corresponding sets of parameters,
moreover, the selected amount of information about a time interval equal to the number of sets of parameters, and information about a time interval includes an absolute value indicating a time interval that is applied to the first set of parameters, and the differential value indicating a time interval of the set of parameters following the first set of parameters is applied to the set of parameters;
the block decoding signal after down-mixing, configured to decode the signal after down-mixing; and
block the formation of the multi-channel signal, configured for forming a multi-channel signal by using the decoded signal after down-mixing and the decoded spatial information.

6. The device according to claim 5, in which information about the time interval is position information indicating the position of the time interval that applies a set of parameters.

7. The device according to claim 5, in which the time interval applies to the following set of parameters is determined by adding the difference to the previous time interval.

Ontrast according to claim 5, in which the absolute value is determined within the first interval of the maximum, the first interval of the maximum is calculated using the number of parameter sets and the number of time intervals, and
in which the differential value is determined within the second interval of the maximum, the second interval of the maximum is calculated in accordance with the information of the previous time interval.

 

© 2013-2014 Russian business network RussianPatents.com - Special Russian commercial information project for world wide. Foreign filing in English.