Transformation of audio file format

FIELD: physics.

SUBSTANCE: invention is related to coding audio signals with flows of audio data. Invention consists in combination of separate flows of audio data into multi-channel flows of audio data by means of data unit modification in audio data flow, which is divided into data units with audio data of determination unit and data unit, for instance, by supplementing, adding or replacing of their part, so that they include indicator of length, which displays value or length of data, respectively, of audio data of data unit or value or length of data, respectively, of data unit, in order to receive the second flow of audio data with modified data units. Alternatively, flow of audio data with indicators in determination units, which point to audio data of determination unit connected to these units of determination, but distributed among different data units, is transformed into flow of audio data, in which audio data of determination unit are combined into audio data of continuous determination unit. Then audio data of continuous determination unit may be included into self-sufficient element of channel together with their determination unit.

EFFECT: simplification of audio data manipulation in relation to combination of separate flows of audio data into multi-channel flows of audio data or general manipulation of audio data flow.

13 cl, 9 dwg

 

The present invention relates to encoding audio streams of audio data, and more particularly to best manipulate streams of audio data in the file format in which the audio data is associated with a time stamp, can be distributed on different blocks of data, as in the case of the MP3 format (the format conversion of the audio data).

Compression of the audio data in the MPEG standard is a particularly effective method of storing audio signals such as music or sound for film, in digital form, requiring, on the one hand, as a smaller memory space and, on the other hand, maintain the best possible quality of audio signals. In the last few years, the compressed audio data according to the MPEG proved himself one of the most successful decisions in this area.

Meanwhile, there are different versions of how audio compression standard MPEG. Audio, generally speaking, is sampled with a certain sampling frequency, and the resultant sequence of samples of the audio data associated with the overlapping time periods or time stamps, respectively. Then mentioned timestamps are served separately, for example, in the hybrid filter Bank, consisting of multiphase and modified discrete cosine transform (MDCT), the overwhelming effects of aliasing. Compression real who's data occurs during the quantization of the MDCT coefficients. Then MDCT-coefficients, quantized in this way is converted into a Huffman code or code word Huffman, generating additional compression by linking shorter words with more common factors. Thus, in General, the compression standard MPEG is lossy, however, audible loss is limited because knowledge of psychoacoustics is included by way of quantization coefficients of a DCT (discrete cosine transform).

Widely used MPEG is a so-called standard MP3, described in ISO/IEC 11172-3 and 13818-3. This standard provides the possibility of adapting the loss of information generated by the compression, the bit rate at which the audio data must be transmitted in real-time. Signal transmission of the compressed data in the channel with a constant bit rate must also be performed in other MPEG standards. To ensure that the quality of listening in the receiving decoder remains sufficient even at low transmission speeds in bits, standard MP3 provides for MP3 encoder, having a so-called bit reservoir. This means the following. Usually, due to the fixed bit rate, MP3 encoder must encode each timestamp in the block of code words having the same size is, moreover, this block could be transmitted with a given bit rate in the time period of the repetition frequency period of time. However, the latter is not suitable for the case when some part of the audio signal, such as, for example, sounds that follow a very loud sound in music composition, require less precise quantization constant as compared with other parts of the audio signal, such as, for example, parts with many different tools. Thus, the MP3 encoder does not generate a simple format of the bitstream, where each timestamp is encoded in one frame with the same frame length for all threads. Such a self-contained frame could consist of a header frame, additional information and data associated with a time stamp associated with the frame, namely encoded MDCT-coefficients, and the additional information is information for the decoder, as should be decoded DCT-coefficients, for example, how many subsequent DCT coefficients equal to 0, to indicate what DCT-coefficients are consistently included in the main data. However, the reverse index is included in additional information or in the header, indicating the position in the master data in one of the previous frames. This is the beginning of the main data, the relative is relevant to the time stamp, with which is associated a frame that includes a corresponding reverse pointer. The back pointer points to, for example, the number of bits that is shifted to the beginning of the main data bit stream. The end of these basic data can be in any frame depending on how high is the rate of compression for this timestamp. Thus, length of master data for the individual time stamps is no longer constant. Thus, the number of bits which are encoded block, can adapt to the properties of the signal. At the same time can be achieved constant bit rate. The specified method is called a "bit reservoir". Generally speaking, the bit reservoir is a buffer of bits that can be used to provide more bits to encode a block of time samples than usual Pets constant speed output. The method of the bit reservoir takes into account the fact that some blocks of audiolibro can be coded with fewer bits than it is defined by a constant rate, so that the blocks fill the bit reservoir, while the other blocks audiolibros have properties psychoacoustics, which do not allow such high compression, so that the available bits may actually be enough for decoder is of such blocks with low noise or no noise, respectively. Required redundant bits are taken from the bit reservoir, so that the bit reservoir is emptied during the mentioned blocks. The method of the bit reservoir is also described in the above network layer 3 of the MPEG standard.

Although the MP3 format has no advantages on the side of the encoder, providing reverse the signs on the side of the decoder are the undoubted disadvantages. If, for example, the decoder receives the bit MP3 stream is not first, but starting from a certain frame in the middle, then the encoded audio signal on the time stamp associated with the specified frame, can only be played back immediately after the reverse pointer accidentally becomes equal to 0, which should indicate that the data for this frame accidentally occurs immediately after the header or additional information, respectively. However, usually it does not happen. Thus, the audio on this timestamp is impossible, when the reverse frame pointer, which was adopted by the first, points to the previous frame, which, however, has not yet been adopted. In this case (at first) can only play the next frame.

Further, problems occur on the receiver side, mostly when dealing with frames that are interconnected backward directions and which, therefore, are not samozascita is low. In addition, the problem of bit streams with return addresses for the bit reservoir is that, when different audio channels individually encoded in MP3 format, basic data relating to each other in the two bit streams, because they are associated with the same timestamp, could shift to each other, with varying offset in the sequence of frames so as to again prevent the Association here of these individual threads standard MP3 in the multichannel audio data stream.

There is also a need for a simple possibility of generation of user-friendly, compatible with MP3, multi-channel audio data streams. Multi-channel MP3 audio streams according to the standard ISO/IEC 13818-3 require matrix operations to retrieve the input channels of the transmitted channels on the side of the decoder and to use different back-pointer and, thus, become difficult to control.

Streams audio layer 2 (MPEG 1/2 correspond to audio data streams standard MP3 in the composition of the subsequent frames in the structure and location of personnel, namely the structure of the header part of the additional information and the main data, and location with static frame length depending on the sample rate and bit rate, VA is Jeremai from frame to frame, however, they differ from them by the absence of reverse pointers or the bit reservoir, respectively, during encoding. Containing and not containing the coding time periods of the audio signal is encoded with the same frame length. Basic data relating to the time stamp, are in the same frame together with the appropriate header.

The present invention is to create a scheme for converting audio data stream in a secondary stream of audio data or Vice versa, so that the manipulation of audio data is made easier, for example, with regard to combining separate streams of audio data in a multi-channel audio streams, or the manipulation of the audio data stream.

This task is achieved by the method according to claims 1, 10, 13, 14 or 15 and the device PP, 18, 19, 20 or 21.

Manipulation of audio data can be simplified as, for example, in relation to combining separate streams of audio data in a multi-channel audio streams or General manipulation of the audio data stream, by modifying the data block in the stream of audio data divided by the data blocks with the block definition and data of the data block, for example, by adding, addition or replacement of their parts so that they include a length indicator indicating the magnitude or the length of the data, with the responsibility audio data block or the size or the length of the data, respectively, of the data block to obtain a second stream of audio data with the modified data blocks. Alternatively, the audio data stream with pointers in blocks definitions that indicate the audio data block definitions associated with these blocks determine, but distributed among different blocks of data, is converted into the audio data stream in which the audio data block definitions are combined into a contiguous block of audio data definitions. Then the audio data is contiguous block definitions can be included in a self-contained element of the canal, together with their definition block.

The present invention is based on the fact that based on the signs of the audio data stream, where the pointer points to the beginning of the audio data block, the definition of the corresponding data block, it is easier to manage when this stream of audio data is manipulated so that all the audio data block definitions, that is, audio data relating to the same timestamp or coding audiomachine for the same audiolabel are merged into a contiguous block of audio data contiguous block definition, and the corresponding block definition, which is associated with the audio data contiguous block definition is added to it. After their layout and alignment according to the government, channel elements obtained in this way, lead to a new audio data stream, in which all audio data belonging to the same timestamp or coding audiomachine or samples, respectively, for a given timestamp, also combined into a single element of the channel so that a new stream of audio data easier to manage.

According to a variant implementation of the present invention, each block definition or each element of the channel is modified in the new audio data stream, for example, by adding or replacing parts to get an indication of the length indicating the length or amount of data, respectively, of the element of the continuous channel of the audio data included in order to simplify the decoding of the new audio data stream with the feed items of variable length. Mainly, the modification is performed by replacing the redundant part of the block definition, identical for all blocks of the input audio data stream to the corresponding display length. This measure can ensure that the data rate in bits of the resulting audio data stream becomes equal to the transfer rate of the original audio data stream, despite the additional indication of the length compared to the original, pointer-based, audio data stream, and thereby, can be obtained there is indeed excess return pointer to the new stream of audio data, to ensure the possibility of a reconstruction of the original audio data stream from the new audio data stream.

Identical excess part of these blocks may not be placed before the new resultant audio data stream in full block definition. On the receiver side, the resulting second audio data stream can be, thus, reconvertion in the original audio data stream to use the existing decoders that can decode audio streams of the source file format for decoding result of the stream of audio data in the format without the pointer.

According to another variant implementation of the present invention the conversion of the first audio data stream to the second stream of audio data to another file format is used to generate multi-channel audio data stream of the multiple streams of audio data of the first file format. Handling on the receiver side is improved in comparison with a simple combination of the source streams of the audio data with a pointer, as in multi-channel audio data stream all feed items related to a particular time stamp or containing audio data contiguous block definitions, respectively, were obtained through the encoding of concurrent time period of the multi-channel audio signal,i.e. the encoding time periods of different channels, related to the timestamp, which can be combined into blocks access. This procedure cannot be pointer-based formats of the audio data as audio data for one time stamp can be distributed among different blocks of data. Providing blocks of data in different streams of audio data for different channels with display length allows you to better parse through blocks access while merging streams of audio data in the multichannel audio data stream with blocks access.

Further, the present invention is derived from a solution consisting in the fact that it is very easy to reconventioning described above, the resulting streams of audio data in the original file format, which can then be decoded audio signal existing decoders. Although the resulting channel elements have different lengths and, thus, somewhat longer or somewhat shorter than the length available in the block data of the original audio data stream, you do not need to move or merge master data in accordance ultimately, with the addition of the received reverse pointers for playing audio data stream in a new file format, but rather to increase the indication of speed transmission bit blocks definition audio data stream to generate the source f is rmat file. The effect is that, according to such a display transmission speed in bits, even the longest of the elements of the channel in the audio data stream, which should be decoded is smaller or the same length as the length of the data block, which blocks of data are audio data stream of the first format file. Back pointers are set to zero and the elements of the channel increased to the length corresponding to the enlarged display of the bit rate by adding the bit values of the indifferent state. Thus, the generated data blocks of the audio data stream of the source file format in which the associated master data simply included in the block data and are not included in any other. The audio data stream of the first format file, reconventional thus, can then be fed into an existing decoder for streaming audio data of the first file format, by using the bit rate, increased according to the increased indication bits. Thus, costly shift operations to reconvertible omitted, and the replacement of existing decoders for the new ones.

On the other hand, according to another variant implementation, you can restore the original audio data stream from the resulting stream of audio data using information is Yu, included in the full block determine the net flow of audio data on identical redundant parts of the blocks in order to recover some, overwrite display length.

Brief description of drawings

The invention is further explained in the description of specific variants of its implementation with reference to the drawings, in which:

figure 1 - schematic drawing to illustrate the format of an MP3 file with a backward pointer,

figure 2 - block diagram for illustrating the structure of convert MP3 audio data stream in MPEG-4 audio data stream,

figure 3 - diagram of the operational sequence of the method of conversion of MP3 audio data stream in MPEG-4 audio data stream according to one variant of implementation of the present invention,

4 is a schematic drawing for illustrating the step of combining the associated audio data by adding blocks definition and phase modifications of the blocks determine the method according to Fig.3,

5 is a schematic drawing to illustrate the method of convert multiple MP3 audio data streams in a multi-channel MPEG-4 audio data stream according to another variant implementation of the present invention,

6 is a block diagram of the layout to convert MPEG-4 audio data stream received according to figure 3, back to the MP3 audio data stream, to enable de is tiravanija its existing decoders

Fig.7 - precedence diagram method reconvertible MPEG-4 audio data stream received according to figure 3, multiple streams of audio data in MP3 format

Fig - precedence diagram method reconvertible MPEG-4 audio data stream received according to figure 3, in one or more streams of audio data in MP3 format, according to another variant implementation of the present invention, and

figure 9 - sequence diagram of the operations of the method of conversion of MP3 audio data stream in MPEG-4 audio data stream according to another variant implementation of the present invention.

The present invention is illustrated with reference to the drawings illustrating variations in its implementation, in which the original audio data stream in the file format in which the blocks define blocks of data, use back pointers to point to the beginning of the main data related to the block definition, is merely illustrative MP3 audio data stream, whereas the resulting stream of audio data consisting of self-contained elements of the channel in which the audio data are combined, the associated timestamp is merely illustrative MPEG-4 audio data stream. The MP3 format is described in ISO/IEC 11172-3 and 13818-3 cited in the prior art, whereas the MPEG-4 file op is described in ISO/IEC 14496-3.

First, with reference to figure 1 describes the format of MP3. Figure 1 shows a portion of an MP3 stream 10 of the audio data. Thread 10 of the audio data consists of a sequence of frames or data units, respectively, of which only three are shown in figure 1, namely, 10a, 10b and 10c. MP3 stream 10 audio data generated MP3 coder of the audio signal or audio signal, respectively. The audio signal, the encoded stream 10 data represents, for example, music, noise, their mixture, etc. Each of the blocks 10a, 10b and 10c data associated with one of the sequential, possibly overlapping time periods on which the audio signal was divided MP3 encoder. Each time period corresponds to the time stamp of the audio signal and, thus, the description, the term "time stamp" is often used for a period of time. Each period of time encoded in the basic data (main_data) MP3 encoder separately, for example, through the hybrid filter Bank, consisting of a polyphase filter Bank and a modified discrete cosine transform followed by entropy, as, for example, the encoding by the Huffman method. Basic data related to three consecutive timestamps that are associated with the blocks 10a-10c data are illustrated in figure 1 by the reference position in the form of contiguous blocks instead of the actual thread 10 of the audio data.

Blocks 10a-10c data on the eye 10 of the audio data ordered equidistant in the stream 10 of the audio data. This means that each block 10a-10c data has the same data block length or the length of the frame, respectively. Again, the frame length depends on the bit rate at which the thread 10 of the audio data should at least be played in real time with a sampling frequency which MP3 encoder used for sampling the audio signal to a valid encoding. The connection is that the sampling rate indicates how long is the time stamp in connection with a fixed number of samples per time-stamp, and that of the bit rate and period of time tags, you can calculate how many bits can be transmitted in this time period.

Both parameters, i.e. the bit rate and sample rate, specified in the headers 14 frames in blocks 10a-10c data. Thus, each block 10a-10c data has its own title 14 of the frame. Generally speaking, all information that is important to decode the stream of audio data is stored in each frame 10a-10c itself, so that the decoder can start decoding in the middle of the MP3 stream 10 of the audio data.

Separately from title 14 of the frame, which is at the beginning of each block 10a-10c data has the section 16 additional information and section 18 of the basic data containing audio data block. Section 16 additional is sustained fashion information follows immediately after the header 14. It includes information that is material to the decoder thread 10 of the audio data to locate the audio data of the main data or block definitions, respectively, associated with the corresponding data block, which are just code words Huffman, arranged linearly in series, to decode them in the right way in DCT or MDCT-coefficients, respectively. Section 18 of the basic data is the end of each data block.

As mentioned in the description section of the prior art, standard MP3 realizes the function of the tank. This provides backward pointers included in the additional information section 16 additional information, indicated in figure 1 by the reference position 20. If the back pointer is set to 0, the underlying data for this additional information begin immediately after section 16 additional information. Otherwise, the back pointer 20 (main_data_begin) shows the start of the main data, encoding the timestamp associated with the data block, and the additional information 16 that contains a back pointer 20 is included in the previous data block. For example, in figure 1 the block 10a data associated c timestamp, encrypted main data 12a. The back pointer 20 additional information 16 of the block 10a Yes the data indicates for example, at the beginning of the main data 12a that are in the data block before the block 10a data in the direction 22 of the flow by setting the offset bits or bytes, measured from the beginning of title 14 of the block 16a of the data. The latter means that at this time, during audio encoding, bit reservoir MP3 encoder, generating MP3 stream 10 audio was not filled, but could be downloaded to the height of the reverse pointer. From a position that is pointed to by the back pointer 20 unit 10a data, forward master data 12a are inserted into the stream 10 of the audio data with equidistant spaced pairs of headers and additional information 14, 16. In this example, the main data 12a extend slightly more than half of section 18 of the basic data unit 10a data. The back pointer 20 on the section 16 additional information the subsequent block 10b data indicates the position directly after the main data 12a in block 10a of the data. The same applies to the reverse pointer 20 additional information 16 block 10c of the data.

As you can see, this situation is rather an exception in the MP3 thread 10 of the audio data, when the main data related to a timestamp, a truly exceptional in the data block associated with a timestamp. Typically, the data blocks for the most part distributed on one or several them of the data blocks, that would not even include the corresponding data block itself, depending on the size of the bit reservoir.

After the structure of the MP3 audio data stream described with reference to figure 1, the described arrangement with reference to figure 2, which is suitable to convert the MP3 audio data stream in MPEG-4 audio data stream, or to receive MPEG-4 audio data stream from the audio signal, which can be easily converted to MP3 format.

Figure 2 depicts MP3 encoder 30 and MP3-MPEG-4-Converter 32. MP3 encoder 30 includes an input for receiving an audio signal, which must be encoded, and output to output MP3 audio data stream, encoding the audio signal at the input. MP3 encoder 30 operates according to the aforementioned standard MP3.

MP3 audio data stream, the structure of which is described with reference to figure 1, consists, as mentioned, of the frames with a fixed frame length, which is dependent on the preset bit rate and the main rate, and from byte fill, installed or not installed. MP3-MPEG-4-Converter 32 receives the MP3 audio data stream at the input and outputs the MPEG-4 audio data stream at the output, the structure of which can be understood from the following description of the mode of operation MP3-MPEG-4-Converter 32. The Converter 32 converts MP3 audio data stream from MP3 to MPEG-4. Format MPE-4 data has the advantage of all master data related to a timestamp included in a contiguous block access or element of the channel so that manipulation of the latter is much easier.

Figure 3 shows the individual steps of the method when converting MP3 audio data stream in MPEG-4 audio data stream performed by the Converter 32. First on the stage 40 is made an MP3 audio data stream. The reception can include maintaining a full stream of audio data or just current portion of flow in the register-latch. Accordingly, the subsequent stages in the conversion process can be performed either while receiving 40 in real time, or just after it.

Then, at step 42, all the audio data or master data, respectively, related to the timestamp, are combined into a contiguous block, and it will run for all time stamps. The stage 42 is illustrated in more detail in figure 4, where the elements of the MP3 audio data stream, similar to the elements illustrated in figure 1 are denoted by same or similar reference positions, and repeated description of these elements is omitted.

As you can see from the direction 22 of the data stream, specified parts of MP3 stream 10 audio illustrated closer to the left side of figure 4, reaches the transducer 32 earlier than the right parts. The blocks 10a and 10b data fully illustrated is as figure 4. Timestamp related to the unit 10a data is encoded main data MD1 included in figure 4, illustrating partly in the data block before the block 10 data and partially in block 10a of the data, and here, in particular, section 18 of the basic data. The basic data, encoding the timestamp associated with the subsequent block 10b data are included only in section 18 of the basic data unit 10a data and marked as MD2. Basic data relating to the data block that follows the block 10b data are distributed among the sites 18 main data blocks 10a and 10b data.

At step 42, the Converter 42 connects all the main data, encoding the same timestamp in continuous blocks. Thus, section 44 before the block 10a parcel data 46 in block 10a of the data in the master data MD1 results in a continuous block 48 by combining after step 42. This is the same as for other master data MD2, MD3.

For step 42, the Converter 32 reads the pointer to additional information 16 data block, and then, based on this index, reads the corresponding first portion 44 of the audio data 12a of the definition block for this block 10a of the data included in item 18 of the previous block of data starting at the position specified by index, up to the header of the current block 10a data. He then MF is still the second part 46 of the audio data, included in part 18 of the current block 10a data, and contains the end of the audio data block definition for this block 10a data, starting from the end of the additional information 16 current block 10a data prior to the next audio data MD2, until the next block 10b data pointed to by the pointer in the additional information 16 subsequent block 10b data, which also reads the Converter 32. The unification of the two parts, as described, results in a block 48.

At step 50, the Converter 32 adds the associated header 14, which includes an associate additional information 16, continuous blocks in order to create MP3-elements 52a, 52b and 52c of the channel. Thus, each of the elements 52a-52c channel consists of a header 14 corresponding MP3 data block subsequent section 16 additional information of the same MP3 data block and contiguous block of 48 master data, encoding the timestamp associated with the data block, from which comes the title and additional information.

MP3 feed items resulting stages 42 and 50 have different lengths of feed items, as shown by double arrows 54a-54c. It should be noted that the blocks 10a, 10b data in the MP3 thread 10 of the audio data have a fixed length of 56 frame, but the number of master data for the individual met the time changes of the mean value with the bit reservoir.

To facilitate decoding and especially parsing a separate MP3-elements 52a-52c-side channel decoder, the headers 14 H1-H3 are modified to obtain the length of the corresponding element of the channel 52a-52c, i.e. 54a-54c. The specified operation is performed at step 56. The input length is written to the part, identical to excess, respectively, for all of the headers 14 thread 10 of the audio data. In the MP3 format each header 14 first receives a certain fixed synchronization word (synchroscope), consisting of 12 bits. At step 56 the specified synchroscope is filled with the length of the corresponding element of the channel. 12 bits of synchroscope sufficient to represent the length of the corresponding element of the channel in binary form, so that the length of the resulting elements 58a-58C trainers channel with modified header hl-h3 remains the same, despite the step 56, that is equal 54a-54c. Thus, the audio information can be transmitted with the same bit rate in real time or played back in the form of source MP3 stream 10 of the audio data after the merge MP3-elements 58a-58C trainers channel according to the order of time stamps encoded similarly, despite the addition of display length, no other official data is not added additional headers.

At step 58 the file header, or for the case codegenerators data stream is not a file, and streaming data, the header of the data stream is generated for the desired MPEG-4 audio data stream (step 60). Because under this option, the implementation must generate an audio data stream compliant with the MPEG-4 standard, the header file is generated in accordance with the MPEG-4 standard, in this case, the frame header has a fixed structure with AudioSpecificConfig, which is defined in the aforementioned MPEG-4 standard. Interface for MPEG-4 systems is provided by the element ObjectTypeIndication set with the value h and display AudioObjectType with the number 29. Function AudioSpecificConfig specified by the MPEG-4 standard, applies to its original definition in ISO/IEC 14496-3, in which the next example considers only the content features AudioSpecificConfig relevant to the present description, but not all of them:

1AudioSpecificConfig() {
2audioObjectType;
3samplingFrequencyIndex;
4if(samplingFrequencyIndex==0xf)
5samplingFrequency;
6channelComfiguration;
7if(audioObjectType==29){
8MPEG_1_2_SpecificConfig();
9}
10}

The above list AudioSpecificConfig is a representation of a common notation for functions AudioSpecificConfig, which is used to parse or read call parameters in the header of the file in the decoder, namely samplingFrequencyIndex, channelConfiguration and audioObjectType, or showing instructions, as should be decoded or analyzed header file.

You can see that the header file generated at step 60, begins with a display element AudioObjectType, which is set by the number 29 (line 2), as mentioned above. Parameter audioObjectType shows the decoder, how should encoded data, and in particular, how should stand out more information to encode the file header, as explained below.

Then the call option samplingFrequencyIndex, which indicates a position in the normalized table for sample (line 3). If the index is equal to 0 (line 4), the indication of the sampling rate should be no indication of the normalized table (line 5).

Then it should display the channel configuration (line 6), which shows the method, which is described next, how many channels are included in the generated MPEG-4 audio data stream, which is also possible, as opposed to this option made the I, to combine more than one MP3 audio data stream in MPEG-4 audio data stream, which is also described below with reference to figure 5.

Then, if audioObjectType is 29, as in this case, it should be part of the header file AudioSpecificConfig containing the excess part of the header of an MP3 frame in the stream 10 of the audio data, that is, the part that remains the same from header 14 frames (line 8). The part shown here, the designation MPEG_l_2_SpecificConfig(), that is again a function that specifies the structure of this part.

Although the element structure MFEG_l_2_SpecificConfig can also be taken from the standard MP3, because it corresponds to the fixed part of the header of an MP3 frame that does not change from frame to frame, the structure of illustration is shown below:

1MPEG_l_2_SpecificConfig(channelConfiguration){
2syncword (synchroscope)
3ID (identifier)
4Layer (level)
5Reserved (reserved)
6sampling_frequency (sampling rate)
7reserved (reserved)
8reserved (reserved)
9reserved (reserved)
10if(channelConfiguration==){
11channel configuration description configuration of channels);
12}
13}

In part MPEG_l_2_SpecificConfig all the bits that are different from the frame header to the header 14 of the frame in the MP3 audio data stream, set to 0. In this case, the first parameter MPEG_l_2_SpecificConfig, namely the 12-bit word synchronization - synchroscope (syncword), allowing you to sync the MP3 encoder, when you receive an MP3 audio data stream (line 2)is the same for each frame header. Subsequent parameter ID (identifier) shows the MPEG version, that is 1 or 2, by an appropriate standard ISO/IEC 13818-3 for version 2 and ISO/IEC 11172-3 for version 1. The level parameter (line 4) gives an indication of the level 3, which corresponds to the standard MP3. The next bit is reserved (line 5), because its value may change from frame to frame, and is transmitted by the elements MP3 channel. Mentioned bit indicates the likelihood that the header should be the variable CRC (cyclic redundancy code). The next variable sampling_frequency (sampling rate) (line 6) indicates a table with sample rates specified in the standard MP3, and thus shows the sample rate of the underlying MP3 DCT coefficients. Then, in line 7, should the indication bits DL is application-specific (reserved), and in lines 8 and 9. Then (lines 11, 12) should approximate the configuration of the channel when a parameter is shown in line 6 AudioSpecificConfig, does not indicate a predefined channel configuration, and has a value of 0. Otherwise, it applies the configuration of the channel table 1.11 14496-3 of subsection (1).

Stage 60, and, in particular, by providing the element MPEG_l_2_SpecificConfig in the header file, which includes all redundant information in the headers 10 frame source MP3 stream 10 of the audio data, it is guaranteed that a specified part in the frame header does not lead to an irreparable loss of this information in the MPEG-4 file that should be generated during data insertion, to facilitate decoding, as, for example, at step 56, by inserting the length of the channel element, but that this modified part can be restored using header MPEG-4 file.

Then, at step 62, MPEG-4 audio data stream is output in the order header MPEG-4 file generated at step 60, and the feed items, in the order of their associated time stamps, full MPEG-4 audio data stream results in an MPEG-4 file or transmitted via the MPEG-4 systems.

The above description relates to the conversion of MP3 audio data stream in MPEG-4 audio data stream. However, as shown in dotted lines the mi figure 2, you can convert two or more MP3 audio data stream of the two MP3 encoders, namely 30 and 30', in a multi-channel MPEG-4 audio data stream. In this case, MP3-MPEG-4-Converter 32 receives an MP3 audio data stream all encoders 30 and 30', and outputs multi-channel audio streams in MPEG-4 format.

Figure 5 in the upper half illustrates, for presentation in figure 4, which can be obtained multichannel audio data stream according to the MPEG-4 standard, and the conversion is performed again by the inverter 32. Illustrate the three sequence 70, 72 and 74 of feed items, which were generated according to the steps 40-56 from one audio signal, each encoder 30 or 30' (figure 2). From each sequence 70, 72 and 74 of the elements of the channel shows the relevant elements 70a, 70b, 72a, 72b or 74a, 74b channel elements, respectively. Figure 5 each of the channel elements located one above the other, here 70a-74a or 70b-74b, respectively, associated with the same timestamp. For example, the elements of the channel sequence 70 encode the audio signal, which was recorded in accordance with the appropriate regulation, front left, right (front), whereas sequences 72 and 82 encode the audio signals representing the recording the same audio source from other directions or with another frequency spectrum, such as the Central is uradni speaker (center) and rear right and left (surround).

As shown by arrows 76, the feed items are now consolidated into blocks during the output stage 62 figure 3) in MPEG-4 audio data stream, referred to below as the blocks 78 access. Thus, in MPEG-4 audio data stream data stored in the block 78 access, always associated with a time stamp. The location of the MP3-elements 70a, 72a and 74a channel in block 78 access here right front channel, center channel and channel surround sound is included in the header file as generated for MPEG-4 audio data stream that would be generated (step 60 figure 3) through appropriate layout configuration channel call parameter in the function AudioSpecificConfig, with reference again is made to subsection 1 in the standard XSO/IEC 14496-3. Blocks 78 access again sequentially arranged in the MPEG-4 stream according to the order of their time stamps, and they are preceded by a header of the MPEG-4 file. Parameter channelConfiguration installed respectively in the header of the MPEG-4 file to show the order of the elements of the channel in units of access or their significance on the side of the decoder, respectively.

As follows from the above description of figure 5, it is easy to combine MP3 audio streams in a multi-channel audio streams, if, as proposed according to the present invention, the MP3 audio data streams manipulated to obtain a self-sufficient element of the s channel of the data blocks, all data for a single time stamp included in one channel element, and these elements of the channel of the individual channels can be easily combined into blocks access.

The present description relates to the conversion of one or more MP3 audio data streams in MPEG-4 audio data stream. However, the essential idea of the present invention is that all the benefits of the resulting MPEG-4 audio data stream, such as improved manageability separate self-contained MP3 feed items with equal speed transmission and the possibility of multi-channel transmission can be used without having to replace existing MP3 encoders completely new decoders, and that reconstruction can be performed without problems, so a similar approach can be used during decoding of the above-mentioned MPEG-4 audio data stream.

Figure 6 illustrates the block diagram containing unit 100 recovery MP3, the mode of operation of which is described in more detail below, and decoders 102, 102'... Block 100 recovery MP3 accepts MPEG-4 audio data stream generated according to one of the previous embodiments, and outputs one or, in the case of multi-channel audio data stream, a few MP3 audio data streams to one or more decoders 102, 102'...that directly decode, respectively, ninety MP3 audio data stream into a corresponding audio signal, and send it to the appropriate speakers located in accordance with the configuration of the channel.

A particularly simple way to restore the original MP3 audio data streams from MPEG-4 audio data stream generated according to figure 5, described below with reference to figure 5 and 7, in which these steps are performed by the recovery block MP3 6.

First block 100 recovery MP3 confirms at step 110 that the MPEG-4 audio data stream, adopted at the entrance, is a reformatted MP3 audio data stream by examining the call parameter audioObjectType in the header file according to the options AudioSpecificConfig whether the specified parameter value 29. If this is the case (line 7 in AudioSpecificConfig), the unit 100 recovery MP3 proceeds to parse the header file to MPEG-4 audio data stream, and reads the excess part of the frame header of the original MP3 audio data stream from part MPEG_l_2_SpecificConfig, from which was obtained the MPEG-4 audio data stream (step 112).

After evaluating MPEG_l_2_SpecificConfig, block 100 recovery MP3 on stage 114 replaces each element of the channel 74a-74c in the corresponding header hFhChSone or more parts of the elements of the channel components MPEG_l_2_SpecificConfig, in particular, the indication of the length of the channel element on the word synchronization MPEG_l_2_SpecificConfig to again polucitsia frame original MP3 audio data stream, HFHCand HSas shown by arrows 116. At step 118, the block 100 recovery MP3 modifies additional information SF, SCand SSin MPEG-4 audio data stream in each channel element. In particular, the back pointer is set to 0 to get a new one for more information S'FS'Cand S'S. Manipulation according to step 118 shown in figure 5 by the arrow 120. Then, at step 122, the block 100 recovery MP3 sets the index of the bit rate in each channel element 74a-74c in the frame header HFHCHSprovided at step 114 the word synchronization instead of indicating the length of the channel element to the highest allowable value. In the end, the resulting headers differ from the original, which is shown in figure 5 by an apostrophe, that is, H'FH'Cand H'S. Manipulation of the elements of the channel according to the stage 122 is also shown by the arrow 120.

To illustrate the changes steps 114-122, figure 5 provides separate settings for header H'Fand additional information S'F. Reference position 124 designated individual parameters title H'F. The frame header, H'Fbegins with a parameter of synchroscope. Synchroscope is set to the initial value (step 114), as is the case in every MP3 audio data stream, namely, mn the value of 0xFFF. Generally speaking, the frame header, H'Fthe resulting steps 114-122 differs from the original header of an MP3 frame is included in the source MP3 stream 10 audio only by the fact that the index of the bit rate is set to the highest valid value according to the standard MP3 is 0xE.

The purpose of the index change of the bit rate is to get a new frame length or the length of the data block, respectively, for a new subject to generation MP3 audio data stream, which is greater than the one from the original MP3 audio data streams from which the generated MPEG-4 audio data stream with the block 78 access. The original solution here is that the frame length in bytes in MP3 format always depends on the bit rate according to the following equation:

for level 3 MPEG 1:

the frame length [bits] = 1152 * bit rate [bits/sec] / sample rate [bits/sec] + 8 * bit filling [bit]

for level 3 MPEG 2 format:

the frame length [bits] = 576 * bit rate [bits/sec] / sample rate [bits/sec] + 8 * bit filling [bits].

In other words, the frame length MP3 audio data stream according to the standard is directly proportional to the transmission speed in bits and obratnoproportsionalno the sampling frequency. As an additional value ol the addition, the value of bits fill which is shown in the header of an MP3 frame hFhChSand can be used to accurately set the bit rate. The sampling frequency is fixed, because it determines the speed with which reproduces the decoded audio signal. Converting the bit rate compared to the original setting allows you to adjust such MP3-elements 74-74c channel to the length of the new data block subject to the generation of MP3 audio data stream, which is longer than the original, because the generation of the primary stream of audio data, the basic data were generated by borrowing bits from the bit reservoir.

Thus, although in the present embodiment, the index of the bit rate is always set to the highest allowable value, could further increase the index of the bit rate only up to a value sufficient to obtain the length of the data block according to the standard MP3, that even the most long MP3-channel elements 74a-74c could be adjusted with regard to their length.

Reference position 126 is illustrated that the inverse main_data_begin pointer is set to 0 in the result of additional information. It only means that the MP3 audio data stream generated according to the method according to Fig.7, the data blocks always is and are self-sufficient, so the basic data for a specific frame header and additional information always start right after additional information and within the same block of data.

Steps 114, 118, 122 are performed on each element of the channel by allocating each element of the channel from their blocks access, and display the length of the feed items can be used in the selection.

Then at step 128 the number of padding data bits or indifferent state is added to each element of the channel 74a-74c, to increase the length of all MP3 feed items uniformly to the length of the MP3 data block, installed the new index transmission speed in bits, 0xE. These data fill is shown by the reference position 128 figure 5. The number of padding data may be computed for each element of the channel, for example, by evaluating the indication of the length of the channel element and bits of padding.

Then at step 130 the elements of the channel shown in figure 5 the reference positions 74a'-74c', modified according to the previous stages, are sent to the appropriate MP3 decoder or element 134a-134c decoder in the form of data blocks MP3 audio data stream in the order coded labels time. The header of the MPEG-4 file is omitted. The resulting MP3 audio streams are shown on figure 5 in General, the reference positions 132a, 132b and 132c. The item is 134a-134c MP3 decoder, for example, have been initialized before the same number of feed items included in separate blocks access.

In reconstructing MP3 device 100 of the parameter estimation channelConfiguration call in the function AudioSpecificConfig MPEG-4 audio data stream, it is known which elements 74a-74c channel in block 78 access to MPEG-4 audio data stream belongs to which of the MP3 audio data streams that are to be generated. Thus, the element 134a decoder connected to the front speaker, takes a stream 132a of the audio data corresponding to the front channel, and, accordingly, elements, 134b and 134c decoder receive streams 132b and 132c of the audio data associated with the Central channel and channel surround sound, and output the resultant audio signals in respectively spaced loudspeakers, for example, in the low-frequency speaker, or speakers, located, for example, rear left and rear right, respectively.

Of course, for encoding MPEG-4 audio data stream in real-time using a configuration with 6 elements 102, 102' or 134a-134c decoder is required to re-transmit the generated MP3 streams 132a-132c of the audio data with the bit rate, increased by the step 122, which is higher than in the original thread 10 of the audio data, which, however, is not a problem because the circuit between the block one MP3 and MP3 decoders 102, 102' or 134a-134c is fixed, so here the transmission channels respectively short and can be designed with a correspondingly high bit rate with a lower cost and effort.

According to a variant implementation described with reference to Fig.7, multi-channel MPEG-4 audio data stream received according to figure 5 of the original threads 10 audio was not reconvertion precisely the original MP3 audio streams, but other MP3 audio data streams were generated from it, and in contrast to the original audio data streams all back pointers are set to 0, and the index of the bit rate is set to the highest value. Thus, the data blocks of these newly generated MP3 audio data streams are self-sufficient, if all the data associated with a particular time stamp are included in the same data block, and to increase the length of the data block to a uniform value data used fill.

Fig shows a variant implementation of the method, according to which you can reconventioning MPEG-4 audio data stream generated according to the options the implementation of figure 1 and 5 in the original MP3 audio streams or the original MP3 audio data stream, respectively.

In this case, the block 100 recovery MP3 on stage 150 again check the em as in step 110 whether the MPEG-4 audio data stream reformatted the MP3 audio data stream. Next steps 152 and 154 also correspond to steps 112 and 114 procedures 7.

Instead of changing back pointers to additional information and index transmission speed in bits in the frame header, the block 100 recovery MP3 recover, according to the method according to Fig, at step 156, the original length of the data block in the source MP3 audio data streams converted to MPEG-4 audio data stream, based on the sample rate, bit rate and bits of padding. The sampling frequency and the indication of the filling shown in MPEG_l_2_SpecificConfig, as well as the bit rate in each channel element, if the latter differs from frame to frame.

The equation for calculating the length of the original frame and has to be reworked audio data stream is again the same as above:

for level 3 MPEG 1:

the frame length [bits] = 1152 · bit rate [bits/sec] / sample rate [bits/sec] + 8 · bit filling [bit]

for level 3 MPEG 2 format:

the frame length [bits] = 576 · bit rate [bits/sec] / sample rate [bits/sec] + 8 · bit filling [bits].

Then an MP3 audio data stream or MP3 audio streams, respectively, are generated through the location of the corresponding frame header of the corresponding channel on the interval calculated the length of the data block, and the gaps are filled by inserting audio data or master data, respectively, in the positions indicated by the pointers to additional information. In contrast to the embodiments according to 7 or 5, respectively, the basic data associated with the corresponding header or the relevant additional information, respectively, are inserted into an MP3 audio data stream at the beginning of the provisions specified reverse pointer. Or in other words, the beginning of the dynamic master data is shifted accordingly to the value of main_data_begin. The header of the MPEG-4 file is omitted. The resulting MP3 audio data stream or the resulting MP3 audio streams, respectively, correspond to the original MP3 audio data streams, based on MPEG-4 audio data stream. Thus, these MP3 audio streams could be decoded is known for the MP3-decoder in the audio signals, like the streams of audio data in Fig.7.

Given the previous description, it should be noted that MP3 audio streams are described as single-channel MP3 audio streams, in some situations, in fact, were already two-channel MP3 audio data streams, specified according to ISO/IEC 13818-3, but in the description are not given such details, because they are not fundamental to the essence of the present invention. Matrix operations from peredavai the s channel to selection of the input-side channel decoder and the use of different reverse pointers in these multi-channel signals are not described but given the reference to the relevant standards.

The above embodiments of provided the ability to store MP3 data blocks in a modified form in the MPEG-4 file. MPEG-1/2 audio layer 3, short MP3 or proprietary formats such as formats MPEG2.5 or mp3PR0 derived from them, can be Packed into a MPEG-4 file on the basis of these procedures, so that this new representation is multi-channel representation of an arbitrary number of channels in a simple way. The use of complex and difficult to apply the method from ISO/IEC 13818-3 is not required. In particular, the MP3 data blocks are packaged so that each block is an element of the channel unit access - refers to the specified timestamp.

In other embodiments, the implementation to change the format of the digital representation of a signal, part of the view have been overwritten by other data. In other words, the information required or useful for the decoder is written in part of the MP3 data block, which is constant for the different blocks within the data stream.

By packaging several blocks stereo and montannah to block access MPEG-4 file, you can get multi-channel representation, which is much easier to handle compared to the performance of the ISO/IEC 13818-3.

In previous versions of the implementation representation of the MP3 data block which was formatted so all the data specific to the timestamp included in one block access. Basically it takes place not in the case of MP3 data blocks, because the item main_data_begin or the back pointer in the original MP3 data block, respectively, may indicate an earlier data blocks.

You can also restore the original data stream (Fig). As shown, this means that the resulting data streams can be processed by any compliant decoder.

The above options for implementation provide the ability to encode or decode more than two channels. Further, in the above embodiments, the implementation of the already encoded MP3 data should only be reformatted simple operations in order to get a multi-line format. On the other hand, on the side of the encoder, only the operation or operations, respectively, must be reversed.

Although the MP3 stream data usually includes data blocks of different lengths, because the dynamic data related to the same block, can be Packed in the previous blocks in the previous embodiments, the implementation of dynamic data was completed immediately after more information. The resulting MPEG-4 audio data stream had a constant average bit rate, but the data blocks of different lengths. Ale is NT main_data_begin or reverse pointer respectively, is passed unmodified way to ensure reproduction of the original data stream.

Next, with reference to figure 5, the extension of the MPEG-4 syntax described for packaging multiple MP3 data blocks as MP3 feed items in one multi-format MPEG-4 file. All MP3 input-element of a channel relating to a single point in time, were Packed into a single block access. Accordingly, the MPEG-4 standard suitable information for the configuration on the side of the encoder can be taken from the so-called function AudioSpecificConfig. In addition to setting audioObjectType, bit rate and channel configuration, etc., it includes a handle that is relevant for the corresponding parameter audioObjectType. This descriptor described above for MPEG_l_2_SpecificConfig.

According to previous variants of implementation of the 12-bit MPEG 1/2 synchroscope in the title was replaced by the length of the corresponding MP3-channel element. According to the standard ISO/IEC 13818-3 enough for that 12 bits. The remaining header was not changed, which may, however, sometimes be made for shortening, for example, the frame header and the residual surplus part, with the exception of synchroscope to reduce the amount of transmitted information.

Different variations of the above embodiments can be easily performed. Thus, the succession of alnost stages figure 3, 7, 8 can be changed, in particular, the steps 42, 50, 56, 60, figure 3, steps 11, 114, 118, 122 and 128 7 and stages 152, 154, 156 on Fig.

Further, in relation to 3, 7, 8 it should be noted that in this example the steps are performed by the respective signs of the Converter or the power recovery device, respectively, according to figure 2 or 6 that can be implemented, for example, using a computer or a hardware-implemented schema.

In the embodiment, in Fig.7 manipulation of headers for more information respectively (steps 118, 122) was performed for the MP3 decoder on the receiver side or decoder, respectively, for the MP3 data stream, slightly modified compared to the original audio data stream. In many cases, it may be advantageous to perform the steps mentioned on the side of the encoder or transmitter, respectively, since the receiver often devices serial production, so that savings in the electronic media on the receiver side would get a higher profit. According to the alternative implementation is possible, therefore, to ensure the implementation of these steps during the conversion of MP3-MPEG-4 data. Stages according to this alternative method of converting the formats shown in Fig.9, the steps identical to the steps in Fig., are denoted by the same reference positions and not described again.

First MP3 audio data stream, which must be converted, taken on the stage 40, and at step 42 the audio data related to a timestamp or representing the coding time period of the audio signal, which must be encoded MP3 audio data stream, the associated timestamp, respectively, are combined into a contiguous block; this procedure is carried out for all timestamps. The headlines again added to the continuous blocks to get the feed items (step 50). However, the headers are not only modified by replacing the word synchronization on the length of the corresponding element of a channel, as in step 56. Moreover, at steps 180 and 182 corresponding to steps 118 and 122 7, followed by further modification. At step 180 the pointer to additional information for each element of the channel is set to zero, and at step 182 index transmission speed in bits in the header of each element of a channel is changed, as described above, the length of the MP3 data block, depending on the bit rate, is sufficient to include all the audio data of the channel element or relating to a timestamp, respectively, together with the size of the header and additional information. Step 182 may also contain the converting video bits fill in the headers of the successive elements of the channel to obtain accurate transmission speed in bits later when applying MPEG-4 audio data stream, formed by the method according to Fig.9, in the decoder, a method according to Fig.7, but without steps 118 and 122. Filling can be performed on the side of the decoder at step 128.

At step 182, it may be useful to set the index of the bit rate is not at the highest possible value, as described for stage 122. The value can also be set to a minimum value which is sufficient to accommodate all the audio data, the header and the additional information element channel in the calculated length of the MP3 frame that can also mean that if the passage of the encoded audiofrequency, which can be encoded with a smaller number of coefficients, the index of the bit rate is reduced.

After these modifications, the steps 60 and 62, simply generates a header file (AudioSpecificConfig), and then it is displayed together with the MP3 feed items as MPEG-4 audio data stream. This thread can, as already mentioned, reproduced by the method according to Fig.7, in which, however, steps 118 and 122 can be omitted, which makes it easier to implement on the side of the decoder. However, the stages 42, 50, 56, 180, 182 and 60 can be performed in any order.

The preceding description is illustrative only refers to the MP3 audio data streams with a fixed bit length data block, which is expressed in bits. Of course, the MP3 data streams with waruiru is my the length of the data block can be processed according to the previous variants of implementation, in which the index of the bit rate and thus also the length of the data block changes from frame to frame.

The preceding description referred to MP3 audio data streams. In other streams of the audio data, not based on pointers, a variant of implementation of the present invention provides a modification of headers in the data blocks, for example, one stream audio layer 2 MPEG 1/2, containing, in addition to headings, appropriate additional information and the appropriate audio data, and thus, is already sufficient to generate MPEG-4 audio data stream. The modification ensures each header indication length, showing the number of any data from the corresponding block of data or the audio data in the corresponding data block, so that the MPEG-4 audio data stream can be decoded more easily, especially when it is combined from multiple threads audio layer 2 MPEG 1/2 in multichannel audio data stream, as in the above description with reference to figure 5. Preferably, the modification is obtained as described above by replacing synchroblog or other redundant parts in the headers of the data flow level 2 standard MPEG 1/2 on the display length. Reformatting or cancellation pointer to 5 by combining audio data related to a single label time and, drops the data flows in level 2 because there is no reverse pointer. Decoding MPEG-4 audio data stream, the joint of the two streams audio layer 2 MPEG 1/2, representing the two channels of the multichannel audio data stream can be easily done by reading the indications of the length of and access to individual elements of the channel blocks based access them. Then they can be known decoders that are compatible with the level MPEG 1/2.

Further, the present invention is immaterial, where exactly is the back pointer in the data blocks of the stream of audio data, based on the pointers. It can be directly in the frame header to determine contiguous block definition.

In particular, it should be noted that depending on conditions corresponding to the invention, the mapping file format can also be implemented in software. This can be done on a digital storage medium, in particular, on disk or CD (compact disk) with electronic reading control signals, which can cooperate with a programmable computer system to execute the corresponding method. Thus, in principle, the invention also consists in a computer program product with program code, the latter on a machine-readable carrier for performing the method, corresponding to the invention when the computer program product runs on a computer. In other words, the invention may also be implemented as a computer program with program code for performing the method when the computer program is executed on a computer.

1. The method of converting the first stream (10) of the audio data representing an encoded audio signal and having a first file format, the second audio data stream representing a coded audio signal and having a second file format, in accordance with the first file format of the first audio data stream is divided into consecutive blocks (10A-10C) data, each of which is associated with the respective basic data obtained by the encoding of the associated one of the successive time periods of the audio signal, and each period contains a number of values of the audio signal, each data block contains a block (14, 16) defining and part (18) of the main data, and master data associated with consecutive data blocks sequentially ordered in the parts master data of consecutive data blocks, with each block definition contains a pointer pointing to the beginning of the associated master data (12A-12C), the end of which is before the main data (12b, 2C), associated with the following data block, the method includes the steps

combining (42), for each data block, associated master data (44, 46) of the successive blocks of data to generate, for each data block, contiguous block (48);

adding (50), for each data block, contiguous block (48) to the unit (14, 16) of the definition of the data block to obtain a coherent elements (52a) of different channel lengths;

the ordering of the elements of the channel in accordance with the procedure of successive periods of time to obtain a second audio data stream; and

modification (56)of each element (54A-C) channel so that it included an indication of the length indicating the data length of the corresponding element (54A-C) channel or the length of the contiguous block of the corresponding element of the channel, and the step of modifying includes replacing (56), the excess part that is identical for all blocks definition, indication of length.

2. The method according to claim 1, additionally containing phase space (60, 62) header file/data stream to the second audio data stream, and the file header/data flow is the excess part that is identical for all blocks of the definition.

3. The method according to claim 1 or 2, wherein the step of combining includes the sub-steps of the read pointer in the block definition p is evritania particular block of data;

read the first part of the basic data with which is associated a predetermined data block, from part (18) of the main data of the first one of the successive blocks of data preceding a predefined block of data containing the data referenced by the block pointer defining a predefined block of data;

read the second part of the basic data with which is associated a predetermined data block from the primary data of the second one of the successive data blocks following the first block containing the end of the said main data; and

combining the first and second parts of the master data to obtain a continuous unit for the predefined block of data.

4. The way of combining the first audio data stream representing the first encoded audio signal and the second audio data stream representing the encoded second audio signal multichannel audio streams, and the method comprises the steps

convert the first audio data stream into a first substream of the audio data according to the method according to claim 1 or 2; and

converting the second audio data stream in the second substream of the audio data according to the method according to claim 1 or 2,

moreover, the stages at which aradhana behave two substream audio data together form the multi-audio streams and multi-channel audio data stream elements (70A) of the channel of the first substream of the audio data and the elements of (72A) of the channel of the second substream of the audio data containing the corresponding continuous block obtained by combining periods of time coding, equal time, ordered sequentially in a contiguous block (78) access.

5. The method according to claim 4, additionally containing phase space header file/data stream to the second stream of audio data, and the header of the file/data stream includes a display format that shows the order in which the elements (70A) of the channel of the first audio substream and the second substream (70b) of the audio data are arranged in blocks (78) access.

6. The method according to claim 1 or 2, in which the data blocks are data blocks or equal to a specified variable size depending on the indication of the sampling rate and display bit rate transmission in the block definition mentioned data blocks.

7. The way to convert the first audio data stream representing a coded audio signal and having a first file format, the second audio data stream representing a coded audio signal and having a second file format, in accordance with the first file format PE the first audio data stream is divided into successive blocks of data, each of which is associated with the corresponding master data, obtained by encoding an associated one of the successive time periods of the audio signal, and each period contains a number of values of an audio signal, and the data block contains block definition and part of the main data, the method includes a step of modifying the data blocks so that they include an indication of the length that indicates the length of data blocks or the length of the main data of the data block to obtain the elements of the channel, forming a second stream of audio data from the data blocks, and the step of modifying includes replacing redundant parts, identical for all blocks definition, indication length.

8. The method according to claim 1 or 2, further comprising stages reinstall (180) pointers in blocks definitions so that they

showed at the start of the relevant basic data basic data starts immediately after the corresponding block of the determination; and

changes (182) display bit rate transmission in blocks definition so that the length of the data block, depending on the indication of the bit rate of the transmission according to the first audio file format, sufficient to receive the corresponding block definitions and associated master data.

9. The way decode the Finance second audio data stream, representing the encoded audio signal and having a second file format, through a decoder capable of decoding the first audio data stream representing a coded audio signal and having a first file format, for receiving the audio signal, in accordance with the first file format of the first audio data stream is divided into successive blocks (10A-10C) data, each of which is associated with the respective basic data obtained by the encoding of the associated one of the successive time periods of the audio signal, each period contains a number of values of an audio signal, and each data block has a block (14, 16) defining and part (18) of the main data, and master data associated with consecutive data blocks sequentially ordered in the parts master data sequential data blocks, each block definition includes a pointer pointing to the beginning of the associated master data (12A-12C), the end of which is before the main data (12A-12C), associated with the following data block, and the second audio data stream is divided into successive elements of the channel in accordance with a second file format, with each channel element contains a contiguous block (44, 46), obtained by combining the main data, associated with the corresponding data block of consecutive data blocks, and the associated block definition in the form in which previously the excess part that is identical for all blocks definition, modified by replacing the indication of the length that indicates the length of the corresponding element of the channel or the length of the corresponding contiguous block, and the method comprises the steps

generate the input data stream representing a coded audio signal and having a first file format from the second audio data stream by:

parsing the second audio data stream through the use of displays length,

reset the pointers in blocks identifying the elements of the channel of the second audio data stream so that they showed at the start of the main data, the main data starts immediately after the corresponding block in order to get reinstalled blocks definitions

change indication bit rate transmission in blocks identifying the elements of the channel of the second audio data stream so that the length of the data block, depending on the indication of the bit rate of the transmission according to the second audio file format, sufficient to receive the corresponding block definitions and associated master data, to receive the th modified bit rate transmission and reinstalled blocks definition, and

insert bits between each channel element and the next element of the channel so that the length of each element of the channel plus the inserted bits has been adapted to display an increased bit rate transmission; and

the input data stream in a decoder according to the indication of the changed bit rate transmission to get the audio.

10. A device for converting the first stream (10) of the audio data representing an encoded audio signal and having a first file format, the second audio data stream representing a coded audio signal and having a second file format, in accordance with the first file format, the first audio data stream is divided into consecutive blocks (10A-10C) data, each of which is associated with the respective basic data obtained by the encoding of the associated one of the successive time periods of the audio signal, and each period contains a number of values of the audio signal, each data block contains a block (14, 16) defining and (18) basic data and master data associated with consecutive data blocks sequentially ordered in the parts master data of consecutive data blocks, with each block definition contains a pointer pointing to the beginning of the associated basis of the data (12A-12C), end of which is before the main data (12b, 12C), associated with the next block of data containing

means for combining (42), for each data block, associated master data (44, 46) of the successive blocks of data to generate, for each data block, contiguous block (48);

means for adding (50), for each data block, contiguous block (48) to the unit (14, 16) of the definition of the data block to obtain a coherent elements (52a) of different channel lengths;

means for ordering the elements of the channel in accordance with the procedure of successive periods of time to obtain a second audio data stream; and

means for modifying (56)of each element (54A-C) channel so that it included an indication of the length indicating the data length of the corresponding element (54A-C) channel or the length of the contiguous block of the corresponding element of the channel, and means for modifying replaces (56), the excess part that is identical for all blocks definition, indication of length.

11. A device for converting a first audio data stream representing a coded audio signal and having a first file format, the second audio data stream representing a coded audio signal and having a second format f is La, however, in accordance with the first file format of the first audio data stream is divided into successive data blocks, each of which is associated with the corresponding master data, obtained by encoding an associated one of the successive time periods of the audio signal, and each period contains a number of values of an audio signal, and the data block contains block definition and the part master data containing means for modifying the data blocks so that they include an indication of the length that indicates the length of data blocks or the length of the main data of the data block to obtain the elements of the channel, forming a second stream of audio data from the data blocks, and means for the modification replaces the redundant part, identical for all blocks definition, indication of length.

12. The device for decoding the second audio data stream representing a coded audio signal and having a second file format, based decoder capable of decoding the first audio data stream representing a coded audio signal and having a first file format, for receiving the audio signal, in accordance with the first file format of the first audio data stream is divided into successive blocks (10A-10C) data, each of which is associated with ACC is dtweedie master data obtained by encoding an associated one of the successive time periods of the audio signal, each period contains a number of values of an audio signal, and each data block has a block (14, 16) defining and (18) basic data and master data associated with consecutive data blocks sequentially ordered in the parts master data sequential data blocks, each block definition includes a pointer pointing to the beginning of the associated master data (12A-12C), the end of which is before the main data (12A-12C), associated with the following data block, and the second the audio data stream is divided into successive elements of the channel in accordance with a second file format, with each channel element contains a contiguous block (44, 46), obtained by combining the basic data associated with the corresponding data block of consecutive data blocks, and the associated block definition in the form in which previously the excess part that is identical for all blocks definition, modified by replacing the indication of the length that indicates the length of the corresponding element of the channel or the length of the corresponding contiguous block definition contains

means for forming an input stream of data is x, representing the encoded audio signal and having a first file format from the second audio data stream by:

parsing the second audio data stream through the use of displays length,

reset the pointers in blocks identifying the elements of the channel of the second audio data stream so that they showed at the start of the main data, the main data starts immediately after the corresponding block in order to get reinstalled blocks definitions

change indication bit rate transmission in blocks identifying the elements of the channel of the second audio data stream so that the length of the data block, depending on the indication of the bit rate of the transmission according to the second audio file format, sufficient to receive the corresponding block definitions and associated master data to obtain modified on the speed of bit transmission and reinstalled blocks definitions, and

insert bits between each channel element and the next element of the channel so that the length of each element of the channel plus the inserted bits has been adapted to display an increased bit rate transmission; and

means for supplying input data stream in a decoder according to the indication of the changed bit rate transmission to p in order to obtain the audio signal.

13. Machine-readable media designed to interact with a programmable computer system under the action of the read control signals in the form of software code stored on a machine-readable carrier, for converting the first stream of audio data having the first format file, the second audio data stream having the second file format, the method according to claim 1 or 7, or for decoding a second stream of audio data having the second file format, the method according to claim 9.



 

Same patents:

FIELD: information technologies.

SUBSTANCE: unitary record optical disk and the method, and the device for allocation of the backup area on a unitary record optical disk are announced. The method includes data area selection on the unitary record media and selection of the user data in the data area on the recording medium and, at least, one backup area having variable size. Thereat, the maximum recording capacity of, at least, one backup area on the recording media is less than maximum recording capacity of, at least, one variable backup area on an rewritable optical disk.

EFFECT: improved recording method.

42 cl, 8 dwg

FIELD: information technologies.

SUBSTANCE: unitary record optical recording media, method of defect-management zone selection for the unitary record optical recording media and method of reserve zone selection of the unitary record optical recording media are announced. The method of defect-management on the unitary record optical recording media having, at least, one recording layer, includes selection steps, of at least, one defect-management zone having the fixed size, and, at least, one defect-management zone having the variable size. On the specified optical recording media the defect-management data recording, accordingly, is performed at least in one temporary defect-management zone having the fixed size, and, at least in one temporary defect-management zone having the variable size. Also one defect-management zone having the fixed size, and, at least, one defect-management zone having the variable size is used for record of the corresponding information.

EFFECT: improved method of recording.

55 cl, 11 dwg

FIELD: information technologies.

SUBSTANCE: defect-management method on the unitary record optical media having at least one recordable layer is announced. The defect-management method includes selection steps of, at least one substituting zone and several temporary defect-management zones on the recording media. According to the defect-management method, recording of the defect-management information is performed, at least, in one of the temporary defect-management zones.

EFFECT: improved method of record and defect-management.

42 cl, 13 dwg

FIELD: information technologies.

SUBSTANCE: recording medium contains the data area storing, at least, several graphic streams, in, at least, its one part. Several graphic streams are multiplexed, and each graphic stream is the transport stream stored in the form of one or more packages. Each package has the package identifier, and packages of the same graphic stream have the same package identifier.

EFFECT: possibility of managing several graphic streams for an optical disk.

26 cl, 7 dwg

FIELD: information technology.

SUBSTANCE: recording medium with the data structure designed for playback control of at least static images recorded on the medium, contains the data area where at least one playback list is stored, connecting the first and the second files.

EFFECT: ensuring playing back static images recorded on the high-density medium based on the data structure also recorded on the high-density medium and designed to control the playing back of static images.

18 cl, 29 dwg

FIELD: information technology.

SUBSTANCE: method of data organisation similar to that of a RAM disc, is adapted for ROM disc in order to make the ROM disc compatible with the RAM disc. According to this method, buffer areas are formed, used as opening and closing areas for preceding and successive RUB (writable blocks) respectively, which are used as a reading/writing measurement unit. Besides, in the areas separated from each other by a distance of the interval length between the synchronising data in the successive RUB frames, the synchronising data (SA) fragments are written in such manner that the synchronising data in the signal read from the ROM disc always appear with equal intervals, which ensures the advantage in organisation and synchronisation.

EFFECT: possibility to create the ROM disc compatible with the RAM disc and having advantages in the synchronisation system.

6 cl, 21 dwg

FIELD: information technology.

SUBSTANCE: single-time writable disc has many update areas for writing updated information of a predefined type; it also has at least one main access information area (AIA) for writing the main access information (AI) which refers to the final update area among the mentioned array of the update areas, where the finally updated information is written; and at least one subordinate AIA for writing subordinate AI, which refers to the location of the finally updated information written in the final update area. The main AI is multiply written in the entire writing block in at least one of the mentioned main AIA. Besides, the AI includes the information on location of the mentioned array of the update areas, the information of the first flag, which refers to the final update area, and the information of the second flag, which refers to the location of the updated information written in the final update area.

EFFECT: reduction of the access time for reading the updated information required for the use of a single-time writable disc.

54 cl, 11 dwg

FIELD: information technologies.

SUBSTANCE: record medium contains data area, keeping video data of several reproducing channels. Video data of several reproducing channels are broken on one or more of interleaving blocks. Each interleaving block is connected with one of reproducing channels. Each interleaving block begins and finishes with reproducing channel change block and interleaving blocks, connected with different reproducing channels, interleave in data area.

EFFECT: record medium has data structure for video data reproducing control of several reproducing channels.

30 cl, 8 dwg

FIELD: information technologies.

SUBSTANCE: stated CD-R which allows data area distributing, its data area distribution method, data recording device and data reproduction method from CD-R. Specified disk includes preset section containing distribution information which points out if at least one segment of disk data area distributed for disk errors processing. In disk and in method information of data area distribution, which defines data area structure, is written in disk, providing possibility to recording/reproducing device to define data area structure. That is why for disk errors processing areas distribution is available such as reserve area, which differs from user data storage area, for data area.

EFFECT: area distribution for disk error processing allows effective using of CD-R.

17 cl, 18 dwg

FIELD: physics, measurement.

SUBSTANCE: invention relates to a recording medium, in particular, to an optical recording disk containing at least one recording track having a physical volume with addressable blocks. The recording medium contains at least one recording track containing an administration area and a physical volume of addressable logical blocks; the said physical volume is subdivided into a logical volume and at least one spare area; the said logical volume is physically continuous and only one spare area is located in the beginning of the physical volume. Alternatively, the said spare area is located in the end of the physical volume and has a size determined by the user during input.

EFFECT: development of recording medium with enhanced performance.

13 cl, 6 dwg

FIELD: physics.

SUBSTANCE: invention claims systems and methods of speech signal classification and coding. Signal classification passes over three stages, with recognition of a definite signal class at each stage. First, active speech detector recognises active and inactive speech frames. If an inactive speech frame is found, the classification is finished, and the frame is encoded by comfortable noise generation. If an active speech frame is found, it undergoes second classification recognising non-vocalised frames. If the frame is recognised as non-vocalised speech signal, the classification is finished, and the frame is encoded by a method optimised for non-vocalised signals. In the opposite case, the speech frame is directed to 'stable vocalised' signal classification module. If the frame is classified as a stable vocalised frame, it is encoded by a method optimised for stable vocalised signals. In the opposite case, if the frame contains instable speech segment, e.g. vocalised initial or rapidly evolving signal, then a speech coder is applied.

EFFECT: improved speech quality at a given average data transfer speed.

84 cl, 12 dwg, 5 tbl

FIELD: acoustics.

SUBSTANCE: invention pertains to the method and device for subsequent processing of a decoded sound signal. The decoded signal is divided into a set of signals at frequency sub-ranges. Subsequent processing is done to at least, one of the signals in the frequency sub-ranges. After processing of at least one signal from the frequency sub-ranges, the signals from the frequency sub-ranges are summed up to form an output decoded sound signal, subject to the next processing. In that way, processing is localised in the necessary sub-range or sub-ranges, leaving the other sub-ranges practically unchanged.

EFFECT: increased perceptible quality of the decoded sound signal.

54 cl, 14 dwg

FIELD: physics, measurement.

SUBSTANCE: invention relates to a method and device for quantisation of linear prediction parameters in audio signal coding at a variable bit rate, in which the input vector of the linear prediction parameters is accepted, the audio signal frame corresponding to the input vector of the linear prediction parameters is classified, the prediction vector is calculated, the calculated prediction vector is deleted from the input vector of the linear prediction parameters in order to create a prediction error vector, and the prediction error vector is quantised. The prediction vector calculation involves selection of one of the many prediction patterns concerning the audio signal frame classification and prediction error vector processing using the selected prediction pattern. The invention relates to a method and device for reverse quantisation of linear prediction parameters in audio signal decoding at a variable bit rate; in which at least one quantisation index and the audio signal frame classification data corresponding to the quantisation index are received, the prediction error vector is restored by applying the index to at least one quantisation table, the prediction vector is recreated, and the linear prediction parameter vector is created depending on the restored prediction error vector and the recreated prediction vector. The prediction vector recreation involves processing of the restored prediction error vector using one of the many prediction patterns depending on the frame classification data.

EFFECT: decrease in quantisation error quantity.

57 cl, 8 dwg

FIELD: method for encoding a signal, in particular, sound signal.

SUBSTANCE: in accordance to the method, first set of values is provided, which is related to serial spans of time in first time interval of signal; second set of values is provided, which is related to successive periods of time in second time interval of signal; where first time interval has certain overlapping with second time interval; aforementioned overlapping contains at least two successive time periods of second interval; where at least one of values of second set, which are related to at least two successive time periods in aforementioned overlapping, is encoded relatively to the value of first set, which is closer in time to at least one value of second set, than any other value in second set.

EFFECT: increased efficiency of signal encoding.

9 cl, 4 dwg

FIELD: systems/methods for filtering signals.

SUBSTANCE: in accordance to invention, filtration of input signal is performed for generation of first filtered signal; first filtered signal is combined with aforementioned input signal for production of difference signal, while stage of filtering of input signal for producing first filtered signal contains: stage of production of at least one delayed, amplified and filtered signal, and production stage contains: storage of signal, related to aforementioned input signal in a buffer; extraction of delayed signal from buffer, filtration of signal for forming at least one second filtered signal, while filtration is stable and causative; amplification of at least one signal by amplification coefficient, while method also contains production of aforementioned first filtered signal, basing on at least one aforementioned delayed, amplified and filtered signal.

EFFECT: development of method for filtering signal with delay cycle.

10 cl, 10 dwg

FIELD: analysis and synthesis of speech information outputted from computer, possible use in synthesizer-informers in mass transit means, communications, measuring and technological complexes and during foreign language studies.

SUBSTANCE: method includes: analog-digital conversion of speech signal; segmentation of transformed signal onto elementary speech fragments; determining of vocalization of each fragment; determining, for each vocalized elementary speech segment, of main tone frequency and spectrum parameters; analysis and changing of spectrum parameters; and synthesis of speech sequence. Technical result is achieved because before synthesis, in vocalized segments periods of main tone of each such segment are adapted to zero starting phase by means of transferring digitization start moment in each period of main tone beyond the point of intersection of contouring line with zero amplitude, distortions appearing at joining lines of main tone periods are smoothed out and, during transformation of additional count in the end of modified period of main tone, re-digitization of such period is performed while preserving its original length.

EFFECT: improved quality of produced modulated signal, allowing more trustworthy reproduction of sounds during synthesis of speech signal.

2 cl, 8 dwg

FIELD: digital speech encoding.

SUBSTANCE: speech compression system provides encoding of speech signal into bits flow for later decoding for generation of synthesized speech, which contains full speed codec, half speed codec, one quarter speed codec and one eighth speed codec, which are selectively activated on basis of speed selection. Also, codecs of full and half speed are selectively activated on basis of type classification. Each codec is activated selectively for encoding and decoding speech signal for various speeds of transfer in bits, to accent different aspects of speech signal to increase total quality of synthesized speech signal.

EFFECT: optimized width of band, required for bits flow, by balancing between preferred average speed of transfer in bits and perception quality of restored speech.

11 cl, 12 dwg, 9 tbl

The invention relates to a speech encoding and reduces the sparsity in the input digital signal comprising a first sequence of sample values

The invention relates to speech recognition

The invention relates to a speech decoder, used in radio communications systems with mobile objects

FIELD: digital speech encoding.

SUBSTANCE: speech compression system provides encoding of speech signal into bits flow for later decoding for generation of synthesized speech, which contains full speed codec, half speed codec, one quarter speed codec and one eighth speed codec, which are selectively activated on basis of speed selection. Also, codecs of full and half speed are selectively activated on basis of type classification. Each codec is activated selectively for encoding and decoding speech signal for various speeds of transfer in bits, to accent different aspects of speech signal to increase total quality of synthesized speech signal.

EFFECT: optimized width of band, required for bits flow, by balancing between preferred average speed of transfer in bits and perception quality of restored speech.

11 cl, 12 dwg, 9 tbl

Up!