Method for restoration of distorted compressed files. RU patent 2510957.

IPC classes for russian patent Method for restoration of distorted compressed files. RU patent 2510957. (RU 2510957):

H03M7/40 - Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Another patents in same IPC classes:

Encoding variable-length codes with efficient memory usage / 2426227
Invention is aimed at methods for adaptive variable-length coding (VLC) with efficient memory usage and low degree of complexity with respect to data for different fields of application, such as digital video data, image data, audio and voice data. In certain aspects, these methods can employ properties of defined sets of code words to support very compact data structures. In other aspects, the methods can support adaptive coding and decoding with low degree of complexity with respect to binary sequences generated by memoryless sources.

Method to process moving image, data medium, where software is recorded to process moving image, and device to process moving image / 2423017
Syntax elements with high frequency of occurrence are processed, using variable conditions of probability contained in the second memory, access delay of which is small, and other syntax elements are processed, using variable conditions of probability, contained in the first memory, access delay of which is large.

Apparatus and method of estimating code volume, as well as data medium for implementing said method / 2420911
Disclosed is a method of determining code volume when coding quantised values of orthogonal transformation coefficients which are greater than the size of orthogonal transformation, assigned to a variable-length code chart. Quantised values are transformed to a one-dimensional type to obtain Serial-Value sets. Based on the proportion between the area of orthogonal transformation which corresponds to the size of orthogonal transformation assigned to the variable-length code chart, and the area of orthogonal transformation, the number of groups is calculated for the coding target object. The length of each Serial-Value set in each group is determined by accessing the variable-length code chart. The volume of the formed code is considered equal to the total length of the codes from all groups.

Efficient coding and decoding conversion units / 2417518
Codec encodes conversion coefficients through composite coding of nonzero coefficients with subsequent series of coefficients with zero values (dwg. 14). When nonzero coefficients are last in their unit, the last indicator is replaced for the value of the series in the symbol of that coefficient (1435). Initial nonzero coefficients are indicated in a special symbol which jointly codes the nonzero coefficient together with initial and subsequent series of zeroes (1440). The codec enables several coding contexts by detecting interruptions in the series of nonzero coefficients and coding nonzero coefficients on any side of that interruption separately (1460). The codec also reduces the size of the code table by indicating in each symbol whether a nonzero coefficient has an absolute value greater than 1, and whether the series of zeroes have positive values (1475), and separately codes the level of coefficients and the length of the series outside the symbols (1490).

Memory efficient adaptive block coding / 2413360
In the adapted variable length coding method, variable length coding is performed according to a code structure. The code structure defines groups of codewords in a coding tree, where each of the groups includes codewords representing values having same weight coefficients and the codewords in each of the groups are ordered lexicographically with respect to the values represented by the codewords. The code structure also defines a first and a second subgroup of codewords in each of the groups, where the first subgroup includes codewords having a fist length, and the second subgroup includes codewords having a second length which is different from the first. A variable length coding result is generated for at least one of storage in memory transmission to a device or presentation to a user.

Method of creating and checking electronic image certified by digital watermark / 2399953
Binary sequence of a secret identification key and a binary sequence of a secret embedding key, a cryptographic function and several Fourier coefficients of the electronic image are pre-generated for the sender and the receiver. An electronic image certified by a digital watermark is created for the sender, for which the electronic image is divided into M units with pixel size n×n. An identifier for the m-th unit of the electronic image is created. The binary sequence of the digital watermark of the m-th unit of the electronic image is determined. The digital watermark is embedded into the m-th unit of the electronic image and operations for certifying units of the electronic image for the sender with the digital watermark are repeated until completion. The receiver is sent the electronic image certified with the digital watermark. Authenticity of the electronic image received by the receiver is checked.

Method for data compression / 2386210
Method of data compression is realised with the help of coder. The first unit of coder memory stores preliminarily recorded code combinations (CC1) with number of digits n, where n = 2, 3, 4…, representing a complete set of possible inlet code combinations (CC). The second unit of coder memory stores preliminarily recorded code combinations CC2, which definitely comply with CC1 with number of digits that is less or equal to the number of CC1. Input flow of data is separated into CC with identical number of digits n. CC is serially entered into coder, identified, by means of comparison to CC1, presented by according output code combination CC2. CC2 present a sequence of groups with identical number of digits n in each one. Combined number of code combinations CC2 is mn, where m = 2, 3, 4…, n = 1, 2, 3…. Number of subsequent groups CC2 is identified as mn-1, mn-2…. Number of digits of CC2 in group is leveled by addition of a nonsignificant zero prior to code combination.

Adaptive grouping of parametres for improved efficiency of coding / 2368074
Invention is related to coding of parametres without losses, and in particular to generation and usage of coding rule for efficient compression of parametres. The present invention is based on conclusion that parametres including the first set of parameters of initial signal first part representation and including the second set of parametres of initial signal second part representation, may be efficiently coded, when parametres are arranged in the first sequence of tuples and the second sequence of tuples, besides the first sequence of tuples comprises tuples of parametres having two parametres from single part of initial signal, and moreover, the second sequence of tuples comprises tuples of parametres having one parametre from the first part and one parametre from the second part of initial signal. Efficient coding may be achieved with the help of bit estimation unit in order to assess required number of bits to code the first and second sequence of tuples, besides only sequence of tuples is coded, which results in reduced number of bits.

Inserting additional data in coded signal / 2251819
Proposed procedure for inserting additional data such as watermarks in audio signals and extracting them from compressed audio signal involves separation of audio signal during insertion of additional data into frames, generation of set of prediction filtration factor for each frame, prediction coding of each frame using mentioned coefficients, and for inserting additional data at least one of chosen filtration factors is adjusted to equal value of these additional data. In this way any impact on bit transfer speed is made absolutely impossible.

Method for restoration of distorted compressed files / 2510957
Invention relates to means of compressing and restoring transmitted information without loss of digital data generated according to the Deflate format in information systems and electrical communication systems Owing to the introduction of the error search procedure in the current code segment and correction of decoded data distortions, based on use of context simulation of information, it becomes possible to restore data from a damaged archive region, thereby reducing loss of information when uncompressing distorted compressed files.

FIELD: physics, communications.

SUBSTANCE: invention relates to means of compressing and restoring transmitted information without loss of digital data generated according to the Deflate format in information systems and electrical communication systems Owing to the introduction of the error search procedure in the current code segment and correction of decoded data distortions, based on use of context simulation of information, it becomes possible to restore data from a damaged archive region, thereby reducing loss of information when uncompressing distorted compressed files.

EFFECT: reduced loss of information when uncompressing distorted compressed files.

2 dwg

The invention relates to the field of telecommunications, in particular to the field, combined with a reduction in redundancy of information transmitted, and can be used to restore the distorted compressed lossless digital data generated according to the Deflate format, in information systems and systems of telecommunications.

The archive format Deflate developed by Philip Vgacon and widely used in practice, for example in the HTTP Protocol, PNG, ZIP, GZIP, etc. and is a combination of the method dictionary LZ77 compression (J. Ziv, A. Lempel, "A Universal Algorithm for Sequential Data Compression", IEEE Transactions on Information Theory, Vol.23, No. 3, pp.337-343.) and coding Huffman, D.A., "A Method for the Construction of Minimum Redundancy Codes", Proceedings of the Institute of Radio Engineers, September 1952, Volume 40, Number 9, pp.1098-1101).

There is a method of data compression (see US patent # 5051745, publ. 24.09.1991), namely, that the encoded string is replaced by references to the sequence of character, located in a sliding window of fixed length, keeping the previous text of the message, then the links encode the Huffman or Shannon-Fano.

The main disadvantage of this method is the inability to retrieve information when decompressing from the corrupted data segments.

Known format specification archive Deflate (see Deutsch, p., "Deflate Compressed Data Format Specification version 1.3", Aladdin Enterprises, Network Working Group, May 1996, 16 pages), which describes how the compression and decompression of data.

The main disadvantage of this method is the inability to retrieve information when decompressing from the corrupted data segments.

A device decompression of archives Deflate "Deflate decompressor"that performs decoding of compressed data according to the format specification archive Deflate (see U.S. Patent №8125357 B1, publ. 22.02.2012).

The main disadvantage of this device is the inability to extract information from corrupted data segments.

Also known way to recover data from corrupted archives (see US patent # 76033390 B2, publ. 13.10.2009), namely, that from the archives, which store multiple compressed files (e.g. Zip-archive), perform the restore files contained in undamaged areas of the archive.

The main disadvantage of this method is the inability to retrieve information from the damaged segments of the archive.

The closest to the technical nature of the claimed invention (prototype) is a method decompression information (see U.S. Patent №7538696 B2, publ. 26.05.2009), which that reads compressed files, secrete code segments LZ77 from the input bit stream by comparing them with predetermined code values, calculate the index lookup table on the value of the code segment LZ77, produce decoding code segment LZ77 in the lookup table.

The main disadvantage of this method is the lack of procedures for recovering data from damaged segments of the archive, resulting in complete or partial loss of information when decompressing the archive.

The objective of the invention is to provide a method to restore the distorted compressed files, allowing to obtain a reduction of the loss of information when decompressing distorted compressed files.

This task is solved by the method of recovery of corrupted compressed files, namely that produce read compressed files, secrete code segments LZ77 from the input bit stream by comparing them with predetermined code values, calculate the index lookup table on the value of the code segment LZ77, produce decoding code segment LZ77 in the lookup table, according to the invention,supplemented by the following sequence of operations:

- after the separation of the code segments LZ77 from the input bit stream search of errors in the current code segment LZ77;

- adjust the subsequent code segments LZ77;

- after decoding code segment LZ77 form the context model the decoded data;

- locate the distortion on the basis of comparison of the context model decoded data at a preset common semantic data model;

- correct the distortion of the decoded data.

Listed set of essential features allows to solve the task inventions due to the fact that the way the possibility of data recovery from a distorted compressed files, minimizing loss of information to decompress compressed files through the use of the procedures of the correction of errors in code segments LZ77 and context models decoded information.

The analysis of the level of technology has allowed to establish that the analogues, characterized by a set of signs, all signs identical technical solutions, no, that indicates compliance of the claimed process condition of patentability of "novelty".

Search results known solutions in this and related areas of technology to identify characteristics that match the distinctive features of the prototype of the features of the declared object, showed that they do not follow explicitly the prior art. The prior art is also not detected fame distinctive essential features contributing to the same technical result achieved in the present method. Therefore, the claimed invention meets the condition of patentability of "inventive step".

"Industrial applicability" inventions due to the presence of the element base, on the basis of which can be performed device implementing this method.

The claimed method is illustrated by drawings, showing:

figure 1 - generalized block diagram of the algorithm of the method of recovery of corrupted compressed files;

figure 2 - comparison of the simulation results for the prototype method and the proposed method;

Realization of the claimed process consists in the following (Figure 1). Before you read compressed files provide input for the formation of the common context of the data model (OKMD) and form OCMD on the basis of a priori information or assumptions about the type of data that may be contained in archives, for example texts in different natural languages, with the aim of subsequent validation decompression of compressed data (units 1 and 2). Input block of compressed data written to the input buffer (Bvnd) for subsequent decompression (blocks 3, 4 and 5). The decompression procedure is carried out according to the invention prototype (blocks 6, 8, 9, 10, 12 and 13) except that if the discrepancy bit of the code segment LZ77 (SC LZ77) with one of the preset values pointer read a sequence of bits in Bwht shifted by one bit to the right relative to the current position with the purpose of correction subsequent code segments LZ77 (units 7 and 15). Decoded (uncompressed) information written to the buffer restore the decoded data (BVD) to determine the presence, location and correction of distortions in the decoded data (section 11). On the basis of information contained in BUDD (block 16), form a contextual model of the decoded data (CMD) (block 17). The presence and location of distortion (Tits) is determined according to the formulas (1) and (2) (block 18):

P = n g n b , ( 1 ) I = arg ( max ( ∑ i = 0 N P ( i ) ) ) , ( 2 )

where the P - value, which characterizes the degree of similarity contexts ACMD and KMD, n g - the amount of matching contexts CMD with contexts ACMD and n b - number not coinciding contexts CMD with contexts OCMD, i - the position of the read pointer symbol sequences in BUDD, N is the number of characters in BUDD, arg - function argument (arg(f(x))=x), max - maximal value of the function. If I=N-1, then the distortion in the sequence of characters BUDD not found, the contents BUDD copy to buffer output data (Bvyha) and provide the information output from Bvyha (section 25), otherwise found distortion at position I in BUDD (blocks 19, 20 and 24). Correction of distortion exercise by performing the following sequence of operations (section 21):

- the index of the character BUDD dictionary LZ77 (each character is written to BUDD from the dictionary LZ77, assign a specific index, which is calculated at the current location of this symbol in the dictionary);

- calculation the distance Euclid D by the formula (3) between each context of ACMD and current context CMD;

- select the context of ACMD, which corresponds to the minimum Euclidean distance D min ,

- convert all characters in BUDD corresponding to the index of the current character in BUDD, on characters context of OCMD D min .

D = ∑ i = 0 n ( s i ( 1 ) - s i ( 2 ) ) 2 , ( 3 )

where D is the Euclidean distance between characters context of ACMD and KMD, n is the number of characters contained in the context (context), s (1) - a symbol of context OCMD, s (2) - the symbol of context CMD.

If the distortion sequence of characters BUDD fix failed (section 22), then the sequence of bits in Bwht shift to the left by one bit position location distortion (block 23) and the process of decompressing carried out again, otherwise the contents BUDD copy in Bvyha (section 24) and provide the information output from Bvyha (section 25).

For comparison of the proposed method with the method of the prototype was conducted an experiment by running the program "ArcRecovery" by the computer is constructed according to the given algorithm in a programming environment "Visual Studio" and in a simulation environment MatLab. The results of the experiment formulated in the form of dependence percentage decoded information on the number of bit errors in the compressed data (Figure 2), which shows that the application of the proposed method gives an advantage to minimize losses of information under the same conditions (the same input malformed compressed files) by 10-15% (depending on the choice of the order of contexts ACMD and CMD and type of distortion) compared with the method of the prototype.

The way to restore the distorted compressed files, namely, that reads compressed files, secrete code segments LZ77 from the input bit stream by comparing them with predetermined code values, calculate the index lookup table by value code segment LZ77, produce decoding code segment LZ77 in the lookup table, characterized in that after the separation of the code segments LZ77 from the input bit stream search of errors in the current code segment LZ77, adjust the subsequent code segments LZ77, after decoding code segment LZ77 form a contextual model of the decoded data, determine the location of distortion on the basis of comparison of the context model decoded data at a preset common semantic data model, correct the distortion of the decoded data.