# Dynamic encoding filters

FIELD: compensation of movement in video encoding, namely, method for encoding coefficients of interpolation filters used for restoring pixel values of image in video encoders and video decoders with compensated movement.

SUBSTANCE: in video decoder system for encoding a video series, containing a series of video frames, each one of which has a matrix of pixel values, interpolation filter is determined to restore pixel values during decoding. System encodes interpolation filter coefficients differentially relatively to given base filter, to produce a set of difference values. Because coefficients of base filter are known to both encoder and decoder and may be statistically acceptably close to real filters, used in video series, decoder may restore pixel values on basis of a set of difference values.

EFFECT: efficient encoding of values of coefficients of adaptive interpolation filters and ensured resistance to errors of bit stream of encoded data.

5 cl, 17 dwg

The technical field to which the invention relates.

The present invention relates to motion compensation in video encoding. More specifically, the invention relates to a method of coding the coefficients of the interpolation filters used to restore the pixel values of the image encoders and videodecoder with compensated movement. The invention also relates to the corresponding video encoder, video decoder and video, which carry out the method according to the invention.

The level of technology

Now there are various standards of coding. They include recommendations n of the standardization Sector of the International telecommunication Union (ITU-T) and MPEG-1, MPEG-2 and MPEG-4 Expert group on the moving images of the International organization of standards (ISO). These standards video encoding based on the compensated using the motion prediction and coding of prediction errors. Compensated prediction motion is performed by analyzing and coding the motion between consecutive frames in the video sequence and the recovery of image blocks using the information of the movement. This recovery of image blocks is constructed using interpolation filters movements, which are capable of sgenerirovat the (pixel) image values for the desired pixel and sub-pixel positions. The basic principles compensated prediction motion and restore images using interpolation filters are described in more detail in the following paragraphs.

Digital video sequences, as usual moving images on the tape contain a sequence of still images, often called "frames". The illusion of motion created by displaying these frames to each other with a relatively fast rate, typically 15 to 30 frames per second. Because of the relatively fast framerates content of images in successive frames tends to be quite similar, and thus, successive frames contain a significant amount of redundant information.

Each frame of the digital video sequence contains a matrix of image pixels. In the commonly used digital video format, known as quarter common interchange format compressed data (QCIF), the frame contains a matrix of 176×144 pixels and thus each frame has 25344 pixels. Each pixel of the frame is represented by a number of bits that carry the information about the brightness and/or color content (color) image area corresponding to the pixel. Commonly used so-called YUV format to represent the contents of the brightness and chromaticity of the image. I am the bone, or Y component represents the intensity (brightness) of the image, while the color content of the image is represented by two chrominance components, labeled U and V

A color model based on the representation of the luminance/chrominance image content, provides some advantages compared to color models, which are based on the view that includes the primary colors (i.e. red, green and blue, RGB). Due to the fact that the human visual system is more sensitive to changes in intensity than color changes, YUV color model exploits this property by using a lower spatial resolution for the component of the chrominance (U, V)than for the luminance (Y). However, the amount of information needed to encode the color information in the image can be reduced with less image quality degradation.

Lower spatial resolution component color is usually achieved by spatial subdirectory. Typically, a block of 16×16 pixels of the image is encoded in a single block of 16×16 values representing luminance information and two color component represented by each block of 8×9 pixels representing the image area equivalent to the matrix 16휖 luminance values. Color component are, thus, spatial seriescreative by a factor of 2 in horizontal and vertical directions. The result set of one luminance block 16×16 and two blocks of chrominance 8×8 usually called the YUV macroblock or macroblock for brevity.

The QCIF image contains 11×9 macroblocks. If luminance blocks and blocks of color are represented with a resolution of 8 bits (that is, numbers in the range from 0 to 255), then the total number of bits required per macroblock is (16×16×8) + 2×(8×8×8) = 3072 bits. Thus, the number of bits required to represent the video in QCIF format with resolution number 8 bits per component, is 99×3072 = 304126 bits. Therefore, the amount of data required for transmission, recording or display of the video sequence containing a series of frames of QCIF format at 30 frames per second is more than 9 Mbps (million bits per second). This frequency data is impractical for use in applications recording, transmission and display due to the very large, the required memory capacity, bandwidth of the transmission channel and hardware performance. For this purpose the developed coding standards, such as those mentioned above, to reduce the volume of the information, required to represent and video while maintaining acceptable image quality.

Each of the previously mentioned standards video encoding is intended for use in video recording systems or transmission, with different characteristics. For example, the MPEG-1 standard ISO developed specifically for use in situations where the available data bandwidth reaches up to 1.5 Mbit/s Standard video encoding MPEG-2 is primarily applicable to the digital media data and videowoman and communications with the available bandwidth of the data up to about 10 Mbit/s recommendation n ITU-T directed to the use in the systems when the available bandwidth is in General much lower. She, in particular, suitable for use in situations where video data must be transmitted in real time over the network or a dedicated line, such as a digital network with integrated services (CSCW) (ISDN) or traditional public switched telephone network (PSTN) (PSTN), when the available bandwidth for data transmission is typically of the order of 64 kbit/S. In mobile video telephony, where transmission takes place at least partially on the radio link, the available bandwidth may be reduced up to 20 kbit/S.

Although various existing standartvideomail designed for use in different situations, the mechanisms that they use to reduce the amount of transferable information, have many common features. In particular, they all work in such a way as to reduce the amount of redundant and useless information perception in transferable sequence. Essentially there are three types of redundancy in video sequences: spatial, temporal and spectral redundancy. Spatial redundancy is an expression used to describe the correlation between neighboring pixels in a single frame sequence. Temporal redundancy expresses the fact that the objects that appear in one frame of the sequence, will probably appear in subsequent frames. Spectral redundancy refers to the correlation between different color components of the same image.

Quite effective compression usually cannot be achieved by a simple reduction of various forms of redundancy in a given image sequence. Thus, most current video encoders also reduce the quality of those parts of the sequence that subjectively the least important. In addition, the redundancy of the compressed video stream of bits is reduced by itself through effective lossless encoding. Typically, this is achieved using entr Pingo encoding.

Compensated prediction motion is a type of reducing temporal redundancy in which the content of some (often many) frames in the video sequence "is predicted from other frames in the sequence by tracking the movement of objects or regions of an image between frames. Frames that are compressed using compensated prediction motion, commonly referred to as INTER-coded or P-frames, while frames that are compressed without the use of compensated prediction motion, are called INTRA-coded or I-frames. Predicted (with compensated movement, INTER-encoded) image is rarely accurate enough to represent the contents of an image with sufficient quality, but because each INTER-frame contacts the frame of the prediction error (OP) (RE) with spatial compression. Many schemes the integration can also use bi-directional predicted frames, which are commonly referred to as In-picture or b-frames. B-pictures are inserted between the support or the so-called "anchor" pairs of pictures (I or P frames) and predicted either from one or from both of the anchor pictures.

Different types of frames that appear in a normal compressed video sequences, illustrer who are in Fig. 3 of the accompanying drawings. As you can see from this drawing, the sequence starts with INTRA or I-frame 30. In Fig. 3 arrows 33 indicate the process of "direct" prediction, through which are formed the P-frames 34. The process of bi-directional prediction, through which are formed In the frames 36, indicated by arrows 31A and 31b, respectively.

The conventional scheme of a General system of coding using compensated prediction motion shown in Fig. 1 and 2. Fig. 1 illustrates an encoder 10, applying the compensated prediction motion, and Fig. 2 illustrates the corresponding decoder 20. The encoder 10 shown in Fig. 1, contains unit 11 estimates the moving field, block 12 coding a moving field, block 13 compensated by the motion prediction unit 14 encoding the prediction error, unit 15 decodes the prediction error, the multiplexing unit 16, human memory 17 and the adder 19. The decoder 20 includes a block 21 compensated by the motion prediction unit 22 decodes the prediction error, demultiplexers block 23 and human memory 24.

The principle of operation of the encoders employing compensated prediction motion is to minimize the amount of information in the frame*E*_{n}(*x,y*) the prediction error, which is the difference between the current encoded kad is ω*
I*_{n}(*x,y*) and the predicted frame*P*_{n}(*x,y*). The frame of the prediction error is thus defined as follows:

*E*_{n}(*x,y*) =*I*_{n}(*x,y*) -*P*_{n}(*x,y*). (1)

The predicted frame*P*_{n}(*x,y*) is constructed using the pixel values of the reference frame*R*_{n}(*x,y*), which in General is one of the previously coded and transmitted frames, such as frame immediately preceding the current frame, and is available from human memory 17 of the encoder 10. Specifically, the predicted frame*P*_{n}(*x,y*) is constructed by finding the predicted pixels in the reference frame*R*_{n}(*x,y*), which essentially correspond to the pixels in the current frame. Is the information of motion describing the relation (for example, relative location, rotation, scale, etc) between pixels in the current frame and their corresponding predicted pixels in the reference frame and the predicted frame is constructed by moving the predicted pixels according to the motion. When this predicted frame is constructed as an approximate representation of the current frame using pixel values in the reference frame. The above-mentioned frame prediction errors therefore represents the difference between the approximate representation of the current frame, etc is delivered to the predicted frame,
and by the current frame. The main advantage provided by encoders that use compensated prediction motion, arises from the fact that a relatively compact description of the current frame can be obtained for the movement required for the formation of the predictions, along with the associated information of the prediction error in the frame of the prediction error.

Due to the large number of pixels in the frame in General is inefficient to transmit to the decoder a separate motion for each pixel. Instead, in most videocodierung schemes the current frame is divided into a large number of segments*S*_{k}and in the decoder receives the information of the motion relating to these segments. For example, information movement, usually provided for each macroblock of the frame, and then the same information movement is used for all frames in this macroblock. Some coding standards, such as recommendation .26L ITU-T, now under development, the macroblock may be divided into smaller blocks, each smaller unit is supplied with your own information movement.

Information movement usually takes the form of motion vectors [Δ*x*(*x,y*), Δ*y*(*x,y*)]. Pair of numbers Δ*x*(*x,y*and Δ*y*(*x,y*) represents the horizontal resolution is Talnoe and vertical offset of the pixel (*
x,y*in the current frame*I*_{n}(*x,y*) relative to the pixel in the reference frame*R*_{n}(*x,y*). Vectors [Δ*x*(*x,y*), Δ*y*(*x,y*)] motion is calculated in block 11 evaluation of moving boxes, and a set of motion vectors in the current frame [Δ*x*(·), Δ*y*(·)] is called a field of motion vectors.

Typically, the location of the macroblock in the current video frame specified by the coordinates (*x,y*) the upper left corner. Thus, in the scheme of coding, in which the information of the motion associated with each macroblock in the frame, each motion vector describes the horizontal and vertical offset Δ*x*(*x,y*and Δ*y*(*x,y*) pixel representing the top-left corner of the macroblock in the current frame*I*_{n}(*x,y*) relative to the pixel in the upper left corner almost corresponding predicted block of pixels in the reference frame*R*_{n}(*x,y*) (as shown in Fig. 4b).

Motion estimation is a hard computationally task. For a given reference frame*R*_{n}(*x,y*and, for example, square macroblock*N*×*N*pixels in the current frame (as shown in Fig. 4A) the goal of motion estimation is to find in the reference frame pixel block*N*×*N*that is consistent with the characteristics of macroblocks current image according to a certain criterion.
This criterion can be, for example, the sum of absolute differences (sad) (SAD) between the pixels of the macroblock in the current frame and the block of pixels in a reference frame with which it is compared. This method is known in General as "pair of blocks". It should be noted that in the General case should be consistent geometry of the block, and that the reference frame need not be the same when real-world objects may be subject to changes of scale, rotation and deformation. However, in the current international standards of coding, such as the above is only a model of the translational motion (see below), and therefore quite fixed rectangular geometry.

Ideally, to achieve the best chance of finding a mate, you should view the entire reference frame. However, this is impractical, because it imposes too heavy a burden calculations on the video encoder. Instead, the search scope is usually limited to the range [-*p,p*] around the source location of the macroblock in the current frame, as shown in Fig. 4C.

To reduce the amount of data traffic to be transferred from the encoder 10 to the decoder 20, the field motion vectors encoded in the encoding block 12 field movement in the encoder 10 by presenting his motion model. In this method, the motion vectors of the segments of the image get in the s expression using the specified functions or in other words, the motion vectors represent the model. Almost all current models of field motion vectors are additive models movement that meet the following General formulas:

(2)

(3)

where*a*_{i}and*b*_{i}are the coefficients of the movement. These coefficients movement is transmitted to the decoder 20 (information flow 2 in Fig. 1 and 2). The function*f*_{i}and*g*_{i}represent the basic functions of the field movement. They are known both to the encoder and the decoder. Approximate fieldmotion vectors can be built using these coefficients of these basis functions. Since the basis functions are known as the encoder 10 and the decoder 20 (i.e. stored in them), the encoder needs to send only the coefficients of the movement, thereby reducing the amount of information required to represent the information of the movement of the frame.

The simplest motion model is a model of the translational motion, which requires only two factors to describe the motion vectors of each segment. The magnitude of the motion vectors are defined by equations

Δ*x*(*x,y*) =*a*_{0}

Δ*y*(*x,y*) =*b*_{0}(4)

This model, used in the recommendations in ITU-T and ISO standards MPEG-1, MPE-2, MPEG-4 in order to describe the motion of pixel blocks 16×16 and 8×8. Systems that use a model of the translational motion, as a rule, perform motion estimation in polnoekrannom resolution or some integer part polnotsennogo resolution, for example, half or quarter pixel resolution.

The predicted frame*P*_{n}(*x,y*) is formed in the block 13 is compensated for the motion prediction in the encoder 10 and is defined by the equation

(5)

In block 14 of encoding the prediction error frame*E*_{n}(*x,y*) is typically compressed by representing it as the end of the row (convert) some two-dimensional functions. For example, you can use a two-dimensional discrete cosine transform (DCT) (DCT). Conversion factors quanthouse and subjected to entropy coding (e.g., Huffman) before they are passed to the decoder (information flow 1 in Fig. 1 and 2). Due to the fact that the quantization introduces an error, this operation usually leads to some deterioration (loss of information) in frame*E*_{n}(*x,y*) the prediction error. To compensate for this deterioration, the encoder 10 also includes unit 15 decodes the prediction error where using conversion factors based decoded frame
the prediction errors. This decoded on the seat frame, the prediction error is added to the predicted frame*P*_{n}(*x,y*) by the adder 19, and the resulting decoded current frameis stored in human memory 17 for further use as a reference frame*R*_{n+1}(*x,y*).

Information flow 2, which carries the information about the motion vectors are combined in the multiplexer 16 with information about the error prediction, and data flow 3, containing, as a rule, at least two types of information, is sent to the decoder 20.

Now will be described the operation of the corresponding decoder 20.

Human memory 24 of the decoder 20 stores previously restored keyframe*R*_{n}(*x,y*). The predicted frame*P*_{n}(*x,y*) is formed in the block 21 is compensated for the motion prediction in the decoder 20 according to equation (5) using the received information about the magnitude of movement and the pixel values of the previously constructed reference frame*R*_{n}(*x,y*). Passed to the transform coefficients of the frame*E*_{n}(*x,y*used unit 22 decodes the prediction error to build the decoded frame. Then restored pixels of the decoded frameby with the ogene predicted frame*
P*_{n}(*x,y*and the decoded framethe prediction errors:

(6)

This decoded current frame may be stored in human memory 24 as the next keyframe*R*_{n+1}(*x,y*).

In the above description of the encoding and decoding motion compensated digital video vector [Δ*x*(*x,y*), Δ*y*(*x,y*)] of motion describing the motion of a macroblock in the current frame relative to a reference frame*R*_{n}(*x,y*), can indicate any of the pixels in the reference frame. This means that the motion between frames of a digital video sequence can be represented only with the permission defined by the pixels of the image in the frame (so-called polnotsennoe resolution). The real movement has arbitrary precision, and therefore the system described above can provide only an approximate simulation of the motion between consecutive frames in digital video sequences. Typically, the simulation of motion between video frames with polemically resolution is not precise enough to provide an effective minimization of the prediction error (OP) (RE)associated with each macroblock or frame. Therefore, in order to provide more accurate modeling of Elenovo movement and help in reducing the amount of information the OP,
which must be passed from the encoder to the decoder, many standards video encoding allows motion vectors to point to the "intermediate" image pixels. In other words, the motion vectors can have "sub-pixel" resolution. Allowing the motion vectors to have sub-pixel resolution is appended to the complexity of encoding and decoding to be performed, so that it is more profitable to limit the degree of spatial resolution, which may have a motion vector. Thus, coding standards, such as previously mentioned, typically allow motion vectors to have polnotsennoe, half pixel and quarter pixel resolution.

Motion estimation with sub-pixel resolution can be embodied as a two-stage process, as illustrated in exemplary form in Fig. 5, the main circuit of the coding, in which the motion vectors may be full - or half pixel resolution. In the first stage motion vector having polnotsennoe resolution, found with the help of a suitable diagram of the motion estimation, such as described above, the method of coupling blocks. In Fig. 5 shows the resulting motion vector having polnotsennoe resolution.

In the second stage, the motion vector found in the first stage, better to get the desire is compulsory half pixel resolution.
In the example illustrated in Fig. 5, this is done through the formation of eight new search blocks of 16×16 pixels, and the location of the upper left corner of each block is labeled with the position X in Fig. 5. These locations are referred to as [Δ*x*+*m*/2, Δ*y*+*n*/2], where*m*and*n*can take the values -1, 0 and +1, but cannot be zero at the same time. As is known, only the pixel values for the pixels of the original image, the values (for example, brightness values and/or chrominance) of subpixels, in a half pixel locations are estimated for each of the eight new search engine units by using some form of interpolation scheme.

Interpolating values subpixels with half pixel resolution, each of the eight search block is compared with the macroblocks whose motion vector is searched for. As in the way of interface blocks performed for finding the motion vector with polemically resolution macroblock is compared with each of the eight search blocks according to a certain criterion, for example, the RAA. As a result of comparison in the General case is obtained the minimum value of the RAA. Depending on the nature of motion in video sequences is the minimum value may correspond to the location specified by the original motion vector is (polemically resolution) or it may correspond to a location with a half pixel resolution. Thus, it is possible to determine whether the motion vector on polnotsennoe or half pixel location, and, if acceptable half pixel resolution, find the correct motion vector with half pixel resolution.

In practice, the estimation of sub-pixel values in the reference frame is performed by interpolation of the values subpixel from the surrounding pixel values. In General, the interpolation values of*F(x,y)*located in a non-integer location*(x,y)*=*(n*+*Δx, m*+*Δy)*can be formulated as a two-dimensional operation, mathematically represented as

(7)

where*f(k,l)*are the filter coefficients, and*n*and*m*obtained by truncation, respectively*x*and*y*to whole numbers. Typically, the filter coefficients depend on the values of*x*and*y*and interpolation filters is usually referred to as "partial filters", and in this case, the sub-pixel value of*F(x,y)*can be calculated as follows:

(8)

The motion vectors are calculated in the encoder. When the corresponding coefficients of the movement transmitted to the decoder, then interpolate the required subpixel using interpolation method, identical to those used in the encoder, it is a simple matter. In this frame, the next for keyframe in human memory 24, can be restored from the reference frame and the transmitted motion vectors.

Traditionally, interpolation filters used in the encoders and videodecoder, used fixed values of filter coefficients, and the same filter (i.e. the same type of filter with the same values of the filter coefficients) is used for all frames of the encoded video sequence. The same filter is used for all video sequences, regardless of their nature and how they were collected (captured).*Wedi*("Adaptive Interpolation Filter for Motion Compensated Hybrid Video Coding", Picture Coding Symposium (PCS 2001) ["Adaptive interpolation filter for hybrid video coding motion compensated", Symposium on image coding], Seoul, Korea, April 2001) proposes the use of interpolation filters with adaptive values of the filter coefficients to compensate for some weaknesses in the way of coding. In particular,*Wedi*describes how, when applying the process of obtaining images, the final resolution of the possible motion vectors and limit the accuracy of the model of translational motion contribute additional errors of prediction. The overlay in the video is caused by the use of non-ideal low-pass filters (and, as sledstvie is,
the failure of theorem of Nyquist samples) during image acquisition. The imposition violates compensated prediction motion in the video sequence and causes an additional component of the prediction error. The final accuracy of the possible motion vectors (for example, polemically, half pixel or quarter pixel) and the ability of the translational motion model to represent only the horizontal and vertical translational motion between consecutive video frames also cause additional components of the prediction error.*Wedi*further presupposes that the improvements in coding efficiency can be achieved by adapting the values of the coefficients of the interpolation filter in the filter to compensate for the additional errors of the predictions made by the overlay, the final accuracy of the motion vector and the limited accuracy of the model of the progressive movement.

More generally, it should be understood that, since the nature and characteristics of the movement changes in a video sequence, the optimal interpolation filter varies as a function of time and location in the image.*Wedi*represents an example in which interpolation filter with dynamically adaptive values of the filter coefficients embedded in the H.26L codec,specifically in the version of this codec,
specific test model (TML) 4. TML-4, H.26L used a quarter pixel resolution motion vector and Wiener interpolation filter type with six symmetrical filter coefficients (6-discharge filter). Presented in article*Wedi*the example suggests the adaptation of the filter coefficients in the interpolation filter on a frame-by-frame basis, differential encoding of filter coefficients and transfer them to the decoder as side information to the main image data. A proposal was made on the basis of this approach include the use of the interpolation filters with adaptive values of the filter coefficients in the test model 8 H.26L video codec. This is presented in the document communication standardization sector of ITU, entitled "Adaptive Interpolation Filter for H.26L" (Adaptive interpolation filter for H.26L) Study Group 16, Question 6, Video Coding Experts Group (VCEG), document VCEG-N28 September 2001, and "More Results on Adaptive Interpolation Filter for H.26L" (More results on Adaptive interpolation filter for H.26L) Study Group 16, Question 6, Video Coding Experts Group (VCEG), document VCEG-O16r1, November 2001.

The use of dynamically adaptive interpolation filter raises an important question concerning the encoding efficiency of the encoded stream of video data, and also affects the resistance to errors coded video data. The issue of efficiency Kodirov the deposits can be easily understood. In videocamera system, which uses the interpolating filter with fixed values of the filter coefficients, it is not necessary to include any information relating to the values of the filter coefficients in the bit stream of encoded video data. The values of the coefficients of the filter can simply be recorded in the video encoder and the decoder. In other words, in videocamera system embodied according to a specific standard of coding, which uses a fixed interpolation filters, the values of the coefficients are pre-programmed in the encoder and in the decoder according to the descriptions of the standard. However, if allowed dynamically adaptive filter coefficients, it becomes necessary to transmit information associated with the values of the coefficients. As the periodic update of the filter coefficients (for example, on a frame-by-frame basis), this information need be added to the amount of information transferable from the video encoder to the decoder, and has an adverse effect on the coding efficiency. In the applications of coding low bit rate, any increase in the amount of information transferable, in most cases undesirable.

Thus, in order to optimally simulate and compensate for the movement, you need effective the first dynamic view interpolation filters.

With regard to resistance to errors, it should be understood that the way in which information about the dynamically varying coefficients of the interpolation filter is transmitted from the encoder to the decoder, may affect the susceptibility of video data to transmission errors. More specifically, in videocodierung systems that use dynamically adaptive interpolation filters, proper recovery frame in the video sequence at the decoder depends on the proper reception and decoding of the values of the filter coefficients. If the information relating to the values of the coefficients, subject to errors during transmission from the encoder to the decoder, probably a distortion of the recovered video data. There are three well-known in the prior art ways of coding the filter coefficients. The first is entropy encoding separately the values of the filter coefficients. The second is the entropy encoding of the values of the filter coefficients differentially, with respect to the filter coefficients already decoded filter (as suggested in the article*Wedi*), and the third is the determination of the set of filters and encoding the index of the selected filter.

Known in the prior art solutions, which could be used to encode the coefficients of the interpolation filter, as mentioned above, in the e have problems,
associated with them in different usage scenarios. The first way in which the coefficients of the interpolation filter are encoded separately, offer lower performance coding, because it does not use any prior information (i.e. information about precisely encoded coefficient values of the interpolation filter). This approach, therefore, requires an excessively large amount of information to be added to an encoded stream encoded videoview to describe the values of the coefficients of the interpolation filter. Differential encoding of the coefficients, as proposed in article*Wedi*effectively, but cannot be used in an environment with possible transmission errors, since the filter coefficients depend on the correct decoding of the earlier filter coefficients. As described previously, if the bit stream of coded data is subjected to error during transmission from the encoder to the decoder, the probability of appearance of distortion of the video data restored by the decoder. The third known from the prior art solution with a given set of filters provides only limited alternatives and thereby impairs the encoding performance. In other words, this option may not achieve the full benefits from the use and terpolation filter with dynamically adaptive values of the filter coefficients,
as stated in article*Wedi*.

Thus, it should be clear that there is a need in the encoding values of the coefficients of adaptive interpolation filters, which would be effective and does not cause deterioration in the resistance to the error bit stream of encoded data.

The invention

The present invention combines good efficiency encoding differential encoding with the properties of resistance to errors, which allows its use in all environments. Therefore, it is in particular suitable for implementation in videocamera system for use in error-prone environments, for example, when the coded stream videoview must be transmitted over the radio link, affected.

Thus, according to the first aspect of the present invention proposes a method of encoding images in a digital video sequence to obtain encoded video data, and digital video sequence contains a sequence of video frames, each video frame has a lot of pixel values, and to restore the pixel values in the frame of the mentioned digital video sequences of encoded video data using interpolation filter having a set of coefficients, presents many what estom values of the coefficients.

The method differs in that

encode the values of the coefficients of the interpolation filter differential relative to the specified base filter to generate a set of difference values, and

adapt the set of Delta values in the encoded video data so that recovery of the pixel values was based on the set of Delta values.

Mainly encoded video data includes coded values, expressing the set of difference values, and a set of differential values of entropy encoding prior to transmission from the encoder to the decoder.

Mainly specified base filter has many coefficients with values statistically similar to the values of the coefficients of the interpolation filter.

Mainly the coefficients of the interpolation filter is chosen for the interpolation of the pixel values in the selected segment of the image.

Mainly specified base filter has fixed coefficients.

Mainly specified base filter has a set of coefficients adapted to the statistics of the sequence.

Preferably interpolation filter is symmetric, so encode only half of the coefficients of the filter.

Preimushestvenno the values of the coefficients of the interpolation filter code in a specific order from the first coefficient is the last coefficient value, and this particular order is different from the spatial order of the above factors.

Mainly the sum of the values of the coefficients of the interpolation filter is fixed.

Mainly specified base filter has a set of values of coefficients and a constant value is added to these values of the coefficients of the specified basic filter so as to reduce the amplitude difference between the values of the coefficients of the interpolation filter and the values of the coefficients of the specified base filter.

According to the second aspect of the present invention is proposed video encoder, which contains

means for encoding images in a digital video sequence having the sequence of video frames to obtain encoded video data, expression of this sequence, and each frame of the sequence contains a number of pixel values, and

means for determining the interpolation filter for the restoration of the pixel values in the frame of the mentioned digital video sequence in decoding process, and the interpolation filter has several factors, represented by a set of coefficient values.

The video encoder differs in that it contains

the tool responds to the interpolation filter for vicis the value of the difference between the values of the coefficients of the interpolation filter and the specified base filter to obtain a set of difference values, and

means for adapting the set of Delta values in the encoded video data that is intended to restore the pixel values in the decoding process was based on the set of Delta values.

Mainly, the encoder includes means for entropy encoding of the set of difference values before the adaptation of the set of difference values in the encoded video data.

According to a third aspect of the present invention proposes a method of decoding video data, expressing the digital video sequence containing a sequence of video frames, each frame of the sequence contains a number of pixel values, while the interpolation filter having a set of coefficients represented by a set of coefficient values used for the recovery of the pixel values in the frame of the mentioned digital video sequences.

The method differs in that

extracted from the video data set of Delta values, and the set of difference values expresses the difference between the values of the coefficients of the interpolation filter and the specified base filter

create an additional filter on the basis of the set of difference values and a given base filter, and

restore pixel values on the basis of Ni is sustained fashion filter.

Mainly specified base filter has many factors, represented by a set of coefficient values, and create an additional filter to perform the summation of the set of difference values with these values of the coefficients of the specified base filter.

Predominantly set of Delta values extracted from the video data by entropy decoding.

According to a fourth aspect of the present invention offers a video decoder that includes means for receiving video data in the bit stream, and accepted Express video digital video sequence containing a sequence of video frames, and each frame of the sequence contains a number of pixel values.

Video decoder differs in that it contains

means for retrieving the set of difference values from the bit stream,

a tool for creating the interpolation filter based on the given base filter and the set of difference values, and

means for restoring the pixel values in the frame of the video sequence on the basis of the interpolation filter and the received video data.

Mostly, the video decoder has additional means for summing the set of difference values with the values of additional factors specified in the basic filter to generate the interpolation filter and the means for entropy decoding the set of difference values from the bit stream.

According to the fifth aspect of the present invention is proposed videocamera system that contains

an encoder for encoding images in a digital video sequence having the sequence of video frames to obtain encoded video data in the bit stream representing the video sequence, each frame of the sequence contains a number of pixel values, while the encoder is a tool to determine the interpolation filter for the restoration of the pixel values in the frame of the mentioned digital video sequence in decoding process, and the interpolation filter has a set of filter coefficients, represented by a set of coefficient values, and

a decoder for receiving encoded video data in the bit stream for recovery of the pixel values in the frame of the sequence in the decoding process.

Videocamera system differs in that

the encoder further comprises

means for calculating the difference between the interpolation filter and the specified base filter to obtain a set of difference values, and

means for adapting the set of Delta values in the bit stream, and

the decoder contains

means for extracting from the bitstream of the set of difference values, and

tool for the creation of the additional filter based on the given base filter and the extracted set of differential values thus to restore the pixel values in the decoding process was based on the additional filter.

These and other features of the present invention will become clear from the following description in conjunction with accompanying drawings. It is clear, however, that the drawings are designed solely for purposes of illustration, and not as defining the limits of the invention.

Brief description of drawings

Fig. 1 is a block diagram illustrating a General video encoder according to the existing level of technology.

Fig. 2 is a block diagram illustrating a General video decoder according to the existing level of technology.

Fig. 3 is a conditional view illustrating the types of frames used in video encoding.

Fig. 4A is a conditional view illustrating a macroblock in the current frame.

Fig. 4b is a conditional view illustrating the support frame for coupling blocks.

Fig. 4C is a conditional view illustrating a search area around the original location of the macroblock in the current frame.

Fig. 5 is a conditional view illustrating the process of motion estimation to sub-pixel resolution according to the current level of technology.

Fig. 6A is a conditional view illustrating the optimal interpolation filter.

Fig. 6b which is a conditional concept illustrating the optimal interpolation filter degradable at a basic filter and a differential coefficients.

Fig. 6C is a conditional view illustrating a differential coefficients to encode and send to the decoder.

Fig. 7 is a block diagram illustrating a terminal device containing hardware video encoding and videodatabase capable of implementing the present invention.

Fig. 8A is a block diagram illustrating a video encoder according to a preferred variant implementation of the present invention.

Fig. 8b is a block diagram illustrating a video encoder according to another variant implementation of the present invention.

Fig. 8C is a block diagram illustrating a video encoder according to another variant implementation of the present invention.

Fig. 9a is a block diagram illustrating a video decoder according to a preferred variant implementation of the present invention.

Fig. 9b is a block diagram illustrating a video decoder according to another variant implementation of the present invention.

Fig. 9c is a block diagram illustrating a video decoder according to another variant implementation of the present invention.

The best mode of implementation is subramania

The encoder according to the present invention encodes the coefficients of the filter differential with respect to a given base filter coefficients. Fig. 6A-6C illustrate the method according to the present invention. The bar graphs shown in Fig. 6A, are examples of the values of the coefficients of the interpolation filter, and each column corresponds to one of the filter coefficients. The height of the column represents the corresponding value of the coefficient, with speakers above the horizontal axis of the bars represent the positive values of the coefficients, and protruding below the horizontal axis bars represent negative values of the coefficients. In Fig. 6A and 6b, the bar graph 110 represents a filter, which is detected by the encoder as the best fit to interpolate the movement of the selected segment, while the bar graph 140 represents the base filter. In the example shown in Fig. 6A, the filter is a 6-lateral symmetrical filter with 6 filter coefficients. Instead of sending the filter coefficients themselves are encoded and sent only difference between the selected filter 110 and the base filter 140. Sent coefficients 120 shown in Fig. 9c.

With the present invention is obtained gain in coding, because small amplitude different the STI can efficiently encode the entropy coder. When such differential values are included in the coded stream videoview received by the video encoder, and this thread videoview transmitted from the encoder to the corresponding decoder, the coefficients of the interpolation filter can be recovered in the decoder by extracting the differential values from the encoded bit stream and add them to the appropriate values of the coefficients of the specified basic filter stored in the decoder.

It should be noted that the basic filter can also be adapted to the statistics of the video sequence and the received filter coefficients to further improve the coding efficiency. It is also possible that the base filter is set for a codec. In other words, the same set of basic filter is used for all video sequences, to be encoded, regardless of their characteristics or the way in which they are received. Alternatively, the base filter is adapted to image data, i.e. different basic filters are used for different video sequences or basic filter can be adapted during the encoding of a specific sequence according to some specified rules.

If the filter is symmetric, as shown in Fig. 6A-6C, it is necessary to encode only half of the filter coefficients. The rest can be the to learn by copying.
In the example shown in Fig. 6C, the amount of information needed to represent the values of the coefficients of the adaptive interpolation filter in the stream of coded videoview, can be reduced further by the realization that the fourth, fifth and sixth filter coefficients are identical respectively to the third, second and first filter coefficients. Thus, in this case, the six coefficients of the interpolation filter can actually be encoded by three values, of which the first represents the difference between the first coefficient of the interpolation filter and the first coefficient of a given base filter, the second represents the difference between the second coefficient of the interpolation filter and the second coefficient of the specified basic filter, and the third represents the difference between the third coefficient of the interpolation filter and the third coefficient of the specified base filter. Then you just need to include these three differential values in the coded stream videoview transmitted from the encoder to the decoder because the decoder can obtain the remaining three coefficients of the interpolation filter corresponding copy of the first three reconstructed values of the filter coefficients. The same approach can be selected, if the base filter and interpolation filter are not even, the odd number of coefficients,
but are none the less symmetric. In this case, it should be understood that the number of differential values to be encoded is (*n*/2) + 1, where*n*there are a number of factors in the base filter / interpolation filter.

The method according to the present invention can be combined with other methods of encoding coefficients. For example, a set of the most commonly used filters can be encoded by their indexes. Less frequently used filters can be encoded described the invention, providing the maximum variation of available filters and thereby overcoming the previously mentioned drawbacks of the third known from the prior art method of encoding values of the coefficients.

The encoding order of the filter coefficients is not necessarily spatial order. For example, the differential values representing the values of the coefficients of the interpolation filter, should not be included in the coded stream videoview in the same order as the coefficients used in the filter. In this case, must be defined and known to both encoder and decoder set a rule specifying the order in which differential values appear in the bit stream.

It is possible that the underlying filter adapts to the adopted filter coefficients of the same is of elytra. For example, if the first transmitted coefficient of the filter is greater than the ratio of the base filter, the second coefficient of the base of the filter can be reduced. This is especially true if you know the sum of the coefficients of the filter.

Typically, the sum of the filter coefficients are fixed. In this case, there is no need to encode the last coefficient of the filter, but it can be calculated by subtracting the amounts of the first coefficients from the total amount. If the sum of the filter coefficients is not fixed, you can add separately transmitted permanent or fixed to the coefficients of a basic filter or output filter to reduce the amplitude difference of the coefficients.

Fig. 7 is a terminal device containing hardware video encoding and videodatabase, which can be adapted to work in accordance with the present invention. More precisely, Fig. 7 illustrates a multimedia terminal 60, embodied according to the recommendations In ITU-T. This terminal can be considered as a multimedia transmission and receiving device. It includes items that capture, encode and multiplexers streams of multimedia data for transmission over the communication network, as well as items that take demultiplexer, decode and display the received media content. Recommendation N ITU-T identified yet all the work terminal and refers to other recommendations, who direct the work of its various component parts. This kind of multimedia terminal can be used in real time applications, such as spoken voice, or not real-time applications, such as sampling or creating streams of video clips, for example, from the server of the multimedia content on the Internet.

In the context of the present invention should be understood that the terminal N shown in Fig. 7, is just one of several alternative embodiments of the multimedia terminal, suitable for application of the invented method. It should also be noted that there are several alternatives relating to the location and embodiment of the terminal equipment. As is illustrated in Fig. 7, the multimedia terminal can be located in the communications equipment connected to the network dedicated telephone lines, such as the PSTN (public switched telephone network (PSTN). In this case, the multimedia terminal is supplied with a modem 71, the relevant recommendations V.8, V.34, and, optionally, V.8bis. Alternatively, the multimedia terminal can connect to an external modem. The modem provides the dialogue multiplexed digital data and control signals produced by the multimedia terminal is in an analog form suitable for transmission over the PSTN. He then poses the s a multimedia terminal to receive data and control signals in analog form from the PSTN and to convert them into a stream of digital data, you can demultiplex and process the appropriate terminal.

Multimedia terminal N can also be embodied so that it can connect directly to a network digital leased lines, such as CSCW (digital network integrated services). In this case, the modem 71 is replaced by the interface user-network CSCW. In Fig. 7 this interface the user-network CSCW presents an alternative block 72.

Multimedia terminals N can also be adapted for use in mobile applications. When used with the line wireless modem 71 can be replaced with a suitable wireless interface, as shown in an alternative block 73 in Fig. 7. For example, the multimedia terminal N/M may include radiopropagation that provides connection to an existing mobile telephone network GSM second generation or proposed storing UMTS (universal mobile telephone system (UMTS) third generation.

It should be noted that in a multimedia terminal, designed for bidirectional communication, i.e. for transmission and reception of video data, it is useful to provide both the video encoder and video decoder implemented in accordance with the present invention. Such a pair of encoder and decoder is often performed as a single the joint functional block, called "codec".

Now will be described in more detail typical multimedia terminal N with reference to Fig. 7. Multimedia terminal 60 includes a set of elements, called "terminal equipment". This terminal equipment includes video, audio and telematic device, designated in General by the reference positions 61, 62 and 63, respectively. Video 61 may include, for example, a video camera for obtaining video monitor for displaying the received video content and optional equipment video processing. Audio gear 62 typically includes a microphone, for example, to retrieve voice messages and a loudspeaker for reproducing a received audiotalaia. Audio equipment may also include additional units audiooperati. Telematic equipment 63 may include a data terminal, keyboard, electronic whiteboard, or a transceiver of still images, such as a Fax unit.

Video 61 connects to the codec 65. The codec 65 contains a video encoder and a corresponding decoder, both of which are embodied according to the invention. Such encoder and decoder will be described further. The codec 65 is responsible for encoding the received video data in a suitable form for long the necks transmission over the communication line and for decoding the compressed video content, received from the communication network. In the example illustrated in Fig. 7, it is assumed that the codec is implemented in such a way that it includes the use of dynamically adaptive interpolation filters. Further assume that Moderna section configured to encode and transmit values of the coefficients of the interpolation filter to the corresponding decoder according to a variant of execution of the invented method, as described previously. Similarly, dekodera section of the codec is configured to receive and decode the values of the filter coefficients that are encoded according to the same variant of execution invented method.

Terminal equipment connected with the audio codec indicated in Fig. 7 reference position 66. Like video codec, audio codec contains a pair of encoder-decoder. It converts the audio data received by the terminal equipment in the form suitable for transmission over the communication line, and converts the encoded audio data received from the network back in form, suitable for reproduction, for example, on a terminal loudspeaker. The output of the audio codec is connected to the block 67 delay. This compensates for the delay introduced by the video encoding process, and thereby ensures the synchronization of audio and video content.

System Manager block is 64 multimedia terminal controls the signaling between the end user and the network using a suitable control Protocol (block 68 alarm), to set a common mode of operation between transmitting and receiving terminals. Block 68 signaling exchanges information about the possibilities of encoding and decoding of the sending and receiving terminals and can be used to allow different modes of encoding video encoder. The system control unit 64 controls the use of data encryption. Information regarding the type of encryption to be used in the transmission of data is passed from block 69 encryption to the multiplexer-demultiplexer (MUX blockDMUX) 70.

During data transmission from the multimedia terminal unit 70 MUXDMUX combines encoded and synchronized video and audio streams using the input data from the telematics 63, and possible control data to form a single bit stream. Issued by block 69 encryption information (if any)relating to the type of data encryption to be applied to the bit stream, is used to select the encryption mode. Accordingly, when receiving multiplexed and possibly encrypted multimedia bitstream, the block 70 MUXDMUX is responsible for decoding the bit stream, split it into its constituent elements mediacomponent and for the transmission of these components to the appropriate codec(s) and/or terminal equipment is aniu for decoding and playback.

Fig. 8A is a block diagram of the video encoder 700 embodied according to a preferred variant implementation of the invention. The structure of the video encoder shown in Fig. 8A, in many aspects similar to the structure known from the prior art video encoder illustrated in Fig. 1, with suitable modifications in those parts of the encoder, which perform operations associated with sub-pixel interpolation values and the formation of a stream of encoded video data. Most elements of the video encoder 700 operate and behave the same as the corresponding elements previously described are known from the prior art video encoder 100 (see Fig. 1). The description of such elements is omitted due to brevity. In particular, the video encoder 700 includes block 711 assess the field of movement, block 712 encoding field motion, block 713 compensated prediction motion, block 714 encoding the prediction error, block 715 decoding the prediction error, multiplexing block 716, human memory and the adder 717 719. As shown in Fig. 8A, block 711 assess the field of motion also includes block 710 calculating differential coefficients used for calculating the difference between the selected filter and the reference filter 709.

Now will be described the operation of the video encoder 700. As known from the prior art video encoder is,
the video encoder 700 according to a variant implementation of the present invention applies compensated prediction motion relative to a reference frame*R*_{n}(*x,y*) to generate the bit stream representing the video frame encoded in the INTER-format. He applies compensated prediction motion with sub-pixel resolution, and then uses interpolation filter having a dynamically variable values of the filter coefficients in order to form a sub-pixel values required during the process of motion estimation.

The video encoder 700 performs compensated prediction motion on a block-by-block basis, and performs motion compensation with sub-pixel resolution as a two-stage process for each block. In the first stage motion vector having polnotsennoe resolution is through a pair of blocks, i.e. the search block of the pixel values in the reference frame*R*_{n}(*x,y*), who is best mates with pixel values of a current block of an image to be coded. The operation of the pair of blocks is performed by block 711 assess the field of motion together with human memory 717 from which to retrieve the pixel values of the reference frame*R*_{n}(*x,y*). In the second stage compensated prediction motion naydennyya first stage motion vector is updated for the desired sub-pixel resolution.
To do this, block 711 assess the field of movement forms new search blocks having sub-pixel resolution by interpolation of the pixel values of the reference frame*R*_{n}(*x,y*in the area previously identified as a best mate to be encoded in the current block of the image (see Fig. 5). As part of this process, block 711 assess the field of movement finds the optimal interpolation filter for interpolating sub-pixel values. Useful to the values of the coefficients of the interpolation filter are adapted in connection with the coding of each block of the image. In alternative versions of the coefficients of the interpolation filter can adapt less frequently, for example once for each frame or at the beginning of the video encode.

Interpoliraj necessary sub-pixel values and forming new search blocks, block 711 assess the field of motion performs a further search in order to determine whether any of the new search blocks of a better pairing with the current block of the image, rather than best paired block, originally identified in polnoekrannom resolution. The block 711 assess the field of motion determines whether a motion vector representing kodiruemyi the moment the unit image, on polnotsennoe or sub-pixel resolution.

Block 711 assess the field of movement displays the identified motion vector in block 712 encoding field motion, which approximates the motion vector using motion models, as described previously. Block 713 compensated prediction motion then generates a prediction for the current block of an image using the computed motion vector information and prediction errors. Following this, the prediction is encoded in block 714 coding of prediction errors. The encoded information of the prediction error for the current block of the image is sent then from block 714 encoding the prediction errors in the multiplexing block 716. Multiplexing unit 716 receives information about approximated the motion vector (in the form of coefficients movement) from block 712 encoding field motion, as well as information about the optimal interpolation filter used at the time compensated for the motion prediction of the current block image of the block 711 assess the field of motion. According to this variant implementation of the present invention block 711 assess the field of movement on the basis of the computation result computed by block 710 calculation of differential coefficients, transmits the set of Delta values 705 indicating different is th between the filter coefficients in the optimal interpolation filter for the current block and the coefficients of the specified basic filter 709, stored in the encoder 700. Multiplexing block 716 generates after this encoded bit stream 703 representing the current block of the image by combining the information of the motion coefficients of the movement), the data of the prediction error, differential values of the filter coefficients and possible control information. Each of the different types of information may be entropy encoded by the encoder before it is included in the bitstream and subsequent transfer to the appropriate decoder.

In an alternative embodiment, block 711 assess the field of motion sends values 704 indicating the filter coefficients for the optimal interpolation filter, in block 710 calculating differential coefficients, which is located between the block 711 assess the field of movement and multiplexing block 716, as shown in Fig. 8b. Based on the basic filter block 709 710 computing differential coefficients calculates a differential value 705 and transmits them to the multiplexing unit 716.

In another alternative embodiment, the block 710 calculating differential coefficients is inside multiplexing block 716. In this case, the coefficients 704 filter for optimal interpolation filter can be sent directly to block 711 assess the field of movement in the multiplexing block is 716, as shown in Fig. 8C.

Fig. 9a is a block diagram of the decoder 800 embodied according to a preferred variant implementation of the present invention and corresponding to the video encoder illustrated in Fig. 8A. The decoder 800 includes block 721 compensated by the motion prediction unit 722 decoding the prediction error, demultiplexers block 723 and human memory 824. Most of the elements in the decoder 800 and function are the same as corresponding elements in the prior art 20 (see Fig. 2). However, the decoder 800 according to the present invention, as shown in Fig. 9a, includes block 810 recovery filter that restores optimal interpolation filter 110 (see Fig. 6A) based on the differential values 130 (Fig. 6b and 6C) and a given base filter 809. Given the basic filter 809 is preferably identical with the base filter 709 (Fig. 8A-8C).

Now will be described the operation of the decoder 800. The demultiplexer 823 receives the encoded bit stream 803, shares this bit stream into its component parts (the coefficients of the motion data of the prediction error, the differential values of the filter coefficients and possible control information), and performs any necessary entropy decoding different types of data. The demultiplexer 823 sends information oshi the key predictions from a received bit stream 803 in block 822 decoding the prediction error.
He also directs the received message traffic in block 821 compensated prediction motion. In this embodiment, the present invention demultiplexer 823 directs adopted (and entropy decoded) differential values via signal 802 at block 821 compensated prediction motion to allow the block 810 recovery filter to restore optimal interpolation filter 110 (see Fig. 6A) by adding the received differential values to the coefficients of a given basic filter 809 stored in the decoder. Block 821 compensated prediction motion following this uses optimal interpolation filter, as determined by the restored values of the coefficients for building prediction decoded currently block image. More specifically, block 821 compensated prediction motion generates a prediction for the current block of an image by extracting pixel values of the reference frame*R*_{n}(*x,y*)stored in human memory 824, and interpolation them as necessary, according to the received information of the movement to form any required sub-pixel values. The prediction for the current block of the image is then combined with the corresponding data of the prediction error to generate a reset is to implement the block image.

Alternatively, block 810 recovery filter is outside the block 821 compensated prediction motion, as shown in Fig. 9b. From the difference of the values contained in the signal 802, adopted from the demultiplexer 823, block 810 recovery filter restores optimal interpolation filters and sends the restored coefficients 805 filter in block 821 compensated prediction motion. In yet another alternative embodiment, the block 810 recovery filter is demultiplexers block 823. Demultiplexers block 823 sends the restored coefficients of the optimal interpolation filter in block 821 compensated prediction motion.

The encoder according to the present invention encodes the coefficients of the filter differential with respect to the coefficients given certain basic filter to allow the decoder to recover the optimal interpolation filter based on the differential values. The base filter coefficients must be known to both encoder and decoder and must be statistically acceptable close to the actual filter used in sequence to obtain good performance coding. In other words, according to the method of the present invention is determined by babyfit, having a specific set of values of the coefficients, and then the difference between the coefficients of the base of the filter and the coefficients of the actual interpolation filter are encoded and included in the flow videoview. However, the amount of information needed to represent the coefficients of the adaptive interpolation filter in the stream of coded videoview, is reduced in relation to the way in which each of the coefficients of the adaptive filter is encoded separately. If the coefficients of the basic filter is quite similar to the actual coefficients of the interpolation filter to be encoding the differential value is small. Thus, mainly the specified base filter is statistically similar to the actual interpolation filter, as in this case, the differential values are reduced and achieved further improvement in coding efficiency.

In contrast to the method of differential coding, as proposed in article*Wedi*the method according to the present invention maintains a relatively good resistance to errors. In case of an error occurring during transmission of the coded stream videoview from the encoder to the decoder, the error is only subjected to the difference between the base filter and the actual interpolation filter.

The following is the duty to regulate to note, what functional elements of a multimedia terminal, the video encoder, decoder and codec according to the present invention can be implemented as software or dedicated hardware or as a combination of the two. Methods of coding and videodatabase according to the invention is particularly suitable for embodiment in the form of a computer program containing computer-readable commands to perform the functional steps of the invention. As such, encoder, decoder and codec according to the invention can be implemented as software code stored on a storage medium and executable in a computer, such as personal desktop computer to provide the computer with the functions of coding and/or videodatabase.

Although the invention is described in the context of specific embodiments, for professionals it is clear that these explanations can make some modifications and various changes. Thus, although the invention specifically shown and described in relation to one or more of its preferred options run specialists it is clear that some modifications or changes therein can be made without departing from the scope and essence of the invention, as it is stated above.

1. The method of encoding images in a digital videoposledovatel the STI for receiving encoded video data, moreover, the digital video sequence contains a sequence of video frames, and each frame has a set of pixel values, and interpolation filter having a set of coefficients represented by a set of coefficient values used for the recovery of the pixel values in the frame of the mentioned digital video sequences of encoded video data, characterized in that

encode the values of the coefficients of the interpolation filter differential relative to the specified base filter to generate a set of difference values, and

adapt the set of Delta values in the encoded video data so that recovery of the pixel values was based on the set of Delta values.

2. The method according to claim 1, characterized in that the encoded video data is passed from the encoder to the decoder, while the encoded video data includes coded values, expressing the set of difference values, and a set of differential values of entropy encoding prior to transmission from the encoder to the decoder.

3. The method according to claim 1, characterized in that the specified base filter has many additional factors with values statistically similar values of the coefficients of the interpolation filter.

4. The method according to claim 1, characterized in that the coefficients of the interpolation filter is chosen for the interpolation of the pixel values in the selected segment of the image.

5. The method according to claim 1, characterized in that the specified base filter has fixed coefficients.

6. The method according to claim 1, characterized in that the specified base filter has a set of coefficients adapted to the statistics of the sequence.

7. The method according to claim 1, wherein the interpolation filter is symmetric, so encode only half of the coefficients of the filter.

8. The method according to claim 1, characterized in that the values of the coefficients of the interpolation filter code in a specific order from the first coefficient value to the last value of the coefficient.

9. The method according to claim 8, characterized in that the specific order in which to encode the values of the coefficients, different from the spatial order of the above factors.

10. The method according to claim 8, characterized in that the sum of the values of the coefficients of the interpolation filter is fixed.

11. The method according to claim 1, characterized in that the specified base filter has many additional values of the coefficients at this constant value is added to the additional values of the coefficients of the specified base filter to fell the ü the amplitude difference between the values of the coefficients of the interpolation filter and the additional values of the coefficients of the specified base filter.

12. Video encoder, containing

means for encoding images in a digital video sequence having the sequence of video frames to obtain encoded video data, expression of this sequence, and each frame of the sequence contains a number of pixel values, and

means for determining the interpolation filter for the restoration of the pixel values in the frame of the mentioned digital video sequence in decoding process, and the interpolation filter has several factors, represented by a set of values of the coefficients, characterized in that it contains

the tool responds to the interpolation filter for calculating the difference between the values of the coefficients of the interpolation filter and the specified base filter to obtain a set of difference values, and

means for adapting the set of Delta values in the encoded video data that is intended to restore the pixel values in the decoding process was based on the set of Delta values.

13. The video encoder according to item 12, characterized in that it further comprises means for entropy encoding of the set of difference values before the adaptation of the set of difference values in the encoded video data./p>

14. The video encoder according to item 13, wherein the interpolation filter is symmetric, and the entropy encoding means configured to encode only a half set of difference values.

15. The method of decoding video data, expressing the digital video sequence containing a sequence of video frames, each frame of the sequence contains a number of pixel values, while the interpolation filter having a set of coefficients represented by a set of coefficient values used for the recovery of the pixel values in the frame of the mentioned digital video sequences, characterized in that

extracted from the video data set of Delta values, and the set of difference values expresses the difference between the values of the coefficients of the interpolation filter and the specified base filter

create an additional filter based on a set of differential snakey and given a basic filter and

restore pixel values on the basis of the additional filter.

16. The method according to item 15, wherein the specified base filter has many additional factors, represented by a set of values of additional factors, in addition

summarize a set of different is the shaft of the values with the values of additional factors specified base filter to create additional filter.

17. The method according to item 16, wherein the set of Delta values extracted from the video data by entropy decoding.

18. Video decoder, containing

means for receiving video data in the bit stream, and accepted Express video digital video sequence containing a sequence of video frames, and each frame of the sequence contains a number of pixel values, characterized in that it contains

means for retrieving the set of difference values from the bit stream,

a tool for creating the interpolation filter based on the given base filter and set of Delta values and

means for restoring the pixel values in the frame of the video sequence on the basis of the interpolation filter and the received video data.

19. Video decoder on p, wherein the specified base filter has many additional factors, represented by the values of additional factors, while the said video decoder further comprises

means for summing the set of difference values with the values of additional factors specified base filter to generate the interpolation filter.

20. Video decoder on p, characterized in that it further comprises Wed the rotary for entropy decoding of the set of difference values from the bit stream.

21. Videocamera system containing

an encoder for encoding images in a digital video sequence having the sequence of video frames to obtain encoded video data in the bit stream representing the video sequence, each frame of the sequence contains a number of pixel values, while the encoder is a tool to determine the interpolation filter for the restoration of the pixel values in the frame of the mentioned digital video sequence in decoding process, and the interpolation filter has a set of filter coefficients, represented by a set of coefficient values, and

a decoder for receiving encoded video data in the bit stream for recovery of the pixel values in the frame of the sequence in the decoding process, characterized in that

the encoder further comprises

means for calculating the difference between the interpolation filter and the specified base filter to obtain a set of difference values and

means for adapting the set of Delta values in the bit stream, and

the decoder contains

means for extracting from the bitstream of the set of Delta values and

means for creating an additional filter based on the given the gas filter and the extracted set of differential values thus to restore the pixel values in the decoding process was based on the additional filter.

**Same patents:**

FIELD: video decoders; measurement engineering; TV communication.

SUBSTANCE: values of motion vectors of blocks are determined which blocks are adjacent with block where the motion vector should be determined. On the base of determined values of motion vectors of adjacent blocks, the range of search of motion vector for specified block is determined. Complexity of evaluation can be reduced significantly without making efficiency of compression lower.

EFFECT: reduced complexity of determination.

7 cl, 2 dwg

FIELD: video decoders; measurement engineering; TV communication.

SUBSTANCE: values of motion vectors of blocks are determined which blocks are adjacent with block where the motion vector should be determined. On the base of determined values of motion vectors of adjacent blocks, the range of search of motion vector for specified block is determined. Complexity of evaluation can be reduced significantly without making efficiency of compression lower.

EFFECT: reduced complexity of determination.

7 cl, 2 dwg

FIELD: compensation of movement in video encoding, namely, method for encoding coefficients of interpolation filters used for restoring pixel values of image in video encoders and video decoders with compensated movement.

SUBSTANCE: in video decoder system for encoding a video series, containing a series of video frames, each one of which has a matrix of pixel values, interpolation filter is determined to restore pixel values during decoding. System encodes interpolation filter coefficients differentially relatively to given base filter, to produce a set of difference values. Because coefficients of base filter are known to both encoder and decoder and may be statistically acceptably close to real filters, used in video series, decoder may restore pixel values on basis of a set of difference values.

EFFECT: efficient encoding of values of coefficients of adaptive interpolation filters and ensured resistance to errors of bit stream of encoded data.

5 cl, 17 dwg

FIELD: video encoding, in particular, methods and devices for ensuring improved encoding and/or prediction methods related to various types of video data.

SUBSTANCE: the method is claimed for usage during encoding of video data in video encoder, containing realization of solution for predicting space/time movement vector for at least one direct mode macro-block in B-image, and signaling of information of space/time movement vector prediction solution for at least one direct mode macro-block in the header, which includes header information for a set of macro-blocks in B-image, where signaling of aforementioned information of space/time movement vector prediction solution in the header transfers a space/time movement vector prediction solution into video decoder for at least one direct mode macro-block in B-image.

EFFECT: creation of improved encoding method, which is capable of supporting newest models and usage modes of bi-directional predictable (B) images in a series of video data with usage of spatial prediction or time distance.

2 cl, 17 dwg

FIELD: movement estimation, in particular, estimation of movement on block basis in video image compression application.

SUBSTANCE: method and device are claimed for conducting search for movement in video encoder system using movement vectors which represent difference between coordinates of macro-block of data in current frame of video data and coordinates of corresponding macro-block of data in standard frame of video data. A set of movement vector prediction parameters is received, where movement vector prediction parameters represent approximations of possible movement vectors for current macro-block, movement vector search pattern is determined and search is conducted around each movement vector prediction parameter from the set of movement vector prediction parameters using search pattern, and on basis of search result, the final movement vector is determined.

EFFECT: increased efficiency of video signals compression.

3 cl, 7 dwg

FIELD: physics.

SUBSTANCE: said utility invention relates to video encoders and, in particular, to the use of adaptive weighing of reference images in video encoders. A video encoder and a method of video signal data processing for an image block and the specific reference image index for predicting this image block are proposed, which use the adaptive weighing of reference images to increase the video signal compression, the encoder having a reference image weighting factor assigning module for the assignment of the weighting factor corresponding to the said specific reference image index.

EFFECT: increased efficiency of reference image predicting.

8 cl, 7 dwg

FIELD: physics, computing.

SUBSTANCE: invention relates to the field of coding and decoding of a moving image. In the method, at least one reference image for the processing of the field macroblock is selected from at least one reference image list, using information about reference image indexes, each at least one reference image selected is a field, and the parity of at least one reference field selected may be based on the parity of the field macroblock and the reference image index information.

EFFECT: efficient provision of information about reference image compensating motion, by reference image indexes determined in different ways, according to the coded macroblock modes.

10 cl, 12 dwg

FIELD: information systems.

SUBSTANCE: invention refers to video coders using adaptive weighing of master images. The video decoder for decoding data from video signal for the image, having multiple motion boxes, containing: the master image weighting coefficient module for accepting, at least, one master image index, thereat each one from the mentioned master image indexes is intended for independent indication not using any other indexes, one of the multiple master images, used for prediction of current motion box and weighting coefficient from the set of weighting coefficients for current mentioned one from mentioned multiple motion boxes.

EFFECT: increase of efficiency in predicting master images.

20 cl, 7 dwg

FIELD: information technology.

SUBSTANCE: method is offered to compress digital motion pictures or videosignals on the basis of superabundant basic transformation using modified algorithm of balance search. The algorithm of residual energy segmentation is used to receive an original assessment of high energy areas shape and location in the residual image. The algorithm of gradual removal is used to decrease the number of balance assessments during the process of balance search. The algorithm of residual energy segmentation and algorithm of gradual removal increase encoding speed to find a balanced basis from the previously defined dictionary of the superabundant basis. The three parameters of the balanced combination form an image element, which is defined by the dictionary index and the status of the basis selected, as well as scalar product of selected basic combination and the residual signal.

EFFECT: creation of simple, yet effective method and device to perform frame-accurate encoding of residual movement on the basis of superabundant basic transformation for video compressing.

10 cl, 15 dwg

FIELD: information technology.

SUBSTANCE: playback with variable speed is performed without picture quality deterioration. Controller 425 creates EP_map () with RAPI address in videoclip information file, dedicated information selection module 423 RAPI, image PTS with internal encoding, which is immediately preceded by RAPI, one of final positions of the picture with internal encoding, as well as the second, the third and the fourth reference pictures, which are preceded by the picture with internal encoding. The controller saves EP_map () in output server 426, i.e. controller 425 copies the value, close to given number of sectors (quantity of sectors, which can be read at one time during encoding process) of final positions for the four reference pictures (1stRef_picture, 2ndRef_picture, 3rdRef_picture and 4thRef_picture) to N-th_Ref_picture_copy, defines value of index_minus 1 on the basis of N-th_Ref_picture_copy and records it to disc.

EFFECT: effective process performance with constant data reading time.

8 cl, 68 dwg

FIELD: information technology.

SUBSTANCE: invention proposed contains videodecoder (200) and corresponding methods of videosignal data processing for image block with two reference frames' indices to predict this image block. The methods use latent scaling of reference images to improve video compressing. The decoder (200) contains latent scaling coefficient module (280) of reference images, which are used to determine a scaling coefficient value, corresponding to each of the reference image indices. Decoding operations contain receiving reference image indices with data, which corresponds to image block, calculation of latent scaling coefficient in response to image block location relative to reference images, indicated by each index of reference image, extraction of reference image for each of the indices, motion compensation relative to extracted reference image and multiplication of reference images, relative to which the motion compensation was performed, to a corresponding scaling value.

EFFECT: increase of decoding efficiency.

25 cl, 6 dwg