Method for interpolation of sub-pixel values

FIELD: method for interpolating values of sub-pixels during encoding and decoding of data.

SUBSTANCE: method of interpolation during video data encoding is claimed, which features an image, containing pixels ordered in rows and columns and represented by values having given dynamic range, where pixels in rows are in integral horizontal positions, and pixels in rows are in integral vertical positions, which image is interpolated in such a way that values of sub-pixels are generated in fractional horizontal and vertical positions, aforementioned method containing following stages: a) when values are required for sub-pixels in half-integral horizontal positions and integral vertical positions and in integral horizontal positions and half-integral vertical positions, such values are interpolated directly using weighted sums of pixels located in integral horizontal and integral vertical positions; b) when values are required for sub-pixels in half-integral horizontal positions and half-integral vertical positions, such values are interpolated directly using a weighted sum of values for sub-pixels located in half-integral horizontal positions and integral vertical positions, computed in accordance with stage a); and c), when values are required for sub-pixel in quaternary horizontal position and quaternary vertical position, such values are interpolated by averaging of at least one pair from first pair of values of sub-pixel located in half-integral horizontal position and half-integral vertical position, and of sub-pixel, located in integral horizontal position and integral vertical position, and second pair of values of pixel, located in integral horizontal position and integral vertical position, and of sub-pixel, located in semi-integral horizontal position and semi-integral vertical position.

EFFECT: creation of improved method for interpolating values of sub-pixels during encoding and decoding of data.

13 cl, 26 dwg, 2 tbl

 

The present invention relates to a method of interpolation of the values of sub-pixels when encoding and decoding the data. In particular, it relates to coding and decoding of digital video, but not limited thereto.

Prior art

Digital video sequences, such as plain films, recorded, contain a sequence of still images, and the illusion of motion created by displaying images one after another at a relatively fast frame rate, typically 15 to 30 frames per second. Due to the relatively fast frame rate of the image in successive frames look identical and thus contain significant amounts of redundant information. For example, a typical scene may contain several fixed elements such as scenery in the background, and a few moving boxes which can take many different forms, for example the face of a news announcer, moving vehicles and other Alternative camera shooting the scene itself can move, and in this case, all the picture elements have the same type of movement. In many cases this means that the total change from one frame to another is rather small. Of course, it depends on the nature of the movement. For example, the faster the motion, the greater the change from one frame to another. Similarly, if the scene contains a number of moving elements, the change from one frame to the next is greater than in the scene where moves only one element.

It should be understood that each frame is raw, that is, uncompressed digital video contains a very large amount of information about the image. Each frame in the uncompressed digital video sequence is formed from an array of image pixels. For example, in a widely used format for digital video, known as the common interchange format compressed video data with a reduced four times the resolution (QCIF), the frame contains an array of 176×144 pixels, and in this case, each frame contains 25344 pixels. In turn, each pixel is represented by a certain number of bits, which carry the information about the brightness and/or color content of the image area corresponding to the pixel. In the General case, to represent the content of the brightness and chromaticity of the image using the so-called YUV format (combination of the luminance signal Y and two color-difference signals U and V). Component Y or brightness is the intensity (brightness) of the image, while the color content of the image is represented by two chrominance components denoted by U and V

Color model based on luminance/chrominance is the presentation of image content, provide certain advantages compared to color models, which are based on the view using the primary colors (red, green and blue, RGB). Human visual system is more sensitive to rate changes than to changes in color; the YUV color model exploit this property by using lower spatial resolution for components (U, V) chromaticity than for component (Y) of the brightness. In this case, the amount of information required to encode the color information in the image can be reduced at an acceptable reduction of image quality.

Lower spatial resolution of the chrominance components is usually achieved by using sub-sampling. Typically, the block 16×16 pixels in the image is represented by a single block of 16×16 pixels containing the brightness information and each of the respective components of the color represented by a single block of 8×8 pixels representing the image area, is equivalent to the block 16×16 pixels of the luminance component. The components of the color thus spatial under-discretized by the multiplier 2 in the directions x and y. Bundle obtained from a single block of 16×16 pixels of luminance and two blocks of 8×8 color commonly referred to as YUV macroblock or for brevity, the macroblock.

And the imagination, QCIF contains macroblocks 11× 9. If luminance blocks and blocks of color is represented by 8-bit resolution (that is, numbers in the range from 0 to 255), then the total number of bits required for the macroblock is (16×6×8)+2×(8×8×8)=3072 bits. The number of bits required to represent the video in QCIF format, so is 99×3072=304128 bit. This means that the amount of data required to transmit/record/display the sequence in QCIF format, represented by the YUV color model, at 30 frames per minute, is over 9 MB/s This is an extremely high data transfer rate and it is impractical to use in recording applications, transmission and display of video due to the fact that it requires a very large memory capacity, the capacity of the transmission channel and the performance of the hardware.

If video data is to be transferred in real-time on a fixed line network, such as digital network integrated services (ISDN) or a conventional switched telephone network (PSTN), the available bandwidth of the data transmission is typically of the order of 64 kbit/S. In mobile video telephony, when the transfer takes place at least partially on the radio link, the available bandwidth may be equal to 20 kbit/S. This means, cradle in order to ensure the transmission of digital video sequences through a communication network with low bandwidth, required to achieve a significant reduction in the volume of data used to represent video data. For this reason, techniques have been developed video compression, which reduces the amount of information transmitted while maintaining acceptable image quality.

Methods of video compression based on the reduction of redundant and useless for the perception of parts of sequences. Redundancy in video sequences can be divided into spatial, temporal and spectral redundancy. The term "spatial redundancy" is used to describe the correlation between neighboring pixels within a frame. The term "temporal redundancy" expresses the fact that the objects that appear in one frame of the sequence, probably, will appear in subsequent frames, while the "spectral redundancy" refers to the correlation between different color components of the same image.

A very efficient compression can be achieved by a simple reduction of various forms of redundancy in the sequence of images. Thus, most of the existing video encoders also reduce the quality of those parts of the sequence that subjectively the least important. Additionally, the redundancy of the bit stream of compressed video data reduction is moved independently by effectively lossless encoding. This is usually accomplished using a technique known as "coding with variable length" (VLC).

Modern video compression standards, such as ITU-T recommendation H.261, N(+) (++), H26L and recommendations of the Expert group on moving images MPEG-4 using temporal prediction with motion compensation". This is a form of reducing temporal redundancy in which the contents of several (often many) frames in video sequences "predicted" on the basis of other frames in the sequence by tracking the movement of objects or regions of an image between frames.

Compressed images that do not use reducing temporal redundancy, usually called INTRA-coded or 1-frames, whereas the predicted time image called INTER-coded or P-frames. In the case of INTER-frame predicted image (motion compensated) very rarely precise enough to represent the contents of an image with sufficient quality, and so with each INTER-frame associated spatially compressed frame of the prediction errors (OP, D). Many schemes for video compression can also use a bi-directional predicted frames, which are usually called In images or frames. In images are inserted between pairs of reference or the so-called anchor images (I - or P-frames) and predicted on the basis of either one or both of the anchor image. In the images themselves are not used as anchor image, then there is no other block is not predicted based on them, and therefore they can be removed from the sequence, without causing deterioration of the quality of future images.

Different types of frames that appear in a normal compressed video sequence shown in figure 3 of the attached drawings. As can be seen from the drawing, the sequence starts INTRA or 1-frame 30. Figure 3 the arrows 33 indicate the process of predicting the "forward"through which are formed the P-frames (denoted by position 34). The process of bi-directional prediction, through which are formed In the frames (36), shown by arrows 31A and 31b, respectively.

Figure 1 and 2 shows a schematic diagram of an exemplary system for encoding video data using the prediction motion compensation. Figure 1 shows the encoder 10 that uses motion compensation, and figure 2 shows the corresponding decoder 20. The encoder 10 shown in figure 1, contains the block 11 evaluation of the field of motion, the block 12 coding field motion, block 13 prediction with motion compensation, block 14 encoding the prediction error, unit 15 decodes the prediction error, block 16 multiplexing, human memory 17 and the adder 19. The decoder 20 includes a block 21 prediction with motion compensation, nl is to 22 decoding the prediction error, block 23 demuxing and human memory 24.

The principle of operation of the encoders video data using motion compensation, is to minimize the amount of information in the frame En(x,y) is the prediction error, which is the difference between the current encoded frame In(x,y) and frame Rn(x,y) predictions. The frame of the prediction error as follows:

Frame Rn(x,y) predictions are built by using the values of the pixels of the reference frame Rn(x,y), which in General is one of the previously encoded and transmitted frames, such as frame immediately preceding the current frame, and is available from human memory 17 of the encoder 10. More specifically, the frame Rn(x,y) prediction is constructed by finding the so-called pixel prediction in the reference frame Rn(x,y), which practically corresponds to the pixels in the current frame. Allocate motion information describing the relation (for example, relative position, rotation, scale, etc) between pixels in the current frame and corresponding pixels of the prediction in the reference frame, and the frame prediction design by moving the frame prediction in accordance with the information about the movement. In this case, the frame prediction is constructed as an approximation, not only is the current frame using pixel values in the reference frame. The frame of the prediction errors, to which reference is made above, is therefore the difference between an approximate representation of the current frame provided by the frame prediction, and by the current frame. The main advantage provided by encoders that use prediction with motion compensation arises from the fact that a relatively compact description of the current frame can be obtained by presenting it in terms of information about the movement required to generate predictions, together with associated error of prediction in the frame of the prediction error.

However, due to the very large number of pixels in the frame in General is inefficient to transmit to the decoder separate motion information for each pixel. Instead, in most schemes of encoding video data of the current frame is divided into large segments Skimages and traffic information related to the segments are transmitted to the decoder. For example, traffic information is usually provided for each macroblock in the frame, and the same traffic information is then used for all pixels within the macroblock. In some video coding standards, such as H.26L, the macroblock may be divided into smaller blocks, and each smaller unit is equipped with private information is the situation about the movement.

Traffic information usually takes the form of motion vectors [Δx(x,y), Δy(x,y)]. Pair of numbers Δx(x,y) and Δy(x,y) represents the horizontal and vertical offset of the pixel in location (x,y) in the current frame In(x,y) relative to the pixel in the reference frame Rn(x,y). Vectors [Δx(x,y), Δy(x,y)] motion is calculated in block 11 evaluation of the field of movement, and the set of motion vectors of the current frame [Δx(·), Δ (·)] is called a field motion vector.

Usually the location of the macroblock in the current video frame is defined by the coordinates (x,y) of the upper left corner. Thus, in the scheme of encoding video data in which traffic information is associated with each macroblock of the frame, each motion vector describes the horizontal and vertical offset Δx(x,y) and Δy(x,y) of the pixel representing the top-left corner of the macroblock in the current frame In(x,y), with respect to the pixel in the upper left corner almost corresponding block of pixels of the prediction in the reference frame Rn(x,y) (as shown in fig.4b).

Motion estimation is a task that requires a lot of calculations. If you specify the reference frame Rn(x,y) and, for example, square macroblock containing N×N pixels in the current frame (as shown in figa), the goal of motion estimation is to find the block N�D7; N pixels in a reference frame that coincides with the characteristics of the macroblock in the current image according to a certain criterion. This criterion can be, for example, the sum of absolute differences (CAP, SAD) between the pixels of the macroblock in the current frame and the block of pixels in a reference frame with which it is compared. This process is in General known as "coordination units". It should be noted that in General the geometry of the unit, subject to approval, and a block in the reference frame should not be the same as the real objects may be subject to changes of scale, as well as the rotation and curvature. However, in existing international standards for encoding video data is used only translational motion model (see below) and thus quite fixed rectangular geometry.

Ideally, to achieve the best probability of finding a match, the search should be performed around the keyframe. However, this is impractical, because it imposes too much computational load on the device for encoding video data. Instead, the search scope is limited to the region [-R,R] around the source location of the macroblock in the current frame, as shown in figure 4 C.

To reduce the amount of information about the movement to be transferred from the encoder 10 and the decoder 20, the motion vector coding is carried out in block 12 of the encoding field of the movement of the composition of the encoder 10 by presenting it using motion models. In this process, the motion vectors of the segments of the image are expressed again by using certain pre-defined functions or, in other words, the motion vector field is represented by the model. Almost all currently used model motion vectors are additive models movement that meet the following General formulas:

where coefficients aiand biare called the coefficients of the movement. These coefficients movement is transmitted to the decoder 20 (information flow 2 in Fig 1 and 2). The function fiand gicalled basic functions of the field of movement and is known as the encoder and the decoder. Approximate fieldmotion vectors can be constructed using the coefficients and basis functions. Since the basis functions are known as the encoder 10 and the decoder 20 (that is stored in them), the encoder must transmit only the coefficients of the movement, thereby reducing the amount of information required for the submission of information about the movement of the frame.

The simplest model of motion is translational motion model, which requires only two factors to describe the motion vectors of each segment. The values of motion vectors defined by the following expression:

This model is widely used in various international standards (ISO MPEG-1, MPEG-2, MPEG-4, ITU-T Recommendation H.261 and n) to describe the motion of blocks 16×16 and 8×8 pixels. Systems that use the translational motion model, perform motion estimation at the full pixel resolution in some integer part of the full pixel, for example, when the resolution in half or quarter pixel.

Frame Rn(x,n) prediction is constructed in block 13 of the prediction by motion compensation in the encoder 10 and is defined by the following expression:

In block 14 of encoding the prediction error frame En(x,y) is the prediction error is typically compressed by its representation as a finite series (conversion) of some two-dimensional functions. For example, can be used two-dimensional discrete cosine transformation (DCT, DCT). Conversion factors quanthouse and statistically encoded (for example, by the method of Huffman) before they are transmitted to the decoder (information flow 1 in figure 1 and 2). Due to errors introduced by quantization, this operation usually causes some deterioration (loss of information) in the frame of En(x,y) is the prediction error. To compensate for this deterioration, the encoder 10 also includes a block 15 of decoding errors is redskazanie, where the decoded framethe prediction error is built using conversion factors. This locally decoded frame of the prediction error is added to the frame Rn(x,y) predictions in the adder 19, and the resulting decoded current frameis stored in human memory 17 for further use as a next reference frame Rn+1(x,y).

Information flow 2, which carries the information about the motion vectors, combined with information about the forecast error in the multiplexer 16, and the flow of information 3, usually containing at least two types of information, is sent to the decoder 20.

Now will be described the operation of the corresponding decoder 20.

Human memory 24 of the decoder 20 stores the restored earlier reference frame Rn(x,y). Frame Rn(x,y) prediction is constructed in block 21 of the prediction with motion compensation from the decoder 20 in accordance with equation 5 by using the received information about the magnitude of movement and the pixel values of the restored earlier reference frame Rn(x,y). Passed to the transform coefficients of the frame En(x,y) of the prediction errors are used in the unit 22 decodes the prediction error to build the decoded frameerror and prediction.

The pixels of the decoded current framethen restored by adding frame Rn(x,y) predictions and the decoded framethe prediction errors:

This decoded current frame may be stored in human memory 24 as the next reference frame Rn+1(x,y).

In the description of the encoding and decoding digital video data with motion compensation presented above, the motion vector [Δx(x,y), Δy(x,y)], describing the motion of a macroblock in the current frame relative to a reference frame Rn(x,y), can indicate any of the pixels in the reference frame. This means that the motion between frames in the digital video sequence can be represented only at the resolution, which is defined by the pixels of the image in the frame (when the so-called resolution in full pixel). The real movement, however, has arbitrary precision, and thus the system described above can only provide an approximate simulation of the motion between consecutive frames in digital video sequences. Typically, the simulation of motion between video frames at a resolution of full pixel are not accurate enough to allow effectively to minimize information error preds the marks (OP), associated with each macroblock/block. Therefore, in order to provide a more accurate simulation of real movement and help reduce the amount of information the OP, which must be passed from the encoder to the decoder, many video coding standards, such as n(+)(++) H.26L, allow the motion vectors indicated by the "between" pixels in the image. In other words, the motion vectors can have sub-pixel resolution. When the motion vectors are allowed to have sub-pixel resolution, this adds complexity to the operations of encoding and decoding that must be performed, so the advantage is still the limitation of the degree of spatial resolution, which may have a motion vector. Thus, the video coding standards, such as mentioned above, typically allow motion vectors to have only the resolution in full pixel, half pixel (twice the resolution or quarter pixel (four times the resolution).

Motion estimation with sub-pixel resolution is usually performed as a two-step process, as shown in figure 5, for encoding video data, which allows motion vectors to have the resolution in full pixel or half pixel. In the first stage motion vector having the resolution in full pixel is determined using any suitable for the coming diagram of the motion estimation, such as the reconciliation process blocks described above. The obtained motion vector having the resolution in full pixel, shown in figure 5.

In the second stage, the motion vector found in the first phase, to be specified in order to obtain the desired resolution in half pixel. In the example shown in figure 5, this is accomplished by the formation of eight new blocks search 16×16 pixels, and the location of the upper left corner of each block is marked with X in figure 5. These locations are referred to as [Δx+m/2, Δy+n/2], where the number type can take the values -1, 0 and +1, but cannot be zero simultaneously. Because the known values of the pixels of the original image pixels, the values (for example, brightness values and/or chrominance) of the sub-pixels located at locations other half of the pixel to be evaluated for each of the eight new units search by using some form of interpolation scheme.

After interpolation of the values of sub-pixels at a resolution of half a pixel, each of the eight blocks of the search compared to the macroblock for which the obtained motion vector. As in the reconciliation process blocks performed in order to determine the motion vector resolution in full pixel macroblock is compared with each of the eight blocks of the search in accordance with a certain criterion, e.g. the measures with the sum of absolute differences (CAP). In the General case the result of comparison is obtained, the minimum value of the CAP. Depending on the nature of motion in video sequences is the minimum value may correspond to the location specified by the original motion vector (with permission in full pixel), or it may correspond to a location that has a resolution of half a pixel. Thus, it is possible to determine whether the motion vector to point to the location of the full pixel or podpisala, and if the sub-pixel resolution is suitable, you can determine the correct motion vector with sub-pixel resolution. It should be understood that the just described circuit can be used for other sub-pixel resolution (for example, permissions quarter pixel) quite the same way.

In practice, the assessment of the values of sub-pixels in the reference frame is performed by interpolating the values of sub-pixel values of the surrounding pixels. In General, the interpolation value of F(x,y) sub-pixel located at a non-integer location (x,y)=(n+Δx, m+Δy), can be formulated as a two-dimensional operation, mathematically represented as follows:

where f(k,l) are the filter coefficients, and m and n are obtained by truncation, corresponding what about x and y to integer values. Usually the filter coefficients depend on the values x and y, and the interpolation filters are usually so-called partial filters, in this case, the value F(x,y) sub-pixel can be calculated as follows:

The motion vectors are calculated in the encoder. When the corresponding coefficients movement is transmitted to the decoder, this is a direct reason to interpolate the required sub-pixel using the interpolation method identical to that used in the encoder. In this case, the frame for supporting the frame in human memory 24 can be restored from the reference frame and motion vectors.

The simplest way of applying the interpolation values of the sub-pixel in the video encoder is the interpolation of each value of the sub-pixel each time it is needed. However, this is an inefficient solution for the video encoder, since it is likely that the same value of the sub-pixel will be required several times, and thus calculate by interpolation the same values of sub-pixel will be executed many times. This is reflected in the excessive increase in computational complexity/workload in the encoder.

An alternative approach, which limits the complexity of the encoder, pre-computing and storing all the values under the peaks is found in the memory, related coder. This decision later in the document called "preliminary" interpolation. Limiting the complexity, advanced interpolation has the disadvantage consisting in a significant increase in memory usage. For example, if the accuracy of the motion vector equal to a quarter of a pixel in the horizontal location and vertical directions, the preservation of pre-computed values of sub-pixels for full image will result in memory usage 16 times greater than that required to store the original, reinterpretating image. In addition, it includes the evaluation of certain sub-pixels, which in reality may not be required when calculating the motion vectors in the encoder. Preliminary interpolation is also highly inefficient in the decoder, since most pre-interpolated values for sub-pixels, the decoder will never need. Thus, it is preferable not to use pre-computation in the decoder.

To reduce the memory requirements of the encoder can be used the so-called interpolation on request. For example, if the desired precision pixel resolution is a quarter of a pixel, only the sub-pixels at a resolution of half-pixel interpolated before the entire frame is stored in memory. Values for sub-pixels at a resolution of a quarter of a pixel are calculated only in the process of estimation/motion compensation, when required. In this case, the memory is used only 4 times greater than that required to store the original, reinterpretating image.

It should be noted that when you use a pre-interpolation, the interpolation process is only a small part of the overall computational complexity/workload in the encoder, since each pixel is interpolated only once. Therefore, in the encoder, the complexity of the interpolation process itself is not very critical, when you use a pre-interpolation of the values of sub-pixel. On the other hand, interpolation on request imposes a significant computational load on the encoder, since the sub-pixels can be interpolated many times. Therefore, the complexity of the interpolation process, which can be thought of in terms of the number of computational operations or cycles that must be performed to interpolate values for sub-pixels becomes an important consideration.

In the decoder the same values of sub-pixels in most cases use a small number of times, and in some there was no need. Therefore, in the decoder, it is preferable not to use advanced interpolate is, that is, it is preferable not to perform pre-calculate any of the values of sub-pixels.

As part of the work ongoing in the Sector of ITU standardization of telecommunications, in Group 16 of the study, the Expert group on coding of video data (VCEG), Questions 6 and 15, were developed two interpolation scheme. These approaches were proposed for inclusion in the recommendation H.26L (ITU-T and were embodied in experimental models (Ω, TML) for evaluation and further development. Experienced model, which corresponds to Question 15, is called an Experienced model 5 (A), then as obtained in the study of Question 6 is known as an Experienced model 6 (A). Now will be described interpolation scheme proposed in OM and OM.

Throughout the description of the interpolation scheme of the values of sub-pixels used in the experimental model OM, reference will be made to figa, which defines the notation for describing the locations of the pixels and sub-pixels that are specific to OM. Separate symbols defined on figa will be used to discuss the interpolation scheme of the values of sub-pixels used in OM. Another separate designations shown on figa will be used later in the text in connection with the method of interpolation of the values of sub-pixels corresponding to the invention. It should be understood that three different seat the deposits, used in the text, are intended to facilitate the understanding of each interpolation method and to identify differences between them. However, in all three drawings the letter a is used to denote the original image pixels (resolution in full pixel). Specifically, the letter a represents the location of the pixels in the image data representing a frame of the sequence, while the pixel values And either accept the current frame In(x,y) from a video source or recover and retain as a reference frame Rn(x,y) in human memory 17, 24 encoder 10 or decoder 20. All other letters represent the location of the sub-pixels, and the values of sub-pixels located at the locations of sub-pixels, is obtained by interpolation.

Some other terms will also be used in a consistent manner throughout the text to identify the specific locations of the pixels and sub-pixels. This terms like:

The term "whole horizontal location" is used to describe the location of any sub-pixel, which is built in the column of the source image data. Sub-pixels C and e on figa and 13A, as well as sub-pixels b and e on figa have whole horizontal location.

The term "whole vertical location used is : to describe the location of any sub-pixel, built in the row of the source image data. Sub-pixels b and d on figa and 13A, as well as sub-pixels b and d on figa have whole vertical location.

By definition pixels And have the whole horizontal and a vertical location.

The term "half-integer horizontal location" is used to describe the location of any sub-pixel, which is built into the column, under the resolution in half pixel. Sub-pixels b, C and e on figa and 13A fall under this category as well as sub-pixels b, C and f on figa. Similarly, the term "half-integer vertical location" is used to describe the location of any sub-pixel, which is built in the row at the resolution in half pixel, for example, sub-pixels C and d on figa and 13A, as well as sub-pixels b, C and g on figa.

Further, the term "horizontal location a quarter of the whole" refers to any sub-pixel, which is built in a column under a quarter of a pixel, for example, sub-pixels d and e on figa, sub-pixels d and g on figa and sub-pixels d, g and h on Fig and. Similarly, the term "vertical location a quarter of the whole" refers to any sub-pixels, which are built in a row that is in the resolution of a quarter pixel. On figa fall under this category sub-pixel unit 1, as well as sub-pixels e, f and g on figa and sub-pixels e, f and h on figa.

The definition of each of the above terms is shown "lines"drawn on the relevant drawings.

It should further be noted that it is often convenient to refer to a specific pixel of the two-dimensional symbol. In this case, a suitable two-dimensional symbol can be obtained by checking the mutual intersection of the lines on figa, 13A and 14a. Applying this principle, the pixel d on figa, for example, has a half-integer horizontal and vertical half-integer location, and sub-pixel e has integer horizontal location and vertical location a quarter of the whole. In addition, for ease of naming sub-pixels, which are at half-integer horizontal and entire vertical by as much as horizontal and vertical half-integer positions, and at half-integer horizontal and vertical half-integer positions, will be referred to as sub-pixels resolution. 1/2 sub-pixels, which are at any horizontal location a quarter of the whole and/or vertical location a quarter of the whole will be referred to as sub-resolution 1/4.

It should also be noted that as in the descriptions of the two experimental models, and in the detailed description of the invention it is assumed that the pixels have the minimum value, Ravne, and the maximum value of 2n-1, where n is the number of bits reserved for the value of a pixel. The number of bits is usually equal to 8. After sub-pixel prediction, if the value of the interpolated sub-pixel exceeds the value of 2n-1, it is limited to the range [0, 2n-1], i.e. values less than the minimum allowed value becomes the minimum value (0), and values greater than the maximum will be the maximum value (2n-1). This operation is called a constraint.

Now will be described in detail diagram of the interpolation values of the sub-pixel in accordance with OM with links to figa, 12b and 12C.

1. The value for sub-pixel at half-integer horizontal and generally vertical location, which is a sub-pixel b resolution 1/2 on figa, is calculated using the 6-drop filter. Filter interpolates the value of the sub-pixel b resolution 1/2 based on the value of 6 pixels (A1-A6located in line in the whole horizontal locations and the entire vertical locations symmetrically around b, as shown in fig.12b, according to the following formula: b=(A1-5A2+20A3+20A4-5A5+A6+16)/32. The operator "/" denotes division with truncation. The result of the limit to get into the range [0, 2n-1].

2. The value is FL for under-resolution 1/2, marked with, are calculated using a similar sectioncode filter, which is used in operation 1, and the next six pixels or sub-pixels (a or b) in the vertical direction. On Fig with filter interpolates the value of the sub-pixel with a resolution of 1/2, which is located in a generally horizontal position and vertical half-integer location, based on the value of 6 pixels (A1-A6located in a column in the entire horizontal locations and the entire vertical locations symmetrically around with, according to the following formula: C=(A1-5A2+20A3+20A4-5A5+A6+16)/32. Similarly, the value of the sub-pixel with a resolution of 1/2, which is located in half-integer horizontal location and vertical half-integer location is calculated by the following formula: C=(b1-5b2+20b3+20b4-5b5+b6+16)/32. Again, the operator "/" denotes division with truncation. Values calculated for the sub-pixels, then limit to get into the range [0, 2n-1].

At this point of the process of interpolation values are calculated for all sub-resolution 1/2, and the process proceeds to calculate the values of sub-pixels resolution 1/4.

3. Values of sub-pixels resolution 1/4, marked d, is calculated using linear interpolation and the values of CL is isih pixels and/or sub-resolution 1/2 in the horizontal direction. More specifically, the values of sub-pixels d resolution 1/4, located in a horizontal location and a quarter of the whole and the whole vertical locations are calculated by averaging directly to the nearest pixel in a generally horizontal position and generally vertical location (pixel), and directly to the nearest sub-pixel resolution 1/2 in half-integer horizontal position and generally vertical location (sub-pixel b, i.e. in accordance with the following formula: d=(A+b)/2. Values of sub-pixels d resolution 1/4, located in the horizontal location quarter of integer and half-integer vertical locations are calculated by averaging directly next sub-pixels with a resolution of 1/2, which are generally horizontal and vertical half-integer locations, and half-integer horizontal and vertical half-integer locations, respectively, that is, in accordance with the following formula: d=(c1+c2)/2. Again, the operator "/" denotes division with truncation.

4. Values of sub-pixels resolution 1/4, marked e, is calculated using linear interpolation and the values of the nearest pixels and/or sub-resolution 1/2 in the vertical direction. In particular, the values of sub-pixels e resolution 1/4, located in a generally horizontal location is the situation and vertical location a quarter of the whole, computed by averaging directly to the nearest pixel in a generally horizontal position and generally vertical location (pixel), and directly to the nearest sub-pixel in generally horizontal and vertical half-integer locations (sub-pixel), that is, in accordance with the following formula: e=(a+C)/2. Sub-pixels e3resolution 1/4, located in half-integer horizontal location and vertical location a quarter of the whole, are calculated by averaging directly to the nearest sub-pixel, which is a half-integer horizontal and generally vertical location (sub-pixel b), and directly to the nearest sub-pixel at half-integer horizontal and vertical half-integer locations (sub-pixel), that is, in accordance with the following formula: e=(b+C)/2. Moreover, the sub-pixels e resolution 1/4 in the horizontal location a quarter of the whole and vertical location a quarter of a whole are calculated by averaging directly next sub-pixels in the horizontal location a quarter of the whole and the whole vertical location and the corresponding sub-pixel in the horizontal location quarter of integer and half-integer vertical location (sub-pixels d), i.e. in accordance with the following formula: e=(d1+d2)/2. Again, W is the operator "/" denotes division with truncation.

5. The value of the sub-pixel f resolution is interpolated by averaging the values of the 4 closest pixel value in generally horizontal and vertical locations in accordance with the following formula: f=(A1+A2+A3+A4+2)/4, where pixels A1And2And3and a4marked the closest pixels.

The disadvantage OM is that the decoder is computationally complex. This comes from the fact that AM uses an approach in which the interpolation of sub-pixel resolution 1/4 depends on the interpolation of the values of sub-pixels resolution 1/2. This means that for interpolation of sub-pixel resolution 1/4 must first be calculated the values of sub-pixels resolution 1/2, from which they should be calculated. Moreover, because the values of some sub-resolution 1/4-dependent interpolation values obtained for the other sub-pixels resolution 1/4, truncation of the values of sub-pixels resolution 1/4 has a harmful effect on the accuracy of some of the values of sub-pixels resolution 1/4. More specifically, the values of sub-pixels resolution 1/4 less accurate than they could be if it were calculated from values that were not rounded or truncated. Another disadvantage OM is that you want to save the values of sub-pixels resolution 1/2 for the interpolation of testing the response of sub-resolution 1/4. Therefore, to store the result, which ultimately do not need, need redundant memory.

Now will be described a scheme for interpolating values of sub-pixels in accordance with OM, which is here referred to as the direct interpolation. In the encoder interpolation method corresponding to OM, works similarly to the previously described method of interpolation AM, except that constantly maintained the highest accuracy. This is achieved by using intermediate values are not rounded and not truncated. Step-by-step description of the interpolation method corresponding to A, as it is used in the encoder is given below with reference to figa, 13b and 13 C.

1. The value of the sub-pixel at half-integer horizontal and generally vertical locations, which is a sub-pixel b resolution 1/2 on figa, obtained by the first calculating intermediate values of b using 6-outlet filter. The filter computes b based on the value of 6 pixels (A1-A6located in line in the whole horizontal locations and the entire vertical locations symmetrically around b, as shown in fig.13b, according to the following formula: b=(A1-5A2+20A3+20A4-5A5+A6). Then calculates the final value of b, which is b=(b+16)/32, and the limit order is to be in the range [0, 2n-1]. As before, the operator "/" denotes division with truncation.

2. Values of sub-pixels resolution, tagged with, get the first calculating intermediate values C. On Fig with an intermediate value for a sub-pixel with a resolution of 1/2, which is located in a generally horizontal position and vertical half-integer location is calculated based on the value of 6 pixels (A1-A6located in a column in the entire horizontal locations and the entire vertical locations symmetrically around with, according to the following formula: C=(A1-5A2+20A3+20A4-5A5+A6). The final value for the sub-pixel with a resolution of 1/2, which is located in a generally horizontal position and vertical half-integer location, is calculated according to the following formula C=(C+16)/32. Similarly, the intermediate value for sub-pixel resolution, which is located in half-integer horizontal location and vertical half-integer location is calculated by the following formula: C=(b1-5b2+20b3+20b4-5b5+b6). The final value for this sub-pixel resolution 1/2 then calculated according to the following formula C=(C+512)/1024. Again, the operator "/" denotes division with truncation, and the values calculated for the sub-pixels with a resolution of 1/2, then the restriction is granted, to get to the range [0, 2n-1].

3. Values of sub-pixels resolution 1/4, marked d, is calculated as follows. Values of sub-pixels d resolution 1/4, located in a horizontal location and a quarter of the whole and the whole vertical locations are calculated from the values directly to the nearest pixel in a generally horizontal position and generally vertical location (pixel), and from the intermediate value b computed in step (1) directly to the nearest sub-pixel resolution 1/2 in half-integer horizontal position and generally vertical location (sub-pixel b resolution 1/2), i.e. in accordance with the following formula: d=(32A+b+32)/64. Values of sub-pixels d resolution 1/4, located in the horizontal location quarter of integer and half-integer vertical location are interpolated using the intermediate values computed for directly next sub-pixels with a resolution of 1/2, which are generally horizontal and vertical half-integer location, and half-integer horizontal and vertical half-integer locations, respectively, that is, in accordance with the following formula: d=(32C1+c2+1024)/2048. Again, the operator "/" denotes division with truncation, and the resulting ultimately the values of sub-pixels d 1/4 limit, to get to the range [0, 2n-1].

4. Values of sub-pixels resolution 1/4, marked e, is calculated as follows. Values of sub-pixels e resolution 1/4, located in a generally horizontal location and vertical location a quarter of the whole, are calculated from the values directly to the nearest pixel in a generally horizontal position and generally vertical location (pixel), and from the intermediate values computed in step (2) directly to the nearest sub-pixel resolution 1/2 in General, the horizontal and vertical half-integer location, i.e. in accordance with the following formula: e=(32A+WITH+32)/64. Values of sub-pixels of resolution, located in half-integer horizontal location and vertical location a quarter of the whole, are calculated on the basis of the intermediate value b computed in step (1) directly to the nearest sub-pixel resolution 1/2, which is a half-integer horizontal and generally vertical position, and the intermediate values computed in step (2) directly to the nearest sub-pixel resolution 1/2 in half-integer horizontal and vertical half-integer locations, i.e. in accordance with the following formula: e=(32b+c+1024)/2048. Again, the operator "/" denotes division with truncation and recip is installed in the end, the values of sub-pixels e resolution 1/4 limit, to get to the range [0, 2n-1].

5. Values of sub-pixels resolution 1/4, marked g, is calculated using the values of the nearest source pixel a and the intermediate values of three neighboring sub-pixels resolution 1/2, in accordance with the following formula: g=(a+32b+32 C1+C2+2048)/4096. As before, the operator "/" denotes division with truncation, and the resulting ultimately the values of sub-pixels g resolution limit to be in the range [0, 2n-1].

6. The value of the sub-pixel f resolution 1/4 interpolated by averaging the values of the 4 closest pixels in generally horizontal and vertical locations in accordance with the following formula: f=(A1+A2+A3+A4+2) /4, where pixels A1And2And3and a4marked the closest source pixels.

In the decoder the values of sub-pixels can be directly obtained by applying a 6-drop filters in the horizontal and vertical directions. In the case of sub-pixel resolution 1/4, with reference to figa, the filter coefficients to be applied to the pixels and sub-pixels in a generally vertical position, such: [0, 0, 64, 0, 0, 0] for a set of six pixels And, [1, -5, 52, 20, -5, 1] for a set of six pixels d, [2, -10, 40, 40, -10, 2] for a set of six sub-pixels b, [1, -5, 52, 20, -5, 1] for a set of six p is d-d pixels. These filter coefficients are applied to the respective sets of pixels or sub-pixels in the same line as the values of the interpolated sub-pixels.

After applying the filters in the horizontal and vertical directions interpolated value is normalized according to the formula C=(C+2048)/4096 and is limited to fall within the range [0, 2n-1]. When the motion vector points to an integer pixel location either in horizontal or in vertical direction, is used many zero coefficients. In practical implementation ON, the software uses a variety of branching, which are optimized for different cases of sub-pixels so that no multiplications by zero coefficients.

It should be noted that in OM values of sub-pixels resolution 1/4 receive direct use of intermediate values mentioned above, and not removed from the rounded and limited values of sub-pixels resolution 1/2. Therefore, to obtain values of sub-pixels resolution, there is no need to calculate the final values for each of the sub-resolution 1/2. More specifically, there is no need for the trimming operation and limitations associated with calculating the final values for sub-pixels resolution 1/2. Also there is no need to store the final values for sub-pixels RA is the solution of 1/2, to use in the calculation of the values of sub-pixels resolution 1/4. Therefore, A less complex to calculate than A, because it requires less truncation operations and limitations. However, the drawback ON is that both in the encoder and in the decoder you want to use high-precision calculations. High-precision interpolation requires more silicon space on the ASIC (ASIC) and requires more computation on some of the Central processors. Moreover, the implementation of direct interpolation-mode "on demand", as defined in OM, requires a large amount of memory. This is an important factor, especially in embedded devices.

Taking into account the above discussion, it should be understood that because of the different requirements of the encoder and decoder relative to the interpolated sub-pixels, there is a serious problem of developing a method of interpolation of the values of sub-pixels capable of providing satisfactory performance of both the encoder and decoder. Moreover, none of the existing experimental models (OM, OM)described above may not provide a solution that would be optimal to use both encoder and decoder.

The invention

In accordance with the first aspect of the invention is provided with the interpolation method when encoding VideoLAN is s, in which the image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, interpolate to generate values for sub-pixels at fractional horizontal and vertical locations, and these fractional horizontal and vertical locations are determined by the formula 1/2xwhere x is a positive integer having a maximum value N, the method comprises the following steps:

a) when required values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, as well as in the whole horizontal locations and vertical locations 1/2N-1whole, interpolating such values directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) when required values for sub-pixels in a horizontal locations 1/2N-1whole and vertical locations 1/2N-1whole, interpolating such values directly using a first weighted sum of values for sub-pixels in the horizontal places the provisions 1/2 N-1the whole and entire vertical locations, and a second weighted sum of values for sub-pixels in the entire horizontal locations and vertical locations 1/2N-1whole, and these first and second weighted sum of the values calculated in accordance with step a); and

c) when a value is required for sub-pixel located at a horizontal location 1/2Nwhole and vertical location 1/2Nthe whole interpolate the value-weighted averaged value of the first sub-pixel or pixel located at a horizontal location 1/2N-mwhole and vertical location 1/2N-nwhole, and the values of the second sub-pixel or pixel located at a horizontal location 1/2N-pwhole and vertical location 1/2N-qof the whole, where the variables m, n, p, q are integers in the range from 1 to N, so that the first and second sub-pixels or pixels are located diagonally with respect to sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nwhole.

Preferably the first and second weight is used when calculating the weighted average according to stage (C), with relative values of the weights are inversely proportional to the proximity (on a straight-line diagonal)of the first and second sub-pixel or pixel to sub-pixel in the horizontal location 1/2 Nwhole and vertical location 1/2Nwhole.

In a situation where the first and second sub-pixels or pixels are disposed symmetrically with respect to sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nwhole (at equal distance from it), the first and second weight can have equal values.

The first weighted sum of values for sub-pixels in a horizontal locations 1/2N-1and the whole vertical locations on the stage b)can be used when you want sub-pixel in the horizontal location 1/2N-1whole and vertical location 1/2Nwhole.

A second weighted sum of values for sub-pixels in the entire horizontal locations and vertical locations 1/2N-1the whole stage b)can be used when you want sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2N-1whole.

In one embodiment, when the required values for sub-pixels in a horizontal locations 1/2Nthe whole and entire vertical locations in a horizontal locations 1/2Nwhole and vertical locations 1/2N-1whole, such interpolate values averaged value is th first pixel or sub-pixel, located at a vertical location corresponding to the vertical location of the calculated sub-pixel, and a generally horizontal position, and the second pixel or sub-pixel located at a horizontal location corresponding to the horizontal location of the calculated sub-pixel, and the horizontal location 1/2N-1whole.

When you need the values for sub-pixels in the entire horizontal locations and vertical locations 1/2Nwhole, as well as in a horizontal locations 1/2N-1whole and vertical locations 1/2Nwhole, they can be interpolated by averaging the values of the first pixel or sub-pixel located at a horizontal location corresponding to the horizontal location of the calculated sub-pixel, and a generally vertical position, and the second pixel or sub-pixel located at a vertical location corresponding to the vertical location of the calculated sub-pixel, and the vertical location 1/2N-1whole.

Values for sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole can be interpolated by averaging pixel values located in the generally horizontal and generally the slight pressure from the beginning location and sub-pixel located at a horizontal location 1/2N-1whole and vertical location 1/2N-1whole.

Values for sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole can be interpolated by averaging the values of sub-pixel located at a horizontal location 1/2N-1the whole and the whole vertical location, and a sub-pixel located in a generally horizontal location and vertical location 1/2N-1whole.

Values for half of the sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole can be interpolated by averaging the first pair of values of sub-pixel located at a horizontal location 1/2N-1the whole and the whole vertical location, and a sub-pixel located in generally horizontal and vertical location 1/2N-1integer, and the values of the other half of the sub-pixels in the horizontal location 1/2Nwhole and vertical location 1/2Nthe whole can be interpolated by averaging the second pair of values of a pixel located in the generally horizontal and generally vertical location, and a sub-pixel located at a horizontal location is stark 1/2 N-1whole and vertical location 1/2N-1whole.

Values for sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole alternately interpolate for one sub-pixel by averaging the first pair of values of sub-pixel located at a horizontal location 1/2N-1the whole and the whole vertical location, and a sub-pixel located in a generally horizontal location and vertical location 1/2N-1integer, and the values for the neighbors of such a sub-pixel - averaging of the second pair of values of a pixel located in the generally horizontal and generally vertical location, and a sub-pixel located at a horizontal location 1/2N-1whole and vertical location 1/2N-1whole.

Sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole can be alternately interpolated in the horizontal direction.

Sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole can be alternately interpolated in the horizontal direction.

When you need the values for some of the sub-pixels in a horizontal locations 1/2Nthe whole and the vertical is s locations 1/2 Nthe whole, these values can be alternately interpolated by averaging the set of nearest neighboring pixels.

At least one of steps a) and b), where the values of sub-pixels interpolate directly using weighted sums may include calculating intermediate values for sub-pixels having a wider dynamic range than the specified dynamic range.

The intermediate value for the sub-pixel with sub-pixel resolution 1/2N-1can be used to calculate values for sub-pixels having sub-pixel resolution 1/2N.

In accordance with the second aspect of the invention is provided with the interpolation method for encoding video data in which an image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, interpolate to generate values for sub-pixels at fractional horizontal and vertical locations, the method includes the following steps:

a) when required values for sub-pixels at half-integer horizontal and entire vertical locations, as well as in the whole horizon is selected and the vertical half-integer locations interpolating such values directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) when required values for sub-pixels at half-integer horizontal and vertical half-integer locations, interpolating such values directly using a weighted sum of values for sub-pixels at half-integer horizontal and entire vertical locations, calculated in accordance with step a); and

(C) when required values for sub-pixels in the horizontal location a quarter of the whole and vertical location a quarter of the whole, interpolating such values averaged at least one pair of the first pairs of values of sub-pixel located at a half-integer horizontal and generally vertical location, and a sub-pixel located in generally horizontal and vertical half-integer location, and the second pair of pixel values that are generally horizontal and generally vertical location, and a sub-pixel in the half-integer horizontal and vertical half-integer location.

In accordance with a third aspect of the invention is provided with the interpolation method for encoding video data in which an image containing pixels arranged in rows and columns and the expressed values, having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, interpolate to generate values for sub-pixels at fractional horizontal and vertical locations, and these fractional horizontal and vertical locations are determined by the formula 1/2xwhere x is a positive integer having a maximum value N, the method includes the following steps:

a) when required values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations in the whole horizontal locations and vertical locations 1/2N-1whole, interpolating such values directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) when a value is required for sub-pixel in the horizontal location of the sub-pixel and the vertical location of the sub-pixel interpolate this value directly by selecting the first weighted sum of values for sub-pixels in a vertical location corresponding to the vertical location of the calculated sub-pixel, and a second weighted sum of values of p is d-pixels, located at a horizontal location corresponding to the horizontal location of the calculated sub-pixel.

Sub-pixels used in the first weighted sum can be sub-pixels arranged in a horizontal locations 1/2N-1the whole and entire vertical locations, and the first weighted sum can be used to interpolate values for sub-pixel in the horizontal location 1/2N-1whole and vertical location 1/2Nwhole.

Sub-pixels used in the second weighted sum can be sub-pixels arranged in the entire horizontal and vertical locations 1/2N-1whole, and the second weighted sum can be used to interpolate values for sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2N-1whole.

When you need the values of sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nwhole, they can be interpolated by averaging at least one pair of the first pairs of values of sub-pixel located at a horizontal location 1/2N-1the whole and the whole vertical location, and a sub-pixel located in a generally horizontal position and vertical m is the location 1/2 N-1whole, and the second pair of values of a pixel located in the generally horizontal and generally vertical location, and a sub-pixel located at a horizontal location 1/2N-1whole and vertical location 1/2N-1whole.

In the above aspects of N can be an integer selected from a list consisting of the values 2, 3 and 4.

Sub-pixels in the horizontal location a quarter of the whole should be interpreted as sub-pixels having as nearest neighbor left pixel in generally horizontal position, and as the nearest neighbor to the right sub-pixel at half-integer horizontal location, as well as sub-pixels having as nearest neighbor to the left sub-pixel at half-integer horizontal location, and as a nearest neighbor to the right pixel of the generally horizontal position. Accordingly, the sub-pixels in the vertical location of a quarter of the whole should be interpreted as sub-pixels having as nearest neighbor from the top pixel in a generally vertical position, and as the nearest neighbor from the bottom sub-pixel in the vertical half-integer location, as well as sub-pixels having as nearest neighbor from the top sub-pixel in the vertical half-integer location, and as the nearest SOS is Yes bottom pixel in a generally vertical position.

The term "dynamic range" refers to the range of values that can take the values of sub-pixels and the weighted sums.

Preferably the change of dynamic range via expansion or reduction means changing the number of bits used to represent the dynamic range.

In the embodiment of the invention the method is applied to the image, which is divided into a number of image blocks. Preferably, each block of the image contains the four corners, each of which is determined by the pixel located in the generally horizontal and generally vertical location. Preferably, the technique is applied to each block of the image, when the unit becomes available for interpolation of the values of sub-pixels. Alternative interpolation of the values of sub-pixels in accordance with the method according to the invention occurs when all blocks of the image are available for interpolation of the values of sub-pixels.

Preferably the method is used when encoding the video data. Preferably the method is used when decoding the video data.

In one embodiment of the invention, when used for the encoding method is implemented as a pre-interpolation, in which the values of all sub-pixel is in half-integer locations and values of all sub-pixels at locations a quarter of the whole are calculated and stored until later use when defining frame prediction during encoding with prediction of the movement. In alternative embodiments, the implementation of the method is implemented as a combination of preliminary interpolation and resampling "on request". In this case, a certain proportion or value category sub-pixels is calculated and stored before use in determining frame prediction, and some other values of sub-pixels are computed only when needed during encoding with the prediction of the movement.

Preferably, if the method is used for decoding, sub-pixels interpolate only when the need is indicated by the motion vector.

In accordance with the fourth aspect of the invention is provided a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, device for encoding video data includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, and these fractional horizontal and vertical locations are determined by the formula 1/2xwhere x is the Polo is sustained fashion integer, having a maximum value N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, as well as in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) interpolate values for sub-pixels in a horizontal locations 1/2N-1whole and vertical locations 1/2N-1the whole directly by selecting the first weighted sum of values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, and a second weighted sum of values for sub-pixels in the entire horizontal and vertical locations 1/2N-1whole, and these first and second weighted sums are calculated in accordance with the operation (a); and

c) to interpolate a value for a sub-pixel in the horizontal location 1/2 of the whole and vertical location 1/2 whole weighted averaged value of the first sub-pixel or pixel in the horizontal location 1/2N-mwhole and vertical location 1/2-n whole, and the values of the second sub-pixel or pixel in the horizontal location 1/2N-pwhole and vertical location 1/2N-qof the whole, where the variables m, n, p, q are integers in the range from 1 to N, so that the first and second sub-pixels or pixels are located diagonally with respect to sub-pixel in the horizontal location 1/2Nand vertical location 1/2Nwhole.

Device for encoding video data may include video encoder. This can include a video decoder. It may be a codec that includes both the video encoder and video decoder.

In accordance with the fifth aspect of the invention is provided a communication terminal that contains a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, the device for encoding video data includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, and these fractional horizontal and vertical locations definition is controlled by the formula 1/2 xwhere x is a positive integer having a maximum value N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) interpolate values for sub-pixels in a horizontal locations 1/2N-1whole and vertical locations 1/2N-1the whole directly by selecting the first weighted sum of values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, and a second weighted sum of values for sub-pixels in the entire horizontal locations and vertical locations 1/2N-1whole, and these first and second weighted sums are calculated in accordance with the operation (a); and

c) to interpolate a value for a sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nthe whole weighted averaged value of the first sub-pixel or pixel in the horizontal the social location 1/2 N-mwhole and vertical location 1/2N-nwhole, and the values of the second sub-pixel or pixel located at a horizontal location 1/2N-pwhole and vertical location 1/2N-qof the whole, where the variables m, n, p, q are integers in the range from 1 to N, so that the first and second sub-pixels or pixels are located diagonally with respect to sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nwhole.

The communication terminal may include a video encoder. It can contain video decoder. Preferably it contains the video codec, which includes both the video encoder and video decoder.

Preferably, the communication terminal includes a user interface, a processor and at least one unit from the transmitting unit and the receiving unit and device for encoding video data in accordance with at least one of the third or fourth aspects of the invention. Preferably, the processor controls the operation of the transmitting unit and/or the receiving unit and the unit of encoding video data.

In accordance with the sixth aspect of the invention is provided a telecommunications system containing a communication terminal and a network, the telecommunications network and the communication terminal are connected by a communication line, which may be transmitted to audirovannye video the communication terminal includes a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, device for encoding video data includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, and these fractional horizontal and vertical locations are determined by the formula 1/2xwhere x is a positive integer having a maximum value N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) interpolate values for sub-pixels in a horizontal locations 1/2N-1whole and vertical locations 1/2N-1the whole directly by selecting p is pout weighted sum of values for sub-pixels, located in a horizontal locations 1/2N-1the whole and entire vertical locations, and a second weighted sum of values for sub-pixels in the entire horizontal and vertical locations 1/2N-1whole, and these first and second weighted sums are calculated in accordance with the operation (a); and

c) to interpolate a value for a sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nthe whole weighted averaged value of the first sub-pixel or pixel located at a horizontal location 1/2N-mwhole and vertical location 1/2N-nwhole, and the values of the second sub-pixel or pixel located at a horizontal location 1/2N-pwhole and vertical location 1/2N-qof the whole, where the variables m, n, p, q are integers in the range from 1 to N, so that the first and second sub-pixels or pixels are located diagonally with respect to sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nwhole.

Preferably, the telecommunication system is a mobile communication system containing a mobile communication terminal and the wireless network, and the connection between the mobile communication terminal and a wireless network f is Merwede on the radio. Preferably, the network allows the communication terminal to communicate with other communication terminals connected to the network, the communication lines between the other communication terminal and the network.

In accordance with the seventh aspect of the invention is provided a telecommunications system containing a communication terminal and a network, the telecommunications network and the communication terminal are connected by a communication line, which may be transmitted coded video data, the telecommunications network includes a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, device for encoding video data includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, moreover, these fractional horizontal and vertical locations are determined by the formula 1/2xwhere x is a positive integer having a maximum value N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in the horizontal octopole eniah 1/2 N-1the whole and entire vertical locations, as well as in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) interpolate values for sub-pixels in a horizontal locations 1/2N-1whole and vertical locations 1/2N-1the whole directly by selecting the first weighted sum of values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, and a second weighted sum of values for sub-pixels in the entire horizontal locations and vertical locations 1/2N-1whole, and these first and second weighted sums are calculated in accordance with the operation (a); and

c) to interpolate a value for a sub-pixel in the horizontal location 1/2Nwhole and vertical location weighted 1/2Nthe whole weighted averaged value of the first sub-pixel or pixel located at a horizontal location 1/2N-mwhole and vertical location 1/2N-nwhole, and the values of the second sub-pixel or pixel located at a horizontal location 1/2N-pthe whole and the vertical is nom location 1/2 N-qof the whole, where the variables m, n, p, q are integers in the range from 1 to N, so that the first and second sub-pixels or pixels are located diagonally with respect to sub-pixel in the horizontal location 1/2Nwhole and vertical location 1/2Nwhole.

In accordance with the eighth aspect of the invention is provided a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, the encoding device includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, the resolution of the sub-pixels is determined by a positive integer N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, as well as in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and CE is s vertical locations;

b) to interpolate a value for a sub-pixel in the horizontal location of the sub-pixel and the vertical location of the sub-pixel directly by selecting the first weighted sum of values for sub-pixels in a vertical location corresponding to the vertical location of the calculated sub-pixel, and a second weighted sum of values for sub-pixels in a horizontal location corresponding to the horizontal location of the calculated sub-pixel.

The interpolator may be further adapted to generate a first weighted sum of values of sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, and to use the first weighted sum to interpolate values for sub-pixel in the horizontal location 1/2N-1whole and vertical location 1/2Nwhole.

The interpolator may be further adapted to generate a second weighted sum of values of sub-pixels in the entire horizontal locations and vertical locations 1/2N-1whole, and to use the second weighted sum to interpolate values for sub-pixel in the horizontal location 1/2Nwhole and vertical conventions at the 1/2 N-1whole.

The interpolator may be further adapted to interpolate values for sub-pixels in a horizontal locations 1/2Nwhole and vertical locations 1/2Nthe whole averaging at least one pair of the first pairs of values of sub-pixel located at a horizontal location 1/2N-1the whole and the whole vertical location, and a sub-pixel located in generally horizontal and vertical location 1/2N-1whole, and the second pair of values of a pixel located in the generally horizontal and generally vertical location, and a sub-pixel located at a horizontal location 1/2N-1whole and vertical location 1/2N-1whole.

In accordance with the ninth aspect of the invention is provided a communication terminal that contains a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, the encoding device includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical what's locations the resolution sub-pixels is determined by a positive integer N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) to interpolate a value for a sub-pixel in the horizontal location of the sub-pixel and the vertical location of the sub-pixel directly by selecting the first weighted sum of values for sub-pixels in a vertical location corresponding to the vertical location of the calculated sub-pixel, and a second weighted sum of values for sub-pixels in a horizontal location corresponding to the horizontal location of the calculated sub-pixel.

In accordance with the tenth aspect of the invention is provided a telecommunications system containing a communication terminal and a network, the telecommunications network and the communication terminal are connected by a communication line, which may be transmitted coded video data, the communication terminal device encoding geodannyh for encoding image, containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, the encoder includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, the resolution of the sub-pixels is determined by a positive integer N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizontal and entire vertical locations;

b) to interpolate a value for a sub-pixel in the horizontal location of the sub-pixel and the vertical location of the sub-pixel directly, by selecting the first weighted sum of values for sub-pixels in a vertical location corresponding to the vertical position of the calculated sub-pixel, and a second weighted sum of values for sub-pixels situated in the mountains, the horizontal location the corresponding horizontal location of the calculated sub-pixel.

In accordance with the eleventh aspect of the invention is provided a telecommunications system containing a communication terminal and a network, the telecommunications network and the communication terminal are connected by a communication line, which may be transmitted coded video data, the network includes a device for encoding video data for encoding image containing pixels arranged in rows and columns and represented by values having a specified dynamic range, the pixels in the rows are in the entire horizontal locations and the pixels in the columns are integers vertical locations, the encoding device includes an interpolator adapted to generate values for sub-pixels at fractional horizontal and vertical locations, with resolution sub-pixels is determined by a positive integer N, and the interpolator is adapted to perform the following operations:

a) interpolate values for sub-pixels in a horizontal locations 1/2N-1the whole and entire vertical locations, as well as in the whole horizontal locations and vertical locations 1/2N-1the whole directly using weighted sums of pixels in the entire horizon is lnyh and entire vertical locations;

b) to interpolate a value for a sub-pixel in the horizontal location of the sub-pixel and the vertical location of the sub-pixel directly by selecting the first weighted sum of values for sub-pixels in a vertical location corresponding to the vertical location of the calculated sub-pixel, and a second weighted sum of values for sub-pixels in a horizontal location corresponding to the horizontal location of the calculated sub-pixel.

List of figures

A variant embodiment of the invention will now be described only as an example with reference to the attached drawings, on which:

figure 1 shows the video encoder in accordance with the prior art;

figure 2 shows the video decoder in accordance with the prior art;

figure 3 shows the types of frames used for encoding the video data;

figa, 4b and 4 C show coordination units;

figure 5 shows the process of motion estimation at sub-pixel resolution;

6 shows the target device containing hardware encoding and decoding of video data, which can be implemented by the method according to the invention;

Fig.7 shows the video encoder in accordance with the embodiment of the present invention;

F. g shows the video decoder in accordance with the embodiment of the present invention;

figures 9 and 10 are not used in this description and any such drawings should not be taken into account;

11 shows a conventional scheme mobile communication network in accordance with the embodiment of the present invention;

figa shows the notation for describing the locations of the pixels and sub-pixels, typical OM;

fig.12b shows the interpolation of sub-pixel resolution 1/2;

figs shows the interpolation of sub-pixel resolution 1/2;

figa shows the notation for describing the locations of the pixels and sub-pixels, typical OM;

fig.13b shows the interpolation of sub-pixel resolution 1/2;

Fig with shows the interpolation of sub-pixel resolution 1/2;

Fig shows the notation for describing the locations of the pixels and sub-pixels characteristic of the invention;

fig.14b shows the interpolation of sub-pixel resolution 1/2 in accordance with the invention;

Fig with shows the interpolation of sub-pixel resolution 1/2 in accordance with the invention;

Fig shows the possible choices diagonal interpolation sub-pixels;

Fig shows the values of sub-pixels resolution 1/2, required to compute other values of sub-pixels resolution 1/2;

figa shows the values of sub-pixels resolution 1/2, which should be calculated on what I interpolate values for sub-pixels resolution 1/4 block of an image using the interpolation method on OM;

fig.17b shows the values of sub-pixels resolution 1/2, which must be computed to interpolate values for sub-pixels resolution 1/4 block of an image using the interpolation method in accordance with the invention;

figa shows the number of sub-resolution 1/2, which must be calculated in order to obtain values for sub-pixels resolution 1/4 block of the image using a method of interpolation of the values of sub-pixels in accordance with OM;

fig.18b shows the number of sub-resolution 1/2, which must be calculated in order to obtain values for sub-pixels resolution 1/4 block of the image using a method of interpolation of the values of sub-pixels in accordance with the invention;

Fig shows the numbering scheme for each of the 15 locations of the sub-pixel;

Fig shows the nomenclature used to describe pixels, sub-pixels resolution 1/2, sub-pixel resolution and 1/4 sub-pixel resolution 1/8;

figa shows a diagonal direction, used for the interpolation of each sub-pixel resolution 1/8 in the embodiment of the invention;

fig.21b shows a diagonal direction, used for the interpolation of each sub-pixel resolution 1/8 in another embodiment of the invention;

Fig shows the nomenclature, ispolzuemuyu to describe sub-resolution 1/8 within the image.

Detailed description of the invention

The above-described 1-5, 12A, 12b, 12, 13A, 13b, 13 S.

6 represents the target device, containing equipment for encoding and decoding video data, which can be adapted to work in accordance with the present invention. More specifically, the drawing shows a multimedia terminal 60, which is implemented in accordance with the recommendations In ITU-T. This terminal can be considered as a multimedia transmission and receiving device. It includes items that capture, encode and multiplexers streams of multimedia data for transmission over the communication network, as well as items that take demultiplexer, decode and display the adopted media content. Recommendations In ITU-T define the General operation of the terminal and refer to other guidelines, which regulate the functioning of its various parts. This type of multimedia terminal can be used in applications in real time, such as interactive video telephony, or applications running in real-time, such as receiving/sending videos, for example, from a server of the multimedia content on the Internet.

In the context of the present invention should be understood that the terminal N shown in Fig.6,is only one of many alternative implementations of the multimedia terminal, suitable for use in the method according to the present invention. It should also be noted that there are several alternatives related to the location and implementation of terminal equipment.

As shown in Fig.6, the multimedia terminal can be located in the communications equipment connected to a wired telephone network, such as an analog telephone switched network (PSTN, PSTN). In this case, the multimedia terminal is equipped with a modem 71, compatible with the recommendations V.8, V.34 and selectively V.8bis, ITU-T. Alternative multimedia terminal can be connected to an external modem. The modem gives you the ability to convert the multiplexed digital data and control signals generated by the multimedia terminal, in analog form suitable for transmission over the PSTN. It further enables the multimedia terminal to receive data and control signals in analog form from the PSTN and to convert them into a stream of digital data that can be properly demultiplexor and processed in the terminal.

Multimedia terminal N may also be implemented in such a way that it can directly connect with wired digital network such as ISDN (digital network integrated services). In this case, the modem 71 is replaced by a network user interface ISDN figure 6 this network user interface ISDN presents an alternative block 72.

Multimedia terminals N can also be adapted for use in mobile applications. When using the wireless communication line modem 71 may be replaced with any suitable wireless interface, presents an alternative block 73 figure 6. For example, the multimedia terminal N may include radiopropagation that provides connection to an existing mobile telephone network GSM second generation, or with the proposed universal mobile telecommunications system (UMTS) third generation.

It should be noted that it is preferable to equip a multimedia terminal, designed for bidirectional communication, i.e. for transmission and reception of video data as a video encoder and a video decoder implemented in accordance with the present invention. Such a pair of encoder and decoder is often implemented as a single integrated functional unit, which is referred to as "codec".

As a device for encoding video data in accordance with the present invention performs encoding video data with motion compensation with sub-pixel resolution using a special interpolation scheme and the specific combinations of pre-interpolation and resampling "on-demand" for the value of the sub-pixel, in the General case it is necessary to videotec the der in the receiving terminal has been implemented in such a way to be compatible with the encoder of the transmitting terminal, which generates a stream of digital data. The inability to provide such compatibility may have an adverse impact on the quality of motion compensation and the accuracy of reconstructed video frames.

Conventional multimedia terminal N will now be described in more detail with reference to Fig.6.

Multimedia terminal 60 includes many of the elements that are United by the term "terminal equipment". It includes video, audio and telematic device, generally denoted by reference numbers 61, 62 and 63, respectively. Video 61 may include, for example, a video camera for capturing video images, a monitor for displaying the received video content and selectively equipment processing video data. Audio gear 62 typically includes a microphone, for example, to capture voice messages and a loudspeaker for reproducing a received audio content. Audio equipment may also include additional processing blocks of the audio data. Telematic equipment 63 may include a data terminal, keyboard, electronic "white Board" (to use the pen or the transceiver still images, such as a facsimile unit.

Video 61 is connected with videoke the ECOM 65. The codec 65 contains a video encoder and a corresponding decoder, implemented in accordance with the invention. Such encoder and decoder will be described below. The codec 65 is responsible for encoding the captured video data in the proper form for subsequent transmission over the communication line and for decoding the compressed video content, received from a communication network. In the example shown in Fig.6, the codec 65 is implemented in accordance with the recommendations in ITU-T, with appropriate modifications, to implement the method of interpolation of the values of sub-pixels in accordance with the invention as in the encoder and in the decoder of the codec.

Similarly, the audio equipment terminal connected to the codec indicated in Fig.6 reference number 66. As the video codec, audio codec contains a pair of encoder/decoder. It converts the audio data captured by the audio equipment terminal in a form suitable for transmission over the communication line, and converts the encoded audio data received from the network back to a form suitable for reproduction, for example, through the loudspeaker terminal. The output of the audio codec pass through the block 67 delay. This compensates for the delay introduced by the process of encoding video data, and thus guarantees the synchronization of audio and video content.

Block 64 system management from the part of the player is Diego terminal controls the transmission of signals along the path from the terminal equipment to the network using a suitable control Protocol (block 68 signaling), to set a common mode of operation between transmitting and receiving terminals. Block 68 signaling exchanges information about the functionality of the encoding and decoding of the sending and receiving terminals and can be used to provide different encoding modes of the encoder. Block 64 control system also controls the use of data encryption. Information about the type of encryption to be used when data is transferred from block 69 encryption on block 70 muxing/demuxing.

During transmission of data from the multimedia terminal unit 70 muxing/demuxing combines encoded and synchronized video and audio streams from an input data signal from the telematics 63 and possible management data to form a single stream of bits. Information regarding the type of encryption (if applicable) to apply to the bit stream generated by the block 69 encryption is used to select the encryption mode. Accordingly, when receiving the multiplexed and possibly encrypted multimedia stream of bits, the block 70 muxing/demuxing is responsible for decoding the bit stream by dividing it into a composite multimedia components and erediawa these components to the corresponding(s) codec(s) and/or equipment terminal for decoding and playback.

It should be noted that the functional elements of the multimedia terminal, the video encoder, video decoder and the video codec in accordance with the invention can be implemented as software or special hardware, or a combination of both. Methods of encoding and decoding video data in accordance with the invention is particularly suitable for implementation in the form of a computer program containing computer-readable commands to perform the functional steps of the invention. As such, the encoder and decoder in accordance with the invention can be implemented as software code stored on the storage medium and executed in a computer, such as personal desktop computer to provide the computer with the functionality of the encoding and/or decoding of video data.

If the multimedia terminal 60 is a mobile terminal, that is, if it is equipped with radiopropagation 73, specialist in the art should understand that he may also contain additional elements. In one embodiment, it provides a user interface with display and keypad that allows the user to work with the mobile terminal 60, together with the necessary functional blocks, including a Central processor, such as high performance embedded the quarrels, who controls the units responsible for the various functions of the multimedia terminal, a random access memory (RAM, RAM), a persistent storage device (RAM, ROM) and a digital camera. Commands of the microprocessor, i.e. program code corresponding to the main functions of the multimedia terminal 60 are stored in permanent memory (ROM) and can be executed by the microprocessor, when required, for example, under the control of the user. In accordance with the code, the microprocessor uses radiopropagation 73 for forming a connection with the mobile network, enabling multimedia terminal 60 to transmit information in a mobile communications network and to receive information from her path.

The microprocessor monitors the state of the user interface and controls the digital camera. In response to a user command, the microprocessor instructs the camera to capture a digital image in RAM. When the image is captured, or alternatively in the process of capturing the microprocessor segmenting the image into image segments (e.g., macroblocks), and uses the encoder for performing encoding with motion compensation for these segments to generate a compressed image sequence, as explained in the preceding description is. The user can issue a command to the multimedia terminal 60 to display the captured images on the display or send the compressed image sequence using radiopropagation 73 to another multimedia terminal, a Videophone connected to a wired network (PSTN), or some other communication device. In a preferred embodiment, the transmission of the image data starts immediately after the encoded first segment, so that the recipient can start the process of decoding with minimum delay.

11 is a conditional scheme mobile communication network in accordance with the embodiment of the present invention. Multimedia terminals (MT, MS) are in connection with base stations (BS, BTS) via a radio link. The base station BS is further connected through a so-called Abis interface with a base station controller (ASC, BSC), which controls and manages multiple base stations.

The community formed by multiple base stations BS (typically a few tens of base stations) and one controller ASC base station managing a base station is called a subsystem of the base stations (PBS, BSS). In particular, the controller ASC base station controls the radio channels and the relay transmission about is the service. The controller ASC base stations are also connected through the so-called interface And switching center mobile communications (CCMS,CSM), which coordinates the formation of connections with mobile stations and mobile stations. Further, the connection goes through the center CCMS switching the mobile communication outside of the mobile network. Outside the mobile communication network may be another network (other networks), coupled with the mobile communications network via the gateway(s), such as the Internet or the telephone switched network (PSTN). In such an external network or within the network connection can be located station for encoding and decoding video data, such as personal computers PC. In the embodiment of the invention, the mobile communications network includes the video server to provide video data to the mobile terminals that have subscribed to this service. Video data is compressed using compression with motion compensation, as described above. The video server can function as a gateway for interactive video source or may contain previously recorded videos. Regular use of video telephony may include, for example, two mobile station or a mobile station MS and a Videophone connected to the PSTN, PC connected to the Internet, or N-compatible terminal, soy is inanny either the Internet or the PSTN.

Fig.7 shows the video encoder 700 according to the embodiment of the invention. Fig shows video decoder 800 according to the embodiment of the invention.

The encoder 700 includes input 701 for receiving the video data from the camera or other video source (not shown). It further contains the block 705 discrete cosine transform (DCT, DCT, quantizer 706, an inverse quantizer 709, block 710 inverse discrete cosine transform (ODCP), combiners 712 and 716, block 730 preliminary interpolated sub-pixels, human memory block 740 and 750 interpolated sub-pixels on request, done in conjunction with block 760 motion estimation. The encoder also includes a block 770 encoding field motion and block 780 prediction with motion compensation. The switches 702 and 714 are managed jointly by means 720 control to switch the encoder between the INTRA-mode encoding video data and the INTER-mode encoding video data. The encoder 700 also contains a block 790 muxing/demuxing to form a single stream of bits from various types of information generated by the encoder 700, for subsequent transmission to a remote receiving terminal, or, for example, for storage on a data carrier, such as a computer hard disk (not shown).

It should be noted that the existence and implementation of block all interpolated sub-pixels, and a block 750 interpolated sub-pixels on request in the architecture of the encoder depends on how to apply the interpolation method of the sub-pixels in accordance with the invention. In embodiments of the invention, in which a preliminary interpolation of the values of sub-pixel is not performed, the encoder 700 does not contain block 730 preliminary interpolation of sub-pixels. In other embodiments, the invention is only a preliminary interpolation of sub-pixels, and thus the encoder does not include block 750 interpolated sub-pixels on request. In the variants of implementation, which are performed as a preliminary interpolation of the values of sub-pixels, and the interpolation of the values of sub-pixels on request, in the encoder 700 and have the unit block 730 and 750.

Now will be described the operation of the encoder 700 in accordance with the invention. In the description it is assumed that each frame of uncompressed video data received from video source input 701, received and processed on a macroblock for the macroblock", preferably in the order of raster scan. Further assume that when you begin coding a new video sequence, the first frame of the sequence is encoded in INTRA-mode. Then the encoder is programmed to encode each frame in the INTER-format, unless one of the following conditions: 1) estimated that the current encoded frame so otlicials is from the reference frame, used in the prediction that produces excessive error of prediction; 2) predefined repetition interval of the INTRA frame has expired; 3) adopts a feedback signal from the receiving terminal indicating a request for the frame to be coded in INTRA format.

The emergence of conditions 1) is detected by monitoring the output of the multiplexer 716. A combiner 716 generates the difference between the current macroblock of the encoded frame and its prediction generated in block 780 prediction with motion compensation. If the measure of this difference (for example, sum of absolute difference values of the pixel exceeds a predefined threshold, the combiner 716 informs tool 720 control line 717 management, and tool 720 control manipulates the switches 702 and 714 so as to switch the encoder 700 in the INTRA-encoding mode. The experience condition 2) is monitored by a timer or counter, implemented in the tool 720 control so that if the timer expires or the frame counter reaches a preset number of frames, tool 720 control manipulates the switches 702 and 714 so as to switch the encoder 700 in the INTRA-encoding mode. Condition 3) is triggered if the tool 720 control receives the feedback signal, for example, from the host terminal on line 18 management showing that the receiving terminal must update the INTRA frame. This condition may occur, for example, if the previously transmitted frame was distorted by interference during transmission, making it impossible decoding in the receiver. In this situation, the receiver will prompt you to encode the next frame in INTRA format, thereby re-initiating the coding sequence.

Further assume that the encoder and decoder implemented in such a way as to allow to determine the motion vectors with a spatial resolution of resolution in a quarter of a pixel. As will be seen further, it is possible and better levels of permissions.

Now will be described the operation of the encoder 700 in the INTRA-encoding mode. In INTRA-mode, the tool 720 control manipulates the switches 702 to receive the input signal from the input line 719. The input signal is received macroblock for the macroblock from input 701 on input line 719, and each macroblock of pixels of the original image is converted into DCT coefficients block 705 DCT. The DCT coefficients are then fed to the quantizer 706, where they quanthouse using parameter QP quantization. The choice of the parameter QP quantization is controlled by means 720 control line 722 control. Each DCT transformed and quantized macroblock, which is INF is rmatio 723 INTRA-coded image frame, passes from the quantizer 706 in block 790 muxing/demuxing. Block 790 muxing/demuxing combines information INTRA-encoded image with possible management information (for example, header data, information on the quantization parameter, data error correction, and the like) to form a single stream of bits of information 725 encoded image. As is well known to specialists in this field of technology, coding with variable length (efficiency, VLC) is used to reduce the redundancy of the compressed video stream of bits.

The locally decoded image is formed in the encoder 700 by passing the output of the quantizer 706 through an inverse quantizer 709 and applying the inverse DCT in block 710 inverse DCT to abrechnung data. The received data is then fed into a multiplexer 712. In INTRA-mode, the switch 714 is set so that the input signal to the combiner 712 through the switch 714 is set to zero. Thus, the work performed by the combiner 712, equivalent to passing data decoded images generated by the inverse quantizer 709 and block 710 inverse DCT, unchanged.

In embodiments of the invention, in which interpolated values for sub-pixels on request, output Yes the data of the multiplexer 712 serves to block 730 interpolated sub-pixels on request. The input data block 730 interpolated sub-pixels on request takes the form of blocks of the decoded image. In block 730 interpolated sub-pixels on request each decoded macroblock is subjected to the interpolation of sub-pixels so that a predetermined subset of the values of sub-pixels in the sub-pixel resolution is calculated in accordance with the interpolation method corresponding to the invention, and is stored together with the values of the decoded pixels in human memory 740.

In the variants of implementation, where the interpolated sub-pixels on request fails, the architecture of the encoder does not contain a block of interpolated sub-pixels on request, and the output of the combiner 712, containing blocks of the decoded image, served in human memory 740.

When the successive macroblocks of the current frame are received and subjected to the previously described operations of encoding and decoding blocks 705, 706, 709, 710, 712, in human memory 740 is formed decoded version of the INTRA-frame. When the last macroblock of the current frame is INTRA-encoded and then decoded, human memory 740 contains the fully decoded frame is available for use as a reference frame prediction when encoding the next received frame in INTER-format. In embodiments implementing the ia of the invention, where interpolated values for sub-pixels on request, the reference frame stored in human memory 740, at least partially interpolated to sub-pixel resolution.

Now will be described the operation of the encoder 700 in the INTER-encoding mode. In the INTER-mode encoding means 720 control manipulates the switch 702 to accept input from lines 721, which contain the output of the multiplexer 716. A combiner 716 generates information about the prediction error representing the difference between the current macroblock of the encoded frame and its prediction generated in block 780 prediction with motion compensation. Information about the prediction error is transformed in accordance with DCT in block 705 and quantized in block 706 to form the macroblock DCT transformed and quantized error data predictions. Each macroblock DCT transformed and quantized error data predictions passes through the quantizer 706 in block 790 muxing/demuxing. Block 790 muxing/demuxing combines information 723 error prediction coefficients 724 movement (described below) and management information (for example, header data, information on the quantization parameter, data error correction, and the like) to form a single is Otok 725 bits of information encoded image.

Locally decoded information about the error of prediction for each macroblock is INTER-coded frame is then formed in the encoder 700 transmission coded information 723 error predictions issued by the quantizer 706, through the inverse quantizer 709 and performing the inverse DCT transform unit 710. Obtained locally decoded macroblock information about the prediction error is then fed to a multiplexer 712. In INTER mode, the switch 714 is set so that the combiner 712 also receives macroblocks with motion prediction for the current INTER-frame generated in block 780 prediction with motion compensation. A combiner 712 combines these two pieces of information to generate blocks of the reconstructed image for the current INTER-frame.

As described above, when considering INTRA-coded frames, in embodiments of the invention, in which a preliminary interpolation of the values of sub-pixels, the output of the combiner 712 serves to block 730 preliminary interpolation of sub-pixels. Thus, the output unit 730 preliminary interpolation of sub-pixel INTER-coding also take the form of blocks of the decoded image. In block 730 preliminary interpolation of sub-pixels each decoded macroblock experience the fast interpolation of sub-pixels so what a predetermined subset of the values of sub-pixels is calculated in accordance with the interpolation method corresponding to the invention, and is stored together with the values of the decoded pixels in human memory 740. In the variants of implementation, where preliminary interpolation of sub-pixels is not performed, the pre-interpolation of sub-pixel does not exist in the architecture of the encoder and the output of the combiner 712, containing blocks of the decoded image, served in human memory 740.

When the successive macroblocks of video data taken from a video source and subjected to the previously described operations of encoding and decoding blocks 705, 706, 709, 710, 712, in human memory 740 is formed decoded version of the INTER-frame. When the last macroblock of the frame INTER-encoded and then decoded, human memory 740 contains the fully decoded frame is available for use as a reference frame prediction when encoding the next received frame in INTER-format. In embodiments of the invention, where the interpolation of the values of sub-pixels on request, the reference frame stored in human memory 740, at least partially interpolated to sub-pixel resolution.

Now will be described the formation of predictions for makr the block of the current frame.

Any frame that is encoded in the INTER-format requires a reference frame for prediction with motion compensation. This means, in particular, that when encoding the first frame of the video sequence, encode it does not matter whether it is the first frame in the sequence, or any other frame must be encoded in INTRA format. This, in turn, means that when the video encoder 700 is switched by means 720 control mode INTER-coding, full keyframe generated by the local decoding of the previous encoded frame is already available in human memory 740 encoder. In the General case, the reference frame formed by local decoding or INTRA-coded frame or INTER-coded frame.

The first step in forming a prediction for the macroblock of the current frame is performed by block 760 motion estimation. Block 760 motion estimation takes the current macroblock of the encoded frame on line 727 and performs an operation of matching blocks to identify the region in the reference frame, which almost matches the current macroblock. In accordance with the invention, the process of matching blocks is performed to sub-pixel resolution in a way that depends on the implementation of the encoder 700 and degree completed prior interpolated sub-pixels. One is about the basic principle the underlying negotiation process of the blocks is the same in all cases. More specifically, block 760 motion estimation negotiates blocks by calculating a value of difference (for example, sums of absolute differences), representing the difference in pixel values between the observed macroblock of the current frame and fields candidates for the best matching pixels/sub-pixels in the reference frame. The differential value is generated for all possible offsets (for example, for displacements in x, y with a precision of a quarter or one-eighth sub-pixel) between the macroblock of the current frame and the test region candidate within a predetermined search area of the reference frame and the block 760 motion estimation determines the smallest calculated value of the difference. The displacement between the macroblock in the current frame and the test region is a candidate pixel value/values of sub-pixels in the reference frame, which gives the smallest value of the difference determines the motion vector for the macroblock. In some embodiments of the invention first determines an initial estimate of the motion vector with a precision in a pixel, and then confirmed to a higher level sub-pixel resolution, as described above.

In the variants of implementation of the encoder, which is not pre-interpolate the values of sub-pixels, all values of sub-pixels required for the reconciliation process blocks are calculated in block 750 interpolation of the values of sub-pixels on request. Block 760, the motion estimation unit controls the 750 interpolation of the values of sub-pixels on request in order to compute each value of the sub-pixels required in the reconciliation process blocks in the mode of "on-demand", if and when necessary. In this case, the block 760, the motion estimation may be implemented to carry out the agreement of blocks as a single-step process, then the motion vector with the desired sub-pixel resolution is obtained directly, or it can be implemented so as to perform the alignment blocks as a two-step process. If you have applied a two-step process, the first stage may contain a search, for example, the motion vector resolution in a pixel or half pixel, and the second phase is to Refine the motion vector to the desired sub-pixel resolution. Since the negotiation blocks exhaustive process in which blocks of n×m pixels in the current frame are compared one by one with blocks of n×m pixels or sub-pixels in the interpolated reference frame, it should be understood that the sub-pixel calculated in the mode of "on-demand" block 750 interpolated sub-pixels on request, it may be necessary to calculate mn is a number of times as how to define a consistent difference. In the video encoder this approach is not the most effective possible from the point of view of computational complexity/load.

In the variants of implementation of the encoder, which use only a preliminary interpolation of the values of sub-pixels, matching blocks can be run as a one-step process, since all values of sub-pixels of the reference frame required to determine the motion vector with the desired sub-pixel resolution, previously calculated in block 730 and stored in human memory 740. Thus, they are directly available for use in the reconciliation process blocks and can be retrieved when required, from human memory block 740 760 motion estimation. However, even in the case when all values of sub-pixels available from human memory 740, still more efficient to compute is to match blocks in the form of a two-stage process, as it requires less computing differences. It should be understood that although the full preliminary interpolation of the values of sub-pixels reduces the computational complexity of the encoder, it is not the most effective approach in terms of memory consumption.

In the variants of implementation of the encoder, which is used as a preliminary interpolation of values under which Exelon, and interpolation of the values of sub-pixels on request, block 760 motion estimation is implemented in such a way that he could retrieve the values of sub-pixels, previously calculated in block 730 preliminary interpolation of the values of sub-pixels and stored in human memory 740, and further control unit 750 interpolation of the values of sub-pixels on request to calculate any additional values of sub-pixels that may be required. The reconciliation process blocks can be run as a one-step process or as a two-step process. If you are using a two-stage implementation, the pre-computed values of sub-pixels extracted from human memory 740 can be used at the first stage of the process, and the second stage may be implemented to use the values of sub-pixels, the computed block 750 interpolation of the values of sub-pixels on request. In this case, you may need a lot of time to calculate certain values of sub-pixels used in the second stage of the reconciliation process blocks when perform serial comparison, but the number of such re-computation is much less than in the case when the pre-interpolation of the values of sub-pixels is not used. Moreover, the memory consumption is reduced in comparison with the variants of implementation, which is is only a preliminary interpolation of the values of sub-pixels.

When the block 760 motion estimation has developed a motion vector for the macroblock of the current frame, it outputs a motion vector in block 770 encoding field motion. Block 770 encoding field motion then approximates the motion vector received from block 760 motion estimation using motion models. The motion model in the General case contains a set of core functions. More specifically, block 770 encoding field motion is the motion vector as the set of values of coefficients (known as the coefficients of the movement), which when multiplied by the basis functions form an approximation of the motion vector. The coefficients 724 motion is transferred from block 770 encoding field motion in block 780 prediction with motion compensation. Block 780 prediction with motion compensation also takes on the values of the pixels/sub-pixels in a better manner consistent test area candidate reference frame identified by block 760 motion estimation. 7 these values are shown as transferable on line 729 of block 750 interpolated sub-pixels on request. In alternative embodiments of the invention consider the pixel values are served directly from block 760 motion estimation.

Using an approximated representation of the motion vector generated by block 770 coding the I movement, and the values of the pixels/sub-pixels in a better manner consistent test area candidate block 780 prediction with motion compensation generates a predicted macroblock of pixel values. The predicted macroblock of pixel values represents the prediction values of the pixels of the current macroblock is generated based on the interpolated reference frame. The predicted macroblock of pixel values is fed to a combiner 716, where it is subtracted from the new current frame to generate information 723 error prediction for a macroblock as described above.

The coefficients 724 movement generated by the coding block of the field of motion, also podaetsya in block 790 muxing/demuxing, where they are combined with information 723 error of prediction for the macroblock and possible information control means 720 to generate the encoded video stream 725 for transmission to the receiving terminal.

Now will be described the operation of the decoder 800 in accordance with the invention. On Fig decoder 800 includes block 810 muxing/demuxing, which receives the encoded video stream 725 from the encoder 700 and demuxes its inverse quantizer 820, block 830 inverse DCT, block 840 prediction with motion compensation, kad the new memory 850, a combiner 860, the tool 870 control output 880, block 845 preliminary interpolation of the values of sub-pixels and the block 890 interpolation of the values of sub-pixels on request, associated with a block 840 prediction with motion compensation. In practice, the tool 870 control decoder 800 and means 720 control encoder 700 may be the same processor. This can occur if the encoder 700 and the decoder 800 are part of the same video codec.

Fig shows a variant implementation, in which the decoder uses a combination of pre-interpolation of the values of sub-pixels and interpolated values of sub-pixels on request. In other embodiments, the implementation uses only a preliminary interpolation of the values of sub-pixels, in which case the decoder 800 does not include the block 890 interpolation of the values of sub-pixels on request. In a preferred embodiment of the invention the decoder is not used preliminary interpolation of the values of sub-pixels, and therefore, block 845 preliminary interpolation of the values of sub-pixels is excluded from the architecture of the decoder. If performed as a preliminary interpolation of the values of sub-pixels, and the interpolation of the values of sub-pixels on request, the decoder contains both block 845, and block 890.

The tool 870 controls the operation of the decoder 800 in re is t, decoded whether INTRA-frame or INTER-frame. The control signal switching INTRA/INTER, which instructs the decoder to switch between modes, decoding, receiving, for example, on the basis of information about the image type provided in part of the header of each compressed video frame received from the encoder. The control signal switching INTRA/INTER served on the tool 870 control line 815 control together with other control signals of the codec, demultiplexing of the encoded video stream block 725 890 muxing/demuxing.

When the decoded INTRA-frame coded video stream 725 demultiplexers in INTRA-coded macroblocks and information management. In the encoded video stream 725 for INTRA-coded frame motion vectors are not included. The decoding process is performed macroblock for the macroblock. When the coded information 723 for a macroblock is allocated from the video block 725 810 muxing/demuxing, she served on the reverse quantizer 820. Management tool manages the inverse quantizer 820 to apply the appropriate level of inverse quantization of a macroblock encoded information in accordance with the management information provided in the video stream 725. Inversely quantized macroblock is then subjected to arr is Tomo transformation in block 830 inverse DCT for the formation of the decoded block of image information. The tool 870 controls the multiplexer 860 to prevent the use of any supporting information by decoding the INTRA-coded macroblock. The decoded block of image information is displayed on a video output 880 decoder.

In variants of the implementation of the decoder, which uses pre-interpolation of the values of sub-pixels in the decoded block of image information (pixel values)resulting from operations of the inverse quantization and inverse transformation performed in blocks 820 and 830, served in block 845 preliminary interpolation of the values of sub-pixels. Here, the interpolation values of sub-pixels is performed in accordance with the method corresponding to the invention, and the degree applied pre-interpolation of the values of sub-pixels is determined by the details of the implementation of the decoder. In embodiments of the invention, in which not interpolated values for sub-pixels on request, block 845 preliminary interpolation of the values of sub-pixels interpolate all values of sub-pixels. Options for implementation, which uses a combination of pre-interpolation of the values of sub-pixels and interpolated values of sub-pixels on request, block 845 preliminary interpolation of the values of sub-pixels interpolates determine the military a subset of the values of sub-pixels. It may contain, for example, all sub-pixels at the locations of half pixel, or a combination of sub-pixels at the locations of half pixel or quarter pixel. In any case, after the preliminary interpolation of sub-pixel interpolated values for sub-pixels are stored in human memory 850, along with the original decoded pixel values. As subsequent macroblocks are decoded, pre-interpolated and stored, decoded frame at least partially interpolated to sub-pixel resolution, gradually regulated in human memory 850 and is available for use as a reference frame for prediction with motion compensation.

In variants of the implementation of the decoder that does not use pre-interpolation of the values of sub-pixels in the decoded block of image information (pixel values)resulting from operations of the inverse quantization and inverse transformation performed in blocks 820 and 830, served directly in human memory 850. As subsequent macroblocks are decoded and stored, decoded frame having a resolution in a pixel, gradually regulated in human memory 850 and is available for use as a reference the Adra for prediction with motion compensation.

When the decoded INTER-frame encoded video stream 725 demultiplexers in the coded information 723 error of prediction for each macroblock in the frame, the associated coefficients 724 traffic and information management. Again, the decoding process is performed macroblock for the macroblock. When the coded information 723 error prediction for a macroblock is allocated from the video block 725 810 muxing/demuxing, she served on the reverse quantizer 820. The tool 870 controls the inverse quantizer 820 to apply the appropriate level of inverse quantization of a macroblock encoded information about the error prediction in accordance with the information management adopted in the video stream 725. The macroblock information about the error predictions, past the inverse quantization, and then converted back to block 830 inverse DCT to obtain the decoded error data prediction for the macroblock.

The coefficients 724 movement associated with this macroblock are extracted from the video stream block 725 810 muxing/demuxing and served in block 840 prediction with motion compensation, which restores the motion vector for the macroblock using the same motion models, which is used for Cody the Finance INTER-coded macroblock, the encoder 700. The restored motion vector approximates the motion vector, the source specified by block 760 motion estimation encoder. Block 840 prediction with motion compensation decoder uses the restored motion vector to identify the position of the unit the values of the pixels/sub-pixels in the reference frame prediction stored in human memory 850. The supporting frame may be, for example, the previously decoded INTRA-frame or previously decoded INTER-frame. In any case, the unit values of the pixels/sub-pixels shown by the restored motion vector that represents the prediction for the macroblock.

The restored motion vector can point to any pixel or sub-pixel. If the motion vector indicates that the prediction for the current macroblock is formed on the basis of the values of the pixels (i.e. pixel values at integer locations of pixels), they can be easily extracted from human memory 850, since the considered values are obtained directly during decoding of each frame. If the motion vector indicates that the prediction for the current macroblock is formed on the basis of the values of sub-pixels, then they must be either extracted from human memory 850 or calculated in block 890 interpolated sub-pixels on request. Should the values of sub-pixels to be evaluated is whether they can be simply extracted from human memory depends on the degree of interpolation of the values of sub-pixels on request, used in the decoder.

In variants of the implementation of the decoder that does not use pre-interpolation of the values of sub-pixels, all the necessary values for sub-pixels are calculated in block 890 interpolated sub-pixels on request. On the other hand, in the variants of implementation, in which all values of sub-pixels interpolated preliminary, block 840 prediction with motion compensation can extract the required values of sub-pixels directly from human memory 850. Options for implementation, which uses a combination of pre-interpolation of the values of sub-pixels and interpolated values of sub-pixels on request, the action required to obtain the required values of sub-pixels, depends on what sub-pixels were interpolated in advance. Considering as an example of an implementation option, in which all values of sub-pixels at the locations of half pixel calculated in advance, it is obvious that if the restored motion vector for the macroblock indicates the pixel in location or sub-pixel location of a half pixel, all values of the pixels and sub-pixels required to form the prediction for the macroblock are already in human memory 850 and can be retrieved by block 840 predictions from compens the corruption movement. However, if the motion vector points to a sub-pixel at location a quarter of a pixel, the sub-pixels required to form the prediction for the macroblock, there are no personnel in the memory 850 and, therefore, are calculated in block 890 interpolation of the values of sub-pixels on request. In this case, the block 890 interpolation of the values of sub-pixels on request retrieves any pixel or sub-pixel that is required to perform the interpolation of the human memory 850, and applies the interpolation method described below. Values of sub-pixels, calculated in block 890 interpolation of the values of sub-pixels on request, served in block 840 prediction with motion compensation.

When the prediction macroblock is obtained, this prediction (i.e., the macroblock is predicted from pixel values) is supplied from block 840 prediction with motion compensation in a combiner 860, where it is combined with the decoded information about the error prediction for a macroblock to form the restored block image, which, in turn, is fed to the video output 880 decoder.

It should be understood that in practical implementations of the encoder 700 and decoder 800 is the limit to which frames previously interpolated to sub-pixel, and thus the amount held by interpolation of the values of sub-pixels on request can be selected in accordance with the laws the AI (or dictated) the hardware implementation of the video encoder 700 or surroundings in which you want to use. For example, if the memory available to the video encoder, limit, or memory should be reserved for other functions, it is advisable to limit the amount held preliminary interpolation of the values of sub-pixels. In other cases, when the microprocessor is executing the operation of encoding video data, has limited processing performance, such as the number of operations per second that can be performed is relatively small, more appropriate would be to limit the amount held by interpolation of the values of sub-pixels on request. In the context of mobile communication, for example, when the encoding and decoding of video data is integrated into a mobile phone or similar wireless terminal to communicate with the mobile communications network, and memory and processing power of the processor can be limited. In this case, a combination of preliminary interpolation of the values of sub-pixels and interpolated values of sub-pixels on request may be the best choice to obtain the effective implementation of the video encoder. In the decoder 800 using preliminary values are interpolated sub-pixels is usually not preferred because it usually leads to calculation of the set of values of sub-pixels that are in fact not ispolzuyutsa the decoding process. However, it should be understood that although the encoder and the decoder to optimize each of them can be used various amounts of pre-interpolation of the values of sub-pixels and interpolated values of sub-pixels on request, and the encoder and decoder can be designed to use the same separation between the preliminary interpolation of the values of sub-pixels and interpolation of the values of sub-pixels on request.

Although the above description describes the construction of bi-directional predicted frames (b-frames) in the encoder 700 and the decoder 800, it should be understood that this feature may be provided in embodiments of the invention. It is believed that providing such opportunity relates to the qualification of specialist in this field of technology.

The encoder 700 or decoder 800, corresponding to the invention can be implemented using hardware or software, or using combinations thereof. The encoder or the decoder, implemented in software, can be, for example, a separate program or software component unit program that can be used by different programs. In the above description and the drawings, the functional blocks are represented as separate modules, but the functions of these blocks may be implemented nab the emer, in a single software module.

In addition, the encoder 700 and the decoder 800 may be combined for creating video codec with the function of both encoding and decoding. In addition to the fact that this codec can be implemented in a multimedia terminal, it can also be implemented in the network. The codec in accordance with the invention may be a computer program or element of a computer program, or it can be implemented at least partly using the hardware.

Now will be described the method of interpolation of sub-pixels used in the encoder 700 and the decoder 800 in accordance with the invention. The method will first be introduced on a General conceptual level, and then will be described two preferred option for implementation. In the first preferred embodiment, the interpolation values of sub-pixels is carried out to a resolution of 1/4 of a pixel, as in the second embodiment, the method is extended to a resolution of 1/8 of a pixel.

It should be noted that the interpolation should produce the same value in the encoder and the decoder, but its implementation should be optimized for both objects separately. For example, in the encoder in accordance with the first embodiment of the invention, in which the interpolation of the values of sub-pixels is executed before the resolution is 1/4 of a pixel, the most effective to pre-calculate the resolution 1/2 and compute the values of the sub-resolution 1/4-mode "on demand"only when they are needed in the process of motion estimation. This manifests itself in the form of limit memory usage when saving computational complexity/workload at an acceptable level. In the decoder, on the other hand, it is preferable not to pre-calculate any of the sub-pixels. Therefore, it should be noted that the preferred implementation of the decoder does not include block 845 preliminary interpolation of the values of sub-pixels, and all the interpolation of sub-pixel is performed in block 880 interpolation of the values of sub-pixels on request.

In the description of the interpolation method given below, reference is made to the provisions of the pixels shown in figa. In this drawing, the pixel indicated by the position And represent the original pixels (i.e. the pixels in the entire horizontal and vertical locations). Pixels with other letters represent sub-pixel subject to interpolation. The following description follows the previously entered agreements concerning the description of the locations of the pixels and sub-pixels.

The following describes the operations needed to interpolate all values of sub-pixels.

Values of sub-pixels p is resheniya 1/2, labeled b, receive first by calculating intermediate values of b using filter K-th order in accordance with the following equation:

where x1is a vector of filter coefficients, A1is the corresponding vector of initial values And the pixels in the entire horizontal and entire vertical locations, and K is an integer that determines the order of the filter. Thus, equation 9 can be differently expressed in the following way:

The values of the coefficients x1filter and the order of the filter can vary from option exercise of option exercise. Similarly, different values of the coefficients can be used to calculate different sub-pixels within one variant of implementation. In other embodiments, implementation of the values of the coefficients x1filter and filter order can depend on which of the sub-pixels b resolution 1/2 interpolated. The pixels Aiare disposed symmetrically with respect to an interpolated sub-pixel b resolution and 1/2 are the nearest neighbors of this sub-pixel. When the sub-pixel b resolution 1/2 is half-integer horizontal and generally vertical location, the pixels Aiwhic is ogeny horizontally with respect to b (as shown in fig.14b). If the interpolated sub-pixel b resolution 1/2, located in generally horizontal and vertical half-integer location, pixels Aiarranged vertically with respect to b (as shown in Fig with).

The final value of the sub-pixel b resolution 1/2 is calculated by dividing the intermediate value b to a constant scale1is to trim it with the aim of obtaining an integer, and limiting the result that he was lying in the range [0, 2n-1]. In alternative embodiments of the invention instead of the truncation can be performed rounding. Preferably constant scale1is chosen to be equal to the sum of the coefficients of xifilter.

The value of sub-pixel resolution 1/2, denoted with, also get first by calculating intermediate values using filter M-th order in accordance with the following equation:

where yiis a vector of filter coefficients, biis the corresponding vector of intermediate values of biin horizontal and vertical directions, that is:

The values of the coefficients yifilter and order M of the filter can vary from option exercise of option exercise. Similarly, different values of the coefficients can be is used when calculating the different sub-pixels within one variant of implementation. Preferably the values of b are intermediate values for sub-pixels b resolution 1/2, which are located symmetrically with respect to sub-pixel resolution and 1/2 to the nearest neighbors of the sub-pixel C. In the embodiment of the invention the sub-pixels b resolution 1/2 are arranged horizontally with respect to sub-pixel, and in an alternative embodiment, they are arranged vertically with respect to sub-pixel C.

The final value of the sub-pixel resolution is calculated by dividing the intermediate values with a constant scale2is to trim it with the aim of obtaining an integer and limiting the result that he was lying in the range [0, 2n-1]. In alternative embodiments of the invention instead of the truncation can be performed rounding. Preferably constant scale2is equal scale1×scale1.

It should be noted that the use of intermediate values of b in the horizontal direction leads to the same result as using intermediate values of b in the vertical direction.

There are two alternatives to interpolate values for sub-pixels resolution 1/4, denoted by h. They both include a linear interpolation on a diagonal line connecting the sub-resolution 1/2, which is adjacent to the predicted under-pee the village h resolution 1/4. In the first embodiment, the value of the sub-pixel h is calculated by averaging the values of the two sub-pixels b resolution 1/2, the next sub-pixel h. In the second embodiment, the value of the sub-pixel h is calculated by averaging the values of the nearest pixel and the nearest sub-pixel with a resolution of 1/2. It should be understood that this provides the possibility of using different combinations of the diagonal of the interpolating to determine the values of sub-pixels h within different groups of four pixels And image. However, it should also be understood that a similar combination should be used in both the encoder and the decoder to obtain the same values of the interpolation. Fig shows 4 possible choices diagonal interpolation for sub-pixel h in the adjacent groups of 4 pixels within the image. The simulation environment OHM confirmed that both the implementation lead to the same compression efficiency. The second variant implementation has a higher complexity, because the calculation of sub-pixel requires the calculation of several intermediate values. Therefore, the preferred first variant implementation.

Values for sub-pixels resolution 1/4, denoted by d, and g are calculated from the values of their nearest neighbors horizontally using linear inter is ASCII. In other words, the value for the sub-pixel d resolution 1/4 is obtained by averaging the values of its nearest neighbors horizontal - pixel And the original image and the sub-pixel b resolution 1/2. Similarly, the value for sub-pixel g resolution 1/4 is obtained by averaging the values of its two nearest neighbors horizontal - sub-pixels b and with the permission of 1/2.

Values for sub-pixels resolution 1/4, denoted by e, f and i, are calculated from the values of their nearest neighbors vertically using linear interpolation. More specifically, the value for the sub-pixel e resolution 1/4 is obtained by averaging the values of its two nearest neighbors along the vertical pixels As the original image and the sub-pixel b resolution 1/2. Similarly, the value for sub-pixel f resolution 1/4 is obtained by averaging the values of its two nearest neighbors along the vertical sub-pixels b and with the permission of 1/2. In the embodiment of the invention is to sub-pixel of the first resolution 1/4 receive the same image that has just been described with respect to sub-pixel f resolution 1/4. However, in an alternative embodiment of the invention and its connection with the previously described N experienced models ON and ON, sub-pixel of the first resolution 1/4 is determined using the values of the four closest pixels in the original image in accordance with the formula (A1+A2 +A3+A4+2)/4.

It should also be noted that in all cases where averaging is used, using values of pixels and/or sub-pixels, the average value can be formed in any suitable manner. For example, the value for the sub-pixel d resolution 1/4 can be defined as d=(A+b)/2 or d=(A+b+1)/2. The result of adding 1 to the sum of the values of the pixel and sub-pixel b resolution 1/2 is that any operation of rounding or truncation is used later rounds or truncates a value to d to the next highest integer value. This is true for any amount of integer values and can be applied to any of the averaging operations performed in accordance with the method according to the invention to control the results of rounding or truncation.

It should be noted that the method of interpolation of the values of sub-pixels in accordance with the invention provides advantages in comparison with each of the models A and OM.

Compared to OM, in which the values of some sub-resolution 1/4 depend on the previously interpolated values obtained for the other sub-pixels resolution 1/4, in the method in accordance with the invention, all sub-resolution 1/4 are calculated from the locations of the source image pixels or sub-pixels resolution 1/2 using linearly the interpolation. Thus, reducing the accuracy of the values of sub-pixels resolution 1/4, which takes place in A because the intermediate truncation and limitations of other sub-pixels resolution 1/4, from which they are calculated, not in the way corresponding to the present invention. In particular, figa sub-pixels h resolution 1/4 (and sub-pixels i in one embodiment of the invention) are interpolated diagonally to reduce the dependence on other sub-resolution 1/4. Moreover, in accordance with the method according to the invention the amount of computation (and hence the number of processor cycles)required to obtain values for these sub-resolution 1/4 of the decoder is reduced compared to OM. In addition, the calculation of any value sub-pixel resolution 1/4 require such a large number of calculations, which is almost the same with the number of calculations required to determine any value sub-pixel resolution 1/4. Specifically, in situations where the required values of sub-pixels resolution 1/2 is already available, for example, pre-calculated, the number of calculations required to interpolate values for sub-pixel resolution 1/4 of the previously calculated values of sub-pixel resolution 1/4 similarly, the number of calculations required to calculate any value of the sub-pixel resolution 1/4 of the available values of sub-pixels resolution 1/2.

Compared to A method in accordance with the invention does not require the use of high-precision calculations when calculating all sub-pixels. Specifically, since all values of sub-pixels resolution 1/4 are calculated from the pixels of the original image or values of sub-pixels resolution 1/2 using linear interpolation, for such interpolation can be used less accurate calculations. Therefore, in hardware implementations of the method corresponding to the invention, for example, in a specialized IC (ASIC), the use of less accurate calculations reduces the number of components (e.g., gateways), which should be designed to calculate the values of sub-pixels resolution 1/4. This, in turn, reduces the total area of the silicon, which must be allocated for function interpolation. Since most of the sub-pixels are actually under-resolution 1/4 (12 out of 15 sub-pixels shown in Fig 14a), the advantage provided by the invention in this respect, it is particularly important. In software implementations, where the interpolated sub-pixels is performed using the standard instruction set Central processing unit (CPU, CPU) General purpose or digital signal processor (DSP, DSP), reducing the accuracy of the required calculations tends to increase when Oreste, which can be calculated. This is a significant advantage in implementations with low cost, in which it is preferable to use the CPU for General use, and not some form of ASIC.

The method in accordance with the invention provides further advantages in comparison with OM. As mentioned previously, the decoder only 1 of the 15 locations of the sub-pixels is required at any given time, namely, that indicates the received information about the motion vector. Therefore, preferably, if the value of the sub-pixel in any location sub-pixel can be calculated with a minimum number of stages, the result is correctly interpolated value. The method corresponding to the invention, provides this opportunity. As mentioned previously in the detailed description, sub-pixel resolution can be interpolated by filtering either in vertical or in horizontal direction, and can be used the same value obtained for regardless of whether horizontal or vertical filtering. Thus, the decoder can take advantage of such properties when calculating values for sub-pixels f and g resolution 1/4 in such a way as to minimize the number of operations required to retrieve the desired values. For example the EP, if the decoder needs the value of the sub-pixel f resolution, sub-pixel resolution should be interpolated in the vertical direction. If you want value for sub-pixel g resolution, it is preferable to interpolate a value for C in the horizontal direction. Thus, in General we can say that the method in accordance with the invention provides flexibility in how get the values for a particular sub-pixel resolution. In ON this flexibility is not provided.

Now will be described in detail two specific variants of implementation. The first version of the implementation is a preferred implementation for the calculation of sub-pixel resolution pixel, whereas in the second variant of the method in accordance with the invention, application of the method is extended for computing values of sub-pixels with a resolution of up to 1/8 of a pixel. For both embodiments is provided by the comparison between computational complexity/workload due to the use of the method in accordance with the invention and the complexity that would otherwise occur when using methods of interpolation in accordance with OM and OM under the same conditions.

The preferred implementation for the interpolation of sub-pixel resolution in pixel is written with references to figa, 14b and 14 C. Further, it is assumed that all the pixels of the image and the final interpolated pixel is represented by 8 bits.

Calculation of sub-pixel resolution in (i) half-integer horizontal and generally vertical locations and (ii) in General, the horizontal and vertical half-integer locations.

1. The value for sub-pixel at half-integer horizontal and generally vertical locations, i.e. sub-pixel b resolution 1/2 on Fig and receive by first calculating intermediate values of b, which is calculated in accordance with the following formula b=(A1-5A2+20A3+20A4-5A5+A6using values of 6 pixels (A1-A6), which are located in whole horizontal and entire vertical locations either in the row or column of pixels containing b, and are located symmetrically around b, as shown in fig.14b and 14 C. the Final value for the sub-pixel b, the resolution is calculated as (b+16)/32, where the operator "/" denotes division with truncation. The result of the limit to get into the range [0, 255].

Calculation of sub-pixel resolution 1/2 in half-integer horizontal and vertical half-integer locations.

2. The value for sub-pixel at half-integer horizontal and vertical half-integer locations, i.e. sub-pixel with a resolution of 1/2 on figa in cissette in accordance with the following formula: C=(b 1-5b2+20b3+20b4-5b5+b6+512)/1024 using intermediate values of b for the next six sub-pixels, which are located either in line or in column of sub-pixels, contains, and which are located symmetrically around with, as shown in fig.14b and 14 C. Again, the operator "/" denotes division with truncation. The result of the limit to get into the range [0, 255]. As explained previously, the use of intermediate b values for sub-pixels b resolution 1/2 in the horizontal direction leads to the same result as using intermediate b values for sub-pixels b resolution 1/2 in the vertical direction. Thus, in the encoder in accordance with the invention, the direction for the interpolation of sub-pixel b resolution 1/2 can be selected in accordance with a preferred mode of implementation. In the decoder in accordance with the invention is selected such direction to interpolate sub-pixel b resolution 1/4, in accordance with which the sub-resolution 1/4, if any, will be interpolated using the results obtained for the sub-pixel with a resolution of 1/2.

Calculation of sub-pixel resolution 1/4 in (i) horizontal locations a quarter of the whole and the whole vertical locations, ii) horizontal locations in quarter t is the logo and vertical half-integer locations iii) the whole horizontal locations and vertical locations of a quarter of the whole, (iv) half-integer horizontal locations and vertical locations of a quarter of the whole.

3. Values for sub-pixels d resolution 1/4, located in a horizontal locations a quarter of the whole and the whole vertical locations are calculated according to the following formula: d=(A+b)/2 using the nearest pixel And the original image and the nearest sub-pixel b resolution 1/2 in the horizontal direction. Similarly, values for sub-pixels g-resolution 1/4, located in a horizontal locations a quarter of integer and half-integer vertical locations are calculated according to the following formula: g=(b+c)/2 using two next sub-resolution 1/2 in the horizontal direction. Similarly, the values for sub-pixels e resolution 1/4, located in the entire horizontal locations and vertical locations of a quarter of the whole, are calculated according to the following formula: e=(A+b)/2 using the nearest pixel And the original image and the nearest sub-pixel b resolution 1/2 in the vertical direction. Values for sub-pixels f resolution 1/4, located in half-integer horizontal location and vertical location is great, a quarter of the whole, calculated in accordance with the following formula: f=(b+c)/2 using two next sub-resolution 1/2 in the vertical direction. In all cases, the operator "/" denotes division with truncation.

Calculation of sub-pixel resolution 1/4 horizontal locations a quarter of the whole and vertical locations a quarter of the whole.

4. Values for sub-pixels h resolution 1/4, located in a horizontal locations a quarter of the whole and vertical locations a quarter of the whole, are calculated according to the following formula: h=(b1+b2)/2 using the two nearest sub-pixels b resolution 1/2 in a diagonal direction. Again, the operator "/" denotes division with truncation.

5. The value for sub-pixel resolution 1/4, denoted i, is calculated according to the following formula: i=(A1+A2+A3+A4+2)/4 using the 4 nearest source pixels A. Again, the operator "/" denotes division with truncation.

Now we will analyze the computational complexity of the first preferred variant embodiment of the invention.

In the encoder, probably the same values of sub-pixels will be calculated many times. Therefore, as explained above, the complexity of the encoder can be reduced by pre-computing all the values of sub-pixels and storing them in memory. However, this solution greatly increases the memory usage. In a preferred embodiment of the invention, in which the accuracy of the motion vector has a resolution of pixels in both the horizontal and vertical directions, storing precalculated values of sub-pixels for the entire image requires 16 times more memory than is required to store the original reinterpretating image. To reduce memory usage, all sub-pixel resolution can be interpolated pre-and sub-pixel resolution can be calculated on demand, i.e. only when necessary. In accordance with the method corresponding to the invention, the interpolation at the request of the values of sub-pixels resolution requires only linear interpolation on the basis of sub-pixels resolution. You will need 4 times more memory compared to the amount of memory for the original image to store the pre-calculated sub-pixel resolution, since they represent only requires 8 bits.

However, if the same strategy precalculation of all sub-pixel resolution using a pre-interpolation is used in conjunction with the scheme of direct interpolation ON, the memory requirements grow 9 times compared to how much p is Mati required to store the original reinterpretating image. This is due to the fact that for storing high-precision intermediate values associated with each sub-pixel resolution in OM, requires a greater number of bits. In addition, the complexity of the interpolation of the sub-pixels during motion estimation in A above, since the scaling and limiting should be performed for each location sub-pixel resolution I.

Lower the complexity of the method of interpolation of the values of sub-pixels in accordance with the invention, applied in the decoder, compared with the complexity of the interpolation schemes used in AM and AM. In this analysis it is assumed that each interpolation method for each value of the sub-pixel is only using the minimum number of steps required to correctly interpolated values. Further assume that each method is implemented using block-based manner, i.e. intermediate values that are common to all sub-pixels to be interpolated in a particular block of N×M, are computed only once. See as an illustration the example shown in Fig. On Fig you can see that the calculation unit 4×4 sub-pixels with a resolution of 1/2 by first calculating unit 9×4 sub-pixels b resolution 1/2.

Compared with the method of interpolation of the values of sub-pixels used in AM, the way in is accordance with the invention has a lower computational complexity for the following reasons:

1. In contrast to the interpolation scheme of the values of sub-pixels used in OM, in accordance with the method according to the invention the value of the sub-pixel with a resolution of 1/2 can be obtained by filtering either in vertical or in horizontal direction. Thus, to reduce the number of operations sub-pixel with a resolution of 1/2 can be interpolated in the vertical direction, if you want value for sub-pixel f resolution 1/4, and in the horizontal direction, if you want value for sub-pixel g resolution 1/4. For example, Fig shows all values of sub-pixels resolution 1/2, which must be computed to interpolate values for sub-pixels g-resolution 1/4 block image determined 4×4 pixels of the original image, using the interpolation method on A (Piga), and using the method according to the invention (fig.17b). In this example, the method of interpolation of the values of sub-pixels in accordance with OM requires that all 88 sub-pixel resolution were interpolated, whereas the method in accordance with the invention requires the calculation of 72 sub-resolution 1/2. As you can see in fig.17b, in accordance with the invention, sub-pixel resolution 1/2 interpolated in the horizontal direction to reduce the number of required computations.

2. In accordance with the way the m according to the invention the sub-pixel h resolution 1/4 is calculated by linear interpolation of the two nearest neighboring sub-pixels resolution 1/2 in a diagonal direction. The appropriate number of sub-resolution 1/2, which must be calculated to obtain values for sub-pixels h resolution 1/4 inside a block of 4×4 pixels of the original image using the method of interpolation of the values of sub-pixels A and method in accordance with the invention, shown respectively at Fig (a) and 18 (b). When using the method in accordance with OM you want to interpolate a total of 56 sub-resolution 1/2, whereas in accordance with the method according to the invention it is necessary to interpolate 40 sub-resolution 1/2.

Table 1 summarizes the complexity of the decoder three methods of interpolation of the values of sub-pixels: method in accordance with OM, the direct interpolation used in OM, and the method in accordance with the invention. Complexity was measured in terms of the number of operations 6-outlet filter and linear interpolation. It is assumed that the interpolation of sub-pixel of the first resolution is calculated in accordance with the following formula: i=(A1+A2+A3+A4+2)/4, which is bilinear interpolation and effectively contains two operations linear interpolation.

Operations necessary to interpolate values for sub-pixels in one block of 4×4 pixels of the original image, are listed for each of the 15 locations of the sub-pixels screensaver ringtones is, for convenience, references are numbered according to the scheme shown in Fig. On Fig location 1 is the location of the pixel And the original image, and the locations 2-16 are the location of the sub-pixels. Position 16 is the position of the sub-pixel of the first resolution. To calculate the average number of operations assumes that the probability specified by the motion vector at each location sub-pixel identical. Average complexity, therefore, is an averaged value of 15 amounts calculated for each position of the sub-pixel and one full pixel position.

Table 1

The complexity of the interpolation of sub-pixel resolution 1/4 in OM, OM and method in accordance with invention
AMAMThe method according to the invention
LocationLinear6-row.Linear6-row.Linear6-row.
100000About
a 3.9016016016
2,4,5,1316160161616
11052052052
7,1516520521652
10,1216680521652
6,8,1448680521632
16320320320
Average19372321328,25

From table 1 we can see that the method in accordance with the invention requires fewer operations 6-outlet filter, than the way the values are interpolated sub-pixels ON, and a few additional operations linear interpolation. Because operations 6-outlet filter is significantly more complex than the operation of linear interpolation, the complexity of the two methods are identical. The method of interpolation of the values of sub-pixels OM has a significantly greater difficulty.

Now with links to Fig, 21 and 22 will be described site is ctically variant of implementation of the interpolated sub-pixels with a resolution of up to 1/8 of a pixel.

Fig is the nomenclature used to describe pixels, sub-pixels resolution 1/2, sub-pixel resolution and 1/4 sub-pixel resolution 1/8 in this wider application of the method in accordance with the invention.

1. Values for sub-pixels resolution 1/2 resolution 1/4, denoted by b1b2and b3on Fig receive a first calculating intermediate values in accordance with the following formula:

b1=(-3A1+12A2-37A3+229A4+715-21A6+6A7-A8); b2=(-3A1+12A2-39A3+A4+A5-39A6+12A7-3A8); and b3=(-A1+6A2-21A3+71A4+229A5-37A6+13A7-3A8),

using the values of the eight nearest pixels (A1-A8images located in the entire horizontal and entire vertical locations either in the row or column containing the b1b2and b3and located symmetrically around the sub-pixel b2resolution 1/2. The asymmetry in the filter coefficients used to obtain intermediate values of b1and b3that reflects the fact that the pixels A1-A8located asymmetrically with respect to sub-pixels of the b1and b3resolution 1/4. The final values for sub-pixels of the bi, i=1 2, 3, are calculated according to the following formula: bi=(bi+128)/256, where the operator "/" denotes division with truncation. The result is truncated to fall in the range [0, 255].

2. Values for sub-pixels resolution 1/2 resolution 1/4, marked positions withij, i, j=1, 2, 3, calculated in accordance with the following formula:

using intermediate values of b1b2and b3calculated for the next eight sub-pixels (b1j-b8jin the vertical direction, and the sub-pixels of the b3located in a column of sub-pixels withijpermissions 1/2 and permission 1/4 interpolated and symmetrically arranged around the sub-pixel with2jresolution 1/2. The asymmetry in the filter coefficients used to obtain values for sub-pixels with1jand3jthat reflects the fact that the sub-pixels b1j-b8jlocated asymmetrically with respect to sub-pixels with1jand3jresolution 1/4. Again, the operator "/" denotes division with truncation. Before the interpolated values for sub-pixels withijstored in human memory, they are limited to fall in the range is [0, 255]. In an alternative embodiment of the invention the sub-pixels withijpermissions 1/2 and permission 1/4 are calculated using a similar method using the intermediate values of b1b2and b3in the horizontal direction.

3. Values for sub-pixels resolution 1/8, denoted by d, are calculated using linear interpolation from the values of their nearest neighboring image pixel, sub-pixel resolution 1/2 or sub-resolution 1/4 in the horizontal or vertical direction. For example, the upper left sub-pixel e resolution 1/8 calculated in accordance with the following formula: d=(A+b1+1]/2. As before, the operator "/" denotes division with truncation.

4. Values for sub-pixels resolution 1/8, denoted by e and f, are calculated using linear interpolation of the values of image pixels, sub-pixels resolution 1/2 or sub-resolution 1/4 in diagonal direction. For example, on Fig the upper left sub-pixel e resolution 1/8 calculated in accordance with the following formula: e=(b1+b1+1)/2. Diagonal direction, to be used in the interpolation of each sub-pixel resolution 1/8 in the first preferred embodiment of the invention, which is hereinafter referred to as the "preferred " method 1", Asano in Fig 21(a). Values for sub-pixels g resolution 1/8, denoted by g, are calculated according to the following formula: g=(A+3C22+3)/4. As always, the operator "/" denotes division with truncation. In an alternative embodiment of the invention, which is hereinafter referred to as "the preferred method 2", the computational complexity is further reduced by interpolating sub-pixel f resolution 1/8, using linear interpolation of the sub-pixels of the b2resolution 1/2, i.e. in accordance with the following formula: f=(3b2+b2+2)/4. Sub-pixel of the b2which is closest to f, is multiplied by 3. The pattern of diagonal interpolation used in this alternative embodiment of the invention shown in Fig 21(b). In further alternative embodiments, the implementation may provide other schemes diagonal interpolation.

It should be noted that all cases, when determining the sub-pixel resolution 1/8 uses the average value, which includes the values of the pixels and/or sub-pixels, this average value can be formed in any suitable manner. The result of adding 1 to the sum of the values used in the calculation of such average is the fact that is later applied any operation rounding or truncation, round ileocecal consider the average value to the next highest integer value. In alternative embodiments of the invention the addition of 1 is not used.

As in the case described above, the interpolation values of sub-pixels to the resolution of 1/4 of a pixel, the memory requirements of the encoder can be reduced by pre-calculation of the values of sub-pixels to be interpolated. In the case of interpolation of the values of sub-pixel to a resolution of 1/8 of a pixel is preferable to pre-compute all sub-resolution 1/2 resolution 1/4, and calculate values for sub-pixels resolution 1/8-mode "on demand"only when it is needed. When applying this approach, the interpolation method in accordance with OM and the interpolation method in accordance with the invention requires 16 times more memory than is required to store the original image, to store the values of sub-pixels resolution 1/2 and permission 1/4. However, if the direct interpolation in accordance with OM used, it should be stored intermediate values for sub-pixels resolution 1/2 and permission 1/4. These intermediate values are represented with 32-bit precision, and this leads to the requirement 64 times more memory than the original reinterpretating image.

Lower the complexity of the method of interpolation of the values of sub-pixels in accordance with the invention pripominanie in the decoder to calculate the values of sub-pixels to the resolution of 1/8 of a pixel is compared with the complexity of the interpolation schemes, used in AM and AM. As in the analogous analysis for the interpolation of the values of sub-pixel resolution in pixel described above, it is assumed that each interpolation method of any value sub-pixel is only using the minimum number of operations required to obtain the correct interpolated values. Further assume that each method is implemented using block-based manner, i.e. intermediate values that are common to all sub-pixels to be interpolated in a particular block of N×M, are computed only once.

Table 2 summarizes the complexity of the three methods of interpolation. Complexity was measured in terms of the number of operations 8-outlet filter and linear interpolation is performed in every way. The table represents the number of operations required for the interpolation of each of the 63 sub-resolution 1/8 within one block of 4×4 pixels of the original image, where each location of the sub-pixel is identified by a corresponding number, as shown in Fig. On Fig location 1 is the location of the pixel in the original image, and the locations 2-64 are the location of the sub-pixels. When calculating the average number of operations assumes that the probability of specifying a motion vector for each month is polozhenie sub-pixel identical. Average complexity, therefore, is the average value of 63 of the amounts calculated for each location sub-pixel and one full pixel location.

Table 2

The complexity of the interpolation of sub-pixel resolution 1/8 in OM, OM and

the method in accordance with the invention (the results are shown separately for the Preferred mode 1 and the Preferred mode 2).
AMAMPredpochtite. method 1Predpochtite. method 2
Locationlinear8-otadn.linear8-otadn.linear8-otadn.linear8-otadn.
100000000
3,5,7,17,016016016016
33,49
19,21,23,060060060060
35,37,39,
51,53,55
2,8, to 9.57161601616161616
4,6,25,41163201616321632
10,16,58,327606016321632
64
11,13,15,166006016601660
59,61,63
18,24,34,167606016601660
40,50,56
12,14,60,3212006016321632
62
26,32,42,3210806016321632
48
20,22,36,1612006016761676
38,52,54
27,29,31,167606016761676
43,45,47
28,30,44,3215206016601660
46
Average64290,250197,548214,7548192,75

As can be seen from table 2, the number of operations 8-outlet filter, performed in accordance with the preferred methods 1 and 2, respectively, 26% and 34% less than the number of operations 8-outlet filter, performed in the method of interpolation of the values of sub-pixels ON. The number of linear operations on 25% less, as in the preferred method 1, and the preferred method 2, compared with OM, but this improvement is not so important compared to decrease number of operations 8-outlet filter. You can see further that the manual direct interpolation, used in OM has a complexity comparable with the preferred method 1, and the preferred method 2, when used to interpolate values for sub-pixels resolution 1/8.

From the above description, the specialist will be obvious that within the scope of the present invention may be made of various modifications. Although described in detail several preferred embodiments of the invention, it should be understood that it is also possible many modifications and changes, all of which lie within the essence and scope of the invention.

1. The method of interpolation of the values of podpisala to determine the values for podpisala located within a limited area rectangle defined by four corner pixels with no intermediate pixels between the corners, the pixels and podpisali arranged in rows and columns, and the location of the pixels and podpisala mentioned inside a limited area rectangle have a mathematical expression using the notation of the coordinates as K/2N, L/2Nwhere K and L are positive integers with values between zero and 2Naccordingly, N is a positive integer greater than one and representing a specific degree of interpolation of the values of podpisala, the method contains the steps that

inter is aliroot value of podpisala for podpisala, having coordinates with odd values of both K and L, according to a predetermined selection weighted average from values of the nearest neighboring pixel and values of podpisala, the location of which is defined by the coordinates 1/2, 1/2, and weighted average values from a pair of diagonally spaced of podpisala having coordinates with even values of both K and L, including zero, located in quadrant mentioned limited rectangle area defined by corner pixels having coordinates 1/2, 1/2, and referred to the nearest neighbouring pixel;

interpolating the values of podpisala for podpisala having coordinates with K equal to an even value and L equal to zero, and podpisala having coordinates with K equal to zero and L equal to an even value used in the interpolation of podpisala having coordinates with odd values of both K and L, using a weighted sum of pixel values, accordingly arranged in rows and columns; and

interpolating the values of podpisala for podpisala having coordinates with even values of both K and L, used in the interpolation values of podpisala for podpisala having coordinates with odd values of both K and L, using a pre-defined choice of either a weighted sum of the values of podpisala having coordinates K, equal to an even value and L equal to zero, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, or the weighted sum of the values of podpisala having coordinates with K equal to zero and L equal to an even value, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas.

2. The method according to claim 1, containing a stage, where the use of the first and second weights in the weighted average, we choose to interpolate the values of podpisala for podpisala having coordinates with odd values of both K and L, and we choose the weighted average is such that enables the value of a nearest-neighbouring pixel and the value of podpisala, the location of which is defined by the coordinates 1/2, 1/2, choosing the relative values of the first and second weights so that they are inversely proportional to the respective distances in a straight diagonal line from the nearest neighboring pixel and podpisala, the location of which is defined by the coordinates 1/2, 1/2, to podpisala having coordinates with odd values of both K and L whose value is interpolated.

3. The method according to claim 1, containing a stage, where the use of the first and second weights in the weighted average wybir the reception for the interpolation values of podpisala for podpisala, having coordinates with odd values of both K and L, and we choose the weighted average is therefore that uses the values of a pair of diagonally spaced of podpisala having coordinates with even values of both K and L, choosing the relative values of the first and second weights so that they are inversely proportional to the respective distances in a straight diagonal line from said podpisala having coordinates with even values of both K and L, to podpisala having coordinates with odd values of both K and L whose value is interpolated.

4. The method according to claim 2, containing a stage, where the use of the first and second weights with equal values when the said nearest neighboring pixel and the second putpixel, the location of which is defined by the coordinates 1/2, 1/2, are located at equal distances from podpisala having coordinates with odd values of both K and L whose value is interpolated.

5. The method according to claim 3, containing a stage, where the use of the first and second weights with equal magnitudes when said podpisali having coordinates with even values of both K and L, are located at equal distances from podpisala having coordinates with odd values of both K and L whose value is interpolated.

6. The method according to claim 1, containing the stage at which the interpolating the values of podpisala for podpisala, having coordinates with even values of both K and L, used in the interpolation values of podpisala for podpisala having coordinates with odd values of both K and L, using a weighted sum of the values of podpisala having coordinates with K equal to an even value and L equal to zero, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, when the value of podpisala having coordinates with K equal to an even value and L equal to an odd value, is also required.

7. The method according to claim 1, containing a stage at which interpolate the values of podpisala for podpisala having coordinates with even values of both K and L, used in the interpolation values of podpisala for podpisala having coordinates with odd values of both K and L, using a weighted sum of the values of podpisala having coordinates with K equal to zero and L equal to an even value, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, when the value of podpisala having coordinates with K equal to an odd value and L equal to an even the value is also required.

8. The method according to claim 1, containing a stage at which interpolate the values of podpisala for podpisala with coor is inate with It, equal to an odd value, and L equal to an even value of E, not including zero or 2Nby finding the average values from the first podpisala having coordinates with K equal to an even value of O-1 and L equal to E, and the second podpisala having coordinates with K equal to an even value O+1, and L equal to E.

9. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 1 and L equal to zero, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to zero, and values of podpisala having coordinates with K equal to 2 and L equal to zero.

10. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to zero, by finding the average from the value of a pixel having coordinates with K equal to 2Nand L equal to zero, and values of podpisala having coordinates with K equal to 2N-2, and L equal to zero.

11. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 1 and L equal to 2Nby finding an average from the value of a pixel having coordinates with K equal to zero and L equal to 2Nand the values of podpisala having coordinates with K equal to 2 and L equal to 2N .

12. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to 2Nby finding an average from the value of a pixel having coordinates with K equal to 2Nand L equal to 2Nand values of podpisala having coordinates with K equal to 2N-2, and L equal to N.

13. The method according to claim 1, containing a stage at which interpolate the values of podpisala for podpisala having coordinates with K equal to an even value of E, not including zero or 2Nand L equal to an odd value On, by finding the average values from the first podpisala having coordinates with K equal to E, and L equal to an even value O-1, and the second podpisala having coordinates with K equal to E, and L equal to an even value O+1.

14. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to zero and L equal to 1, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to zero, and values of podpisala having coordinates with K equal to zero and L equal to 2.

15. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 2Nand L equal to 1, by finding the average from the value of the pixel is within the coordinates of C, equal to 2Nand L equal to zero, and values of podpisala having coordinates with K equal to zero and L equal to 2.

16. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to zero and L equal to 2N-1, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to 2Nand the values of podpisala having coordinates with K equal to zero and L equal to 2N-2.

17. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 2Nand L equal to 2N-1, by finding the average from the value of a pixel having coordinates with K equal to 2Nand L equal to 2Nand the values of podpisala having coordinates with K equal to 2Nand L equal to 2N-2.

18. The method according to claim 1, containing a stage at which interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to 2N-1, by finding the weighted average of the values of the four corner pixels, which is set mentioned bounded by the rectangle area.

19. The method according to claim 1, containing a stage where you choose the amount of N from the list consisting of the values 2, 3 and 4.

20. The method corresponding to four times the resolution of the interpolation value is deposits of podpisala to determine the values for podpisala, within a limited area rectangle defined by four corner pixels with no intermediate pixels between the corners, the pixels and podpisali arranged in rows and columns, and the location of the pixels and podpisala mentioned inside a limited area rectangle have a mathematical expression using the notation of the coordinates as K/4, L/4, where K and L are positive integers with values between zero and 4, respectively, the method contains the steps that interpolate the value of podpisala for podpisala with such coordinates that as K, and L equal to 1 or 3, according to a predetermined selection weighted average from values of the nearest neighboring pixel and values of podpisala, the location of which is defined by the coordinates 2/4, 2/4, and a weighted average from values of a pair of diagonally spaced of podpisala with such coordinates that as K, and L equal to zero, 2 or 4, located in quadrant mentioned limited rectangle area defined by corner pixels having coordinates 2/4, 2/4, and referred to the nearest neighbouring pixel;

interpolating the values of podpisala for podpisala with such coordinates, which is equal To 2 and L equal to zero, and podpisala with such coordinates, which is equal To zero and L RA is but 2, used for interpolation of podpisala with such coordinates that as K, and L equal to 1 or 3, using a weighted sum of pixel values, suitably arranged in rows and columns; and

interpolate the value of podpisala for podpisala with such coordinates that as K, and L equal to 2, used in the interpolation values of podpisala for podpisala with such coordinates that as K, and L equal to 1 or 3, using a pre-defined choice of either a weighted sum of the values of podpisala with such coordinates, which is equal To 2 and L equal to zero, and the value of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, or weighted sum of the values of podpisala with such coordinates, which is equal To zero and L equal to 2, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas.

21. The method corresponding to eight times the resolution of the interpolation values of podpisala to determine the values for podpisala located within a limited area rectangle defined by four corner pixels with no intermediate pixels between the corners, the pixels and podpisali arranged in rows and columns, and position is the position of the pixels and podpisala mentioned inside a limited area rectangle have a mathematical expression using the notation of the coordinates as/8, L/8, where K and L are positive integers with values between zero and 8, respectively, the method contains the steps that interpolate the value of podpisala for podpisala with such coordinates that as K, and L equal to 1, 3, 5 or 7, according to a predetermined selection weighted average from values of the nearest neighboring pixel and values of podpisala, the location of which is defined by the coordinates 4/8, 4/8, and a weighted average from values of a pair of diagonally spaced of podpisala with such coordinates that as K, and L equal to zero, 2, 4, 6 or 8, located within the quadrant mentioned limited rectangle area defined by corner pixels having coordinates 4/8, 4/8, and referred to the nearest neighbouring pixel;

interpolating the values of podpisala for podpisala with such coordinates, that is 2, 4 or 6 and L equal to zero, and podpisala with such coordinates, which is equal To zero and L equal to 2, 4 or 6, used in the interpolation of podpisala with such coordinates that as K, and L equal to 1, 3, 5 or 7, by using a weighted sum of pixel values, suitably arranged in rows and columns; and

interpolating the values of podpisala for podpisala with such coordinates that as K, and L equal to 2, 4 or 6, used in the interpolation of values is udpixel for podpisala, with such coordinates that as K, and L equal to 1, 3, 5 or 7, using a pre-defined choice of either a weighted sum of the values of podpisala with such coordinates, that is 2, 4 or 6 and L equal to zero, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, or the weighted sum of the values of podpisala with such coordinates, which is equal To zero and L equal to 2, 4 or 6, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas.

22. Device for interpolation of podpisala, configured to determine values for podpisala located within a limited area rectangle defined by four corner pixels with no intermediate pixels between the corners, the pixels and podpisali arranged in rows and columns, and the location of the pixels and podpisala mentioned inside a limited area rectangle have a mathematical expression using the notation of the coordinates as K/2N, L/2Nwhere K and L are positive integers with values between zero and 2Naccordingly, N is a positive integer greater than one and representing a specific degree of interpolation of the values of podpisala, if this is the device contains circuits, made with the possibility to interpolate the value of podpisala for podpisala having coordinates with odd values of both K and L, according to a predetermined selection weighted average from values of the nearest neighboring pixel and values of podpisala, the location of which is defined by the coordinates 1/2, 1/2, and weighted average values from a pair of diagonally spaced of podpisala having coordinates with even values of both K and L, including zero, located in quadrant mentioned limited rectangle area defined by corner pixels having coordinates 1/2, 1/2, and referred to the nearest neighbouring pixel; schemes with the ability to interpolate values podpisala for podpisala having coordinates with K equal to an even value and L equal to zero, and podpisala having coordinates with K equal to zero and L equal to an even value used in the interpolation of podpisala having coordinates with odd values of both K and L, using a weighted sum of pixel values, accordingly arranged in rows and columns; and

schemes with the ability to interpolate values of podpisala for podpisala having coordinates with even values of both K and L, used in the interpolation values of podpisala for podfic the fir, having coordinates with odd values of both K and L, using a pre-defined choice of either a weighted sum of the values of podpisala having coordinates with K equal to an even value and L equal to zero, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, or the weighted sum of the values of podpisala having coordinates with K equal to zero and L equal to an even value, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas.

23. The device according to item 22, made with the possibility to use the first and second weights in the weighted average, we choose to interpolate the values of podpisala for podpisala having coordinates with odd values of both K and L, and we choose the weighted average is such that enables the value of a nearest-neighbouring pixel and the value of podpisala, the location of which is defined by the coordinates 1/2, 1/2, when this device is designed with the ability to choose the relative magnitudes of the first and second weights so that they are inversely proportional to the respective distances in a straight diagonal line from the nearest neighboring pixel and podpisala, the location of which is defined by the coordinates 1/2, 1/, to podpisala having coordinates with odd values of both K and L whose value is interpolated.

24. The device according to item 22, made with the possibility to use the first and second weights in the weighted average, we choose to interpolate the values of podpisala for podpisala having coordinates with odd values of both K and L, and we choose the weighted average is therefore that uses the values of a pair of diagonally spaced of podpisala having coordinates with even values of both K and L, the device is made with the possibility to choose the relative magnitudes of the first and second weights so that they are inversely proportional to the respective distances in a straight diagonal line from said podpisala having coordinates with even values of both K, and L, to podpisala having coordinates with odd values of both K and L whose value is interpolated.

25. The device according to item 23, made with the possibility to use the first and second weights with equal values when the said nearest neighboring pixel and the second putpixel, the location of which is defined by the coordinates 1/2, 1/2, are located at equal distances from podpisala having coordinates with odd values of both K and L whose value is interpolated.

26. The device according to paragraph 24, vypolnen the e with the ability to use the first and second weights with equal values, when mentioned podpisali having coordinates with even values of both K and L, are located at equal distances from podpisala having coordinates with odd values of both K and L whose value is interpolated.

27. The device according to item 22, configured to interpolate the values of podpisala for podpisala having coordinates with even values of both K and L, used in the interpolation values of podpisala for podpisala having coordinates with odd values of both K and L, using a weighted sum of the values of podpisala having coordinates with K equal to an even value and L equal to zero, and values of podpisala with the corresponding coordinates in adjacent bounded by the rectangle areas, when the value of podpisala having coordinates with K equal to an even value and L equal to an odd value, is also required.

28. The device according to item 22, configured to interpolate the values of podpisala for podpisala having coordinates with even values of both K and L, used in the interpolation values of podpisala for podpisala having coordinates with odd values of both K and L, using a weighted sum of the values of podpisala having coordinates with K equal to zero and L equal to an even value, and values of podpisala, and ausich the corresponding coordinates in adjacent bounded by the rectangle areas, when the value of podpisala having coordinates with K equal to an odd value and L equal to an even value, is also required.

29. The device according to item 22, configured to interpolate the values of podpisala for podpisala having coordinates with K equal to an odd value, and L equal to an even value of E, not including zero or 2Nby finding the average values from the first podpisala having coordinates with K equal to an even value of O-1 and L equal to E, and the second podpisala having coordinates with K equal to an even value O+1, and L equal to E.

30. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 1 and L equal to zero, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to zero, and values of podpisala having coordinates with K equal to 2 and L equal to zero.

31. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to zero, by finding the average from the value of a pixel having coordinates with K equal to 2Nand L equal to zero, and values of podpisala having coordinates with K equal to 2N-2, and L equal to zero.

32. The device according to item 22, made with the possibility interpriroda the ü the value of podpisala for podpisala, having coordinates with K equal to 1 and L equal to 2Nby finding an average from the value of a pixel having coordinates with K equal to zero and L equal to 2Nand the values of podpisala having coordinates with K equal to 2 and L equal to 2N.

33. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to 2Nby finding an average from the value of a pixel having coordinates with K equal to 2Nand L equal to 2Nand the values of podpisala having coordinates with K equal to 2N-2, and L equal to N.

34. The device according to item 22, configured to interpolate the values of podpisala for podpisala having coordinates with K equal to an even value of E, not including zero or 2Nand L equal to an odd value On, by finding the average values from the first podpisala having coordinates with K equal to E, and L equal to an even value O-1, and the second podpisala having coordinates with K equal to E, and L equal to an even value O+1.

35. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to zero and L equal to 1, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to zero, and values podpis the ate, having coordinates with K equal to zero and L equal to 2.

36. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 2Nand L equal to 1, by finding the average from the value of a pixel having coordinates with K equal to 2Nand L equal to zero, and values of podpisala having coordinates with K equal to zero and L equal to 2.

37. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to zero and L equal to 2N-1, by finding the average from the value of a pixel having coordinates with K equal to zero and L equal to 2Nand the values of podpisala having coordinates with K equal to zero and L equal to 2N-2.

38. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 2Nand L equal to 2N-1, by finding the average from the value of a pixel having coordinates with K equal to 2Nand L equal to 2Nand the values of podpisala having coordinates with K equal to 2Nand L equal to 2N-2.

39. The device according to item 22, configured to interpolate the value of podpisala for podpisala having coordinates with K equal to 2N-1, and L equal to 2N-1, by finding the weighted is military medium from the values of the four corner pixels which sets mentioned bounded by the rectangle area.

40. The device according to item 22, in which N is set to 2.

41. The device according to item 22, in which N is set to 3.

42. Device for encoding video data containing device for interpolation of podpisala on p.22.

43. Device for encoding still images containing device for interpolation of podpisala on p.22.

44. A device for decoding video data containing device for interpolation of podpisala on p.22.

45. The device decoding still images containing device for interpolation of podpisala on p.22.

46. The codec contains a device for encoding video data according to § 42 and device for decoding video data according to item 44.

47. The codec contains a device for encoding still images on p.43 apparatus and decoding of still images by § 45.

48. The communication terminal containing a device for encoding video data according to § 42.

49. The communication terminal containing device for encoding still images on p.43.

50. The communication terminal containing the device decoding video data according to item 44.

51. The communication terminal containing the device decoding of still images by § 45.



 

Same patents:

FIELD: system for encoding moving image, in particular, method for determining movement vector being predicted, of image block in B-frame in process of decoding of moving image.

SUBSTANCE: in accordance to method, at least one movement vector is produced for at least one block, different from current block, while aforementioned at least one block is related to one, at least, supporting frame in a row of supporting frame, movement vector is predicted for current block on basis of received one, at least, movement vector, while prediction operation includes also operation of comparison of value of order number of B-frame to value of order number of one, at least, supporting frame, while movement vector for current block and aforementioned one, at least, movement vector are vectors of forward movement.

EFFECT: increased efficiency.

2 cl, 1 dwg

FIELD: video-coding; fine-grain coding method including quality and time scaling.

SUBSTANCE: use is made of hybrid time/signal-to-noise-ratio scaled structure found just useful for fine-grain coding method. Updated time frames in one example of this structure and updated frames of fine-grain scaling are included in single updated layer. In other example use is made of separate updated layers to attain hybrid time-dependent signal-to-noise ratio scaling. These two layers incorporate time scaled layer to attain time updating for basic layer (that is for better motion) while fine-grain scaling layer is used to improve quality of signal-to-noise basic layer and/or time-scaled updated layer.

EFFECT: improved video quality or signal-to-noise ratio of each frame or pattern transferred in basic layer.

10 cl, 21 dwg

FIELD: television.

SUBSTANCE: support frame is assigned with sign, showing information about direction of support frame, and during determining of predicted vector of movement of encoded block averaging operation is performed with use of vectors of movement of neighboring blocks, during which, if one of aforementioned blocks has movement vectors, information about direction of support frames is received, to which these movement vectors are related, and one of movement vectors is selected with reference to received information about direction, than averaging operation is performed with use of selected movement vector to receive subject movement vector of encoded block.

EFFECT: higher precision, higher reliability.

3 cl, 1 dwg, 3 ex

FIELD: observation of moving objects.

SUBSTANCE: method includes using movement sensors, capable of recording two-dimensional distributions of intensity in form of images, where sensors are positioned with known spatial orientation, making it possible to perform simultaneous observation of one and the same scene, periodical query of sensors is performed during whole time period after their enabling, processing and analysis of data received from sensors is performed, which constitutes series of images, and output signal is generated in case of detection of three-dimensional moving object and determining of its spatial position, which signal is injected into output device.

EFFECT: increased trustworthiness when determining spatial position of a moving object.

3 cl, 1 dwg

FIELD: systems for automatic video surveillance of an object.

SUBSTANCE: system for automatic detection and tracking of individuals on basis of images and biometric identity recognition based on target list, realizes following operations: on basis of three-dimensional data about scene and two-dimensional data, characterizing optical flow, detection of objects-noises of scene is performed, static background objects are selected, and regular dynamic object-noises; on basis of comparison of two-dimensional and two-dimensional data about the scene in current frame with reference data on previous frames and a map of object-noises changes are determined on a scene, in three-dimensional zones of interest, preliminary check of presence of human-like objects is performed, zones of interest are determined more precisely and their changes are tracked: a contour of separate elements of human body is singled out, zones of interest are divided onto a set of sub-zones of interest for elements, detection of three-dimensional head of individual is performed and it is tracked in each zone of interest; face of individual is tracked in each zone of interest; images of detected face are normalized in terms of dimensions, angles and brightness; recognition is performed.

EFFECT: objectivity and stability of system operation.

1 dwg

FIELD: video encoding, in particular, methods and devices for ensuring improved encoding and/or prediction methods related to various types of video data.

SUBSTANCE: the method is claimed for usage during encoding of video data in video encoder, containing realization of solution for predicting space/time movement vector for at least one direct mode macro-block in B-image, and signaling of information of space/time movement vector prediction solution for at least one direct mode macro-block in the header, which includes header information for a set of macro-blocks in B-image, where signaling of aforementioned information of space/time movement vector prediction solution in the header transfers a space/time movement vector prediction solution into video decoder for at least one direct mode macro-block in B-image.

EFFECT: creation of improved encoding method, which is capable of supporting newest models and usage modes of bi-directional predictable (B) images in a series of video data with usage of spatial prediction or time distance.

2 cl, 17 dwg

FIELD: mobile robot, such as cleaner robot, and, in particular, device for tracking movement of mobile robot.

SUBSTANCE: suggested device for tracking movement of mobile robot includes: video camera for filming an individual object; unit for tracking movement and creation of image for setting support one in an image for current moment by means of filming of individual object by video camera and creation of image in current moment, for which support zone is set; unit for selecting image of difference of pixels of image support zone limit based on difference between pixels present only at limit of support zone of aforementioned images; and micro-computer for tracking movement of separate object on basis of selected image of difference.

EFFECT: decreased time of pixel comparison operation and increased efficiency of room perception.

5 cl, 4 dwg

FIELD: system for encoding moving image, in particular, method for determining movement vector being predicted, of image block in B-frame in process of decoding of moving image.

SUBSTANCE: in accordance to method, at least one movement vector is produced for at least one block, different from current block, while aforementioned at least one block is related to one, at least, supporting frame in a row of supporting frame, movement vector is predicted for current block on basis of received one, at least, movement vector, while prediction operation includes also operation of comparison of value of order number of B-frame to value of order number of one, at least, supporting frame, while movement vector for current block and aforementioned one, at least, movement vector are vectors of forward movement.

EFFECT: increased efficiency.

2 cl, 1 dwg

FIELD: technology for processing images of moving objects, possible use, in particular, in theatric art, show business when registration/recording is necessary or repeated reproduction of scenic performance.

SUBSTANCE: method includes inserting enumeration system for each object and performing projection of enumerated objects onto plane, while projection is displayed in form of graph with trajectories of movement of enumerated objects in each staging.

EFFECT: spatial-temporal serial graphic display of scenic action for its further identification and repeated reproduction.

2 dwg

FIELD: device and method for recognizing gestures in dynamics from a series of stereo frames.

SUBSTANCE: method includes producing a series of stereo-images of object, on basis of which map of differences in depths is formed. System is automatically initialized on basis of probability model of upper portion of body of object. Upper portion of body of object is modeled as three planes, representing body and arms of object and three gauss components, representing head and wrists of object. Tracking of movements of upper part of body is performed with utilization of probability model of upper part of body and extraction of three-dimensional signs of performed gestures.

EFFECT: simplified operation of system, high precision of gesture interpretation.

3 cl, 12 dwg

FIELD: movement detection systems, technical cybernetics, in particular, system and method for detecting static background in video series of images with moving objects of image foreground.

SUBSTANCE: method contains localization of moving objects in each frame and learning of background model with utilization of image remainder.

EFFECT: increased speed and reliability of background extraction from frames, with possible processing of random background changes and camera movements.

4 cl, 14 dwg

FIELD: television.

SUBSTANCE: support frame is assigned with sign, showing information about direction of support frame, and during determining of predicted vector of movement of encoded block averaging operation is performed with use of vectors of movement of neighboring blocks, during which, if one of aforementioned blocks has movement vectors, information about direction of support frames is received, to which these movement vectors are related, and one of movement vectors is selected with reference to received information about direction, than averaging operation is performed with use of selected movement vector to receive subject movement vector of encoded block.

EFFECT: higher precision, higher reliability.

3 cl, 1 dwg, 3 ex

The invention relates to a method and apparatus for identification and localization of areas with relative movement in the scene and to determine the speed and oriented direction of this relative movement in real time

FIELD: television.

SUBSTANCE: support frame is assigned with sign, showing information about direction of support frame, and during determining of predicted vector of movement of encoded block averaging operation is performed with use of vectors of movement of neighboring blocks, during which, if one of aforementioned blocks has movement vectors, information about direction of support frames is received, to which these movement vectors are related, and one of movement vectors is selected with reference to received information about direction, than averaging operation is performed with use of selected movement vector to receive subject movement vector of encoded block.

EFFECT: higher precision, higher reliability.

3 cl, 1 dwg, 3 ex

FIELD: movement detection systems, technical cybernetics, in particular, system and method for detecting static background in video series of images with moving objects of image foreground.

SUBSTANCE: method contains localization of moving objects in each frame and learning of background model with utilization of image remainder.

EFFECT: increased speed and reliability of background extraction from frames, with possible processing of random background changes and camera movements.

4 cl, 14 dwg

FIELD: device and method for recognizing gestures in dynamics from a series of stereo frames.

SUBSTANCE: method includes producing a series of stereo-images of object, on basis of which map of differences in depths is formed. System is automatically initialized on basis of probability model of upper portion of body of object. Upper portion of body of object is modeled as three planes, representing body and arms of object and three gauss components, representing head and wrists of object. Tracking of movements of upper part of body is performed with utilization of probability model of upper part of body and extraction of three-dimensional signs of performed gestures.

EFFECT: simplified operation of system, high precision of gesture interpretation.

3 cl, 12 dwg

FIELD: technology for processing images of moving objects, possible use, in particular, in theatric art, show business when registration/recording is necessary or repeated reproduction of scenic performance.

SUBSTANCE: method includes inserting enumeration system for each object and performing projection of enumerated objects onto plane, while projection is displayed in form of graph with trajectories of movement of enumerated objects in each staging.

EFFECT: spatial-temporal serial graphic display of scenic action for its further identification and repeated reproduction.

2 dwg

FIELD: system for encoding moving image, in particular, method for determining movement vector being predicted, of image block in B-frame in process of decoding of moving image.

SUBSTANCE: in accordance to method, at least one movement vector is produced for at least one block, different from current block, while aforementioned at least one block is related to one, at least, supporting frame in a row of supporting frame, movement vector is predicted for current block on basis of received one, at least, movement vector, while prediction operation includes also operation of comparison of value of order number of B-frame to value of order number of one, at least, supporting frame, while movement vector for current block and aforementioned one, at least, movement vector are vectors of forward movement.

EFFECT: increased efficiency.

2 cl, 1 dwg

FIELD: mobile robot, such as cleaner robot, and, in particular, device for tracking movement of mobile robot.

SUBSTANCE: suggested device for tracking movement of mobile robot includes: video camera for filming an individual object; unit for tracking movement and creation of image for setting support one in an image for current moment by means of filming of individual object by video camera and creation of image in current moment, for which support zone is set; unit for selecting image of difference of pixels of image support zone limit based on difference between pixels present only at limit of support zone of aforementioned images; and micro-computer for tracking movement of separate object on basis of selected image of difference.

EFFECT: decreased time of pixel comparison operation and increased efficiency of room perception.

5 cl, 4 dwg

FIELD: video encoding, in particular, methods and devices for ensuring improved encoding and/or prediction methods related to various types of video data.

SUBSTANCE: the method is claimed for usage during encoding of video data in video encoder, containing realization of solution for predicting space/time movement vector for at least one direct mode macro-block in B-image, and signaling of information of space/time movement vector prediction solution for at least one direct mode macro-block in the header, which includes header information for a set of macro-blocks in B-image, where signaling of aforementioned information of space/time movement vector prediction solution in the header transfers a space/time movement vector prediction solution into video decoder for at least one direct mode macro-block in B-image.

EFFECT: creation of improved encoding method, which is capable of supporting newest models and usage modes of bi-directional predictable (B) images in a series of video data with usage of spatial prediction or time distance.

2 cl, 17 dwg

FIELD: systems for automatic video surveillance of an object.

SUBSTANCE: system for automatic detection and tracking of individuals on basis of images and biometric identity recognition based on target list, realizes following operations: on basis of three-dimensional data about scene and two-dimensional data, characterizing optical flow, detection of objects-noises of scene is performed, static background objects are selected, and regular dynamic object-noises; on basis of comparison of two-dimensional and two-dimensional data about the scene in current frame with reference data on previous frames and a map of object-noises changes are determined on a scene, in three-dimensional zones of interest, preliminary check of presence of human-like objects is performed, zones of interest are determined more precisely and their changes are tracked: a contour of separate elements of human body is singled out, zones of interest are divided onto a set of sub-zones of interest for elements, detection of three-dimensional head of individual is performed and it is tracked in each zone of interest; face of individual is tracked in each zone of interest; images of detected face are normalized in terms of dimensions, angles and brightness; recognition is performed.

EFFECT: objectivity and stability of system operation.

1 dwg

Up!