Method for masking errors in video sequences

FIELD: video data encoding, in particular, masking of distortions introduced by errors.

SUBSTANCE: method and device are claimed which are meant for masking errors in video sequences. When a transition between scenes exists in a video sequence and an error is present in the image which is subject to transition between scenes, error masking procedure is used, which is based on type of transition between scenes, to conceal the error. Information about transition between scenes together with information about the type of transition between scenes is transferred to video decoder in the message of additional extension information, if the transition between the scenes represents a gradual transition between the scenes, algorithm of spatial-temporal masking of errors is used for masking the image, if the transition between the scenes represents a scene cut, and only a part of the image is lost or damaged, then spatial error masking is used to mask the lost or damaged part of the image, and if the whole image belonging to scene cut is lost or damaged, and the image begins a new scene, it is not masked.

EFFECT: creation of method, by means of which the appropriate form of error masking may be selected for frames which are subject to transitions between scenes in a video series.

3 cl, 3 dwg, 11 tbl

 

The technical field to which the invention relates

The present invention relates in General to video encoding and, in particular, to masking distortion input errors.

Prior art

The sequence consists of a sequence of still images or frames. Methods of video compression based on reducing excessive and not important for the perception of parts of sequences. Redundancy in video sequences can be classified on the spectral, spatial and temporal redundancy. Spectral redundancy refers to the similarity between different color components of the same image. Spatial redundancy is the result of the similarity between neighboring pixels in the image. Temporal redundancy exists due to the fact that there is a significant probability that the objects appearing in the previous frame image, will also appear in the current frame image. Compression can be achieved through the use of this temporal redundancy and prediction of the current image based on another image, called the reference image. Additional compression is achieved by generating a data motion compensation, which describe the motion between a current is their image and the reference image.

How video compression usually there are images that use the temporary elimination of redundancy and images that do not use it. Compressed images that do not use the remedy time redundancy, usually referred to as INTRA- (or I-) frames or images. Predicted in-time images usually quickly predicted on the basis of the image occurring before the current image, and they are referred to as INTER - or P-frames. The compressed fragment of the videos (video) usually consists of a sequence of images, which can be roughly classified into independent time INTRA-image and encoded using temporal differences INTER-image. INTRA-image is usually used to stop the spread in time of transmission errors in the recovered video signal and to provide the random access point in the bit stream. Since the compression efficiency provided by the INTRA-pictures, usually lower than the efficiency of the compression provided by using INTER-image, they are in General used sparingly, especially in applications with a low bit rate.

The sequence may be composed of many scenes that have been shot with one camera, or video clips. The video clip is defined as the succession of alnost continuous frames or images, shot using a single camera. In the General case frames within the same video strongly correlated. However, in the normal sequence content of the images is significantly different from one scene to another, and therefore the first image of a scene usually encode using coding INTRA-frame. Change various video clips in the sequence are called "transitions between scenes. The transitions between scenes can take many different forms. For example, one movie can end, and the other may abruptly begin with "sharp transitions between scenes (cliffs scenes)". In other cases, the transition between scenes is gradual and it takes more than one frame. Examples of the gradual transitions between scenes - "the waves", "the gradual appearance and disappearance of images (fade image, the gradual disappearance of the image) and eviction (gradual change from one image to the other)".

Compressed video is easily damaged by errors in transmission, mainly for two reasons. First, because of the use of differential coding with prediction in time (INTER-frames) errors are distributed both in space and in time. Secondly, the use of codes of variable length increases susceptibility bit videophoto is and errors. In the receiver (decoder) there are many ways of dealing with the distortions introduced in the transmission path. In General, when the reception signal errors that occur during transmission, first detect and then correct or mask using the decoder. The term "error correction" refers to the process of full recovery of erroneous data, as if originally have been no errors, while masking errors" refers to the process of masking the impact of transmission errors so that they were barely visible in the reconstructed video sequences.

Currently, the video decoder, developed by the joint group on video (JVT) of the expert group on cinematography of the International organization for standardization (ISO)/International electrotechnical Commission (IEC) and the expert group on coding of the International telecommunication Union - telecommunication sector (ITU-T) video encoder-decoder (codec) ITU-T H.264/MPEG-4 part 10 AVC, lacks in the way of making decisions about how to camouflage a transmission error in the encoded INTRA-frames and frames of transition between scenes, and it is in this context we developed in this invention.

The invention

The present invention is the provision of a method by which with testwuide type of masking errors can be selected for the frames which belong to the transitions between scenes in a video sequence. This method is equally applicable to an abrupt transitions between scenes (i.e. breaks scene) and gradual transitions between scenes, such as the gradual emergence or disappearance of the image, the displacement and the influx etc.

To perform effective masking of error frames that belong to the transitions between scenes, you need two kinds of information: 1) information about frames in which the change of the video begins and ends; and 2) the type of transition between scenes (break, dissolve, fade in or disappearance, displacement etc). As the two described types of information are not required for correct decoding of the data of the level of encoding video data (VCL), the present invention proposes that information related to transitions between scenes, were provided as additional information extension (DIR SEI) and encoded bit stream of video data included in the message for more information expansion (DIR). All the necessary information required for masking errors that occur in frames that belong to transitions between scenes can then be logically derived from the message DEERE.

According to the invention, each scene of the video sequence associated with the connotation of the scene ID. The values of the scene ID for successive stages differ from each other, and thus, the video decoder may conclude that a change of scene, when he makes a scene ID different from the one he took earlier. Shots in the period of transition between scenes associated with two values of the scene ID, one from each of the two stages, between which the transition is performed. In addition, the gradual scene transitions associated with a particular type of transition, which may be the influx, the gradual appearance of the image, the gradual disappearance of the image, displacement, or "none of the above" (i.e. some other kind of transition). This rough classification provides the decoder with sufficient management information, giving him the opportunity to choose a suitable algorithm for masking errors to mask the loss or corruption of data when switching between scenes.

Thus, according to the first aspect of the present invention provides a method of masking errors in the frame of the sequence, which contains at least a first scene and a second scene, there is a transition from the first scene to the second scene, which contains many frames and is one of many species. This method contains:

the identifier is of the form of transition between scenes; and

the application procedures for masking errors to mask errors in the frame belonging to the mentioned transition, based on the identified transition between scenes.

Identified type of transition between scenes can be broken scene or gradual transition between scenes.

Preferably, if the entire image belonging to the cliff scene, lost, lost the image is not masked.

Preferably, if the part of the image belonging to the cliff scene, lost or damaged, apply the algorithm of the spatial masking errors for dropout mentioned the lost or damaged part of the image.

Preferably, if all the image belonging to a gradual transition, lost or damaged, apply the algorithm of the spatial-temporal masking errors to mask the lost or damaged part of the image.

Preferably, if a portion of an image belonging to a gradual transition, lost or damaged, apply the algorithm of the spatial-temporal masking errors for dropout mentioned the lost or damaged part of the image.

Preferably, information indicating the identified transition between scenes, serves on the decoder in the message for more information and extensions to allow the decoder to mask the error, based on the aforementioned information.

Mainly information indicating the identified transition between scenes, includes an indication of a transition between scenes, and information indicating the identified transition between scenes, provide for each frame belonging to the transition.

According to the second aspect of the present invention is a video encoding device for encoding a video sequence in the data stream, this sequence contains at least the first stage and the second stage is the transition between scenes from the first scene, and the transition between scenes contains many frames and is one of many species. The video encoding device includes:

means for identifying frames associated with the said transition;

a means of providing information about the transition.

According to a third aspect of the present invention provides a device videodatabase for decoding video sequences from the data stream, this sequence contains at least the first stage and the second stage is the transition between scenes from the first scene, and the transition between scenes contains many frames and is one of many species. The device videodecoder the project contains:

means of reception of the data stream; and

the algorithm masking errors to mask errors in the frame belonging to the mentioned transition based on the transition between scenes.

The present invention will become apparent after reading the description, considered in conjunction with Fig. 1 - 3.

List of figures

Fig. 1 - sequence of operations illustrating a method of masking errors according to the present invention, showing how to choose a suitable method of masking errors for the image in the transition between scenes depending on the type of transition between scenes.

Fig. 2 is a structural diagram showing a video encoder implemented in accordance with the present invention, to provide an encoded data stream, which includes information specifying the transitions between scenes, in order to mask the error.

Fig. 3 is a structural diagram showing a video decoder implemented in accordance with the present invention and corresponding to the encoder shown in Fig. 2.

The preferred embodiment of the invention

As explained above, additional information extension (DIR), which includes the encoded bit stream contains information that is not required for correct decoding of the encoded videodanny is, but which is, nevertheless, useful for presentation purposes or decoding. Thus, information DEERE is the perfect vehicle for transferring information about the scene belongs to a particular frame of the sequence, and to provide information about the transitions between scenes.

According to the video coding standard ITU-T H.264/MPEG-4 part 10 AVC, element DEERE contains one or more messages DEER. Each message DEERE consists of a header DEER and useful information (payload) DIR. Type and amount of useful information DEERE encode using extensible syntax. The amount of useful information DEERE indicate in bytes. Permissible types of useful information DEER are listed in Appendix C of the draft JVT Committee (see document JVT_D015d5).

Useful information DEER can be a useful header information DIR. For example, the header of useful information can specify the image belongs to the specific data. The header of useful information determined for each kind of useful information separately. Descriptions of useful information DEERE defined in Appendix C of the draft JVT Committee (again, see the document JVT_D015d5).

Transfer modules DIR occur synchronously with respect to other modules NAL (network abstraction layer). The message CONDUCTOR may refer to a layer, parts of the image the Oia, image, any group of images, the sequence in the past, the sequence decoded at the present time, or sequence, which will be decoded in the future. The message CONDUCTOR may also refer to one or more modules NAL, the previous or next in order of transmission.

Table 1 below defines the syntax of useful information DEERE, which is used in the video coding standard ITU-T H.264/MPEG-4 part 10 AVC, while table 2 presents the concrete syntax used in connection with the exchange of information about the scene, which is proposed according to the present invention.

Table 1
The syntax of useful information DIR
SEI_payload (PayloadType (kind of useful information), PayloadSize(the amount of useful information)) {CategoryA descriptor (descriptor)
if (PayloadType == 1)
temporal_reference (PayloadSize, PayloadSize)7
else if (PayloadType == 2)
Clock_timestamp (PayloadSize, PayloadSize)7
else if (PayloadType == 3)
panscan_rect (PayloadSize, PayloadSize)7
else if (PayloadType == 4)
Scene_information (PayloadSize, PayloadSize)7
otherwise
reservedVariable
if (!byte_aligned ()) {
bit_equal_to_onef(1)
while (!byte_aligned ())
bit_equal_to_zerof(1)
}
}

Table 2
The syntax information of the scene DEERE
Scene_information (PayloadType, PayloadSize) {CategoryDescriptor
Scene_identifieru(8)
if (more_ SEI_payload_data ()) {
second_scene_identifieru(8)
if (more_ SEI_payload_data ()) /td>
scene_transition_typee(v)
}
}

The settings information of the scene are shown in table 2, refer to the following NAL module containing the encoded data of the macroblock in the order of transfer.

scene_identifier: the Scene is defined as a sequence of continuous frames taken from one camera. In the General case frames within the same scene strongly correlated. According to the invention shots in this scene share the same parameter value scene_identifier and sequential scenes in order encoding should not have the same value scene_identifier.

second_scene_identifier: If there is, the parameter second_scene_identifier indicates that the following NAL module containing the encoded data of the macroblock belongs to the frame that contains the image data from the two scenes. In other words, the frame belongs to a gradual transition between scenes. Parameter second_scene_identifier - the ID of the subsequent scenes in order encoding.

scene_transition_type: If the parameter scene_transition_type not present in the information stage DIR, it indicates that the type of transition between scenes is not known, is not defined or inappropriate. When present, the value is s, below in table 3 are valid:

Table 3
Types of transitions between scenes according to a preferred variant of the invention, the
ValueDescription
0Rush
1The gradual disappearance of the image
2The gradual appearance of the image
4Displacement
Other valuesReserved

Now will be described the manner in which the above-described information of the scene is used in the decoding process to the processing of data loss or corruption.

If the entire image is lost immediately before the current image and the scene has changed since the previously received image, the lost, the image should not be masked, because it starts a new scene. If the entire image is lost immediately before the current image, and no change of scene did not happen after the previously received image, the decoder conceals the lost image. If the entire image is lost during the transition period, the decoder uses the specified type of transition between vs the us masking lost the image.

If a portion of the current image have been lost or damaged and if the message DEERE information of a scene is associated with a given image, the decoder performs the following steps:

1. If the scene has changed after taking the previous image, the decoder applies the algorithm of the spatial masking errors for dropout lost or damaged parts of the current image.

2. If the scene has not changed since the previously received image, the decoder uses the algorithm of the spatial-temporal masking errors.

3. If the current image belongs to the transition between scenes, the decoder uses the specified type of transition between scenes, masking lost the image.

According to the invention, the encoder must generate messages DEERE information of the scene, if it is in error-prone environment, transfer or if there is a need to create a description of the contents of video data based on the encoded video signal. Even if there is no urgent need to describe the content, later, the need may arise for certain types of video content, such as video for fun. Therefore, according to the present invention preferably coders always generated message DEERE information of the scene, the EU is possible.

Accordingly, the encoder creates a message DEERE information of the scene for each cliff scene and gradual transition between scenes. For each image of the cliff scene there is a message DEERE information of the scene, which can be repeated later for robustness. For each of gradual transition between scenes there (preferably repeated) message DEERE scene information associated with the first image transition (i.e. the first image, which is composed and from the scene, from which transferred, and from scene to which you are moving). For each of gradual transition between scenes there is also preferably repeated) message DEERE scene information associated with the first image after the last image transition (the last image transition refers to the last image, which is composed and from the scene, from which transferred, and from scene to which you are moving). As mentioned above, the value of the parameter scene_identifier differs in consecutive scenes.

In service-oriented packet transport block batching transportation duplicates of every message DEERE information of the scene at least in two batches, if possible, to ensure the correct reception of at least one message. When the transport is rouke using RTP (transport Protocol real-time) unit package uses a composite packages for linking messages DEERE information of the scene with the encoded image content. In oriented byte stream transport environment each message DEERE information of the scene at least twice.

Method of masking errors in the sequences by passing information about the change of the sequences according to the present invention shown in the sequence 100 of Fig. 1.

When the decoder encounters with data loss or corruption during the decoding process, the decoder determines at step 110 whether the lost or damaged the image of the whole image or part of image. If you have lost all the picture, the decoder determines what kind of loss situation occurs (step 120). If the entire image was lost just before the current image and the scene has changed since the previously received image (e.g., as specified by the value parameter scene_identifier in the received information DIR), then lost, the image should not be masked, because in this case, as explained above, the lost image represents the beginning of a new stage (stage 124). If the entire image was lost just before the current image and no change of scene did not happen after the previously received image, the decoder conceals the lost image, as shown in step 122. If the entire image were lost during the period PE is ehoda, the decoder uses the specified type of transition between scenes (obtained from the received information DIR) when masking lost image, as shown in step 126.

If you lose part of the image, the decoder determines what kind of loss situation occurs (step 130). If lost or damaged part of the current image, and if the message DEERE information of a scene is associated with a given image, the decoder should perform the following actions: If the scene has changed since the previous received image, the decoder applies the algorithm of the spatial masking errors for dropout lost or damaged parts of the current image, as shown in step 134. If the scene has not changed since the previous received image, the decoder uses the algorithm of the spatial-temporal masking errors, as shown in step 132. If the current image belongs to the transition between scenes, the decoder uses the specified type of transition between scenes when masking lost image, as shown in step 136.

To perform the method of masking errors, as shown in the sequence 100 of Fig. 1, the video encoder implemented according to the invention, must be made with the possibility of monitoring changes of scene and transmitting information indicating the change of scene, the bit stream generated by the encoder. Structural diagram of such a video encoder 200 shown in Fig. 2. As shown in this figure, the video encoder implemented in accordance with the invention, includes means 210 monitoring transition between scenes, the tool 220 encoding unit 230, the control unit 240 multiplexing/batching. The input video signal representing a sequence is fed to the input of the video encoder and fed through the tool 210 monitoring transition between scenes to the tool 220 encoding video data, where individual frames of the video encode, for example, in the format of an INTRA - or INTER-frames. The monitoring tool of transition between scenes explores frames, for example, calculating the cumulative sum of absolute differences between pixels in successive frames of the sequence, or using any other method of detection of the scene, known from the prior art, to identify the various stages that make up the sequence. The tool 210 monitoring transition between scenes provides unit 230 controls the indication of the scene, which belongs to each frame. When the detected transition between scenes (for example, breakage of the scene, the gradual emergence or disappearance of the image, dissolve, wipe, etc.), the tool 210 monitoring transition between scenes also provide the supports unit 230 controls the indication of the transition. The control unit assigns an identifier (e.g. number) of each stage identified by means of monitoring the transition between scenes, and associates the identifier with each frame, which is identified as belonging to this scene. In addition, when the detected transition between scenes, block 230 control instructs the tool 220 encoding video data to encode the first frame of the new scene format coding INTRA-frame. Mostly all subsequent frames belonging to the new scene, then encode to the format of encoding the INTER-frame, unless it becomes necessary for some reason to encode this frame in a different format. In a preferred embodiment of the invention, the block 230 management associates for more information expansion (DIR) with each frame belonging to the transition between scenes, and passes this information to the CONDUCTOR block 240 multiplexing/batching. Most of the information CONDUCTOR for the frames that are part of the transition between scenes, formed as previously described in the preferred embodiment of the invention, presented earlier in the text. The multiplexing unit/formation packages also receives coded video data from the means 220 encoding video data and generates a single bit stream from Zack is giovanny video and information DIR. The bit stream is then passed, for example, on a corresponding video decoder (see Fig. 3) through the transmission channel or storage device (not shown) for later retrieval and viewing.

Fig. 3 is a structural diagram showing the video decoder 300 implemented in accordance with the present invention and corresponding to the video encoder as described in connection with Fig. 2. As can be seen in Fig. 3, the video decoder according to the invention includes a block 310 dissolution packages/demuxing, block 320, the control module 330 of the masking bug tool 340 decoding video data. Block the dissolution packages/demux receives from the transmission channel encoded bit stream representing a sequence in the form of data packets. It restores the encoded bit stream of video data from the received packets of data and divides the bit stream of video data into its parts (i.e. various types of information related to the encoded frames of the video sequence). According to the invention block the dissolution packages/demux extracts additional information extension that contains, among other things, information related to transitions between scenes, from the encoded bitstream of video data and transmits the information DIR to the unit 30 controls. The data necessary for decoding the encoded video frames are passed from block 310 dissolution packages/demux means 340 decoding video data, where individual frames of the video sequence is reconstructed using, for example, methods of decoding the INTRA - and INTER-frames. When each frame is decoded, the tool 340 decoding video explores the received video data for errors that may have been introduced during transmission of the encoded bitstream of video data on the transmission channel from the encoder. If the means of decoding video data detects that a particular frame contains such errors or frame so badly damaged that it cannot be decoded (i.e., the frame actually lost), it tries to mask the error, or the entire frame, using the appropriate algorithm masking errors. According to the present invention the choice of the appropriate method of masking errors performs module 330 masking errors. Module masking errors receives information about the scene, which each frame belongs, from block 320 management. In the case of frames that are part of a transition between scenes, the block 320 control also passes to the masking of the error information related to the type of transition between scenes. Thus, the tool 340 Dec is tiravanija detects an error, affecting the frame, which is part of a transition between scenes, the module 330 masking errors can choose the appropriate method for masking errors, considering the scene which the frame belongs, and given the type of transition between scenes. Preferably, making this selection, the module 330 masking errors applies the selection method described in the preferred embodiment of the invention presented above. The tool 340 videodatabase then masks the error in the frame using the selected algorithm masking errors, and outputs the decoded frame, for example, for display on a display device (not shown).

To confirm the effectiveness of the methods of masking errors for frames cliff scene and transition between scenes according to the invention was performed a number of simulation experiments using the video encoder and decoder implemented according to the coding standard ITU-T H.264/MPEG-4 part 10 AVC, modified to operate in accordance with the method of the present invention. These simulation experiments are described in detail below.

A. Modeling of masking errors in the random access frames and cliffs of the scenes

This simulation was used sequences and bit rate, as proposed in document VCEG-N79r1, and conventional conditions for environment rubbed the mi package (as defined in VCEG-N79r1). In addition, to simulate the effects of masking errors for frames cliff scene of artificial sequence of 30 frames with regular breaks scene was created from known sequences, "News", "Foreman (Foreman)", "Coastguard (Officer coastguard)", "Carphone (Car phone)and Silent (Silence)". In the future this synthetic sequence is referred to as "MixedSeq". Used between INTRA-frames, is approximately equal to 1 second, to allow frequent random access in all cases. For the sequence MixedSeq such period INTRA-frame led to the fact that all breakages scenes were coded using the coding INTRA-frame. Also used optimization knowledge about the losses R/D (distortion depending on the transmission speed) (LA-RDO). Other encoding parameters used to encode the sequence shown below in table 4:

Table 4
The parameters of the encoder used in the modeling of masking errors in the random access frames and cliffs scenes
Mode bitstream:RTP (transport Protocol real-time)
Resolution motion vector: j pixel
Hadamard transform:use
The maximum search range:16
The number of previous frames used for motion search INTER-frames:5
Permissible types of blocks:All
Mode layers:fixed size 1400 bytes/layer
B-frames and SP-frames:not used
Mode symbol:UVLC (universal coding with variable word length)
Data sections:1 section per layer
Title sequence:there is no sequence header
Limit search range:No
Limited prediction on the basis of INTRA-CDROM:use
Limited keyframes:use
The number of decoders for LA-RDO (optimized algorithm for selecting the mode of the macroblock, according to awareness about the losses and dependence of the distortion on the speed of data transmission):30

Masking errors

Create messages DEERE information scene in the encoder according to the method of the invention model who was Avalos in accordance with the recommendations presented in the above-described preferred embodiment of the invention, and found the comparison of the two decoding processes:

1. Standard joint model of the decoder, which includes a method of masking errors, described in Appendix D working draft JVT (see document JVT-C039).

2. The combined model of the decoder, the extended use of the decoding process according to the present invention.

Calculate the data transfer rate and PSNR (signal-to-noise power)

As indicated in the normal conditions defined in document VCEG-N79r1, coding parameters, such as quantization parameter were chosen so as to make the resulting bit rate as close as possible to the transmission speed in bits per channel, given the 40 bytes of headers of IP/UDP/RTP (Internet Protocol/transmission Protocol user datagram/Protocol real-time) on the package. The PSNR values were calculated using each frame in the original sequence, including missed and lost frames. To reduce the effect of the first frame on the entire result (the first encoded frames have a larger average size than the average size of the entire sequence), the bit rate and the average PSNR was calculated on the basis of six encoded kad is impressive. This method allows you to encode a short sequence with reliable results. Instead of coding 4000 frames 300-400 frames assigned to each sequence was used to ensure that at least 100 frames were coded and used at least 300 frames.

Modeling of packet losses

In this simulation it was assumed that the package contains a set of parameters (table 4), was transmitted securely (for example, using an additional channel during the session), and therefore no sample error was not read to him from a file with sample errors. At least one packet of the first frame should be taken to avoid accidental failure of the decoder. To meet this condition, the first packet of the first frame is always accepted regardless of the corresponding sample of the error.

Representative performing decoding

The encoded bit stream has been decoded several times (each referred to as performing decoding). The initial position of the lost frame to perform decoding with sequence number n+1 is continuously follows the final location of the lost frame n-th execution of the decoding. The number of executions decoding was chosen so that all were at least 8000 packages. Full environments is its PSNR was obtained by averaging the average values of PSNR of all executions decoding. Representative performing decoding was chosen so that its average PSNR was closest to full average PSNR. Instantaneous values of PSNR and the decoded sequence representative perform decoding continued to draw instant graphs of PSNR and subjective quality assessment.

Results

As shown by simulation results for the sequence MixedSeq at 144 kbit/s, are presented in the following table 11, the use of masking errors INTRA-frames for breakages scene provides the best performance in terms of objective and subjective quality than using masking errors INTER-frames. On the contrary, the use of masking errors INTER-frames, respectively, better than using masking errors INTRA-frames for frames that are not the cliff scene, as you can see in the other six cases the encoding presented in tables 5 - 10. This demonstrates the applicability of the present invention.

Table 5
The "Foreman"sequence encoded at 64 kbit/s to 7.5 frames/s
AlgorithmThe resultant transmission rate in bits QP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D59,812425,5425,2024,9324,4323,34
The method according to the invention59,812425,5425,2925,1124,6423,86

Table 6
The "Foreman"sequence encoded at 144 kbit/s to 7.5 frames/s
AlgorithmThe resultant transmission rate in bitsQP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D143,541826,7826,1925,8824,9723,61
The method according to the invention 143,541826,7826,4326,1625,53a 24.57

Table 7
The sequence "Hall", encoded at 32 kbps, 10 fps
AlgorithmThe resultant transmission rate in bitsQP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D29,732430,5329,8929,5328,2826,79
The method according to the invention29,732430,5330,4030,2830,0129,55

Table 8
The sequence Irene (Irina)", encoded at 384 kbps, 30 fps
AlgorithmThe resultant transmission rate in bitsQP

(parameter kV is notowania)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D334,962234,9934,0933,4031,3528,79
The method according to the invention334,962234,9934,6234,3233,5832,35

Table 9
The sequence "Paris (Paris)", encoded at 144 kbps, 15 fps
AlgorithmThe resultant transmission rate in bitsQP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D139,182826,4125,34to 24.6623,4421,01
The method according to the invention139,1828 26,4126,2326,0825,7025,10

Table 10
The sequence "Paris", encoded at 384 kbps, 15 fps
AlgorithmThe resultant transmission rate in bitsQP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D355,322229,5627,7526,9524,0621,54
The method according to the invention355,322229,5629,2028,9228,3327,34

Table 11
The sequence MixedSeq", encoded at 144 kbps, 15 fps
AlgorithmThe resultant transmission rate in bitsQP

(quantization parameter)
The rate of packet loss (%)
0351020
Standard joint model of the decoder with the methods of masking application error D124,312130.37 per30,0429,8629,1728,23
The method according to the invention124,312130.37 per30,0629,8829,2628,36

B. Modeling of masking errors in the gradual emergence or disappearance of the image

To simulate the effects of masking errors for frames gradual appearance of the image and the gradual disappearance of the images were created two artificial sequence 10 frames fade, 10 frames gradual emergence of images and 10 normal shots. One was made from a combination of sequence "News" and "Akiyo" (low-motion), the other was made of sequences "Carphone" and "Foreman" (with a moderate level of traffic). After encoding, using a generalized model of the JVT coder and a combination of coding I-P-P-P, simulated the loss of some of the frames of the gradual emergence or disappearance of the image, and the bit stream with sweat is Rami served on the decoder. For masking errors introduced by using frame loss gradual appearance or disappearance of the image, we used two different methods of masking errors:

1. The usual way of masking errors in the encoder-decoder JVT (as described in Appendix D working draft JVT (JVT-C039)); and

2. A special way of masking errors for the gradual emergence or disappearance of images according to the invention, as described below.

Method of masking errors

The missing frame with a gradual change mask by copying and scaling the values of the pixels of the previous frame. However, if there is only one previous frame in the period of transition between scenes, no scaling is not performed. If Mn' is the average of Y (luminance) of the pixel of the previous frame, and Mn is the average value Y of the pixel of the frame located in front of the previous frame, the scale factor f is calculated as follows:

f = (2·Mn' - Mn")/Mn'

Masked values Y, U and V, where U and V are color difference signals, for pixel (Ys, Us, Vs) is calculated from the spatial corresponding values of the previous image (Y, U, V) as follows:

Ys = f·Y

Us = f·(U - 128) +128

Vs = f·(V - 128) + 128

As shown by simulation results, using specialsymbol masking errors for gradual changes, according to the present invention provided significantly better performance in objective and subjective quality compared to the usual way of masking errors, described in Appendix D of the recommendations of the JVT encoding video data. It can be argued that the visual quality of gradual transition between scenes is not important. However, poor masking of errors during the transition between scenes not only leads to poor quality of personnel transition, but also leads to poor quality of conventional frames after the transition between scenes due to error propagation in time.

Masking errors flows

Below are two ways of masking errors lost frame during the influx. If the decoder is able to buffer a sufficient number of frames to decode, then the algorithm should be used A. Otherwise, the algorithm should be used B.

The algorithm A:

If the buffer before the decoder contains some encoded INTRA frame of the second scene (the frame after the transition period), this INTRA-frame is used as the second reference frame for masking errors. If such INTRA-frame is not available, then the algorithm should be used B. the First reference frame last restored frame. If dt1 is the temporal distance between the first reference frame and the missing frame during the performance, which means, and dt2 has the same size relative to the second support frame, respectively, (y1, u1, v1) is the pixel in the first reference frame and (y2, u2, v2) is the spatially corresponding pixel in the second reference frame, the masked pixel (y, u, v) then set using:

y = clip1 ((y1·dt2 + y2·dt1)/(dt2 + dt1)).

where u and v are calculated in a similar way, but take into account their sign:

u = clip1 ((u1-128)·dt2 + (u2 - 128)·dt1)/(dt2 + dt1) + 128), where the mathematical function "clip1" is defined as follows:

clip1 (c)=clip3 (0, 255, c)
clip3 (a, b, c)= a, if c < a
= b if c > b or a
= c otherwise.

The algorithm B:

Conventional spatial-temporal masking errors.

Masking errors in displacement

The decoder must detect:

1. the shape of the border between the two scenes involved in the eviction; and

2. the speed determines how quickly the final scene is covered by the initial stage.

Detection can be performed, for example, by comparing the restored image and calculating the correlation block by block. If two spatially corresponding block of the sequence in time of the image correlated, they belong to the same scene. Otherwise they are from different scenes.

<> Based on the estimated shape and speed, the decoder can compute the prediction of the location and shape of the border in the missing image or region. Missing areas that belonged to the final stage in the previous image and the estimated belong to the final scene in lost image/region, may be masked by copying the area of the previous image. Similarly, missing areas, which belonged to the initial stage in the previous image and the estimated belong to the initial stage in the lost image/region, may be masked by copying the area of the previous image. Missing areas that belonged to the final scene in the previous picture, which is estimated to belong to the initial stage in the lost image/areas shall be masked from adjacent spatial content of the initial scene. When masking the missing areas can be used boundary, the corresponding connecting properly restored blocks, how often do when masking errors.

Masking errors of other types of transition

Should be a simple spatial-temporal method of masking errors.

Although this invention has been described relative to a preferred variant implementation is tvline, experts should understand that the foregoing and various other changes, excluding parts and change the shape and details of this variant implementation can be made without departing from the scope of the invention.

1. Method of masking errors in the frame of the sequence, which contains at least a first scene and a second scene, there is a transition between scenes from the first scene to the second scene, which contains many frames and is one of many species, and this method comprises the steps are extracted from the encoded data stream information identifying the type of transition between scenes, and apply the procedure of masking errors to mask errors in the frame belonging to the transition between scenes, based on the identified transition between scenes.

2. The method according to claim 1, wherein the identified type of transition between scenes is a breakage of the scene.

3. The method according to claim 2, in which if you lose the whole picture belonging to the breakage of the scene, then lost the image is not masked.

4. The method according to claim 2, in which if lost or damaged part of the image belonging to the breakage of the scene, then apply the algorithm of the spatial masking errors for dropout mentioned a lost or damaged part of the image is of.

5. The method according to claim 1, wherein the identified type of transition between scenes is a gradual transition between scenes.

6. The method according to claim 5, in which the transition between scenes is a gradual emergence or disappearance of the image.

7. The method according to claim 5, in which the transition between scenes is a rush.

8. The method according to claim 5, in which the transition between scenes is an eviction.

9. The method according to claim 5, in which if lost or damaged all image belonging to a gradual transition, apply the algorithm of the spatial-temporal masking errors to mask the lost or damaged part of the image.

10. The method according to claim 5, in which if lost or damaged part of the image belonging to a gradual transition, apply the algorithm of the spatial-temporal masking errors for dropout mentioned the lost or damaged part of the image.

11. The method according to claim 1, wherein information indicating the identified transition between scenes, serves on the decoder in the message for more information extensions to provide the decoder capabilities to mask the error, based on the aforementioned information.

12. The method according to claim 11, in which said information indicating the identified transition between scenes, including the AET reference transition between scenes.

13. The method according to claim 11, in which said information indicating the identified transition between scenes, provide for each frame belonging to the transition between scenes.

14. Device for encoding a video signal for encoding a video sequence in the encoded stream of video data, this sequence contains at least the first stage and the second stage is the transition between scenes from the first scene, and the transition between scenes contains many frames and is one of the many types of transitions between scenes, while the above-mentioned device for encoding video signal contains an identification module to identify personnel associated with the transition between scenes, the multiplexing module to provide information indicating the type of transition between scenes, in the encoded data stream.

15. Device for encoding a video signal according to 14, in which the mentioned information is available in the additional message information extensions.

16. Device for encoding a video signal according to 15, in which said information is provided for each frame belonging to the transition between scenes.

17. The device decoding a video signal for decoding the sequence of encoded video data stream, with the data sequence contains the first stage and the second stage is the transition between scenes from the first scene, moreover, the above-mentioned device decoding a video signal includes a demux module for receiving encoded video data stream and extract information that identifies the type of transition between scenes, the encoded stream of video data and the masking module errors to mask errors in the frame belonging to the transition between scenes, based on the type of transition between scenes.

18. A device for decoding a video signal through 17, in which the type of transition between scenes is extracted from the additional information of the expansion in the encoded data stream.

19. A device for decoding a video signal through 17, in which the type of transition between scenes is a gradual transition between scenes, and lost or corrupted image belonging to this gradual transition between scenes, and the masking module error applies the algorithm of the spatial-temporal masking errors for dropout lost or damaged image.

20. A device for decoding a video signal through 17, in which the type of transition between scenes is a gradual transition between scenes, and lost or damaged part of the image belonging to the gradual transition between scenes, and the masking module error applies the algorithm of spaces of the NGO-temporal masking errors for dropout mentioned the lost or damaged part of the image.

21. A device for decoding a video signal through 17, in which the type of transition between scenes is a cliff scene, and lost or damaged part of the image belonging to the cliff scene, the masking module error applies the algorithm of the spatial masking errors to mask errors in the image.

22. A device for decoding a video signal through 17, in which the type of transition between scenes is a cliff scene, and lost or corrupted image belonging to the cliff scene, this module masking errors made with the possibility of ignoring the lost or damaged image.



 

Same patents:

FIELD: digital processing of images, possible use for transmitting images through low speed communication channels.

SUBSTANCE: in accordance to the invention, the image is divided onto rank blocks, for each rank block of original image a domain or a block is found in the code book and a corresponding transformation, which best covers the given rank block, if no sufficiently precise match is found, then rank blocks are divided onto blocks of smaller size, continuing the process, until acceptable match is achieved, or the size of rank blocks reaches certain predetermined limit, while after the division of the image onto rank blocks, classification of the blocks is performed, in accordance to which each domain is related to one of three classes, also except classification of domain blocks of original image, code book blocks classification is also performed, and further domain-rank matching is only performed for those domains, which belong to similarity class of given rank area. As a result, during the encoding, the search for area, which is similar to a rank block, is performed not only among the domains which are blocks of the image being encoded, but also among the code book blocks which match the rank area class.

EFFECT: increased speed of encoding with preserved speed of transmission and frame format length.

3 dwg

FIELD: technology for processing digital images, namely, encoding and decoding of images.

SUBSTANCE: in the system and the method, serial conversion and encoding of digital images are performed by means of application of transformation with superposition (combination) of several resolutions, ensuring serial visualization and reduction of distortions of image block integrity and image contour when compared to many standard data compression systems. The system contains a converter of color space, block for transformation with superposition of several resolutions, quantizer, scanner and statistical encoder. Transformation by scanning with usage of several resolutions outputs transformation coefficients, for example, first transformation coefficients and second transformation coefficients. Representation with usage of several resolutions may be produced using second transformation coefficients with superposition of several resolutions. The transformer of color space transforms the input image to representation of color space of the input image. Then, the representation of color space of input image is used for transformation with superposition of several resolutions. The quantizer receives first transformation coefficients and/or second transformation coefficients and outputs quantized coefficients for use by scanner and/or statistical encoder. The scanner scans quantized coefficients for creating a one-dimensional vector, which is used by statistical encoder. The statistical encoder encodes quantized coefficients received from quantizer and/or scanner, which results in compression of data.

EFFECT: increased traffic capacity and increased precision of image reconstruction.

27 cl, 19 dwg

FIELD: image processing systems, in particular, methods and systems for encoding and decoding images.

SUBSTANCE: in accordance to the invention, input image is divided onto several image blocks (600), containing several image elements (610), further image blocks (600) are encoded to form encoded representations (700) of blocks, which contains color code word (710), intensity code word (720) and intensity representations series (730). Color code word (710) is a representation of colors of elements (610) of image block (600). Intensity code word (720) is a representation of a set of several intensity modifiers for modification of intensity of elements (610) in image block (600), and series (730) of representations includes representation of intensity for each element (610) in image block (600), where the series identifies one of intensity modifiers in a set of intensity modifiers. In process of decoding, code words (710, 720) of colors and intensity and intensity representation (730) are used to generate decoded representation of elements (610) in image block (600).

EFFECT: increased efficiency of processing, encoding/decoding of images for adaptation in mobile devices with low volume and productivity of memory.

9 cl, 21 dwg, 3 tbl

FIELD: method for encoding and decoding digital data transferred by prioritized pixel transmission method or stored in memory.

SUBSTANCE: in accordance to the invention, informational content being encoded and decoded consists of separate pixel groups, where each pixel group contains value of position, at least one pixel value and priority value assigned to it, where at least one key is used, with which value of position and/or pixel value/values of pixels of pixel group are selectively encoded or decoded. Depending on used keys and on parts of informational content which are encoded, for example, value of positions and/or values of pixel groups, many various requirements may be taken into consideration during encoding.

EFFECT: ensured scaling capacity of encoding and decoding of digital data.

8 cl, 5 dwg, 3 tbl

FIELD: systems for encoding and decoding video signals.

SUBSTANCE: method and system for statistical encoding are claimed, where parameters which represent the encoded signal are transformed to indexes of code words, so that decoder may restore the encoded signal from aforementioned indexes of code words. When the parameter space is limited in such a way that encoding becomes inefficient and code words are not positioned in ordered or continuous fashion in accordance with parameters, sorting is used to sort parameters into various groups with the goal of transformation of parameters from various groups into indexes of code words in different manner, so that assignment of code word indexes which correspond to parameters is performed in continuous and ordered fashion. Sorting may be based on absolute values of parameters relatively to selected value. In process of decoding, indexes of code words are also sorted into various groups on basis of code word index values relatively to selected value.

EFFECT: increased efficiency of compression, when encoding parameters are within limited range to ensure ordered transformation of code word indexes.

6 cl, 3 dwg

FIELD: technology for encoding and decoding of given three-dimensional objects, consisting of point texture data, voxel data or octet tree data.

SUBSTANCE: method for encoding data pertaining to three-dimensional objects includes following procedures as follows: forming of three-dimensional objects data, having tree-like structure, with marks assigned to nodes pointing out their types; encoding of data nodes of three-dimensional objects; and forming of three-dimensional objects data for objects, nodes of which are encoded into bit stream.

EFFECT: higher compression level for information about image with depth.

12 cl, 29 dwg

The invention relates to the representation of three-dimensional objects on the basis of images with depth

The invention relates to the representation of three-dimensional objects on the basis of images with depth

The invention relates to the representation of three-dimensional objects obtained using photos of real objects

The invention relates to photo - and video system technology

FIELD: technology for encoding and decoding of given three-dimensional objects, consisting of point texture data, voxel data or octet tree data.

SUBSTANCE: method for encoding data pertaining to three-dimensional objects includes following procedures as follows: forming of three-dimensional objects data, having tree-like structure, with marks assigned to nodes pointing out their types; encoding of data nodes of three-dimensional objects; and forming of three-dimensional objects data for objects, nodes of which are encoded into bit stream.

EFFECT: higher compression level for information about image with depth.

12 cl, 29 dwg

FIELD: systems for encoding and decoding video signals.

SUBSTANCE: method and system for statistical encoding are claimed, where parameters which represent the encoded signal are transformed to indexes of code words, so that decoder may restore the encoded signal from aforementioned indexes of code words. When the parameter space is limited in such a way that encoding becomes inefficient and code words are not positioned in ordered or continuous fashion in accordance with parameters, sorting is used to sort parameters into various groups with the goal of transformation of parameters from various groups into indexes of code words in different manner, so that assignment of code word indexes which correspond to parameters is performed in continuous and ordered fashion. Sorting may be based on absolute values of parameters relatively to selected value. In process of decoding, indexes of code words are also sorted into various groups on basis of code word index values relatively to selected value.

EFFECT: increased efficiency of compression, when encoding parameters are within limited range to ensure ordered transformation of code word indexes.

6 cl, 3 dwg

FIELD: method for encoding and decoding digital data transferred by prioritized pixel transmission method or stored in memory.

SUBSTANCE: in accordance to the invention, informational content being encoded and decoded consists of separate pixel groups, where each pixel group contains value of position, at least one pixel value and priority value assigned to it, where at least one key is used, with which value of position and/or pixel value/values of pixels of pixel group are selectively encoded or decoded. Depending on used keys and on parts of informational content which are encoded, for example, value of positions and/or values of pixel groups, many various requirements may be taken into consideration during encoding.

EFFECT: ensured scaling capacity of encoding and decoding of digital data.

8 cl, 5 dwg, 3 tbl

FIELD: image processing systems, in particular, methods and systems for encoding and decoding images.

SUBSTANCE: in accordance to the invention, input image is divided onto several image blocks (600), containing several image elements (610), further image blocks (600) are encoded to form encoded representations (700) of blocks, which contains color code word (710), intensity code word (720) and intensity representations series (730). Color code word (710) is a representation of colors of elements (610) of image block (600). Intensity code word (720) is a representation of a set of several intensity modifiers for modification of intensity of elements (610) in image block (600), and series (730) of representations includes representation of intensity for each element (610) in image block (600), where the series identifies one of intensity modifiers in a set of intensity modifiers. In process of decoding, code words (710, 720) of colors and intensity and intensity representation (730) are used to generate decoded representation of elements (610) in image block (600).

EFFECT: increased efficiency of processing, encoding/decoding of images for adaptation in mobile devices with low volume and productivity of memory.

9 cl, 21 dwg, 3 tbl

FIELD: technology for processing digital images, namely, encoding and decoding of images.

SUBSTANCE: in the system and the method, serial conversion and encoding of digital images are performed by means of application of transformation with superposition (combination) of several resolutions, ensuring serial visualization and reduction of distortions of image block integrity and image contour when compared to many standard data compression systems. The system contains a converter of color space, block for transformation with superposition of several resolutions, quantizer, scanner and statistical encoder. Transformation by scanning with usage of several resolutions outputs transformation coefficients, for example, first transformation coefficients and second transformation coefficients. Representation with usage of several resolutions may be produced using second transformation coefficients with superposition of several resolutions. The transformer of color space transforms the input image to representation of color space of the input image. Then, the representation of color space of input image is used for transformation with superposition of several resolutions. The quantizer receives first transformation coefficients and/or second transformation coefficients and outputs quantized coefficients for use by scanner and/or statistical encoder. The scanner scans quantized coefficients for creating a one-dimensional vector, which is used by statistical encoder. The statistical encoder encodes quantized coefficients received from quantizer and/or scanner, which results in compression of data.

EFFECT: increased traffic capacity and increased precision of image reconstruction.

27 cl, 19 dwg

FIELD: digital processing of images, possible use for transmitting images through low speed communication channels.

SUBSTANCE: in accordance to the invention, the image is divided onto rank blocks, for each rank block of original image a domain or a block is found in the code book and a corresponding transformation, which best covers the given rank block, if no sufficiently precise match is found, then rank blocks are divided onto blocks of smaller size, continuing the process, until acceptable match is achieved, or the size of rank blocks reaches certain predetermined limit, while after the division of the image onto rank blocks, classification of the blocks is performed, in accordance to which each domain is related to one of three classes, also except classification of domain blocks of original image, code book blocks classification is also performed, and further domain-rank matching is only performed for those domains, which belong to similarity class of given rank area. As a result, during the encoding, the search for area, which is similar to a rank block, is performed not only among the domains which are blocks of the image being encoded, but also among the code book blocks which match the rank area class.

EFFECT: increased speed of encoding with preserved speed of transmission and frame format length.

3 dwg

FIELD: video data encoding, in particular, masking of distortions introduced by errors.

SUBSTANCE: method and device are claimed which are meant for masking errors in video sequences. When a transition between scenes exists in a video sequence and an error is present in the image which is subject to transition between scenes, error masking procedure is used, which is based on type of transition between scenes, to conceal the error. Information about transition between scenes together with information about the type of transition between scenes is transferred to video decoder in the message of additional extension information, if the transition between the scenes represents a gradual transition between the scenes, algorithm of spatial-temporal masking of errors is used for masking the image, if the transition between the scenes represents a scene cut, and only a part of the image is lost or damaged, then spatial error masking is used to mask the lost or damaged part of the image, and if the whole image belonging to scene cut is lost or damaged, and the image begins a new scene, it is not masked.

EFFECT: creation of method, by means of which the appropriate form of error masking may be selected for frames which are subject to transitions between scenes in a video series.

3 cl, 3 dwg, 11 tbl

FIELD: electrical engineering.

SUBSTANCE: invention relates to communication, in particular, to reducing the message redundancy. The developed method allows transmitting additional data without increasing the volume of transmitted data with the transmission rate left intact. First, the initial image is separated into not overlapping range units to be classified. Here, note that every range unit is refereed to one of the three classes, and the said classification is applied to the domains and units from the code book as well. Additional data is entered into lower category of the domain or units indices, to the rest categories of indices of the domain of the initial image or units from the code book applied is the trial inversion procedure. Now, the domain indices and units from the code book are optimised to be transmitted, along with the data on indices of their orientation, over the communication channel. The receiving party isolates the additional data and restores the initial image.

EFFECT: transmission of additional data without increasing the common volume of transmitted data at the required transmission rate.

4 dwg, 1 tbl

FIELD: information technology.

SUBSTANCE: invention refers to method and electronic device for determination of applicability of the encoded file in an application, which allows for using such type of files but has some restrictions related to properties of such file type, as well as to the computer-readable medium containing the computer programme for performing the said method. To fulfill the above method, the electronic device contains at least one block for correlating the files associated with the application, which accepts (50) at least one property of the encoded file and correlates (52) the property with the application, creates (54) an indicator showing whether the file can be used by the application relying on correlation, and connects (56) the indicator with the encrypted file for further provision of quick decision making regarding usage of the file by the application.

EFFECT: provision of quick choosing encoded files for usage by the application without preliminary decoding of the file.

16 cl, 7 dwg

FIELD: physics, computation equipment.

SUBSTANCE: the invention claims method of image blur compensation involving: calculation of difference between measured image pixel brightness and brightness assessment obtained earlier on the basis of previous frame sequence; movement detection by comparison of obtained difference to threshold value; defining of movement direction for each pixel; combination of adjoining pixels with the same movement direction in a single object; outlining contours of moving objects by adding their initial B(k) and gradient ▿(B(k)) of images; and generation of output image where k1, k2 are weight factors. Device of image blur compensation includes: image sensor, controller, mode movement detection module, object detection module, correction module, first RAM device, second RAM device, third RAM device, counter, first comparator, second comparator, first multiplexor, second multiplexor, third multiplexor, fourth multiplexor, fifth multiplexor, sixth multiplexor, seventh multiplexor, first demultiplexor, second demultiplexor.

EFFECT: blur compensation for moving object image in real-time mode.

2 cl, 11 dwg

FIELD: video data encoding, in particular, masking of distortions introduced by errors.

SUBSTANCE: method and device are claimed which are meant for masking errors in video sequences. When a transition between scenes exists in a video sequence and an error is present in the image which is subject to transition between scenes, error masking procedure is used, which is based on type of transition between scenes, to conceal the error. Information about transition between scenes together with information about the type of transition between scenes is transferred to video decoder in the message of additional extension information, if the transition between the scenes represents a gradual transition between the scenes, algorithm of spatial-temporal masking of errors is used for masking the image, if the transition between the scenes represents a scene cut, and only a part of the image is lost or damaged, then spatial error masking is used to mask the lost or damaged part of the image, and if the whole image belonging to scene cut is lost or damaged, and the image begins a new scene, it is not masked.

EFFECT: creation of method, by means of which the appropriate form of error masking may be selected for frames which are subject to transitions between scenes in a video series.

3 cl, 3 dwg, 11 tbl

Up!