Encoder for three-dimensional video signals

FIELD: physics, video.

SUBSTANCE: invention relates to encoding three-dimensional video signals, and specifically to a transport format used to transport three-dimensional content. The technical result is achieved using a device which is characterised by that it includes a means of generating a stream which is structured into multiple levels: level 0, having two independent layers: a base layer containing video data of a right-side image, and a level 0 extension layer containing video data of a left-side image, or vice versa; level 1, having two independent extension layers: a level 1 first extension layer containing a depth map relating to the image of the base layer, a level 1 second extension layer containing a depth map relating to the image of the level 0 extension layer; level 2, having a level 2 extension layer containing overlapping data relating to the image of the base layer.

EFFECT: high quality of three-dimensional images with a large number of presentations used.

6 cl, 2 dwg

 

VOLUME INVENTIONS

The invention relates to the coding of the signals of the three-dimensional video image, namely the transport format used for transmitting the three-dimensional content.

The area corresponds to three-dimensional video image, which includes the cinematic content used to show films for distribution on DVD or for broadcasting by television channels. Thus, more precisely, it includes the three-dimensional digital cinema, DVD and three-dimensional three-dimensional television.

Prior art

Today, for the relief of the images there are numerous systems.

Three-dimensional digital film, known as stereoscopic system, based on the wearing of glasses, for example, polarizing filters, and uses a stereographic pair of views (left/right), or the equivalent of two "coils" for the film.

Three-dimensional screen for relief of digital television, known as autostereoscopic system, because it does not require wearing glasses, is based on the use of a polarizing lenses or bands. These systems are designed to give viewers the opportunity to get a different image in the angular cone coming on right eye and left eye:

- Three dimensional is the first TV screen, produced by Newsight, contains the parallax barrier, transparent and non-transparent film corresponding to vertical slits that act as optical center of the lens, and rays, which have not been rejected, are the rays that pass through these cracks. The system actually uses 8 views, 4 views to the right and 4 left view, and these views enable you to create the effect of motion parallax during a change of perspective or move the viewer. This effect of motion parallax provides the best feeling of immersing the viewer in the scene than the feeling generated by simple autostereoscopic representation, in other words, a single view to the right and single view to the left, creating a stereoscopic parallax. Three-dimensional television screen from Newsight should have input format multiple stream of 8 performances, still undergoing standardization. The extension of the MVC (Multiview coding) standard MPEG4 AVC/H264 from JVT MPEG/ITU-T related to multiple coding of video, offers, respectively, encoding each of the views to their transfer in the flow, there is no synthesis of images in the receipt.

Three - dimensional television screen produced by Philips, contains lenses spared the TV panel. The system uses 9 views, 4 views to the right and 4 left view and one of the Central two-dimensional representation. It uses the format "2D+Z", in other words, the standard two-dimensional video image, transferring the traditional two-dimensional video plus auxiliary data corresponding to the map of the depth z, standardized by MPEG-C part 3. Two-dimensional images are synthesized using depth maps to ensure the display of the right and left images. This format is compatible with the current standard, related to two-dimensional images, but not sufficient to ensure high-quality three-dimensional images, especially when a high number of views. For example, the available data still does not enable the correct processing of the slab, causing artifacts. One solution, called LDV (layered video with depth) is in the scene successive frames. Then in addition to the "2D+z" is the data content relating to these ceilings, which are layers of beams composed of the color map that sets the value of the closed pixels, and depth maps for these closed pixels. To transfer this data. Philips uses the following format: the image, such as an HD image is agenie (HD), divided into four portions, the first portion of the image is Central to the two-dimensional image, the second is a map of the depths, the third is an overlap relative to the map of pixel values and the last is the depth relative to the map overlaps.

Also worth mentioning is that the existing solutions lead to the loss of spatial resolution due to the additional information that must be passed for three-dimensional display. For example, for panel high-definition 1080 lines by 1920 pixels, each of conceptions among 8 or 9 views will have a loss of spatial resolution with a factor of 8 or 9, and use the bit rate and the number of pixels in a television system remain constant.

Research in the field of relief images on the screens today are aimed at:

- autostereoscopic Multiview systems, in other words, the use of more than 2 views without wearing special glasses. This includes, for example, the previously mentioned LDV format or the format of the MVD (Multiview video + depth)using depth maps,

- stereo system, in other words, the use of 2 views, and wearing special glasses. The content, in other words, the data used may be CTE is escapisim data related to the two images, left and right, or data corresponding to the LDV format, or data related to MVD format. You can enumerate the 3D DLP (Digital light processing) Rear Projection HDTV from Samsung, 3D Plasma HDTV from the same manufacturer, system 3D LCD from Sharp, etc.

In addition, note that the content related to three-dimensional digital cinematography can be distributed via DVD media, and studied in the present system are called, for example, Sensio or DDD.

The formats of the main streams used for the exchange of three-dimensional content is not consistent. Together there are brand specialized solutions. Standardized single format, which is a transport format container (MPEG-C part 3), but it only applies to the packaging system in a transport stream MPEG-2 TS and therefore does not define a new format for the main thread.

This diversity of formats of the primary video stream for three-dimensional video content, this lack of convergence is not conducive to the transformation from one system to another, e.g. from a digital film distribution on DVD and television transmission.

One of the purposes of the invention is to overcome the aforementioned disadvantages.

The INVENTION

The purpose of the invention is the encoder that is designed to use the output data from the different tools for creating three-dimensional images, data relating to the right image and left image data relating to the maps of depths associated with the right image and/or the left image, and/or data relating to the layers overlap, and the device differs in that it contains means for forming a flow, a structured on more than one level:

- level 0, which contains two independent layers: a base layer containing the video data of the right image, and the layer expansion on level 0, which contains the video data of the left image, or Vice versa,

- level 1, contains two independent layer extensions: the first layer extension layer 1 containing a depth map relating to the image base layer, the second layer extension layer 1 containing a depth map relating to the image layer expansion level 0,

- level 2, contains the layer extension layer 2 containing the data overlap related to the image of the base layer.

In accordance with a particular embodiment, data related to the level 0, level 1 or level 2, come from the means of forming an image using a three-dimensional synthesis and/or tools for creating three-dimensional data from:

- two-dimensional data from two-dimensional cameras and/or two-dimensional video content, and/or

data from stereoscopic cameras and/or multiple cameras.

the accordance with a specific embodiment, a tool for generating three-dimensional data to calculate data related to level 1, use special tool to obtain information about the depth and/or means for computing depth maps from data from stereoscopic cameras and/or multiple cameras.

In accordance with a specific embodiment, a tool for generating three-dimensional data to calculate data related to level 2, use the calculator tool card slabs of data coming from a means of obtaining depth information from stereo cameras and/or multiple cameras.

The aim of the invention is a decoding device for three-dimensional data from a stream for display on the screen, structured on several levels:

- level 0, which contains two independent layers: a base layer containing the video data of the right image, and the layer expansion on level 0, which contains the video data of the left image, or Vice versa,

- level 1, contains two independent layer extensions: the first layer extension layer 1 containing a depth map relating to the image base layer, the second layer extension layer 1 containing a depth map relating to the image layer expansion level 0,

- level 2, contains the layer expansion-level 2, contains the data of the overlap relating to the images of the base layer

for display on the display device, characterized in that it contains the schema adaptation of the three-dimensional display using the data of one or more of received layers of the data stream to reproduce them in accordance with the display device.

In accordance with a specific embodiment, adaptation scheme of the three-dimensional display uses:

layers of level 0 when the display is on a three-dimensional cinematic screen, stereoscopic screen with 2 views, requiring the use of glasses or autostereoscopic screen with 2 views

- base layer and the first layer extension level 1, when the display is on the screen type Philips 2D+z"

- all layers of level 0 and level 1, when the display is on autostereoscopic three-dimensional TV type MVD,

- the base layer, the first layer extension of level 1 and level 2, when the display is on the screen type LDV.

The aim of the invention is also a transport stream of video data, wherein the stream syntax distinguishes between data layers in accordance with the following structure:

layer level 0 consists of two independent layers: one base layer containing the video data of the right image, and the extensibility layer containing the video data of the left image, or Vice versa

- layer extension layer 1 consisting itself of two independent layers extensions: the first layer extension layer 1 containing a depth map relating to the image base layer, the second layer extension layer 1 containing a depth map relating to the image layer expansion level 0,

- layer extension layer 2 containing the data overlap related to the image of the base layer.

A single "multi-level" format is used to distribute different three-dimensional content in different media and for different display systems, such as three-dimensional content for digital cinematography, three DVDs, three-dimensional television.

Thus, it is possible to reconstruct the three-dimensional content coming from different existing modes of creation, and can be accessed with a number of devices autostereoscopic display from a single transmission format.

By defining the format for the video and in the structuring of the data stream, enabling the extraction and selection of appropriate data, ensure compatibility of the three-dimensional system with another system.

BRIEF DESCRIPTION of DRAWINGS

Other characteristic features and advantages will appear from the following description, provided as a non-limiting example, and referring to the attached line and, on which:

- figure 1 shows a system for the creation and distribution of three-dimensional content,

- figure 2 shows the organization of coding layers in accordance with the invention.

DETAILED DESCRIPTION of embodiments of the INVENTION

Apparently, Multiview autostereoscopic screens, such as screen Newsight, provide the best results in terms of impact the quality when they arrive N views, where the extreme image corresponds to a pair of stereoscopic views, and where the intermediate image are interpolated, only upon receipt of the shooting with multiple cameras. This is due to restrictions that must be observed between the focal axes of the cameras, aperture, their arrangement (the distance between the camera direction with respect to the optical axes etc), size, and distance to subject. For real scenes indoors or outdoors and "realistic" camera, in other words, with a reasonable focal lengths and apertures that do not give the feeling of distortion of the scene on the display, typically used a camera system whose optical axis must be spaced at a distance of about 1 cm Average distance between human eyes is 6.25 see

Therefore, it would seem advantageous to convert the data relating to multiple cameras, the data related to left and right stereoscopic views corresponding to the distance between the eyes. These data are processed to provide a stereoscopic view cards with depth and, if possible, with templates floors. Therefore, it becomes useless to pass multiple projections, in other words, data related to the number of two-dimensional images corresponding to the number of used cameras.

For data related to stereoscopic cameras, the left and right images can be processed to provide, in addition to images, depth charts and templates floors, providing the possibility of using mobile devices autostereoscopic display after processing.

As for the depth information, it can be estimated from a suitable means, such as laser or infrared, or be calculated by measuring disproportionality movement between the right image and the left image is the manual way by evaluating the depth of fields.

Data from single two-dimensional camera can be processed to provide two images, two views, allowing for relief. Three-dimensional model can be created from this single two-dimensional video image using the intervention of man is ka, namely, for example, in the restoration of scenes through the use of consistent views, to provide a stereoscopic image.

Apparently, the N representations used for multiple display systems and coming from N cameras can actually be calculated from stereoscopic content by performing the interpolating. Here stereoscopic content can serve as the basis for the transmission of television signals, data relating to the stereoscopic pair, allowing to obtain N diagrams for the device of the three-dimensional display using interpolation and, ultimately, extrapolation.

Taking into account these observations, we can conclude that different types of data necessary for displaying three-dimensional video content, in accordance with the type of the display device are as follows:

- single view and map of depths possible with templates ceilings for device autostereoscopic display type of Philips 9 views

- stereographic pair:

sequential or one metameric, polarized display three-dimensional digital cinematography,

device for stereoscopic display with only two views, when using shutter or polarized glasses,

- ustroilsyaliteyschikom only display with two views with the servo in position of the head or methods of visual direction, known as the tracking of the head and eyes of the user

- stereographic pair with two cards depths to facilitate the interpolation of intermediate representations, if passed two performances deteriorate as a result of compression, for the device autostereoscopic display type Newsight with 8 views

- stereographic pair of cards of different depths and layers overlap for display devices in accordance with the upcoming standard FTV (TV with free view), in other words, compatible with MVD and LDV.

Figure 1 schematically shows a system for the creation and distribution of three-dimensional content.

Existing traditional two-dimensional content, coming for example from the means of transmission or storage, the link 1, and data from standard two-dimensional camera link 2 are transmitted to the creation tool on the link 3, converts a three-dimensional video image.

Data from the stereoscopic camera 4, from multiple cameras 5, data from the means 6 distance measurement is transmitted in scheme 7 create three-dimensional images. This schema contains the schema 8 calculation of the depth map and chart 9 calculation templates ceiling.

Video data coming from the circuit 10 forming a synthetic image is transferred to the compression scheme 11, Transporter the key. Information from schemes 3 and 7 create a three-dimensional image is also transmitted to the circuit 11.

Scheme 11 compression and transportation performs data compression using, for example, the compression method MPEG-4. The signals are used for transport, and the transport stream syntax distinguishes between object layers in the structuring of video data potentially available at the entrance to the compression scheme described later. The data from figure 11 can be transmitted to the regimen used in different ways:

via physical media, made in the form of three-dimensional DVD or other digital media

via physical media stored on reels for cinema (rolling),

using broadcast, cable, satellite, etc.

The signals respectively transmitted by using compression schemes and transportation in accordance with the structure of the transport stream described later, the signals are placed on DVD or coils in accordance with the structure of the transport stream. Signals are accepted by the scheme adaptation to devices of the three-dimensional display of the link 12. This block computes the information needed by the display device, to which it is connected, from different layers in the transport stream or stream programs. Display devices have a screen type for stereo epicheskoi projection 13, stereographic display 14, a stereo or Multiview autostereoscopic display 15, autostereoscopic display 16 with a servomechanism or the other.

Figure 2 schematically shows the overlay of different layers for transporting data.

In the vertical direction are defined layers with a level O level one and level two. In the horizontal direction for a level set of the first layer and the second layer.

Video data of the first image in the stereoscopic pair, for example, the left view stereoscopic images which are assigned to the base layer, the first layer of level 0 in accordance with the above proposed designation. This base layer is the one that uses a standard TV, the video data of the traditional type, such as two-dimensional data relating to the image displayed on a standard TV, also assigned to this base layer. Accordingly, they are compatible with existing products, i.e. compatibility, which does not exist in the Multiple standardization of video coding (MVC).

Video of the second layer in the stereoscopic pair, such as right view, are assigned to the second layer of level 0, which is called stereographic layer. It includes the layer of the first expansion the Loya level 0.

Video about depth charts are assigned to layers extension level one: the first layer of level one, called the left layer depth for the left view, and the second layer of level one, called the right depth for the right view.

Video related to templates floors, assigned to the layer expansion level two, the first layer of the two layer is called the ceiling.

Multi-level format for the main video stream therefore consists of:

- base layer containing a standard video image, the left view in stereographic pair,

- layer expansion stereographs, containing the right-hand view in stereographic pair,

- two layers of the expanding depth and depth maps correspond to the left and right views in stereographic pair,

- layer extensions overlap, N templates ceiling.

Thanks to this organization of data on different layers you can collect content that relates to stereoscopic devices for three-dimensional digital cinematography, autostereoscopic devices multiple types or uses depth maps and map overlays. Multilevel format allows you to go at least 5 different types of display devices. The configuration used for each of these types is a display device, indicated in figure 2, with merged layers used for each configuration.

One base layer of the link 17 are conventional display devices.

Group link 18 from the base layer attached to the stereographic layer provides the ability to display three-dimensional film, and display the DVD on stereoscopic screens with glasses or autostereoscopic display with only two views by tracking the user's head.

The base layer associated with the "left" layer depth, group 19, gives the opportunity to handle the display device of the type Philips 2D+z

The base layer associated with the "left" layer depth and a layer of overlapping, in other words, the first layer is at level 0 and the first layers of the expansion of levels one and two, the group of 20, gives the opportunity to handle the display device of the type LDV (layered video with depth).

The base layer associated with stereographic layer and with the left and right layers of depth, in other words, layers, level 0 and level one, group 21, relates to a device displaying autostereoscopic three-dimensional television type MVD (Multiview video+depth maps).

Such structuring of the transport stream enables convergence of formats, such as type Philips 2D+z 2D+z+overlap LDV format stereoscopic type of cinema and formats type LDV or MVD.

Returning to figure 1, scheme 12 adaptation to three-dimensional display performs the selection of layers: select the base layer and stereographic layer expansion, in other words, layers, level 0, if the display is stereoscopic projection 13 or uses the device 16 of the three-dimensional display with a servo-mechanism, the choice of the base layer left layer extension layer depth and overlap, in other words, the first layer of level 0 and level one and two, for the device 14 display type LDV, choice of layers of level 0 and level one for the device 15 to display multiple types (MDV). For example, in this latter case, adaptation scheme computes the 8 diagrams from 2 stereoscopic views, and depth charts for transmission to the display device 15 multispecies type (MDV).

Therefore, the traditional signals of two-dimensional or three-dimensional video images, regardless of whether they act with media recording, broadcast or cable, can be displayed on any two-dimensional or three-dimensional system. The decoder, which, for example, contains the schema adaptation selects and uses layers in accordance with the system of three-dimensional display to which it is connected.

Also because of this structure, you can pass receiver who, for example, by cable, only the layers that are necessary for the system of three-dimensional display.

The invention in the preceding text is described as an example. It is understood that the specialists in this field of technology can create variants of the invention without deviating from the scope of invention.

1. The encoder that is designed to use data from different tools for creating 3D images, data related to the right image and left image data relating to the maps of depths associated with the right image and/or the left image, and/or data relating to the layers overlap, characterized in that it contains means for forming a flow, a structured on several levels:
- level 0, which contains two layers: a base layer containing the video data of the right image, and the layer expansion level 0, which contains the video data of the left image, or Vice versa,
- level 1, contains two layers of extensions: the first layer extension layer 1 containing a depth map relating to the image base layer, the second layer extension layer 1 containing a depth map relating to the image layer expansion level 0,
- level 2, contains the layer extension layer 2 containing the data overlap related to the image of the base layer.

2. the device according to claim 1, characterized in that the data relating to the level 0, level 1 or level 2, come from means (10) imaging using 3D synthesis and/or means (3, 7) create 3D data from:
- 2D data from 2D cameras and/or 2D video content (1) and/or
data from stereoscopic cameras and/or multiple cameras (4, 5).

3. The device according to claim 1, characterized in that the tool for creating 3D data to calculate data related to level 1, use special tool to obtain (6) information about the depth and/or means for calculating (8) depth maps from data from stereoscopic cameras and/or multiple cameras (4, 5).

4. The device according to claim 1, characterized in that the tool for creating 3D data for calculating data relating to level 2, use the calculator tool card slabs of data coming from a means of obtaining depth information from stereo cameras and/or from multiple cameras.

5. The decoding device 3D data from a stream for display on the screen, structured on several levels:
- level 0, which contains two layers: a base layer containing the video data of the right image, and the layer expansion level 0, which contains the video data of the left image, or Vice versa,
- level 1, contains two layers of extensions: the first layer extension layer 1 containing a map opertion is h, related to the image base layer, the second layer extension layer 1 containing a depth map relating to the image layer expansion level 0,
- level 2, contains the layer expansion-level 2, contains the data of the overlap relating to the image base layer, for display on the display device,
characterized in that it contains the schema adaptation of 3D display that uses data from one or more of received layers of the data stream to reproduce them in accordance with the display device.

6. The device according to claim 5, characterized in that the adaptation scheme 3D display uses:
layers (18) level 0, when the display is in 3D cinema screen, stereoscopic screen with 2 views, requiring the use of glasses or autostereoscopic screen with 2 views
- base layer and the first layer extension level 1 (19), when the display is on the screen type Philips 2D+z",
- all layers (21) of level 0 and level 1, when the display is on autostereoscopic 3D TV type MVD,
- the base layer, the first layer extension of level 1 and level 2 (20)when the display is on the screen type LDV.



 

Same patents:

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to computer engineering. The image processing device includes an extraction means which performs motion compensation using as a reference frame a frame formed from a decoded image, and using a motion vector in an image that was encoded, and for extracting a motion compensation image corresponding to a predicted image from the reference frame; a means of generating an image with intra-frame prediction which performs intra-frame prediction for the current frame for which the predicted image is to be generated, and which generates an image with intra-frame prediction, which corresponds to the predicted image from a portion of the decoded image; and a means of generating a predicted image, which generates a predicted image by performing filter processing to compensate for high-frequency component shortcomings in the motion compensation image extracted by the extraction means, and an image with intra-frame prediction generated by generating an image with intra-frame prediction using correlation in a temporal direction which is included in the motion compensation image and the image with intra-frame prediction.

EFFECT: providing high encoding efficiency without increasing the volume of transmission of motion vectors in a stream.

13 cl, 32 dwg

FIELD: physics, video.

SUBSTANCE: invention relates to video encoding technology. Disclosed is a method of controlling video encoding, which encodes an input video signal by controlling the generated bit rate to prevent failure of a hypothetical buffer in a decoder. The method includes a step of successively encoding each image in a group of images in an encoding queue in accordance with a predefined encoding parameter. The group of images in the encoding queue includes a predefined number of images and is a set of successive images in the encoding queue. Further, the method includes calculating quantisation statistic of each image based on information about the quantisation parameter used to encode each image every time each image is encoded, and checking if the quantisation statistic exceeds a predefined threshold.

EFFECT: high efficiency of encoding images.

14 cl, 26 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to computer engineering. A multi-view video encoding method comprising generating a projection-synthesised image on which a projection-synthesised image is synthesised from an already encoded base projection frame, which corresponds to a target encoding frame in a target encoding image; searching for a base portion on the already encoded base frame in the target encoding image which corresponds to the projection-synthesised image, for each processing element portion, having a predefined size; calculating a correction parameter for correcting mismatch between cameras based on the projection-synthesised image for the processing element portion and the base frame for the base portion; correcting the projection-synthesised image for the processing element portion using the calculated correction parameter; and performing predictive video encoding in the target encoding projection using the corrected projection-synthesised image.

EFFECT: high efficiency of encoding/decoding multi-view video without further encoding/decoding correction parameters.

23 cl, 7 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to video conferencing. A video conference server may receive video streams from each client in a video conference and may receive subscription requests from each client. The subscription requests may include requests to view video streams from specific other clients with a given resolution and/or frame rate. The video conference server may match up the received video streams with the subscription requests in order to send the subscribing clients their desired video streams. The server may also be able to request different versions of video streams from participants (e.g. different resolutions) and/or alter the video streams in order to better comply with the subscription request.

EFFECT: providing video conferencing subscriptions using multiple bit rate streams.

20 cl, 6 dwg

FIELD: physics, video.

SUBSTANCE: invention relates to video encoding and decoding. The video decoding method includes receiving and parsing a bit stream of an encoded video and decoding encoded image data for a maximum size of a coding unit based on information on coding depth and coding mode for the maximum coding unit based on a raster scanning order for the maximum coding unit and a zigzag scanning order for coding units of the maximum depth coding unit.

EFFECT: high efficiency of encoding or decoding high-resolution or high-quality video content.

15 cl, 29 dwg, 1 tbl

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to image encoding. The image predictive coding apparatus includes: a partitioning module for dividing an input image into a plurality of units; a prediction signal generating module for generating a prediction signal relative to a pixel signal which is included in the unit under consideration, which is to be processed, from a plurality of units; a residual signal generating module for generating a residual signal between the pixel signal of the unit under consideration and the generated prediction signal; a signal encoding module for generating a compressed signal by encoding the residual signal; and a storage module for decompressing the compressed signal and storing the decompressed signal as a reconstructed pixel signal.

EFFECT: high encoding efficiency by increasing pixel prediction accuracy.

16 cl, 16 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to computer engineering. A method for network-wide storage and distribution of data for internet protocol television (IPTV) comprising adding, by a live broadcast media code stream sending server of a content delivery network, an identifier of a program to which a media code stream data packet belongs and a storage identifier of the media code stream data packet into the media code stream data packet, where the identifier of the program to which the media code stream data packet belongs is a program label and the storage identifier of the media code stream data is a storage offset label; transmitting, by the live broadcast media code stream sending server, the media code stream data packet to a recording node; and storing, by the recording node, the media code stream data as a recording file according to the identifier of the program and the storage identifier; distributing and requesting, by the client terminal, the media code stream data from the edge node or the recording node; and transmitting the media code stream data to the client terminal.

EFFECT: providing seamless distribution of media code stream data.

8 cl, 5 dwg

FIELD: physics, video.

SUBSTANCE: invention relates to multiview image encoding/decoding. The technical result is increasing efficiency of encoding multiview images in which there is localised mismatch of illumination and colour between cameras, a well as reducing the amount of code. The invention provides, when dividing an encoded and decoded frame and encoding/decoding each region, generation of a prediction image not only for the processed region, but for already encoded/decoded regions adjacent to the processed region. Prediction images are generated using the same prediction method. A correction parameter is then estimated for correcting mismatch of illumination and colour from the predicted images and the decoded images of adjacent regions. An estimated correction parameter can be found even at the decoding side. Encoding is therefore unnecessary.

EFFECT: corrected prediction image is generated using estimated correction parameters in order to correct a prediction image that was generated for a processed region.

16 cl, 8 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to a recording device which stores a basic image stream and an extended image stream, obtained by encoding multiview video. The technical result is that data on the medium using the disclosed record encoding may be reproduced in a device which is incompatible with reproduction of multiview video. In an Access Unit containing basic video display, MVC header encoding is prohibited. For a display component contained in an Access Unit without a MVC header, determination is carried out such that the "view_id" parameter thereof is recognised as 0.

EFFECT: present invention can be applied to a reproducing device which is compatible with the BD-ROM standard.

7 cl, 48 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to image processing means. The device has a first and a second extraction means, where the first extraction means is designed for motion compensation and extraction of a motion compensation image corresponding to a predicted image, and the second extraction means is designed for extracting a part which coincides with the motion compensation image extracted by said first means, which is a motion compensation image corresponding to the predicted image, a means of generating a predicted image by performing filter processing over the motion compensation image extracted by the first and second extraction means, where the filtering process adds a high-frequency component by using correlation in a temporal direction included in the motion compensation image.

EFFECT: high image encoding efficiency by preventing increase in volume of transmitted motion vectors in a stream.

18 cl, 30 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to means of encoding and decoding images. In the method, the motion vector of a reference section has the same shape as the current section and belongs to a reference image which is different from the current image and is broken down in advance as a result of encoding with subsequent decoding into a plurality of sections. When a reference section overlaps a set of reference sections from said plurality of sections of the reference image, said motion vector of the current image section is determined based on a reference motion vector function belonging to a set of reference motion vectors associated with k overlapped reference sections.

EFFECT: high accuracy of predicting the motion vector of an image section.

15 cl, 6 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to predictive motion vector predictive encoding/decoding of moving pictures. The moving picture encoding apparatus includes a primary candidate reference motion vector determination unit which sets N primary candidate reference motion vectors, a degree of reliability calculation unit which calculates the reliability of each primary candidate reference motion vector which quantitatively represents effectiveness in motion vector prediction of the block to be decoded, using encoded or decoded picture information, a reference motion vector determination unit selects M (M<N) secondary candidate reference motion vectors in accordance with the degree of reliability of N primary candidate reference motion vectors, a motion vector encoding unit calculates a predictive motion vector of the block to be encoded using M secondary candidate reference motion vectors with high reliability.

EFFECT: improved efficiency of predicting and encoding moving pictures.

16 cl, 14 dwg

FIELD: information technology.

SUBSTANCE: method for alphabetical representation of images includes a step for primary conversion of an input image to a multi-centre scanning (MCS) format, constructed according to rules of a plane-filling curve (PFC). The initial MSC cell is a discrete square consisting of nine cells (33=9), having its own centre and its own four faces (sides). Scanning of the initial MSC cell is performed from the centre to the edge of the square while bypassing the other cells on a circle. The path with a bypass direction to the left from the centre of the square and then on a circle, clockwise, is the priority path for scanning and displaying images.

EFFECT: high efficiency of encoding images.

3 cl, 5 dwg

FIELD: radio engineering, communication.

SUBSTANCE: invention relates to video monitoring means. The method involves a mobile client sending a request, at the request of an external control device, to a multimedia transcoder; the multimedia transcoder receiving said request; requesting an encoded multimedia stream of said external control device from a fixed streaming media network server; transcoding the obtained encoded multimedia stream; the multimedia transcoder outputting a transcoded encoded multimedia stream to the mobile client or mobile streaming media network server, where the multimedia transcoder sets video transcoding parameters corresponding to various mobile network standards.

EFFECT: enabling a user to perform video monitoring of a control point using a mobile terminal.

10 cl, 4 dwg

FIELD: physics, computation hardware.

SUBSTANCE: invention relates to coding/decoding of picture signals. Method for variation of reference block (RFBL) with reference pixels in reference picture (I_REF) converts (TRF) reference block to first set of factors (REF (u, v,)). It changes the first set of factors (REF (u, v,)) with the help of one or several weights (TR (u, v,)) and executes the inversion (ITR) of changed. Note here that weights (TR (u, v,)) are defined by extra pixels in current picture (I_CUR) and extra reference pixels in reference picture. Application of extra pixels and extra reference pixels allows the determination of spectral weights so that they display the effects of attenuation. Particularly, if reference frame consists of two black-out frames one of which should be forecast with the help of reference frame, then assignment of weights in spectral band allows isolation of significant frame from two frames.

EFFECT: efficient coding in the case of attenuation.

10 cl, 3 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to computer engineering. The method of transmitting a stream of unencrypted images involves encoding the stream of images and sending a compressed stream of images to at least one receiving device. Before encoding, images of the unencrypted stream of images are converted via a secure reversible conversion to obtain a converted stream of images which is encoded and transmitted in place of the unencrypted stream of images. The secure reversible conversion converts each image from a sequence of unencrypted images.

EFFECT: high security of a stream of unencrypted images.

12 cl, 9 dwg

FIELD: information technology.

SUBSTANCE: method of compressing images programmed in a controller of a device, comprising: partitioning an image into one or more blocks; applying gamma conversion to each pixel of the image to generate data with the same number of bits; computing prediction values for each pixel in each block of the one or more blocks using a plurality of prediction modes; applying quantisation to each pixel of each block of the one or more blocks using a plurality of quantisation numbers; computing differential pulse code modulation (DPCM) to generate residuals of the quantised values for each of the plurality of quantisation numbers, wherein the number of bits generated for each block of the one or more blocks is equal to the bit budget; computing pulse code modulation (PCM), which includes shifting each pixel value by a fixed number of bits; selecting for each block of said one or more blocks, DPCM with a quantisation number where the best quantisation accuracy is achieved; selecting an encoding method from the DPCM with said quantisation number and PCM; and generating a bit stream containing data encoded using the selected encoding method.

EFFECT: compression without visual losses.

14 cl, 17 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to means of digitising a frame image.

EFFECT: frame digitisation with not three converters in each matrix element but with one converter in each matrix element which, during the frame period, concurrently and synchronously performs three successive conversions of colours R, G, B with 15 bits each, and image digitisation ends at the end of the frame period.

4 dwg

FIELD: information technology.

SUBSTANCE: image compression method, based on excluding a certain portion of information, wherein the information is excluded from the space domain through numerical solution of Poisson or Laplace differential equations, and subsequent estimation of the difference between the obtained solution and actual values at discrete points of the image; generating an array of boundary conditions, which includes a considerable number of equal elements which is compressed, and the image is reconstructed by solving Poisson or Laplace partial differential equations using the array of boundary conditions.

EFFECT: eliminating loss of image integrity, high efficiency of compressing images having large areas of the same tone or gradient and maintaining contrast of boundaries between different objects of an image.

2 cl, 16 dwg

FIELD: information technology.

SUBSTANCE: method is carried out by realising automatic computer formation of a prediction procedure which is appropriately applied to an input image. The technical result is achieved by making an image encoding device for encoding images using a predicted pixel value generated by a predetermined procedure for generating a predicted value which predicts the value of a target encoding pixel using a pre-decoded pixel. The procedure for generating a predicted value, having the best estimate cost, is selected from procedures for generating a predicted value as parents and descendants, where the overall information content for displaying a tree structure and volume of code estimated by the predicted pixel value, obtained through the tree structure, is used as an estimate cost. The final procedure for generating a predicted value is formed by repeating the relevant operation.

EFFECT: high efficiency of encoding and decoding, and further reduction of the relevant volume of code.

12 cl, 14 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to three-dimensional (3D) rendering, particularly processing an image of an object to place said image at a perceptual depth. Methods for rendering at least one object on a stereoscopic image for a display device are provided. Perceptual depth data as a fraction of viewer distance for the object are received, wherein said perceptual depth data can be normalised. A pixel separation offset for a particular display device is calculated from the perceptual depth data. Left and right eye images of the object are respectively inserted into the stereoscopic image with the pixel separation offset, wherein said object includes captioning to be inserted.

EFFECT: insertion of 3D objects positioned automatically and/or independently of a display device.

14 cl, 14 dwg

Up!