The device and method of representing a three-dimensional object based on the image depth

 

The invention relates to the representation of three-dimensional objects on the basis of images with depth. It should be used for rendering three-dimensional images in computer graphics and animation allows to obtain a technical result in providing compact storage of the image information, fast rendering with high quality output image. This result is achieved due to the fact that the method includes generating: pieces of information about the observation point, color images based on color information of respective pixels of the pixels constituting the object image with depth, nodes, images, consisting of information about the observation point, color images and images with depth, relevant information about the observation point; and encoding the generated nodes of the images. 10 S. and 56 C.p. f-crystals, 54 ill., 13 table.

The technical field,

The present invention relates to a device and method for presenting three-dimensional (3D) objects on the basis of images with depth, more particularly, to a device and method for representing three-dimensional objects using images with depth for computer graphics and anime AFX (extension patterns of the animation for MPEG-4.

Description of the prior art,

From the very beginning of research in the field of three-dimensional graphics the ultimate goal of researchers is synthesizing realistic graphic scenes (rendered three-dimensional space), similar to the real image. With this purpose, research was carried out under traditionem technologies rendering (visualization) using polygonal models, which resulted in the development of technologies for modeling and rendering, providing a very realistic three-dimensional representations of the environment. However, the procedure of generating more complex models requires a huge effort of experts and time-consuming. In addition, realistic and sophisticated representation of the environment requires very large amounts of information and leads to the reduction of efficiency in storage and transmission.

Currently, polygonal models are typically used to represent three-dimensional objects in computer graphics. Arbitrary shape can be essentially represented by a set of colored polygons, i.e., triangles. Significantly improved software algorithms and hardware design Grafinya model still and moving images.

However, in the last decade very actively carried out the search for alternative three-dimensional representations. The main reasons for this include the difficulty of constructing polygonal models for real-world objects, as well as the complexity of the rendering and the poor quality of the formation of the scenes truly photographic quality.

Applications require an extremely large number of polygons; for example, a detailed model of the human body contains several million triangles, which causes difficulties in processing. Although recent advances in methods for determining distances, for example using a laser scanner distances, allow to obtain the data lengths of high density with allowable errors, is still expensive and very difficult to obtain a smooth (without seams) full polygonal model of the object as a whole. On the other hand, the rendering algorithms to obtain approximate to the photo quality require complex calculations are complex and do not provide real-time rendering.

The invention

One aspect of the present invention is to provide a device and method for the representation of three-dimensional objects on assuranee structure of the animation for MPEG-4.

Another aspect of the present invention is to provide a computer-readable recording media having a program for implementing a method of representing a three-dimensional object based on the images with depth, for computer graphics and animation, DIBR, adopted in the standard AFX (extension patterns of the animation for MPEG-4.

In one aspect the present invention provides a device for representing three-dimensional objects based on the image depth containing the generator information about the observation point for generating at least one piece of information about the observation point, the first image generator for generating a color image based on color information corresponding to the information about the observation point on the relevant points of the pixels forming the object, a second image generator for generating image depth based on the depth information, the relevant information about the observation point according to the corresponding points of the pixels forming the object, the generator units for generating the nodes of the images, consisting of information about the observation point, the color image and depth image format corresponding to the information about the point enablewebscript device for presenting three-dimensional objects, based on images with depth, comprising a generator of information about the observation point for generating at least one piece of information about the observation point, from which the observed object, the generator information plane to generate information about the plane that defines the width, height and depth of the image plane corresponding to the information about the observation point, the generator depth information to generate sequence data about the depth for depths of all projected points that are projected onto the image plane, the generator color information to generate the sequence of the color data for the corresponding projected points, and generator units for generating node, based on the information plane corresponding to the plane of the image sequence data about the depth and consistency of the color data.

In another aspect the present invention provides a device for representing a three-dimensional object based on the images with depth, comprising a generator of information about the form to generate information about the shape of the object by dividing ochoterena that contains the object 8 subcubes and the reference image, containing color image for each cube, divided by the generator information form generator indexes for generating index information of the reference image in accordance with the information on the form, the generator units for generating nodes ochoterena, including information about the form, the index information and the reference image, and an encoder for encoding nodes ochoterena in the output bit streams, and the generator information form iteratively performs division up until Zubkov will not become smaller than a predetermined size.

In another aspect the present invention provides a device for representing three-dimensional objects based on the image depth of the containing block is input to the input bit streams, the first block selection to select nodes ochoterena from the input bit streams, the decoder for decoding nodes ochoterena, the second block selection for highlighting information about the form and reference images for a variety of cubes forming ochoterena, from the decoded nodes ochoterena and block representation of an object for object representation by combining the selected reference images, respectively, information about the form.

And the images with depth, including generating at least one piece of information about the observation point, the generation of color images based on color information corresponding to the information about the observation point according to the corresponding points of the pixels forming the object, generating images with depth based on the depth information, the relevant information about the observation point according to the corresponding points of the pixels forming the object, generating nodes of images, consisting of information about the observation point, the color image and depth image format corresponding to the information about the observation point, and encoding the generated image nodes.

In another aspect the present invention provides a way of representing three-dimensional objects on the basis of images with depth, including the generation of information about the observation point, from which the observed object, generating information about the plane that defines the width, height and depth of the image plane corresponding to the information about the observation point, the generation of sequence data about the depth for depths of all projected points that are projected onto the image plane, generating posledovati on the plane, the corresponding plane of the image sequence data about the depth and consistency of the color data.

In another aspect the present invention provides a way of representing a three-dimensional object based on the images with depth, including the generation of information about the shape of the object by dividing ochoterena that contains the object 8 subcubes and definitions separated subcubes as child nodes, determining a reference image containing a color image, for each cube, divided by the generator information form, generating the index information of the reference image corresponding to the information on the form, generating nodes ochoterena, including information about the form, the index information and the reference image, and encoding nodes ochoterena in the output bit streams, moreover, in the step of generating information about the shape of the unit is iterative way up until Zubkov will not become smaller than a predetermined size.

In another aspect the present invention provides a way of representing three-dimensional objects on the basis of images with depth, including the input bit streams, the selection of nodes octogenarians for multiple cubes forming ochoterena, from the decoded nodes ochoterena, and the object representation by combining the selected reference images, respectively, information about the form.

According to the present invention render time for models based on images is proportional to the number of pixels in the reference and output images, but, fundamentally, not geometric complexity, as in the case of polygonal models. In addition, when image-based representation is applied to objects and scenes of the real world, it becomes possible to visualize with photographic quality natural scenes without using millions of polygons and expensive calculations.

Brief description of drawings

The above objectives and advantages of the present invention are explained by detailed description of preferred embodiments of the invention with reference to the drawings, which represent the following:

Fig.1 - examples of image-based representations integrated in modern software;

Fig.2 - chart patterns ochoterena and the order of child elements;

Fig.3 is a graph representing the compression ratios ochoterena;

Fig.4 - diagetic units ("1") and white cells correspond to zeros ("0"); b - two-dimensional cross-section in the coordinates (x, depth);

Fig.5 is a graph showing the invariance of the likelihood of site: a - source current generating unit; b - current and generating the node is rotated around the axis by 90 degrees;

Fig.7, 8, 9 - ratios of the geometric compression ratio for the best way, based on PPM;

Fig.6 the assumption of orthogonal invariance;

Fig.10 is a diagram showing two ways of reordering the color field point model textures "angel" in the two-dimensional image;

Fig.11 is a chart of examples of geometric lossless compression and color lossy compression: a and b are the original and the compressed model "angel", respectively; C and d - the original and the compressed version of the model "Morton 256", respectively; and

Fig.12 is a model diagram of a binary volumetric tree (BVI) and the model textured binary volumetric ochoterena (TBWA) angel;

Fig.13 is a graph showing the additional images additional cameras in TBVO: a - image index camera; b - first additional image with a second image;

Fig.14 is a diagram showing an example of recording the stream TBWA: a tree structure of TBWA (grey color is "neopredelennykh camera; with the resulting stream of TBWO, which filled in Cuba and Cuba ochoterena indicate the bytes textures and bytes BVI, respectively;

Fig.15, 17, 18 and 19 is a graph showing the results of the compression TBWA models "angel", "Morton", "Palma 512" and "Robots 512", respectively; and

Fig.16 is a diagram showing deprived of shell model "angel" and "Morton";

Fig.20 is a chart of the example texture image and the depth map;

Fig.21 is a chart example of a layered image with depth (MIG): a projection of the object; b - pixels of a layered image;

Fig.22 is a chart example of the texture unit (TU), in which six simple textures (pairs of images and depth map) is used to visualize the model, shown in the center;

Fig.23 is a chart example of the generalized texture block (STB): a - location of the camera for the model "palm tree"; b - plane of the reference image for the same model (used 21 simple texture);

Fig.24 is a graph example showing a two-dimensional representation ochoterena: "points"; b: corresponding average map display;

Fig.25 - pseudocode for recording the bit stream TBWA;

Fig.26 is a diagram showing the specification of nodes based views images with depth (POIG);<- orthogonal representation;

Fig.28 - pseudocode-based OpenGL rendering a simple texture;

Fig.29 is a diagram of example, showing the compression of the reference image in a simple texture and the original reference image; b - modified reference image in JPEG format;

Fig.30 is a diagram of example, shows the result of the visualization model "Morton" in a variety of formats: as in the original polygon format b - format images with depth; C - format image ochoterena;

Fig.31 is a chart of examples of visualization and scanned the Tower model in the format of an image with depth; b - the same model in image format ochoterena (data scanner were used without noise reduction, hence the black dots in the upper part of the model);

Fig.32 is a chart of examples of model visualization "Palma" and the original polygon format, b - the same model, but the image with depth;

Fig.33 - example figure showing the animation frame "Dragon 512" format image with depth;

Fig.34 is a diagram of example visualization model angel 512" in the format of bitmap textures;

Fig.35 is a block diagram of a device for representing three-dimensional objects on the basis of images with depth in accordance with vozmojnye diagram, illustrating the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using a simple texture according to a variant implementation of the present invention; and

Fig.38 is a block diagram of a device for representing three-dimensional objects on the basis of images with depth according to the present invention; and

Fig.39 is a flowchart showing the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using dot texture according to the invention; and

Fig.40 is a block diagram of a device for representing three-dimensional objects on the basis of the image with depth by using ochoterena in accordance with the present invention; and

Fig.41 is a detailed block diagram of the preprocessor 2310;

Fig.42 is a detailed block diagram of the generator 2340 indexes;

Fig.43 is a detailed block diagram of the encoder 2360;

Fig.44 is a detailed block diagram of the second section 2630 encoding;

Fig.45 is a detailed block diagram of the third section 2640 encoding;

Fig.46 is a flowchart showing the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using oktodelete according to a variant implementation of the present invention; and

Fig.47 - BC the EMA, showing the process of implementing the generation of the index;

Fig.49 is a block diagram showing the implementation of the encoding;

Fig.50 is a flowchart showing the process of implementation of the second phase encoding;

Fig.51 is a flowchart showing the process of implementation of the third stage of coding;

Fig.52 is a flowchart showing the process of generating streams of bits during encoding;

Fig.53 is a block diagram of a device for representing three-dimensional objects on the basis of the image with depth by using ochoterena in accordance with another embodiment of the present invention; and

Fig.54 is a flowchart showing the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using oktodelete according to another variant implementation of the present invention.

Description of the preferred embodiments of the present invention

This application claims the priority of provisional applications U.S. below, which are included in the present description by reference in its entirety.

I. the Coding according to ISO/IEC JTC 1/SC 29/WG 11 moving pictures and audio

1. Introduction

This document sets out results technology based visualization of images, using textures with depth. Also presents changes based on experiments conducted after the 57th Symposium MPEG, and discussions during the meeting of the Special Group AFX in October, made to the specifications of the nodes.

2.Experimental results

2.1. Model tests

For stationary objects

Node image with depth with a simple structure

Dog

King Rex (an image with depth by using about 20 cameras)

The veranda (monster) (image with depth by using about 20 cameras)

Concorde (image with depth, scanned data)

Palma (image with depth, 20 cameras)

The image with depth with a multilayered texture

Angel

The image with depth with point texture

Angel

Site image ochoterena

Create a

DragonThe dragon in the environment

The image with depth with a multilayered texture

Not available

Site image ochoterena

Robot

The dragon in the environment

In the future will provide more data (scanned or simulated).

2.2. The results of the test

All sites proposed in Sydney, integrated in the standard software blaxxun contact 4.3. However, the sources are not yet uploaded to the cvs server.

Entertainment formats, view image-based (CIP) must be synchronization between multiple files movies the way that images in the same keyframe from each file movie should be issued at the same time. However, modern standard software does not support this property synchronization, which is possible in the MPEG standards. So now entertainment formats can be Visualizer is xture files are movies in AVI format.

After conducting several experiments with layered textures, it was found that the site multilayered texture is inefficient. This site was proposed for multi-layer image with depth. However, there is also a node point textures, which can to support him. Therefore, it was proposed to delete the node multi-layered texture of the specifications of the nodes. In Fig.1 presents examples of concepts based on images (CIP), integrated in modern standard software.

3. Update data specification nodes POI

The findings at the meeting in Sydney on the proposal of the CIP was to get the flow of the CIP, which contains images and information of the cameras, and the node POI only need to have a link (Url - universal resource locator) with him. However, the AhG meeting in Rennes in the discussion of the CIP, it was decided to use images and information of the cells in the nodes of the CIP, and in the flow. Thus, the following is a specification updated nodes to nodes of the CIP. Requirements flow PSI see, which explains the Url field.

Node image with depth determines the individual is PA, and should therefore be placed under the same node of the conversion.

Field di" defines the texture with depth, which should be displayed in the area defined in the host image with depth. She must be one of the textures of the different types of texture images with depth (simple texture or bitmap texture).

Field position and orientation determine the relative location of the observation point textures POI in the local coordinate system. The position is relative to the origin (0, 0, 0) coordinate system and the orientation determines the rotation relative to the default orientation. In the default position and orientation of the observer is on the Z-axis and looks along the Z-axis towards the origin, and the axis +X is to the right and the +Y axis is ahead in the upward direction. However, the hierarchy transformation affects the final position and orientation.

Field "field of view" defines the angle from the observation point of the camera defined by the fields of position and orientation. The first value represents the angle to the horizontal direction, and the second value represents the angle to the vertical side. The values set default is "field of view" refers to the width and height of the near and far plane.

Fields near and far plane to determine the distance from the observation point to the near and far plane of the observation area. Texture and depth data show the region bounded by the near plane, far plane and the observation point. Depth data are normalized to the distance from the near plane to the far plane.

Field "orthogonal" determines the type of the observation of the texture of the CIP. When it is set to TRUE, then the texture of the CIP is based on orthogonal observation. Otherwise, the texture of the CIP is based on the observation in the future.

Field image with depth Url" specifies the url of the image stream with depth, which may optionally include the following contents:

Position

Orientation

The observation point

The near plane

The far plane

Orthogonal

di (simple texture or bitmap texture)

Header 1 byte for the flag include/exclude for the above fields

The "depth" determines the depth for each pixel in the texture. Depth map must have the same size as the image or movie in the "texture". It must be one of various types of nodes textures (texture image, the texture of the movie or pixel texture). If the node depth is set to ZERO or the "depth" is vague, then the alpha channel in the texture should be used as a depth map.

The node point structure defines multiple layers of points of POI. The fields "width" and "height" specify the width and height of the texture.

The "depth" defines multiple depths at each point (in normalized coordinates) in the plane of projection in the crossing procedure, which starts from a point in the lower left corner and crosses the right to end on a horizontal line before moving to a higher line. For each point the number of depths (pixels) first remember, and this is the number of depth values then.

Color determines the color of tecomella) for each point is not included.

Site image ochoterena" defines the structure ochoterena and projected textures. The size of the cube described full ochoterena 111, and the center of the cube ochoterena must be at the origin (0, 0, 0) local coordinate system.

The "resolution ochoterena" specifies the maximum number of leaves ochoterena along the sides of the cube described. Level ochoterena can be determined from the resolution ochoterena using the following equation: level oktodelete=int(iog2(resolution ochoterena-1))+1).

Field oktodelete" defines the number of internal nodes ochoterena. Each internal node is represented by a byte. "1" in the i-th bit of the first byte means that the child nodes exist for the i-th child element of this internal node, and ' 0 ' means that it is not. The internal nodes ochoterena should be the order of the signal width ochoterena. Order eight child elements of the internal node is shown in Fig.2.

The image ochoterena" defines the set of nodes "image depth" with a simple structure for the field "di". However, the field "middle plan" and "long-range plan" node "image with depth the supply ochoterena with the following content:

the header for flags

resolution ochoterena

oktodelete

image ochoterena (nodes of multiple images with depth)

the middle plan is not used

long-range plan is not used

disimple texture without depth

II. Coding according to ISO/IEC JTC 1/SC 29/WG 11 moving pictures and audio

1. Introduction

This document sets out the results of the main experiment based visualization of images with depth (DIBR), AFX A8.3. This basic experiment is to view sites on the basis of the image with depth by using textures with depth. These nodes were accepted and included in the proposal for the Project Committee at the meeting in Pattaya. However, the formation of flow of the information through field Url" node "image ochoterena" and the field "image Url" node "image depth" still continues to be a subject of research. This document describes the format of the formation of p is the compression field oktodelete" node "image ochoterena" and fields "depth/color" node "point texture.

2. The format of the formation of the thread to "Url"

2.1. The stream format

The node image ochoterena includes field Url", which defines the address of the image stream ochoterena. This thread may optionally include the following contents:

the header for flags

resolution ochoterena

oktodelete

image ochoterena (nodes of multiple images with depth)

the middle plan is not used

long-range plan is not used

disimple texture without depth

Field oktodelete" defines the number of internal nodes ochoterena. Each internal node is represented by a byte. "1" in the i-th bit of the first byte means that the child nodes exist for the i-th child element of this internal node, and ' 0 ' means that it is not. The internal nodes ochoterena should be the order of the signal width ochoterena. Order eight child elements of the internal node is shown in Fig.2.

Field oktodelete" node "image octoder the second flow. The following section describes the compression scheme ochoterena for oktodelete" node "image ochoterena".

2.2. A compression scheme for oktodelete"

In view ochoterena corresponding to the DIBR data contained in the field "oktodelete", which represents a geometric component. Oktodelete represents the set of points described in the cube, fully representing the surface of the object.

Nonideal reconstruction of geometry from a compressed representation leads to very noticeable artifacts. Therefore, the geometry must be compressed without loss of information.

2.2.1. Compression ochoterena

For compression field oktodelete", presented in the form ochoterena first pass depth, developed a method of lossless compression that uses some ideas of the method of prediction by partial matching (responsible). The main idea that was used is "prediction" (i.e., the probability score) the next character from several previous characters, which are called context. For each context there is a probability table, which contains estimates of the probability of occurrence of each symbol in this context. It is used in combination with an arithmetic coder, called the surrounding node as the context for the child node.

2) use assumptions "orthogonal invariance" to reduce the number of contexts.

The second idea is based on the observation that the "transition probability" for pairs of nodes and the "parent-child" in a typical case, invariant to orthogonal transformations (rotation and symmetry). This assumption is illustrated in Appendix 1. This assumption allows the use of more complex context without having to use too many tables of probability. This, in turn, has achieved quite good results in terms of volume and speed, as the more contexts you use, the more accurate is the estimation of probability and, consequently, the more compactnes is the code.

Encoding is the process of constructing and updating the table of probabilities according to the context model. In the proposed method, the context is modeled as a hierarchy of type "parent-child" in the structure ochoterena. First, you define the symbol as a byte node, bits of which indicate the fullness of subcube after internal divisions. Therefore, each node in oktodelete can be a character and its numeric value is from 0 to 255. The table of probabilities">255), divided by the sum of all the variables is equal to the frequency (the assessment of the probability) of occurrence of the i-th symbol. Table of probabilistic context (TCE) is a set of tables of probabilities (TV). The probability of a symbol is determined by one and only one TV. The specific TV depends on the context. The TCE example shown in table 1.

The encoder works as follows. He first uses 0-context model (i.e., the only TV for all characters, starting with a uniform distribution, and updating the TV after each new coded character. The tree is traversed in the first order of depth. If you collected enough statistics (empirically found value is 512 coded symbols), the encoder switches to the 1-context model. He has 27 contexts, which are defined as follows.

Consider a set of 32 fixed orthogonal transformations, which include symmetry and turns 90about the coordinate axes (see Annex 2). Then you can separate the characters by categories in accordance with the configuration of the filling of their subcubes. In the used method by the applicant will have 27 sets zingalamaduni, if and only if they belong to the same group.

In the byte write group consists of 27 sets of numbers (see Appendix 2). It is assumed that the table of probabilities does not depend on the originating node (in this case, it would be 256 tables), but only from a group (denoted by Generating symbol in Fig.2) belongs to the originating node (hence the 27 tables).

When switching TV for all contexts are installed on up 0-context TV. Then each of the 27 TV is updated when it is used for encoding.

Once encoded 2048 (another heuristic value) characters in the 1-context model, is switching to the 2-context model, which uses pairs (Generating symbol, node) as contexts. The symbol of a node is simply the position of the current node to the originating node. Now, there are 27·8 contexts for the 2-context model. At the moment of switch on this model of TV obtained for each context, are used for each node within this context and with this point are updated independently.

Taking into account more detailed technical descriptions, coding for the 1-context and 20-kontaktnoi models is as arr is bliznova search (transcoding) (geometric analysis was performed on the stage of the development program). Then apply orthogonal transformation, which results in our context in the "standard" (arbitrarily chosen once and for all an element of the group to which he belongs. The same transformation is applied to the character (these operations can also be performed as a table lookup (encoding); of course, all calculations for all possible combinations made in advance). Effectively, this means calculating the correct position of the current symbol in the table of probabilities for a group that contains its context. Then the corresponding probability is introduced in the encoder distances.

In short, if set to generate the character and position of abuse, it is determined identifier (ID) context, which identifies the group ID and the position of the TV in TCEs. The distribution of the TV and the context ID is entered in the encoder distances. After encoding TCEs updated to use the following coding. Note that the encoder distance is a variant of arithmetic coding, which provides renormalization in bytes instead of bits, two times faster completion, and is characterized by 0.01% worse compression than the standard implementation of arithmetic coding.

The process decode standard procedure, you do not need to describe, because it uses exactly the same methods of detecting contexts, updating probabilities, and so on,

2.3. The results of the test

In Fig.3 presents a table to compare the applicant proposed approach for both static and animated models (the ordinate indicates the ratio). The ratio of ochoterena varies about values 1.5-2 times in comparison with the original size ochoterena and enforces the universal methods of lossless compression (on the basis of the Lempel-Ziv-like RAR) is about 30%.

3. The format of forming a flow field image with depth of Url"

3.1. The stream format

Node "Sobranie with depth" includes the "image Url depth, which defines the address of the image stream with depth. This thread may optionally include the following contents:

Header 1 byte for the flag include/exclude for the above fields

Position

Orientation

The observation point

The near plane

The far plane

The site definition "point texture that can be used in the field of "di" node "image depth", has the following form:

The node point structure defines multiple layers of points IBR. The fields "width" and "height" specify the width and height of the texture. The "depth" defines multiple depths at each point (in normalized coordinates) in the plane of projection in the crossing procedure, which starts from a point in the lower left corner and crosses the right to end on a horizontal line before moving to a higher line. For each point the number of depths (pixels) first remember, and this is the number of depth values then it should. The field "color" defines the color of the current pixel. The order must be the same as for the field "depth", except that the number of depths (pixels) for each point is not included.

Field "depth" and "color" pinpoint texture set in the original (raw) format, and the size of these fields is likely very large. Therefore, these fields need to be compressed, in order to have an efficient flow. The following section describes the compression scheme for the fields in the node point of the texture.

3.2. The compression scheme to the point what about the points in the discretized describing Cuba". Assume that the bottom plane is the plane of projection. When the condition m·n·1-dimensional arrays for the model points are the centers of the cells (in the case of ochoterena called their voxels) of the lattice can be considered filled voxels as "1" and the empty voxels as "0". The resulting set of bits (m·n·1-bits) are then organized into a stream of bytes. This is done by passing the voxels in the direction of depth (orthogonal to the plane of projection) on eight layers of depth ("columns") in the plane of projection (when filled, if necessary, the last layer of bytes with zeros, if the size of the depth is not a multiple of 8). Thus, it is possible to present many points as sets ("stacks") images 8-bit grayscale (option - 16-bit images). The correspondence between voxels and bits illustrated in Fig.4A.

For example, as shown in Fig.4b, the black squares correspond to points of the object. The horizontal plane is the plane of projection. Consider the slice height 16 (the upper boundary is shown with a bold line). Will interpret the columns as bytes. I.e. the column above the point indicated in the drawing, is a stack of 2 bytes with values of 18 and 1 (or 16-bit Levitov, thus obtained, can be obtained fairly good results. However, if this is the case directly to apply a simple 1-context method (of course, there may be used an orthogonal invariance or hierarchical contexts), this will lead to a slightly better compression. Below is a table of columns required for the various types of representations of the geometry-based multi-layered images with depth (MIG): according to the method of compression of binary volumetric ochoterena (SBWA), compression above byte array using the best compressor (compression) method responsible and compression of the same pattern using the compressor. currently used by the applicant (the numbers are in kilobytes).

3.2.2 Compression field "color"

The field "color" node "point the texture is a set of colors assigned to the pixels of the object. Unlike the case ochoterena, the color field is characterized exact match (one to one) with the field depth. The idea is to represent the color information as a single image, which can be compressed in one of known methods with the loss. The number of elements of e which is a significant motivation for this method. The image may be obtained by scanning pixels of depth in some natural order.

Consider, first, the scanning order, dictated by the original format of remembering for a MOMENT (point texture) - scan first depth" geometry. Multipixel scanned in the natural order across the plane of projection, as if they were ordinary pixels, and pixels inside one and the same multipixel scanned in the direction of depth. This scanning order forms a one-dimensional array of colors (1st non-zero multipixel, 2nd nonzero multipixel and so on). Once the depth is determined from this array can be successfully restored color points. To ensure the applicability of the methods of image compression, it is necessary to display this long string of one-to-one in a two-dimensional lattice. This can be done in different ways.

The method used in the tests is the so-called "block scan, when the color line is arranged in blocks of 8·8, and these blocks are arranged in columns ("block scanning"). The resulting image is shown in Fig.5.

The compression of the image was performed by various methods, including the th best results, if you use the compression of the texture. This method is based on adaptive local "palletizing" (reducing the number of bits per pixel representation fewer colors palette) each block of 8·8. He has two modes: 8 - and 12-fold compression (compared to BMP-format 24-bit "raw" true color per pixel). The success of this method when applied to this type of images can be accurately explained his "palette" character (creating a palette-based) that allows to take into account sharp (even non-regional) local variations in colors due to the "mixing" of points front and rear surfaces (which may vary to a very great extent, as in the case of model "angel"). To find the optimal scan is to reduce such variations to the maximum extent possible.

3.3. The results of the test

Examples of models in the original and compressed formats are shown in Appendix 3. The quality of some models (for example, "angel") after compression is still not satisfactory, while the quality of other very good ("Grasshopper"). However, it is expected that this problem can be solved by using proper scanning. Could potentially be used even mode 12-ketetapan, to get up close while compressing the geometry of the compression results on the basis of the best method is responsible for.

Below is a table of compression ratios.

4. Conclusions

This document presents the results of the main experiment to view AFX A8.3 based on the image depth. The flow entered the concepts based on images with depth (DIBR), which are connected through the fields of the Url (universal resource locator) of nodes DIBR. These flows consist of all elements in the node DIBR together with a flag for each item to allow the optional use. Also studied was the compression ochoterena and data point textures.

Appendix 1. The geometric meaning of the orthogonal invariance of the context in the compression algorithm BWO

The assumption of orthogonal invariance is illustrated in Fig.6. Consider a rotation about the vertical axis 90clockwise. Consider an arbitrary configuration populate the node and its predecessor before (upper image) and after (bottom) rotation. Then, two different configurations can be treated as one and the same configuration.

Annex 2. Group and convert the Ohm. Bit combination is composed of the following basic transformations (i.e., if the k-th bit is "1", it executes the following conversion):

1st bit - swapping x and y coordinates;

the 2nd bit - swapping x and y coordinates;

3-th bit - symmetry in the plane (y-z);

the 4th bit - symmetry in the plane (x-z);

5-th bit - symmetry in the plane (x-y).

2. 27 groups

For each group there is the order of the group and the number of non-zero bits in its elements: number of groups, number of groups and the number of bits of padding (set of voxels).

3. The characters and transformations

For each character (s) has a group index (g), to which it belongs, and the value of the transformation (t), leading him to the "standard" group element.

A binary number of a symbol is displayed in binary voxel coordinates in the following way: the i-th bit of the number is binary coordinates x=i&1, v=i&(1<<1), z=i&(1 < <2).

Annex 3. Screenshots compression point textures

In Fig.7, 8 and 9 presents data compression geometry for the method, based on the best responsible.

III. Retaxim document sets out the results of the main experiment based visualization of images with depth (DIBR), AFX A8.3. This basic experiment is to view sites on the basis of the image with depth by using textures with depth. These nodes were accepted and included in the proposal for the Project Committee at the meeting in Pattaya. However, the formation of flow of this information by site image ochoterena" and node "image depth" still continues to be a subject of research. This document describes the format of the formation of a thread that should be contacted through these nodes. The format of the flow formation includes compression field oktodelete" node "image ochoterena" and fields "depth/color" node "point texture.

2. Compression formats DIBR

Below is described a new way of efficient lossless compression of unrelated data structures ochoterena, providing the possibility of reducing the volume of this compact representation about 1.5-2 times, as shown by the experiments. There are also various methods of lossless compression and lossy format bitmap textures using intermediate Voxengo view in combination with entropy coding and specialized way block compression of the texture.

2.1. Compression ochoterena

Fields or developed on the basis of the provisions the field oktodelete" should be compressed without loss in somewhat visually acceptable distortion, valid for images ochoterena. Field oktodelete" is compressed using the method of image compression standard MPEG-4 (static model) or the integration tools (entertainment models).

2.1.1. The compression field oktodelete"

Compression ochoterena is the most important part of image compression ochoterena, because its subject is the compression is already very compact representation of a binary tree without ties. However, in the experiments of the applicant's method, explained below, provided the reduction of this structure to approximately half its original volume. In the animated version of the image ochoterena field oktodelete" is compressed separately for each three-dimensional frame.

2.1.1.1. The context model

Compression is performed by a variant of the adaptive arithmetic coding (implemented as an encoder distance"), which makes use of explicit geometric nature of the data. Oktodelete is a stream of bytes. Each byte represents a node (i.e. subsub) tree, in which the bits indicate the filling subcube after internal podryvaet bytes with a ratio of one-to-one, as follows.

- Defines the context for the current byte.

- Probability (normalized frequency) of occurrence of the current byte in this context is extracted from the table of probabilities" (TV) accordingly to the context.

The probability value is entered in the encoder range.

Current TV is updated by adding 1 to the frequency of the current byte in the current context (and, if necessary, with subsequent renormalizable, as described in more detail below).

Thus, coding is the process of constructing and updating the TV accordingly, the context model. In schemes adaptive arithmetic coding context-based (such as "prediction with partial matching") the context of the symbol is usually a string from a number of preceding characters. However, in our case, the compression efficiency is increased by using patterns ochoterena and geometric properties of the data. The described approach is based on two ideas that are obviously new to the problem of compression ochoterena.

A. For the current context node or its parent node or a pair of {an originating node, the position of the current node to the originating node}.

C. it is Assumed that "verojatno is and in relation to a specific set of orthogonal (such as turns or symmetric transformation) transformation.

The assumption of "In" is illustrated in the drawing for converting R, which represents the rotation by -90in the x-z plane. The main position on item "B" is the probability of encountering a particular type of the child node in a particular type of the parent node should depend only on their relative positions. This assumption is confirmed in the experiments of the applicant by analyzing tables of probabilities. This allows you to use a more complex context without having to have too many tables of probabilities. This, in turn, contributes to the achievement of a reasonably good results in terms of data size and performance. Note that the more complex contexts are used, the more accurate the assessment of the probability and the more compact is the code.

We introduce a set of transformations for which we assume the invariance of the probability distributions. To ensure the possibility of their application in the given situation, such transformations must preserve described cubic Consider the set G of orthogonal transformations in Euclidean space, which are obtained by all of the compositions in any number and order of the three basic transformations (b> and m2- display on the plane x=y and y=z, respectively, and m3- display on the plane x=0. One of the classical results in theory of groups generated by reflections, asserts that G contains 48 discrete orthogonal transforms, and is, by implication, the maximum group of orthogonal transformations, which adopts the aforementioned cube to itself (the so-called Coxeter group). For example, the rotation R in Fig.6 expressed in terms of the generators as follows:

R=m3·m2·m1m2,

where "·" denotes matrix multiplication.

Conversion from G, applicable to the site ochoterena, creates a node with a different configuration fill subcubes. This allows you to divide the nodes into categories in accordance with the configuration of the filling of subcubes. Using the terminology of group theory, we can say that G acts on the set of all configurations populate nodes ochoterena. Calculations indicate that there are 22 different class (also called orbits in group theory), in which, by definition, two nodes belong to the same class if and only if they are related by the transformation of G. the Number of elements in the class varies from 1 to 24 and is always delitala, but only from a class that belongs to the originating node. Note that there will be 256 tables for context-based generating element, and an additional 2568=2048 tables for context, based on the position of generating and child elements in the first case, although we only have 22 tables for context, based on the class of the parent element, plus 228=176 tables in the latter case. Therefore, you can use the context of equivalent complexity with a relatively small number of tables of probabilities. Formed TV will have the form as shown in table 2.

2.1.1.2. The encoding process

To statistics tables of probabilities was more accurate, it is collected in different ways at three stages of the encoding process.

- At the first stage, the context is not used at all, is accepted "model 0-context", and is supported by a table of probabilities with 256 entries, starting with the uniform distribution.

As only the first 512 nodes (this is empirically found number) encoded, switch to the "1-context model using the parent node as the context. When switching TV 0-con is donovani, switch to the 2-context model. At this point, 1-context-per originating node is copied to all the TV for each position in the same configuration of the generating node.

The key point of this algorithm is the definition of context and the probability for the current byte. This is implemented as follows. In each case, we fix one element called "standard element". Maintain a table mapping class (CURRENT), which indicates the class to which belongs each of the possible 256 nodes, and pre-calculated conversion from G, which translates a given node in a standard element of his class. Thus, to determine the probability of the current node N, perform the following steps.

To determine an originating node P of the current node.

- Remove the class from the CURRENT belongs to P, and the transformation T that maps P to standard class node. Let the number of class C.

- Apply T to R and find the position p of the child node in the standard node that displays the current node n

- Apply T to N. Then the newly received configuration TN filling is in position p in the standard site class C.

- Extract the desired probability of C is Sopianae stages are modified in the obvious way. Needless to say, all conversions are pre-computed and implemented in the form of tables transcoding.

Note that at the stage of the decoding node N its parent node P is already decoded, and therefore, the transformation T is known. All the steps performed during decoding, absolutely identical to the corresponding steps of the encoding procedure.

Finally, we describe the procedure for updating probabilities. Let R table of probabilities for a certain context. Denote by P(N) record R, the corresponding probability of node N in this context. In the described implementation P(N) is an integer, and after each occurrence of N, P(N) is updated as

P(N)=P(N)+A,

where a is an integer parameter increment, changing in a typical case, from 1 to 4 for different context models.

Let S(P) is the sum of all entries in RP. Then the probability that N is entered in the arithmetic encoder (in this case, the encoder distances), is calculated as P(N)/S(P). As soon as S(P) reaches the threshold value 21all entries renormings: to avoid zero values in R, the entries equal to 1 remain as they are, while others are divided by 2.

2.2. Compression point textures

The node point texture contains two fields that are requirements.

The geometry must be compressed without loss, since the distortion in the geometric representation of this type is often quite noticeable.

The color information does not have a natural two-dimensional structure, and therefore methods of image compression is not directly applicable.

In this section, the proposed three methods for compression point model textures.

Method standard representation of a node without loss.

- Method representation of a node with low resolution without a loss.

- The compression method of the geometry of the lossless compression and color loss to represent a node with low resolution.

These methods correspond to the three levels of the "authenticity" of the object descriptions. The first method assumes that you want to save information about the depth accurate to the original 32 bits. In practice, however, information about the depth may often quantize with a much smaller number of bits without loss of quality. In particular, when the model is "pixel structure" converted from mesh, the resolution of the quantization is selected in accordance with the actual size of the visible parts possessed by the original model, and the desired output resolution of the display image. In this case, bits 8-11 may well udovletvorinie. The above-mentioned second method uses a lossless representation of "low resolution". The key point here is that for a relatively small number of bits can be used in the intermediate representation by the voxels of the model, and this allows you to compress the "depth" essentially without loss of information. Color information in both cases is compressed lossless and saved in PNG format after reordering color data in the form of an auxiliary two-dimensional image. Finally, the third method allows you to achieve a much higher compression ratio combining lossless geometry compression lossless data color. The latter is done using a specialized method of block compression of the texture. In the following three subsections describe these methods in more detail.

2.1.1. Compression point textures without losses for the standard view site

This is a simple method of lossless encoding, which works as follows.

- The "depth" is compressed adaptive encoder distances similar to those used in the compression field oktodelete". For the format used by the version in which the table of probabilities is maintained for each of the 1-character to is seen as a stream of bytes and geometric structure in an explicit form is not used.

- The field "color" is compressed after conversion to a flat image true color. The color of points in the point model textures are first written to a temporary one-dimensional array in the same order as the values of depth of field depth. If the total number of points in the model is equal to L, then calculates the smallest integer 1 such that 1-1L, and this "long line" color values "collapses" into a square image with a side of 1 (if necessary, fill-in-black pixels). This image is then compressed using one of the tools of MPEG-4 image compression without loss. This approach was used PNG format. The resulting image for the model "angel" shown in Fig.10A.

2.2.2. Compression point textures lossless representation of a node with a lower resolution

In many cases, a 16-bit resolution for depth information is very good. In fact, the depth resolution should match the screen resolution on which must be rendered model. In situations where small variations in the depth of the model at different points lead to displacement in the plane of the screen is much smaller, which are in the format where the depth occupied by bits 8-11. Such models are usually obtained from other formats, for example, a polygonal model by sampling depth and color in the proper spatial grid.

Such a representation with lower resolution can be considered as a compressed form of the standard model with 32-bit depth. However, for such models there is a more compact representation using intermediate space of voxels. Actually, the point model can be imagined to belong to the nodes of a uniform spatial grid with cells defined by the discretization step. We can assume that the grid is uniform and orthogonal, as in the case of the model in the perspective view, you can work in a parametric space. Using these assumptions, the depth and color bitmap textures with lower resolution compressed as follows.

- The color field is compressed by the compression method of images without losses, as in the previous method.

- The "depth" is first converted to a representation on the basis of voxels and then compressed using a version of the encoder distance described in the previous section.

Intermediate model using vocalistener of voxels of size widththe height of the2s(the parameters "width" and "height" are explained in the specification point textures). For our purposes, we need not work with space potentially huge number of voxels, as a whole, but only with its "thin" sections. Let (r, C) coordinates of the row-column in the plane of projection, and take d as the coordinates of the depth. Convertible slices {c=const}, i.e., sections of the model vertical planes, in the view based on voxels. Scanning the slice by "columns parallel to the plane of projection, select voxel (r, c, d) as "black", unless there is a point model with a value of d is depth, which is projected onto the (r, s). This process is illustrated in Fig.4.

Once the slice is built, it is compressed 1-contextual encoder distance, and begins the compression of the next slice. This way you can avoid working with very large datasets. Table of probabilities are not initialized for each new slice. For a wide range of models, only a small proportion of voxels is defined as "black", which allows to achieve high compression ratios. Decompression is an obvious inversion of the described operations.

Will compare compression pedeseta, however, field color, such as irregular image cannot be simply compressed image without distortion. In the next subsection considers the combination of the method of lossless compression of geometry and lossy compression of color.

2.2.3. The lossless geometry and lossy compression of color to represent a point textures with lower resolution

Like described above, this method converts the "depth" in the representation on the basis of voxels, which is then compressed adaptive 1-contextual encoder distance. The color field is also displayed in the two-dimensional image. We made an attempt to arrange the display so that points that are close in three-dimensional space, shown in close points in the plane two-dimensional image. To the obtained image uses a special compression method texture (ABR - adaptive subdivision into blocks). The basic steps of the algorithm are as follows.

1. To convert a "slice" of four consecutive vertical planes" model point texture-based representation of voxels.

2. Scan the array of voxels width42sby:

the vertical passage is parallel to the plane of projection: first column, closest to the plane of projection, then the next column, and so on (i.e., in the usual manner of passing a two-dimensional array);

- passing of voxels within each subcube 444 in the manner similar to the procedure used when passing subcubes nodes ochoterena.

3. Record the color of the points of the model detected during this procedure, in the auxiliary one-dimensional array.

4. To reorder the received array of colors in the two-dimensional image.

5. Serial 64 sample colors are arranged in columns in a block of 88 pixels, the next 64 samples are reordered in a neighboring block of 88 pixels, and so on,

6. Compress the resulting image method ABR.

This method of scanning a three-dimensional array and display the result in the two-dimensional image was selected from the following considerations. Note that subcube 444 and 8 blocks8 images contain the same number of samples. If multiple sequentially scanned subcubes contain enough samples color to fill the block 88, it is very Veroli after decompression. The ABR algorithm compresses the blocks 88 independently of each other using local palletizing (by reducing the number of bits per pixel representation fewer colors palette). In the tests the distortion caused by the compression algorithm ABR entered in the final three-dimensional model was significantly less than the distortion for the JPEG algorithm. Another reason for choosing this algorithm is high decompression speed (for which he was originally created). The compression ratio can have two values of 8 and 12. The compression algorithm of point texture applicant was selected value 8 compression ratio.

Unfortunately, this algorithm is not universally applicable. Although the image obtained by this method from the color fields shown in Fig.10b, a much more uniform than for the "natural" order of scanning, some two-dimensional blocks 88 may contain color sample corresponding to remote points in three-dimensional space. In this case, the method ABR loss can provide a "mixing" colors from remote parts of the model that will lead to local, but noticeable distortion after decompression.

However, for many models this sokrashenie volume model in both cases is about 7 times.

3. The results of the test

This section provides a comparison of compression results of the two models, the "angel" and "Morton 256", in two different formats - image ochoterena and dot texture. The sizes of the reference image for each model was 256256 pixels.

3.1.Compression point textures

In tables 3-5 shows the results of different compression methods. The model for this experiment was obtained from models with 8-bit field of "depth". Depth distributed over the range (1, 230using the quantization step 221+1 to make the distribution of the bits in a 32-bit depth values more uniform, simulating to some extent the "true" 32-bit values.

High compression ratios are not expected from the use of this model. The reduction in the same manner as that for the standard lossless compression of true color images. Short depth of field and the colours are of comparable size, because of the geometric properties of the data are not affected by this method.

Now let's see how many times the same model can be compressed without loss, if they are used for their "true" depth resolution. In contrast to the previous case, the depth of the PoE makes redundancy geometric data is much more pronounced - actually, only a small proportion of voxels refers to black. However, because the uncompressed size of the models is smaller than for the 32-bit case, the ratio of the field "color" now determines the overall compression ratio that is smaller than the 32-bit case (although the output files is also less). Thus, it is desirable to be able to compress the color field at least as good as the "depth".

Our third method uses for this purpose a method of lossy compression, called ATS 6. This method gives a much higher compression. However, like all methods of lossy compression, in some cases, it can lead to nepriyatnym artifacts. An example of the object for which this occurs is the model of "angel". In the scanning points of the model spatial remote point sometimes fall into the same block two-dimensional image. Color in remote locations this model may vary very considerably, and local palletizing may not provide an accurate approximation if the block has too many different colors. On the other hand, local palletizing accurately compress a huge number of blocks for which the distortion introduced, for example, standartmassig locations. However, the visual quality of the model "Morton 256", compressed by the same method, is excellent, and this was the case for most models in the conducted experiments.

3.2. Compression ochoterena

Table 6 shows the sizes of the compressed and uncompressed components ochoterena in two models of testing. Observed a reduction of this field is approximately 1.6-1.9 times.

However, compared to the uncompressed models point textures, even with 8-bit field "depth", the image ochoterena much more compact. Table 7 shows the compression ratio of 7.2 and 11.2. This is more than could be provided at compression point textures without converting the image ochoterena (respectively 6.7 and 6.8 times). However, as mentioned above, the image ochoterena may contain incomplete information about the color, which is the case for the model of "angel". In such cases, using a three-dimensional interpolation color.

In the result we can conclude that the presented experiments demonstrate the effectiveness of the developed methods of compression. Choosing the best tool for a given model depends on the geometric complexity, charactertures and their components (the file sizes are rounded to kilobytes).

5. Comments to the study of ISO/IEC 14496-1/PDAM4

After applying the following revised sections to the study of ISO/IEC 14496-1/PDAM4 (N4627) a revised survey of ISO/IEC 14496-1/PDAM4 should be included in ISO/IEC 14496-1/FPDAM4.

Section 6.5.3.1.1, technical change

Problem: the default value of the field "orthogonal" should be the most commonly used value.

Solution: Replace the default value of the field "orthogonal" from FALSE to TRUE as shown below.

Proposed change:

Section 6.5.3.1.1, technical change

Problem: the Formation of flow DIBR should be made on a straight-flow formation for AFX.

Solution: Delete the "image Url depth" of the node image with depth. Proposed change:

Section 6.5.3.1.2, drafting amendment

Problem: the Term "normalized" is misleading when applied to the field of "depth" in its present context.

Solution: In the 5th paragraph, replace "normalized" to "scaled".

Proposed change:

Fields near and far plane to determine the distance from the observation point to blignieres plane, the far plane and the observation point. Depth data scaled to the distance from the near plane to the far plane.

Section 6.5.3.1.2, technical change

Problem: the Formation of flow DIBR should be made on a straight-flow formation for AFX.

Solution: Delete the field definition image with depth Url (7th paragraph and below).

Proposed change:

Section 6.5.3.2.2, drafting amendment

Problem: the semantics of the fields "depth" incompletely defined.

Solution: Change the specification of the field "depth" in the 3rd paragraph as follows.

The "depth" determines the depth for each pixel in the texture. Depth map must have the same size as the image or movie in the "texture". The "depth" must be one of various types of nodes textures (texture image, the texture of the movie or Pixela texture), where only the nodes representing the image in grayscale. If the "depth" is not defined, then the alpha channel in the texture should be used as a depth map. If the depth map is not defined by field "depth" or the alpha channel, then the result is undefined.

The "depth" allows you to calculate the current rasstoyania the near plane and far plane:

where d is the depth value and dmax- the maximum depth value. It is assumed that for point models d>0, where d=1 corresponds to the far plane a d=dmaxcorresponds to the near plane.

This formula holds for the case of the perspective view, and for the orthogonal case, since d is the distance between point and plane, dmaxis the maximum value d, which may be represented by bits used for each pixel.

(1) If the depth defined by the depth, the depth value d is equivalent to grayscale.

(2) If the depth is determined through the alpha channel in the image defined by the texture, the depth value d is equivalent to alpha value.

The depth value is also used to indicate points belonging to the model: only the point for which d is nonzero, belongs to the model.

For animated models based on images with depth is only an image with depth with simple textures as diTe.

Each of the simple texture can be made entertainment one of the following ways:

(1) the "depth" is fixed depicted

(2) the "depth" is an arbitrary texture movie that satisfy the above condition for the field "depth", the "texture" is a still image;

(3) as depth and texture are textures movie, and the "depth" satisfies the above condition;

(4) the "depth" is not used, and the depth is extracted from the alpha channel of the texture of the movie, which makes the "texture".

Section 6.5.3.3., editorial change

Problem: the semantics of the fields "depth" is not defined fully.

Solution: Replace the specification field "depth" (3rd paragraph) proposed a modified version.

Proposed amended text:

The geometrical sense of depth, and all agreements according to their interpretations adopted for simple textures, here are also applicable.

The "depth" defines multiple depths at each point in the plane of projection, which assumes the far plane (see above) in the procedure, which starts from a point in the lower left corner and moves to the right, ending on a horizontal line before moving to a higher line. For each point the number of depths (pixels) first remember, and this is the number of values of the depth of the ZAT is in", can lead to incompatible values.

Solution: Change the field type for the field "oktodelete" MFInt32.

Proposed change:

In section 6.5.3.4.1

In section H. 1, table for ochoterena, change the column oktodelete as follows:

Section 6.5.3.4.1, technical change

Problem: the formation of flow DIBR should be done with a uniform method of forming a thread for AFX.

Solution: Delete the "oktodelete Url" from node "image ochoterena).

Proposed change:

Section 6.5.3.4.2, drafting amendment

Problem: the field Definition "resolution ochoterena" (2nd paragraph) tolerate incorrect interpretation.

Solution: Change the description by adding the word "valid".

The "resolution ochoterena" specifies the maximum number of leaves ochoterena along the sides of the cube described. Level ochoterena can be determined from the resolution ochoterena using the following equation: level oktodelete=int (log2(resolution ochoterena-1))+1)

Section 6.5.3.4.2, technical change

Problem: the formation of flow DIBR should be a straight-flow formation for AFX.

Solution: the uniform change

Problem: the Animation image ochoterena not described fully.

Solution: Add a paragraph at the end of Section 6.5.3.4.2 describing the animation image ochoterena.

Proposed change:

Animation image ochoterena can be performed by the same method as the first three ways of animation, image-based with depth, with the only difference that you are using the "oktodelete" field instead of "depth".

Section N. 1. Technical change

Problem: the data Range of the depth of the node point of the texture may be too small for future applications. Many graphics software tools allow you to use 24-bit or 36-bit data depth for the z-buffer. However, the "depth" in dotted texture has the range [0,65535], which corresponds to 16 bits.

Solution: In section H. 1, in the table for point texture, change the range of the column "depth", as proposed.

IV. Coding according to ISO/IEC JTC 1/SC 29/WG11 moving images and audio

1. Introduction

This document describes the image ochoterena in the view-based images with depth (DIBR), AFX A8.3. Site image ochoterena" was adopted and included in predlozhenietaiwan cases unsatisfactory, due to the overlapping geometry of the object. This document describes an enhanced version of node image ochoterena", i.e. textured binary volumetric ochoterena (TBWA), and how it is compressed to form a stream.

2. Textured binary volumetric oktodelete (TBWA)

2.1.The main properties of TBWO

The purpose of TBWO is the creation of a more flexible format/compression with fast rendering, as improvements binary volumetric ochoterena (BWO). This is achieved by storing some additional information based on the BVI. Representation on the basis of the BVI consists of (patterns ochoterena+the set of reference images), while the view based on TBWA, consists of (patterns ochoterena BWO+the set of reference images+indexes cameras).

The main problem of the visualization of the BVI is that we need to determine the index of the camera for each voxel in the rendering. With this purpose it is necessary not only to project on the camera, but also reverse the procedure for adjusting the rays. We should at least determine the existence of the camera from which a voxel is observed. Therefore, we must find all the voxels that PR". We have developed an algorithm that performs quickly and accurately for most forms of objects. However, there are still difficulties for the voxels that are not observed with any of the cameras.

A possible solution could be to preserve the explicit color for each voxel. However, in this case there were problems in the compression of color information. I.e. if you group the colors of the voxels as the image format and compress it, the correlation colors of neighboring voxels is destroyed, so the compression ratio will be poor.

In TBVO problem is solved by storing the index of the camera (image) for each voxel. The index is usually the same for a large group of voxels. And this allows you to use the structure ochoterena for economical storage of additional information. Note that, on average, only 15% increase was observed in experiments with our models. This modeling more complex, but provides a more flexible way of representing objects and any geometry.

Advantages of TBWO compared to the BVI are that its visualization is easier and faster and, obviously, does not impose restrictions on the geometry of the object.

2.2. An example of TBWO

In Dan TBWA. In Fig.12A shows a model BWO "angel". Using conventional six textures BWO some parts of the body and wing are not observed no camera, which leads to rendering the image with a large number of observed "cracks". In view of TBWO the same model used a total of 8 cameras (6 faces of the cube+2 additional cameras). In Fig.13A shows the image index of the camera. Different color indicates different index camera. The secondary camera is placed inside the cube, watching the front and back faces orthogonal way. In Fig.13b and C - additional images additional cameras. The result is a smooth and clearly rendered the model, as shown in Fig.12b.

2.3. Description uncompressed stream TBWA

We assume that enough to 255 cameras, and assign 1 byte for the index. The flow of TBWO is a stream of characters. Each TBWO-symbol is BWO-the symbol or the symbol of the texture.

The character textures denotes the index of the camera, which may be a specific number or code "undefined". Will accept code "undefined" in the following description as "?".

The flow of TBWO traversed in the order in width. Describe how to record the stream TBWA, if we have the BWO, and wok nodes BWO including the leaf nodes (which do not have BWO-symbol) in order by width. The following pseudocode implements the stream recording.

If Tekusesha is not a leaf node

{

Write out the current BWO-symbol accordingly, this node

}

if all child elements have the same index camera

(character textures)

{

if the Current Node has a "?" index camera

To record the index of the camera is equivalent to subuser

}

more

{

Write"? "

}

Accordingly this procedure for tree TBWA shown in Fig.14a, can be obtained from a stream of characters, as shown in Fig.14b. In this example, the character textures are represented in a byte. However, in the real flow of each character textures will require only 2 bits, since we need to represent three values (two cameras and an undefined code).

2.4. Compression of TBWO

Field "image ochoterena and ochoterena" node "image ochoterena" is compressed separately. The methods described were developed on the basis of the fact that the field "ochoterena" should be compressed without loss, while for images ochoterena valid some degree of visually perceived distortion.

2.4.1. The compression field "image ochoterena"

The image ochoterena" is compressed by the image compression algorithm MPEG-4 (static model), or instrumentle JPEG format for images ochoterena (after some preprocessing, what we call the "minimization" of JPEG images, keeping for each texture, only the data necessary for three-dimensional visualization; in other words, part of this texture, which is not used at the stage of three-dimensional visualization can be compressed so rude as this is desirable).

2.4.2. The compression field oktodelete"

Compression ochoterena is the most important part of image compression ochoterena, because its subject is the compression is already very compact representation of a binary tree without ties. However, in our experiments the method explained below, provided the reduction of this structure to approximately half its original volume. In the animated version of the image ochoterena field oktodelete" is compressed separately for each three-dimensional frame.

2.4.2.1. The context model

Compression is performed by a variant of the adaptive arithmetic coding (implemented as an encoder distance"), which uses explicitly the geometric nature of the data. Oktodelete is a stream of bytes. Each byte represents a node (i.e. subsub) tree, in which the bits indicate the filling subcube after internal divisions. The bit configuration is configuratie the zoom.

- Defines the context for the current byte.

- Probability (normalized frequency) of occurrence of the current byte in this context is extracted from the table of probabilities" (TV) accordingly to the context.

The probability value is entered in the encoder range.

Current TV is updated by adding 1 to the frequency of the current byte in the current context (if necessary, with subsequent renormalizable, as described in more detail below).

Thus, coding is the process of creating and updating TV respectively contextual model. In schemes adaptive arithmetic coding context-based (such as "prediction with partial matching") the context of the symbol is usually a string from a number of preceding characters. However, in our case, the compression efficiency is increased by using patterns ochoterena and geometric properties of the data. The described approach is based on two ideas that are obviously new to the problem of compression ochoterena.

A. For the current context node or its parent node or a pair of {an originating node, the position of the current node to the originating node}.

C. it is Assumed that the "probability" appearance the structure to a specific set of orthogonal (such as turns or symmetric transformation) transformation.

The assumption of "In" is illustrated in the drawing for converting R, which represents the rotation by -90in the x-z plane. The main idea of the point "b" is the probability of encountering a particular type of the child node in a particular type of the parent node should depend only on their relative positions. This assumption is confirmed in our experiments by analyzing tables of probabilities. This allows you to use a more complex context, without requiring too many tables of probabilities. This, in turn, contributes to the achievement of a reasonably good results in terms of data size and performance. Note that the more complex contexts are used, the more accurate the assessment of the probability and the more compact is the code.

We introduce a set of transformations for which we assume the invariance of the probability distributions. To ensure the possibility of their application in the given situation, such transformations must preserve covering cubic Consider the set G of orthogonal transformations in Euclidean space, which are obtained by all of the compositions in any number and order of the three basic transformations (generators) m1, m2and m3- display on the plane x=0.

One of the classical results in theory of groups generated by reflections, asserts that G contains 48 discrete orthogonal transforms, and is, by implication, the maximum group of orthogonal transformations, which adopts the aforementioned cube to itself (the so-called Coxeter group). For example, the rotation R in Fig.6 expressed in terms of the generators as follows:

R=m3·m2·m1·m2,

where "·" denotes matrix multiplication.

Conversion from G, applicable to the site ochoterena, creates a node with a different configuration fill subcubes. This allows you to divide the nodes into categories in accordance with the configuration of the filling of subcubes. Using the terminology of group theory, we can say that G acts on the set of all configurations populate nodes ochoterena. Calculations indicate that there are 22 different class (also called orbits in group theory), in which, by definition, two nodes belong to the same class only if they are related by the transformation of G. the Number of elements in the class is from 1 to 24 and is always a divisor of 48.

A practical consequence "In" is that the table veroyatnostei is, there is 256 tables for context-based generating element, and additionally 256x8=2048 tables for context, based on the position of generating and child elements in the first case, though it takes only 22 tables for context, based on the class of the parent element, plus 22x8=176 tables in the latter case. Therefore, you can use the context of equivalent complexity with a relatively small number of tables of probabilities. Formed TV will have the form as shown in table 8.

2.4.2.2. The encoding process

To statistics tables of probabilities was more accurate, it is collected in different ways at three stages of the encoding process.

- At the first stage, the context is not used at all, is taken as "0-context model, and is supported by a table of probabilities with 256 entries, starting with the uniform distribution.

As only the first 512 nodes (this is empirically found number) encoded, switch to the "1-context model using the parent node as the context. When switching TV 0-context is copied to the TV for all 22 contexts.

After 2048 (another heuristic value) encoded, re is each position in the same configuration of the generating node.

The key point of this algorithm is the definition of context and the probability for the current byte. This is implemented as follows. In each case, we fix one element called "standard element". Maintain a table mapping class (CURRENT), which indicates the class to which belongs each of the possible 256 nodes, and pre-calculated conversion from G, which translates a given node in a standard element of his class. Thus, to determine the probability of the current node N, perform the following steps:

To determine an originating node P of the current node.

- Remove the class from the CURRENT belongs to P, and the transformation T that maps P to standard class node. Let the number of class C.

- Apply T to R and find the position p of the child node in the standard node that displays the current node n

- Apply T to N. Then the newly received configuration TN filling is in position p in the standard site class C.

- Extract the desired probability of recording TN table of probabilities corresponding to a combination of "class position" (C, p).

For the 1-context model, the above steps are modified in the obvious way. Needless to mind at the stage of the decoding node N its parent node P is already decoded, and, therefore, the transformation T is known. All the steps performed during decoding, absolutely identical to the corresponding steps of the encoding procedure.

Finally, we describe the procedure for updating probabilities. Let R table of probabilities for a certain context. Denote by P(N) record R, the corresponding probability of node N in this context. In the described implementation P(N) is an integer, and after each occurrence of N, P(N) is updated as

P(N)=P(N)+A,

where a is an integer parameter increment, changing in a typical case, from 1 to 4 for different context models.

Let S(P) is the sum of all entries in R. Then the probability that N is entered in the arithmetic encoder (in this case, the encoder distances), is calculated as P(N)/S(P). As soon as S(P) reaches the threshold value 216all entries renormings: to avoid zero values in R, write = 1, remain as they are, while others are divided by 2.

2.4.2.3. Coding nodes chambers"

The stream of characters that specifies the number of textures (camera) for each voxel, is compressed using its own table of probabilities. In the terms used above, it has one context. The table entries are the probabilities are updated with great prires the Oia and visualization of TBWO

In Fig.15, 17, 18 and 19 shows the results of compression TBWA. In Fig.16 shows purged (removed from casing) image models "angel" and "Morton". Compressed size is comparable with the compressed BWO: in the third column the number in parentheses represents the compressed volume geometry, and the first number - the full amount of models, compressed on the basis of TBWA (i.e. textures are taken into account). As a measure of visual distortion was calculated signal-to-noise ratio (PSNR) to evaluate differences in color after conversion MIG(T)BWOThe MIG. The compressed size is the size of all textures saved as minimized JPEG, see 0) plus the size of the compressed geometry. In the case of TBWO compressed geometry also includes information about the camera. The ratio PSNR for TBWA improved significantly compared to the BVI.

TBWA provides faster rendering than the BVI. For model angel frame rate of TBWO-12 has a 10.8 K/s, while for the BVI it is equal to 7.5 K/s For the model "Morton" the appropriate option for TBWA-12 is 3.0 K/s, and for BWO 2,1 (Celeron 850 MHz). On the other hand, it was observed that visualization accelerated much more in entertainment TBWA. For model Dragon frame rate for TBWA-12 was equal to 73 K/s, BP is, 2 ways of using 12 cameras shown in Fig.6: TBWA-12, TBWA-6+6). TBWO-12 uses 6 cameras BWO (cube face) plus 6 images taken from the center of the cube and parallel faces. Configuration (6+6) uses 6 cameras BWO and then removes ("clears") all voxels observed these cameras, and "shoots" the part that was visible of these same 6 cameras. Examples of such images are shown in Fig.16.

Note the significant difference in quality (subjective and the value of PSNR) between the BVI and TBWA-6 for models "angel". Although we used the same camera locations, TBWA allows to distribute the camera for all voxels, even those that are not observed by the cameras. Their number is chosen so the best way to reconcile the original color (i.e., for each point selects the best match in all the images of the cameras, regardless of the immediate observability). Models angel this gives a significant result).

We also note a very moderate difference in volume "geometry" (i.e. BWO-cameras) between cases 6 and 12 cameras. Actually, the secondary camera is covered in a typical case of small areas, their IDs are rare, and their texture is sparse and well-compressed). All of this applies not only to the model "Ang who ate ochoterena defines the structure of TBWO, in which there is a structure ochoterena corresponding to the indices array of cameras, and a set of images ochoterena.

The "image ochoterena" defines a node-set "image depth" with a simple structure for the field "di"; the "depth" of these nodes is a simple texture is not used. Field "orthogonal" must be TRUE for nodes "image depth". For each of the simple texture field texture stores the color information of the object or part of object type (for example, the cross-section plane of the camera), the obtained orthogonal camera position and orientation, which is defined in the respective fields of the image with depth. Part of the object corresponding to each camera, are distributed at the stage of model design. The division of the object using the field values of the position, orientation and texture is to minimize the number of cameras (or, equivalently, the images used ochoterena), and at the same time, to include all parts of the object, potentially observable from an arbitrarily selected position. Field orientation must satisfy the conditions: vector surveillance camera has only one nonzero component (i.e., perpendicular to one of the edges is the parties covering the cube.

Field oktodelete" completely describes the geometry of the object. The geometry is represented as a set of voxels that make up this object. Oktodelete is a tree data structure in which each node is represented by a byte. "1" in the i-th bit of this byte means that the child nodes exist for the i-th child node of the internal node; and "0" means that it is not. The internal nodes ochoterena should be the order of the passage width ochoterena. The order for the eight child nodes of the internal node is shown at 14b. The size of the covering cube full ochoterena equal to 111, and the center of the cube ochoterena must comply with the origin (0,0,0) in the local coordinate system.

The field "ID" camera contains an array of indexes of the camera assigned to the voxels. At the stage of rendering the color assigned to the leaves ochoterena, is determined by the orthogonal projection of the leaves on one of the images ochoterena with a specific index. The indexes are stored in a manner similar to oktodelete: if a particular camera can be used for all leaves contained in a specific node, the node containing the index of the camera is given in the flow, as stated which will be determined separately for the child sobeslav the current node (the same recursive way). If the field "ID" camera is empty, then the indices of the chamber are defined in the rendering stage (as in the case BWO).

The "resolution ochoterena" specifies the maximum number of leaves ochoterena along side covering Cuba. Level ochoterena can be determined from the resolution ochoterena using the ratio of:

Level oktodelete =log2(resolution ochoterena)

2.7. Specification bitstream

2.7.1. Compression ochoterena

2.7.1.1. General information

Site image ochoterena" in the view based on the image depth, determines the structure of ochoterena and projected textures. Each texture is stored in the array of images ochoterena, is determined by the node image with depth with a simple structure. Other fields of the node image ochoterena" can be compressed by compressing ochoterena.

2.7.1.2 Oktodelete

2.7.1.2.1

class Octree ()

{

OctreeHeader ();

aligned bit (32)* next;

while(next= =0000001 C8)

{

aligned bit (32) octree_frame_start_code;

OctreeFrame(octreeLevel);

aligned bit (32)* next;

}

)

2.7.1.2.2. Semantics

Compressed stream ochoterena contains the header ochoterena and one or more frames ochoterena, each of which presstwo. This value is detected proactive parse stream.

2.7.1.3. Title ochoterena

2.7.1.3.1

class OctreeHeader ()

{

unsigned int (5) octreeResolutionBits;

unsigned int (octreeResolutionBits) octreeResolution;

int octreeLevel=ceil(log(octreeResolution)/log(2));

unsigned int (3) textureNumBits;

unsigned int (textureNumBits) numOfTextures;

}

2.7.1.3.2. Semantics

This class reads the header information for compression ochoterena.

Field octreeResolution (resolution ochoterena), the length of which is described by octreeResolutionBits (bits resolution ochoterena), contains the value of the field "resolution ochoterena" node "image ochoterena". This value is used to obtain the level ochoterena.

Field numOfTexture (including textures), which has a length of textureNumBits (bits number of textures), describes the number of textures (or cameras) used in the host image ochoterena". This value is used for arithmetic encoding ID of the camera for each node ochoterena. If the value textureNumBits is 0, the character textures are not encoded by setting the "current texture" of the root node 255.

2.7.1.4. Frame ochoterena

2.7.1.4.1.Syntax

class OctreeFrame (int octreeLevel)

{

for (int curl_evel=0; curLevel<octreeLevel; curLevel++0

{

for (int nodelndex=0; nodelndex<nNodeslnCurLevel; nodelndex++)

{

int nodeSym=ArithmeticDecodeSymbol (contextID);

if(curTexture==0)

{

curTexture=ArithmeticDecodeSymbol (who hemantika

This class reads one frame ochoterena in order adventures in width. Starting from the 1st node at level 0, after the reading of each node at the current level, the number of nodes at the next level is known by counting all "1" in each symbol node. At the next level this is the number of nodes (nNodeslnCurLevel - n nodes at the current level) will be read from the stream.

To decode each node sets the corresponding contextID (context ID), as described in section 2.7.1.6.

If the ID of the texture (or camera) for the current node (curTexture - current pattern) is not defined originating node, the ID of the texture also is read from the stream, using the context ID for texture-defined field textureContextID. If you are removing a non-zero value (the ID of the texture is defined), then this value will also be applied to all child nodes at the following levels. After decoding each node, textureID will be assigned to the leaf nodes ochoterena, which has not yet been assigned a value textureID.

2.7.1.5. Adaptive arithmetic decoding

This section describes the adaptive arithmetic coder used in the compression ochoterena, using the syntax of C++ style.

aa_decode () function, which decodes the symbol using the model definition is xtID)

{

unsigned int MAXCUM=1"13;

unsigned int TextureMAXCUM=256;

int*p, allsym, maxcum;

if (contextID !=textureContextID)

{

p=PCT[contextID];

allsym=256;

maxcum = MAXCUM;

}

else

{

p=TexturePCT;

allsym=numOfTextures;

maxcum=TextureMAXCUM;

}

int cumul_freq[allsym];

int cum=0;

for (int i=allsym-l; i>=0; i-)

{

cum+=p[i];

cumul_freq[i]=cum;

}

if (cum>maxcum)

{

cum=0;

for (int i=allsym-l; i>=0; i--)

{

PCT[contextID][i]=(PCT[contextID][i]+l)/2;

cum+=PCT[contextID] [i];

cumul_freq [i] =cum;

}

}

return aa_decode(cumul_freq);

}

2.7.1.6. The process of decoding

The General structure of the decoding process described in section 0 (see also the coding process described above). He shows how to get the knots of TBWO from a stream of bits that forms the arithmetically encoded (compressed) model TBWA.

At each stage of the decoding process it is necessary to update the number of context (i.e., the index used by tables of probabilities) and a table of probabilities. Let's call the Probabilistic model, the set of all probability tables (arrays of integers), j-th element of the i-th table of probabilities, divided by the sum of its elements, assesses the probability of occurrence of the j-th character in the i-th context.

The upgrade procedure table of probabilities is as follows. First table of probabilities are initialized so that all entries are equal to 1. Before decoding of the symbol should be selected context number (ContextID). ContextID use binary arithmetic decoder. After this the table of probabilities is updated by adding the adaptive step to the frequency of the decoded symbol. If the total (cumulative) amount of items in the table exceeds the cumulative threshold, it performs normalization (see 2.7.1.5.1).

2.7.1.6.1 Context modeling character textures

The character textures are modeled with only one context. This means that there is only one table of probabilities. The size of this table is equal to the number numOfTextures (including textures)+1. First, this table is initialized with all "1". The maximum entry value is set to 256. Adaptive step is set at 32. This combination of parameter values allows adaptation to alternating substantially the stream numbers of the texture.

2.7.1.6.2. Context modeling symbol node

There are 256 different characters of nodes, and each symbol represents 222 binary matrix of voxels. These matrices can be applied to three-dimensional orthogonal transformation that transform symbols into each other.

Consider a set of 48 fixed orthogonal transformations, i.e., turns 90·n (n=0, 1, 2, 3) degrees relative to e conversion [48]=

}

There are 22 sets of symbols, called classes, so 2 characters are connected in such a conversion only if they belong to the same class. The encoding method generates the PCT as follows: ContextID symbol is equal to the number of the class to which it belongs generating element, or a combined room (class generating element, the position of the current node to the originating node). This significantly reduces the number of contexts, reducing the time required to obtain meaningful statistics.

For each class defined in one of the base character (see table 9), and for each character pre-computed orthogonal transformation, which leads him to base his character class (in the real process of encoding/ decoding is used recoding table). After ContextID for the character specified, applies the transformation inverse (i.e., the transposed matrix) that leads it to the originating element in the base element. Table 10 shows the contexts and the corresponding direct conversion for each character.

The context model depends on the number N is already decoded symbols:For 512N<2560 (=2048+512), used 1-context (in the sense that the number of contexts is one parameter, the number of the class) model. This model uses a 22 PCT. ntextID the class number belongs to an originating node decoded node. This number can be determined from the conversion tables (see table 10), because the originating node is decoded earlier than the child. Each of the 22 PCT is initialized by the PCT of the previous stage. The number of characters in each table of the probabilities is equal to 256. Step adaptation is equal to 3. The maximum cumulative frequency is also equal to 8192. After decoding of the symbol, it is converted using the inverse orthogonal transformation defined above. The number of orthogonal transformation can be found in table 10 for the ID of the symbol node is the originating node for the symbol of the current node.

When 2560 characters decoded, the decoder switches to 2-contextul model (in the sense that the number of contexts now consists of two parameters, as explained below). This model uses 176 (=22·8, i.e., class 2 through 8 positions) of the PCT. ContextID here depends on the class parodushevye only from the context, but not from the position for all 8 positions PCT is a clone of the PCT obtained for this class in the previous step. The number of characters in each table of the probabilities is equal to 256. Step adaptation is equal to 4. The maximum cumulative frequency = 8192.

After the symbol has been decoded, it is also transformed using the inverse orthogonal transformation (see table 10), as the earlier models.

You can easily get the geometry of the underlying items for each class, using table 10. The basic elements are the characters for which ID orthogonal transformations is equal to 0 (the number 0 is assigned to the ideal conversion).

Below is a more detailed explanation of the MPEG-4 specification of nodes and compression techniques image formats ochoterena used in the device and method of the three-dimensional view based on images with depth, according to the present invention.

The present invention describes a family of data structures, performance-based images with depth (DIBR), which provide effective and efficient representation, based mostly on the images you DIBR SimpleTexture (single texture), PointTexture (dot texture) and Octreelmage (image ochoterena).

In Fig.20 is a diagram of example texture image and the depth map of Fig.21 is a chart example of a layered image with depth and projection of the object; b - layers of pixels.

Simple structure is a data structure that consists of an image corresponding to the depth map and descriptions of the camera (position, orientation and type, orthographic or perspective). Opportunities represent a single, simple textures is limited to one object, such as facades of buildings: front image with the depth map allows the reconstruction of the elevation views in a significant range of angles. However, the set of simple textures formed properly positioned cameras, allows you to get a view of the entire building - in the case where the reference image cover all potentially visible part of the surface of the building. Of course, the same applies to trees, human figures, vehicles, etc. moreover, the set of simple texture provides a natural means for processing three-dimensional animation data. In this case, the reference izobratetalny or alpha values of these video streams, a separate stream grayscale. In this type of representation of the image can be saved in formats lossy compression like JPEG. This significantly reduces the amount of color information, especially in entertainment. However, the geometry data (depth map) should be compressed without loss that affects the overall saving of memory.

For complex objects it is sometimes difficult to cover the whole visible surface of the acceptable number of reference images. The preferred representation for such cases can be dot texture. This format also provides for the preservation of the reference image and depth map, but in this case they are multi-valued: for each line of sight of the camera (orthographic or perspective) color and length are stored for each intersection of a line with the object. The number of crossings may vary from line to line. The combination of several point texture provides a very detailed view of even complex objects. But this format is not enough two-dimensional regularity simple textures, and therefore it does not provide a natural image-based compressed form. For the same reason he used the greater part of the two-dimensional simple texture and mostly three-dimensional point texture: it preserves the geometry of the object in a structured in oktodelete volumetric representation (hierarchically organized voxels in the usual binary division covering Cuba), moreover, the color component is represented by a set of images. This format also contains a data structure, such oktodelete that stores, for each voxel of the sheet, index of a reference image that contains the color. Phase image visualization ochoterena color voxel sheet is determined by the orthogonal projection of its corresponding reference image. We have developed a highly efficient compression method for the geometric part of the image ochoterena. It is a variant of the adaptive context-based arithmetic coding, where the contexts are constructed using explicit geometric properties of the data. The use of compression together with compressed lossy reference image makes the image ochoterena very effective space representation. Like a simple texture, image ochoterena has entertainment version: reference stream instead of the reference images, plus two additional threads oktodelete representing the geometry and compliance voxel is the three-dimensional image for each frame. A very useful property of the image format ochoterena is inherent t the inclusion in the standard AFX MPEG. AFX provides more advanced features for synthetic environments MPEG-4, and includes a set of interactive tools that provide a reusable architecture for interactive entertainment contents (compatible with existing MPEG-4). Each tool AFX shows compatibility with BIFS node, synthetic thread and audiovisual stream. The current version of the AFX consists of descriptions of high-level animation (for example, animation, based on the construction of the skeleton and the formation on its basis the outer shell), advanced rendering (for example, procedural texturing, display light fields), compact representations (e.g., NURBS, solid performance, the surface of the unit), animation low bit rate (for example, interpolation compression) and others, as well as our proposed DIBR method.

The DIBR formats were designed to combine the advantages of various previously proposed ideas, providing the user with flexible tools that are best suited for a specific task. For example, reanimacionnaya simple texture and bitmap texture are private But in the context of MPEG-4, all three phase format DIBR can be considered as constituent elements, and combinations thereof, based on the MPEG-4 not only cover many of the concepts based on images proposed in the literature, but also provide significant potential to create new formats.

The following describes the view-based images with depth.

Taking into account the ideas described above, as well as some of our own developments, we have proposed the following set of formats based images for use in MPEG-4 AFX: SimpleTexture (single texture), PointTexture (dot texture, Depthlmage (image depth) and Octreelmage (image ochoterena). Simple texture and image ochoterena have animated version.

Simple texture is a single image, a combined image with depth. It is equivalent to RT, while the dotted texture of the equivalent MOMENT.

Based on a simple texture and bitmap texture as constituent elements, you can create multiple views using design of MPEG-4. Formal specification below, and here presents a geometric description of the result.

The structure of the image with depth determines either a simple texture or pixel structure with a connecting block, put the other structure, called transformation host that allows you to build many useful ideas. The most commonly used are two of them that do not have a special definition in the framework of MPEG-4, but in our practice we call them block texture (BT) and the generalized block texture (MBT). BT is a set of six simple textures, relevant linking cube object or scene, while MBT is an arbitrary combination of any number of simple textures that work together to provide a consistent three-dimensional representation. Example BT shown in Fig.22, which shows the reference image, the depth map and the resulting three-dimensional object. BT can be visualized using the algorithm deformation increments, but we used a different approach, also applicable to MBT. An example of a view OBT shown in Fig.23, where the representation of a complex object (palm) used 21 simple texture.

It should be noted that the pooling mechanism allows, for example, be used several times, by different cameras to represent the same object or parts of the same object. Therefore, data structures, like based on the picture which I particular cases of this format, that provides much greater flexibility in adapting regulations and permits simple textures and bitmap textures to the structure of the scene.

The following describes the image ochoterena: textured binary volumetric oktodelete (TBWA).

To use the geometry of multiresistance and texture with a more flexible view and rapid visualization, we have developed a representation of the image ochoterena, which is based on textured binary volumetric oktodelete (TBWA). The purpose of TBWO is flexible format/compression with fast high-quality rendering. TBWA consists of three main components: a binary volumetric tree (BWO), which represents the geometry of the set of reference images and index images corresponding to the nodes ochoterena.

Geometric information in the form of a BWO is a set of binary (filled or empty) regularly exploded voxels merged into larger elements in the usual way ochoterena. This representation can be easily obtained from the image data ochoterena through intermediate forms "accumulation points", because each pixel with depth determines a unique ties allows the conversion of polygonal models in the BVI. Information texture BWO can be obtained from the reference image. The reference image represents the texture of voxels at a given position and orientation of the camera. Therefore, the BVI, together with a reference image, already provides a representation of the model. However, it appeared that additional structure, storing the index of the reference image for each sheet BWO, provides a much faster rendering with best quality.

The main problem of the visualization of the BVI is that we should determine the appropriate index of the camera of each voxel in the rendering. To this end, we must first determine the existence of the camera, which is observed this voxel. This procedure is very long if you use the solution "in the forehead". In addition to this problem, there are still some challenges for voxels that are not visible with any one of the cells, which leads to undesirable artifacts in the rendered image.

A possible solution could be saved in an explicit form the color for each voxel. However, in this case, we would have faced problems with compression of color information. I.e. if you group the colors of the voxels as the image format and compress it, then correspo this problem is solved by storing the index of the camera (image) for each voxel. The index is typically the same for a large group of voxels that allows you to use the structure ochoterena for economical storage of additional information. Note that in the experiments with our models, on average, achieved only a 15% increase compared to the views using only the BVI and reference images. Modeling is a bit more complex, but provides a more flexible way of representing objects of any geometry.

Note that TBWA is a very convenient representation for visualization using Platov because the size split is easily calculated from the voxel size. The color of the voxel is easily determined by using the reference image and the index image voxel.

The following describes the flow textured binary volumetric tree.

We assume that it is enough to 255 cameras and assigned to 1 byte for the index. The flow of TBWO is a stream of characters. Each character of TBWO is a symbol of the BVI or the character textures. The character textures denotes the index of the camera, which can be a specific number or code not specified.

Let us denote in the following description of the code "is not defined" in the form "?". The flow of TBWA East is the index of the image. This can be done at the modeling stage. He will go through all the nodes in the BVI, including the leaf nodes (which have no symbol BWO) in order by width. In Fig.25 shows the pseudo-code that corresponds to the entry thread.

Example of recording the bit stream TBWA shown in Fig.14. For TBWA shown in Fig.14a, a stream of characters can be obtained, as shown in Fig.14C, according to the procedure. In this example, the character textures are represented in a byte. However, in the real flow of each character textures will require only 2 bits, since we only need to represent three values (two cameras and the code "is not defined").

Below is the animation based on DIBR.

Animated version were defined for the two of them DIBR formats: images with depth, containing only simple textures, and images ochoterena. The amount of data is one of the essential issues in three-dimensional animation. These specific formats were chosen because the stream can be naturally embedded in the animated version, providing significant reduction of the volume data.

For images with depth animation was done by replacing the reference image textures movies MPEG-4 standard. High quality s can be saved (almost lossless) in the alpha channel of the reference video streams. At the stage of rendering a three-dimensional frame is rendered after all the reference image frames and depth taken and subjected to decompression.

Animation image ochoterena similar images ochoterena replaced textures movies MPEG-4 standard, and there is a new thread ochoterena.

The following describes the specification of the node MPEG-4.

The DIBR formats are described in detail in the specifications of nodes MPEG-4 AFX. The image with depth contains fields that define the parameters of form, truncated to a simple texture or bitmap textures. Site image ochoterena represents an object in the shape of the geometry specified by TBWA, and a set of formats of the reference images. Information, depending on the scene, is stored in the dedicated fields of the data structures DIBR, providing the possibility of correct interaction of objects DIBR with the rest of the scene. The definition of nodes DIBR shown in Fig.26.

In Fig.27 shows a spatial representation of the image with depth, which shows the value of each field. Note that the node image with depth defines one object DIBR. If many nodes image with depth related to each other, they are processed as a group, and should therefore be placed under odd texture), which should be displayed in the area defined in the host image with depth.

Site image ochoterena defines the structure ochoterena and projected textures. The "resolution ochoterena" specifies the maximum number of leaves ochoterena along side covering Cuba. Field oktodelete" defines a set of internal nodes ochoterena. Each internal node is represented by a byte. "1" in the i-th bit of the first byte means that the child nodes exist for the i-th child element of this internal node, while "0" means that it is not. The internal nodes ochoterena should be the order of the passage width of this ochoterena. Order eight child elements of the internal node is shown in Fig.14b. The "index of the image voxel contains the array index of the image assigned to the voxel. At the stage of rendering the color assigned to the sheet ochoterena, is determined by the orthogonal projection of the sheet on one of the images with a specific index. Indexes are stored in a tree configuration: if a particular image can be used for all leaves contained in a particular bill, the voxel containing the index of the image is given in the flow; otherwise slouch must be defined separately for each child element of the current voxel (the same recursive way). If the index image voxel" is empty, then the index image are determined at the stage of rendering. The "image" defines a node-set "image depth" with a simple structure for the field "di". However, the field "near plane and far plane" node "image depth" and the "depth" of the node "a simple texture" is not used.

The following describes the compression image format ochoterena.

In this section, the compression method for the image ochoterena. Typical test results are presented and commented below. It should be noted that the compression point of the texture currently not yet supported and will be implemented in the next version of AFX.

Field image ochoterena and ochoterena" in the image ochoterena are compressed separately. The proposed methods have been developed on the basis of the provision that the "oktodelete" should be compressed without loss in somewhat visually acceptable distortion, valid for images ochoterena.

Field oktodelete" is compressed using the compression method of images (static model) or tools for video compression (entertainment models), supported by the MPEG-4 standard. In our approach, we IP the m discarding irrelevant pixels and suppressing compression artifacts at the boundary of the object/background at the same time increases the compression speed and rendering quality.

Compression ochoterena is the most important part of image compression ochoterena, because its subject is the compression is already very compact representation of a binary tree without ties. However, in our experiments the method explained below, provided the reduction of this structure to approximately half its original volume. In the animated version of the image ochoterena field oktodelete" is compressed separately for each three-dimensional frame.

Compression is performed by a variant of context-based adaptive arithmetic coding, which makes use of explicit geometric nature of the data. Oktodelete is a stream of bytes. Each byte represents a node (i.e. subsub) tree, in which the bits indicate the filling subcube after internal divisions. The bit configuration is the configuration of the fill site. Describes the compression algorithm processes the bytes with a ratio of one-to-one, as follows.

- Defines the context for the current byte.

- Probability (normalized frequency) of occurrence of the current byte in this context is extracted from the table of probabilities" (TV) accordingly to the context.

- Is probably the appearance of the current byte in the current context (and, if necessary, followed by renormalizable, as described in more detail below).

Thus, coding is the process of creating and updating TV respectively contextual model. In schemes adaptive arithmetic coding context-based (such as "prediction with partial matching") the context of the symbol is usually a string from a number of preceding characters. However, in our case, the compression efficiency is increased by using patterns ochoterena and geometric properties of the data. The proposed approach is based on two new ideas in the problem of compression ochoterena.

A1: For the current context node is its parent node or a pair of {an originating node, the position of the current node to the originating node};

A2: it is Assumed that the probability of occurrence of a given node in a given geometrical position in a particular originating node invariant with respect to a specific set of orthogonal (such as turns or symmetric transformation) transformation.

Assumption A1 is illustrated in Fig.6 for converting R, which represents the rotation by -90in the x-z plane. The basic position of proving node should depend only on their relative positions. This assumption is confirmed in our experiments by analyzing tables of probabilities. This allows you to use a more complex context, without requiring too many tables of probabilities. This, in turn, contributes to the achievement of a reasonably good results in terms of data size and performance. Note that the more complex contexts are used, the more accurate the assessment of the probability and the more compact is the code.

We introduce a set of transformations for which we assume the invariance of the probability distributions. To ensure the possibility of their application in the given situation, such transformations must preserve covering cubic Consider the set G of orthogonal transformations in Euclidean space, which are obtained by all of the compositions in any number and order of the three basic transformations (generators) m1, m2and m3defined as follows:

where m1and m2- display on the plane x=y and y=z, respectively, and m3- display on the plane x=0.

One of the classical results in theory of groups generated by reflections, asserts that G contains 48 individual orthogonal preobrazovanij to itself (the so-called Coxeter group). For example, the rotation R in Fig.6 expressed in terms of the generators as follows:

where "·" denotes matrix multiplication.

Conversion from G, applicable to the site ochoterena, creates a node with a different configuration fill subcubes. This allows you to divide the nodes into categories in accordance with the configuration of the filling of subcubes. Using the terminology of group theory, we can say that G acts on the set of all configurations populate nodes ochoterena. Calculations indicate that there are 22 different class (also called orbits in group theory), in which, by definition, two nodes belong to the same class only if they are related by the transformation of G. the Number of elements in the class varies from 1 to 24 and is always a divisor of 48.

A practical consequence "In" is that the table of probabilities does not depend on the generating site, but only from a class that belongs to the originating node. Note that there will be 256 tables for context-based generating element, and an additional 2568=2048 tables for context, based on the position of generating and child elements in the first case, the/img.russianpatents.com/chr/215.gif">8=176 tables in the latter case. Therefore, you can use the context of equivalent complexity with a relatively small number of tables of probabilities. Formed TV will have the form as shown in table 11.

To statistics tables of probabilities was more accurate, it is collected in different ways at three stages of the encoding process.

In the first stage, the context is not used at all, is taken as "0-context model, and is supported by a table of probabilities with 256 entries, starting with the uniform distribution.

As only the first 512 nodes (this is empirically found number) encoded, switch to the "1-context model using the parent node as the context. When switching TV 0-context is copied to the TV for all 22 contexts.

After 2048 nodes (another heuristic value) encoded, switch to the 2-context model. At this point, 1-context-per originating node is copied to all the TV for each position in the same configuration of the generating node.

The key point of this algorithm is the definition of context and the probability for the current byte. This is implemented as follows. In kagalaska (CURRENT), which indicates the class to which belongs each of the possible 256 nodes, and pre-calculated conversion from G, which translates a given node in a standard element of his class. Thus, to determine the probability of the current node N, perform the following steps.

To determine an originating node P of the current node.

- Remove the class from the CURRENT belongs to P, and the transformation T that maps P to standard class node. Let the number of class C.

- Apply T to R and find the position p of the child node in the standard node that displays the current node n

- Apply T to N. Then the newly received configuration TN filling there is a position p in the standard site class C.

- Extract the desired probability of recording TN table of probabilities corresponding to the combination of class-position (C, p).

- For the 1-context model, the above steps are modified in the obvious way. Needless to say, all conversions are pre-computed and implemented in the form of tables transcoding.

Note that at the stage of the decoding node N its parent node P is already decoded, and therefore, the transformation T is known. All the steps performed during decoding, the absolute is barb. Let R table of probabilities for a certain context. Denote by P(N) record R, the corresponding probability of node N in this context. In the described implementation P(N) is an integer, and after each occurrence of N, P(N) is updated as

P(N)=P(N)+A,

where a is an integer parameter increment, changing in a typical case, from 1 to 4 for different context models.

Let S(P) is the sum of all entries in RP. Then the probability that N is entered in the arithmetic encoder (in this case, the encoder distances), is calculated as P(N)/S(P). As soon as S(P) reaches the threshold value 216all entries renormalized: to avoid zero values in R, write = 1, remain as they are, while others are divided by 2.

The stream of characters that define the image index for each voxel, is compressed using its own table of probabilities. In the terms used above, it has one context. Recording TV is updated incrementally greater than the entries for nodes ochoterena: this allows you to adapt the probability to high warerooms used frequencies of characters; otherwise there are no differences from the character encoding of the node.

Visualization techniques DIBR formats are not part of the AFX, but neophase imaging techniques based on splatch, small flat color "patches" used as "primitives visualization". Two approaches, described below, are focused on two different views: the image depth and image ochoterena. In our implementation for faster rendering using Platov use OpenGL functions. However, it is also possible to programmatically implemented visualization that allows you to optimize calculations using a simple structure image with depth or image ochoterena.

The method used to render images with depth, is extremely simple. It should be noted, however, that it depends on the features of OpenGL and runs much faster by using a hardware accelerator. In this method, we transform all pixels with depth from simple textures and bitmap textures that need to be visualized in three-dimensional point, then put small polygons (splats (small color patches) in these points and functions to be used OpenGL rendering. The pseudocode of this procedure for the case of simple texture shown in Fig.28. Case in point textures is implemented in the same way.

Size splat must be adapted to the distance between the point and the observer. We used followed by the second grid. Size splata calculated for each grid cell, and this value is used for points inside the cell. The calculation is performed as follows.

- To display the cell on the screen using OpenGL.

To calculate the length L of the maximum diagonal or projection (in pixels).

To evaluate D (the diameter of the payment) aswhere N is the average number of points on the side of the cell and C - heuristic is a constant, approximately equal to 1.3.

It should be emphasized that this method can be improved due to more accurate calculations of the radius, the more complex Platov, filtering-smoothing. However, even this simple method provides good visual quality.

The same method is suitable for images ochoterena, where the nodes ochoterena on one of the coarser levels are used in the above calculations of the size splat. However, for image ochoterena color information must first be mapped to a set of voxels. This can be done very simply, because each voxel has its own index of the corresponding reference image. The position of a pixel on the reference image is also known in the parsing process flow ochoterena. As soon as the colors of the voxels nGL, as explained above.

The DIBR formats have been implemented and tested on several three-dimensional models. One of the models ("Tower") was obtained by scanning a real physical object (was used colored three-dimensional scanner Cyberware), others were converted from the demo software package 3DS MAX. Testing was performed on a Pentium IV 1.8 GHz using OpenGL accelerator.

Below explains how the conversion from polygon format DIBR, and then presents the simulation results, views, and different compression formats DIBR. Most of the data relate to the models, corresponding to the image depth and image ochoterena; these formats are animated version and can be effectively compressed. All models were created using an orthographic camera, because this is the preferred way to represent "compact" objects. Note that the camera perspective is mainly used for cheap DIBR view deleted scenes the environment.

Generation of the model DIBR starts with the receipt of a sufficient number of simple textures. For a simple polygon textures are calculated, while for an object in d is that it is desirable to use.

The image depth is simply the set of received simple textures. Although the map of depths can be stored in compressed form, acceptable only lossless compression, because even a small distortion in geometry is often quite noticeable.

The reference image can be stored in compressed form with a loss, but in this case, preprocessing is necessary. Although in General it is permissible to use popular methods such as lossy compression according to the JPEG standard, edge artifacts become more visible on the generated forms a three-dimensional object, especially due to the presence of boundaries between the object and the background reference image, where the background color appears to be "poured" into the object. The solution, which was used to cope with this problem, was to expand the image in the edge blocks in the background using the average color of the block and the rapid decay of the intensity, and then use compression on the JPEG standard. The effect is reminiscent of the "compression" of the distortion to the background, where it is harmless, because the background pixels are not used for rendering. Internal boundaries in the reference image, compressed with a loss, can also generate Arterton point-based representation (OTP). The set of points that form the OTP, is a set of colored points obtained by shifting pixels in the reference image, the distance defined in the respective maps of depths. The original simple texture should be constructed so that the resulting OTP provided a reasonably good approximation of the object surface. After this OTP is converted into an image ochoterena, as shown in Fig.24, and is used to generate a new full set of reference images that satisfy the constraints imposed by the format. At the same time generate an additional data structure - the index of the image voxel, representing the indices of the reference image for the voxels ochoterena. In the case where the reference image should be saved in a lossy formats, they are first preprocessed as explained in the previous subsection. In addition, since the structure of TBWO defines explicitly the pixel containing the color of each voxel, the redundant pixels are discarded, which further reduces the amount of data of the index image voxel. Examples of the original and the processed reference images in JPEG format showing ochoterena, but sometimes is noticeable for image objects with depth.

Point model textures are created using a projection of the object on the reference plane. If it does not provide enough samples (which may occur for parts of the surface, close to kastelnik to the vector projection), to obtain a larger number of samples are formed more simple textures. The resulting set of points is then reordered in the bitmap structure of the texture.

Table 12 shows the size comparison of data of different polygonal models and versions DIBR. The numbers in the model names indicate the resolution (in pixels) of the reference image.

Depth map in image ochoterena were saved in PNG format, and the reference image in high quality JPEG format. The data in table 12 show that the size of the model image with depth is not always smaller than the size of the zipped polygonal model. However, the compression provided by the image ochoterena, usually much higher. This is a consequence of the unification of depth maps into a single effectively compressed data structure ochoterena and complex pre-processing that the us is provides a simple and versatile tool for representing complex objects, such models Palma difficult without pre-treatment.

Table 13 presents data specific to the image ochoterena, illustrating the idea of compression efficiency, designed for this format. The entries in the table are the data sizes of the compressed and uncompressed part models containing components ochoterena and index image voxel. It is shown that the reduction of this part varies from 2 to 2.5 times. Note that the model of "Palma" in table 13 is not the same as the model of "Palma" in table 12

Below presents data on the speed of rendering.

The speed of rendering images with depth model, "Palm 512" about 2 frames/s (K/s) (note that this 21 simple texture, while the other tested the static model, with a side of the reference image 512, visualized at a speed of 5-6 K/S. note that the rendering speed depends mainly on the number and resolution of the reference image, but not from the complexity of the scene. This is an important advantage over polygonal representations, especially in entertainment. Animated image ochoterena models "Dragon 512" is rendered with a speed of 24 K/S. the Results of the respectively);

six reference video streams in a compressed format AVI: 1370 KB.

The full data: 2280 KB.

The model image with depth "angel 256" shown in Fig.22. Fig.30-34 show several other DIBR and polygonal models. In Fig.30 shows a comparison of model view "Morton" for the polygon case and for the case of images with depth. The model image with depth uses the reference image in JPEG format, and the rendering is the easiest formation of Platov, as described in section 5, but the image quality is quite acceptable. In Fig.31 shows the comparison of the two versions of the scanned model "Tower". Black dots in the upper part of the model due to the noisiness of the input data. In Fig.32 presents a more complex model "Palma", consisting of 21 simple textures. It also demonstrates good quality, although the leaves are, in General, wider than the original 3DS-MAX, which is a consequence of the simplified formation of Platov.

In Fig.33 presents a three-dimensional frame of the animation image ochoterena models "Dragon 512". In Fig.34 presents the possibilities of the format of the bitmap texture to obtain models of excellent quality.

Device and method for representing three-dimensional objects on the basis of image is a systematic block diagram of a device for representing three-dimensional objects on the basis of the image with depth by using a simple texture in accordance with a possible embodiment of the present invention.

According Fig.35, the device 1800 to represent three-dimensional objects on the basis of images with depth contains the generator 1810 information about the observation point, the preprocessor 1820, the first generator 1830 images, the second generator 1840 images, 1850 generator nodes and the encoder 1860.

Generator 1810 information about the observation point generates at least one piece of information about the observation point. Information about the observation point includes a set of fields defining the image plane for the object. Field comprising information about the observation point, include field position, field orientation, field observations, field projection method and field distance.

In the fields of position and orientation of the recorded position and orientation in which the observed image plane. Position field position represents the relative location of the observation point relative to the origin of the coordinate system, while the orientation in the pitch orientation is the magnitude of the rotation of the observation point relative to the default orientation.

Field observations recorded field observations from the observation point to the image plane.

In the method field of the projection recorded IU the method includes the orthogonal projection in which the region of observation is represented by width and height, and a method of projection in the future, in which the region of observation is represented by a horizontal angle and vertical angle. When you choose the method of orthogonal projection, the width and height of the field observations correspond to the width and height of the image plane, respectively. When choosing a method of projection in perspective, horizontal and vertical angles of the observations correspond to the angles formed with horizontal and vertical sides of species ranging from the observation point to the image plane.

In the field of distance recorded the distance from the observation point to the nearest boundary plane and the distance from the observation point to the far boundary plane. Distance field is composed of a field near plane and the far field plane. Distance field specifies the scope for depth information.

The first generator 1830 images generates a color image based on color information corresponding to the information about the observation point on the relevant points of the pixels forming the object. In the case of the data format for the formation of the animated object information about the depth and color information to depict the image with depth, relevant information observation point based on the depth information on corresponding points of the pixels forming the object. Generator 1850 nodes generates the nodes of the image, consisting of information about the observation point, color images and images with depth, relevant information about the observation point.

The preprocessor 1820 performs preliminary processing of the pixels on the border between the object and the background color. In Fig.36 preprocess shown in more detail. According Fig.36, the preprocessor 1820 contains a section 1910 extensions and section 1920 compression. Section 1910 expansion extends the colors of the edge pixels on the background using the average color blocks and a rapid decrease in the intensity. Section 1920 compression unit performs compression, and then to shift the image in the background region. Encoder 1920 encodes the generated image nodes for the issuance of bit streams.

In Fig.37 shows a flowchart illustrating the procedure of the method for representing three-dimensional objects on the basis of the image with depth by using a simple texture according to one embodiments of the invention.

According Fig.37, at step S2000 generator 1810 point information nab the R 1830 images generates a color image based on color information, relevant information about the observation point on the relevant points of the pixels forming the object. In step S2020, the second generator 1840 image generates an image with depth, relevant information about the observation point, based on the depth information for the respective pixels of the pixels forming the object. At step S2030 generator 1850 nodes generates the nodes of the image, consisting of information about the observation point, information about the color and image with depth, relevant information about the observation point.

At step S2040 section 1910 expansion extends the colors of pixels on the boundary between blocks in the background using the average color blocks and a rapid decrease in the intensity. At the stage of section 1920 of the compression block performs the compression, and then to shift the image in the background region. In step S2060, the encoder 1920 encodes the generated image nodes for the issuance of bit streams.

The same apparatus and method for representing three-dimensional objects on the basis of images with depth, corresponding to the present invention described above with reference to Fig.35-37 also applies to the representation of objects based on simple textures, and simple texture illustrated in Fig.26.

In Fig.38 prepolovenie point textures in accordance with the present invention.

According Fig.38, the device 2100 to represent three-dimensional objects on the basis of images with depth contains block 2110 sample selection, generator 2120 information about the observation point, the generator 2130 information about the plane, the generator 2140 depth information generator 2150 color information and the generator 2160 nodes.

Block 2110 formation sampling generates samples for the image plane by the projection of the object on the reference plane. Sampling for the image plane consists of pairs of images - color images and images with depth.

Generator 2120 information about the observation point generates information about the observation point, from which the observed object. Information about the observation point includes a set of fields defining the image plane for the object. Field comprising information about the observation point, include field position, field orientation, field observations, field projection method and field distance.

In the fields of position and orientation of the recorded position and orientation in which the observed image plane. The observation point is determined by the position and orientation. Field observations recorded width and height of the region of observation of the observation point to pinoy projection, in which the region of observation is represented by width and height, and a method of projection in the future, in which the region of observation is represented by a horizontal angle and vertical angle. In the field of distance recorded the distance from the observation point to the nearest boundary plane and the distance from the observation point to the far boundary plane. Distance field is composed of a field near plane and the far field plane. Distance field specifies the scope for depth information.

Generator 2130 information about the plane generates information about the plane that defines the width, height and depth of the image plane, consisting of a set of points obtained from the samples to the image plane, relevant information about the observation point. Information about the plane consists of many fields. The fields that make up the information planes include the first field contains the width of the image plane, the second field contains the height of the image plane, and the permissions on the depth of the recording permission information in depth.

Generator 2140 depth information generates a sequence of information about the depths of all the projected points of the object projected on latestmusic projected points. In the sequence of depth information is consistently recorded the number of projected points and the depth of the corresponding projected points. In the sequence of color information sequentially recorded color values corresponding to depth values corresponding projected points.

Generator 2160 nodes generates the nodes of the image, consisting of information about the plane corresponding to the plane of the image sequence information about the depth and consistency of color information.

In Fig.39 shows a flowchart illustrating the procedure of the method for representing three-dimensional objects on the basis of the image with depth by using dot texture according to the present invention.

According Fig.39 at step S2200 generator 2120 information about the observation point generates information about the observation point, from which the observed object. At step S2210 generator 2130 information about the plane generates information about the plane that defines the width, height and depth of the image plane corresponding to the information about the observation point. At step S2220 block 2110 formation sampling generates samples for the image plane by projecting the object is Borok to the image plane. If there is a sufficient number of samples for the image plane, the step S220 is not running.

At step S2230 generator 2140 depth information generates a sequence of information about the depths of all the projected points of the object projected onto the image plane.

At step S2240 generator 2150 color information generates a sequence of color information on the respective projected points. Phase generator 2160 nodes generates the nodes of the image, consisting of information about the plane corresponding to the plane of the image sequence information about the depth and consistency of color information.

The same apparatus and method for representing three-dimensional objects on the basis of images with depth, corresponding to the present invention described above with reference to Fig.35-37 also applies to the representation of objects on the basis of point textures, and simple texture illustrated in Fig.26.

In Fig.40 shows the block diagram of a device for representing three-dimensional objects on the basis of the image with depth by using ochoterena according to the present invention.

According Fig.40 device 2300 to represent three-dimensional objects on OS 2330 information about the form, generator 2340 index generator 2350 units and encoder 2360.

The preprocessor 2310 pre-processing the reference image. The detailed structure of the preprocessor 2310 shown in Fig.41. According Fig.41 preprocessor 2310 contains a section 2410 extensions and section 2420 compression. Section 2410 extension extends the colors of pixels on the boundary between blocks in the reference image onto the background using the average color blocks and a rapid decrease in the intensity. Section 2420 compression unit performs compression on the master image, and then to shift the image in the background region.

Block 2320 determine the reference of the image determines the reference image containing a color image for each cube, divided by the generator 2330 information about the form. The reference image is a host image with depth, consisting of information about the observation point and a color image corresponding to the information about the observation point. Here information about the observation point includes a set of fields defining the image plane for the object. The respective fields that make up the information about the viewpoint described above, and detailed information about them is not given. Color shows the texture.

Generator 2330 information about the form generates information about the shape of the object by dividing ochoterena that contains the object 8 subcubes and definitions separated subcubes as child nodes. Generator 2330 information about the shape of an iterative manner performs division up until every Zubkov will not become smaller than a predetermined size. The information on the form includes a field "resolution", which recorded maximum number of leaves ochoterena along the side of the cube containing the object, and the "oktodelete", which recorded the sequence of the internal nodes, and the "index" that contains the indexes of the reference images corresponding to each internal node. Generator 2340 indexes generates information of indexes of the reference image corresponding to the information on the form. In Fig.42 presents detailed structure of the generator 2340 indexes. According Fig.42, the generator 2340 includes a generator 2510 colored dots, generator 2520 point-based representation (OTP) the Converter 2530 image and generator 2540 index information.

Generator 2510 colored dots gets colored dots by shifting pixels existing in the reference image, the Sam is the OTP through a set of colored dots. Converter 2530 image converts the image OTP in the image ochoterena, represented by a cube corresponding to each point. Generator 2540 index information, generates index information of the reference image corresponding to each cube.

Generator 2350 nodes generates nodes ochoterena, including the information form, the index information and the reference image.

Encoder 2360 encodes nodes ochoterena in the output bit streams. The detailed structure of the encoder 2360 shown in Fig.43. According Fig.43 encoder 2360 includes a section 2610 context definition, the first section 2620 coding, the second section 2630 coding, third section 2640 encoding section 2650 write byte characters and the section 2660 recording image index.

Section 2610 context definition defines the context of the current node ochoterena on the basis of the number of cycles of encoding for node ochoterena. The first section 2620 encoding encodes the first 512 nodes 0-context model and arithmetic coding while maintaining a single table of probabilities with 22 entries. The first section 2620 coding starts coding with uniform distribution.

The second section 2630 encoding encodes the nodes 513 th to 2048-th of th is At the moment of switching from the 0-context 1 context model the second section 2630 encoding to copy the table of probabilities 0-context model in all tables of probabilities 1-context model.

In Fig.44 presents the block diagram of the second section 2630 coding. According Fig.44 the second section 2360 encoding contains a section 2710 retrieval probability, arithmetic coder 2720 and section 2730 update the tables. Section 2710 extraction probability retrieves the probability of generating the current node in the context of the table of probabilities corresponding to the context. The arithmetic encoder 2720 compresses ectodermal sequence of probabilities that contains the extracted probability. Section 2730 update table updates the table of probabilities with a pre-determined increment, such as 1, to generate the frequency of the current node in the current context.

The third section 2640 encoding encodes the nodes following after 2048-x nodes, through the 2-context model and arithmetic coding using and generating child nodes as contexts. At the time of switching from the 1-context 2-context model the third section 2640 encoding to copy the table of probabilities 1-context model for the configuration of the generating node 2-context table of probabilities corresponding to the relevant provisions in the same configuration of the generating node.

In Fig.45 presents the block diagram is to be placed, the first section 2820 definition, the second section 2830 extraction section 2840 receiving configuration, the second section 2850 determine the arithmetic coder 2860 and section 2870 update the tables.

The first section 2810 extract extracts the originating node of the current node. The first section 2820 definition defines the class to which belongs the extracted originating node, and determines the transformation by which the originating node is converted into the standard node of a certain class. The second section 2830 extraction applies the obtained conversion to the originating node and retrieves the position of the current node in the converted originating node. Section 2840 obtain configuration applies the transformation to the current node and receives the configuration as a combination of a certain class and the index position of the current node. The second section 2850 definition determines the probability of accounts table of probabilities corresponding to the received configuration. The arithmetic encoder 2860 compresses ectodermal through a sequence of probabilities that contains the extracted probability. Section 2870 update table updates the table of probabilities with a pre-determined increment, such as 1, to write the bytes of the characters writes a byte characters corresponding to the current node in the bit streams. If all the child nodes of the current node have the same index of a reference image and an originating node of the current node has an "undefined" reference index image, the section 2660 recording image index writes the same index of the reference image in the bit streams for sobeslav the current node. If the child nodes have different indexes of the reference image, the section 2660 recording image index writes "undefined" index reference image in the bit streams for sobeslav the current node.

In Fig.46 presents a flowchart illustrating the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using ochoterena, according to a possible variant of implementation of the present invention. According Fig.46 in step S2900 generator 2330 information about the form generates information about the shape of the object by dividing ochoterena that contains the object to subcube and definitions separated subcubes as child nodes. The information on the form includes a field "resolution", which recorded maximum number of leaves ochoterena along the side of the cube containing the object, and the field "OK what s the index of the reference image, corresponding to each internal node. Each internal node is represented by a byte. Information site recorded in the bit sequence constituting a byte representing the presence or absence of the child node of the child nodes belonging to the internal node. At step S2910, the division is performed in an iterative fashion for the formation of 8 subcubes, if every subsub more than a predefined size (this value can be found empirically.).

In step S2920 block 2320 determine the reference of the image determines the reference image containing a color image for each cube, divided by the generator 2330 information about the form. The reference image is a host image with depth, consisting of information about the observation point and a color image corresponding to the information about the observation point. The information about the viewpoint described above. The reference image may be performed on the stage of the preview image.

In Fig.47 presents a flowchart illustrating the procedure of implementation of pre-processing the reference image. According Fig.47 at step S3000 section 1910 expansion extends the colors of pixels on the boundaries of the block 0 is compression, in order to offset the distortion in the background.

At step S2930 generator 2340 indexes generates index information of the reference image according to the information on the form.

In Fig.48 presents a flowchart illustrating the procedure of implementing the generation of indexes. According Fig.48 at step S3100 generator 2510 colored dots gets colored dots by shifting pixels existing in the reference image, to the distance specified in the depth map corresponding to him. Phase generator 2520 OTP generates an intermediate image OTP through a set of colored dots. At step S3120 Converter 2530 image converts the image OTP in the image ochoterena, represented by a cube corresponding to each point. At step S313 0 generator 2540 index information, generates index information of the reference image corresponding to each cube.

At step S2940 generator 2350 nodes generates nodes ochoterena, including the information form, the index information and the reference image.

At step S2950 encoder 2360 encodes nodes ochoterena in the output bit streams.

In Fig.49 presents a block diagram illustrating the implementation of the encoding. According Fig.49 at step S3200 section a node ochoterena. At step S3210 is determined whether the position of the current node is less than or equal to 512. If Yes, then in step S3220 is the first stage of coding through the 0-context model and arithmetic coding. If at step S3210 determined that the current position of the node is greater than 512, it is determined by the context of the current node (step S3430) and is the second stage of coding through the 1-context model using the parent node as the context (step S3240). If at step S3250 is determined that the position of the current node is greater than 2048, then it defines the context of the current node (step S3260) and is the third stage of coding through the 2-context model using the parent node as the context (step S3270).

Here 0 is the context implies independence from context 1 context is the class of the parent node. The total number of classes is equal to 22. If the classes are related by orthogonal transformations of G generated by the basic transformations, then the two nodes belong to the same class. Basic transformations m1, m2and m3are defined as follows:

where m1and m2- display on the plane x=y and y=z, m3- about olausen node.

In Fig.50 presents a flowchart illustrating the procedure of the second phase of coding. According Fig.50 in step S3300 section 2710 extraction probability retrieves the probability of generating the current node in the context of the table of probabilities corresponding to the context. At step S3310 arithmetic coder 2720 compresses ectodermal sequence of probabilities that contains the extracted probability. In step S3320 section 2730 update table updates the table of probabilities with a pre-determined increment, such as 1, to generate the frequency of the current node in the current context.

In Fig.51 presents a flowchart illustrating the procedure of implementation of the third stage of coding. According Fig.51 in step S3400, the first section 2810 extract extracts the originating node of the current node. At step S3410 first section 2820 definition defines the class to which belongs the extracted originating node, and determines the transformation by which the originating node is converted into the standard node of a certain class. In step S3420 the second section 2830 extraction applies the obtained conversion to the originating node and retrieves the position of the current node in the transformed generating knots which includes the configuration as a combination of a certain class and the index position of the current node. At step S3440 the second section 2850 definition determines the probability of accounts table of probabilities corresponding to the received configuration. On stage S3450 arithmetic coder 2860 compresses ectodermal through a sequence of probabilities that contains the extracted probability. On stage S3460 section 2870 update table updates the table of probabilities with a pre-determined increment, such as 1, to generate the frequency of the current node in the current context.

In Fig.52 presents a flowchart of the procedure of generating bit streams in the encoding process. According Fig.52, if the stage S3500 is determined that the current node is not a leaf node, then the section 2650 write byte characters on stage S3510 writes the bytes of the characters corresponding to the current node in the bit streams, and proceeds to step S3520. If the current node is a leaf node, then the procedure goes directly to step S3520 without executing step S3510.

If at step S3520 determined that all the child nodes of the current node have the same index of a reference image and an originating node of the current node has an "undefined" reference index image, the section 2660 account index image at step S3530 writes the network, what child nodes have different indexes of the reference image, the section 2660 account index image at step S3540 writes "undefined" index reference image in the bit streams for sobeslav the current node.

In Fig.53 shows the block diagram of a device for representing three-dimensional objects on the basis of the image with depth by using ochoterena according to another variant implementation of the present invention. In Fig.54 presents a flowchart illustrating the procedure of the method for presenting three-dimensional objects on the basis of the image with depth by using ochoterena, according to another variant implementation of the present invention.

According Fig.53 and 54 of the device 3600 to represent three-dimensional objects on the basis of images with depth, corresponding to the present invention contains a block 3610 input, the first block 3620 selection decoder 3630, the second block 3640 allocation and block 3650 views of the object.

In step S3700, the first block 3610 input enters the bit streams from the external device. In step S3710 the first block 3620 allocates nodes ochoterena from the input bit streams.

At step S3720 decoder decodes the selected nodes ochoterena. The decoder 3630 includes Eromania. The operations of the respective components forming the decoder 3630, similar to the corresponding operations of the same components encoders described with reference to Fig.43-45, and Fig.49-52, and therefore their detailed description is not given here. On stage S3730 second block 3540 highlights information about the shape and the reference image for multiple cubes, forming ochoterena, from the decoded nodes oktodelete. On stage S3740 block 3650 view object represents the object by combining the selected reference image in accordance with the information on the form.

The present invention can be implemented on a computer-readable storage medium computer-readable codes. Computer-readable recording medium includes all types of recording devices that can read data that has been read by the computer system, and examples of which can serve as ROM, RAM, ROM, CD-ROMs, magnetic tapes, floppy disks, optical data storage devices, etc., and also implemented on the carrier wave, for example, from the Internet or some other transmission medium. A computer-readable recording medium is distributed in a computer system associated with a network so that computer-readable is the performances, based on the images as high-quality information about the colored three-dimensional object is encoded through a set of simple and regular structures of two-dimensional images that are currently used in a well-known processing methods and image compression algorithm is simple and can be supported by the hardware in many aspects. In addition, the render time for models based on images is proportional to the number of pixels in the reference and output images, but, in General, not geometric complexity, as in the polygonal case. In addition, since the representation of image-based applicable to objects and scenes of the real world, close to photographic visualization of the real scene becomes possible without using millions of polygons and expensive calculations.

The above description of the invention are presented for purposes of illustration and explanation. It is not exhaustive and does not limit the invention to just open the form. Possible modifications and variations in light of the above principles, or with regard to the practical implementation of the invention. Scope of the invention defined by the claims izobreteniya objects based on images with depth, containing the generator information about the observation point for generating at least one piece of information about the observation point; a first image generator for generating a color image based on color information corresponding to the information about the observation point, the corresponding points of the pixels constituting the object; a second image generator for generating image depth based on the depth information, the relevant information about the observation point according to the corresponding points of the pixels constituting the object; generator units for generating nodes of images, consisting of information about the observation point, color images and images with depth, relevant information about the observation point; and an encoder for encoding the generated nodes images.

2. The device under item 1, characterized in that the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed pastorate, field projection method, in which a recorded method of projection of the observation point on the image plane, the first distance field, which recorded the distance from the observation point to the nearest boundary plane and the second field distances, which recorded the distance from the observation point to the far edge of the plane, and the range for the depth information is determined by fields of distances.

3. The device under item 1, characterized in that it further comprises a preprocessor for preprocessing the pixels on the border between the object and the background color image, and the preprocessor section contains extensions for distribution of the pixels on the border on the background using the average color blocks and a rapid decrease in the intensity and the compression section to perform block compression to offset the distortion in the direction of the background.

4. A way of representing three-dimensional objects on the basis of images with depth, comprising generating at least one piece of information about the observation point; generating a color image based on color information corresponding to the information about the observation point, the corresponding points of the pixels constituting the object; generating images with CH the Cam pixels, the components of the object; generating nodes of images, consisting of information about the observation point, color images and images with depth, relevant information about the observation point; and encoding the generated nodes images.

5. The method according to p. 4, characterized in that the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of observation, in which the recorded field observations from the observation point to the image plane, the field projection method, in which a recorded method of projection of the observation point on the image plane, the first field distances, which recorded the distance from the observation point to the nearest boundary plane and the second field distances, which recorded the distance from the observation point to the far edge of the plane, and the range for the depth information is determined by fields of distances.

6. The method according to p. 5, characterized in that proserity.

7. The method according to p. 5, characterized in that the orientation represents the amount of rotation relative to the default orientation.

8. The method according to p. 5, wherein, when the selected method of orthogonal projection, the width and height of the field observations correspond to the width and height of the image plane, respectively, and when the selected method of projection in perspective, horizontal and vertical angles of the observations correspond to the angles formed with horizontal and vertical sides of the fields of view from the observation point to the image plane.

9. The method according to p. 4, characterized in that in the case of a format of video data to generate animated objects information about the depth and color information are many sequences of image frames.

10. The method according to p. 4, characterized in that the step of generating color images includes the distribution of pixels on the border on the background using the average color blocks and a rapid decrease in the intensity and the execution of block compression to offset the distortion in the direction of the background.

11. Device for representing three-dimensional objects on the basis of images with depth, comprising a generator of information about the accuracy of the information on the plane for generating information about the plane, defining the width, height and depth of the image plane corresponding to the information about the observation point; generator depth information to generate sequence information about the depth for depths of all projected points of the object projected onto the image plane; the generator of color information to generate the sequence of color information on the respective projected points and generator units for generating units consisting of information about the plane corresponding to the plane of the image sequence information about the depth and consistency of color information.

12. The device according to p. 11, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of observation, in which the recorded field observations from the observation point to the image plane, the field projection method in which the recorded method projected the key observation to the nearest boundary plane, and the second field distances, which recorded the distance from the observation point to the far edge of the plane, and the range for the depth information is determined by fields of distances.

13. The device according to p. 11, wherein the information about the plane consists of many fields and fields that make up the information planes include the first field contains the width of the image plane, the second field contains the height of the image plane, and the permissions depth, in which the recorded resolution depth information.

14. The device according to p. 11, characterized in that sequence information about the depth consistently recorded the number of projected points and the depth of the corresponding projected points, and the sequence of color information sequentially recorded color values corresponding to depth values corresponding projected points.

15. A way of representing three-dimensional objects on the basis of images with depth, which includes generating at least one piece of information about the observation point; generating information about the plane that defines the width, height and depth of the image plane, relevant information about t is bhakta, projected onto the image plane; generating a sequence of color information on the respective projected points and the generating unit, which consists of information about the plane corresponding to the plane of the image sequence information about the depth and consistency of color information.

16. The method according to p. 15, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of observation, in which the recorded field observations from the observation point to the image plane, the field projection method, in which a recorded method of projection of the observation point on the image plane, the first field distances, which recorded the distance from the observation point to the nearest boundary plane and the second field distances, which recorded the distance from the observation point to the far edge of the plane, and the range for depth information about what ESCWA fields and fields forming information planes include the first field contains the width of the image plane, the second field contains the height of the image plane, and the permissions depth, in which the recorded resolution depth information.

18. The method according to p. 15, characterized in that sequence information about the depth consistently recorded the number of projected points and the depth of the corresponding projected points, and the sequence of color information sequentially recorded color values corresponding to depth values corresponding projected points.

19. Device for representing three-dimensional objects on the basis of images with depth, comprising a generator of information about the form to generate information about the shape of the object by dividing ochoterena that contains the object 8 subcubes and definitions separated subcubes as child nodes as long as each of subsub will not become smaller than a predetermined size; the definition block of the reference image to determine a reference image containing a color image for each cube, divided by the generator information form; an index generator to generate the generate node ochoterena, includes information about the form, the index information and the reference image, and an encoder for encoding nodes ochoterena for output bit streams.

20. The device according to p. 19, wherein the information on the form includes the permissions, which recorded maximum number of leaves ochoterena along the side of the cube that contains the object field ochoterena, which recorded the sequence of the internal nodes, and an index field that contains the indexes of the reference images corresponding to each internal node.

21. The device according to p. 19, wherein the reference image is a host image with depth, containing information about the observation point and a color image corresponding to the information about the viewpoint.

22. The device according to p. 21, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of nablyudeniya, recording method the projection of the observation point on the image plane selected from a method of orthogonal projection in which the region of observation is represented by width and height, and a method of projection in the future, when the surveillance area is represented by a horizontal angle and vertical angle.

23. The device according to p. 19, characterized in that it further comprises a preprocessor for preprocessing of pixels at the boundary between blocks in the reference image on the issuance of previously processed pixels in the definition block of the reference image, and the preprocessor section contains extensions for distribution of pixels in the background using the average color blocks and a rapid decrease in the intensity and the compression section to perform block compression to offset the distortion in the direction of the background.

24. The device according to p. 19, wherein the index generator comprises a generator of colored dots for color points by shifting pixels existing in the reference image, the distance defined in map depths corresponding to him; the generator is based on points of view (OTP) to generate an intermediate image OTP through the wood, represented by a cube corresponding to each point, and the index information generator for generating index information of the reference image corresponding to each cube.

25. The device according to p. 19, characterized in that the encoder contains a section context definition to define the context of the current node ochoterena based on the number of cycles of encoding for node ochoterena; a first encoding section for encoding the first predetermined number of nodes by 0-context model and arithmetic coding while maintaining a single table of probabilities with a predefined number of entries; a second encoding section for encoding the second predetermined number of nodes other than the first predefined number of nodes by 1-context model using the parent node as the context and the third encoding section for encoding the remaining nodes, following the second predefined number of nodes, through the 2-context model and arithmetic coding using and generating child nodes as contexts, and the first section of the coding starts coding with Ptotal table of probabilities 1-context model at the time of switching from the 0-context 1-context model and the third section of the coding copies a table of probabilities 1-context model for the configuration of the generating node in the table of probabilities of the 2-context model, relevant relevant provisions in the same configuration generating node at the time of switching from the 1-context 2-context model.

26. The device according to p. 25, characterized in that the second section contains the coding section of the extraction probability for extracting the probability of generating the current node in the context of the table of probabilities corresponding to the context; an arithmetic encoder for compression oktodelete through a sequence of probabilities that contains the extracted probability, and the section of the update tables to update the table of probabilities with a pre-determined increment to generate the frequency of the current node in the current context.

27. The device according to p. 25, wherein the third encoding section contains the first section of the retrieval to retrieve the parent node of the current node; the first section definition to a class definition, belongs to the extracted originating node, and define the transformation by which the originating node is converted into the standard node of a certain class; the second extraction section for applying certain transformations to generate the node and retrieve the position of the current node in a particular class, and the index position of the current node; the second section definition to determine the probability of the records of a table of probabilities corresponding to the received configuration; an arithmetic encoder for compression oktodelete through a sequence of probabilities that contains the extracted probability, and the section of the update tables to update the table of probabilities with predefined increments to generate the frequency of the current node in the current context.

28. The device according to p. 25, wherein the encoder further comprises a section of writing bytes of characters to write byte characters corresponding to the current node in the bit streams, if the current node is not a leaf node; section records the index of the image to write to the same index of the reference image in the bit streams for sobeslav the current node, if all the child nodes of the current node have the same index of the reference image, and generating the node of the current node has an “undefined” reference index image, or write “unknown” index reference image for sobeslav the current node, if the child nodes of the current node have different indexes of the reference images.

29. A way of representing three-dimensional objects on the basis of rasego object, 8 subcubes and the definition of split subcubes as child nodes as long as each of subsub will not become smaller than a predetermined size; determining a reference image containing a color image for each cube, divided by the generator information form; generating index information of the reference image corresponding to the information on the form; generating nodes ochoterena, which includes the information on the form, the index information and the reference image, and encoding nodes ochoterena for output bit streams.

30. The method according to p. 29, wherein the information on the form includes the permissions, which recorded maximum number of leaves ochoterena along the side of the cube that contains the object field ochoterena, which recorded the sequence of the internal nodes, and an index field that contains the indexes of the reference images corresponding to each internal node.

31. The method according to p. 30, characterized in that each internal node is represented by a byte, and the information of the node is recorded in the bit sequence forming of bytes that represents the presence or absence of child nodes belonging to the internal node.

32. Speedy of information about the observation point and the color image, relevant information observation point.

33. The method according to p. 32, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of observation, in which the recorded field observations from the observation point to the image plane, the field projection method, in which a recorded method of projection of the observation point on the image plane, selected from a method of orthogonal projection in which the region of observation is represented by width and height, and a method of projection in the future, when the surveillance area is represented by a horizontal angle and vertical angle.

34. The method according to p. 29, wherein generating the index information includes obtaining colored dots by shifting pixels existing in the reference image, the distance defined in map depths, corresponding; generating an intermediate image otobrajenie ochoterena, represented by a cube corresponding to each of the pixels, and generating index information of the reference image corresponding to each cube.

35. The method according to p. 29, wherein the step of determining the reference image includes the distribution of pixels on the border on the background using the average color blocks and a rapid decrease in the intensity and the execution of block compression to offset the distortion in the direction of the background.

36. The method according to p. 29, wherein the step of encoding includes determining the context of the current node ochoterena based on the number of cycles of encoding for node ochoterena; the first step of the encoding process for encoding a first predetermined number of nodes by 0-context model and arithmetic coding while maintaining a single table of probabilities with a predefined number of entries; the second step of the encoding process for encoding a second predetermined number of nodes other than the first predefined number of nodes by 1-context model using the parent node as the context and the third step of the encoding process for encoding the remaining nodes, following the second pre opredelenni and child nodes as contexts the first step of the encoding process starts with a uniform distribution, the second stage of encoding to copy the table of probabilities 0-context model in all tables of probabilities 1-context model at the time of switching from the 0-context 1-context model and the third step of the encoding process copies a table of probabilities 1-context model for the configuration of the generating node in the table of probabilities of the 2-context model corresponding to the relevant provisions in the same configuration generating node at the time of switching from the 1-context 2-context model.

37. The method according to p. 36, characterized in that 1-contexta model represents a class of the parent node.

38. The method according to p. 37, characterized in that the total number of classes is equal to 22, and if the nodes are linked by an orthogonal transformation G generated by the basic transformations, then the two nodes belong to the same class, where basic transformations m1, m2and m3are defined as follows:

where m1and m2- display on the plane x=y and y=z, respectively; and

m3- display on the plane x=0.

39. The method according to p. 36, characterized in that 2-context includes the class then the AP coding includes removing the probability of generating the current node in the context of the table of probabilities, appropriate to the context; compression oktodelete through a sequence of probabilities that contains the extracted probability, and update the table of probabilities with a pre-determined increment to generate the frequency of the current node in the current context.

41. The method according to p. 36, characterized in that the third step of the encoding process includes removing the parent node of the current node; the definition of the class to which belongs the extracted originating node, and the definition of conversion, by which the originating node is converted into the standard node of a certain class; applying certain transformations to generate the node and retrieve the position of the current node in the converted originating node; applying a transform to the current node and its configuration as a combination of a certain class and the index position of the current node; determining the required probability of the records of a table of probabilities corresponding to combinations of a certain class; and compression oktodelete through a sequence of probabilities, containing the extracted probability, and update the table of probabilities with a pre-determined increment to generate chaise byte characters corresponding to the current node in the bit streams, if the current node is not a leaf node; index entry of the same reference image in the bit streams for sobeslav the current node, if all the child nodes of the current node have the same index of the reference image, and generating the node of the current node has an “undefined” reference index image, or write “unknown” index reference image for sobeslav the current node, if the child nodes of the current node have different indexes of the reference images.

43. Device for representing three-dimensional objects based on the image depth of the containing block is input to the input bit streams; a first block allocation for node allocation ochoterena from the input bit streams; a decoder for decoding nodes ochoterena; the second block selection for highlighting information about the form and reference images for a variety of cubes forming octocorallia from the decoded nodes ochoterena; and block representation of an object for object representation by combining the selected reference image corresponding to the information on the form.

44. The device according to p. 43, wherein the decoder includes a section defining context for apperley decoding section for decoding a first predetermined number of nodes by 0-context model and arithmetic decoding while maintaining a single table of probabilities with a predefined number of records; a second decoding section for decoding a second predetermined number of nodes other than the first predefined number of nodes by 1-context model using the parent node as the context, and the third decoding section for decoding the remaining nodes, following the second predefined number of nodes, through the 2-context model and arithmetic decoding using and generating child nodes as contexts, and the first section of the decoder starts decoding from a uniform distribution, the second section decoding a copy of the table of probabilities 0-context model in all tables of probabilities 1-context model at the time of switching from the 0-context 1-context model, and the third section decoding copies a table of probabilities 1-context model for the configuration of the generating node in the table of probabilities of the 2-context model corresponding to the relevant provisions in the same configuration originating node, at the time of switching from the 1-context 2-context model.

45. The device according to p. 44, characterized in that 1-contexta model represents vasani orthogonal transformations G, generated basic transformations, then the two nodes belong to the same class, where basic transformations m1, m2and m3are defined as follows:

where m1and m2- display on the plane x=y and y=z, respectively; and

m3- display on the plane x=0.

47. The device according to p. 44, characterized in that 2-context includes the class of the parent node and the position of the current node to the originating node.

48. The device according to p. 44, characterized in that the section of the second stage decoding section contains extract the probability to extract the probability of generating the current node in the context of the table of probabilities corresponding to the context; section compression oktodelete to compress oktodelete through a sequence of probabilities that contains the extracted probability, and the section of the update tables to update the table of probabilities with a pre-determined increment to generate the frequency of the current node in the current context.

49. The device according to p. 44, wherein the third decoding section contains the section extraction unit for extracting the parent node of the current node; the definitions section of preobrazovaniya, through which the originating node is converted into the standard node of a certain class; section extraction position to apply certain transformations to generate the node and retrieve the position of the current node in the converted originating node; a receiving section configuration for applying the transform to the current node and receiving configuration as a combination of a certain class and the index position of the current node; section determine the probability to determine the probability of the records of a table of probabilities corresponding to combinations of a certain class; and section compression oktodelete to compress oktodelete through a sequence of probabilities that contains the extracted probability and the section of the update tables to update the table of probabilities with a pre-determined increment to generate the frequency of the current node in the current context.

50. The device according to p. 43, wherein the information on the form includes the permissions, which recorded maximum number of leaves ochoterena along the side of the cube that contains the object field ochoterena, which recorded the sequence of the internal nodes, and the field that's being done is in p. 50, characterized in that each internal node is represented by a byte, and the information of the node is recorded in the bit sequence forming of bytes that represents the presence or absence of child nodes belonging to the internal node.

52. The device according to p. 43, wherein the reference image is a host image with depth, consisting of information about the observation point and a color image corresponding to the information about the viewpoint.

53. The device according to p. 52, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation in which the recorded orientation from which the observed image plane, the field of observation, in which the recorded field observations from the observation point to the image plane, the field projection method, in which a recorded method of projection of the observation point on the image plane, selected from a method of orthogonal projection in which the surveillance area is represented by the width and glam and vertical angle.

54. A way of representing three-dimensional objects on the basis of images with depth, including the input bit streams; selecting nodes ochoterena from the input bit streams; decoding nodes ochoterena; highlighting information about the form and reference images for a variety of cubes forming octocorallia from the decoded nodes ochoterena; and the object representation by combining the selected reference image corresponding to the information on the form.

55. The method according to p. 54, wherein the step of decoding includes determining the context of the current node ochoterena based on the number of decoding cycles for node ochoterena; the first stage of decoding a first predetermined number of nodes by 0-context model and arithmetic decoding while maintaining a single table of probabilities with a predefined number of entries; the second stage decoding a second predetermined number of nodes other than the first predefined number of nodes by 1-context model using the parent node as the context and the third stage of decoding of the remaining nodes, following the second predefined Chi the child nodes as contexts the first stage decoding starts with a uniform distribution, in the second stage decoding copied the table of probabilities 0-context model in all tables of probabilities 1-context model at the time of switching from the 0-context 1-context model and in the third stage of decoding copied table of probabilities 1-context model for the configuration of the generating node in the table of probabilities of the 2-context model corresponding to the relevant provisions in the same configuration originating node, at the time of switching from the 1-context 2-context model.

56. The method according to p. 55, characterized in that 1-contexta model represents a class of the parent node.

57. The method according to p. 56, characterized in that the total number of classes is equal to 22, and if the nodes are linked by an orthogonal transformation G generated by the basic transformations, then the two nodes belong to the same class, where basic transformations m1, m2and m3are defined as follows:

where m1and m2- display on the plane x=y and y=z, respectively; and

m3- display on the plane x=0.

58. The method according to p. 55, characterized in that 2-context of vkluchay fact, the second stage of decoding includes removing the probability of generating the current node in the context of the table of probabilities corresponding to the context; compression oktodelete through a sequence of probabilities that contains the extracted probability, and update the table of probabilities with a pre-determined increment to generate the frequency of the current node in the current context.

60. The method according to p. 55, characterized in that the third step of decoding includes removing the parent node of the current node; the definition of the class to which belongs the extracted originating node, and the definition of conversion, by which the originating node is converted into the standard node of a certain class; applying certain transformations to generate the node and retrieve the position of the current node in the converted originating node; applying a transform to the current node and its configuration as a combination of a certain class and the index position of the current node; determining the required probability of the records of a table of probabilities corresponding to combinations of a certain class; and compression oktodelete through a sequence of probabilities, what begins to generate the frequency of the current node in the current context.

61. The method according to p. 54, wherein the information on the form includes the permissions, which recorded maximum number of leaves ochoterena along the side of the cube that contains the object field ochoterena, which recorded the sequence of the internal nodes, and an index field that contains the indexes of the reference images corresponding to each internal node.

62. The method according to p. 61, characterized in that each internal node is represented by a byte, and the information of the node is recorded in the bit sequence forming of bytes that represents the presence or absence of child nodes belonging to the internal node.

63. The method according to p. 54, wherein the reference image is a host image with depth, consisting of information about the observation point and a color image corresponding to the information about the viewpoint.

64. The method according to p. 63, wherein the information about the observation point includes a set of fields defining the image plane for the object, the fields that make up the information about the observation point, include field position, which recorded the position from which the observed image plane, the field orientation, Kotorosl observations from the observation point to the image plane, field projection method, in which a recorded method of projection of the observation point on the image plane selected from a method of orthogonal projection in which the region of observation is represented by width and height, and a method of projection in the future, when the surveillance area is represented by a horizontal angle and vertical angle.

65. Computer-readable recording medium for recording a program for implementing the method of representing a three-dimensional object based on the images with depth corresponding to p. 29, on the computer.

66. Computer-readable recording medium for recording a program for implementing the method of representing a three-dimensional object based on the images with depth corresponding to p. 54, on the computer.

 

Same patents:

The invention relates to the representation of three-dimensional objects obtained using photos of real objects

The invention relates to computer technology and can be used for modeling communication systems

The invention relates to the representation of three-dimensional objects obtained using photos of real objects

The invention relates to the field of data compression, in particular to a method and device for coding information change form three-dimensional (3D) object

The invention relates to a stereological analysis of the size distributions of the objects described in the form of elliptical cylinders

The invention relates to the field stereological analysis of the spatial organization of objects, in particular, when studying objects in their planar images

The invention relates to the representation of three-dimensional objects obtained using photos of real objects

The invention relates to the representation of three-dimensional objects obtained using photos of real objects

The invention relates to the field stereological analysis of the spatial organization of objects, in particular when studying objects in their planar images

The invention relates to computer technology and can be used for modeling the dynamics of the interaction of large-scale systems

The invention relates to the field of computer engineering and can be used in the automated design

FIELD: computer-laser breadboarding.

SUBSTANCE: using a system for three-dimensional geometric modeling, volumetric model of product is made, separated on thin transverse layers and hard model is synthesized layer-wise, thickness A of transverse layers is picked from condition, where A≤F, where F is an allowed value for nominal profile of model surface and generatrix of model surface profile passes through middle line of transverse layers.

EFFECT: shorter time needed for manufacture of solid model.

1 dwg

FIELD: computer-laser breadboarding.

SUBSTANCE: using a system for three-dimensional geometric modeling, volumetric model of product is made, separated on thin transverse layers and hard model is synthesized layer-wise, thickness A of transverse layers is picked from condition, where A≤F, where F is an allowed value for nominal profile of model surface and generatrix of model surface profile passes through middle line of transverse layers.

EFFECT: shorter time needed for manufacture of solid model.

1 dwg

FIELD: computer science.

SUBSTANCE: method includes forming a computer model of object, determining mass-center and inertial characteristics of object model, while according to first variant, model of object is made in form of mass-inertia imitator, being an imitator of mass and main center momentums of inertia, according to second variant, model of object is made in form of assembly imitator, in form of assembly, received by combining dimensional imitator of object model, in form of three-dimensional model with appropriate outer geometry, and mass imitator and main central inertia momentums, and according to third variant object model is formed as component imitator, in form of assembly, consisting of dimensional object model imitator, in form of three-dimensional model of object with appropriate outer geometry.

EFFECT: higher efficiency, broader functional capabilities, lower laboriousness.

3 cl, 5 dwg

FIELD: technology for encoding and decoding of given three-dimensional objects, consisting of point texture data, voxel data or octet tree data.

SUBSTANCE: method for encoding data pertaining to three-dimensional objects includes following procedures as follows: forming of three-dimensional objects data, having tree-like structure, with marks assigned to nodes pointing out their types; encoding of data nodes of three-dimensional objects; and forming of three-dimensional objects data for objects, nodes of which are encoded into bit stream.

EFFECT: higher compression level for information about image with depth.

12 cl, 29 dwg

FIELD: technology for layer-wise shape generation as part of accelerated modeling systems based on laser-computer modeling.

SUBSTANCE: in the method by means of three-dimensional geometric modeling system a volumetric model of product is formed, split onto thin transverse layers and layer-wise synthesis of solid model is performed, while transverse layers Coefficient are made of different thickness A, which is determined from appropriate mathematical formula.

EFFECT: less time required for manufacture of solid model.

1 dwg

FIELD: technology for layer-wise shape generation as part of accelerated modeling systems based on laser-computer modeling.

SUBSTANCE: in the method by means of three-dimensional geometric modeling system a volumetric model of product is formed, split onto thin transverse layers and layer-wise synthesis of solid model is performed, while transverse layers Coefficient are made of different thickness A, which is determined from appropriate mathematical formula.

EFFECT: less time required for manufacture of solid model.

1 dwg

FIELD: engineering of image processing devices.

SUBSTANCE: information is produced about position of surface of input three-dimensional object, this surface is simplified as a set of base polygons, information is produced about position of simplified surface of input three-dimensional object and information map of surface is generated on basis of information about position of surface of input three-dimensional object prior to simplification and information about position of simplified surface of input three-dimensional object; surface of each basic polygon is split in information map of surface on multiple area and excitations function is produced for each area; error is determined between object on basis of excitations function and by given three-dimensional object; it is determined whether error is less than threshold value; if error is less than threshold value, match is set between coefficients of excitation functions for basic polygons and information about basic polygons, while information map of surface is information about surface of input three-dimensional object, and if error is not less than threshold value, than surface of object, represented by information map, is split finer in comparison to previous splitting.

EFFECT: possible processing of three-dimensional object with highly efficient compression.

2 cl, 13 dwg

FIELD: computer-aided design, possible usage for video monitoring of development process of large-scale systems.

SUBSTANCE: method is based on using arrays of data about technical-economical characteristics of military equipment objects being developed with display and combination of this information in windows on screen of display.

EFFECT: provision of method for computer modeling of process of warfare, providing simplified modeling of warfare process.

10 dwg, 7 tbl

FIELD: technology for displaying multilevel text data on volumetric map.

SUBSTANCE: three-dimensional map is displayed on screen, and text data are displayed with varying density levels in accordance to distances from observation point of displayed three-dimensional map to assemblies, where text data are going to be displayed. Further, it is possible to display text data with use of local adjustment of density of text data on screen.

EFFECT: transformation of cartographic data with two-dimensional coordinates to cartographic data with three-dimensional coordinates, thus increasing readability of text data.

2 cl, 11 dwg

FIELD: metrological equipment for navigational systems of railroad transport.

SUBSTANCE: in accordance to method, working sides of railroad track are coordinated with given stationing interval by means of measuring-computing complex mounted on moving object. Measuring-computing complex includes rover, gyroscopic indicator of Euler angles, indicators of track and width of track, controller, personal computer. To provide unity of measurements on railroad main, single three-dimensional orthogonal system of coordinates is used in special projection. Abscissa axis on the surface of Earth ellipsoid is combined with geodesic line, coinciding with main direction of railroad main. As ordinates, geometrical perpendiculars to abscissa axis are used. Coordinates system base consists of system of temporary base stations of satellite radio-navigation system. Satellite radio-navigation system stations are positioned along the railroad with 50-100km intervals for the time of movement of measuring-computing complex. Continuous synchronous recording of indications of all devices and satellite receivers of base stations is performed. Coordinate models of railroad track having no substantial distortions of angles and distances are taken as standard. For compensating systematic errors, indications of Euler angle indicator on measuring-computer complex are smoothed by sliding average filter on sliding interval, equal to length of wheel circle of moving object. Corrections for inclination of antenna are introduced to satellite coordinates of receipt of rover of measuring-computing complex. Indications of course track indicator are calibrated by means of center-affine transformations, converting to series of directional angles and scaled horizontal projections. Joint estimation of complex measurements and parameters of statistic model is performed by means of recurrent generalized method of least squares.

EFFECT: increased precision when determining standard coordinate model.

2 cl, 3 dwg

Up!