# Method of producing a structure of nodes that are intended to represent three-dimensional objects using images with depth

The invention relates to the representation of three-dimensional objects on the basis of images with depth. It can be used when rendering three-dimensional images in computer graphics and animation allows to obtain a technical result in providing compact storage of the image information, fast rendering with high quality output image. This result is achieved due to the fact that the method includes the steps are: create a field of texture, which record a color image containing color information for each pixel; create depth of field in which to record the image with depth, containing depth information for each pixel, and generate a node simple textures by combining field texture and depth of field in the given order. 8 N., and 15 C.p. f-crystals, 36 ill., 10 table.

I. Scope of the invention

The present invention relates to a structure of nodes that are intended to represent three-dimensional objects on the basis of images with depth, and, more particularly, to a method of producing a structure of nodes intended to represent objects by images using depth information.

II the and researchers to synthesize realistic graphics scene similar to the real image. Thus, the conducted research in the area of traditional technologies of visualization (rendering) using polygonal models, and were developed enough to provide a very realistic three-dimensional media technology modeling and visualization. However, the process of formation of complex models requires a considerable effort of professionals and takes a lot of time. Also for realistic and complex environment requires a huge amount of information, which leads to less efficiency in the storage and transmission of information.

At the moment to represent three-dimensional objects in computer graphics are commonly used poly models. Essentially, the object of arbitrary shape can be represented by sets of colored polygons, such as triangles. The most advanced software algorithms and the development of graphics hardware allow to visualize complex objects and scenes as largely realistic stationary and moving polygonal model images.

However, in the last decade was geovani include the complexity of building polygonal models of real world objects, as well as the complexity of the visualization and the poor quality of the formation of a truly photorealistic scene.

Applications that require significant resources, ask for a huge number of polygons; for example, a detailed model of the human body contains several million triangles, which is not easy to handle. Although recent advances in methods for determining the range, for example, using a laser scanner range, allows to obtain a dense data range with acceptable error, still getting seamless full polygonal model of the whole object turns out to be very costly and difficult. On the other hand, the visualization algorithms designed to obtain quality, close to the pictures that are complex from the point of view of computation and, thus, far from real time visualization.

The INVENTION

One aspect of the present invention is to provide a method of generating a node structures for computer graphics and animation designed to represent three-dimensional objects using images with depth and called view-based images with depth (CIP the tov, the node structure based on the images with depth, includes field texture (texture), in which the recorded color image containing the color for each pixel, and field depth (depth), in which the recorded depth value for each pixel.

According to another aspect, the node structure based on the images with depth, includes field size, in which recorded information about the size of the image plane; field resolution, which recorded depth resolution for each pixel; field depth (depth), which recorded many segments of the depth information relating to each pixel, and the color in which the recorded color information corresponding to each pixel.

According to another aspect, the node structure based on the images with depth, includes field viewpoint (point of observation), which recorded the observation point of the image plane; field visibility (monitoring), in which the recorded region of observation of the observation point to the image plane; field projection method projection), in which is written the way the projection of the observation point on the image plane;

the distance (distance), which recorded rasna.

According to another aspect, the node structure based on the images with depth includes field resolution, which recorded maximum number of leaves ochoterena along the verge of the enclosing cube that contains the object; field octree (oktodelete), which recorded the structure of the internal node ochoterena; field index, which recorded the index of the reference image corresponding to the internal node, and a field image, which is recorded reference image.

In accordance with the present invention, time visualization (visualization) for models based on images with depth in proportion to the number of pixels in the reference and the target images, but, in General, not proportional to the geometric complexity that occurs for the case of polygonal models. Moreover, in the case when the view-based images with depth, is applied to objects and scenes of the real world, it becomes possible to visualize natural scenes with photographic quality without using millions of polygons and expensive calculations.

BRIEF DESCRIPTION of DRAWINGS

The above-mentioned objectives and advantages of the present invention pojani on the accompanying drawings, which presents the following:

Fig.1 is a diagram of examples of the representation of image-based integrated current reference software;

Fig.2 is a structure diagram ochoterena and the order of the child corners;

Fig.3 is a graph illustrating the compression ochoterena;

Fig.4 is a diagram of examples of a layered image with depth (MIG): (a) illustrates the projection of the object, where the dark cells (voxels) correspond to the units, and the white cells are the zeros, and (b) illustrates a two-dimensional cross-section in the coordinates (x, depth);

Fig.5 is a diagram illustrating a color component model "angel" after reorder its color data;

Fig.6 is a diagram illustrating the orthogonal invariance of the likelihood of the site: (a) illustrates a current source and generating node, and (b) illustrates the current-generating node, rotated around the y-axis 90;

Fig.7, 8 and 9 - data compression geometry for the best way, based on the prediction by partial match (responsible);

Fig.10 is a diagram illustrating two ways to rearrange fields color model "angel" format PointTexture (dot texture) in the two-dimensional image;

Fig.11 is a diagram of examples of compression is l", and (C) and (d) respectively, the original and compressed versions of the model "Morton";

Fig.12 is a diagram illustrating a model of the "angel" in the format of binary volumetric ochoterena (BVI) and the model of "angel" in the format of textured BWO (TBWA);

Fig.13 is a diagram illustrating additional images taken additional cameras in the case of TBWO: (a) the image index of the camera, (b) the first additional image and (b) second additional image;

Fig.14 is a diagram illustrating an example of an entry data stream BWO: (a) illustrates the tree structure of TBWO. Grey color corresponds to the "uncertain" character textures. Each color denotes the index of the camera, (b) illustrates the traversal order ochoterena node BWO and indexes cameras, (C) illustrates the resulting data stream of TBWO, which is filled with cubes and cube ochoterena indicate the bytes textures and bytes BVI, respectively; and

Fig.15, 17, 18 and 19 is a diagram illustrating compression of TBWO for models "angel", "Morton", "Palma" and "Roboty", respectively; and

Fig.16 is a diagram illustrating image models "angel" and "Morton", obtained after selective removal of the portion of voxels;

Fig.20 is a diagram of the example image embossed texture and depth map;

in Fig.21 priveste multilayer pixels;

Fig.22 is a diagram of the example of block texture (BT), in which visualization is shown in the center of the model using the six objects SimpleTexture (single texture) (pair image and depth map);

Fig.23 is a diagram of an example of a generalized block textures (MBT): (a) illustrates the position of the camera to model a palm tree, (b) illustrates a plane of reference images for the same model (used 21 object SimpleTexture);

Fig.24 is a diagram of an example showing a two-dimensional representation ochoterena: (a) illustrates a "point cloud", (b) illustrates the corresponding averaged card;

Fig.25 - the pseudo-code used for recording the bit stream TBWA;

Fig.26 - specification of nodes POIG;

Fig.27 is a schematic representation of the three-dimensional model for DepthImage (image depth): (a) perspective, (b) orthogonal projection;

Fig.28 - pseudocode visualization SimpleTexture, based on OpenGL;

Fig.29 is an example illustrating the compression reference image format SimpleTexture (a) illustrates the initial reference image, and (b) illustrates a modified reference image in the JPEG format.

Fig.30 is an example illustrating the model visualization "Morton" in accordance with different formats: (a) in the original polygonalization visualization: (a) illustrates a scanned model of the Tower in the format DepthImage, (b) illustrates the same model in the format OctreeImage (data scanner was used without removal of noises, so in the upper part of the model visible black dots);

Fig.32 - examples of model visualization "Palma" (a) illustrates the original polygon format, (b) illustrates the same model, but in the format DepthImage;

Fig.33 is an example of visualization illustrating a frame from the animation "Dragon" format OctreeImage;

Fig.34 is an example of the visualization model Angel" format PointTexture;

Fig.35A and 35B is a diagram illustrating the relationship of the respective nodes when the representation of the object in the format DepthImage, with nodes SimpIeTexture and PointTexture node, respectively; and

Fig.36 is a diagram illustrating the structure of the corresponding OctreeImage node in the view object by OctreeImage node.

A DETAILED DESCRIPTION of the PREFERRED embodiments

This patent application is based on provisional application for U.S. patent, are listed below and are fully incorporated herein by reference.

The following summarises the invention described in the provisional application No. 60/333167 on U.S. patent, filed November 27, 2001 and entitled "Method and apparatus for encoding data representation based on the images in the three-dimensional scene."

Umenta the results of the main experiment PCA A8.3 visualization of image-based. The main experiment comes to technology based visualization of images using texture information about the depth. Also presents some changes in the specification of the nodes on the basis of experiments performed after the 57th session of the Expert group on the Moving Images (MPEG) and discussions during the meeting of the Special Group PCA, held in October.

2. Experimental results

2.1 Test models

For stationary objects

The DepthImage node (image depth) with SimpleTexture (simple structure)

Dog

The t-Rex (DepthImage, with about 20 cameras)

Monster (DepthImage, about 20 cameras)

ChumSungDae (DepthImage, scanned data)

Palma (DepthImage, 20 cameras)

The DepthImage node with LayeredTexture (layered texture)

Angel

The DepthImage node with PointTexture (dot pattern)

Angel

The OctreeImage node (image ochoterena)

Creature

For animated objects

The DepthImage node with SimpleTexture

Dragon

Dragon surrounded by scenes

The DepthImage node with LayeredTexture

Not provided

The OctreeImage node

Robot

Dragon surrounded by scenes

Additional data (obtained by scanning or modeling) will be provided in the future.

2.2 test Results

All of the proposed Sydney knots is weeny server Version Management System (OMS).

For the format of the animation representation based images (CIP) is required synchronization between multiple files movie performed so that the image in the same key frame of each movie file was served at the same time. However, current standard software does not support this feature synchronization, which is possible in MPEG systems. Thus, at the moment the animation formats can be visualized in the assumption that all animation data is already in the file. For each animated textures temporarily files are movies in AVI format.

After several experiments with layered textures we were convinced that the node LayeredTexture (layered texture) is ineffective. This site was proposed for multi-layer image with depth. However, there is also a PointTexture node (point texture), which supports this type of image. Therefore, we propose to delete the node LayeredTexture of BOM nodes. Fig.1 illustrates examples of the CIP-integrated current reference software.

(Figure removed)

3. Update BOM nodes Bay contains images and information about cameras, while the site the POI only needs to contain a link (uniform resource locator (url) to this thread. However, the discussion of the CIP during the meeting of the Special Group in Rennes was that image and camera information must be kept in units of PSI, and flow. Thus, the following is an updated specification of the nodes of the CIP. Requirements to the data flow of the CIP is provided in the section that explains the field (url).

Decoder (bit streams) - specification of nodes

The DepthImage node defines a single texture THESE. When multiple nodes DepthImage connected to each other, they are processed as a group, and thus, they should be subordinate to the same node Transform (transform).

Field diTexture sets the texture with depth, which is displayed in the area defined in the DepthImage node. This texture is one of the many types of textures based on the images with depth (SimpleTexture or PointTexture).

Field position and orientation) set the relative position of the observation point texture pop in a local coordinate system. The position is defined relative to the beginning (0, 0, 0) coordinate system, while the orientation rotates consider the I to the Z-axis looking down the-Z axis toward the origin offset +X rightward and +Y upward. However, the hierarchy of transformations affect the final position and orientation of the viewpoint.

Field fieldOfView (observation area) specifies the angle from the point of view of the camera defined by the fields position and orientation. The first value indicates the angle with the horizontal side, and the second value denotes the angle with the vertical side. Values, the default is 45in radians. However, if the field is orthogonal (orthogonal) set as TRUE, fieldOfView field indicates the width and height of the near plane and far plane.

Field nearPlane (near plane) and the farPlane (far plane) set the distance from the observation point to the near plane and far plane of the field of observation. Data about textures and the depth of the display region enclosed between the near plane, far plane and fieldOfView. Depth data are normalized by the distance between the nearPlane and farPlane.

Field orthogonal sets type of texture POI. If this field is set to TRUE, the texture of the CIP is based on the orthogonal views. Otherwise, the texture of the CIP is based on the perspective.

Field depthImageUrl specifies the address of the data flow images with depth, which may optionally on the I flags of the presence/absence of the above fields.

The SimpleTexture node defines a single texture layer POI.

Field texture sets flat image that contains the color for each pixel. It is one of the many types of nodes textures (ImageTexture (texture image), MovieTexture (texture movie) or PixelTexture (pixel texture).

Field depth (depth) sets the depth for each pixel in the field texture. The size of the depth map must match the size of the image or movie in the box texture. This field refers to one of the many types of nodes textures (ImageTexture, MovieTexture or PixelTexture). If the node depth is NULL or the field depth is not set, the alpha channel in the field texture will be used as a depth map.

The PointTexture node defines a set of layers of points of POI.

Field width and height specify the width and height of the texture, respectively.

Field depth (depth) specifies the set of depth values for each point (in normalized coordinates) in the plane of projection in the order in which the traversal starts from the point in the lower left-hand corner and continuing from left to right to complete a horizontal line, then jumps to line up. For each point first Zap the current pixel. The procedure is the same as for depth of field except that is not included specified number of depths (pixels) for each point.

(Fig.1 and 2 removed)

The OctreeImage node (image ochoterena) defines the structure ochoterena and projected textures. The size of the cube embraces all ochoterena, there is 111, and the center of the cube ochoterena is the origin (0, 0, 0) local coordinate system.

Field octreeresolution (resolution ochoterena) specifies the maximum number of leaves ochoterena along the verge of the enclosing cube. Level ochoterena can be determined from field octreeresolution (resolution ochoterena) using the following equation: octreelevel = int (log2 (octreeresolution -1)) +1).

Field octree (oktodelete) specifies the set of internal nodes ochoterena. Each internal node is represented by a byte. 1 in the i-th bit of this byte means that the i-th child node of the considered internal node are child nodes, while 0 means that these nodes do not exist. The order of the internal nodes is the traversal order ochoterena on a "first in width. The order for the eight child nodes of the internal node is shown in Fig.2.

(Fig.3 removed)

Field octreeimagesrPlane and farPlane DepthImage node and the field depth of the SimpleTexture node is not used.

Field octreeUrl specifies the address of the data flow octreeImage, which includes:

the header for flags

octreeresolution

octree

octreeimages (many nodes DepthImage)

nearPlane not use

the farPlane is not used

diTextureSimpleTexture without depth of field

The following summarises the invention described in the provisional application No. 60/363545 on U.S. patent, filed March 8, 2002 and entitled "Method and apparatus for compressing and forming a data flow view, based on images with depth".

II. CODING of MOVING PICTURES AND AUDIO ISO/IEC JTC 1/SC 29/WG 11

1. Introduction

This document provides the results of the main experiment PCA A8.3 in the field of view based on the image depth (POIG). This basic experiment concerns the nodes of views on the basis of images with depth, using textures with depth. These units have received approval and were included in the proposal for the Drafting Committee during the meeting in Pattaya. However, the conversion of that information into the data stream by means of a field octreeUrl OctreeImage node and field depthImageUrl DepthImage node is still ongoing. This document describes the format of forming a data stream associated ELA PointTexture.

2. The format of forming a data stream for octreeUrl

2.1 the stream Format

The OctreeImage node includes a field octreeUrl, which specifies the address of the data flow OctreeImage. This thread may further include:

the header for flags

octreeresolution

octree

octreeImages (many nodes DepthImage)

nearPlane not use

nearPlane not use

diTextureSimpleTexture without depth of field

Field octree specifies the set of internal nodes ochoterena. Each internal node is represented by a byte. 1 in the i-th bit of this byte means that the i-th child node of the considered internal node are child nodes, while 0 means that these nodes do not exist. The order of the internal nodes is the traversal order ochoterena on a "first in width. The order for the eight child nodes of the internal node is shown in Fig.2.

(Fig.1 removed)

Field octree node OctreeImage presented in a compact format. However, this field can be further compressed to ensure effective transformation in the data flow. The following section describes the compression scheme for the field octree node OctreeImage.

2.2. The compression scheme for the field octree

In view POIG based ochoterena data consist of field octree, which is a geometric component is Non-identical reproduction of the geometry of the compressed representation leads to clearly visible artifacts. Therefore, the geometry must be compressed without loss of information.

2.2.1. The compressed octree

For compression field octree, presented in the form ochoterena order traversal on a "first in depth", we have developed a method of lossless compression, which uses some ideas of the approach, known as prediction by partial match (responsible) [1, 2]. The main used the idea in a "prediction" (i.e., the probability score) of the next character by several previous characters, which are called "context". For each context there is a probability table containing the estimated probability of occurrence of each symbol in this context. This procedure is used in combination with an arithmetic coder called encoder range [3, 4].

The two main distinctive features of this method are:

1) use the parent node as the context for the child node.

2) using assumptions about "orthogonal invariance" in order to reduce the number of contexts.

The second idea is based on the observation, which is that the "transition probability" for couples "generating node - child node" usually invariant under orthogonal changes associated with polozhenie allows us to use a more complex context, doing this without an unnecessarily large number of tables of probability. This, in turn, allowed us to achieve very good results in terms of volume and speed, as more contexts are used, the more accurate the assessment of the probability and, thus, the more compact code.

Encoding is the process of building and updating the probability table according to the context model. In the proposed method, the context is modeled as a hierarchy of "generating node - child node in the structure ochoterena. First, we define the symbol as a node size of one byte, the bits of which indicate the filling Podkova after an internal split. Therefore, each node ochoterena can be a character and its numeric value is in the range from 0 to 255. Table probability (TV) contains 256 integers. The value of i-th variable (0i255), divided by the sum of all variables is equal to the frequency (the assessment of the probability) of occurrence of the i-th symbol. Table of probability contexts (TCE) is a set of tables, TV. The probability of a symbol is determined by one and only one of the tables TV. The specific tables TV depends on the context. An example of TCE are shown in table 1.

Consider a set of 32 fixed orthogonal transformations, which include symmetry transformations and turns 90about the coordinate axes (see Annex 2). Next, we can categorize the symbols in accordance with the pattern fill their podkopov. According to our method, there are 27 sets of characters, in this document called groups, which are characterized by the following property: 2 symbol associated with one of these fixed transformations if and only if they belong to the same group.

In the byte designation consists of 27 sets of numbers (see Appendix 2). We believe that the table of probability does not depend on the originating node (in this lies an originating node (thus, 27 tables).

At the time of switching tables TV for all contexts are equal copies of TV corresponding to the 0-context model. Further, each of the 27 tables TV is updated, if it is used for encoding.

After encoding 2048 (another heuristic value) symbols in accordance with the 1-context model, we switch to the 2-context model, which uses pairs (ParentSymbol, NodeSymbol) as contexts. NodeSymbol simply represents the position of the current node to the originating node. Thus, we have 27*8 contexts for the 2-context model. At the moment of switching on this table model TV obtained for each context, are used for each node within this context and with this point are updated independently.

In more detail, the coding for the 1-context 2-context models is performed as follows. For the context of the current symbol (that is, the parent node is identified by its group. This is done by searching in the table (geometric analysis was performed on the stage of the development program). Next, we apply orthogonal transformation that transforms our context in the "standard" (selected randomly once and for all an element of the GRU is fulfilled by using the conversion table, this, of course, all calculations for all possible combinations have been made in advance). In General, the above procedure is the calculation of the correct position of the current symbol in the table of probabilities for a group that contains its context. Next, the corresponding probability is fed to encoder range.

Thus, under the given originating node and the position of abuse ContextID defined, which specifies the group ID and the position of the TV in TCEs. The probability distribution in the TV and ContextID served on the encoder range. After encoding TCEs updated for use with the following encoding. It should be noted that the encoder range is a variant of arithmetic coding, which is re-regulation in bytes instead of bits, resulting in the execution time is reduced by half, and compression worsens 0.01% compared to a standard implementation of arithmetic coding.

The decoding process essentially is the reverse of the encoding process. It is a routine procedure, which requires no description, as it uses exactly the same methods for determining contexts, updating probabilities, and so on,

2.3. Resultig, and for animated models (y-axis denotes the compression ratio). The compression ochoterena varies in the region of 1.5-2 times compared with the original size ochoterena and not less than 30% superior to the methods of lossless compression, intended for General purposes (based on the algorithm of Lempel-Ziv, for example, RAR).

(Fig.3 removed)

3. The format conversion in the data stream for depthImageUrl

3.1. The stream format

The DepthImage node includes a field depthImageUrl, which specifies the address of the data flow images with depth. This thread may further include:

a one-byte header flags for the presence/absence of the following fields

position

orientation

fieldOfView

nearPlane

farPlane

orthogonal

diTexture (SimpleTexture or PointTexture)

The definition of PointTexture node, which can be used in the field diTexture, is performed as follows.

PointTexture{

The PointTexture node defines a set of layers of points of POI. Fields width and height specify the width and height of the texture, respectively. Field depth specifies the set of depth values of each point (in normalized coordinates) in the plane of projection in the order in which the traversal starts from the point in the lower left-hand corner and continuing from left to right until the completion of the horizontal lines, the TWT depth (pixels), and then the actual depth. The color field specifies the color of the current pixel. The procedure is the same as for depth of field except that is not included specified number of depths (pixels) for each point.

Field depth and color of PointTexture node presented in the format of raw data, and the size of these fields, in all probability, will be very large. Therefore, it is necessary compression of these fields to ensure effective transformation in the data flow. The following section describes the compression scheme for the mentioned fields of PointTexture node.

3.2. A compression scheme for PointTexture

3.2.1. Compression depth of field

Field depth of PointTexture node is simply the set of points in the discretized enclosing cube". We assume that the projection plane is the lower plane. Provided that for the model specified grid dimension m*n*I, and the points are the centers of the cells (in the case of ochoterena we call them voxels) of this grid, we can consider the occupied voxels as units, and the empty voxels as zeros. Next, the resulting set of bits (m*n*I bits) arranged in the byte stream. This is done by viewing voxels in the direction of depth (orthogonal to the plane of projection) layers of depth 8, and in normal (PE, if the dimension of depth is not divisible by 8 evenly). Thus, we can consider our set of points as a stack of 8-bit images (as an option - 16-bit images) scale of gray levels. According voxels and bits illustrated in Fig.4 (a), below.

(Fig.4 removed)

For example, in Fig.4(b) black squares correspond to points on the object. The projection plane is a horizontal plane. Consider the cross-section height of 16 (the upper boundary is marked with a bold line). We will interpret the columns as bytes. In other words, the column above are marked on the figure by a dot, is a stack of 2 bytes with values of 18 and 1 (or 16-bit integer 274). If we apply the best of the current compression methods based on responsible, to the thus obtained combination bytes, you will get very good results. However, if in the present case directly applies a simple 1-contextual way (of course, in the present case cannot be used orthogonal invariance or hierarchical contexts), the compression ratio for these results is quite a bit less. Below is a table objemovou, compressed best seal information responsible, and the same array, compressed used at the moment our seal information (figures in Kbytes).

3.2.2. The compression field color

The color field of PointTexture node represents a set of colors, which is the attribute of the object points. Unlike the case ochoterena, the color field has one correspondence with the field depth. The idea is to represent the color information as a single image, which can be compressed using one of the known methods of lossy compression. The number of elements of this image is much less than the number of elements of the reference image in the case ochoterena or DepthImage that is a major motivation for this approach. The image can be obtained by scanning points with depth in some natural order.

Consider, first, the scanning order, dictated our source data storage format for MIG (PointTexture) - scan geometry on a "first in depth". Multipixel scanned in the natural order across the plane of projection as if they are simple pixels and points within the same multipixel scanned in the direction of the l, 2nd nonzero multipixel and so on). Once the depth is known, the color points can consistently recreate this array. In order to make possible the application of the methods of image compression, we must clearly display this long string in a two-dimensional array. This can be done in different ways.

The approach used in the following tests is the so-called "block scanning", in which the color string is arranged in blocks of 8*8, and these blocks are arranged in columns ("block scanning"). The resulting image is shown in Fig.5.

Compression of this image was done in several ways, including JPEG standard way. It turns out that at least for this type of color scan significantly better results are obtained by using the method for compressing the textures described in [5]. This method is based on adaptive local packaging each block of 8*8. This method has two modes: eight - and twelve-compression (compared to "raw" format BMP with realistic colour reproduction at 24 bits per pixel). The success of this method for this type of images can be accurately explain from Hatori in contrasting colors) local variations of color, arising as a result of "mixing" colors of the points of the front and rear surfaces (which can vary dramatically, as in the case of model "angel"). To find the optimal scan is to reduce these variations as possible.

(Fig.5 removed)

3.3. Test results

Annex 3 provides examples of models in the original and compressed formats. The quality of some models (for example, "angel") after compression is still not quite satisfactory, while the quality of the other models are very good ("Grasshopper"). However, there is a feeling that this problem can be solved through proper scan. Potentially you could use the twelve-compression, so that the overall compression level has increased at an even greater value. In the end, the lossless compression can be improved to approach the results in the compression of the geometry on the basis of the best ways responsible.

Here we provide a table of degrees of compression.

4. Conclusion

This document reports on the results of the main experiment PCA A8.3 in the field of view based on the image depth. Entered data streams POIG, communication with which ouidoo element, pointing to the unreliability of this element. Also examined data compression ochoterena and PointTexture.

5. Links

[1] Cleary, J. G., Witten I. H., Data compression using adaptive coding and partial string matching, IEEE Transactions on Communications, Vol.32(4), pp.396-402, April 1984.

[2] J. J. Rissanen, G. G. Langdon, " Universal modeling and coding // IEEE Transactions on Information Theory, Vol.27(1), pp.12-23, Jan. 1981.

[3] M. Schindler, A byte oriented arithmetic coding, Proceedings of Data Compression Conference, 1998.

[4] Martin G. N. N., Range encoding an algorithm for removing redundancy from a digitized message. Video & Data Recording Conference, March 1979.

[5] Levkovich-Maslyuk L., Kalyzhny P., A. Zhirkov, Texture compression with adaptive block partitions, Proceedings of^{th}ACM International Conference on Multimedia (Multimedia 2000).

Appendix 1. The geometric meaning of the orthogonal invariance of contexts in the compression algorithm BWO.

The assumption of orthogonal invariance is illustrated in Fig.6. Consider a rotation about the vertical axis 90clockwise. Consider an arbitrary templates populate the node and its parent before (upper image) and after rotation (bottom figure). Next, two different template can be considered as one and the same template.

(Figure removed)

Annex 2. Group and conversion.

1. 32 fixed orthogonal transformations.

Each transformation set 5-bit word. The combination of the bits represent the corresponding transformation)

1st bit: the permutation of the coordinates x and y;

2nd bit: the permutation of the coordinates y and z;

3rd bit: conversion of symmetry in the plane (y-z);

the 4th bit: conversion of symmetry in the plane (x-z);

the 5th bit: conversion of symmetry in the plane (x-y);

2. 27 groups

For each group, here is the order of this group and the number of non-zero bits in its elements: NumberOfGroup, QuantityOfGroup and NumberOfFillBits(SetVoxels).

3. The characters and transformations

For each character (s) zip code (g) group to which it belongs, and the value (t) of the transformation, which turns it into a “standard” element of this group.

Binary character number is displayed in binary coordinates waxes as follows: the coordinates of the i-th bit of the mentioned numbers are x=i&1, y=i&(1<<1), z=i&(1 < <2).

Annex 3. Display image compression point of the texture.

In Fig.7, 8 and 9 shows data describing the compression of the geometry for the best way, based on responsible.

(Figure removed)

The following summarises the invention described in the provisional application No. 60/376563 on U.S. patent, filed may 1, 2002, and entitled "Method and apparatus for compressing and forming a data flow view, based on sobri (PCA A8.3)

1. Introduction

In this document the results of the main experiment PCA A8.3 for views based on images with depth (POIG). This basic experiment comes to view sites on the basis of images with depth, using textures with depth. These units have received approval and were included in proposals to the Drafting Committee at the meeting in Pattaya. However, the transformation of this information into the data flow through the OctreeImage node and DepthImage node is still ongoing. This document describes the format of a formation flow of data communication which is performed by these nodes. The format of forming the data stream includes compressed octree field OctreeImage node and field depth/color PointTexture node.

2. Compression formats POIH

This document provides a description of a new method of effective compression disjoint data structures ochoterena lossless, which allows in our experiments to reduce the amount of this already compact representation about 1.5-2 times. We also offer several methods of lossless compression and lossy format PointTexture using intermediate voxel representation in combination with the statistical encoding of the UIS separately. Following methods were developed based on the observation that the field octree is required to compress without loss, while for the field octreeimages some valid visually acceptable level of distortion. Field octreeimages squeeze by way of image compression MPEG-4 (in the case of static models), or by using tools for video compression (in the case of animated models).

(Fig.1 removed)

2.1.1. The compression field octree.

The octree compression is the most important part of the compression OctreeImage as it relates to the compression and already very compact representation in the form of a disjoint binary tree. However, in our experiments described below, the method reduced the volume of this structure to approximately half its original size. In the case of animated version OctreeImage runs a separate compression field octree for each three-dimensional frame.

2.1.1.1. The context model

Compression is performed using a variant of the adaptive arithmetic coding (implemented as "encoder range", [3], [4]), which clearly uses the geometric nature of the data. Field octree represents a stream of bytes. Each byte represents a node (i.e., podkul) tree, in which its bits determine the employment of this under the compression algorithm processes the bytes one by one as follows:

Is determined by the context of the current byte.

From appropriate in this context "table probability (TV) is extracted probability (normalized frequency) of occurrence of the current byte in the given context.

The probability value is fed to encoder range.

Current TV is updated by adding 1 to the value of the frequency of occurrence of the current byte in the current context (and, if necessary, subsequently re-normalized, see details below).

Thus, coding is the process of building and updating tables TV in accordance with the context model. In based on the contexts schemes adaptive arithmetic coding (such as Prediction by partial match" [1]-[3]), the context of a symbol is typically a string of several preceding characters. However, in our case, the compression efficiency is increased through the use of patterns ochoterena and geometric nature of the data. The considered approach is based on two ideas, which, obviously, first used in the task of compressing ochoterena.

A. For the current context node is either its parent node, or a pair {originating node, the position of the current node in poroda specific originating node is invariant with respect to some set of orthogonal transformations (such as rotations or transformations of symmetry).

The assumption of “B” illustrated in Fig.6 for converting R, which is rotated -90in the x-z plane. The main idea of the assumptions "B" is the observation that the probability of occurrence of some specific type of the child node in a specific type of the parent node should depend only on their relative positions. This assumption is confirmed in our experiments by analyzing tables of probability. It allows us to use a more complex context without too many tables of probability. This, in turn, helps to achieve very good results in terms of data volume and speed. It should be noted that the more contexts you use, the more accurate the estimate and, therefore, the more compact code.

We introduce a set of transformations for which we assume the invariance of the probability distributions. In order for these transformations can be used in our case, they must preserve the enclosing cube unchanged. Consider the set G of orthogonal transformations in Euclidean space, which are obtained by all possible combinations in a random number and the order of the>img src="https://img.russianpatents.com/img_data/83/830510.gif">

where m_{1}and m_{2}is the reection with respect to the planes x=z and y=z, respectively, and m_{3}- the reflection in the plane x=0. One of the classical results in theory of groups obtained by reflection [27] States that G contains 48 different orthogonal transforms and is, in fact, a maximum group of orthogonal transformations, which transform the cube in himself (the so-called group of Corsetiere [27]). For example, turn R on Fig.6 through generators is expressed as

R=m_{3}·m_{2}·m_{1}·m_{2},

where “·” denotes multiplication of matrices.

Being applied to the node ochoterena, transformation, set G forms a node with a different fill pattern podkopov. This allows you to categorize the nodes in accordance with templates, fill them podkopov. Using the terminology of group theory, we can say that G acts on the set of all fill patterns of the nodes ochoterena. Calculations show that there are 22 different class (in theory of groups, they are also called orbits), for which, by definition, two nodes belong to the same class if and only if they are related by the transformation specified assumptions "B" is the fact that the table of probability does not depend on the generating site, but only from a class that belongs to this originating site. It should be noted that in the first case for context-based generating nodes would require 256 tables, and an additional 2568=2048 tables for context, based on the mutual arrangement of generating and child nodes, while in the second case, the context based on the class of the parent node requires only 22 table plus 228=176 tables. Therefore, it is possible to use equally a complex context with a relatively small number of tables of probability. Formed TV will have the form shown in table 2.

2.1.1.2 the encoding Process

To improve the accuracy of statistical data for tables of probability, these data are collected in different ways at three stages of the encoding process.

In the first stage, the context is not used at all, is taken as "0-context model, and is supported by a single probability table with 256 cells in the initial state characterized by a uniform distribution of elements.

As soon as wypolnieno model" to the originating node as the context. At the time of this switching TV corresponding to the 0-context model is copied into table TV for all 22 contexts.

After performing encoding 2048 nodes (also heuristic value), switch to the 2-context model. At the moment, tables TV for configurations generating nodes corresponding to the 1-context model is copied into table TV for each position in the same configuration of the generating node.

The key point of this algorithm is the definition of context and the probability for the current byte. It is implemented as follows. For each class one fixed element, which is called "standard element". We maintain a table mapping classes (CURRENT) indicating the class to which belongs each of the 256 possible sites, as well as pre-calculated conversion set G, which translates a given node in a standard element of his class. Thus, to calculate the probability of the current node N, perform the following steps:

To get an originating node P of the current node.

Extract from the CURRENT class, which belongs to P, and the transformation T that maps P to the standard node of this class. Let the class number is equal to C.

Note the repeal of T and N. The result just obtained the fill pattern TN is in position p in the standard site class C.

To extract the required probability value from a cell TN table of probability, an appropriate combination of class-position (C, p).

Modification of the above stages for the 1-context model is obvious. Needless to say that all transformations are pre-computed and implemented in the form of a table, on which their search.

It should be noted that at the stage of the decoding node N to its parent P has passed the decoding and, therefore, the transformation T is known. All steps on stage decoding similar to the corresponding stages of coding.

Finally, consider the process of updating the probability values. Let P is the probability table for some context. Denote the element of R corresponding to the probability of node N in this context, as R (N). In our implementation of P (N) is an integer, and after each occurrence of N is P(N) is updated as follows:

P(N)=P(N)+A,

where a is a parameter increment, the value of which for various context models usually varies between 1 and 4. Let S (P) is the sum of all elements in R. Then the probability that N p(R) reaches the threshold value 2^{16}, re-regulation of all cells: in order to avoid the P zero values, the elements equal to 1, remain unchanged, and all other elements are divided by 2.

2.2. Compression PointTexture.

The PointTexture node contains two subject to compression field, namely depth and color. The main challenges arising from the compression data PointTexture, connected with the following requirements:

Compression geometry must be lossless, as the distortion in this type of geometric representation is often very noticeable.

The color information does not have a natural two-dimensional structure, therefore, the methods of image compression cannot be applied directly.

This section offers three compression models PointTexture:

The method of lossless compression for nodes in the standard view.

The method of lossless compression for nodes in the view with a reduced resolution.

The compression method for the nodes in the view with a reduced resolution, in which the geometry is compressed without loss, and the color is compressed lossy.

These methods correspond to the three levels of the "authenticity" of the object description. In the first method assumes that you want to store the depth information with an accuracy up to the original 32 bit. Odnako particular, if the model PointTexture obtained by conversion from a polygonal model, the resolution of the discretization is chosen according to the actual size of the visible parts of the source model and the desired output resolution. In this case, bits 8-11 may well satisfy the requirements, and the depth is initially stored in this format with a reduced resolution. Then our second method provides lossless compression for such representation with reduced resolution. An important observation is that for such a relatively small (in comparison with standard 32) of the number of bits you can use intermediate voxel representation model that can significantly compress field depth without loss of information. In both cases, the color information is compressed without loss, and after grouping the color data in the form of an auxiliary two-dimensional image is saved in PNG format (Portable Network Graphics portable network graphics).

Finally, the third method allows you to achieve a much greater degree of compression by combining compression geometry lossless compression and color data loss. The latter is performed by a specialized way block siati the C losses for the nodes in the standard view.

This method is a simple method of lossless encoding, which works as follows:

Field depth is compressed using adaptive encoder range, similar to the encoder that was used when the compression field octree. For this format, we use a variant in which the probability table is maintained for each of the single-character contexts, and the context is simply the preceding bytes. Therefore, use of 256 tables TV. Field depth is considered as a stream of bytes, and the geometric structure is explicitly not used.

The color field is compressed after it has been converted into a flat image with realistic colour reproduction. The color of points in the model PointTexture are first written to a temporary one-dimensional array in the same order as the values of depth of field depth. If the total number of points in the model is equal to 1, then calculate the smallest integer L such that 1·1L, "package" this long "line" color values in the square image of side 1 (if necessary complementing its black pixels). Then this image compress one of the tools for MPEG-4 image compression without loss. In our approach we used the format portable network graphics (PNG). On the Ter for the nodes in the view with a reduced resolution.

In many cases, a 16-bit resolution for depth information is too precise. In fact, the depth resolution should match the screen resolution, which is visualized model. In situations where small variations in the depth of the model at different points lead to offsets in the display plane, the size of which is much smaller than the pixel size, it is reasonable to use a lower resolution in depth, and models are often in the format in which the values of depths occupied by bits 8-11. Such models are usually obtained from other formats, such as polygonal models by sampling depth and color that suits spatial grid.

Such a representation with a reduced resolution itself can be regarded as a compressed form of the standard model with 32-bit depth. However, for such models there is a more compact representation using the intermediate voxel space. Indeed, the points of the model can be considered as belonging to the nodes of a uniform spatial grid cell size which is determined by the discretization step. You can always assume that the grid is uniform and orthogonalise observation field compression depth and color to PointTexture with reduced resolution, as follows.

The color field is compressed using methods of image compression is lossless, as well as in the previous method.

Field depth is first converted into voxel representation and then squeeze through a variant of the encoder range, described in the previous subsection.

Intermediate voxel model is constructed as follows. Consider a discrete voxel space size widthheight2^{s}(the width and height parameters described in the specification PointTexture), corresponding to the resolution's model in depth. For our purposes it is not necessary to work with potentially huge voxel space as a whole, but only with its thin cross sections. Let us denote the coordinates of the rows-columns in the plane of projection as (r, C), and let d denotes the coordinate in the depth. Convertible slices {C=const}, i.e. the cross-section model of "vertical plane", to a voxel representation. When scanning, cut along the "columns", parallel to the plane of projection of the voxel (r, s, d) is black if and only if there exists a point of the model with the depth value d, which is projected in the (r, s). This process is shown in Fig.4.

(Fig.2 removed)

As soon as the next slice. Thus we avoid working with very large arrays. The initialization of the probability tables for each new cut is not performed. For a wide range of models, only a small fraction of voxels is black, so you can achieve a very high compression ratio. Decompression is performed by the obvious appeal of the described operations.

The following describes a comparison of the compression field depth by means of the just described method and by view ochoterena. However, the overall compression ratio of the model is determined by the color field, such as irregular images cannot compress without distortion. In the next subsection discusses the combination of compression geometry lossless compression and color loss.

2.2.3. Compression PointTexture in performance with reduced resolution at which the geometry is compressed without loss, and the color is compressed lossy.

Similarly to the previous method, this method converts the field depth in the voxel representation, which is then compressed using 1-context adaptive encoder range. The color field is also displayed on the two-dimensional image. However, attempts to perform this mapping so that points that are close to yuusaku image used specialized way texture compression (adaptive tiling, ARB). This algorithm includes the following stages:

1. To convert a "slice" of the four successive vertical planes" model PointTexture to a voxel representation.

2. Scan the array of width42^{s}voxels by means of:

Crawl vertical "plane" of 444 subcubes voxels along the column, parallel to the plane of projection: first - column that is closest to the plane of projection, then the next column, and so on (that is, in the usual manner crawl a two-dimensional array).

Crawl voxels within each subcube 444 in the manner similar to the procedure that is used to crawl subcubes OctreeImage node.

3. Record in the auxiliary one-dimensional array of color pixels of the considered models as they appear in this order traversal.

4. To reorder the received array of colors in the two-dimensional image so that:

5. Serial 64 color sample managed by columns in the block 88 pixels, and the next 64 samples in the neighboring block 88 pixels is Rahmanova array and display the result on the two-dimensional image was selected based on the following considerations. Note that subcube 444 and 8 blocks8 images contain the same number of samples. If multiple sequentially scanned subcubes contain the number of color samples, sufficient to fill the block 88, it is very likely that this unit will be fairly uniform, and, as a consequence, after decompression distortion in the three-dimensional model will hardly be noticeable. The ARB algorithm compresses blocks 88 independently from each other using a local packaging. In our tests, the distortion introduced by the compression ARB in the resulting three-dimensional model, were much smaller than the distortion when using JPEG. Another reason for choosing this algorithm was high decompression speed (for which it was originally developed). The compression ratio can be one of two values: 8, or 12. For the compression algorithm PointTexture we fix the compression ratio is 8.

Unfortunately, this algorithm does not have universal applicability. Although shown in Fig.10(b) image obtained from a field of color in this way, has a much more homogeneous than the image perceived by the t to contain color sample the corresponding remote from each other in three-dimensional space. In this case, the compression method ARB loss can "mix" the colors from remote areas of the model that leads to a local, but noticeable after decompression.

However, for many models, this algorithm works well. In Fig.11 shows a "bad" case (model "angel") and a good case (model "Morton"). In both cases the model has decreased by about 7 times.

(Fig.4 removed)

3. Test results

This section provides a comparison of compression results of two models "angel" and "Morton"' in two different formats - OctreeImage and PointTexture. The reference images for each of the models was 256256 pixels.

3.1. Compression PointTexture.

In tables 3-5 shows the results of using different compression methods. The model for this experiment were obtained from models with 8-bit field depth. Depth were expanded in range (1, 2^{30}) using the sampling step equal to 2^{31}+1 to achieve a more uniform distribution of bits of a 32-bit depth, thereby to some extent mimicking the "true" 32-bit values.

This kind of compression should not expect the casualty for images with realistic cvetovosproizvodyaschie. Since in this approach, the geometric nature of the data is not used, after compression of the field depth and color of comparable size.

Now examine the question of how the same model can be compressed without loss when taken to its "true" value of resolution in depth. Unlike the previous case, the field depth can be compressed without loss of approximately 5-6 times. This is achieved by intermediate voxel representation, which makes the geometric redundancy of the data is much more pronounced - indeed, only a small fraction of voxels is black. However, as the uncompressed size of the models is smaller than for the case of 32 bits, the total compression ratio is now determined by the compression ratio of the field color, which is even smaller than for the case of 32 bits (although the output file size is also smaller). Thus, it is desirable to be able to compress the color field at least as well as field depth.

With this purpose in mind, the third method used method of lossy compression, which is called ARB [6]. This method provides much greater compression. However, as with any other method of lossy compression, it in some cases can lead to unpleasant artef is the model of spatial remote point sometimes really fall in the same block two-dimensional image. Colors such distant points of this model may differ significantly, and if a block is too many different colors, local packaging is not able to provide an accurate approximation. On the other hand, it is a local batching allows you to accurately compress the vast majority of blocks for which the distortion introduced by, for example, a standard JPEG compression algorithm, become totally unacceptable after restored color are placed in respective points of three-dimensional space. However, the visual quality of the model Morton", the compression of which was done using the same method, is excellent, and this is the case for most models in our experiments.

3.2. Compression OctreeImage

Table 6 presents the sizes of the compressed and uncompressed components ochoterena for our two test models. As you can see, the size of this field decreases approximately 1.6-1.9 times.

However, if we compare these results with uncompressed models PointTexture even with 8-bit field depth, OctreeImage is much more compact. In table 7 prepodovanija to OctreeImage (6.7 and 6.8 times, respectively). However, as already mentioned, OctreeImage may contain incomplete information about the color, which takes place in the case of the model "angel". In such cases, using a three-dimensional interpolation color.

Summarizing, we can conclude that the above experiments prove the effectiveness of compression tools. Choosing the best tools for this model depends on its geometric complexity, the nature of the color distribution, the desired speed of the visualization, and other factors.

4. The list of references.

[1] J. Cleary and I. Witten, "Data compression using adaptive coding and partial string matching", IEEE Trans. on Communications, vol.32, no.4, pp.396-402, April 1984.

[2] J. Rissanen and G. Langdon, "Universal modeling and coding", IEEE Trans. on Information Theory, vol.27, no.1, pp.12-23, January 1981.

[3] M. Schindler, "A byte oriented arithmetic coding", Proc. of Data Compression Conference, March 1998.

[4] G. Martin, "Range encoding: an algorithm for removing redundancy from a digitized message". Video & Data Recording Conference, March 1979.

[5] H. Coxeter and W. Moser, Generators and relations for discrete groups, 3^{rd}edition, Springer-Verlag, 1972.

[6] L. Levkovich-Maslyuk, P. and A. Kalyuzhny Zhirkov, "Texture compression with adaptive block partitions", Proc. of the 8^{th}ACM International Conference on Multimedia, pp.401-403, October, 2000.

5. Comments to the Study of ISO/IEC 14496-1/PDAM4

After you make the following corrections in the Study of ISO/IEC 14496-1/PDAM4 (N4627) added emoe the default orthographic should be the value used in the most General case.

Solution: change the default value orthographic with "FALSE" (false) "TRUE" (true) in the following way.

Suggested fix:

Problem: convert PUIG in the data flow should be carried out by way of a homogeneous transformation data stream intended for Rumi.

Solution: remove field depthImageUrl of the DepthImage node.

Suggested fix:

Problem: the term "standardized" can be misleading when applied to the field depth in the current context.

Solution: in the 5th paragraph, replace "are" with "is represented in the scale."

Suggested fix:

Field nearPlane and farPlane specify the distances from the observation point to the near and far planes of the observation areas. Data about textures and the depth of the display region located between the near plane, far plane and feildOfView. Depth data is presented in the scale distance from the nearPlane to the farPlane.

Item 6.5.3.1.2., technical

Problem: convert PUIG in the data flow should be carried out by way of a homogeneous transformation data stream intended for Rumi.

Solution: delete the note field depthImageUrl (7th paragraph and nicacio field depth in the 3rd paragraph as follows:

Suggested fix:

Field depth specifies the depth of each pixel in the field texture. Depth map must be the same size as the image or movie in the box texture. Field depth should be one of various types of nodes textures (ImageTexture, MovieTexture or PixelTexture), for which only the nodes that represent the image by a scale of gray levels. If the field depth is not specified, the depth map will use the alpha channel in the field texture. If the depth map is defined by depth of field, either through the alpha channel, then the result is undefined.

Field depth allows you to calculate the actual distance from the three-dimensional points of the model to the plane passing through the point parallel to the near and far planes:

where d is the depth value, a d_{max}- the maximum depth value. It is assumed that for point models d>0, and d=1 corresponds to the far plane, and d=d_{max}corresponds to the near plane.

This formula is true both for the case of perspective, and for the case of orthogonal projection, since d is the distance between the considered point and the plane d_{max}is the maximum value of d, which means depth of field, the depth value d is equal to the grey scale.

(2) If the depth is set by the alpha channel in the image is determined through the texture, the depth value d is equal to the alpha value.

The depth value is also used to indicate which points belong to the model: only points with non-zero d belong to this model.

For animated models, based on DepthImage, used DepthImage with SimpleTexture as diTexture.

Each of the objects SimpleTexture you can animate one of the following ways:

(1) Box depth is a still image that satisfies the above condition, and the texture is arbitrary MovieTexture.

(2) Field depth is arbitrary MovieTexture, which satisfies the above condition imposed on the field depth, and the texture is a still image.

(3) As a field depth and texture are MovieTexture, and the depth satisfies the above condition.

(4) Field depth is not used, and the depth is extracted from the alpha channel MovieTexture, which is used to animate the field texture.

Item 6.5.3.3.2, editorial

Problem: the semantics of the field depth is not fully specified.

Solution: Replace the specification field depth (3rd paragraph) on the offers is accepted for SimpleTexture, remain valid in this case.

Field depth specifies the set of depth values for each point in the plane of projection, which is taken farPlane (see above), in the order in which the traversal starts from the point in the lower left-hand corner and continuing from left to right to complete a horizontal line, then jumps to line up. For each point first, remember the number of depth values (pixels), and then the values of depth.

Item 6.5.3.4.1, H. 1, technical

Problem: Using type SFString field-for-field octree can lead to inconsistent results.

Solution: Replace the type field octree on MFInt32.

Suggested fix:

In Paragraph H. 1, table for Octree, change column octree as follows:

Item 6.5.3.4.1, technical

Problem: convert PUIG in the data flow should be carried out by way of a homogeneous transformation data stream intended for Rumi.

Solution: remove field octreeUrl from the OctreeImage node.

Suggested fix:

Problem: the field definition octreeresolution (2nd paragraph) tolerate incorrect interpretation.

Solution: Fix the description by adding Ogareva, along the verge of the enclosing cube. Level ochoterena can be determined from octreeresolution by the following formula: octreelevel = int(log2(octreeresolution-1)) +1)

Item 6.5.3.4.2, technical

The solution is to remove the description field octreeUrl (5th paragraph and below).

Item 6.5.3.4.2, editorial

Problem: the description of the animation OctreeImage not shown completely.

Solution: add a paragraph at the end of paragraph 6.5.3.4.2 describing the animation order OctreeImage

Suggested fix:

Animation OctreeImage can be performed by the same approaches as in the three above-described methods of animation, based on DepthImage, with the only difference consisting in the fact that instead of the field depth field is used octree.

Item H. 1, technical

Problem: the range of data about the depth in the PointTexture node may be too small for future applications. Many graphical tools allow to use for the z-buffer depth data with the bit depth of 24 or 36 bits. However, the field depth in PointTexture has the range [0, 65535], which is 16 bits.

Solution: In paragraph H. 1, table for PointTexture, change column for band znacheniya, described in provisional application No. 60/395304 on U.S. patent, filed July 12, 2002 and entitled “Method and apparatus for representing and compressing data ochoterena in the view based on the image depth”.

IV. CODING of MOVING PICTURES AND AUDIO ISO/IEC JTC 1/SC 29/WG 11

1. Introduction

This document describes an enhanced version OctreeImage for views based on images with depth (POIG), PCA A8.3. The OctreeImage node has received approval and has been included in proposals to the Drafting Committee during the meeting in Pattaya. However, due to occlusion geometry object in some specific cases there was poor visual quality. This document describes the enhanced version of the OctreeImage node, i.e. textured binary volumetric oktodelete (TBWA), and the method of its compression to convert the data stream.

2. Textured binary volumetric oktodelete (TBWA)

2.1. General description TBWA

The purpose of TBWO as an improved variant of binary volumetric ochoterena (BWO) is to provide a more flexible format/compression with fast rendering. This is achieved through khraneniia + the set of reference images) while the view based on TBWA, consists of (patterns BWO + the set of reference images + indexes cameras).

The main problem of the visualization of the BVI is that the visualization process is necessary to determine corresponding to each bill index camera. In this connection it is necessary not only to perform the projection on the camera, but also the procedure of inverse ray tracing. You need at least to determine the presence of the camera, which is visible to the given voxel. Therefore, you want to find all the voxels, which are projected on a particular camera. But when using "head-on" approach, this procedure is very slow. We have developed an algorithm that provides fast and accurate for the vast majority of forms of objects. However, there are still difficulties with voxels that are not visible from any of the cameras.

A possible solution could be the storage of explicit colors for each voxes. However, in this case, we are faced with some problem when compressing color information. In other words, if we group the colors of the voxels in the image format and compress it, the correlation of the colors of the neighboring voxels is broken, so that the compression ratio becomes newimage waxes. This index is usually the same for large groups of voxels that allows you to use the structure ochoterena for economical storage of additional information. It should be noted that in the experiments with our models, on average, achieved a 15% increase in volume. The above simulation is a bit more complex, but it provides a more flexible way of representing objects of arbitrary geometry.

(Fig.1 removed)

Advantages of TBWO before BWO is that visualization using it is easier and much faster than using the BVI, and virtually no restrictions on the geometry of the object.

2.2. An example of TBWO

This section provides a typical example that illustrates the effectiveness and key elements of the submission on the basis of TBWO. In Fig.12(a) shows the model of "angel", based in the BVI. Since some parts of the body and wing are not observed in any of the cameras, using standard BWO 6 textures obtained by visualizing the image contains many visible "cracks". When presenting the same model on the basis of TBWA uses 8 cameras (6 faces of the cube + 2 extra Camerota. The secondary camera is placed inside the cube and focused on the front and back planes of the cube at right angles. In Fig.13(b) and (C) represent the additional image obtained through additional cameras. As a result, as shown in Fig.12(b), achieves smooth and clear visualization of the model.

(Fig.2 removed)

2.3. Description uncompressed data stream, TBWA

We believe that 255 cameras will be enough, and allocated to the index up to 1 byte. The flow of data TBWA is a stream of characters. Each character of TBWO is a symbol of the BVI or the character textures. Glyph-textures denotes the index of the camera, which can be a specific number or code not defined". Let in the following description as a code 'is not defined" appears ‘?’.

The flow of data TBWA costs in the order the scheme first width. The following is a description of the procedures for recording stream data TBWA if we have BWO, and for waxes each sheet contains the index of the camera. This should be done at the modeling stage. All nodes in the BVI, including the leaf nodes (for which no symbol BVI), cost in the order the scheme first width. The following pseudo-code writes to a stream.

If CurNode is not a leaf node, the camera (texture-character),

{

If the index of the camera parent CurNode equal to ‘?’,

To record the index of the camera is equal to the index of the camera to sobeslav

}

Otherwise,

{

Write the symbol ‘?’

}

Fig.3 pseudo-Code for recording the stream TBWA.

In accordance with this procedure to tree TBWA in Fig.14(a) you can get a stream of characters, depicted in Fig.14(b). In this example, for the character representation of the textures used one byte. However, in real data stream for each character textures only requires 2 bits, as it is necessary to submit only three values (two cameras and the code "is not defined").

(Fig.4 removed)

2.4. Compression of TBWO

Field compression octreeimages and octree OctreeImage node is performed separately. The following methods have been developed based on the consideration that compression field octree should be lossless, while for the field octreemages some valid visually acceptable level of distortion.

2.4.1. The compression field octreeimages

Field octreeimages compress by means of image compression MPEG-4 (static model) or use allowed under the MPEG-4 tools for video compression (in the case of animated models). In our approach to octreeimages used JPEG format (after some preprocessing, we note necessary for three-dimensional visualization; in other words, unused at the stage of three-dimensional visualization of part of the textures can be compressed so rough as it is acceptable for us).

(Fig.5 removed)

2.4.2. The compression field octree

The compression field octree is the most important part of the compression process OctreeImage because it is associated with compression is already very compact representation in the form of a disjoint binary tree. However, in our experiments, the following method has led to the reduction of this structure to approximately half its original size. In the case of animated version OctreeImage compression field octree is performed separately for each three-dimensional frame.

2.4.2.1. The context model

Compression is performed using a variant of the adaptive arithmetic coding (implemented as "encoder range", [3], [4]), which clearly uses the geometric nature of the data. Field octree represents a stream of bytes. Each byte represents a node (i.e., podkul) tree, in which its bits determine the employment of this Podkova after an internal split. This bit pattern is called a fill pattern node. Consider the compression algorithm processes the bytes one by one as follows:

It defines the context timesofindia frequency) of occurrence of the current byte in the given context.

The probability value is fed to encoder range.

Current TV is updated by adding 1 to the value of the frequency of occurrence of the current byte in the current context (and, if necessary, subsequently re-normalized, see details below).

Thus, coding is the process of building and updating tables TV in accordance with the context model. In based on the contexts schemes adaptive arithmetic coding (such as Prediction by partial match" [1]-[3]), the context of a symbol is typically a string of several preceding characters. However, in our case, the compression efficiency is increased through the use of patterns ochoterena and geometric nature of the data. The considered approach is based on two ideas, which, obviously, first used in the task of compressing ochoterena.

A. For the current context node is either its parent node, or a pair {originating node, the position of the current node to the originating node}.

B. it is Assumed that the probability of occurrence of a given node in a given geometrical position in a particular originating node is invariant with respect to some set of oregano in Fig.6 for converting R, which is rotated -90in the x-z plane. The main idea of the assumptions “B” is the observation that the probability of occurrence of some specific type of the child node in a specific type of the parent node should depend only on their relative positions. This assumption is confirmed in our experiments by analyzing tables of probability. It allows us to use a more complex context without too many tables of probability. This, in turn, helps to achieve very good results in terms of data volume and speed. It should be noted that the more contexts you use, the more accurate the assessment of the probability, and therefore, the more compact code.

We introduce a set of transformations for which we assume the invariance of the probability distributions. In order for these transformations can be used in our case, they must preserve the enclosing cube unchanged. Consider the set G of orthogonal transformations in Euclidean space, which are obtained by all possible combinations in a random number and the order of the following three basic transformations (generator)_{1}, m_{23}- the reflection in the plane x=0. One of the classical results in theory of groups obtained by reflection [27] States that G contains 48 different orthogonal transforms and is, in fact, a maximum group of orthogonal transformations, which transform the cube in himself (the so-called group of Corsetiere). For example, turn R on Fig.6 through generators is expressed as

R=m_{3}·m_{2}·m_{1}·m_{2},

where “” denotes the operation of matrix multiplication.

Being applied to the node ochoterena, transformation, set G forms a node with a different configuration fill subcubes. This allows you to categorize the nodes in accordance with the configurations of filling their subcubes.

Using the terminology of group theory, we can say that G acts on the set of all configurations populate nodes ochoterena. Calculations show that there are 22 different class (in theory of groups, they are also called orbits), for which, by definition, two nodes belong to the same class if and only if they are related by a transformation of the set G. the Number of elements in the class varies from 1 to 24 and always alsic not from the originating node, but only from a class that belongs to this originating site. It should be noted that in the first case for context-based generating nodes would require 256 tables, and an additional 2568=2048 tables for context, based on the mutual arrangement of generating and child nodes, while in the second case, the context based on the class of the parent node requires only 22 table plus 228=176 tables. Therefore, it is possible to use equally a complex context with a relatively small number of tables of probability. Formed TV will have the form shown in table 8.

2.4.2.2. The encoding process

To improve the accuracy of statistical data for tables of probability, these data are collected in different ways at three stages of the encoding process.

In the first stage, the context is not used at all, is taken as "0-context model, and is supported by a single probability table with 256 cells in the initial state characterized by a uniform distribution of elements.

Once completed coding the first 512 nodes (this number was obtained empir is this switching TV, the corresponding 0-context model is copied into table TV for all 22 contexts.

After performing encoding 2048 nodes (also heuristic value), switch to the 2-context model. At the moment, tables TV for configurations generating nodes corresponding to the 1-context model is copied into table TV for each position in the same configuration of the generating node.

The key point of this algorithm is the definition of context and the probability for the current byte. It is implemented as follows. For each class one fixed element, which is called "standard element". We maintain a table of mappings of class (CURRENT) indicating the class to which belongs each of the 256 possible sites, as well as pre-calculated conversion set G, which translates a given node in a standard element of his class. Thus, to calculate the probability of the current node N, perform the following steps:

To get an originating node P of the current node.

Extract from the CURRENT class, which belongs to P, and the transformation T that maps P to the standard node of this class. Let the class number is equal to C.

Apply T to R and find the position p DOCO received configuration fill in TN is in position p in the standard site class C.

To extract the required probability value from a cell TN table of probability, an appropriate combination of class-position (C, p).

Modification of the above stages for the 1-context model is obvious. Needless to say that all transformations are pre-computed and implemented in the form of a table, on which their search.

It should be noted that at the stage of the decoding node N its parent node P has passed the decoding and, therefore, the transformation T is known. All steps on stage decoding similar to the corresponding stages of coding.

Finally, consider the process of updating the probability values. Let P is the probability table for some context. Denote the element of R corresponding to the probability of node N in this context, as P(N). In our implementation of P(N) is an integer, and after each occurrence of N is P(N) is updated as follows:

P(N)=P(N)+A,

where a is a parameter increment, the value of which for various context models usually varies between 1 and 4. Let S (P) is the sum of all elements in R. Then the probability that N will be the arithmetic coder (encoder range in our case), is calculated as R(N)/S(R). As soon as veejet appearance in R to zero, elements equal to 1, remain unchanged, and all other elements are divided by 2.

2.4.2.3. Coding nodes chambers"

Compression of the stream of characters that define the number of the texture (cameras) for each waxes using its own table of probability. Using the terminology introduced above, this thread has a single context. Update items TV runs with the increment value greater than that used for octree nodes; otherwise, the encoding procedure is not different from the character encoding of the nodes.

2.5. The results of the compression TBWA and visualization.

In Fig.15, 17, 18 and 19 shows the results of compression TBWA. In Fig.16 shows images of models "angel" and "Morton", obtained after selective removal of part of the voxels. Volume in the compressed state is compared with the compressed BWO: in the third column the number in parentheses indicates the amount of compressed geometry, and the first number indicates the total size of the compressed model, based on TBWA (that is, taking into account texture). As a measure of visual distortion was calculated signal-to-noise power (OSM), which provides an estimate of the difference in color after conversion MIG(T)BWOThe MIG. The amount of LF is jam compressed geometry. In the case of TBWO compressed geometry also includes information about cameras. For TBWA there is a significant improvement of OSM compared to the BVI.

Using TBWA achieved faster rendering than the BVI. For the model "angel" frame rate in the case of TBWO-12 10.8 fps, while for BWO it is 7.5 frames/sec. For model "Morton" frame rate in the case of TBWO-12 is 3.0 fps, as for BWO - 2.1 fps (Celeron 850 MHz). On the other hand, it was found that in the case of animation TBWA visualization accelerates even more. For model Dragon frame rate in the case of TBWO-12 amounted to 73 fps, while for BWO - 29 fps (Pentium IV 1.8 GHz).

(Fig.6 removed)

Format TBWA provides more flexibility. For example, in Fig.6 illustrates two ways of using 12 cameras - TBWA-12, TBWA-(6+6). In the case of TBWO-12 uses 6 cameras BWO (cube face) plus 6 parallel faces of the images obtained from the center of the cube. In the configuration of (6+6) used 6 cameras BWO, then all visible data for cameras voxels are removed, and using the same 6 cameras "photographed" the parts that are visible to them after such removal. Examples of such images when and largest OSSM) between the model of "angel" in the view of the BVI, TBWA-6. Despite the fact that in both cases use the same camera positions, TBWA allows you to assign rooms cameras all voxels - even those that are not visible from any of the cameras. These numbers are chosen in such a way as to achieve the most precise match the original colors (that is, for each point, regardless of its line of sight choose the best color match among all the images of the cameras". Models angel this leads to excellent results).

It should also be noted modest difference in the amount of "geometry" (i.e., BWO + camera) between cases 6 and 12 cameras. In fact, additional cameras usually cover small size, and, consequently, their IDs are rare, and their corresponding texture is sparse (and can be easily compressed). All of the above applies not only to the model of "angel", but also to models "Morton", "Palma" and "Roboty".

(Fig.8-10 removed)

2.6. Specification of node

The OctreeImage node determines the structure of TBWO showing the structure of a binary ochoterena corresponding to the indices array of cameras and the number of images in the format ochoterena.

Field octreeimages specifies the set of nodes DepthImage with SimpleT TRUE. Field texture of each object SimpleTexture stores the color information of the object or part of object type (for example, its cross-sectional plane of the camera), obtained by means of an orthogonal camera position and orientation, which is specified in the relevant fields DepthImage. The correspondence between the different parts of the object and each of the cameras installed at the model building stage. The splitting of the object using the values of the fields position, orientation and texture is to minimize the number of cameras (or, equivalently, the images used in the format ochoterena) and, at the same time, inclusion of all parts of the object, potentially visible from a randomly selected position. Field orientation must satisfy the following condition: the vector of sight of the camera should have only one nonzero component (i.e. to be perpendicular to one of the faces of the enclosing cube). In addition, the sides of the picture SimpleTexture should be parallel to the corresponding sides of the enclosing cube.

Field octree completely describes the geometry of the object. The geometry is represented as a set of voxels that make up this object. Oktodelete is a tree data structure, each node of which is represented by one Ewout child nodes, while 0 means that these nodes do not exist. The internal nodes ochoterena is the traversal order ochoterena on a "first in width. Order 8 child nodes of the internal node is shown in Fig.14(b). The size of the enclosing cube for just ochoterena is 111, and the center of the cube ochoterena is located at the beginning (0, 0, 0) local coordinate system.

Field cameraID contains an array of the indexes of the cells assigned to the voxels. At the stage of color rendering, which is the attribute of the sheet ochoterena, is determined by the orthogonal projection of this sheet on one of the images in the format ochoterena with a specific index. Indexes are stored in a form similar oktodelete: if some of the camera can be used for all leaves contained in a specific node, the node containing the index of this camera is shown in the flow; otherwise, it displays the node that contains the fixed code "advanced partitioning", which means that the index of the camera is determined separately for the child sobeslav the current node (in the same recursive form). If the field cameraID is empty, then the indices of the cameras is determined by the rendering stage (as in the case BWO).

Field octreeresolution sets Macedonia can be determined from octreeresolution, using the following equation:

octreelevel = [log_{2}(octreeresolution)]

2.7. Specification bitstream

2.7.1. Compression ochoterena

2.7.1.1. Overview

The OctreeImage node in the view, based on the images with depth, determines the structure of ochoterena and projected textures. Each of the textures stored in the array octreeimages, is determined through the site DetphImage with SimpleTexture. Compression other fields OctreeImage node can be performed by compression ochoterena.

2.7.1.2. Octree

2.7.1.2.1. Syntax

2.7.1.2.2. Semantics

The compressed data stream ochoterena contains the header ochoterena and one or more frames ochoterena, each of which is preceded by octree_frame_start_code. Is octree_frame_start_code always 0000001C8. This value detected in the process of proactive parse (next) of this thread.

2.7.1.3. OctreeHeader

2.7.1.3.1. Syntax

2.7.1.3.2. Semantics

This class reads the header information compression ochoterena.

Variable octreeResotlution, the length of which is set by octreeResolutionBits, contains the value of the field octreeResotlution OctreeImage node. This value is used to obtain the level ochoterena.

Variable NumOfTextures, the length of which is equal to the value of the optical encoding ID of the camera for each node ochoterena. If the value octreeResolutionBits 0, curTexture the root node is set equal to 255 and the character textures are not encoded.

2.7.1.4. OctreeFrame

2.7.1.4.1. Syntax

2.7.1.4.2 Semantics

This class reads one frame ochoterena in order traversal on a "first in width. After starting with the 1-th node at level 0, read all nodes at the current level, the number of nodes at the next level is determined by counting units in each symbol node. When moving to the next level of the stream is read the number of nodes (nNodesInCurLevel).

To decode each node is provided corresponding contextID as described in Paragraph 2.7.1.6.

If the ID of the texture (or camera) for the current node (curTexture) in the originating node is not specified, then the ID of the texture also is read from the stream using the context ID of the texture specified by the variable textureContextID. If the extracted non-zero value (the ID of the texture is defined), then the same value will be applied to all child nodes at subsequent levels. After decoding each node the value of the ID of the texture assigned to those nodes Lisichenko decoding

In this section, using syntactic descriptions in the style of C++ describes an adaptive arithmetic coder used in the compression ochoterena. aa_decode () is a function that decodes the symbol using the model specified by array cumul_freq[], and the PCT is an array of tables of probability contexts described in Paragraph 2.7.1.6.

2.7.1.6. The process of decoding

The General structure of the decoding process described in Paragraph 0 (see also the description of the encoding process above). This paragraph shows you how to get the knots of TBWO from a stream of bits, which is the model TBWA, coded (compressed) by arithmetic coding.

At each stage of the decoding process it is necessary to update the number of context (i.e., the index used by tables of probability) and the probability table. The Union of all tables of probability (of integer arrays) we call a probabilistic model, the j-th element of the i-th table of probability, being divided by the sum of all its elements, assesses the likelihood of the j-th character in the i-th context.

The process of updating the probability table is as follows. In itself is the formation of character you want to choose the number of context (ContextID). ContextID is determined based on the decoded data, as specified below in Paragraphs 0 and 0. Once received ContextID, decoding of the symbol is performed by the binary arithmetic decoder. Then by adding the adaptive increment to the frequency value of the decoded symbol update the table of probability. If the total (cumulative) amount of table elements becomes greater than the cumulative threshold, perform normalization (see 2.7.1.5.1).

2.7.1.6.1. Context modeling character textures.

When modeling the character textures using only one context. This means that there is only one table of probability. The size of this table is equal to the value numOfTextures plus one. At first, all the elements of this table are initialized units. The maximum value of the element is set to 256. Adaptive increment is set equal to 32. This combination of parameter values allows us to adapt to highly variable flow numbers of textures.

2.7.1.6.2. Context modeling symbol node

There are 256 different characters of nodes, each of which represents a binary array of voxels of size 222.the following symbols into each other.

Consider a set of 48 fixed orthogonal transformations, i.e. turns 90*n (n=1, 1, 2, 3) degrees with respect to the axes of coordinates and transformations of symmetry. Matrix such transformations are listed below in the order of their numbers:

OrthogonalTransoforms[48] =

{

}

There are 22 sets of symbols, called classes, such that two characters are linked through a similar transformation if and only if they belong to the same class. With this method of coding construction of TCEs perform the following way: Context ID of a certain character or equal to the number of the class belongs to its parent node, or a composite number (the class of the parent node, the position of the current node to the originating node). This allows to considerably reduce the number of contexts that reduces the time required to collect meaningful statistics.

For each class is determined by one base character (see table 9), and for each symbol is calculated in advance orthogonal transformation which translates the character in the basic character of its class (in the real process of encoding/decoding is used, the table on which the search). After I matrix) to which translates to its parent in the base element. In table 10 for each character are given contexts and relevant direct conversion.

Selecting a context model depends on the number N is already decoded symbols:

For N<512 there is only one context. The elements of the probability table is initialized units. The number of characters in the table, the probability is equal to 256. Adaptive increment is 2. The maximum cumulative frequency is equal to 8192.

For 512N<2560 (=2048+512) uses the 1-context model (in the sense that the number of context is a single parameter, the number of the class). This model uses 22 table TCEs. ContextID - the number of the class to which belongs the parent decoded node. This number can always be identified by the table on which the search (see table 10), as the decoding of a parent is to decode the child. Each of the 22 tables TCEs initialized with TCE from the previous stage. The number of characters in each table, the probability is equal to 256. Adaptive increment equal to 3. The maximum cumulative frequency is also equal to 8192. After decoding the symbol transform using a particular above about what tificate symbol node, the corresponding generating the symbol node of the current node.

After decoding 2560 symbols, the decoder switches to the 2-context model (in the sense that now the number of context composed of two parameters as explained below). This model comprises 176 (=22*8, i.e., 22 class at 8 positions) TCEs. In this case ContextID depends on the class of the parent node and the position of the current node to the originating node. The initial value of the probability tables in this model depends solely on the context, but not from the position of: for each of the 8 positions corresponding TCE is a clone of TCEs obtained for this class in the previous step. The number of characters in each table, the probability is equal to 256. Adaptive increment equal to 4. The maximum cumulative frequency is also equal to 8192.

As in the case with the previous model, after decoding the symbol also transform using the inverse orthogonal transformation to the symbol specified in table 10).

Using table 10, you can easily get the geometry of the underlying items for each class. The basic elements are the symbols for which the identifier of the transformation is equal to 0 (0 being assigned to the identical transformation).[1] J. Cleary and I. Witten, "Data compression using adaptive coding and partial string matching", IEEE Trans. on Communications, vol.32, no.4, pp.396-402, April 1984.

[2] J. Rissanen and G. Langdon, "Universal modeling and coding", IEEE Trans. on Information Theory, vol.27, no.1, pp.12-23, January 1981.

[3] M. Schindler, "A byte oriented arithmetic coding", Proc. of Data Compression Conference, March 1998.

[4] G. Martin, "Range encoding: an algorithm for removing redundancy from a digitized message". Video & Data Recording Conference, March 1979.

[5] H. Coxeter and W. Moser, Generators and relations for discrete groups, 3^{rd}edition, Springer-Verlag, 1972.

The following is a detailed description specifications nodes and MPEG-4 compression methods of image formats based on ochoterena used in of the present invention device and method of representing a three-dimensional object based on the image depth.

This invention describes a family of data structures - concepts based on images with depth (POG), which provide effective and efficient representation, based mostly on the images and depth maps, fully using the above-described advantages. The following is a brief description of the key formats POIG - SimpleTexture, and PointTexture OctreeImage.

In Fig.20 is a diagram of an example image embossed texture and depth map, and Fig.21 is a diagram of an example of a layered image with depth (MIG). (a) illustrious image, the corresponding depth map and descriptions of the camera (position, orientation and type of orthographic or perspective). The possibility of submitting one SimpleTexture limited to one object, such as a building facade: the front image with the depth map enables you to replay elevation views in a significant range of angles. However, the collection of objects SimpleTexture issued positioned properly cameras, allows you to play all the building as a whole - in this case, the reference image cover all potentially visible part of the surface of the building. Naturally, the same applies to trees, human figures, vehicles, etc. moreover, the Association of objects SimpleTexture provides a very natural tool for processing three-dimensional animation data. In this case, the reference image are replaced with the reference streams. Depth map for each of the three-dimensional frame can be represented either by the alpha values of these streams, or separate video streams in view of the scale of gray levels. In this type of representation of the image can be stored in formats lossy compression, such as JPEG. This significantly reduces the amount of color information that shows the impact on total reduction of stored information.

For objects of complex shape can sometimes be extremely difficult to cover the whole visible surface of the acceptable number of reference images. The preferred representation for such cases can be PointTexutre. This format also stores the reference image and depth map, but in this case they are multi-valued: for each beam of view provided by the camera (orthographic or perspective), color and length are stored for each intersection of the considered beam with the object. The number of crossings can vary from beam to beam. Combining multiple objects PointTexture provides a very detailed view even for complex objects. But this format in the first place lacks two-dimensional regularity SimpleTexture, and thus there is no natural compressed form, based on the images. For the same reason it is used only for stationary objects.

Format OctreeImage occupies an intermediate position between "predominantly two-dimensional" SimpleTexture and mostly three-dimensional" PointTexture: it stores the geometry of the object into a volumetric representation, structured in ochoterena (hierarchically organized voxels normal bit format also contains additional structure, such oktodelete in which to waxes each tile is stored the index of the reference image containing the color. At the rendering stage OctreeImage color waxes sheet is determined by its orthogonal projection on the corresponding reference image. We have developed a highly efficient compression method for the geometric part of the OctreeImage. This method is a variant of adaptive arithmetic coding based on contexts where the contexts are created with the explicit use of the geometrical nature of the data. Using this method, compression together with a reference image that has been compressed with loss, makes OctreeImage very economical in terms of volume representation. Like SimpleTexture, OctreeImage there is an animated version: reference streams instead of the reference images plus two additional data stream oktodelete representing the geometry and the correspondence between voxels and three-dimensional images for each frame. A very useful feature of the format OctreeImage is built into it the possibility of the averaged display.

The collection POIG developed for the new version of the MPEG-4 standard and accepted for inclusion in the Expansion of the Structure of the Animation (the d MPEG-4 and includes a collection of interoperable tools, which provide a reusable architecture for interactive entertainment content (compatible with existing MPEG-4). Each tool PCA is compatible with the node of the binary format for scenes (DFS), the flow of the synthesized data and the audio-visual stream. The current version of the PCA consists of a high-level descriptions of the animation (for example, skeletal animation), advanced visualization (e.g., procedural texturing, mapping technology the colors on objects), compact representations (for example, view nonuniform rational b-splines (NURBS), monolithic representation, surfaces, lines, animation with a low bit rate (for example, compression interpolator) and other descriptions, including our POIG.

Formats POIG were developed with the aim of combining the advantages of different ideas proposed earlier, thereby providing user flexible tools that are best suited for a specific task. For example, rianimazione formats SimpleTexture and PointTexture are particular cases of known formats, while OctreeImage is, obviously, a new view. But in the context of MPEG-4, all three main formato MPEG-4 not only cover many of the proposed in the literature views based on the images, but also provide a huge potential to create new formats.

The following describes the view-based images with depth.

Taking into account the ideas mentioned in the previous section, as well as some of our own products, we offer the following range of formats, based on the images with depth and intended for use in the x-ray MPEG-4: SimpleTexture, PointTexture, DepthImage and OctreeImage. It should be noted that for SimpleTexture and OctreeImage are animated version.

SimpleTexture is a separate image, the combined image with depth. It is the equivalent of a relief texture (RT), while the PointTexture is equivalent to the MOMENT.

Choosing SimpleTexture and PointTexture as building blocks, we can build multiple views, using structural components of MPEG-4. Formal specifications are given below, but here the result is described from a geometrical point of view.

The structure of the DepthImage specifies either the SimpleTexture, or PointTexture together with the bounding cube, position in space and some additional information. The set of objects DepthImage can be combined into a single structure called a Transform node that does not have a specific MPEG-4 names however, in our practice we call them "block texture (BT) and "generalized block texture (MBT). BT represents the Union of the six objects SimpleTexture, corresponding to the faces of the cube, the bounding object or scene, while MBT is the Union of an arbitrary number of objects SimpleTexture, which together provide a consistent three-dimensional representation. The example of BT is given in Fig.22, which shows the reference image, the depth map and the resulting three-dimensional object. Visualization based on BT can be performed using algorithm deformation increments [6], but we use another approach, which also applies to the OCT. An example of a view-based OCT is shown in Fig.23, where the representation of a complex object, a palm tree, used 21 object SimpleTexture.

It should be noted that for representations of the same object or parts of the same object pooling mechanism allows, for example, to use multiple images MIG with different cameras. Consequently, data structures, similar to objects based on images with depth, the cells of the tree the MOMENT, the cells in the tree structure based on surface elements, is ptcii position and object permissions SimpleTexture and PointTexture to the structure of the scene.

The following describes OctreeImage: textured binary volumetric oktodelete (TBWA).

To use the geometry and textures with variable resolution in a more flexible representation and visualisation we develop a representation of the OctreeImage, which is based on textured binary volumetric oktodelete (TBWA). The task of TBWO is to develop a flexible format/compression with fast high-quality rendering. TBWA consists of three main components: a binary volumetric ochoterena (BWO), which represents the geometry of the set of reference images and index images corresponding to the nodes ochoterena.

The geometry information in the form of a BWO is a set of binary (occupied or empty) voxels separated by regular intervals and merged into the cells of a larger size normal for ochoterena way. This representation can be easily obtained from the data DepthImage through an intermediate form of "point cloud", as each pixel with depth determines a unique point in three-dimensional space. Convert the point cloud in the BVI is illustrated in Fig.24. A similar process allows the conversion of polygonal models in the BVI. Tableet a texture voxels at a given position and orientation of the camera. Thus, the BVI, together with the reference image provides a representation of the model. It turns out that the additional structure that is used to store the index of the reference image for each sheet BWO, allows you to render much faster and with higher quality.

The main problem of the visualization of the BVI is that we should determine the appropriate index of the camera for each voxes in the visualization process. In this regard, we should, at least, to determine the presence of the camera, which is visible to the given voxel. This procedure is very slow, if you use the "frontal" approach to solving the above problems. In addition to this problem there are some difficulties for voxels that are invisible with any one of the cells, which leads to undesirable artifacts in the image obtained when rendering.

A possible solution could be the storage of explicit colors for each voxes. However, in this case, we are faced with some problem when compressing color information. In other words, if we group the colors of the voxels in the image format and compress it, the correlation of the colors of the neighboring voxels is broken, so that the compression ratio of the screen) for each voxes. This index is usually the same for large groups of voxels that allows you to use the structure ochoterena for economical storage of additional information. It should be noted that during the experiments with our models, on average, achieved a 15% increase compared to the representation that is used only BWO and the reference image. The above simulation is a bit more complex, but it provides a more flexible way of representing objects of arbitrary geometry.

It should be noted that TBWA is a very convenient representation intended for visualization using Platov", because "splat" it is easy to calculate the voxel. Color waxes can be easily identified using the reference image and the index image of this voxes.

The following describes the conversion textured binary volumetric ochoterena in the data stream.

We believe that 255 cameras will be enough, and allocated to the index up to 1 byte. The flow of data TBWA is a stream of characters. Each character of TBWO is a symbol of the BVI or the character textures. The character textures denotes the index of the camera, which can be a specific number or code not aprodite in the order the scheme first width. The following is a description of the procedures for recording stream data TBWA if we have BWO, and for waxes each sheet has the image index. This should be done at the modeling stage. All nodes in the BVI, including the leaf nodes (for which there is no BWO-symbol) cost in the order the scheme first width. In Fig.25 shows the pseudo-code, which completes the writing of the data stream.

An example entry of the bit stream TBWA shown in Fig.14. In accordance with this procedure to tree TBWA in Fig.14(a) you can get a stream of characters, depicted in Fig.14(b). In this example, to represent textures-characters use one byte. However, in real data stream for each texture-character requires only 2 bits, as it is necessary to submit only three values (two cameras and the code "is not defined").

Here is how the animation POIG.

Animated version were defined for the two formats POIG: DepthImage, containing only objects SimpleTexutre, and OctreeImage. The amount of data is one of the main problems associated with three-dimensional animation. We chose this format because the video streams can be naturally embedded in their animated version, thereby providing a significant reduction in the volume of tandata MPEG-4. High-quality video compression with losses not seriously affect the appearance of the resulting three-dimensional objects. The depth map can be stored (for which almost no loss) in the alpha channel of the reference streams. At the stage of visualization visualization frame is performed after received and decompression all frames of reference images and depth.

Animation OctreeImage similar to the above - reference image are replaced with objects MovieTexture MPEG-4 standard, resulting in a new data flow ochoterena.

The following defines the specification of nodes MPEG-4.

Formats POIG described in detail in the specifications of nodes PCA MPEG-4 [4]. DepthImage contains fields that define the parameters of the cone of visibility or for SimpleTexture, or for PointTexture. The OctreeImage node represents an object in the form of geometric data that is defined by TBWA, and a set of formats of the reference images. Information, depending on the scene, is stored in the dedicated fields of the data structures POIG providing a correct interaction objects POIG with the rest of the scene. The definition of nodes POIG shown in Fig.26.

Fig.27 illustrates the spatial representation of the DepthImage, it is shown that the value of each item is explained, they are processed as a group, and thus, they should be subordinate to a single Transform node. Field diTexture sets the texture with depth (SimpleTexture or PointTexture), which is displayed in the area defined in the DepthImage node.

The OctreeImage node defines the structure ochoterena and projected textures. Field octreeResolution specifies the maximum number of leaves ochoterena along the verge of the enclosing cube. Field octree specifies the set of internal nodes ochoterena. Each internal node is represented by a byte. 1 in the i-th bit of the first byte means that the i-th child node of the considered internal node are child nodes, while 0 means that these nodes do not exist. The order of the internal nodes ochoterena is the traversal order ochoterena on a "first in width. Order eight child nodes of the internal node is shown in Fig.14(b). Field voxelImageIndex contains an array of indexes of the images assigned to the voxel. At the stage of color rendering, which is the attribute of the sheet ochoterena, is determined by the orthogonal projection of this sheet on one of the images with a specific index. Indexes are stored in a form similar oktodelete: if a separate image can be used for all leaves, stem case, displays the voxel that contains a fixed code "advanced partitioning", which means that the index image is determined separately for each child node of the current waxes (in the same recursive form). If the field voxelImageIndex empty, the index images are determined at the stage of rendering. Field images specifies the set of nodes Depthlmage with SimpleTexture as a field diTexture. However, field nearPlane and farPlane DepthImage node and the field depth of the SimpleTexture node is not used.

Visualization techniques for formats POIG are not part of the PCA, but nevertheless it is necessary to clarify the ideas used for the sake of simplicity, speed and quality of rendering objects POIG. Our visualization techniques are based on the "payments" - small flat spots of irregular shape, used as "primitives visualization". Two of the below approach is focused on two different representations: DepthImage and OctreeImage. To speed up rendering in our implementation for the formation of Platov" use OpenGL functions. However, it is also possible to do software rendering, which allows for optimized calculations using simple patterns DepthImage or OctreeImage.

The method that we use for rendering objects DepthImage, limiting the hardware accelerator. According to this method, we transform all pixels with depth related to objects SimpleTexture and PointTexture, imaging should be performed in three points, then position the small polygons ("splats (small color patches) on these points and functions to be used the visualization of the composition OpenGL. The pseudocode of this procedure for the case SimpleTexture shown in Fig.28. Case PointTexture is processed in exactly the same way.

Size split" must be adapted to the distance between the considered point and the observer. We used the following simple approach. First, the cube that encloses a given three-dimensional object is represented in the form of a coarse uniform grid. Size splat" is calculated for each cell of the considered grid, and this value is used for points inside the given cell. The above calculation is as follows:

- To display the cell on the screen via OpenGL.

To calculate the length L of the greatest diagonal projection (in pixels).

To evaluate D (diameter splat") aswhere N is the average number of points falling on the face of the cell, and s is a heuristic constant, approximately equal to 1.3.

We would like to emphasize that this method is, of course, you can improve poreia defects overlay images). However, even this simple approach provides good visual quality.

The same approach also works for the case OctreeImage, in which nodes ochoterena located on one of the coarser levels are used for the above calculation amount "splat". However, in the case OctreeImage color information should be displayed on a set of voxels. This can be done very simply, because for each voxes, there is a corresponding index of the reference image. The pixel position in the reference image also becomes known during the parsing of the data stream ochoterena. Once defined color voxels OctreeImage, as described above evaluated the sizes of Platov" and is used visualization based on OpenGL.

Formats POIG were implemented and tested on several three-dimensional models. One of these models ("Tower") was obtained by scanning a physical object (was used colored three-dimensional scanner Cyberware), while others were converted from the demo software package 3DS MAX. The tests were performed on Intel Pentium IV 1.8 GHz processor with OpenGL accelerator.

In the following subsections explain how the conversions which I for different formats POIG. Most of the data refers to models DepthImage and OctreeImage; for these formats, there is an animated version, and they can effectively compress. All presented models are built using an orthographic camera, as in the General case, this is the preferred method of submission "compact" objects. It should be noted that the perspective camera is mainly used for economical view POIG remote environments.

The formation model POG starts with the receipt of a sufficient number of objects SimpleTexture. For polygon objects SimpleTexture are calculated, while for a real-world object the data obtained from digital cameras and scanning devices. The next step depends on the format POIG that we are going to use.

DepthImage is simply the Union of the retrieved objects SimpleTexture. Despite the fact that the depth map can be stored in compressed form, in this case, acceptable only lossless compression, as even a small distortion of the geometry is often quite striking.

The reference image can be stored in compressed form with a loss, but in this case, the pre-processing. Despite the fact that, in General, it is acceptable to use popular methods ooh the HN - especially with the boundaries between the subject and the background reference image, where the background color "penetrates" into the object. The solution we used to overcome this problem is to extend the image in the boundary blocks on the background of the reference image, using the average color of the block and the rapid attenuation of the intensity, and subsequent application of JPEG compression. The resulting effect is analogous to the "squeezing" of the distortion in the background image, where it is harmless, since the pixels of the background image for visualization are not used. The inner border of the reference image, compressed with a loss, can also create artifacts, but, in General, they are less noticeable.

For the formation of models OctreeImage we use an intermediate representation based on points (SWEAT). The set of points making up the SWEAT, represents the Union of the colored dots obtained by shifting pixels in the reference image at the distances specified in the corresponding depth maps. The original objects SimpleTexture should be created so that the resulting SWEAT provided a reasonably accurate approximation of the object surface. After that SWEAT is converted into OctreeImage as shown in Fig.24, and ispolzuemoe limitations. At the same time structure voxelImageIndex additional data representing the indices of the reference image voxels ochoterena. If the reference image should be stored in formats lossy compression, these images are pre-processed as explained in the previous subsection. In addition, since the structure of TBWO explicitly specifies for each voxes pixel containing color, the excess pixels are discarded, which further reduces the amount of voxelImageIndex. Examples of the original and the processed reference images in JPEG format shown in Fig.29.

It should be noted that the deterioration in quality due to lossy compression is negligible for objects OctreeImage, but still sometimes noticeable for objects DepthImage.

Model PointTexture are built using the projection of the object on the reference plane as explained in Section 2.1. If this creates an insufficient number of samples (which may occur for parts of the surface are almost tangent to the vector projection), for the formation of a larger number of samples are additional objects SimpleTexture. The resulting set of points is then reorganized into a structure PointTexture.

Next predstavljaju/s (it should be noted, it 21 simple texture), while for the other static models, which we tested with the side of the reference image is 512, visualization was performed with a speed of 5-6 frames/S. it Should be noted that rendering speed mainly depends on the number and resolution of the reference image, but not from the complexity of the scene. This is an important advantage over polygonal representations, especially in the case of animation. Entertainment model "Drakon" format OctreeImage visualized with a speed of 24 frames per second (fps).

Model Angel" format DepthImage shown in Fig.22. In Fig.30-34 are some other models POIG and polygon models. In Fig.30 comparison of the appearance model, "Morton" in polygon format DepthImage. Model DepthImage uses the reference image in JPEG format, and the rendering is done by a simple forming "Platov", described in Section 5, but the image quality is quite acceptable. In Fig.31 comparison of the two versions of the scanned model "Tower". Black dots in the upper part of this model is due to noisy input data. In Fig.32 illustrates a more complex model "Palma", composed of 21 on which riginal 3DS-MAX, that is a consequence of the simplified formation of Platov".

In Fig.33 shows a three-dimensional frame of animation "Dragon" format OctreeImage. Fig.34 demonstrates the ability to format PointTexture to provide models of excellent quality.

Corresponding to the present invention, the node structure based on the images with depth, includes the SimpleTexture node, the node PointTexture, the DepthImage node and the OctreeImage node. The DepthImage node composed of information about the depth and color images. A color image is selected from the SimpleTexture node and PointTexture node.

In the case when the object is observed from six observation points (front, back, in plan, from rear, left and right sides), this object can be represented by six pairs of nodes SimpleTexture. Specification SimpleTexture node shown in Fig.26.

According Fig.26 SimpleTexture node built from field Texture, in which the recorded color image containing the color of each pixel, and the field depth, which recorded depth for each pixel. The SimpleTexture node defines a single texture POI. In this case, the term "texture" means color flat image.

Flat image containing the color for each of the pixels forming the image is in the field of texture. The depth for each PI, sootvetstvuyushie flat the image in the texture. Image with depth is a flat image, represented by the grey scale according to the depth values. In the case of video, intended for the formation of animated objects, information about the depth and color information are many sequences of image frames.

A flat image field texture (that is, color image and the flat image field depth (i.e., the image represented by the grey scale) are SimpleTexture node. In Fig.20 shows objects "Morton", represented by nodes SimpleTexture front of the observation points. Ultimately, the data objects are represented by six nodes SimpleTexture, which are pairs of images generated for the six observation points. In Fig.22 shows objects "angel", is represented by six SimpleTexture node.

A color image can be represented by nodes PointTexture. In Fig.21 shows a scatter texture, formed by the projection of the object on the reference plane (in this case on the plane remote from the object at a predetermined distance in such a way as to be face to rear Pavan of field size, field resolution, depth of field and color field. Information about the size of the image plane is recorded in field size. The field size composed of the fields width and height, in which are recorded the width and height of the image plane, respectively. The size of the image plane is set equal to the amount sufficient to cover the entire object, projected on the reference plane.

Information about the depth resolution for each pixel is recorded in the resolution field. For example, when in the resolution field contains the number "8", depth of the object is represented by 256 levels of scale based on the distance from the reference plane.

Many segments of the depth information for each pixel is recorded in field depth. Information about the depth represents the sequence of numbers of pixels projected onto the image plane, and depths for the respective pixels. Color information is a sequence of colors corresponding to the respective pixels projected onto the image plane.

Information about the observation point, comprising the DepthImage node includes multiple fields, such as viewpoint, visibility, projection method or distance.

In the field viewpoint recorded observation point, from which the observed image plane. Field viewpoint s, recorded in field position, is the position of the observation point relative to the beginning (0, 0, 0) coordinate system, while the orientation is recorded in field orientation, represents the magnitude of the rotation of the observation point relative to the orientation adopted by default.

In the field of visibility of the recorded region of observation of the observation point to the image plane. In the field projection method recorded the way the projection of the observation point on the image plane. In the present invention, the projection method includes the method of orthogonal projection in which the region of observation is represented by width and height, and the method of perspective projection, in which the region of observation is represented by a horizontal angle and vertical angle. If the chosen method of orthogonal projection, that is, if the projection method is set to TRUE, the width and height of the region of observation correspond to the width and height of the image plane, respectively. If the chosen method of perspective projection, the horizontal and vertical angles of the observations correspond to the angles formed by the planes passing through the observation point and the vertical and horizontal sides of the plane pervasive overowing plenty and the distance from the observation point to the far boundary plane. Distance field consists of a field nearPlane and farPlane. Distance field specifies the scope for depth information.

In Fig.35A and 35B are shown diagrams illustrating the relationship of the respective nodes when the representation of the object in the format DepthImage with the SimpleTexture node and the nodes PointTexture, respectively.

According Fig.35A, the object in question can be represented by sets of nodes DepthImage, corresponding to the six observation points. Each of the respective nodes DepthImage consists of information about the observation point and SimpleTexutre. SimpleTexutre consists of a pair of color images and images with depth.

According Fig.35B, the object in question can be represented by a DepthImage node. Specification of the DepthImage node is also described, as was done previously. Site PointTexutre composed of information about the plane on which to project the object, as well as information about the depth and color information for various points of the objects projected onto the image plane.

In the case of the OctreeImage node object is represented by a structure of internal nodes constituting voxels that contain the object and the reference image. Specification of the OctreeImage node shown in Fig.26.

According Fig.26, the OctreeImage node includes a field octreeResolution, octree, voxelImageIndex and images.

cube which contains the object in question. In the field octree recorded the structure of the internal node. An interior node is a node for subcube formed in the split enclosing cube, which contains the entire object. A breakdown of each subcube iteratively for the formation of 8 subcubes and continues until it reaches the predetermined number of subcubes. When the iterative partitioning process is performed three times, the site for subcube after the first iteration of partitioning is called the originating node and the node for subcube after the third iteration of partitioning is called a child node, assuming that the node for subcube after the second iteration of partitioning is called the current node. Order 8 split subcubes is the order of priority in width. In Fig.14 shows the mode of appointment of the non-priority subcubes. Each internal node is represented by a byte. Information site written in constituting this byte bit streams is the presence or absence of child nodes, the child nodes that belong to the considered internal node.

Field index recorded the indexes of the reference image, corresponding to internal nodes. In the field of image recording is pthImage, the structure of which is described as it was done above.

In Fig.36 is a diagram illustrating the structure of the corresponding OctreeImage node in the view object using the OctreeImage node.

According Fig.36, the OctreeImage node encapsulated bit packers. Every bit packer includes the OctreeImage node. When the object is represented as nodes SimpleTexture, the OctreeImage node includes 6 DepthImage node, and each node DepthImage node contains SimpleTexture. On the other hand, in the case when the object is presented in the form of PointTexture node, the OctreeImage node includes one DepthImage node.

The present invention can be implemented in a recording machine-readable recording medium with the use of machine-readable codes. A machine-readable recording medium includes all kinds of recording devices that can read the data understandable to the computer system, and examples of these devices are permanent memory (ROM), a storage device with random access (NVR), a ROM on the CD-ROM, magnetic tape, diskette, optical storage device, etc., and this media can be implemented in the form of carrier waves, propagating, for example, from the Internet or others who gluconoi to the network, so that machine-readable codes are stored and implemented a distributed way.

In accordance with the present invention, in view based on images, the algorithm used is simple and can be supported by the hardware in many aspects due to the fact that the true information about the colored three-dimensional object is encoded by a set of two-dimensional images simple and regular structure, which was immediately adopted in the widely known methods of processing and image compression. Moreover, the render time for models, based on the images in proportion to the number of pixels in the reference and the resulting images, but, in General, not proportional to the geometric complexity that occurs in the case of polygonal models. Moreover, in the case when the representation, image-based, is applied to objects and scenes of the real world, rendering natural scenes with photographic quality becomes possible without using millions of polygons and expensive calculations.

The preceding description of embodiments of the present invention is presented for illustrative and descriptive purposes. It is not ischerpivayut in the light of the ideas described above or can be obtained from the practical implementation of the present invention. Scope of the present invention defined by the claims and their equivalents.

Claims

1. The method of generating the node structure for representing a three-dimensional object using an image with depth, and the method includes the steps, which create a field of texture, which record a color image containing color information for each pixel; create depth of field in which to record the image with depth, containing depth information for each pixel, and generate a node simple textures by combining field texture and depth of field in the given order.

2. The method according to p. 1, characterized in that in the case of video, intended for the formation of animated objects, information about the depth and color information are many sequences of image frames.

3. The method of generating the node structure for representing a three-dimensional object using an image with depth, and the method includes the steps, which create a field size, which record information about the size of the image plane; create permissions that record depth resolution d is sasasa to each pixel; create a color field, which records the color information for each pixel, and generate a node point textures by combining field size, margins, resolution, depth of field and color field in the specified order.

4. The method according to p. 3, characterized in that the information about the depth represents the sequence of numbers of pixels projected onto the image plane, and depth values for the respective pixels, and the color information is a sequence of colors corresponding to the respective pixels projected onto the image plane.

5. The method of generating the node structure for representing a three-dimensional object using an image with depth, and the method includes the steps, which create a field observation point, in which record information about the observation point, from which you are monitoring the image plane; create field observations, in which record information about the field observations from the observation point to the image plane; create field projection method, in which record information about the way the projection of the observation point on the image plane; create distance field, which record information about the e image and generate the node image with depth by combining field observations, field observations, field projection method, distance fields and field textures in the specified order.

6. The method according to p. 5, characterized in that the field observation point includes field position, in which the recorded position of the point of observation, and field orientation, in which the recorded orientation of the observation point, and the above-mentioned position is a position relative to the beginning of the coordinate system, and the orientation is determined by the angle of rotation about the orientation adopted by default.

7. The method according to p. 5, characterized in that the projection method includes the method of orthogonal projection in which the region of observation is represented by width and height, and the method of perspective projection, in which the region of observation is represented by a horizontal angle and vertical angle.

8. The method according to p. 7, wherein if the chosen method of orthogonal projection, the width and height of the field observations correspond to the width and height of the image plane, respectively, and if the chosen method of perspective projection, the horizontal and vertical angles of the global and horizontal sides of the image plane, respectively.

9. The method according to p. 5, characterized in that in the case of video, intended for the formation of animated objects, information about the depth and color information are many sequences of image frames.

10. The method according to p. 5, characterized in that a color image consists of one or more nodes of a simple texture, consisting of a flat image that contains the color for each pixel and the depth value for that pixel.

11. The method according to p. 5, characterized in that a color image consists of one or more nodes point textures, consisting of information about the size, depth resolution, many segments of the depth information for each of the pixels constituting the color image, and color information for each pixel.

12. The method according to p. 11, wherein the information about the depth represents the sequence of numbers of pixels projected onto the image plane, and depth values for the respective pixels, and the color information is a sequence of colors corresponding to the respective pixels projected onto the image plane.

13. The method of generating the node structure for predstavleniya field permissions in which recorded maximum number of leaves ochoterena along the verge of entering into a cube that contains the object; create a field ochoterena, which records the structure of the internal node ochoterena; create index field, which records the index of the reference image corresponding to the mentioned internal node; create a field image write reference image, and generate the images node ochoterena by combining field permissions field ochoterena, index field and the image field in the specified order.

14. The method according to p. 13, characterized in that the internal node ochoterena represented by a byte, and recorded in the form of the components of this byte bit streams of information site is the presence or absence of child nodes, the child nodes belonging to a given internal node.

15. The method according to p. 13, characterized in that the reference image is an image with depth, which includes information about the observation point and a color image corresponding to the information about the viewpoint.

16. The method according to p. 15, wherein the information about the observation point includes field observation points, in which recorded information about informacija about the field of observation of the observation point on the image plane; field projection method, in which recorded information about the way the projection of the observation point on the image plane.

17. The method according to p. 16, wherein the information about the observation point includes field position, in which the recorded position of the point of observation; field orientation, in which the recorded orientation of the observation point, and the position is a position relative to the beginning of the coordinate system and the orientation is determined by the angle of rotation about the orientation adopted by default.

18. The method according to p. 16, characterized in that the projection method is a method of orthogonal projection, the width and height of the region of observation correspond to the width and height of the image plane, respectively.

19. The method according to p. 16, characterized in that a color image composed of one or more nodes of a simple texture, consisting of a flat image containing the color for each pixel, and the depth for that pixel.

20. Machine-readable recording medium containing an entry for the node structure for representing a three-dimensional object using an image with depth, and referred to the node structure contains a field texture, in which is written the depth value for each pixel.

21. Machine-readable recording medium containing an entry for the node structure for representing a three-dimensional object using an image with depth, and referred to the node structure contains a field size, in which recorded information about the size of the image plane; the permissions, which recorded depth resolution for each pixel; field depth, which recorded many segments of the depth information relating to each pixel, and the color field, in which the recorded color information for each pixel.

22. Machine-readable recording medium containing an entry for the node structure for representing a three-dimensional object using an image with depth, and referred to the node structure contains a field observation point, in which is recorded information about an observation point from which you are monitoring the image plane; field observations, in which recorded information about the field observations from the observation point to the image plane; field projection method, in which recorded information about the way the projection of the observation point on the image plane; distance field, in which recorded information about the distance between the near plane and Italy records contains the entry for the node structure for representing a three-dimensional object using an image with depth, and referred to the node structure contains the permissions, which recorded maximum number of leaves ochoterena along the verge of entering into a cube that contains the object; field ochoterena, which recorded the structure of the internal node ochoterena; field index, which recorded the index of the reference image corresponding to the mentioned internal node; field image recorded reference image.

**Same patents:**

FIELD: computer-laser breadboarding.

SUBSTANCE: using a system for three-dimensional geometric modeling, volumetric model of product is made, separated on thin transverse layers and hard model is synthesized layer-wise, thickness A of transverse layers is picked from condition, where A≤F, where F is an allowed value for nominal profile of model surface and generatrix of model surface profile passes through middle line of transverse layers.

EFFECT: shorter time needed for manufacture of solid model.

1 dwg

FIELD: computer-laser breadboarding.

SUBSTANCE: using a system for three-dimensional geometric modeling, volumetric model of product is made, separated on thin transverse layers and hard model is synthesized layer-wise, thickness A of transverse layers is picked from condition, where A≤F, where F is an allowed value for nominal profile of model surface and generatrix of model surface profile passes through middle line of transverse layers.

EFFECT: shorter time needed for manufacture of solid model.

1 dwg

FIELD: computer science.

SUBSTANCE: method includes forming a computer model of object, determining mass-center and inertial characteristics of object model, while according to first variant, model of object is made in form of mass-inertia imitator, being an imitator of mass and main center momentums of inertia, according to second variant, model of object is made in form of assembly imitator, in form of assembly, received by combining dimensional imitator of object model, in form of three-dimensional model with appropriate outer geometry, and mass imitator and main central inertia momentums, and according to third variant object model is formed as component imitator, in form of assembly, consisting of dimensional object model imitator, in form of three-dimensional model of object with appropriate outer geometry.

EFFECT: higher efficiency, broader functional capabilities, lower laboriousness.

3 cl, 5 dwg

FIELD: technology for encoding and decoding of given three-dimensional objects, consisting of point texture data, voxel data or octet tree data.

SUBSTANCE: method for encoding data pertaining to three-dimensional objects includes following procedures as follows: forming of three-dimensional objects data, having tree-like structure, with marks assigned to nodes pointing out their types; encoding of data nodes of three-dimensional objects; and forming of three-dimensional objects data for objects, nodes of which are encoded into bit stream.

EFFECT: higher compression level for information about image with depth.

12 cl, 29 dwg

FIELD: technology for layer-wise shape generation as part of accelerated modeling systems based on laser-computer modeling.

SUBSTANCE: in the method by means of three-dimensional geometric modeling system a volumetric model of product is formed, split onto thin transverse layers and layer-wise synthesis of solid model is performed, while transverse layers Coefficient are made of different thickness A, which is determined from appropriate mathematical formula.

EFFECT: less time required for manufacture of solid model.

1 dwg

FIELD: technology for layer-wise shape generation as part of accelerated modeling systems based on laser-computer modeling.

SUBSTANCE: in the method by means of three-dimensional geometric modeling system a volumetric model of product is formed, split onto thin transverse layers and layer-wise synthesis of solid model is performed, while transverse layers Coefficient are made of different thickness A, which is determined from appropriate mathematical formula.

EFFECT: less time required for manufacture of solid model.

1 dwg

FIELD: engineering of image processing devices.

SUBSTANCE: information is produced about position of surface of input three-dimensional object, this surface is simplified as a set of base polygons, information is produced about position of simplified surface of input three-dimensional object and information map of surface is generated on basis of information about position of surface of input three-dimensional object prior to simplification and information about position of simplified surface of input three-dimensional object; surface of each basic polygon is split in information map of surface on multiple area and excitations function is produced for each area; error is determined between object on basis of excitations function and by given three-dimensional object; it is determined whether error is less than threshold value; if error is less than threshold value, match is set between coefficients of excitation functions for basic polygons and information about basic polygons, while information map of surface is information about surface of input three-dimensional object, and if error is not less than threshold value, than surface of object, represented by information map, is split finer in comparison to previous splitting.

EFFECT: possible processing of three-dimensional object with highly efficient compression.

2 cl, 13 dwg

FIELD: computer-aided design, possible usage for video monitoring of development process of large-scale systems.

SUBSTANCE: method is based on using arrays of data about technical-economical characteristics of military equipment objects being developed with display and combination of this information in windows on screen of display.

EFFECT: provision of method for computer modeling of process of warfare, providing simplified modeling of warfare process.

10 dwg, 7 tbl

FIELD: technology for displaying multilevel text data on volumetric map.

SUBSTANCE: three-dimensional map is displayed on screen, and text data are displayed with varying density levels in accordance to distances from observation point of displayed three-dimensional map to assemblies, where text data are going to be displayed. Further, it is possible to display text data with use of local adjustment of density of text data on screen.

EFFECT: transformation of cartographic data with two-dimensional coordinates to cartographic data with three-dimensional coordinates, thus increasing readability of text data.

2 cl, 11 dwg