Automatic recognition of symbols on structural background by combination of models of symbols and background

FIELD: automated recognition of symbols.

SUBSTANCE: method includes following stages: tuning, forming symbols models, recognition, recording background model together with background of read image, separating model of registered background from elementary image of background, combining for each position of symbol of model of letters and/or digits with elementary displaying of appropriate background, forming of combined models, comparison of unknown symbols to combined models, recognition of each unknown symbol as appropriate symbol, combined model of which is combined with it best in accordance to "template comparison" technology.

EFFECT: higher efficiency.

10 cl, 10 dwg

 

Prior art

The present invention relates to a method of automatic recognition of characters printed on any appropriate material, even in the case when the background has a very contrasting structures, as a consequence, greatly interfering with the structure of printed characters.

The vast majority of known systems, character recognition approach to solving this problem, trying to separate the characters from the background by using the threshold values, often very sophisticated and complex.

Unfortunately, this technology does not lead to success in those cases, when the contrast of the structures of the background is very significant, especially if the position of the character is variable in relation to those structures. As a result, images of characters in some cases contain some of the signs of the background (those that exceed the corresponding threshold values) or sometimes these images are not complete, since some patterns of characters does not exceed the corresponding threshold value.

This applies in particular to the case of control of the Bank of tickets on which the print rooms of the series is in the process phase, the individual (usually later) from the phase print the rest of the image, and usually with a different print the tion equipment. Thus, the registration may not be quite perfect and, therefore, non-series "move" relative to the background.

This means that if the numbers are printed on the structured area of this Bank ticket, that is the zone that contains a picture, they "move" or can be shifted arbitrarily in relation to the structure (figure) background. In addition to the above mentioned cases even search and segmentation of characters may not lead to success because of the presence of specific structures of the background.

Indeed, even taking into account the huge number of variations of the process of separation and character recognition is almost always passes through the following stages in which exercise

- read images of the document and, especially, the object containing the subject to recognize the characters. The reading of these images is provided by the electronic camera, and then these images are usually processed to improve contrast and reduce noise;

search on the image (now e) the provisions of symbols that are subject to recognition. This search is often based on the analysis of abrupt changes of illumination (type of transition from white to black) and, in particular, the spatial distribution of these transitions;

- segmentation Ident is fitiavana zone areas each of which contains only one character. This segmentation is carried out, for example, by analyzing the projection of the density of the black segment parallel to the base line of the characters: the minimum value of this density are correlated with white space between characters;

each insulated so the symbol is correlated with the prototypes (models) all letters and/or all of the digits, either from the point of view of the degree of coincidence (technology, known under the English name "template matching"), or from the point of view of the sequence of characteristic structures, such as vertical lines, horizontal lines, diagonal lines, etc. (technology, known as the "features extraction" or structural analysis).

In any case, it is evident that if the portion of the image is segmented as a symbol, contains patterns, extraneous with respect to the form of the symbol (for example, lines that belong to the structure of the background), the risk of insolvency or failure comparison with these prototypes is very high. This danger can also be caused by the loss of the distinctive parts of the structure of the symbol followed a sharp threshold transition in the phase separation of this symbol from the background.

Therefore, previous approaches to the problem of automatic recognition with whom molov, printed on highly structured backgrounds, high contrast, are not very satisfactory.

Brief description of the invention

In accordance with the invention the objects on which are printed should be recognition of the characters are subjected to optical analysis using well-known optoelectronic means, such as camera type CCD (linear or matrix, black and white or color) with the desired resolution, for forming an electronic image of a character subject to recognition.

In the following presentation will use the term "image" in the sense of electronic images, in particular, the discrete system of values of the light saturation, is usually organized in the form of a rectangular matrix. Each element of this matrix, or the so-called pixel is a measure of the intensity of light reflected from the corresponding part of the object. For color images, their General description consists of three matrices, the respective components of red, green and blue colors for each picture element or pixel. For the sake of simplicity, the following description applies to the case of a black and white image. The extension to color images by repeating the same operations for the three mA the Ritz.

The basis of the offer of the invention is the implementation of automatic recognition on the electronic images of characters printed on a highly structured background, the contrast of which may even be comparable to the contrast of the structures of these characters (as shown in the example illustrated in Fig/4c).

The first step of the method in accordance with the proposed invention consists in forming a model of the background, which can be obtained by picking up images of one or more samples that represent only the background picture without any symbols (as shown in the example illustrated in Fig/4B).

As such models can be used, in particular, the arithmetic average of the image mentioned samples. In the case of black and white images will be obtained by a single matrix of these averaged images, whereas in the case of color images of these matrices averaged images is three, for example, a matrix of red, green and blue colors.

Then form model to be character recognition (e.g., letters and/or numbers), or by picking up images of the system of these characters printed on a white background, either by direct use of electronic images from the information computer is situations of files currently have a commercial distribution for most fonts. In the first case, it is possible to generate a model of each to-be-recognized character in the form of averaged images of a number of instances of the same character printed on a white background.

After the formed model of all characters and background pattern, the first phase of this method, which can be called "debugging phase"ends.

In the implementation of subsequent phases of character recognition actions are performed in accordance with the following steps:

- carry out the read image to be recognition of the sample containing the unknown symbols are printed on a background in regulations, which are also unknown (an example is shown Fig/4-a);

- check the model of the background together with the read image using any of the known technologies registration of images, for example, using the method of maximum correlation;

- subtract the registered model of the background of the read image. Thus obtained differential image where the background is almost completely eliminated, very clearly identifies the position of the characters (an example of a differential image, which is read by the image minus recorded the reported model background shown in Fig/4-b);

- search the position of each symbol on the differential image. This operation is carried out using any of the known technologies localization and segmentation of characters, for example, by analyzing the sharp transitions saturation transition from black to white. Thus, for each symbol position will be allocated to the elementary image whose dimensions match the dimensions of the models of the characters (Fig/4B shows examples of elementary images segmented characters);

- allocation of the registered model of the background of the elementary background image corresponding to each unknown character;

- combining, for each of the positions of the symbols, models, characters with the corresponding elemental image model background (see Fig/4-c). Because the model background was registered together with the background image containing the subject to recognize the characters in the elementary images, combining the model background model numbers and/or letters, the relative position of the character background is the same as in the unknown image. Thus, in the synthesis for each symbol position will create new prototypes (combined model) characters (letters and/or numbers) with the same background as in the unknown image. One of the developed technologies to the minirovaniya will be described in the section "Description of several preferred options". However, there may be used any of the methods proposed by other authors;

the mapping of each of the unknown characters with all models, combined in the previous steps. Symbol recognition with background is, thus, by comparison with models of the characters with the same background and in the same position. Here it is possible to use any known recognition technology, for example, the method "template-matching" or "features extraction", etc.

Brief description of drawings

On Fig/4 presents an example of a sequence of characters printed on a highly structured and has a high contrast background, and on the form a) shows the sequence of characters printed on a white background, in the form of b) shows a picture of your own background and view (C) shows the sequence of characters)printed on the background b).

On Fig/4 type a) fully complies mind), shown in Fig/4, whereas type b) shows the result of subtracting a model of the registered background of the image is completely printed ticket.

On Fig/4 (a) shows the plot of the ticket from the example shown in the previous figures, containing the subject to recognize the characters, and view (b) shows the elemental image corresponding to each character position, as a result the and segmentation. View (C) shows, for each position of the symbol combination corresponding to the basic image of the registered background with models of all possible symbols, that is, the combined model described in the text. This example clearly shows that to be processed symbols (b) can be more efficiently detected when compared with the combined model (s)than with models of the characters printed on a white background (shown, for example, on Fig/4-d).

On fig.IV/4 shows a typical scheme of the recognition system described in the text.

Description of the preferred embodiments of the invention

Later will be described as a non-limiting example of implementation of the invention one of the preferred options related to automatic identification numbers of the series, printed on Bank tickets.

Indeed, in most types of Bank notes, the series number is partially or completely printed directly on the background of this ticket. Printing of Bank notes is, in particular, using a mixture of different technologies, usually, at least, offset and shallow notches. This latest technology, in particular, is typically areas with a large number of lines of very high contrast. To the Yes print the serial number on one of these zones, it is quite difficult using conventional technologies to separate the characters from the background, and hence to recognize these characters.

In addition, the series number is usually printed on the final production phase, after offset printing and the small incision, and another printing machine. Even if you use a very sophisticated system of marking, the relative registration between the series and the background in the result turns out to be rather variable, and can usually "move or be moved within a few millimeters.

On fig.IV/4 illustrates building a recognition system series numbers in the Bank tickets. Here a linear CCD camera 1 with their lenses 2 and its lighting system 3 is used to read image Bank 4 tickets to which you want to read the numbers of the series until they are transported by a suction tape 5.

Line scan cameras are sequentially stored in the first buffer circuit of the storage device subsystem image processing 6 for forming an electronic image of each Bank card.

Subsystem image processing 6, which may be based on specialized computing hardware and programmable computers such as DSP (Digital Signal Processor), a fast personal computer the PC etc, carry out various operations in the debugging phase (formation of a model of the background and character models), and in the phase of recognition.

In the implementation phase of the debug model background subsystem image processing:

- provides fee is not numbered image Bank notes, selected as "Set to debug background (NOF) (EAF) and stores this set in the corresponding storage device;

- selects from a set of NOF one reference image for registration and makes it either automatically (for example, selects the first image from the set NOF), or by the system operator, which uses the remote control 7;

- registers all images from a set of NOF, determining, first of all, the horizontal offset Δx and vertical offset Δeach image relative to the reference image, and then imposing the corresponding offset -Δx -Δ. In this embodiment, the measurement bias is carried out using the method of maximum correlation. A small rectangular plot of S0(markup) reference image with the center having coordinates of x0,0selected, for example, by the operator (outside napechatanie characters)associated with the site S1the same size, the center of which is shifted step by step n is each position (pixel) image from a set of NOF, to find the position of x1, y1where the correlation coefficient has its maximum value (this corresponds to the most precise overlap between the two images). This offset can be determined with the following expression:

Δx=x1-x0and Δy=y1-u0

In accordance with this variant of the model background Mfan average image NOF registered with the reference image.

In the implementation phase of debugging models of characters subsystem image processing 6:

- provides a collection of images together with Bank notes, where on a white background printed all the numbers and/or letters used in the rooms of the series, each singly, and in known positions (set to debug symbols HOC (EAC);

then segmenting the image from a set of EAC in elemental images, each of which contains a single character. In accordance with this option segmentation is provided using standard analysis techniques transition from black to white, very effective in cases when the characters printed on a white background;

- forms a model Msfor each character (letter or digit) as the arithmetic average of a set of NOSE elemental images each location, register the frame, for example, together with the symbol of the first ticket from the set of the NOSE, taken as reference. Registration and averaging are the same as in the case of the model of the background, but the basics of markup coincide with the whole elemental symbol.

Usually consisting of a number of Bank tickets are alphabetic and numeric characters of the same font. Thus, there will typically be only one position on the ticket from the set of the NOSE on the character (one character And one character In etc). In another case it will be necessary to provide as many conditions on the symbol, how to use different fonts (e.g., font, New York, And Courrier font And font Geneve etc).

In the implementation phase detection in accordance with the described embodiment of the proposed invention subsystem image processing 6 after reading the image:

first of all, registers the background image of each subject to read a Bank of ticket along with a model of the background through the basics of the markup that was used for debugging the model, and using the same technology correlation;

- generates a differential image that represents the recorded image full ticket minus model background, and then searches for the positions of the characters. Used is when this technology is based on the aforementioned analysis of transitions. Typically, the search can be performed on a limited area of the ticket, because the seal numbers can be displaced with respect to the background picture is not more than a few millimeters;

- allocates for each character position marked on the differential image corresponding to the basic image model of the background. Once registered, mentioned elementary image will accurately represent a part of the background on which was printed an unknown character;

- combines for each symbol position corresponding to the registered elementary background image Mfwith each model symbols of Ms.

For each symbol position will provide new models, plus the background, with the same relative position as on the subject to read the ticket. In this variant implementation of the invention mentioned combination of Mwithis formed, pixel by pixel, using the following equations:

if you first print the background, and then print the characters. Otherwise:

In all cases, the coefficients Kaboutand K1are constants characterizing the used ink and paper. In equations [1] and [2] the first member (namely, the product of K MfMs) takes into account the transmission capacity used printing inks and reflectivity of the paper, whereas the second term in these equations is associated with a reflective surface printing ink, which has been caused to the latter;

- for each symbol position calculates the correlation coefficient between the corresponding elemental image of the ticket and all the new models (plus background). While to be processed symbol is recognized as a symbol of the combined model, corresponding to the maximum value of the above-mentioned correlation coefficient;

in accordance with this variant implementation of the invention is optionally carried out the mapping mentioned maximum value of the correlation coefficient with a certain threshold value in order to check the print quality of the character and background of elemental images corresponding to each character position. If the quality is satisfactory (to be processed elemental image and a combined model are almost identical), the coefficient has a value very close to 1, whereas poor quality will match the value of the coefficient is closer to zero.

Other preferred variants of implementation of the proposed izaberete the Oia will include:

a) its application to character recognition on documents other than Bank notes, i.e. such as letters, postcards, labels, postal or Bank cheques, etc.;

b) replacement of the transportation system on the tape transport system, suitable for sheets of a large size, for example, on a cylinder-type used in printing presses or system in accordance with U.S. patent No. 5598006 from 28.01.1998;

(c) substitution in the recognition system of linear camera matrix camera;

d) using the averaged image of the set of images NOF as a reference image for the background check;

e) automatic stemming markup for background, for example, in accordance with the technology proposed in Patent Principality of Monaco No. 2411992479;

f) build a model of the background by using a method other than averaging, for example, using the technology specified in U.S. Patent No. 5778088 from 7.07.1998.

The list of references

(1) L.Stringa - "Inspection Automatique de la qualite d'impression par un modele elastique" Patent for invention №2411.99.2479 issued by the State Ministry of the Principality of Monaco (27.04.99).

(2) L.Stringa - "Procedur for Producing A Referance Model etc." - U.S. patent No. 5.778.088 - July 7, 1998

(3) L.Stringa - "Procede de controle automatique de la qualite d'impression d'une image multichrome" - Application for European the th patent No. 97810160.8-2304.

(4) L.Stringa - "Installation for Quality Control of Printed Sheets, Especially Security Paper" - U.S. Patent No. 5.598.006 - January 28, 1998

(5) Rice-Nagy-Nartkr - "Optical Character Recognition" - Klumer Academic Publshers - 1999.

1. Method of automatic recognition of electronic means of symbols such as letters and/or numbers printed on any material including patterns with strong contrast, even when the background has a very contrasting patterns with using optoelectronic reading device of the image and the image processing system, containing the following steps shall be performed by the debugging phase, in which form the background pattern obtained by picking up images of one or more samples, which depicts only the background, forming the model of characters (digits and/or letters, and other symbols), obtained by reading images of set of characters printed on a white background, containing at least one instance of each symbol, followed by the recognition stage, in which the read image is subjected to detection of the sample containing the unknown symbols are printed on a background register model the background with the background of the read image, highlight the model registered background of the elementary background image corresponding to each unknown character, combin the display for each character position model letters and/or numbers with elemental image corresponding background, create a combined model, map the unknown characters with all combo models corresponding to the same character position, recognize each unknown character as a special character, a combined model which best is imposed on him in accordance with the comparison with the template.

2. The method according to claim 1, in which the model of the background is one of the images in the Set to Debug Background (NOF).

3. The method according to claim 1, in which the model of the background represents the average value of the image NOF registered with each other.

4. The method according to claim 1 in which the said model background represent by a set of samples containing either the background or the characters, in accordance with the technology division of the character or background.

5. The method according to claim 4, in which models are subject to character recognition form as average values of the respective images from a set of Debug Symbols (NOSE).

6. The method according to claim 4, in which the model to be recognition of characters are formed using information and computer files.

7. The method according to claim 6, in which the use of reading a color image, the recognition of which is carried out in the color channel, which gives the best overlay.

8. The method according to claim 7, in which each image is agenie, to be processed is represented as a branch of the unknown characters from the background by highlighting the model registered background.

9. The method according to claim 8, in which the combination of the models of the background and characters perform in accordance with the following equations:

;

Ms≈Mf,

if you first print the background, and then print the characters, otherwise:

;

Ms≈Mf,

where Mb is the model of the symbol b;

Mf - model background;

Ms - model for each character;

Toaboutand K1are coefficients representing constants.

10. The method according to claim 9, which is used to check the print quality by comparing with the threshold value of the correlation coefficient between elementary image of each of the provisions of the characters and the combined model, the selected level of recognition.



 

Same patents:

The invention relates to a method of detection of counterfeit products

The invention relates to automation and computer engineering and can be used for recognition of objects or groups of objects among a large number of templates

The invention relates to a control device authentication securities (notes), their deterioration and contamination, image analysis banknotes in different parts of the spectrum for sorting banknotes for automatic sorting machines

The invention relates to the field of medical technology and can be used to automatically raspoznavaniya and measuring characteristics of cells cytological preparations of various types, such as blood smears and bone marrow, cervical and other

The invention relates to information-measuring technique and can be used for regulated skip to protected objects, database security, banking in the organization of the automated system of access to Bank accounts and so on

The invention relates to specialized computer equipment and can be used for object recognition in the case when the reference and the observed two-dimensional image defined as fuzzy sets

The invention relates to a device for controlling the identification of objects in text documents

The invention relates to devices and methods of identifying a genuine series of images

The invention relates to automation and computer engineering and can be used in automatic recognition of situations to recognize the States of the objects according to the spectral characteristics of their parameters, as well as to recognize the signature of the person

The invention relates to automation and computer engineering and can be used in automated pattern recognition to recognize the States of the objects according to the spectral characteristics of their parameters, as well as to recognize the signature of the person

The invention relates to an automatic diagnosis of the communication channel and equipment in multi-channel systems telephone

The invention relates to the detection and identification of signals

The invention relates to the encoding of papillary pattern

The invention relates to computer technology and can be used for object recognition in the case when the reference and the observed two-dimensional image defined as fuzzy sets

FIELD: automated recognition of symbols.

SUBSTANCE: method includes following stages: tuning, forming symbols models, recognition, recording background model together with background of read image, separating model of registered background from elementary image of background, combining for each position of symbol of model of letters and/or digits with elementary displaying of appropriate background, forming of combined models, comparison of unknown symbols to combined models, recognition of each unknown symbol as appropriate symbol, combined model of which is combined with it best in accordance to "template comparison" technology.

EFFECT: higher efficiency.

10 cl, 10 dwg

FIELD: optical recognition of symbols.

SUBSTANCE: method includes dividing image on areas, finding areas with hand-written symbols, using structural and sign classifiers for recognition of symbols, use of structure classifier as main recognition tool, selecting best suiting symbol of several variants.. recognition of symbol includes recognition of symbol by at least one additional sign classifier of crossed symbols, performing concurrent comparison to crossed symbol and at least one common symbol like the latter, and identification of symbol as crossed one in case of better compliance to signs of crossed symbols.

EFFECT: higher efficiency.

1 dwg

FIELD: identification devices.

SUBSTANCE: device has photographic image of a person and microprocessor, which has processor, memory, connected to processor and containing authentication data, and interface means, connected to said processor to organize communication with external device. Said photographic image has specially concealed information, contents of which when combined with said authentication data provides for authentication of said photographic image, and said microprocessor is made with possible realization of at least a portion of said authentication.

EFFECT: higher efficiency.

5 cl, 4 dwg

FIELD: identity recognition devices.

SUBSTANCE: device has in case in form of small suitcase, a computer, which is compatible to operation systems meant for using programs of scientific identification. Computer is connected to display and keyboard, it can be connected to printer external relatively to case, and presumes presence of remote connection to processing center, responsible for identification. Device additionally has fingerprint reader connected to computer and digital camera connected to computer.

EFFECT: higher speed of operation, higher reliability, broader functional capabilities.

5 cl, 3 dwg

FIELD: polygraphy.

SUBSTANCE: method includes conversion of recognized and standard images to digital form, their digital processing by determining coordinates, comparison and determining of match of recognized and standard contours. Determining of coordinates of line of characteristic contour of recognized image of symbol is performed using appropriate standard graphic image by finding value of coordinates X, Y, angle β of position of optical center of text symbols by superposition along area of printed area of digital images - in straight contrast of standard on appropriate recognized in reversed contrast.

EFFECT: higher reliability.

2 cl, 1 dwg

FIELD: coherent optics, Fourier optics.

SUBSTANCE: method for recognition of images in optical-digital correlators includes procedures for input of amplitude distributions of standard and compared objects into correlator, transformation of these distributions to synthesized phase distributions, receiving correlation between them, registration of received recognition signal and estimation of recognition result, distributions of standard and compared objects, related to arbitrary type objects, are unambiguously matched with phase random distributions Ψst(x,y), Ψ(x,y), synthesized from distributions of standard and compared objects and starting phase distribution Ψo(x,y), utilized further during recognition in optical-digital correlator instead of real objects.

EFFECT: increased trustworthiness of recognition of images of arbitrary class objects.

7 dwg

Up!