RussianPatents.com
|
Method of training artificial neural network. RU patent 2504006. |
||||||
IPC classes for russian patent Method of training artificial neural network. RU patent 2504006. (RU 2504006):
|
FIELD: information technology. SUBSTANCE: method comprises steps of: determining the required number of training vectors; limiting the input vector space with a certain region O; indicating M vectors which describe the most typical representatives of each of the investigated classes of objects belonging to the region O; generating K training vectors of input signals of artificial neural networks (ANN), first in the vicinity of the M vectors, with subsequent expansion to the region O; creating visual patterns clearly describing objects specified by the generated training vectors; determining one of M classes to which each of the K generated training vectors of input signals of ANN is associated; recording the generated training vectors and reference signals corresponding to classes of objects, to which the generated vectors relate, in form of pairs; reading the recorded pairs and transmitting to ANN inputs; correcting the vector of synaptic weights of neurons w(n) with a correction step η until training of the ANN is complete. EFFECT: training ANN without a statistically sufficient series of observations of investigated objects. 2 cl, 3 dwg
The invention relates to the field of computer systems based on biological models, more accurate computer models of artificial neural networks (Ann), designed for solving classification of the objects described by sets of numeric characteristics (vectors), and in particular to methods of their teaching. It is known (patent RU 2424561 C2; IPC G06F 15/18, G06K 9/66, G06N 3/08, published 20.07.2011,) that some computer tasks, such as the problem of classification, well-settled methods of machine learning. The main of them is connected with the use of the Ann, representing mathematical models and their software and/or hardware implementations built on the principle of organization and functioning of networks of nerve cells of living organisms. Neural networks are based on the concept of interconnected neurons. In inertial neurons contain data values, each of which affects the value attached neuron according relations with predetermined scale and in the fact, whether the sum of all connections with each individual neuron a predetermined threshold. Identify appropriate weights of communications (a process called training, ANT can to achieve an effective solution of problems of classification. Consider a way of training the Ann, called «teaching», for example, the case classification linearly objects, when the number of classes M equals two, which can serve as a basis for solving more complicated problems. One of the models of ins, solving this problem is the neural network, called perceptron (Rutkovskaya D. Neural networks, genetic algorithms and fuzzy systems: translated from Polish. .. / ., M , . / - M: Hot line - Telecom, 2006? p.21-25). Figure 1 presents the structure of the perceptron. In as a function f in the model of a neuron -Pitts applies bipolar activation function: in whichwhere u 1 , ..., u N inputs Ann; w 1 , ..., w N - synaptic weights; y is the output of Ann; ν - threshold value. Signal x at the output of the linear part of a perceptron is given by the expression: where w 0 =ν,u 0 =-1. The task of a perceptron is the classification of vector u=[u 1 ,...,u N ] T in the sense of assigning it to one of two classes (M=2), denoted by the symbols L 1 and L 2 . Perceptron refers vector u to the class L 1 , when the output signal of the following values : 1, and to the class L 2, if the output signal is set to-1. After that perceptron divides N-dimensional space of input vectors u two half-space, shared (N-1)-dimensional hyperplane given by the equation: Hyperplane (4) is called the decisive boundary (decision boundary). If N=2, the decisive border is a direct line, defined by the equation: Any point (u 1 , u 2 ), lying above the line, shown in figure 2, belongs to the class L-1 , then the point (u 1 , u 2 )lying under this direct, apply to a class L 2 . As a rule, the weight w i , i=0, 1, ..., N in equation hyperplanes (4) are unknown, while the entrance perceptron consistently served the so-called educational vectors (signals) u(n), n=1, 2, ..., u(n)=[u 1 (n),...,u N (n)] T . Unknown values of the weights are determined in the process of learning perceptron. This was referred to as «supervised learning or training under supervision». Role of ' teacher ' is the correct classification of signals u(n) to classes L 1 and L 2 despite the uncertainty of scales equations decisive borders (4). On completion of the learning process perceptron must correctly classify incoming input signals, including those who were absent in the training sequence u(n), n=1, 2, ..., K. In addition, we assume that the sets of vectors u(n), n=1, 2, ..., K, for which the output of a perceptron takes the values 1 and -1, linearly separated, that are in two different , separated by a hyperplane (4). In other words, be divided training sequence {u(n)} be two sequences {u 1 (n)} and {u 2 (n)} that {u 1 (n)}let L 1 and {u 2 (n)}belongs to L 2 . In the n-th time of the output signal linear part of a perceptron is determined by the expression: where u(n)=[-1, u 1 (n), u 2 (n), ..., u N (n)] T ; w(n)=[v(n), w 1 (n), w 2 (n), ..., w N (n)] T . Training perceptron is a recurrent correction vector of weights w(n) according to the equation: andwhere the parameter η for 0<η<1 - step correction, whereas the initial values of the components of the vector of weights are equal to zero, i.e. Dependence (7) and (8) can be represented in a more compact form. For this we define the so-called benchmark (set) signal d(n) in the form: Additionally, we note that the output signal of a perceptron can be described by the expression: Taking into account the introduced notations recursion (7) and (8) take the form of: The difference d(n)-y(n) can be interpreted as an error of ε(n) between the reference (set) signal d(n) and the actual output signal y(n). Taking into account the above conditions linear input signals algorithm (12) converges, i.e. The training concluded with a decisive boundary of a perceptron is determined by the expression: and perceptron correctly classifies as signals, which belong to the training set {u(n)}, and not part of this set, but performing condition linear . Training of other models of ins to solve more complex tasks in a way of «teaching», in General, is similar to that described above. It is known (Beams, E.V. Development of neural network control system of technological processes at the sorting slides: Diss. on competition of a scientific degree of candidate of technical Sciences under the special tee 05.13.06, Rostov - on-don, 2011)that for the formation of a training sample for training the Ann method of «teaching» typically rely on data from the following sources: 1. local data organizations (databases, spreadsheets etc); 2. external data available via Internet (stock prices, weather information and etc); 3. data from various devices (sensors of the equipment, camcorders etc). The disadvantage of this method is the inability of its application in case of the absence of a statistically sufficient number of observations of the studied objects that can not form a sufficient number of training vectors for correct learning Ann way of «teaching». Technical problem to be solved in invention is extending a class of tasks solved with the help of technology, Ann, in case of lack of a statistically sufficient number of observations of the studied objects. The goal of the project is achieved by training vectors form a knowledge-based expert in this field, and expert consistently defines classes of studied objects, which are generated using a random number generator training vectors of input signals Ann belonging to some of the region under consideration, and by computer visual images, clearly describing objects, asked generated teaching vectors. Under the expert, in the context of this invention, means a person possessing special knowledge about the studied sites, competent in the field. Implemented using a computer, the sequence of actions invention of the method contains the following stages: 1. determination of the necessary number K of training vectors u(n), n=1, 2, ..., K for training the Ann, i.e. the number of points in an N-dimensional space of input vectors u; 2. specify the range of change of input signals Ann, i.e. the limitation of all N-dimensional space of input vectors and some of the considered region (figure 3. shown by shading); 3. booking M vectors describing the most typical representatives of each of the M studied classes of objects L 1 , L 2 , ..., L j , j=1, 2, ..., M, belonging to the sphere Of; 4. generation computer using a random number generator K training vectors u(n), n=1, 2, ..., K input signals Ann, belong to the scope Of, first, the near neighborhood of the points indicated an expert on phase 3 of this method, i.e., near neighborhood of the points M vectors describing the most typical representatives of each of the M studied classes of objects L 1 , L 2 , ..., L j , with followed by successive uniform extension of this neighborhood before the indicated area Of; 5. the computer to create visual images, clearly describing objects, asked generated teaching vectors; 6. demonstration of expert training generated vectors and visual images, clearly describing objects, asked generated teaching vectors; 7. the definition of an expert, on the basis of their knowledge about the studied sites, within the considered area is About, one of the M class, where each of the K-generated learning vectors u(n) input signals Ann; 8. record generated training vectors u(n) and reference signals d j (n), the relevant classes L j (n) the objects to which, according to the expert, are generated by vector, in the form of pairs of <u(n), d j (n)> on a material carrier; 9. reading recorded pairs <u(n), d j (n)> with the material carrier and supply inputs Ann read signals training vectors u(n) and the relevant reference signals d j (n), 10. correction vector synaptic weights of neurons w(n) ANT in accordance with (12) with a step of correction η to the completion of the training. The described method can be improved by the fact that the expert in case of difficulties with the definition of the facilities of any of the K-generated learning vectors u(n) input signals Ann to one or another of M class (step 7 of the above mentioned procedure actions) has the possibility to refuse work of the vector and re-generate new learning vectors (return to step 4) without determining their facilities until he will not be able to correctly determine the status of one of the newly generated vectors. The invention is illustrated by the following graphic materials: figure 1 - structure of perceptron; figure 2 - two-dimensional space of input vectors (hyperplane); figure 3 - the limit of a two-dimensional space of input vectors and some of the considered region O. The use of the invention method of training the Ann provides a comparison with the known method the following technical advantages: a) extension of the class of tasks solved with the help of technology, Ann, in case of lack of a statistically sufficient number of observations of the objects; b) Ann contains in itself the knowledge of a given participating in training, expert on the investigated objects, and can mimic his intellectual activity when solving the problems of classification of the objects described by sets of numeric characteristics (vectors). In the environment of Delphi-7» was created, execute on the computer program used in the described method of teaching Ann. Under the generator of pseudorandom numbers, in this case, refers to the use of standard functions Randomize and Random() programming language Pascal, under the material carrier of paper printout. Visual images, clearly describing objects, asked generated teaching vectors are displayed on the computer monitor. 1. Method of teaching of the artificial neural network (Ann), intended for solving classification problems the objects described by sets of numeric characteristics (vectors), containing N-dimensional space training vectors u(n)=[u 1 (n), ..., u N (n)] T , n=1, 2, ..., K for training the Ann; M the studied object classes L 1 , L 2 , ..., L j , j=1, 2, ..., M; standard signals d j (n), the corresponding studied classes of L j (n) objects; vector synaptic weights of neurons w(n) Ann; step correction n, 0<η<1; output signals Ann y(n) wherein the training vectors u(n), n=1, 2, ..., K, form the basis of human knowledge, competent in this field (expert), in case of absence of a statistically sufficient number of observations of the studied objects, and expert consistently defines classes of studied objects, which are generated using a random number generator training vectors of input signals Ann belonging to some of the region under consideration, and by computer visual images, clearly describing objects, asked generated teaching vectors, implemented using a computer, the sequence of actions invention of the method contains the following stages: define the required number of training vectors u(n), n=1, 2, ..., K for training the Ann; limit the N-dimensional space of input vectors u some of the considered region On; indicate M vectors describing the most typical representatives of each of the M studied classes of objects L 1 , L 2 , ..., L j , j=1, 2, ..., M, belonging to the sphere Of; generate your computer using the random number generator K training vectors u(n), n=1, 2, ..., K input signals Ann, belong to the scope Of, first, the near surroundings of these vectors M, describing the most typical representatives of each of the M studied classes of objects L 1 , L 2 , ..., L j , followed by successive uniform extension of this neighborhood before the indicated area of About; create computer images, clearly describing objects, asked generated teaching vectors; demonstrate expert-generated learning vectors and visual images, clearly describing objects, asked generated training vectors; determine, on the basis of expert knowledge about the studied sites, within the considered area On one of the M class, where each of the K-generated learning vectors u(n) input signals Ann; record generated learning vectors u(n) and reference signals d j (n), corresponding to classes L j (n) the objects to which, according to the expert, are generated by vector, in the form of pairs of <u(n), d j (n)> on a material carrier; reads the recorded pairs <u(n), d j (n)> with the material carrier and serves on inputs Ann received signals training vectors u(n) and the appropriate reference signals d j (n); correct vector of synaptic weights of neurons w(n) with a step of correction η to the completion of the training the Ann. 2. Method of teaching of the artificial neural network (Ann), intended for solving classification problems the objects described by sets of numeric characteristics (vectors) according to claim 1, characterized in that in case of difficulties expert with the determination of the facilities of any of the K-generated learning vectors u(n) input signals Ann to one or another of M class, have the possibility to refuse work of the vector and re-generate new training vectors without determining their facilities until not be able to determine on the basis of expert knowledge about the studied sites belonging to one of the newly generated vectors.
|
© 2013-2014 Russian business network RussianPatents.com - Special Russian commercial information project for world wide. Foreign filing in English. |