Method of designing primary structure of protein with specified secondary structure

FIELD: chemistry.

SUBSTANCE: invention relates to computer method, which uses biochemical databases in design of novel protein compounds. Design is performed by operator by means of specially written software PROTCOM basing on application of database of protein pentafragments. Design process consists in specifying and introduction into PROTCOM software of initial sequence of five amino acids (specified initial pentafragment) and written in binary system ten-digit number, which describes secondary structure of specified initial pentafragment. Search of said sequence is performed in database fold with the number, corresponding to specified ten-digit number. Search is performed until specified initial pentafragment is found in database. After its finding, said pentafragment is considered to be the first of possible number N of pentafragments of designed primary protein structure, and it, together with ten-digit number of fold, describing its secondary structure, is recorded into the programme working file. After that, secondary structures of each following number of (N-1) pentafragments are specified by introduction of the same or changed ten-digit number, describing secondary structure of the previous pentafragment into the programme, and search is performed in database of pentafragments, containing four amino acids of each of (N-1) pentafragments, recorded in working file, and one new one. When such pentafragments are found, one of new amino acids is selected and linked to four last amino acids of the previous pentafragment, new amino acid and ten-digit number of fold, describing secondary structure of each found pentafragment are recorded into working file. Obtained in working file sequence of amino acids, with corresponding description of its secondary structure, is considered to be designed primary structure of protein.

EFFECT: claimed method of designing primary structure of protein considerably simplifies and accelerates the task of designing proteins with specified secondary structure.

5 dwg, 21 tbl, 2 ex

 

The invention relates to a computer method that uses biochemical databases in the development of new protein compounds for the pharmaceutical, biotechnology and other industries, as well as for scientific research in medicine, biochemistry, molecular biology and genetics, for which essentially the use of new protein-based compounds amino acids.

This invention relates to the field of protein engineering to molecular biology, whose mission is to create knowledge and methods to obtain proteins with predetermined structure and function. One aspect of this trend is the design (design) of protein molecules. The design problem is inverse to the task of predicting protein structure. If in the process of predicting protein structure we known amino acid sequence have at the first stage to find its secondary structure, i.e. the position of the α-helix, β-structural sections and bends, the design we have to ask such a previously unknown sequence of amino acids in the primary structure, designed by us to create the desired spatial structure, which in suitable conditions, after its synthesis will take the order and size of the α-helix, β-structure the situations plots and twists.

The design of new proteins, as a rule, is carried out on the basis of the developed methodology for the prediction of protein structures and the success of this methodology depends on the degree of luck in the design of new proteins with predictable structure. In most cases, the results obtained are only a few successful examples among a large number of not-cited authors unfortunate options.

Known attempts to design protein structures, based on the General regularities of their formation. One of the first was the work of the group De Grado (D.Eisenberg, W.Wilcox, S.M.Eshita, P.M.Pryciak, S.P.Ho, W.F.Degrado. 1986. The design, synthesis, and crystallization of an alpha-helical peptide. Proteins: Structure, Function, and Bioinformatics. V.1, Issue 1, pp.16-22). The authors started from a simple idea: hydrophobic interaction between protein structures should be minimized and hidden in the hydrophobic core and hydrophilic to ensure contact with the solvent. Based on these considerations, the authors have designed and synthesized an artificial protein, containing only a few amino acids (Leu, Glu, Lys) and consisting of four α-helices (W.F.DeGrado, L.Regan, S.P.. The Design of a Four-helix Bundle Protein. Cold Spring Harb Symp Quant Biol 1987. 52: 521-526).

However, this simplified approach does not allow to design near-real complex proteins consisting of 20 different types of amino acids and having given how structural is, and functional properties.

The basis of the artificial protein alabamine was based not existing in the nature of the structure, which consisted of two repetitions of the type α-β-β (V.V.Chemeris, D.A.Dolgikh, A.N.Fedorov, A.V.Finkelstein, M.P.Kirpichnikov, V.N.Uversky, O.B.Ptitsyn. A new approach to artificial and modified proteins: theory-based design, synthesis in a cell-free system and fast testing of structural properties by radiolabels. Protein Eng. (1994) 7 (8): 1041-1052). Its structure was developed based on the physical theory of the formation of secondary structure of proteins, developed by the authors (Ptitsyn O.B., Finkelstein A.V. Theory of protein secondary structure and algorithm of its prediction. Biopolymers. 1983. V.22. P.15-25). Structural study alabamine showed that he has given authors the secondary structure and is in a state of a molten globule. It should be noted that the accuracy of the approach used by the authors, does not exceed 80%, which is not possible with full confidence to design proteins with a given structure. The authors practically designed only for one protein, and further studies were terminated.

To improve the predictive properties of the known method that uses physical potentials, it was proposed to introduce a number of parameters, taking into account the properties of the amino acid sequence (A.M.Poole and R.Ranganathan. Knowledge-based potentials in protein design. Current Opinion in Structural Biology 2006, 16, 508-513). On the basis of this method, taking into account the entered parameters, the authors have designed de novo a number of be the Cove (WO 2007030594, "Methods of using and analyzing biological sequence data", IPC G06F 19/22; G06F 19/18, publ. 15.03.2007). However, this approach to wear compilation in nature and provides only a slight improvement based methods, without changing the probabilistic nature of the source physical method.

Known invention is related to apparatus and methods for quantitative design and optimization of protein structure (US 2002106694 "Apparatus and method for automated protein design", IPC SC 1/00; C07K 14/00; C12N 15/10; G06F 17/50; G06F 19/00, publ. 08.08.2002). Developed an automated design method, quantitatively taking into account the interaction of surface residue side chains based on the evaluation of three types of potentials and accounting stereochemical constraints, you can choose from a large number of protein variants FSD-1 ββα motif, based on the structure of the domain zinc-finger protein. The amino acid sequence of this protein has very little similarity with this domain. Despite this, the study of this protein in solution by the method of nuclear magnetic resonance showed that it forms the structure is identical to that proposed for her design (B.I.Dahiyat and S.L.Mayo. De Novo Protein Design: Fully Automated Sequence Selection. Science (1997) 278, 82-87).

The disadvantage of this method is the necessity of an exemplary protein, based on which the selection of a new structure of a large number of the of the option.

Using the methodology of Rosetta (Rosetta), presented in the work (Kuhlman B, Dantas G, Ireton GC, Varani G, Stoddard solvent BL, Baker D. Design of a novel globular protein fold with atomic-level accuracy. Science, 2003, 302(5649), 1364-8), based on the optimization of the selected structures, was designed and synthesized unknown in nature artificial protein tor 7, the structure of which was confirmed experimentally. The core Rosetta - physical model of macromolecular interactions and search algorithms amino acid sequence with the lowest energy for a given protein structure. The authors applied their method (US 7574306 "Method and system for optimization of polymer sequences with stable, 3-dimensional conformations", IPC G06F 19/00, publ. 11.08.2009) to develop structures of other proteins. However, this method requires a fairly complex calculations and does not always lead to successful results. To use it you must also have samples.

Such methods do not solve the problem create a simple way to design new proteins with any specified structure and functional properties, and the need to use as examples of specific protein structures limits the range of the designed structures.

This problem is particularly important in the technology of pharmaceutical and immunological preparations of protein origin.

The task we address the claimed invention, is the development of the design of the primary structure of the protein, which achieves the technical result consists in the simplification of the way with the expansion of the range of the designed structures.

The proposed method for the design of the primary structure of a protein based on the receipt of characterizing its amino acid sequence and describe the secondary structure is the following:

A) create a database of amino acid Pentagrammaton proteins containing folder pentatriacontane, and the source folder list compiled by their names, formed on the basis of the encoded binary description of hydrogen bonds of the peptide groups Pentagrammaton in the secondary structure of proteins, and write them on the information media;

B) enter in the computer memory is recorded on the information carrier database of amino acid Pentagrammaton proteins;

B) ask and enter into computer memory, the initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the specified initial Pentagrammaton;

G) determine and enter into computer memory a description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system;

E) enter into computer memory PROTCOM to highlight and search Pentagrammaton designed protein in the database and write the names of the amino acids found Pentagrammaton and numbers database folders, describing the secondary structure, in which are found the search interagency;

E) enter and remember the specified initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;

G) enter and remember a given secondary structure specified initial interamente in the form of a ten-digit number in the binary system in the program PROTCOM;

C) conducting a search for a given initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

encodes the specified initial interamente for purposes of searching in the database.

- search the specified initial interamente in the database in a folder with a given secondary structure interamente;

- when in the folder specified initial interamente believe this interagent the first possible number N of Pentagrammaton design of the primary structure of the protein and produce:

- recording folder number of the database that contains the first interagent;

record amino acid sequence of the first interamente in the work file of the program;

record ten-digit folder number describing the secondary structure found in the first interamente in the work file;

with nenachislenie in the folder specified initial interamente:

- specify and enter into computer memory a new initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the new set initial Pentagrammaton;

- enter and memorize new set initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;

- conduct a search for a new specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

coding new set of initial interamente for purposes of searching in the database.

- research new set of initial interamente in the database in a folder with a given secondary structure interamente;

- repetition set a new initial Pentagrammaton and find a new set of source Pentagrammaton carry out up until you will not find interagent with such amino acid sequence, which is located in the database folder describing a given secondary structure interamente;

And) determine the secondary structure of each subsequent (N-1) Pentagrammaton by introducing the same or a modified ten-digit number describing the secondary structure of the previous PE is fragmenta in the program PROTCOM;

K) conduct the search in the database of Pentagrammaton containing four amino acids of each of the (N-1) Pentagrammaton, stored in the working file and the new one, and the search algorithm includes:

- selection and memorization of the last four amino acids in each of the (N-1) Pentagrammaton, stored in the working file;

- search Pentagrammaton containing the last four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new amino acid in the database in a folder with a given secondary structure;

- when it finds such Pentagrammaton produce:

- select one of the new amino acid and adding it to the last four amino acids of the previous interamente;

a recording of a new amino acid in the work file, reflecting the projected primary structure of a protein;

record decimal folder number describing the secondary structure of each found interamente;

- when not finding such Pentagrammaton produce:

the job of a modified secondary structure;

- allocation of the last four amino acids in the subsequent interamente;

- search Pentagrammaton containing the last four amino acids of the previous interamente and one new amino acid in the database in the folder with the modified secondary structure;

- repetition changed the I secondary structure and database search carried out until until you find at least one interagent containing four amino acids of the previous interamente;

L) designed primary structure of a protein is considered received in the working file an amino acid sequence, with the corresponding description of its secondary structure.

The method is as follows:

A) create a database of amino acid Pentagrammaton proteins containing folder pentatriacontane, and the source folder list compiled by their names, formed on the basis of the encoded binary description of hydrogen bonds (H-bonds) peptide groups Pentagrammaton in the secondary structure of proteins, and write them on the information media;

a) from Protein Data Bank produced downloading publicly available files with the coordinates of the atoms of the crystal proteins, were investigated by means of x-ray analysis (PCA). To create the initial database was made downloading 2500 files proteins.

b) using a computer program Protein 3D (Computer program "Protein 3D", registered in Russia. APO, No. 980143 from 03.05.98, authors: Karasev V.A., Demchenko EL) on the basis of obtained from Protein Data Bank files create text files containing the primary structure of proteins with the description of H-bonds formed by the peptide groups of the main chain of proteins in the secondary page is the established levels;

C) using the set of programs to create the database holds the following steps:

- carry out cutting the primary structures of proteins into fragments of five amino acids (interagency) so that each subsequent fragment in the process of movement from the bottom up stood out with a shift of one amino acid with respect to the previous fragment, and information about N-relationships of each of the allocated fragment in the secondary structure of the protein was fully preserved. In table 1 for example shows the procedure for cutting a fragment of a text file of protein 1SCN (subtilisin Carlsberg). The table shows that the H-bond in interamente remain unchanged.

- interagency homologous structure of H-bonds of the peptide groups in the secondary structure of the protein, sorted by folders, assigning the names of the folders are encoded in the binary system description H-bonds of the peptide groups. The presence of H-bond is denoted by the numeral "1", the lack of hydrogen bonds is "0".

Each interamente have 5 pairs of peptide groups, H-bond connection, which describes four types of pairs of variables: there is no H-bonds - 00, H bond O...HN - 01, H-bond NH...O - 10 and two H-bonds:...HN and NH...O - 11. Thus the name of the folder containing the homologous structure of interagency, consists of 10 symbols 0 and 1 read from top to bottom and write in the article is the eye from left to right.

Examples of options allocated Pentagrammaton and describing their ten-digit numbers in the binary system are shown in table 2. So, interagent obtained from a plot of β-structure (the first line of the example to the left), contains no H-bonds short-range order and is described by the number 0000000000. The plot Pentagrammaton, which is located in the transition region of β-structure α-helix (first row, right side) contains one link with ties O...HN and NH...O - pair of variables 11 and four link with ties O...HN - 01 and is characterized by the number 1101010101. The Central region of α-helix, as shown in table 2, contains five links with links O...HN and NH...O - 11 and described by a number 1111111111. The transition region of α-helix - β-structure contains four link with ties NH...O - 10 and one with ties O...HN and NH...O - pair of variables 11, which gives a ten-digit number 1010101011. Finally, the bent portion of β-structure with one H-bond, as can be seen from table 2, contains one link connectivity NH...O - 10, the three-link - without H-bonds - 00 and one link with communication O...HN - 01, which is described by the number 1000000001.

When you create a database during the processing of text files produced movement on the protein chain from the bottom up with a shift of one amino acid at each stage and each allocated interagent received the appropriate ten-digit description. In table 1 these values prevedeno the second right column. As a result, in this column we have lots of overlapping 4/5 ten descriptions of the structure of the protein site 1CSN, each of which receives in the database folder with the same number. Bold 10-digit numbers for Pentagrammaton similar to that shown in table 2.

tr> 56
Table 1
An example of a procedure of cutting interagency α-helical protein fragment 1SCN
10-character description
Text fileStages of selection Pentagrammaton
1CSN000000000069
69 PRO 000000000068
68 ILE000000001067
67 GLY000000101066
66 THR000010101065
65 CYS001010101064
64 GLY101010101163
63 ALA N - 59 TYR O
63 ALA
62 N LEU - THR 58 O101010111162
62 LEU
61 LEU N - 57 ARG O101011111161
61 LEU
60 LYS N - 56 TYR O 101111111160
60 LYS
59 TYR O - 63 ALA N111111111159
59 N TYR - 55 O GLU
59 TYR
58 THR O - 62 LEU N111111110158
58 THR N - 54 ASP O
58 THR
57 O ARG - 61 LEU N111111010157
57 ARG N - 53 ARG O
57 ARG
56 TYR O - 60 LYS N1111010101
56 N TYR - 52 LEU O
56 TYR
55 GLU O - 59 TYR N110101010155
55 N GLU - 51 GLN O
55 GLU
54 O ASP - THR 58 N 010101010054
54 ASP
53 O ARG - 57 ARG N53 O ARG - 57 ARG N010101000053
53 ARG53 ARG
52 LEU O - 56 TYR N52 LEU O - 56 TYR N52 LEU O - 56 TYR N010100000052
52 LEU52 LEU52 LEU
51 GLN O - 55 GLU N51 GLN O - 55 GLU N51 GLN O - 55 GLU N51 GLN O - 55 GLU N010000000051
51 GLN51 GN 51 GLN51 GLN
50 PRO50 PRO50 PRO50 PRO50 PRO000000000050
49 ALA49 ALA49 ALA49 ALA49 ALA
48 ASP48 ASP48 ASP48 ASP
47 SER47 SER47 SER
46 ARG46 ARG

The Central parts of α-helices and β-structures of proteins describe, respectively, the rows of a repeating 10-digit dialing 1111111111 0000000000 and. At the same time, the transition from β-to α-helix and α-helix to β-structure described by blocks of 10-digit dialing with a gradually changing composition of the pairs of variables. Examples of such blocks are shown in table 3. In bold are the initial and final sections of the transitions and their 10-digit description.

Table 3
Examples of transition sections and their descriptions using the 10-digit number
The transition from β-to α-helix10-character descriptionTransition from α-helix to β-structure10-character description
1SCN1SCN
59 TYR O - 63 ALA N68 ILE
59 N TYR - 55 O GLU67 GLY
59 TYR
58 THR O - 62 LEU N
111111111166 THR0000000000
65 CYS
58 THR N - 54 ASP O64 GLY
58 THR111111110163 ALA N - 59 TYR O0000000010
57 O ARG - 61 LEU N111111010163 ALA0000001010
57 ARG N - 53 ARG O111101010162 N LEU - THR 58 O0000101010
57 ARG110101010162 LEU
56 TYR O - 60 LYS N010101010061 LEU N - 57 ARG O0010101010
56 N TYR - 52 LEU O010101000061 LEU
56 TYR010100000060 LYS N - 56 TYR O1010101011
55 GLU O - 59 TYR N010000000060 LYS1010101111
55 N GLU - 51 GLN O000000000059 TYR O - 63 ALA N
55 GLU59TYRN - 55 O GLU1010111111
54 O ASP - THR 58 N59 TYR
54 ASP58 THR O - 62 LEU N1011111111
53 O ARG - 57 ARG N58 THR N - 54 ASP O
53 ARG 58 THR1111111111
52 LEU O - 56 TYR N57 O ARG - 61 LEU N
52 LEU57 ARG N - 53 ARG O
51 GLN O - 55 GLU N57 ARG
51 GLN56 TYR O - 60 LYS N
50 PRO56 N TYR - 52 LEU O
49 ALA56 TYR
48 ASP55 GLU O - 59 TYR N
47 SER55 N GLU - 51 GLN O
46 ARG55 GLU
1AMF3BBY
131 O GLU-LYS 135 N93 PRO
131 GLU N - 127 ILE O92 TYR
131 GLU91 ILE
130 O LYS - 134 GLN N90 ARG
130 N LYS - 126 GLY O111111111189 GLU0000000000
130 LYS111111110188 TRP N - 84 ALA O0000000010
129 ALA O DEPRESSION -133 LEU N111111010188 TRP0000001000
129 ALA N - 125 ALA O111101010187 THR0000100000
129 ALA110101011186 PRO0010000000
128 TYRO - 132 ALA N010101110085 PRO1000000011
128 N TYR - 124 PRO O84 ALA O - 88 TRP N
128 TYR010111000084 ALA N - 80 GLU O0000001110
127 ILE O - 131 GLU N84 ALA
127 ILE N - 123 VAL O011100000083 PHE N - 79 LEU O0000111010
127 ILE110000000183 PHE0011101010
126 GLY O - 13 LYS N 000000010082 ARG N - 78 TYR O1110101011
126 GLY000001000082 ARG1010101111
125 ALA O - 129 ALA N81 ASPN - 77 GLU O
125 ALA000100000081 ASP1010111111
124 PRO O - TYR 128 N80 GLU O - 84 ALA N
124 PRO010000000080 GLU N - 76 ALA O1011111111
123 VAL O - 127 ILE N000000000080 GLU1111111111
123 VAL N - 119 ASP O79 LEU O - 83 PHE N
123 VAL79 N LEU - ILE 75 O
122 HIS79 LEU
121 GLU78 TYR O - 82 ARG N
120 PRO78 N TYR - 74 ALA O
119 O ASP - 123 VAL N78 TYR
119 ASP77 GLU O - 81 ASP N
118 GLY77 N GLU - SER 73 O
117 VAL77 GLU
116 ALA76 ALA O - 80 GLU N
115 LEU76 ALA N - 72 SER O
114 ARG76 ALA

We established that the number of such units is limited, and between the transition from β-to α-helix and α-helix to β-structure has the symmetry (0←→1). For these transitions compiled catalog. Similar symmetry (0←→1) are also observed for the curves of α-helices and β-structures, examples of which are presented in table 4. For these blocks also compiled catalog. In bold are the beginning and end of curves and pairs of variables denoting hydrogen bonds in the bends.

Table 4
Comparison of bend α-helix with a gap of one H-bond with the bending of the β-structure with one H-bond
Bend α-helix with a gap of one H-bond10-character descriptionFrom the IB β-structure with one H-bond 10-character description
1DOG
334 GLN
333 TYR O - LYS 337 N
333 N TYR - 329 TYR O
333 TYR
332 LEU O - 336 ASP N
332 N LEU - LEU 328 O
332 LEU
331 ALA O - 335 TRP N
331 ALA N - 327 GLN O
331 ALA
330 ASP O - 334 GLN N
330 ASP N - 326 O GLU
330 ASP1GZM
329 TYR O - 333 TYR N111111111131 LEU0000000000
329 N TYR - ALA 325 O30 TYR0000000010
329 TYR29 TYR
328 LEU O - 332 LEU N111111110128 GLN0000001000
328 LEU1111110111 27 PRO0000100000
327 GLN O - 331 ALA N111101111126 ALA N - 22 SER O0010000000
327 GLN N - 323 ALA O110111111126 ALA1000000001
327 GLN011111111025 GLU0000000100
326 GLU O - 330 ASP N111111101124PHE0000010000
326 GLU N - 322 LEU O23 PRO
326 GLU111110111122 SER O - 26 ALA N0001000000
325 O ALA - 329 TYR N111011111122 SER0100000000
325 ALA N - 321 THR O101111110121 ARG
325 ALA111111010120 VAL0000000000
324 ALA N - 320 CYS O 19 VAL
324 ALA18 GLY
323 ALA O - 327 GLN N17 THR
323 ALA N - 319 LEU O
323 ALA
322 LEU O - 326 N GLU
322 LEU N - 318 PHE O
322 LEU
321 THR O - ALA 325 N
321 N THR - 317 TRP O
321 THR
320 CYS O - 324 ALA N
320 CYS
319 LEU O - 323 ALA N
319 LEU

By combining these blocks can be used to design all types of secondary structures of proteins.

d) produce a simplification of the selected Pentagrammaton by removing from them the information about the structure of H-bonds and leaving only the sequence of five amino acids;

d) to facilitate further search Pentagrammaton in the database produce sorting files, containing fragments with the same five-digit numeric index, which assign them by assigning each amino interamente to one of four groups of symmetry transformations (Ka is ASEV VA, V.V. Luchinin Introduction to the design of the bionic nanosystems. - M.: Fizmatlit, 2009, 464 S., Chapter 8). These groups are given in table 5.

Table 5
The distribution of amino acids in accordance with the group antisemit
Group symmetry
Amino acids
Group 1Gly, Pro
Group 2Ala, Leu
Group 3Ser, Thr, Cys, Met, His, Trp, Phe, Tyr
Group 4Asp, Glu, Asn, Gln, Arg, Lys, Val, Ile

In file name, write a five-digit code and name of the folder where this file is located. If interagent

Efg
Def
Cde
Bcd
Abc

opisyvaet the SJ 10-digit number 0000000000, the index is formed from the top down and write from left to right: for example, if the amino acid Efg belongs to the group 1, Def-2, CA - 3, Bcd - 4 and Abc - 1, then the 5-digit code 12341, and the name of the file 12341_0000000000.

Created the database contains more than 500 thousand Pentagrammaton sorted on more than 500 folders. The database is organized in a system consisting of 16 hypercube, is isomorphic to the Boolean hypercube In6(Database Pentagrammaton proteins. Authors: Vagarosa, AIESEC, Win. Registered July 7, 2010 in the Federal Agency ROSPATENT №2010620364).

The database is constantly updated by processing new files from the Protein Data Bank. Can also be created theoretical database.

B) enter in the computer memory is recorded on the information carrier database of amino acid Pentagrammaton proteins;

C) ask and enter into computer memory, the initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the specified initial Pentagrammaton;

Conceived the initial sequence of five amino acids represented as a column of three-letter acronyms amino acids with the notation on the left of their numbers recorded from bottom to top:

5Efg
4Def
3Cde
2Bcd
1Abc

G) determine and enter into computer memory a description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system;

D) introducing into the memory of the computer program PROTCOM to highlight and search Pentagrammaton designed protein in the database and write the names of the amino acids found Pentagrammaton and numbers database folders describing the secondary structure, in which are found the search interagency;

E) enter and remember the specified initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;

The operator enters the program planned sequence of five amino acids (specified initial interagent).

The input of these amino acids in the program from the top down, starting with the fifth amino acids, and the end of the first amino acid: Efg, Def, Cde, Bcd, Abc.

G) enter and remember a given secondary structure specified initial interamente in the form of ten-digit numbers on oinoi system in the program PROTCOM;

Example enter the ten-digit number: 0000000000

C) conducting a search for a given initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

encodes the specified initial interamente for purposes of searching in the database.

The program reads the amino acid interamente down, encodes them according to their belonging to this or that group symmetry and writes the code number from left to right, similar to the generated index files, for example: Efg - 1, Def - 2, CA - 3, Bd - 4, Ab - 4, code number - 12344.

- search the specified initial interamente in the database in a folder with a given secondary structure interamente;

For the entered ten-digit number 0000000000 specified initial interagent are looking for in the database directory number 0000000000 file with code number 12344, i.e. 12344_0000000000.

- when in the folder specified initial interamente believe this interagent the first possible number N of Pentagrammaton design of the primary structure of the protein and produce:

- recording folder number of the database that contains the first interagent;

record amino acid sequence of the first interamente in the work file of the program;

record decade is ignacego folder number, describing the secondary structure found in the first interamente in the work file;

The format of a working file created by the program PROTCOM shown in table 6.

Table 6
The format of a working file created by the program PROTCOM
123
NSTPbbbbbbbbbb
.............
5Efgbbbbbbbbbb
4Def
3Cde
2Bcd
1Abc

Write the sequence of amino acids protein in the working file is made from the bottom up, that atrage the procedure of protein synthesis on the ribosome (elongation of the protein by adding amino acids to the top amino acid). The columns of the file has the following functions:

1 - numbers of amino acids in the designed protein, written from the bottom up;

2 - the sequence of amino acids in the designed protein, recorded from bottom to top using the three-letter designations;

3 - ten-digit number folders (bbbbbbbbbb) database describing the secondary structure of the designed Pentagrammaton recorded from bottom to top.

in the row N - signal is the end of the protein sequence (STP).

Bold the first interagent and ten-digit number of the folder in which the found data interagent.

- when not in a folder specified initial interamente:

- specify and enter into computer memory a new initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the new set initial Pentagrammaton;

- enter and memorize new set initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;

- conduct a search for a new specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

- implement the new coding specified initial interamente to target the search in the database;

- conduct a search for a new specified initial interamente in the database in a folder with a given secondary structure interamente;

- repetition set a new initial Pentagrammaton and find a new set of initial Pentagrammaton carry out up until you will not find interagent with such amino acid sequence, which is located in the database folder describing a given secondary structure interamente.

And) determine the secondary structure of each subsequent (N-1) Pentagrammaton, stored in the working file by entering the same or a modified ten-digit number describing the secondary structure of the previous interamente, the program PROTCOM;

K) conduct the search in the database of Pentagrammaton containing four amino acids of each of the (N-1), stored in the working file Pentagrammaton, and one new, and the search algorithm includes:

- selection and memorization of the last four amino acids in each of the (N-1) Pentagrammaton, stored in the working file;

- search Pentagrammaton containing the last four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new amino acid in the database in a folder with a given secondary structure;

For example, in table 7 in bold are the last four amino acids of PR is previous of interamente and the description of the secondary structure to search for a new interamente.

- when it finds such Pentagrammaton produce:

- select one of the new amino acid and adding it to the last four amino acids of the previous interamente;

Table 7
Selection of amino acids and their secondary structure search Pentagrammaton in the database
123
..........
60000000000
5Efg0000000000
4Def
3Cde
2Bcd
1Abc

- write the new amino acid in the work file reflecting the projected primary structure of a protein;

- write decimal numbers folder describing the secondary structure of each found interamente;

- when not finding such Pentagrammaton produce:

the job of a modified secondary structure;

- allocation of the last four amino acids in the subsequent interamente;

- search Pentagrammaton containing the last four amino acids of the previous interamente and one new amino acid in the database in the folder with the modified secondary structure;

- repetition of the changes in secondary structure and database search carried out until, until you find at least one interagent containing four amino acids of the previous interamente;

L) is considered received in the working file an amino acid sequence with the corresponding description of its secondary structure of the designed primary structure of a protein.

As a result of actions PROMCOM and operator, projecting protein in the working file is completely filled with the second column containing the primary structure of a protein and the third column, on the basis of which is judged on the secondary structure of this protein. The presence in the 3rd column of consecutive 0000000000 folder characterizes the fragment as β-structural. Several consecutive folders numbered 1111111111 can be attributed to a fragment of α-spiral is at (see table 2). Transitional areas between α-helical and β-structural conformation, and the curves of β-structure (tables 2-4) are designed and describes the relevant folders.

Description of the application are illustrated in the following graphics:

Figure 1. Screenprint fragments of protein secondary structure NNW and 3EOK obtained using PROTEIN 3D.

a - NNW (person); b - EOC (duck);

Figure 2. Screenprint fragments of protein secondary structure 1AGD and 2R37 obtained using PROTEIN 3D for the projected site of the protein.

a - interagent 1AGD (103-107); b - interagent 2R37 (189-193).

Figure 3. Screenprint fragments of protein secondary structure 1AGD and V obtained using PROTEIN 3D for the projected site of the protein.

a - interagent 1AGD (105-109); b - interagent V (35-39).

Figure 4. Screenprint fragments of protein secondary structure 1AGD and 1BAS, obtained using PROTEIN 3D for the projected site of the protein.

a - interagent 1AGD (106-110); b - interagent 1BAS (80-84).

Figure 5. Screenprint protein 1AGD obtained using PROTEIN 3D protein 1AGD, were investigated by means of PCA.

a protein; b - view of a fragment of the protein; a detailed view of the secondary structure of the protein 1GDJ corresponding to a given secondary structure of example 2.

The method is illustrated by examples.

Example 1.

In this example, the Russ is otrin the design of the primary structure of the protein, specified in the form of α-helix secondary structure that contains the plot of the transition from β-to α-helix, the Central region of α-helix and phase transition from α-helix to β-structure.

When carrying out the design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and descriptions secondary structure, carry out the following:

A) create a database of amino acid Pentagrammaton proteins containing folder pentatriacontane, and the source folder list compiled by their names, formed on the basis of the encoded binary description of hydrogen bonds of the peptide groups Pentagrammaton in the secondary structure of proteins, and write them on the information media;

B) create a catalog descriptions of secondary structures that contain descriptions of secondary structures in the form of a ten-digit sequence of Boolean numbers;

(B) enter into computer memory recorded on the information carrier database of amino acid Pentagrammaton proteins;

G) set the description of the secondary structure of the designed primary structure of a protein as a sequence of decimal Boolean numbers based on the catalog descriptions of secondary structures;

In this example, given in the form of α-helix secondary structure which contains plots of the transition from β-to α-helix, the Central region of α-helix and the areas of transition from α-helix to β-structure. Their description of the operator finds in the directory of secondary structures and fixes it (table 8).

Table 8
Description of the projected secondary structure for example 1
180010101010The plot of the transition from α-helix to β-structure
171010101011
161010101111
151010111111
141011111111
131111111111The Central part of the α-helix
121111111111
111111111111
101111111111
91111111101The plot of the transition from β-to α-helix
8111110101
71111010101
61101010101
50101010100
4
3
2
1

D) ask and enter into computer memory, the initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the specified initial Pentagrammaton:

5Asp
4Ala
3Pro
2Ser
1Leu

which is recorded in order from the bottom up Leu, Ser, Pro, Ala, Asp.

E) ask and enter into computer memory a description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system, which is the first ten-digit number in the set on which Isani secondary structure, which corresponds to the folder name in the database containing the specified initial interagent: ten digit number 0101010100. As can be seen from table 8, it is the first ten-digit number in the given description of the secondary structure.

It matches the description of interamente with four H-bonds C=O...HN, described by the pair of variables 01, and one pair of variables 00 (no H-bonds), see table 3 (transition β-structure, α-helix).

G) enter into computer memory a program PROTCOM to highlight and search Pentagrammaton designed protein in the database and write the names of the amino acids found Pentagrammaton and numbers database folders describing the secondary structure, in which are found the search interagency;

1. Installing the program is done in a special folder, which are produced by working files containing the design of the primary structure of a protein and describing its secondary structure in the binary system of ten-digit numbers.

2. The newly installed program does not contain any other files besides the program itself, and opens with a splash screen to enter the specified initial interamente.

3. In the beginning of the program given in the form of a table system twenty amino acids consisting of four groups.

C) enter and remember the specified initial interagent designed protein in view of the sequence of five amino acids in the program PROTCOM: the operator shall enter amino acids, components specified initial interagent in sequence with the fifth on the first, i.e. from top to bottom: Asp, Ala, Pro, Ser, Leu.

And) enter and remember the given description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system in the program PROTCOM: the operator enters the program PROTCOM sequence 0101010100.

K) conduct search the specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

encodes the specified initial interamente for purposes of searching in the database.

The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application).

In this example: Asp - 4, Ala - 2, Pro - 1, Ser - 3, Leu - 2. This numerical sequence is written to the program memory from left to right 42132 and is used to find the specified initial interamente in the folder 0101010100 database file 42132_0101010100.

- search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente;

The program has detected the specified initial interagent file 42132_0101010100:

Asp
Ala
Pro
Ser
Leu

This interagent was isolated from a text file, the program PROTEIN 3D-based handling of atomic coordinates of the protein from the Protein Data Bank, and has a structure 0101010100 transition plot of β-structure α-helix (see table 8).

Tables 9, 10, 11 illustrate the operation of the program. In the left part, entitled "Input", host: in the first column enter the program PROTCOM sequence number of the projected amino acids, in the second column are the amino acids at the input according PZ) or pairs of variables according PL), selected by the operator on the basis of a given secondary structure (table 8). In the third column is written is entered in the program description secondary structure in the form of a ten-digit number. In the Central part, entitled "the Search interamente in the database"will be placed in the first column the names of the files with the number of the encoding and the number of the specified folder, and the second names found in interamente amino acids. In the right part of the table is the record for implementation through the Aya program PROTCOM in the working file after detecting the specified initial interamente, and in the future, after the choice of the amino acids in the file number of the encoding number of the specified folder.

Table 9
Search the specified initial interamente in the database
InputSearch interamente in the databaseThe entry in the desktop file
No.Amino acids or pairs of variablesThe description of the secondary structureThe name of the file with the number of the encoding and the number of the specified folderThe names of the amino acidsNo.The names of the amino acidsDescription secondary structure
5Asp010101010042132_01010101005Asp0101010100
4Ala 4Ala
3Pro3Pro
2Ser2Ser
1Leu1Leu

- when in the folder specified initial interamente believe this interagent the first possible number N of Pentagrammaton design of the primary structure of the protein and produce:

- recording folder number of the database that contains the first interagent;

record amino acid sequence of the first interamente in the work file of the program;

record ten-digit folder number describing the secondary structure found in the first interamente in the work file;

The program detect the introduced initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (table).

Since a given initial interagent was found, then we omit the validity of the search relating to the case not being in the folder specified initial interamente.

L) set the description of the secondary structure for each subsequent (N-1) Pentagrammaton, using the description of the given secondary structure in the form of a ten-digit sequence of Boolean numbers that correspond to the names of the folders in the database containing the specified interagency, by introducing the same or a modified ten-digit number describing the secondary structure of the previous interamente, the program PROTCOM;

For this process, the job description of the secondary structure of the program PROTCOM prompts you to enter the variable pairs 00, 01, 10 or 11. From table 8 it is seen that the following ten-digit number is 1101010101. For this reason, the operator selects 11, and introduces a couple of 11 variables in the program (column "Amino acids and pairs of variables" in table 10). The program adds 11 to the left and remove a couple of digits to the right that leads to a change in the ten-digit number describing the secondary structure of the previous interamente, as reflected in the column "Description of the given secondary structure" table 10.

Table 10
Search second interamente in the database
InputSearch interamente in the databaseThe entry in the desktop file
No.Amino acids or pairs of variablesThe description of the secondary structureThe name of the file with the number of the encoding and the number of the specified folderThe names of the amino acidsNo.The names of the amino acidsDescription secondary structure
611110101010134213_1101010101Ser6Lys1101010101
44213_1101010101Lys
5Asp010101010042132_01010101005 Asp0101010100
4Ala4Ala
3Pro3Pro
2Ser2Ser
1Leu1Leu

M) perform a search in the database of Pentagrammaton containing four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new, and the search algorithm includes:

- selection and memorization of the last four amino acids in each of the (N-1) Pentagrammaton, stored in the working file;

- search pintara the cops, contains the last four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new amino acid in the database in the folder with the specified description of the secondary structure;

This program provides interamente recorded work file table 10, four amino acids, recorded from top to bottom: Asp, Ala, Pro, Ser.

Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes the code number from left to right, similar to the generated index files, but without the first amino acids: 4213 and conducts a database search of Pentagrammaton containing four selected amino acids in the folder with the specified structure of the next interamente (1101010101), i.e. in files H, where X can take the values 1, 2, 3, 4, corresponding to the numbers of symmetry groups (see table 5 description of application) - 14213_1101010101, 24213_1101010101, 34213_1101010101, 44213_1101010101.

The result of the search were found interagency containing the last four amino acids: Asp, Ala, Pro, Ser and following the fifth amino acids, recorded together with the codes of the proteins from which they were obtained:

edit group 1 (14213_1101010101): interagency not found;

- in the file group 2 (24213_1101010101) - interagency not found;

- in the file group 3 (34213_1101010101): 1 - Ser;

- in the file group 4 (44213_1101010101): 2 - Lys;

- when that is their Pentagrammaton operator produces:

- select one of the new amino acid and adding it to the last four amino acids of the previous interamente;

Found a program of amino acids, you can choose either Ser in the file group 3 or Lys in the file group 4. The program allows to select only one option. Depending on the choice of the design will be different, that can be found only in the design. As a fifth amino acids operator chose Lys and entered this information into the program.

Next, the program produces:

a recording of a new amino acid in the work file ("Entry in the desktop file, table 10), reflecting the projected primary structure of a protein (Lys);

record decimal folder number describing the secondary structure of each found interamente (1101010101);

Because interagent was found, then we omit the stages of the search relating to the case of not finding the folder interamente.

Next, produce a repetition of the action pursuant to sub. L) and M) until the end of the design process. As can be seen from table 11, in the process of designing amino acid sequence at stages 11, 13, and 18 had a choice of two or three amino acids, at other stages of the program have found only one amino acid.

H) - designed primary structure of a protein is considered received in the file is a sequence of amino acids, with an appropriate description of its secondary structure, stored in the working file of the program PROTCOM and presented in the right part of table 11.

Ser
Table 11
Search subsequent Pentagrammaton in the database
InputSearch interamente in the databaseThe entry in the desktop file
No.Amino acids or pairs of variablesThe description of the secondary structureThe name of the file with the number of the encoding and the number of the specified folderName of amino acidsNo.The names of the amino acidsDescription secondary structure
1800001010101011443_0010101010Gly0010101010
41443_0010101010Lys18Gly
1710101010101114433_1010101011Gly17Gly1010101011
1610101010111144334_1010101111Ile16Ile1010101111
1510101011111143341_1010111111Lys15Lys1010111111
1410101111111133414_1011111111Ser14Ser1011111111
1311111111111124144_1111111111Ala
34144_1111111111Phe13Phe1111111111
1211111111111141444_1111111111Val12Val1111111111
Ile
1111111111111114443_1111111111Gly11Gly1111111111
24443_1111111111Ala
34443_1111111111Thr
1011111111111144434_1111111111Lys10Lys1111111111
911111111110144344_1111111101Val9Val1111111101
811111111010143442_1111110101Asn8Asn1111110101
711111101010134421_1111010101Thr7Thr1111010101
611110101010134213_1101010101
44213_1101010101Lys6Lys1101010101
5Asp010101010042132_01010101005Asp0101010100
4Ala4Ala
3Pro3Pro
2Ser2Ser
1 Leu1Leu

For experimental confirmation of the existence designed in example 1, the primary structure of a protein with a given secondary structure in the Protein Data Bank was found a few fragments of the proteins having the amino acid sequence that is partially coincident with the amino acid sequence shown in example 1 (table).

As can be seen from table 12, designed the primary structure of the protein in example 1 has the greatest similarity with the primary structure of the protein fragments NNW and EOC. Thus, the designed primary structure of protein with 1 to 10 amino acid identical with the primary structure of the protein fragment NNW 2-th to 11-th amino acid. At the same time, from 11 th to 18-th amino acids designed primary structure of the protein is identical to the primary structure of the protein fragment EOC from 12 th to 19-th amino acid.

Table 12
Comparison of the primary structures of protein fragments
No.Designed sequence is lnost amino acids of example 1 No. of amino acids in proteinsThe protein fragments
NWEACH3DHR3D4X
18Gly19AlaGlyGlySer
17Gly18GlyGlyGlyGly
16Ile17ValIleIleIle
15Lys16LysLysLysLys
14Ser15Gly SerAlaGly
13Phe14TrpPhePheTrp
12Val13AlaValValCys
11Gly12AlaGlyAlaAla
10Lys11LysLysLysLys
9Val10ValValValVal
8Asn9AsnAsnAsnAsn
7Thr8ThrThrSerSer
6Lys7LysLysLysLys
5Asp6AspAspAspAsp
4Ala5AlaAlaAsnAla
3Pro4ProAlaAlaAla
2Ser3SerSerSerSer
1Leu 2LeuLeuLeuLeu

Table 13 shows the two-dimensional description of hydrogen bonds is presented in table 12 protein fragments obtained using Protein 3D file-based table 12. There is given a description of their secondary structure in the form of a ten-digit Boolean numbers, which is completely identical to the given description of the secondary structure of the designed primary structure (table 11).

In figure 1,a and 1,b shows screenprint fragments of protein secondary structure NNW and 3EOK obtained using Protein 3D, and the corresponding amino acid sequence of the primary structure. On these figures it is seen that the secondary structure fragments, of which is designed primary structure of example 1 has an overlap with the 7-th and 11-th amino acid, as well as the overlapping sequences of amino acids in their primary structure (the sequence it is in italics). Therefore, the designed sequence is identical with the original fragments of the secondary structure.

Thus, in example 1 presents designed primary structure of the protein, consisting of fragments of proteins NNW and EACH defined secondary structure which fully with the flows from the secondary structure of each of these proteins.

Table 13
Fragments of proteins from two-dimensional description of their hydrogen bonds
Description of the secondary structure of protein fragments
The secondary structure of protein fragments from the Protein Data Bank
NW3EOK3DHR3D4X
19 ALA19 GLY19 GLY19 SER
18 GLY N - 14 TRP O18 GLY N - 14 PHE O18 GLY N - 14 PHE O18 GLY N - 14TRP O
18 GLY18 GLY18 GLY18 GLY
17 VAL N - 13 ALA O17 ILE N - 1 VAL ABOUT 17 ILE N - 13 VAL OF17 ILE N - 13 CYS O
17 VAL17 ILE17 ILE17 ILE
16 LYS N - 12 ALA O16 LYS N - 12 GLY O16 LYS N - 12 ALA O16 LYS N - 12 ALA O180010101010
16 LYS16 LYS16 LYS16 LYS
15 N GLY - LYS 11 ON15 N SER - LYS 11 ON15 ALA N - 11, LYS ON15 N GLY - LYS 11 ON171010101011
15 GLY15 SER15 ALA15 GLY
14 TRPO - 18 GLY N14 PHE O - 18 GLY N14 PHE O - 18 GLY N14 TRP O - 18 GLY N 161010101111
14 TRPN - 10 OF ABOUT VAL14 PHE N - 10 ABOUT VAL14 PHE N - 10 ABOUT VAL14 TRP N - 10 ABOUT VAL
14 TRP14 PHE14 PHE14 TRP151010111111
13 ALA O - 17 VAL N13 ABOUT VAL - 17 ILE N13 ABOUT VAL - 17 ILE N13 CYS O - 17 ILE N1011111111
13 ALAN - 9 ASN O13 VAL N - 9 ASN O13 VAL N - 9 ASN O13 CYS N - 9 ASN O14
13 ALA13 VAL13 VAL13 CYS131111111111
12 ALA O - 16 LYS N12 ABOUT GLY - 16 LYS N12 ALA O - 16 LYS N12 ALA O - 16 LYS N1111111111
12 ALAN - 8 THR About12 GLY N - 8 THR About12 ALAN - 8 SER O12 ALAN - 8 SER O121111111111
12 ALA12 GLY12 ALA12 ALA
11 ABOUT LYS - 15 GLY N11 ABOUT LYS - 15 SER N11 ABOUT LYS - 15 ALAN11 ABOUT LYS - 15 GLY N111111111111
11 LYS N - 7 LYS ON11 LYS N - 7 LYS ON11 LYS N - 7 LYS ON11 LYS N - 7 LYS ON101111111111
11 LYS11 LYS11 LYS11 LYS
10 ABOUT VAL - 14 TRP N10 ABOUT VAL - 14 PHE N10 ABOUT VAL - 14 PHE N10 ABOUT VAL - 14TRP N91111111101
10 VAL N - 6 ASP On10 VAL N - 6 ASP On 10 VAL N - 6 ASP On10 VAL N - 6 ASP On
10 VAL10 VAL10 VAL10 VAL81111110101
9 ABOUT ASN - 13 ALAN9 ABOUT ASN - 13 VAL N9 ABOUT ASN - 13 VAL N9 ASN O - 13CYS N71111010101
9 ASN N - 5 ALA O9 ASN N - 5 ALA O9 ASN N - 5 ASN O9 ASN N - 5 ALA O
9 ASN9 ASN9 ASN9 ASN61101010101
8 THR O - 12 ALA N8 THR O - 12 GLY N8 SER O - 12 ALA N8 SER O - 12 ALA N
8 THR N - 4 PRO ON8 THR N - 4 ALA O8 SER N - 4 ALA O8 SER N - 4 ALA O 50101010100
8 THR8 THR8 SER8 SER
7 ABOUT LYS - LYS 11 N7 ABOUT LYS - LYS 11 N7 ABOUT LYS - LYS 11 N7 ABOUT LYS - LYS 11 N4
7 LYS N - 3 SER O7 LYS N - 3 SER O7 LYS N - 3 SER O7 LYS N - 3 SER O3
7 LYS7 LYS7 LYS7 LYS
6 ASP O - 10 VAL N6 ABOUT ASP - 10 VAL N6 ABOUT ASP - 10 VAL N6 ABOUT ASP - 10 VAL N2
6 ASP6 ASP6 ASP6 ASP1
5 ALA O - 9 ASN N5 ALA O - 9 ASN N 5 ASN O - 9 ASN N5 ALA O - 9 ASN N
5 ALA5 ALA5 ASN5 ALA
4 ABOUT PRO - 8 THR N4 ALA O - 8 THR N4 ALA O - 8 SER N4 ALA O - 8 SER N
4 PRO4 ALA4 ALA4 ALA
3 SER - 7 LYS N3 SER - 7 LYS N3 SER - 7 LYS N3 SER - 7 LYS N
3 SER3 SER3 SER3 SER
2 LEU2 LEU2 LEU2 LEU

Information about secondary structures of the proteins NNW and 3EOK, belonging to the class of hemoglobins, published and presented table 14.

Table 14
The list of proteins in which the x-ray method was investigated structure, coinciding with a given secondary structure
No.Protein codeThe name of the protein and the source selectionLiterature
1NWHEMOGLOBIN SUBUNIT ALPHA HUMAN (man)G. Fermi, M.F. Perutz B. Shaanan, R. Fourme The crystal structure of human deoxyhaemoglobin at 1.74 angstroms resolution. J. Mol. Biol. v.175, p.159 (1984)
23EOKHEMOGLOBIN SUBUNIT ALPHA DUCK (duck)Sathya Moorthy, K. Neelagandan, M. Balasubramanian, M.N. Ponnuswamy. Crystal Structure Determination of the Duck (Anas Platyrhynchos) Hemoglobin at 2.1 Angstrom Resolution To be published (structural data from the PDB Bank)

Example 2.

This example considers the design of the primary structure of a protein, given in the form of inverted β-bend secondary structure.

When carrying out the design of the primary structure of a protein with a given secondary structure on the basis of the characteristics of arisawa its amino acid sequence and descriptions secondary structure, carry out the following:

A) create a database of amino acid Pentagrammaton proteins containing folder pentatriacontane, and the source folder list compiled by their names, formed on the basis of the encoded binary description of hydrogen bonds of the peptide groups Pentagrammaton in the secondary structure of proteins, and write them on the information media;

B) create a catalog descriptions of secondary structures that contain descriptions of secondary structures in the form of a ten-digit sequence of Boolean numbers;

(B) enter into computer memory recorded on the information carrier database of amino acid Pentagrammaton proteins;

G) set the description of the secondary structure of the designed primary structure of a protein as a sequence of decimal Boolean numbers based on the catalog descriptions of secondary structures;

In this example, the secondary structure is specified in the form of inverted β-bend. Her description of the operator finds in the directory of secondary structures and fixes it (table).

Table 15
Description of the projected secondary structure for example 2
140000000001
130000000100
120000010000
110001000000
100100000010
90000001000
80000100000
70010000000
61000000000
50000000000
4
3
2
1

D) ask and enter into computer memory, the initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the specified initial Pentagrammaton:

5Val
4Asp
3Cys
2 Gly
1Tyr

which is written in the order from bottom to top: Tyr, Gly, Cys, Asp, Val.

E) ask and enter into computer memory a description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system, which is the first ten-digit number in the given description of the secondary structure, which corresponds to the folder name in the database containing the specified initial interagent: 10-digit number 0000000000. As can be seen from table 15, it is the first ten-digit number in the given description of the secondary structure.

G) enter into computer memory a program PROTCOM to highlight and search Pentagrammaton designed protein in the database and write the names of the amino acids found Pentagrammaton and numbers database folders describing the secondary structure, in which are found the search interagency;

The installation program is similar to Pierre in example 1.

C) enter and remember the specified initial interagent designed protein in a sequence of five amino acids in the program PROTCOM: the operator makes an input of the amino acids that make up a given initial interagent in sequence with the fifth on the first, i.e. from top to bottom: Val, Asp, Cys, Gly, Tyr.

And) enter and remember set OPI is the use of secondary structure specified initial interamente in the form of a ten-digit number in the binary system in the program PROTCOM: the operator enters the program PROTCOM sequence 0000000000.

K) conduct search the specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

encodes the specified initial interamente for purposes of searching in the database.

The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application).

In this example: Val - 4, Asp - 4, Cys - 3, Gly - 1, Tyr - 3. This numerical sequence is written to the program memory from left to right 44313, and is used to search the specified initial interamente 0000000000 folder database in the file 44313_0000000000.

- search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente;

The program has detected the specified initial interagent file 44313_0000000000:

Val
Asp
Cys
Gly
Tyr

This interagency selected from a text file, the program PROTEIN 3D-based handling of atomic coordinates of the protein from the Protein Data Bank and has a β-structure, described as " 0000000000 not containing H-bonds in the immediate interamente.

Tables 16, 17, 18 illustrate the operation of the program. In the left part, entitled "Input" are: in the first column enter the program PROTCOM sequence number of the projected amino acids, in the second column are the amino acids at the input according PZ) or pairs of variables according PL), selected by the operator on the basis of a given secondary structure (table 15). In the third column is written is entered in the program description secondary structure in the form of a ten-digit number. In the Central part, entitled "the Search interamente in the database"will be placed in the first column the names of the files with the number of the encoding and the number of the specified folder, and the second names found in interamente amino acids. In the right part of the table the record is performed by the program PROTCOM in the working file after detecting the specified initial interamente, and further after the choice of the amino acids in the file number of the encoding number of the specified folder.

- when in the folder specified initial interamente believe this interagent the first possible number N of Pentagrammaton projected primary with the touch of a protein and produce:

- recording folder number of the database that contains the first interagent;

record amino acid sequence of the first interamente in the work file of the program;

record ten-digit folder number describing the secondary structure found in the first interamente in the work file;

The program discovered entered the initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (table 16). Since a given initial interagent was found, then we omit the validity of the search relating to the case not being in the folder specified initial interamente.

L) set the description of the secondary structure for each subsequent (N-1) Pentagrammaton, using the description of the given secondary structure in the form of a ten-digit sequence of Boolean numbers that correspond to the names of the folders in the database containing the specified interagency, by introducing the same or a modified ten-digit number describing the secondary structure of the previous interamente, the program PROTCOM;

For this process, the job description of the secondary structure of the program PROTCOM prompts you to enter the variable pairs 00, 01, 10 or 11. Table 15 shows that the following ten-digit number is 1000000000. For this reason, the operator is iberet 10, and introduces a couple of 10 variables in the program (column Amino acids or pairs of variables in table 17). The program adds 10 to the left and remove a couple of digits to the right that leads to a change in the ten-digit number describing the secondary structure of the previous interamente, as reflected in the column "Description of the given secondary structure" table 17.

M) perform a search in the database of Pentagrammaton containing four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new, and the search algorithm includes:

- selection and memorization of the last four amino acids in each of the (N-1) Pentagrammaton, stored in the working file;

- search Pentagrammaton containing the last four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new amino acid in the database in the folder with the specified description of the secondary structure;

This program provides interamente recorded work file table 16 four amino acids, recorded from top to bottom: Val, Asp, Cys, Gly.

Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes the code number from left to right, similar to the generated index files, but without the first amino acids: 4431 and conducts a database search Penta the fragments, containing four selected amino acids in the folder with the specified structure of the next interamente (1000000000), i.e. in files H, where X can take the values 1, 2, 3, 4, corresponding to the numbers of symmetry groups (see table 5 description of application) - 14431_1000000000, 24431_1000000000, 34431_1000000000, 44431_1000000000.

The result of the search were found interagency containing the last four amino acids: Val, Asp, Cys, Gly and following the fifth amino acids, recorded together with the codes of the proteins from which they were obtained:

edit group 1 (14431_1000000000): Gly;

- in the file group 2 (24431_1000000000) - interagency not found;

- in the file group 3 (34431_1000000000): - interagency not found;

- in the file group 4 (44431_1000000000): - interagency not found.

Please note that files in groups 2, 3, 4 interagency not found. For design use only the amino acid Gly.

- when it finds such Pentagrammaton operator produces:

- select one of the new amino acid and adding it to the last four amino acids of the previous interamente;

As a fifth amino acids operator chose Gly and entered this information into the program.

Next, the program produces:

a recording of a new amino acid in the work file ("Entry in the desktop file, table 17), reflecting the projected primary structure of a protein (Gly);

record decimal is Omer folder describing the secondary structure of each found interamente (1000000000);

Because interagent was found, then we omit the validity of the search relating to the case of not finding the folder interamente.

Next, produce a repetition of actions according ppl) and M) until the end of the design process. As can be seen from table 18, in the process of designing amino acid sequence at all stages there was a choice of only one amino acid.

H) - designed primary structure of a protein is considered received in the working file an amino acid sequence, with the corresponding description of its secondary structure, stored in the working file of the program PROTCOM and presented in the right part of the table 18.

For experimental confirmation of the existence designed in example 2, the primary structure of a protein with a given secondary structure in the Protein Data Bank was found a few fragments of the proteins having the amino acid sequence that is identical either with all the designed amino acid sequence of example 2, or the plot of this sequence (table 19).

As can be seen from table 19, we designed the primary structure of amino acids, for example 2 was identical posledovatel the particular amino acid fragment of the protein 1AGD. Table 20 shows a two-dimensional description of hydrogen bonds is presented in table 19 of the protein fragments obtained using Protein 3D file-based table 19. The description of their secondary structure in the form of a ten-digit Boolean numbers are listed in the right column of the table 20. For protein 1AGD completely identical to the given description of the secondary structure of the designed primary protein structure of example 2 (table). Also found, some parts of this sequence can be based on Pentagrammaton proteins 2R37 (NO. 9), V (No. 11) and 1BAS (No. 12), which have no affinity with protein 1AGD. Table 20 describes their secondary structure in the form of a ten-digit Boolean numbers, which coincides with the given description of the secondary structure of the designed primary structure of example 2.

tr>
Table 20
The secondary structure of the protein fragments
Fragments of protein secondary structure from Protein Data BankDescription of the secondary structure of the protein fragment 1AGD
1AGD2R373B02BAS
140000000001
112 GLY130000000100
111 ARG84 LEU120000010000
110 LEU83 LEU
109 LEU39 LEU82 ABOUT ARG - LYS 78 N110001000000
108 ABOUT ARG - 104 GLY N38 ABOUT ARG - 34 LEU N81 GLY100100000010
108 ARG
107 GLY37 GLY80 ASP90000001000
106 ASP192 GLY 191 ASP36 ASP 35 PRO80000100000
105 PRO190 PRO70010000000
104 GLY N - 108 ABOUT ARG
104 GLY189 N GLY - 193 ILE O61000000000
103 VAL188 VAL50000000000
102 ASP4
101 CYS
100 GLY3
99 TYR 2
1

Figure 2-4 shows screenprint fragments of protein secondary structure 1AGD, 2R37, V, 1BAS, obtained using Protein 3D. Mapping protein fragment 1AGD with a fragment of the protein 2R37 (figure 2,a and 2,b), with a fragment of the protein V (figure 3,a and 3,b) and protein fragment 1BAS (figure 4,a and 4,b) allows us to conclude, their secondary structures are identical and are interchangeable. This means that there is no difference in the designed protein of example 2 on the basis of Pentagrammaton only protein 1AGD or using Pentagrammaton obtained from four different proteins 1AGD, 2R37, V and 1BAS.

The General view was investigated by x-ray method of protein 1AGD shown in figure 5,and. In the rectangle selection, the corresponding primary structure, designed by the claimed method. Figure 5,b it is shown close up, and figure 5,a detailed view of a fragment secondary structure of the protein 1GDJ corresponding to a given secondary structure of example 2. The above figures clearly illustrate the presence of this fragment in the real protein.

Thus, when the ore 2 designed primary structure of the protein is confirmed by the two options. First option: the primary structure consists only of Pentagrammaton protein 1AGD. Given the secondary structure of the designed protein of example 2 coincides with the secondary structure of the protein fragment 1AGD. Second option: the primary structure consists of Pentagrammaton obtained from four different proteins 1AGD, 2R37, V and 1BAS. Description of secondary structures Pentagrammaton proteins 1AGD, 2R37, V and 1BAS also coincides with the given description of the secondary structure of the designed primary protein structure of example 2.

Information about the secondary structure of proteins 1AGD, 2R37, V and 1BAS published and presented table 21.

Table 21
The list of proteins in which the x-ray method was investigated structure, coincident with a given secondary structure
stage № Protein codeThe name proteinLiterature
5-8, 10, 13, 141AGDHistocompatipility complexS.W. Reid, S. McAdam, K.J. Smith, P. Klenerman, C.A. O'callaghan, K. Harlos, B.K. Jakobsen, A.J. McMichael, J.I. Bell, D.I. Stuart, E.Y Jones Antagonist Hiv-1 Gag Peptides Induce Structural Changes In Hla B8 J. Exp. Med. V. 184 2279 1996 ASTM JEMEAV US ISSN 0022-100 0774 Resolution 2.05 Angstroms
92R37Human glutathione buffer 3E.S. Pilka, K. Guo, O. Gileadi, A. Rojkowa, F. Von Delft, A.C.W.Pike, K.L. Kavanagh, C. Johannson, M. Sundstrom, C.H. Arrowsmith, J. Weigel, T, A.M. Edwards, U. by Oppermann Crystal structure of human glutathione buffer 3 (selenocysteine to glycine mutant). No recorded citation in PubMed Resolution 1.85 Angstroms.
11WTranscriptional regulator, CRP family;Agari Y, Kuramitsu S, Shinkai A X-ray crystal structure of tthb099, a crp/fnr superfamily transcriptional regulator from thermus thermophilus hb8, reveals a DNA-binding protein with no required allosteric effector molecule. Proteins (2012), to be published. Resolution 1.92 Angstroms.
121BASFibroblast growth factorX. Zhu, H. Komiya, A. Chirino, S. Faham, G.M. Fox, T. Arakawa, B.T. Hsu, D.C. Rees Three-dimensional structures of acidic and basic fibroblast growth factors. Science V.251 90 1991. Astm Scieas US Issn 0036-8075 038 Resolution. 1.9 Angstroms.

Registration information database and software used in the description of the application

Database Pentagrammaton protein".

Authors: Karasev V.A., Belyaev, A.I., V.V. Luchinin

The certificate of state registration database No. 2010620364

Registered in the Registry database 7 July 2010

"A computer program for the design of the primary structure of a protein with a given secondary structure" - "PROTCOM".

The author is: Karasev V.A., A.I. Belyaev, V.V. Luchinin

The certificate of state registration of the computer program No. 2011611105.

Registered in the Register of computer programs February 2, 2011.

The design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and descriptions secondary structures, which are as follows:
A) create a database of amino acid Pentagrammaton proteins containing folder pentatriacontane, and the source folder list compiled by their names, formed on the basis of the encoded binary description of hydrogen bonds of the peptide groups Pentagrammaton in the secondary structure of proteins, and write them on the information media;
B) create a catalog descriptions of secondary structures that contain descriptions of secondary structures in a sequence of 10-digit Boolean number;
(B) enter into computer memory recorded on the information carrier database of amino acid Pentagrammaton proteins;
G) set the description of the secondary structure of the designed primary structure of a protein as a sequence of 10-digit Boolean numbers based on the catalog descriptions of secondary structures;
D) ask and enter into computer memory, the initial sequence of five amino acids belonging to the group of the C twenty canonical amino acids proteins which is the given initial Pentagrammaton;
E) ask and enter into computer memory a description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system, which is the first 10-digit number in the given description of the secondary structure, which corresponds to the folder name in the database containing the specified initial interagent;
G) enter into computer memory a program PROTCOM to highlight and search Pentagrammaton designed protein in the database and write the names of the amino acids found Pentagrammaton and numbers database folders describing the secondary structure, in which are found the search interagency;
C) enter and remember the specified initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;
And) enter and remember the given description of the secondary structure of a given initial interamente in the form of a ten-digit number in the binary system in the program PROTCOM;
K) conduct search the specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:
encodes the specified initial interamente for purposes of searching in the database;
- search the specified initial is entrapment in the database in the folder with the specified description of the secondary structure of interamente;
- when in the folder specified initial interamente believe this interagent the first possible number N of Pentagrammaton design of the primary structure of the protein and produce:
- recording folder number of the database that contains the first interagent;
record amino acid sequence of the first interamente in the work file of the program;
record ten-digit folder number describing the secondary structure found in the first interamente in the work file;
- when not in a folder specified initial interamente:
- specify and enter into computer memory a new initial sequence of five amino acids belonging to the group of the twenty canonical amino acids in protein, which is the new set initial Pentagrammaton;
- enter and memorize new set initial interagent designed protein in a sequence of five amino acids in the program PROTCOM;
- conduct a search for a new specified initial interamente designed protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:
coding new set of initial interamente for purposes of searching in the database;
- research new set of initial interamente in the database dads in the e with a given secondary structure interamente;
- repetition set a new initial Pentagrammaton and find a new set of source Pentagrammaton carry out up until you will not find interagent with such amino acid sequence, which is located in the database folder describing a given secondary structure interamente;
L) set the description of the secondary structure for each subsequent (N-1) Pentagrammaton, using the description of the given secondary structure in a sequence of 10-digit Boolean numbers that correspond to the names of the folders in the database containing the specified interagency, by introducing the same or a modified ten-digit number describing the secondary structure of the previous interamente, the program PROTCOM;
M) perform a search in the database of Pentagrammaton containing four amino acids of each of the (N-1) Pentagrammaton, stored in the working file and the new one, and the search algorithm includes:
- selection and memorization of the last four amino acids in each of the (N-1) Pentagrammaton, stored in the working file;
- search Pentagrammaton containing the last four amino acids of each of the (N-1) Pentagrammaton, stored in the working file, and one new amino acid in the database in the folder with the specified description of the secondary structure;
- when such phenteramine is produced:
- select one of the new amino acid and adding it to the last four amino acids of the previous interamente;
a recording of a new amino acid in the work file, reflecting the projected primary structure of a protein;
record decimal folder number describing the secondary structure of each found interamente;
- when not finding such Pentagrammaton produce:
the modified job descriptions secondary structure;
- allocation of the last four amino acids in the subsequent interamente;
- search Pentagrammaton containing the last four amino acids of the previous interamente and one new amino acid in the database in the folder with the modified description of the secondary structure;
- repetition of describing changes of the secondary structure and database search carried out until, until you find at least one interagent containing four amino acids of the previous interamente;
H) - designed primary structure of a protein is considered received in the working file an amino acid sequence, with the corresponding description of its secondary structure.



 

Same patents:

FIELD: chemistry.

SUBSTANCE: invention relates to biotechnology, specifically a method of producing artificial oligonucleotides that are potentially capable of forming non-canonical structures that stable in physiological conditions and conditions close to physiological, said structures being imperfect G-quadruplexes (lmGQ) which include one nucleotide substitution in the G4 plane in the G-quadruplexes (GQ). Said method includes using an algorithm describing nucleotide sequences in form of a defined set of formulae for further synthesis of selected oligonucleotides.

EFFECT: invention enables to use bioinformation analysis to obtain artificial oligonucleotides that are potentially capable of forming a new conformation - imperfect G-quadruplexes.

4 dwg, 2 tbl, 2 ex

FIELD: information technology.

SUBSTANCE: pre-examination patient information gathering system comprises an electronic user interface including a display and at least one user input device, and an electronic processor configured to present an initial set of questions to a patient via the electronic user interface, receive responses to the initial set of questions from the patient via the electronic user interface, construct or select follow-up questions based on the received responses, present the constructed or selected follow up questions to the patient via the electronic user interface, and receive responses to the constructed or selected follow up questions from the patient via the electronic user interface. A physiological sensor may be configured to autonomously measure a patient physiological parameter as the patient interacts with the electronic user interface.

EFFECT: high efficiency of a medical facility.

11 cl, 3 dwg

FIELD: information technology.

SUBSTANCE: apparatus includes a subject record database, a time-dependent relationship identifier, an event predictor, a coded subject record database, a decision support system processor and a user interface. The time-dependent relationship identifier processes the data in the subject record database to identify time-dependent relationships in the data. Information indicative of the identified relationships is processed by the processor and presented to a user via the user interface.

EFFECT: identifying relationships in subject information which includes event data indicative of an event experienced by the subject, outcome data indicative of an outcome experienced by the subject, and intervention data indicative of an intervention applied to the subject.

8 cl, 7 dwg

FIELD: physics.

SUBSTANCE: method involves determining current time characteristics, taking into account the state of the atmosphere, determining the spatial position of the imaging means, based on data from spatial positioning means, the obtained image is compared with three-dimensional models of the surrounding environment and electronic maps stored in a dynamically populated knowledge base, identifying objects of the surrounding environment that are part of the image using means of recognising and identifying samples associated with said base, where said base is constantly populated and improved with knew data obtained from identification of said objects.

EFFECT: high accuracy of displaying artificial objects on an image of the surrounding environment in real time owing to analysis of dynamic changes in the surrounding environment.

5 cl, 1 dwg

FIELD: information technology.

SUBSTANCE: portable storage device has a data management application which receives and processes data with measurement results from a measuring device which measures an analysed substance. The portable device can use an interface protocol which directly provides compatibility of the portable device with different operating systems and hardware configurations. The data management application is launched automatically upon connecting the portable device with a master computer.

EFFECT: managing medical data using different processing devices without the need for pre-installation of additional programs, clients, device drivers or other program components on separate processing devices.

60 cl, 19 dwg

FIELD: medicine.

SUBSTANCE: group of inventions relates to medicine. In realisation of methods implanted gastric restricting device is implanted into patient's body. Data, containing information about values of parameter, perceived inside the body, are collected for a time period. In the first version of method realisation determined are values of perceived parameter, which exceed the first threshold, are below the first threshold or below the second threshold in such a way that pulse is determined by time between values, which exceed the first threshold and values, which are below the first threshold or below the second threshold. In the second version of the method additional values of perceived parameter, accompanied by decreasing values, are determined. In the third version of the method areas under the curve of pressure dependence on time are determined, compared and the result of comparison is correlated with the state. In the fourth version of the method values of perceived pressure are formed for demonstration on display or further analysis. In the fifth version of the method average value of pressure for time X within the specified time period is calculated on the basis of values of perceived pressure within the window of averaging in specified period of time.

EFFECT: group of inventions makes it possible to increase treatment safety and efficiency due to control of implanted device.

32 cl, 77 dwg

FIELD: medicine.

SUBSTANCE: group of inventions relates to medical equipment. Wireless system of cardiac control contains ECG monitor and mobile phone. ECG monitor contains transceiver for wireless transmission of ECG signal data. ECG monitor contains connected with transceiver unit of notification about status for transmission of notification in case of change of ECG monitor status. Mobile phone contains electronics, transceiver for wireless reception of ECG signal data or notifications from ECG monitor and controller for transmission of ECG signal data into the control centre by electronics via mobile connection net. Controller can respond to notification from ECG monitor by communicating notification to patient by means of mobile phone or transmission of notification into the control centre. Notification is communicated to patient by means of mobile phone display, tone signal or verbal prompt, formed by mobile phone. Controller can delay transmission of specified notification into the control centre to give time for reception of notification about status of disorder elimination. When patient is informed about change in status patient is given possibility to answer immediately or to delay respond to notification.

EFFECT: invention makes it possible for patient to recognize and correct situation with changed status without transmission of notification or response of the control centre.

6 cl, 38 dwg, 1 tbl

FIELD: oil and gas industry.

SUBSTANCE: system contains one or more sources providing data representing aggregated fractures in formation, processor of computer connected to one or more sources of data, at that processor of computer contains carriers containing output code of the computer consisting of the first program code for selection of variety of materials to control drill mud losses out of list of materials in compliance with data representing total number of fractures in formation and the second program code related to the first program code and purposed for determination of optimised mixture for selected materials to control drill mud losses to apply them for fractures; at that optimised mixture is based on comparison of statistical distribution for selected sizes of materials to control drill mud losses and sizes of aggregated fractures.

EFFECT: reducing loss of materials and improving operational efficiency of wells.

20 cl, 6 dwg

FIELD: medicine.

SUBSTANCE: invention relates to field of medicine. System of cardiac monitoring contains battery-supplied ECG monitor, which is worn by patient and has processor of patient's ECG signal, device for identification of arrhythmia and wireless transceiver for sending messages about the state and obtaining information about configuration of device of arrhythmia identification. System of cardiac control additionally contains mobile phone, which has electronic devices of mobile phone, transceiver and controller. In the process of method version realisation, parameter of specified arrhythmia to be identified, and limit of switching on alarm signals for specified arrhythmia, are determined and stored in configuration file in the centre of monitoring. ECG monitor is fixed to patient and activated to start ECG monitoring. Message about state is sent by wireless communication line from ECG monitor into the centre of monitoring. Reply to message, which includes only configuration file, is sent to ECG monitor. Configuration file is used to adjust device for arrhythmia identification.

EFFECT: invention makes it possible to provide completely wireless ECG monitoring to increase patient's comfort and convenience.

18 cl, 48 dwg, 1 tbl

FIELD: information technologies.

SUBSTANCE: in the method a type of the map is built and placed using logics determined by the map type component, corresponding to each visual element, besides, such logics may depend on one or more values of parameters of the map type component. Some of these values of parameters correspond to available values of map model parameters, and other ones are calculated using a model, which determines analytic ratios between parameters of the map model. Sequence of operations for building of map type may be fully controlled by data and may include a mechanism for canonisation of input data and linkage of canonised input data to model parameters.

EFFECT: expansion of functional capabilities, due to provision of generation of a layout controlled by infrastructure data, which depends on input data.

20 cl, 16 dwg

FIELD: physics, computer engineering.

SUBSTANCE: invention relates to providing content to a device for reproducing content. In one implementation, the computer-implemented method receives content data and metadata. The metadata are associated with a plurality of temporal positions in the content data. Viewing parameters corresponding to the plurality of temporal positions are calculated based on the received metadata. The content data are selectively delivered based on said association.

EFFECT: improved method.

20 cl, 13 dwg

FIELD: information technology.

SUBSTANCE: apparatus comprises a modem consisting of a demodulator and a modulator, a mutual difference coefficient measuring device consisting of two multipliers, a phase changer, two integrators, two squaring devices, an adder, a gating unit and a normalising unit, a group of AND elements, an OR element, a NOR element, a flip-flop, a register, a unit for measuring the signal energy to noise spectral density ratio, a mutual difference coefficient threshold measuring device consisting of an AND element, a doubler, a squaring device, a logarithm device, a divider, a comparator, a control result output unit, a group of delay lines, an analogue-to-digital converter, a controlled delay line, a switch, and further includes an OR element, two AND elements, an RS flip-flop, a comparator, two devices for calculating mathematical expectation consisting of two OR elements, two inverters, a register, two shift registers, a group of AND elements, a group of adders, a counter and a divider.

EFFECT: high reliability of controlling communication link quality of a data transmission channel and end transmission equipment.

2 dwg

FIELD: information technology.

SUBSTANCE: versions provide an architecture to enable composite autonomous applications and services to be built and deployed. In addition, an infrastructure is provided to enable communication between distributed applications and services. In one or more versions, an exemplary architecture includes or otherwise uses five logical modules including connectivity services, process services, identification services, lifecycle services and tools.

EFFECT: high efficiency of building and deploying data-controlled composite applications.

9 cl, 17 dwg

FIELD: information technology.

SUBSTANCE: method of operating a computer interactively with a user interface device to prepare a beverage recipe for an integrated beverage preparation system, having a dispensing module for dispensing selected ingredients into a beverage container and a blending/mixing module which blends and/or mixes ingredients in the beverage container, said method including a step of: executing on the computer a recipe program which includes: presenting to a user one or more images on a display of said user interface device for said user to enter recipe parameters for said beverage recipe; and saving said entered recipe parameters as said beverage recipe in memory associated with said computer, and then executing the program responsible for the user selection of said saved recipe to prepare the beverage. Said recipe parameters include a blending profile for blending coarse particles without changing granularity and/or a mixing profile for grinding coarse particles into a finely ground product for the blending and/or mixing module for blending and/or mixing ingredients in the beverage container.

EFFECT: creating a beverage recipe and saving said recipe, the saved recipe having parameters which include mixture parameters for mixing coarse particles without changing granularity or blending profile for grinding coarse particles.

16 cl, 63 dwg

FIELD: information technology.

SUBSTANCE: in a version of the invention, execution of one or more processes which include content received through a network is controlled by another process of the same application which includes the one or more processes. The control involves ending one or more processes if they are not responding. Execution of one or more processes is isolated from the other process such that when one or more processes are not responding, the other process remains responsive. Content in the one or more ended processes is then restored.

EFFECT: isolating content through processes in an application.

20 cl, 5 dwg

FIELD: information technologies.

SUBSTANCE: in the method of automated language detection and (or) text document coding, byte sequences are identified, and statistics of frequency of identified byte sequences is counted. Then, using the statistics, profiles of each language and (or) each coding are built, a search engine is built to extract sought-for byte sequences from the byte flow of the inspected document, and the built search engine and profiles of languages and (or) codes are saved into the memory. Byte sequences are found in electronic version of each inspected document with the help of the search engine, and statistics of frequency of found byte sequences is counted as the profile of the inspected document. The calculated profile of the inspected document is compared with profiles of languages and (or) codes to identify relevance of the language and (or) code to this inspected document.

EFFECT: expanded arsenal of technical facilities, making it possible to automatically detect language and coding of text according to previously collected statistics in any text documents.

3 cl

FIELD: information technology.

SUBSTANCE: first version and at least one cell associated with the document are received, wherein at least one cell has a cell identifier and the cell identifier is associated with the first version, having at least one first version identifier. Each of the at least one first version identifiers presents cell status at a moment in time, and the coverage area defines a plurality of cells and versions, the coverage area including at least one root object. Updates for a first computing device are received. The updates indicate the identifier of the updated version, associated with each cell, associated with the document. The first version of each cell is stored if the first version identifier matches the identifier of the updated version of the cell. A new version of each cell is generated, wherein generation of the new version includes assigning a new version of the identifier of the new version if the identifier of the first version of the cell does not match the identifier of the updated version of the cell. Any cell on which there were no links in root objects is deleted and the document is synchronised by replacing cells with a new version of each cell.

EFFECT: reduced volume of altered information.

12 cl, 6 dwg

FIELD: information technologies.

SUBSTANCE: associative Identifier of Events, Technological, implements a circuit of identification of expected design events/conditions of a control system, determined readings of primary sensors of process control, and whenever such occur, it generates alternative design data for direct control of process without application of software and processor resources in asynchronous mode and at the moment of control data arrival, at the same time it includes a multi-layer architecture of an item, organising address-free space of memory and providing for equal and asynchronous access of input information to each layer, in respect to input data, all layers are interconnected memory cells with elements of data comparison and control of recording procedure.

EFFECT: increased indices of reliability, trustworthiness and validity.

3 cl, 3 dwg

FIELD: radio engineering, communication.

SUBSTANCE: disclosed system for controlling, collecting and processing data with onboard spacecraft recording equipment includes at least one onboard recording equipment unit connected by at least two communication channels to a data control and processing unit, which is connected onboard spacecraft equipment through at least one communication channel for subsequent collection of information on Earth. The data control and processing unit includes: an interfacing device, a self-contained timer, a single-board computer, a forced cooling system, a heat sensor system, a storage unit, a synchronous data transmission unit, a secondary power unit and a command transmission and power distribution system.

EFFECT: easier and reliable simultaneous connection to different onboard recording equipment.

7 cl, 2 dwg

FIELD: oil and gas industry.

SUBSTANCE: stages of the proposed method involve acquisition of database of oil deposit, which are related to oil-field objects. A self-organising map (SOM) is formed by means of the following: assignment of each of multiple data fields to one of multiple SOM maps. Each of multiple oil-field objects is assigned to one of multiple SOM positions based on the pre-determined SOM algorithm for presentation of statistical patterns in a variety of databases of oil deposit. Stochastic database is formed of databases of oil deposit based on artificial neuron network for databases of oil deposit. Screening of databases of oil deposit is performed to identify candidates from oil-field objects. Besides, screening is based on stochastic database. Detail assessment of each of the candidates and selection of oil-field object of candidates based on detail assessment is performed. Oil-field operations for the chosen oil-field object are performed.

EFFECT: improving assessment accuracy of oil-field objects.

22 cl, 23 dwg

FIELD: chemistry.

SUBSTANCE: invention relates to biotechnology and a method of detecting O-glycosylated proteins in cell homogenates that are prepared for proteomic and phosphoproteomic analysis. The disclosed invention can be used to perform proteomic and phosphoproteomic analysis. The method involves performing two-dimensional electrophoresis, followed by identification of spots using MALDI-TOF spectroscopy and phosphoproteomic techniques. The cell homogenates are desalinated by gel-penetrating chromatography or dialysis. The cell homogenates are subjected to glycosylation based on a β-elimination principle in a 0.05 M NaOH solution which contains 38 mg/ml NaBH4 for 16 hours at +45°C, followed by addition of cyanine dye JC-1 in concentration of 10-6 M. The cell homogenates are incubated for 15 minutes at room temperature. The homogenates are concentrated by precipitation with 50% acetone, subjected to two-dimensional electrophoresis to form electrophoregrams which are analysed for fluorescence when illuminated on a blue light transilluminator with an amber light filter, which visually appears in form of strips which are fluorescent in the dark. Said strips are extracted from the gel and used to perform proteomic or phosphoproteomic analysis. Further analysis of intensity and arrangement of the extracted strips is performed by comparing silver nitrate-coloured electrophoregrams of homogenates before and after a deglycosylation procedure.

EFFECT: disclosed invention enables to identify proteins which change their composition or degree of O-glycosylation as a result of any physiological action on the cell.

5 dwg

Up!