RussianPatents.com
|
Method of designing primary structure of protein with specified secondary structure. RU patent 2511002. |
||||||||||||||||||||||
IPC classes for russian patent Method of designing primary structure of protein with specified secondary structure. RU patent 2511002. (RU 2511002):
|
FIELD: chemistry. SUBSTANCE: invention relates to computer method, which uses biochemical databases in design of novel protein compounds. Design is performed by operator by means of specially written software PROTCOM basing on application of database of protein pentafragments. Design process consists in specifying and introduction into PROTCOM software of initial sequence of five amino acids (specified initial pentafragment) and written in binary system ten-digit number, which describes secondary structure of specified initial pentafragment. Search of said sequence is performed in database fold with the number, corresponding to specified ten-digit number. Search is performed until specified initial pentafragment is found in database. After its finding, said pentafragment is considered to be the first of possible number N of pentafragments of designed primary protein structure, and it, together with ten-digit number of fold, describing its secondary structure, is recorded into the programme working file. After that, secondary structures of each following number of (N-1) pentafragments are specified by introduction of the same or changed ten-digit number, describing secondary structure of the previous pentafragment into the programme, and search is performed in database of pentafragments, containing four amino acids of each of (N-1) pentafragments, recorded in working file, and one new one. When such pentafragments are found, one of new amino acids is selected and linked to four last amino acids of the previous pentafragment, new amino acid and ten-digit number of fold, describing secondary structure of each found pentafragment are recorded into working file. Obtained in working file sequence of amino acids, with corresponding description of its secondary structure, is considered to be designed primary structure of protein. EFFECT: claimed method of designing primary structure of protein considerably simplifies and accelerates the task of designing proteins with specified secondary structure. 5 dwg, 21 tbl, 2 ex
The invention relates to computer method, using biochemical databases in the development of new protein compounds for the pharmaceutical, biotechnology and other industries, as well as for scientific research in medicine, biochemistry, molecular biology and genetics for which significantly the use of new protein compounds on the basis of amino acids. The invention relates to the field of protein engineering to molecular biology, the tasks of which include the creation of knowledge and methods, allowing to get proteins with predetermined structure and function. One aspect of this trend is the design (design) of protein molecules. The design problem is the inverse with respect to the task of predicting protein structure. If in the process of predicting protein structure we known amino acid sequence should the first stage is to find its secondary structure, i.e. the position α-helix, beta-structural plots and twists, the design we have to ask this previously unknown sequence of amino acids in the primary structure, designed us to create the desired spatial structure which in suitable conditions, after its synthesis will take the order and size α-helix, beta-structural plots and twists. Design of new proteins, as a rule, is carried out on the basis of the developed methodology of prediction of protein structures and success of this methodology depends on the degree of luck in the design of new proteins with predictable structure. In most cases the results obtained are only few successful examples of large numbers is not mentioned by the authors failed options. Known attempts design of protein structures based on the General regularities of their formation. One of the first was the work of the group De Grado (D.Eisenberg, W.Wilcox, S.M.Eshita, P.M.Pryciak, S.P.Ho, W.F.Degrado. 1986. The design, synthesis and crystallization of an alpha-helical peptide. Proteins: Structure, Function, and Bioinformatics. V.1, Issue 1, pp.16-22). The authors proceeded from a simple idea: hydrophobic interaction of protein structures should be minimized or hidden in the hydrophobic core, and hydrophilic - secure contact with the solvent. Based on these considerations, the authors designed and synthesized artificial protein, containing only a few amino acids (Leu Glu, Lys) and consisting of four alpha-helices (W.F.DeGrado, L.Regan, S.P.. The Design of a Four-helix Bundle Protein. Cold Spring Harb Symp Quant Biol 1987. 52: 521-526). However, this simplified approach does not allow to design near real complex proteins are composed of 20 different types of amino acids and having given both structural and functional properties. In the basis of artificial protein albebetin was based not existing in the nature, structure, which consisted of two repetitions type α -? -? (V.V.Chemeris, D.A.Dolgikh, A.N.Fedorov, A.V.Finkelstein, M.P.Kirpichnikov, V.N.Uversky, O.B.Ptitsyn. A new approach to artificial and modified proteins: theory-based design, synthesis in a cell-free system and fast testing of structural properties by radiolabels. Protein Eng. (1994) 7 (8): 1041-1052). Its structure was developed on the basis of a physical theory of the formation of protein secondary structure developed by the authors (Ptitsyn O.B., Finkelstein A.V. Theory of protein secondary structure and algorithm of its prediction. Biopolymers. 1983. V.22. P.15-25). Structural study albebetin showed that he has given authors secondary structure and is in a state of molten globule. It should be noted that the accuracy of the approach used by the authors, does not exceed 80%, which is not possible with confidence to design proteins with the specified structure. They practically designed only one protein, and further investigations were discontinued. To improve the predictive properties of a known method that uses physical potentials, it was proposed to introduce a number of parameters, taking into account the properties of the amino acid sequence (A.M.Poole and R.Ranganathan. Knowledge-based potentials in protein design. Current Opinion in Structural Biology 2006, 16, 508-513). On the basis of this method, taking into account the entered parameters, the authors designed a de novo a number of proteins (WO 2007030594, "Methods of using and analyzing biological sequence data", IPC G06F 19/22; G06F 19/18, publ. 15.03.2007). However, this approach to wear compilation character and provides only a slight improvement based methods, without changing the probabilistic nature of the original physical method. It is known the invention pertaining to the apparatus and methods for quantitative design and optimization of the structure of the protein (US 2002106694 "Apparatus and method for automated protein design", IPC SC 1/00; C07K 14/00; C12N 15/10; G06F 17/50; G06F 19/00, publ. 08.08.2002). Developed automated design method of quantitatively account for the interaction of surface residues side chains on the basis of the calculation of three types of potentials and accounting stereochemical restrictions, you can choose from a large number of variants protein FSD-1 with the motive ββα, based on the structure of the domain zinc-finger protein. The amino acid sequence of the protein has very little resemblance to this domain. Despite this, the study of this protein in solution by the method of nuclear magnetic resonance have shown, that it forms the structure is identical to that proposed for her design (B.I.Dahiyat and S.L.Mayo. De Novo Protein Design: Fully Automated Sequence Selection. Science, (1997) 278, 82-87). The disadvantage of this method is the need for exemplary protein, on the basis of which it will select new structure of a large number of options. Using the methodology of Rosetta (Rosetta), presented in this paper (Kuhlman, Dantas G, Ireton GC, Varani G, Stoddard solvent BL, D. Baker Design of a novel globular protein fold with atomic-level accuracy. Science, 2003, 302(5649), 1364-8), based on the optimization of selected structures, was designed and synthesized unknown in nature artificial protein tor 7, which has been confirmed experimentally. The core of Rosetta - physical model of macromolecular interactions and search algorithms amino acid sequence with the least energy for a given protein structure. The authors used the method (US 7574306 "Method and system for optimization of polymer sequences with stable, 3-dimensional conformations", IPC G06F 19/00, publ. 11.08.2009) to the development of structures of a number of other proteins. However, this method requires quite complex calculations and does not always lead to successful results. To use it you must also have samples. Such methods do not solve the problem of creating a simple way to design of new proteins with any specified structure and functional properties, and the need to use as samples of a specific protein structures limits the range of the designed structures. The solution to this problem is especially important in the technology of pharmaceutical and immunological preparations protein. The task, which directed the claimed invention is development of design of the primary structure of the protein, which achieves the technical result consists in the simplification of the way with expanding the range of the designed structures. The proposed method the design of the primary structure of a protein on the basis of characterizing its amino acid sequence and description of the secondary structure is this: A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list composed by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier; B) enter in the memory of a computer-recorded information to the media database amino acid phentaramine proteins; B) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton; G) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit number in the binary system; D) to introduce in the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency; E) introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM; G) introduce and remember specified secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM; C) look for the specified initial interamente projected protein in the database using a previously recorded in the computer memory program PROTCOM, the search algorithm includes: - the encoding is specified initial interamente for search in a database; - search the specified initial interamente in the database in the folder with the specified secondary structure interamente; - when in folder the specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce: - recording folder number of the database containing the first interagent; record the sequence of amino acids first interamente in the working file of the program; - record ten-digit folder number describing the secondary structure found the first interamente in working file; - if not in a folder specified initial interamente: - ask and enter it in the computer memory as a new initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the new set initial Pentagrammaton; - enter and memorize new set initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM; - conduct the search for a new specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes: coding the new specified initial interamente for search in a database; - conducting search new set of initial interamente in the database in the folder with the specified secondary structure interamente; - repeat job new initial phentaramine and find new set of source phentaramine realize until then, until a match is found interagent with this amino acid sequence, which is located in the database directory that describes the specified secondary the structure of interamente; And ask the secondary structure of each following from (N-1) phentaramine through the introduction of the same or changed ten-digit number that describes the secondary structure of the previous interamente in the program PROTCOM; K) conducting a search in the database of phentaramine, containing four amino acids each of the (N-1) phentaramine recorded in the working file and the new one, and the search algorithm includes: - allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file; - search for phentaramine, contains the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database in the folder with the specified secondary structure; - when it finds such phentaramine produce: - select one of the new amino acids and adding it to the four last amino acids previous interamente; a recording of a new amino acids in the work file, reflecting the projected primary structure of a protein; record decimal folder number describing the secondary structure of each of the found interamente; - when not finding such phentaramine produce: - setting the modified secondary structure; - allocation of the last four amino acids in subsequent interamente; - search for phentaramine containing the last four amino acids previous interamente and one amino acid in the database in the folder with the modified secondary structure; - repetition of changes in the secondary structure and database search realize until then, until you find at least one interagent containing four amino acids previous interamente; L) designed the primary structure of a protein is considered received in the working file the sequence of amino acids, with an appropriate description of its secondary structure. The method is as follows: A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds (H-bonds) peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier; a) from Protein Data Bank produces download publicly available files with coordinates of atoms in crystals of proteins were investigated by means of x-ray analysis (PCA). To create the initial database was produced download 2500 files proteins. b) with the help of computer programs Protein 3D (Computer program "Protein 3D", registered in Russia. APO, №980143 from 03.05.98, authors: V. Karasev, Demchenko EL) on the basis of obtained from the Protein Data Bank files create a text file that contains the primary structure of proteins with a description of H-bonds formed by peptide groups the main chains of protein secondary structure; by means of a complex of programs for creating the database, conduct the following steps: - carry out cutting received primary structures proteins on the fragments of the five amino acids (interagency) so that each subsequent piece in the process of movement from the bottom up, stood out with a shift to one amino acid in relation to the previous tracks, and information about N-relationships of each allocated fragment in the secondary structure protein is fully preserved. In table 1 for example, the procedure of cutting of fragment of the text file protein 1SCN (subtilisin Carlsberg). The table shows that the H-bond in interamente remain unchanged. - interagency homologous structure of H-bonds of peptide groups in the secondary structure of a protein, sorted by folders, assigning the names of the folders are encoded in the binary system description of H-bonds peptide groups. The presence of H-bonds represent the number "1", the absence of hydrogen bonds - the number "0". Each interamente has 5 pairs of peptide groups, H-link connection, which describes four kinds of pairs of variables: no N-links - 00, N-link O...HN - 01, H-bonds NH...O - 10 and two H-bond:...HN NH...O and - 11. Thus the name of the folder that contains homologous structures interagency, consists of 10 digits 0 and 1, we read from the top down and write in a row from left to right. Examples of options allocated phentaramine and describes their ten-digit number in the binary system are shown in table 2. So, interagent obtained from section b-structure (the first line, the example to the left), does not contain H-bonds short-range order and describes a number 0000000000. Plot with Pentagrammaton, which is in the transition region b-structure - alpha-spiral (first row, right side) contains one link with ties O...HN NH...O and a pair of variables 11 and four sections with links O...HN - 01 and is characterized by a number 1101010101. The Central region α spiral, as shown in table 2, contains five links with ties O...HN NH...O and - 11 and describes a number 1111111111. Transition region alpha-helix - b-structure contains four link with bonds NH...O - 10 and one with ties O...HN NH...O and a pair of variables 11, which gives a ten-digit number 1010101011. Finally, section bending b-structure with one N-link, as follows from table 2, contains one link with communication NH...O - 10, three-link - without the H-bonds - 00 and one link with communication O...HN - 01, which is described by the number 1000000001. When you create a database during the processing of text files were moving along the chain of the protein from the bottom up with a shift to one amino acid at each stage and each allocated interagent receive a ten description. In table 1, these values are given in the second on the right column. In this column we have a series of overlapping 4/5 ten-digit descriptions of structure of a site of the protein 1CSN, each of which gets in the database folder with the same number. Bold 10-digit numbers for phentaramine similar to that shown in table 2. Table 1The sample procedure of cutting on interagency α-helical protein fragment 1SCN 10-character description Text file The stages of selection phentaramine 1CSN0000000000 69 69 PRO0000000000 68 68 ILE0000000010 67 67 GLY0000001010 66 66 THR0000101010 65 65 CYS0010101010 64 64 GLY → →1010101011 6363 ALA N - 59 O TYR 63 ALA62 LEU N - 58 THR O 1010101111 62 62 LEU61 LEU N - 57 ARG O 1010111111 61 61 LEU60 N LYS - 56 TYR O 1011111111 60 60 LYS59 TYR O - 63 ALA N → →1111111111 5959 TYR N - 55 GLU O 59 TYR58 THR O - 62 LEU N 1111111101 5858 THR N - 54 ASP O 58 THR57 ARG O - 61 LEU N 1111110101 5757 ARG N - 53 ARG O 57 ARG56 TYR O - 60 N LYS 1111010101 5656 TYR N - 52 LEU O 56 TYR55 GLU O - 59 N TYR → →1101010101 5555 GLU N - 51 GLN O 55 GLU54 ASP O - 58 N THR 0101010100 54 54 ASP53 ARG O - 57 ARG N →53 ARG O - 57 ARG N 0101010000 53 53 ARG 53 ARG52 LEU O - 56 TYR N →52 LEU O - 56 TYR N 52 LEU O - 56 TYR N 0101000000 52 52 LEU 52 LEU 52 LEU51 GLN O - 55 N GLU →51 GLN O - 55 N GLU 51 GLN O - 55 N GLU 51 GLN O - 55 N GLU 0100000000 51 51 GLN 51 GLN 51 GLN 51 GLN 50 PRO 50 PRO 50 PRO 50 PRO 50 PRO0000000000 50 49 ALA 49 ALA 49 ALA 49 ALA 49 ALA 48 ASP 48 ASP 48 ASP 48 ASP 47 SER 47 SER 47 SER 46 ARG 46 ARGThe Central parts of alpha-helices and?-structures of proteins describe, respectively, the ranks of the repeated 10-digit dialing 1111111111 and 0000000000. At the same time, the transition areas from b-structure to α spiral and from α spiral to beta structure describes blocks of 10-digit dialing slowly varying composition of pairs of variables. Examples of such blocks are shown in table 3. In bold are the initial and final parts of transitions and their 10-digit description. Table 4Comparison of bending α spiral with a gap of one H-bonds with a bend of b-structure with one N-bond Bending α spiral with a gap of one H-bond 10-character description Bend b-structure with one N-bond 10-character description 1DOG 334 GLN333 TYR O - 337 N LYS 333 TYR N - 329 TYR O 333 TYR332 LEU O - 336 ASP N 332 LEU N - 328 LEU O 332 LEU331 ALA O - 335 TRP N 331 ALA N - 327 GLN O 331 ALA330 ASP O - 334 GLN N 330 ASP N - 326 GLU O 330 ASP 1GZM329 TYR O - 333 TYR N 1111111111 31 LEU0000000000 329 TYR N - 325 ALA O 30 TYR0000000010 329 TYR 29 TYR328 LEU O - 332 LEU N 1111111101 28 GLN0000001000 328 LEU1111110111 27 PRO0000100000 327 GLN O - 331 ALA N 1111011111 26 ALA N - 22 SER O 0010000000 327 GLN N - 323 ALA O 1101111111 26 ALA1000000001 327 GLN0111111110 25 GLU0000000100 326 GLU O - 330 ASP N 1111111011 24PHE0000010000 326 GLU N - 322 LEU O 23 PRO 326 GLU1111101111 22 SER O - 26 ALA N 0001000000 325 ALA O - 329 TYR N 1110111111 22 SER0100000000 325 ALA N - 321 THR O 1011111101 21 ARG 325 ALA1111110101 20 VAL0000000000 324 ALA N - 320 CYS O 19 VAL 324 ALA 18 GLY323 ALA O - 327 GLN N 17 THR323 ALA N - 319 LEU O 323 ALA322 LEU O - 326 GLU N 322 LEU N - 318 PHE O 322 LEU321 THR O - 325 ALA N 321 THR N - 317 TRP O 321 THR320 CYS O - 324 ALA N 320 CYS319 LEU O - 323 ALA N 319 LEUBy combining these blocks can be used to design all types of secondary structures of proteins. d) produce simplification selected phentaramine by removing information from them about the structure of H-bonds and leaving only sequence of five amino acids; e) to facilitate further the procedures for phentaramine in the database perform a sort on the files that contain fragments with the same five-digit numeric index, which assign them by assigning each of the amino acids to interamente one of the four groups of transformations of the symmetry (V. Karasev, V.V. Luchinin Introduction to designing bionic nano. - M: Fizmatlit, 2009, 464 S., Chapter 8). These groups are given in table 5. Table 5The distribution of amino acids in accordance with the group antisemit Group symmetry Amino acids Group 1 Gly Pro Group 2Ala, Leu Group 3Ser, Thr, Cys, Met His, Trp, Phe, Tyr Group 4Asp, Glu, Asn, Gln, Arg, Lys Val, Ile In the file name record five-digit code and name of the folder where the file is located. If interagent Efg Def Cde Bcd Abcdescribes the 10-digit number 0000000000, its index is formed from the top down and write from left to right: for example, if the amino acid Efg belongs to the group 1, Def-group 2, Cde - to group 3, Bcd - to a group of 4 and Abc - group-1, it is a 5-digit code 12341, and the file name is 12341_0000000000. Created the database contains more than 500 thousand phentaramine, sorted by more than 500 folders. The database is organized in a system consisting of 16 hypercubes, is isomorphic to a Boolean hypercube 6 (Database phentaramine proteins. Authors: Vairaki, AIESEC, Win. Registered July 7, 2010 in the Federal Agency ROSPATENT №2010620364). The database is constantly updated by processing the new files from the Protein Data Bank. Can also be created theoretical database. B) enter in the memory of a computer-recorded information to the media database amino acid phentaramine proteins; C) define and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton; Conceived initial sequence of five amino acids are presented in the form of a column of three-letter acronyms of amino acids with the marks left their rooms, written from the bottom up: 5 Efg 4 Def 3 Cde 2 Bcd 1 AbcG) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit numbers in the binary system; D) to introduce in the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency; E) introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM; The operator enters into the program of the planned sequence of five amino acids (specified initial interagent). The input of these amino acids in the program carried out from top to bottom, starting with the fifth amino acids, and ends the first amino acid: Efg, Def, Cde, Bcd, Abc. G) introduce and remember specified secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM; Example input ten-digit numbers: 0000000000 C) look for the specified initial interamente projected protein in the database by using a previously recorded in the memory of a computer program PROTCOM, the search algorithm includes: - the encoding is specified initial interamente for search in a database; The program reads amino acids interamente down, encodes them in accordance with the affiliation to one group or another symmetry and writes a code number from left to right, similar to the created index file, for example: Efg - 1, Def - 2, Cde - 3, Bd - 4, Ab - 4, code - number 12344. - when placed in the folder specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce: - recording folder number of the database containing the first interagent; record amino acid sequence of the first interamente in the working file of the program; - record ten-digit folder number describing the secondary structure found the first interamente in working file; The format of a working file created by the program PROTCOM shown in table 6. Table 6The format of a working file created by PROTCOM 1 2 3 N STPbbbbbbbbbb . ............ 5 Efgbbbbbbbbbb 4 Def 3 Cde 2 Bcd 1 AbcThe entry sequence of amino acids of protein in the working file is made bottom-up that reflects the order of protein synthesis at the ribosome (extension of protein by adding amino acids to the top amino acid). Columns file has the following functions: 1 - rooms of amino acids in the projected protein, written from the bottom up; 2 - the sequence of amino acids in the projected protein, written from the bottom up using three signs; 3 - ten-digit numbers of folders (bbbbbbbbbb) database describing the secondary structure of the designed phentaramine written from the bottom up. in line N signal is the end of protein sequence (STP). Bold selected first interagent and ten-digit number of the folder where they found interagent. - if not in a folder specified initial interamente: - ask and enter it in the computer memory as a new initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the new set initial Pentagrammaton; - enter and memorize new set initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM; - conduct a search for a new specified initial interamente the projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes: - carry out the coding of the new specified initial interamente for search in a database; - conduct a search for a new set start interamente in the database in the folder with the specified secondary structure interamente; - repeat job new initial phentaramine and find new set of initial phentaramine realize until then, until a match is found interagent with this sequence of amino acids, which located in the database directory that describes the specified secondary structure of interamente. And ask the secondary structure of each following from (N-1) phentaramine recorded in the working file by entering the same or modified ten-digit number that describes the secondary structure of the previous interamente, the program PROTCOM; K) conducting a search in the database of phentaramine, containing four amino acids each of the (N-1)recorded in the desktop file phentaramine, and one new, and the search algorithm includes: - allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file; - search for phentaramine, contains the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database in the folder with the specified secondary structure; For example, in table 7 in bold are the last four amino acids previous interamente and entered description secondary structures for the search of a new interamente. - when it finds such phentaramine produce: - select one of the new amino acids and adding it to the last four amino acids previous interamente; Table 7The selection of amino acids and their secondary patterns for search phentaramine in the database 1 2 3 . ... ...... 60000000000 5 Efg0000000000 4 Def 3 Cde 2 Bcd 1 Abc- write a new amino acids in the work file, reflecting the projected primary structure of a protein; - write the decimal folder number describing the secondary structure of each of the found interamente; - when not finding such phentaramine produce: - setting the modified secondary structure; - allocation of the last four amino acids in subsequent interamente; - search phentaramine containing the last four amino acids previous interamente and one amino acid in the database in the folder with the modified secondary structure; - repetition of changes in the secondary structure and database search realize until then, until you find at least one interagent containing four amino acids previous interamente; L) is considered received in the working file the sequence of amino acids with the appropriate description of its secondary structure designed primary structure of the protein. As a result of actions of the program PROMCOM and the operator, who is designing a protein in the working file is completely filled, the second column contains the primary structure of a protein, and the third column, on the basis of which is judged on the secondary structure of this protein. The presence in the 3rd column consecutive folders 0000000000 characterizes the fragment as β-structural. Several consecutive folders numbering 1111111111 can be attributed to a fragment of α-spiral (see table 2). Transitional areas between α-helix and?-structural conformation, as well as curves of b-patterns (table 2-4) are designed and describes the relevant folders. Description of the application are illustrated in the following graphic materials: Fig 1. Screenprinti fragments of protein secondary structure NV and 3EOK obtained with the help of PROTEIN 3D. a - NV (people); b - EOK (duck); 2. Screenprinti fragments of protein secondary structure 1AGD and 2R37 obtained with the help of PROTEIN 3D for the projected site of the protein. a - phentaramine 1AGD (103-107); b - phentaramine 2R37 (189-193). 3. Screenprinti fragments of protein secondary structure 1AGD and V obtained with the help of PROTEIN 3D for the projected site of the protein. a - phentaramine 1AGD (105-109); b - phentaramine V (35-39). Figure 4. Screenprinti fragments of protein secondary structure 1AGD and 1BAS, obtained with the help of PROTEIN 3D for the projected site of the protein. a - phentaramine 1AGD (106-110); b - phentaramine 1BAS (80-84). Figure 5. Screenprinti protein 1AGD obtained with the help of PROTEIN 3D protein 1AGD, were investigated by means of the PCA. a protein; - a kind of protein fragment; - detailed view of the secondary structure of a protein 1GDJ relevant to given secondary structure of example 2. The method is illustrated by examples. Example 1. In this example describes how to design primary protein structure, with defined as α spiral secondary structure containing the crossing from b-structure to α spiral, Central region α spiral and the transition area from α spiral to beta structure. At carrying out of the way the design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and description of the secondary structure, carry out the following: A) create a database of amino acid phentaramine proteins containing folder interamente, moreover, the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier; B) create a catalog of the descriptions of secondary structures, containing descriptions of secondary structures in a series of ten-digit Boolean number; B) enter in the computer memory information recorded on the media database amino acid phentaramine proteins; D) set the description of the secondary structure of the projected primary structure of a protein in a series of ten-digit Boolean number directory-based descriptions of secondary structures; In the example given in the form of α spiral secondary structure contains plots of the transition from b-structure to α spiral, Central region α spiral and areas of transition from α spiral to beta structure. Description operator finds in the directory secondary structures and fixes it (table 8). Table 8Description of the projected secondary structure for example 1 180010101010 The crossing from α spiral to beta structure 171010101011 161010101111 151010111111 141011111111 131111111111 The Central part α-spiral 121111111111 111111111111 101111111111 91111111101 The crossing from b-structure to α-spiral 81111110101 71111010101 61101010101 50101010100 4 3 2 1D) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty-canonical amino acids of proteins, which is the specified initial Pentagrammaton: 5 Asp 4 Ala 3 Pro 2 Ser 1 Leuwhich is written in the order from the bottom up Leu, Ser, Pro, Ala, Asp. E) determine and enter it in the computer memory description of the secondary structure of a given initial interamente in the form of ten-digit number in the binary system, the first ten-digit number in the given description of the secondary structure, which corresponds to the name of the folder in the database that contains the specified initial interagent: ten-digit number 0101010100. As can be seen from table 8, it is the first ten-digit number in the given description secondary structure. It corresponds to the description of interamente with four H-bond C=O...HN described by a pair of variables 01, and one pair of variables 00 (no N-ties) - see table 3 (transition b-structure - alpha-helix). W) injected into the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency; 1. Installing the program is conducted in a special folder in which produced work files containing the projected primary structure of a protein and describing its secondary structure in the binary system of ten-digit numbers. 3. In the beginning of the program shown in table form a system of twenty amino acids, consisting of four groups. C) to introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM: the operator shall enter amino acids that make up a given initial interagent in sequence from the fifth on the first, that is top-down: Asp, Ala, Pro, Ser, Leu. And) to introduce and remember the given description secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM: the operator enters into the program PROTCOM sequence 0101010100. To) look for the specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes: - the encoding is specified initial interamente for search in a database; The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application). In this example: Asp - 4, Ala - 2, Pro - 1, Ser - 3, Leu - 2. This numeric sequence is recorded in the memory programs from left to right 42132 and is used to search for the specified initial interamente in the folder 0101010100 database in the file 42132_0101010100. - search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente; Program found a given initial interagent in the file 42132_0101010100: Asp Ala Pro Ser LeuThis interagent was isolated from a text file, the program PROTEIN 3D on the basis of processing of atomic coordinates of protein from Protein Data Bank, and has the structure 0101010100 transition section b-structure - alpha-spiral (see table 8). Tables 9, 10, 11 illustrate the work of the program. In the left part, entitled "Enter", are placed in the first column of the input to the program PROTCOM sequence numbers projected amino acids, in the second column are amino acids when entering according PS) or pairs of variables according to PL), selected by the operator based on the specified secondary structure (table 8). In the third column are written entered in the program description of the secondary structure in the form of ten-digit numbers. In the Central part, entitled "the Search interamente in the database"will be placed in the first column are the names of the files with the number of coding and number of the specified folder, and in the second - names found in interamente amino acids. In the right part of the table the record is carried out by program PROTCOM in the working file after the discovery of the specified initial interamente, and in future - after selecting the amino acids in a file with the number of coding number specified folder. Table 9For a given initial interamente in the database EnterSearch interamente in the database The entry in desktop file No.Amino acids or pairs of variables The description of the secondary structure The name of the file with the number of coding and number of the specified folder Names of amino acids No.Name amino acids Description secondary structure 5 Asp0101010100 42132_0101010100 5 Asp0101010100 4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 Leu- when placed in the folder specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce: - recording folder number of the database containing the first interagent; record the sequence of amino acids first interamente in work the program file; - record ten-digit folder number describing the secondary structure found the first interamente in working file; The program found introduced initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (tabl). Since a given initial interagent was found, we omit steps search, pertaining to find in the folder specified initial interamente. L) set the description of the secondary structure for each increment of (N-1) phentaramine using the description given secondary patterns in the form of ten-digit sequence of Boolean numbers that correspond to the names of the folders in the database that contain the specified interagency, through the introduction of the same or changed ten-digit number that describes the secondary structure of the previous interamente, the program PROTCOM; For this purpose during the job description of the secondary structure of the program PROTCOM proposes to introduce a pair of variables 00, 01, 10 or 11. From table 8 shows that the following ten-digit number is 1101010101. For this reason, the operator selects 11, and introduces a couple of 11 variables in the program (the column "Amino acids and a couple of variables in table 10). The program adds 11 to the left and remove a couple of digits to the right that leads to the change of ten-digit numbers that describe the secondary structure of the previous interamente, as reflected in the column "Description given secondary structure" table 10. Table 10 The search of the second interamente in the database EnterSearch interamente in the database The entry in desktop file No.Amino acids or pairs of variables The description of the secondary structure The name of the file with the number of coding and number specified folders Names of amino acids No.Names of amino acids Description secondary structure 6 111101010101 34213_1101010101 Ser 6 Lys1101010101 44213_1101010101 Lys 5 Asp0101010100 42132_0101010100 5 Asp0101010100 4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 LeuM) perform a database search of phentaramine containing four amino acids each of the (N-1) phentaramine recorded in the working file, and one new, and the search algorithm consists in myself: - allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file; - search for phentaramine containing the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database the folder with the specified description secondary structure; To do this, the program allocates interamente recorded the working file table 10, four amino acids, written from top to bottom: Asp, Ala, Pro, Ser. Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes a code number from left to right, similarly formed index files, but without the first amino acids: 4213 and conducts in the database search phentaramine containing four selected amino acids, in the folder with the specified structure next interamente (1101010101), i.e files X, where X can take the values 1, 2, 3, 4, corresponding to the numbers of groups of symmetry (see table 5 description of the application) - 14213_1101010101, 24213_1101010101, 34213_1101010101, 44213_1101010101. The result of the search were found interagency containing four last amino acids: Asp, Ala, Pro, Ser and following the fifth amino acids, recorded together with codes of proteins from which they were obtained: - in the file group 1 (14213_1101010101): interagency not found; - in the file group 2 (24213_1101010101) - interagency not found; - in the file group 3 (34213_1101010101): 1 - Ser; - in the file group 4 (44213_1101010101): 2 - Lys; - when it finds such phentaramine operator produces: - select one of the new amino acids and adding it to the last four amino acids previous interamente; Found a program of amino acids you can select either Ser in the file group 3 or Lys in the file group 4. The program allows to select only one option. Depending on the choice of the design will be different, that can be found only in the design. As the fifth amino acids operator chose Lys and entered the information into the program. Next, the program produces: a recording of a new amino acids in the work file ("Entry in the desktop file, table 10), reflecting the projected primary structure of a protein (Lys); record decimal folder number describing the secondary structure of each of the found interamente (1101010101); Because interagent was found, we omit the stage of searching, pertaining to find in the folder interamente. Next, make the repetition of actions pursuant to sub. L) and M) up to the end of the design process. As can be seen from table 11, in the design process the sequence of amino acids in phases 11, 13 and 18 had a choice of two or three amino acids, on the other stages of the program found only one amino acid. N) designed the primary structure of a protein is considered received in the working file the sequence of amino acids, with an appropriate description of its secondary structure, stored in the working file of the program PROTCOM and presented in the right part of table 11. Table 11 A subsequent search phentaramine in the database EnterSearch interamente in the database The entry in desktop file No.Amino acids or pairs of variables The description of the secondary structure The name of the file with the number of coding and number of the specified folder Name of amino acids No.Name amino acids Description secondary structure 18 000010101010 11443_0010101010 Gly0010101010 41443_0010101010 Lys 18 Gly 17 101010101011 14433_1010101011 Gly 17 Gly1010101011 16 101010101111 44334_1010101111 Ile 16 Ile1010101111 15 101010111111 43341_1010111111 Lys 15 Lys1010111111 14 101011111111 33414_1011111111 Ser 14 Ser1011111111 13 111111111111 24144_1111111111 Ala34144_1111111111 Phe 13 Phe1111111111 12 111111111111 41444_1111111111 Val 12 Val1111111111 Ile 11 111111111111 14443_1111111111 Gly 11 Gly1111111111 24443_1111111111 Ala34443_1111111111 Thr 10 111111111111 44434_1111111111 Lys 10 Lys1111111111 9 111111111101 44344_1111111101 Val 9 Val1111111101 8 111111110101 43442_1111110101 Asn 8 Asn1111110101 7 111111010101 34421_1111010101 Thr 7 Thr1111010101 6 111101010101 34213_1101010101 Ser44213_1101010101 Lys 6 Lys1101010101 5 Asp0101010100 42132_0101010100 5 Asp0101010100 4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 LeuAs seen from table 12, designed primary structure of a protein in example 1 is the most similar to the primary structure of the protein fragments NV and EOC. Thus, designed primary structure of a protein from 1 to 10 amino acid identical with the primary structure of a protein fragment NV 2-th and 11-th amino acid. At the same time, from 11 th to 18 amino acids designed primary structure of a protein identical to the primary structure of a protein fragment EOC from 12 th to 19 amino acids. Table 12 Mapping primary structures of protein fragments No.Designed by the sequence of amino acids example 1 No. of amino acids in proteins The protein fragments NV EACH 3DHR 3D4X 18 Gly 19 Ala Gly Gly Ser 17 Gly 18 Gly Gly Gly Gly 16 Ile 17 Val Ile Ile Ile 15 Lys 16 Lys Lys Lys Lys 14 Ser 15 Gly Ser Ala Gly 13 Phe 14 Trp Phe Phe Trp 12 Val 13 Ala Val Val Cys 11 Gly 12 Ala Gly Ala Ala 10 Lys 11 Lys Lys Lys Lys 9 Val 10 Val Val Val Val 8 Asn 9 Asn Asn Asn Asn 7 Thr 8 Thr Thr Ser Ser 6 Lys 7 Lys Lys Lys Lys 5 Asp 6 Asp Asp Asp Asp 4 Ala 5 Ala Ala Asn Ala 3 Pro 4 Pro Ala Ala Ala 2 Ser 3 Ser Ser Ser Ser 1 Leu 2 Leu Leu Leu LeuTable 13 shows the two-dimensional description of hydrogen bonds is presented in table 12 fragments of proteins obtained with the help of Protein 3D file-based table 12. There is given a description of their secondary structure in the form of ten-digit Boolean number, which is completely identical to the given description secondary structure designed primary structure (table 11). In figure 1,and 1,b presents screenprinti fragments of protein secondary structure NV and 3EOK obtained with the help of Protein 3D, and the corresponding amino acid sequence of the primary structure. These figures shows that the secondary structure of the fragments that make up designed primary structure of example 1, has overlap with the 7-th on 11-th amino acids, as well as the overlap between the amino acid sequence of their primary structures (on the sequences it in italics). Consequently, designed sequence is identical with the original fragments of the secondary structure. Thus, in example 1 presents designed primary structure of a protein, consisting of fragments of proteins NV and EACH specified secondary structure which fully coincides with the secondary structure of each of these proteins. Table 13 Fragments of proteins with two-dimensional description of their hydrogen bonds Description of the secondary structure of protein fragments The secondary structure of protein fragments from Protein Data Bank NV 3EOK 3DHR 3D4X 19 ALA 19 GLY 19 GLY 19 SER18 GLY N - 14 TRP O 18 GLY N - 14 PHE ABOUT 18 GLY N - 14 PHE ABOUT 18 GLY N - 14TRP O 18 GLY 18 GLY 18 GLY 18 GLY17 VAL N - 13 ABOUT ALA 17 ILE N - 13 VAL ABOUT 17 ILE N - 13 VAL ABOUT 17 ILE N - 13 CYS O 17 VAL 17 ILE 17 ILE 17 ILE16 N LYS - 12 ALA ABOUT 16 N LYS - 12 GLY ABOUT 16 N LYS - 12 ALA ABOUT 16 N LYS - 12 ALA ABOUT 180010101010 16 LYS 16 LYS 16 LYS 16 LYS15 GLY N - 11 LYS ABOUT 15 SER N - 11 LYS ABOUT 15 ALA N - 11 LYS ABOUT 15 GLY N - 11 LYS ABOUT 171010101011 15 GLY 15 SER 15 ALA 15 GLY14 TRPO - 18 GLY N 14 PHE ABOUT - 18 GLY N 14 PHE ABOUT - 18 GLY N 14 TRP O - 18 GLY N 161010101111 14 TRPN - 10 OF ABOUT VAL 14 PHE N - 10 OF ABOUT VAL 14 PHE N - 10 OF ABOUT VAL 14 TRP N - 10 OF ABOUT VAL 14 TRP 14 PHE 14 PHE 14 TRP 151010111111 13 ABOUT ALA - 17 VAL N 13 ABOUT VAL - 17 ILE N 13 ABOUT VAL - 17 ILE N 13 CYS O - 17 ILE N 1011111111 13 ALAN - 9 ASN ABOUT 13 VAL N - 9 ASN ABOUT 13 VAL N - 9 ASN ABOUT 13 CYS N - 9 ASN ABOUT 14 13 ALA 13 VAL 13 VAL 13 CYS 131111111111 12 ABOUT ALA - 16 N LYS 12 GLY ABOUT - 16 N LYS 12 ABOUT ALA - 16 N LYS 12 ABOUT ALA - 16 N LYS 1111111111 12 ALAN - 8 ON THR 12 GLY N - 8 ON THR 12 ALAN - 8 SER ABOUT 12 ALAN - 8 SER ABOUT 121111111111 12 ALA 12 GLY 12 ALA 12 ALA11 LYS ABOUT - 15 GLY N 11 LYS ABOUT - 15 N SER 11 LYS ABOUT - 15 ALAN 11 LYS ABOUT - 15 GLY N 111111111111 11 N LYS - 7 LYS ABOUT 11 N LYS - 7 LYS ABOUT 11 N LYS - 7 LYS ABOUT 11 N LYS - 7 LYS ABOUT 101111111111 11 LYS 11 LYS 11 LYS 11 LYS10 ABOUT VAL - 14 TRP N 10 ABOUT VAL - 14 N PHE 10 ABOUT VAL - 14 N PHE 10 VAL O - N 14TRP 91111111101 10 VAL N - 6 ABOUT ASP 10 VAL N - 6 ABOUT ASP 10 VAL N - 6 ABOUT ASP 10 VAL N - 6 ABOUT ASP 10 VAL 10 VAL 10 VAL 10 VAL 81111110101 9 ASN AU - 13 ALAN 9 ASN AU - 13 VAL N 9 ASN AU - 13 VAL N 9 ASN AU - 13CYS N 71111010101 9 ASN N - 5 ALA ABOUT 9 ASN N - 5 ALA ABOUT 9 ASN N - 5 ASN ABOUT 9 ASN N - 5 ALA ABOUT 9 ASN 9 ASN 9 ASN 9 ASN 61101010101 8 THR O - 12 ALA N 8 THR O - 12 GLY N 8 SER O - 12 ALA N 8 SER O - 12 ALA N 8 N THR - 4 PRO ON 8 THR N - 4 ALA ABOUT 8 SER N - 4 ALA ABOUT 8 SER N - 4 ALA ABOUT 50101010100 8 THR 8 THR 8 SER 8 SER7 LYS O - 11 N LYS 7 LYS O - 11 N LYS 7 LYS O - 11 N LYS 7 LYS O - 11 N LYS 47 LYS N - 3 SER ABOUT 7 LYS N - 3 SER ABOUT 7 LYS N - 3 SER ABOUT 7 LYS N - 3 SER ABOUT 3 7 LYS 7 LYS 7 LYS 7 LYS6 ASP O - 10 VAL N 6 ABOUT ASP - 10 VAL N 6 ABOUT ASP - 10 VAL N 6 ABOUT ASP - 10 VAL N 2 6 ASP 6 ASP 6 ASP 6 ASP 15 ALA O - 9 ASN N 5 ALA O - 9 ASN N 5 ASN AU - 9 ASN N 5 ALA O - 9 ASN N 5 ALA 5 ALA 5 ASN 5 ALA4 ABOUT PRO - 8 N THR 4 ABOUT ALA - 8 N THR 4 ABOUT ALA - 8 N SER 4 ABOUT ALA - 8 N SER 4 PRO 4 ALA 4 ALA 4 ALA3 SER - 7 N LYS 3 SER - 7 N LYS 3 SER - 7 N LYS 3 SER - 7 N LYS 3 SER 3 SER 3 SER 3 SER 2 LEU 2 LEU 2 LEU 2 LEUInformation about the secondary structure of proteins NV and 3EOK belonging to the class of hemoglobins, published and presented table 14. Table 14 The list of proteins in which the x-ray method was investigated structure that matches the given us the secondary structure No.Code protein The name of protein and source selection Literature 1 NVHEMOGLOBIN ALPHA SUBUNIT of HUMAN (person) G. Fermi, M.F. Perutz B. Shaanan, R. Fourme The crystal structure of human deoxyhaemoglobin at 1.74 angstroms resolution. J. Mol. Biol. v.175, p.159 (1984) 2 3EOKHEMOGLOBIN ALPHA SUBUNIT DUCK (duck) Sathya Moorthy, K. Neelagandan, M. Balasubramanian, M.N. Ponnuswamy. Crystal Structure Determination of Duck (Anas Platyrhynchos) Hemoglobin at 2.1 Angstrom Resolution To be published (structural data from PDB-Bank) Example 2. In this example describes how to design the primary structure of a protein, given in the form of inverted b-bend secondary structure. When carrying out design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and description of the secondary structure, carry out the following: A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier; B) create a catalog of the descriptions of secondary structures, containing descriptions of secondary structures in a series of ten-digit Boolean number; B) enter in the computer memory information recorded on the media database amino acid phentaramine proteins; D) set the description of the secondary structure of the projected primary structure of a protein in a series of ten-digit Boolean number directory-based descriptions of secondary structures; In this example, the secondary structure set in the form of inverted b-bend. Description operator finds in the directory secondary structures and fixes it (tabl). Table 15 Description of the projected secondary structure for example 2 140000000001 130000000100 120000010000 110001000000 100100000010 90000001000 80000100000 70010000000 61000000000 50000000000 4 3 2 1D) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton: 5 Val 4 Asp 3 Cys 2 Gly 1 Tyrwhich is written in the order from bottom to top: Tyr, Gly, Cys, Asp, Val. E) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit number in the binary system, the first ten-digit number in given the description of the secondary structure, which corresponds to the name of the folder in the database that contains the specified initial interagent: 10-digit number 0000000000. As can be seen from table 15, it is the first ten-digit number in the given description secondary structure. W) injected into the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency; Installing the program is similar to Pierre in example 1. C) to introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM: the operator shall enter amino acids that make up a given initial interagent in sequence from the fifth on the first, i.e. from top to bottom: Val, Asp, Cys, Gly, Tyr. And) to introduce and remember the given description secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM: the operator enters into the program PROTCOM sequence 0000000000. To) look for the specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes: - the encoding is specified initial interamente for search in a database; The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application). In this example: Val - 4 Asp - 4, Cys - 3, Gly - 1 and Tyr - 3. This numeric sequence is recorded in the memory programs from left to right 44313 and is used to search for the specified initial interamente in the folder 0000000000 database in the file 44313_0000000000. - search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente; Program found a given initial interagent in the file 44313_0000000000: Val Asp Cys Gly TyrThis interagent was isolated from a text file, the program PROTEIN 3D on the basis of processing of atomic coordinates of protein from Protein Data Bank and has β structure, described as 0000000000 that does not contain H-bonds in the immediate interamente. Tables 16, 17, 18 illustrate the work of the program. In the left part, entitled "Input" are placed in the first column of the input to the program PROTCOM sequence numbers projected amino acids, in the second column are the amino acids when entering according PS) or couples variables according PL), selected by the operator based on the specified secondary structure (table 15). In the third column are written entered in the program description of the secondary structure in the form of ten-digit numbers. In the Central part, entitled "the Search interamente in the database"will be placed in the first column are the names of the files with the number of coding and number of the specified folder, and in the second - names found in interamente amino acids. In the right part of the table the record is carried out the program PROTCOM in the working file after the discovery of the specified initial interamente, and in future - after selecting the amino acids in a file with the number of coding number specified folder. - when placed in the folder specified initial interamente consider this interagent the first of the possible number N phentaramine projected primary structure of the protein and produce: - recording folder number of the database containing the first interagent; record the sequence of amino acids first interamente in the working file of the program; - record ten-digit numbers folder describing secondary the structure found the first interamente in working file; The program found introduced initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (table 16). Since a given initial interagent was found, we omit steps search, pertaining to find in the folder specified initial interamente. L) set the description of the secondary structure for each increment of (N-1) phentaramine using the description given secondary patterns in the form of ten-digit sequence of Boolean numbers that correspond to the names of the folders in the database that contain the specified interagency, through the introduction of the same or changed ten-digit number that describes the secondary structure of the previous interamente, the program PROTCOM; For this purpose during the job description of the secondary structure of the program PROTCOM proposes to introduce a pair of variables 00, 01, 10 or 11. Table 15 shows that the following ten-digit number is 1000000000. For this reason, the operator selects 10, and introduces a couple of 10 variables in the program (the column "of Amino acids, or pairs of variables in table 17). The program adds 10 to the left and remove a couple of digits to the right that leads to the change of ten-digit numbers that describe the secondary structure of the previous interamente, as reflected in the column "Description given secondary structure" table 17. M) perform a database search of phentaramine containing four amino acids each of the (N-1) phentaramine recorded in the working file, and one new, and the search algorithm includes: - allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file; - search for phentaramine containing the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database in the folder with the specified description secondary structure; To do this, the program allocates interamente recorded the working file table 16 four amino acids, written from top to bottom: Val, Asp, Cys, Gly. Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes a code number from left to right, similar to the generated index of the files, but without the first amino acids: 4431 and conducts in the database search phentaramine containing four selected amino acids, in the folder with the specified structure next interamente (1000000000), i.e files X, where X can take the values 1, 2, 3, 4, corresponding to the numbers of groups of symmetry (see table 5 description of the application) - 14431_1000000000, 24431_1000000000, 34431_1000000000, 44431_1000000000. The result of the search were found interagency containing the last four amino acids: Val, Asp, Cys, Gly, and following the fifth amino acids, recorded together with codes of proteins from which they were obtained: - in the file group 1 (14431_1000000000): Gly; - in the file group 2 (24431_1000000000) - interagency not found; - in the file group 3 (34431_1000000000): - interagency not found; - in the file group 4 (44431_1000000000): - interagency not found. Please note that files in groups 2, 3, 4 interagency not found. For design use a single amino acid Gly. - when it finds such phentaramine operator produces: - select one of the new amino acids and adding it to the last four amino acids previous interamente; As the fifth amino acids operator chose Gly and entered the information into the program. Next, the program produces: a recording of a new amino acids in the work file ("Entry in the desktop file, table 17), reflecting the projected primary structure of a protein (Gly); record decimal folder number describing the secondary structure of each of the found interamente (1000000000); Because interagent was found, we omit steps searches relating to the case of not finding the folder interamente. As follows from table 19, we designed the primary structure of amino acids for example 2 was identical amino acid sequence of a protein fragment 1AGD. Table 20 shows the two-dimensional description of hydrogen bonds is presented in table 19 fragments of proteins obtained with the help of Protein 3D file-based table 19. Description of their secondary structure in the form of ten-digit Boolean numbers are listed in the right column of the table 20. For protein 1AGD completely identical to the given description secondary structure designed primary structure of a protein example 2 (tabl). Also found, some parts of this sequence can be compiled on the basis of phentaramine proteins 2R37 (№9), B (№11) and 1BAS (№12), which have no relationship with the protein 1AGD. In table 20 describes their secondary structure in the form of ten-digit Boolean number, which fully coincides with the given description secondary structure designed primary structure of example 2. Table 20 The secondary structure of protein fragments Fragments of secondary structure proteins from Protein Data Bank Description of the secondary structure of a protein fragment 1AGD 1AGD 2R37 3B02 1BAS 140000000001 112 GLY 130000000100 111 ARG 84 LEU 120000010000 110 LEU 83 LEU 109 LEU 39 LEU82 ARG ABOUT - 78 N LYS 110001000000 108 ARG ABOUT - 104 GLY N 38 ARG ABOUT - 34 LEU N 81 GLY 100100000010 108 ARG 107 GLY 37 GLY 80 ASP 90000001000 106 ASP192 GLY 191 ASP 36 ASP 35 PRO 80000100000 105 PRO 190 PRO 70010000000 104 GLY N - 108 ARG ABOUT 104 GLY189 GLY N - 193 ILE ABOUT 61000000000 103 VAL 188 VAL 50000000000 102 ASP 4 101 CYS 100 GLY 3 99 TYR 2 1Figure 2-4 presents screenprinti fragments of protein secondary structure 1AGD, 2R37, V, 1BAS, obtained with the help of Protein 3D. Comparison of protein fragment 1AGD with a fragment of the protein 2R37 (figure 2,a and 2,b), with a fragment of the protein V (figure 3,a and 3 b) and protein fragment 1BAS (figure 4,and 4,b) leads to the conclusion that their secondary structure identical and they are interchangeable. This means that there is no difference in the engineered protein example 2 on the basis of phentaramine only protein 1AGD or using phentaramine obtained from four different proteins 1AGD, 2R37, V and 1BAS. General view investigated by x-ray method protein 1AGD shown in figure 5,and. In the rectangle selection, corresponding primary structure, designed by the claimed method. Figure 5,b it is shown close up, and figure 5,in - detail view of the fragment of the secondary structure of a protein 1GDJ relevant to given secondary structure of example 2. The given figures illustrate the presence of this fragment in real protein. Thus, in example 2 is designed primary structure of a protein is confirmed by the two options. First variant: primary structure consists of only phentaramine protein 1AGD. Given the secondary structure designed protein example 2 identical with the secondary structure of a protein fragment 1AGD. Second variant: primary structure consists of phentaramine obtained from four different proteins 1AGD, 2R37, V and 1BAS. Description of secondary structures phentaramine proteins 1AGD, 2R37, V and 1BAS also completely coincides with the given description secondary structure designed primary structure of a protein example 2. Information about the secondary structure of proteins 1AGD, 2R37, V and 1BAS published and presented table 21. Table 21 The list of proteins in which the x-ray method was investigated structure that matches the specified secondary structure stage № Code protein Name protein Literature 5-8, 10, 13, 14 1AGDHistocompatipility complex S.W. Reid, S. McAdam, K.J. Smith, P. Klenerman, C.A. O'callaghan, K. Harlos, B.K. Jakobsen, A.J. McMichael, J.I. Bell, D.I. Stuart, E.Y Jones Antagonist Hiv-1 Gag Peptides Induce Structural Changes In Hla-B8 J. Exp. Med. V. 184 2279 1996 ASTM JEMEAV US ISSN 0022-1007 0774 Resolution 2.05 Angstroms 9 2R37Human glutathione buffer 3 E.S. Pilka, K. Guo, O. Gileadi, A. Rojkowa, F. Von Delft, A.C.W.Pike, K.L. Kavanagh, C. Johannson, M. Sundstrom, C.H. Arrowsmith, J. Weigel, T, A.M. Edwards, U. by Oppermann Crystal structure of human glutathione buffer 3 (selenocysteine to glycine mutant). No recorded citation in PubMed Resolution 1.85 Angstroms. 11 VTranscriptional regulator, CRP family; Agari Y, Kuramitsu S, Shinkai A X-ray crystal structure of tthb099, a crp/fnr superfamily transcriptional regulator from thermus thermophilus hb8, reveals a DNA-binding protein with no required allosteric effector molecule. Proteins (2012), to be published. Resolution 1.92 Angstroms. 12 1BASFibroblast growth factor X. Zhu, H. Komiya, A. Chirino, S. Faham, G.M. Fox, T. Arakawa, B.T. Hsu, D.C. Rees Three-dimensional structures of acidic and basic fibroblast growth factors. Science V.251 90 1991. Astm Scieas US Issn 0036-8075 038 Resolution. 1.9 Angstroms. Registration information databases and software used in the description of the application "The database of phentaramine proteins". Authors: V. Karasev, A.I. Belyaev, V.V. Luchinin Certificate of state registration database №2010620364 Registered in the Register of databases on 7 July 2010 "Computer program for constructing the primary structure of a protein with a given secondary structure" - "PROTCOM". Authors: V. Karasev, A.I. Belyaev, V.V. Luchinin The certificate on the state registration of the computer program №2011611105. Registered in the Register of computer programs February 2, 2011.
|
© 2013-2014 Russian business network RussianPatents.com - Special Russian commercial information project for world wide. Foreign filing in English. |