RussianPatents.com

Method of designing primary structure of protein with specified secondary structure. RU patent 2511002.

IPC classes for russian patent Method of designing primary structure of protein with specified secondary structure. RU patent 2511002. (RU 2511002):

G06F19/10 - Digital computing or data processing equipment or methods, specially adapted for specific applications ( G06F0017000000 takes precedence;data processing systems or methods specially adapted for administrative, commercial, financial, managerial, supervisory or forecasting purposes G06Q)

G06F17/00 - Digital computing or data processing equipment or methods, specially adapted for specific functions

C12Q1/00 - Measuring or testing processes involving enzymes or micro-organisms (measuring or testing apparatus with condition measuring or sensing means, e.g. colony counters, C12M0001340000); Compositions therefor; Processes of preparing such compositions

Another patents in same IPC classes:

Method of producing artificial oligonucleotides potentially capable of forming imperfect g-quadruplexes / 2509802
Invention relates to biotechnology, specifically a method of producing artificial oligonucleotides that are potentially capable of forming non-canonical structures that stable in physiological conditions and conditions close to physiological, said structures being imperfect G-quadruplexes (lmGQ) which include one nucleotide substitution in the G4 plane in the G-quadruplexes (GQ). Said method includes using an algorithm describing nucleotide sequences in form of a defined set of formulae for further synthesis of selected oligonucleotides.

Pre-examination medical data acquisition system / 2507576
Pre-examination patient information gathering system comprises an electronic user interface including a display and at least one user input device, and an electronic processor configured to present an initial set of questions to a patient via the electronic user interface, receive responses to the initial set of questions from the patient via the electronic user interface, construct or select follow-up questions based on the received responses, present the constructed or selected follow up questions to the patient via the electronic user interface, and receive responses to the constructed or selected follow up questions from the patient via the electronic user interface. A physiological sensor may be configured to autonomously measure a patient physiological parameter as the patient interacts with the electronic user interface.

Method and apparatus for identifying relationships in data based on time-dependent relationships / 2507575
Apparatus includes a subject record database, a time-dependent relationship identifier, an event predictor, a coded subject record database, a decision support system processor and a user interface. The time-dependent relationship identifier processes the data in the subject record database to identify time-dependent relationships in the data. Information indicative of the identified relationships is processed by the processor and presented to a user via the user interface.

Method of displaying surrounding environment / 2504833
Method involves determining current time characteristics, taking into account the state of the atmosphere, determining the spatial position of the imaging means, based on data from spatial positioning means, the obtained image is compared with three-dimensional models of the surrounding environment and electronic maps stored in a dynamically populated knowledge base, identifying objects of the surrounding environment that are part of the image using means of recognising and identifying samples associated with said base, where said base is constantly populated and improved with knew data obtained from identification of said objects.

System and method of managing medical data / 2504003
Portable storage device has a data management application which receives and processes data with measurement results from a measuring device which measures an analysed substance. The portable device can use an interface protocol which directly provides compatibility of the portable device with different operating systems and hardware configurations. The data management application is launched automatically upon connecting the portable device with a master computer.

Analysis of data for implanted restricting device and devices for data registration / 2502460
Group of inventions relates to medicine. In realisation of methods implanted gastric restricting device is implanted into patient's body. Data, containing information about values of parameter, perceived inside the body, are collected for a time period. In the first version of method realisation determined are values of perceived parameter, which exceed the first threshold, are below the first threshold or below the second threshold in such a way that pulse is determined by time between values, which exceed the first threshold and values, which are below the first threshold or below the second threshold. In the second version of the method additional values of perceived parameter, accompanied by decreasing values, are determined. In the third version of the method areas under the curve of pressure dependence on time are determined, compared and the result of comparison is correlated with the state. In the fourth version of the method values of perceived pressure are formed for demonstration on display or further analysis. In the fifth version of the method average value of pressure for time X within the specified time period is calculated on the basis of values of perceived pressure within the window of averaging in specified period of time.

System of controlling ecg with wireless connection / 2501520
Group of inventions relates to medical equipment. Wireless system of cardiac control contains ECG monitor and mobile phone. ECG monitor contains transceiver for wireless transmission of ECG signal data. ECG monitor contains connected with transceiver unit of notification about status for transmission of notification in case of change of ECG monitor status. Mobile phone contains electronics, transceiver for wireless reception of ECG signal data or notifications from ECG monitor and controller for transmission of ECG signal data into the control centre by electronics via mobile connection net. Controller can respond to notification from ECG monitor by communicating notification to patient by means of mobile phone or transmission of notification into the control centre. Notification is communicated to patient by means of mobile phone display, tone signal or verbal prompt, formed by mobile phone. Controller can delay transmission of specified notification into the control centre to give time for reception of notification about status of disorder elimination. When patient is informed about change in status patient is given possibility to answer immediately or to delay respond to notification.

System and method for minimisation of drilling mud loss / 2500884
System contains one or more sources providing data representing aggregated fractures in formation, processor of computer connected to one or more sources of data, at that processor of computer contains carriers containing output code of the computer consisting of the first program code for selection of variety of materials to control drill mud losses out of list of materials in compliance with data representing total number of fractures in formation and the second program code related to the first program code and purposed for determination of optimised mixture for selected materials to control drill mud losses to apply them for fractures; at that optimised mixture is based on comparison of statistical distribution for selected sizes of materials to control drill mud losses and sizes of aggregated fractures.

System of ecg monitoring with configured limits of switching on alarm signal / 2499550
Invention relates to field of medicine. System of cardiac monitoring contains battery-supplied ECG monitor, which is worn by patient and has processor of patient's ECG signal, device for identification of arrhythmia and wireless transceiver for sending messages about the state and obtaining information about configuration of device of arrhythmia identification. System of cardiac control additionally contains mobile phone, which has electronic devices of mobile phone, transceiver and controller. In the process of method version realisation, parameter of specified arrhythmia to be identified, and limit of switching on alarm signals for specified arrhythmia, are determined and stored in configuration file in the centre of monitoring. ECG monitor is fixed to patient and activated to start ECG monitoring. Message about state is sent by wireless communication line from ECG monitor into the centre of monitoring. Reply to message, which includes only configuration file, is sent to ECG monitor. Configuration file is used to adjust device for arrhythmia identification.

Analytical map models / 2497188
In the method a type of the map is built and placed using logics determined by the map type component, corresponding to each visual element, besides, such logics may depend on one or more values of parameters of the map type component. Some of these values of parameters correspond to available values of map model parameters, and other ones are calculated using a model, which determines analytic ratios between parameters of the map model. Sequence of operations for building of map type may be fully controlled by data and may include a mechanism for canonisation of input data and linkage of canonised input data to model parameters.

Server for providing content, device for reproducing content, method of providing content, method of reproducing content, program and system for providing content / 2506635
Invention relates to providing content to a device for reproducing content. In one implementation, the computer-implemented method receives content data and metadata. The metadata are associated with a plurality of temporal positions in the content data. Viewing parameters corresponding to the plurality of temporal positions are calculated based on the received metadata. The content data are selectively delivered based on said association.

Apparatus for controlling data transmission channel quality / 2504830
Apparatus comprises a modem consisting of a demodulator and a modulator, a mutual difference coefficient measuring device consisting of two multipliers, a phase changer, two integrators, two squaring devices, an adder, a gating unit and a normalising unit, a group of AND elements, an OR element, a NOR element, a flip-flop, a register, a unit for measuring the signal energy to noise spectral density ratio, a mutual difference coefficient threshold measuring device consisting of an AND element, a doubler, a squaring device, a logarithm device, a divider, a comparator, a control result output unit, a group of delay lines, an analogue-to-digital converter, a controlled delay line, a switch, and further includes an OR element, two AND elements, an RS flip-flop, a comparator, two devices for calculating mathematical expectation consisting of two OR elements, two inverters, a register, two shift registers, a group of AND elements, a group of adders, a counter and a divider.

Model-based composite application platform / 2502127
Versions provide an architecture to enable composite autonomous applications and services to be built and deployed. In addition, an infrastructure is provided to enable communication between distributed applications and services. In one or more versions, an exemplary architecture includes or otherwise uses five logical modules including connectivity services, process services, identification services, lifecycle services and tools.

Device and method of creating beverage recipe for integrated system for dispensing and blending/mixing beverage ingredients / 2501076
Method of operating a computer interactively with a user interface device to prepare a beverage recipe for an integrated beverage preparation system, having a dispensing module for dispensing selected ingredients into a beverage container and a blending/mixing module which blends and/or mixes ingredients in the beverage container, said method including a step of: executing on the computer a recipe program which includes: presenting to a user one or more images on a display of said user interface device for said user to enter recipe parameters for said beverage recipe; and saving said entered recipe parameters as said beverage recipe in memory associated with said computer, and then executing the program responsible for the user selection of said saved recipe to prepare the beverage. Said recipe parameters include a blending profile for blending coarse particles without changing granularity and/or a mixing profile for grinding coarse particles into a finely ground product for the blending and/or mixing module for blending and/or mixing ingredients in the beverage container.

Content isolation by processes in application / 2501075
In a version of the invention, execution of one or more processes which include content received through a network is controlled by another process of the same application which includes the one or more processes. The control involves ending one or more processes if they are not responding. Execution of one or more processes is isolated from the other process such that when one or more processes are not responding, the other process remains responsive. Content in the one or more ended processes is then restored.

Method for automated language detection and (or) text document coding / 2500024
In the method of automated language detection and (or) text document coding, byte sequences are identified, and statistics of frequency of identified byte sequences is counted. Then, using the statistics, profiles of each language and (or) each coding are built, a search engine is built to extract sought-for byte sequences from the byte flow of the inspected document, and the built search engine and profiles of languages and (or) codes are saved into the memory. Byte sequences are found in electronic version of each inspected document with the help of the search engine, and statistics of frequency of found byte sequences is counted as the profile of the inspected document. The calculated profile of the inspected document is compared with profiles of languages and (or) codes to identify relevance of the language and (or) code to this inspected document.

Document synchronisation on protocol not using status information / 2500023
First version and at least one cell associated with the document are received, wherein at least one cell has a cell identifier and the cell identifier is associated with the first version, having at least one first version identifier. Each of the at least one first version identifiers presents cell status at a moment in time, and the coverage area defines a plurality of cells and versions, the coverage area including at least one root object. Updates for a first computing device are received. The updates indicate the identifier of the updated version, associated with each cell, associated with the document. The first version of each cell is stored if the first version identifier matches the identifier of the updated version of the cell. A new version of each cell is generated, wherein generation of the new version includes assigning a new version of the identifier of the new version if the identifier of the first version of the cell does not match the identifier of the updated version of the cell. Any cell on which there were no links in root objects is deleted and the document is synchronised by replacing cells with a new version of each cell.

Associative identifier of events, technological / 2498400
Associative Identifier of Events, Technological, implements a circuit of identification of expected design events/conditions of a control system, determined readings of primary sensors of process control, and whenever such occur, it generates alternative design data for direct control of process without application of software and processor resources in asynchronous mode and at the moment of control data arrival, at the same time it includes a multi-layer architecture of an item, organising address-free space of memory and providing for equal and asynchronous access of input information to each layer, in respect to input data, all layers are interconnected memory cells with elements of data comparison and control of recording procedure.

System for controlling, collecting and processing data with onboard spacecraft recording equipment / 2498399
Disclosed system for controlling, collecting and processing data with onboard spacecraft recording equipment includes at least one onboard recording equipment unit connected by at least two communication channels to a data control and processing unit, which is connected onboard spacecraft equipment through at least one communication channel for subsequent collection of information on Earth. The data control and processing unit includes: an interfacing device, a self-contained timer, a single-board computer, a forced cooling system, a heat sensor system, a storage unit, a synchronous data transmission unit, a secondary power unit and a command transmission and power distribution system.

Device, method and system of stochastic investigation of formation at oil-field operations / 2496972
Stages of the proposed method involve acquisition of database of oil deposit, which are related to oil-field objects. A self-organising map (SOM) is formed by means of the following: assignment of each of multiple data fields to one of multiple SOM maps. Each of multiple oil-field objects is assigned to one of multiple SOM positions based on the pre-determined SOM algorithm for presentation of statistical patterns in a variety of databases of oil deposit. Stochastic database is formed of databases of oil deposit based on artificial neuron network for databases of oil deposit. Screening of databases of oil deposit is performed to identify candidates from oil-field objects. Besides, screening is based on stochastic database. Detail assessment of each of the candidates and selection of oil-field object of candidates based on detail assessment is performed. Oil-field operations for the chosen oil-field object are performed.

Method of detecting o-glycosylated proteins in cell homogenates prepared for proteomic and phosphoproteomic analysis / 2509807
Invention relates to biotechnology and a method of detecting O-glycosylated proteins in cell homogenates that are prepared for proteomic and phosphoproteomic analysis. The disclosed invention can be used to perform proteomic and phosphoproteomic analysis. The method involves performing two-dimensional electrophoresis, followed by identification of spots using MALDI-TOF spectroscopy and phosphoproteomic techniques. The cell homogenates are desalinated by gel-penetrating chromatography or dialysis. The cell homogenates are subjected to glycosylation based on a β-elimination principle in a 0.05 M NaOH solution which contains 38 mg/ml NaBH4 for 16 hours at +45°C, followed by addition of cyanine dye JC-1 in concentration of 10-6 M. The cell homogenates are incubated for 15 minutes at room temperature. The homogenates are concentrated by precipitation with 50% acetone, subjected to two-dimensional electrophoresis to form electrophoregrams which are analysed for fluorescence when illuminated on a blue light transilluminator with an amber light filter, which visually appears in form of strips which are fluorescent in the dark. Said strips are extracted from the gel and used to perform proteomic or phosphoproteomic analysis. Further analysis of intensity and arrangement of the extracted strips is performed by comparing silver nitrate-coloured electrophoregrams of homogenates before and after a deglycosylation procedure.

FIELD: chemistry.

SUBSTANCE: invention relates to computer method, which uses biochemical databases in design of novel protein compounds. Design is performed by operator by means of specially written software PROTCOM basing on application of database of protein pentafragments. Design process consists in specifying and introduction into PROTCOM software of initial sequence of five amino acids (specified initial pentafragment) and written in binary system ten-digit number, which describes secondary structure of specified initial pentafragment. Search of said sequence is performed in database fold with the number, corresponding to specified ten-digit number. Search is performed until specified initial pentafragment is found in database. After its finding, said pentafragment is considered to be the first of possible number N of pentafragments of designed primary protein structure, and it, together with ten-digit number of fold, describing its secondary structure, is recorded into the programme working file. After that, secondary structures of each following number of (N-1) pentafragments are specified by introduction of the same or changed ten-digit number, describing secondary structure of the previous pentafragment into the programme, and search is performed in database of pentafragments, containing four amino acids of each of (N-1) pentafragments, recorded in working file, and one new one. When such pentafragments are found, one of new amino acids is selected and linked to four last amino acids of the previous pentafragment, new amino acid and ten-digit number of fold, describing secondary structure of each found pentafragment are recorded into working file. Obtained in working file sequence of amino acids, with corresponding description of its secondary structure, is considered to be designed primary structure of protein.

EFFECT: claimed method of designing primary structure of protein considerably simplifies and accelerates the task of designing proteins with specified secondary structure.

5 dwg, 21 tbl, 2 ex

The invention relates to computer method, using biochemical databases in the development of new protein compounds for the pharmaceutical, biotechnology and other industries, as well as for scientific research in medicine, biochemistry, molecular biology and genetics for which significantly the use of new protein compounds on the basis of amino acids.

The invention relates to the field of protein engineering to molecular biology, the tasks of which include the creation of knowledge and methods, allowing to get proteins with predetermined structure and function. One aspect of this trend is the design (design) of protein molecules. The design problem is the inverse with respect to the task of predicting protein structure. If in the process of predicting protein structure we known amino acid sequence should the first stage is to find its secondary structure, i.e. the position α-helix, beta-structural plots and twists, the design we have to ask this previously unknown sequence of amino acids in the primary structure, designed us to create the desired spatial structure which in suitable conditions, after its synthesis will take the order and size α-helix, beta-structural plots and twists.

Design of new proteins, as a rule, is carried out on the basis of the developed methodology of prediction of protein structures and success of this methodology depends on the degree of luck in the design of new proteins with predictable structure. In most cases the results obtained are only few successful examples of large numbers is not mentioned by the authors failed options.

Known attempts design of protein structures based on the General regularities of their formation. One of the first was the work of the group De Grado (D.Eisenberg, W.Wilcox, S.M.Eshita, P.M.Pryciak, S.P.Ho, W.F.Degrado. 1986. The design, synthesis and crystallization of an alpha-helical peptide. Proteins: Structure, Function, and Bioinformatics. V.1, Issue 1, pp.16-22). The authors proceeded from a simple idea: hydrophobic interaction of protein structures should be minimized or hidden in the hydrophobic core, and hydrophilic - secure contact with the solvent. Based on these considerations, the authors designed and synthesized artificial protein, containing only a few amino acids (Leu Glu, Lys) and consisting of four alpha-helices (W.F.DeGrado, L.Regan, S.P.. The Design of a Four-helix Bundle Protein. Cold Spring Harb Symp Quant Biol 1987. 52: 521-526).

However, this simplified approach does not allow to design near real complex proteins are composed of 20 different types of amino acids and having given both structural and functional properties.

In the basis of artificial protein albebetin was based not existing in the nature, structure, which consisted of two repetitions type α -? -? (V.V.Chemeris, D.A.Dolgikh, A.N.Fedorov, A.V.Finkelstein, M.P.Kirpichnikov, V.N.Uversky, O.B.Ptitsyn. A new approach to artificial and modified proteins: theory-based design, synthesis in a cell-free system and fast testing of structural properties by radiolabels. Protein Eng. (1994) 7 (8): 1041-1052). Its structure was developed on the basis of a physical theory of the formation of protein secondary structure developed by the authors (Ptitsyn O.B., Finkelstein A.V. Theory of protein secondary structure and algorithm of its prediction. Biopolymers. 1983. V.22. P.15-25). Structural study albebetin showed that he has given authors secondary structure and is in a state of molten globule. It should be noted that the accuracy of the approach used by the authors, does not exceed 80%, which is not possible with confidence to design proteins with the specified structure. They practically designed only one protein, and further investigations were discontinued.

To improve the predictive properties of a known method that uses physical potentials, it was proposed to introduce a number of parameters, taking into account the properties of the amino acid sequence (A.M.Poole and R.Ranganathan. Knowledge-based potentials in protein design. Current Opinion in Structural Biology 2006, 16, 508-513). On the basis of this method, taking into account the entered parameters, the authors designed a de novo a number of proteins (WO 2007030594, "Methods of using and analyzing biological sequence data", IPC G06F 19/22; G06F 19/18, publ. 15.03.2007). However, this approach to wear compilation character and provides only a slight improvement based methods, without changing the probabilistic nature of the original physical method.

It is known the invention pertaining to the apparatus and methods for quantitative design and optimization of the structure of the protein (US 2002106694 "Apparatus and method for automated protein design", IPC SC 1/00; C07K 14/00; C12N 15/10; G06F 17/50; G06F 19/00, publ. 08.08.2002). Developed automated design method of quantitatively account for the interaction of surface residues side chains on the basis of the calculation of three types of potentials and accounting stereochemical restrictions, you can choose from a large number of variants protein FSD-1 with the motive ββα, based on the structure of the domain zinc-finger protein. The amino acid sequence of the protein has very little resemblance to this domain. Despite this, the study of this protein in solution by the method of nuclear magnetic resonance have shown, that it forms the structure is identical to that proposed for her design (B.I.Dahiyat and S.L.Mayo. De Novo Protein Design: Fully Automated Sequence Selection. Science, (1997) 278, 82-87).

The disadvantage of this method is the need for exemplary protein, on the basis of which it will select new structure of a large number of options.

Using the methodology of Rosetta (Rosetta), presented in this paper (Kuhlman, Dantas G, Ireton GC, Varani G, Stoddard solvent BL, D. Baker Design of a novel globular protein fold with atomic-level accuracy. Science, 2003, 302(5649), 1364-8), based on the optimization of selected structures, was designed and synthesized unknown in nature artificial protein tor 7, which has been confirmed experimentally. The core of Rosetta - physical model of macromolecular interactions and search algorithms amino acid sequence with the least energy for a given protein structure. The authors used the method (US 7574306 "Method and system for optimization of polymer sequences with stable, 3-dimensional conformations", IPC G06F 19/00, publ. 11.08.2009) to the development of structures of a number of other proteins. However, this method requires quite complex calculations and does not always lead to successful results. To use it you must also have samples.

Such methods do not solve the problem of creating a simple way to design of new proteins with any specified structure and functional properties, and the need to use as samples of a specific protein structures limits the range of the designed structures.

The solution to this problem is especially important in the technology of pharmaceutical and immunological preparations protein.

The task, which directed the claimed invention is development of design of the primary structure of the protein, which achieves the technical result consists in the simplification of the way with expanding the range of the designed structures.

The proposed method the design of the primary structure of a protein on the basis of characterizing its amino acid sequence and description of the secondary structure is this:

A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list composed by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier;

B) enter in the memory of a computer-recorded information to the media database amino acid phentaramine proteins;

B) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton;

G) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit number in the binary system;

D) to introduce in the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency;

E) introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM;

G) introduce and remember specified secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM;

C) look for the specified initial interamente projected protein in the database using a previously recorded in the computer memory program PROTCOM, the search algorithm includes:

- the encoding is specified initial interamente for search in a database;

- search the specified initial interamente in the database in the folder with the specified secondary structure interamente;

- when in folder the specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce:

- recording folder number of the database containing the first interagent;

record the sequence of amino acids first interamente in the working file of the program;

- record ten-digit folder number describing the secondary structure found the first interamente in working file;

- if not in a folder specified initial interamente:

- ask and enter it in the computer memory as a new initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the new set initial Pentagrammaton;

- enter and memorize new set initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM;

- conduct the search for a new specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

coding the new specified initial interamente for search in a database;

- conducting search new set of initial interamente in the database in the folder with the specified secondary structure interamente;

- repeat job new initial phentaramine and find new set of source phentaramine realize until then, until a match is found interagent with this amino acid sequence, which is located in the database directory that describes the specified secondary the structure of interamente;

And ask the secondary structure of each following from (N-1) phentaramine through the introduction of the same or changed ten-digit number that describes the secondary structure of the previous interamente in the program PROTCOM;

K) conducting a search in the database of phentaramine, containing four amino acids each of the (N-1) phentaramine recorded in the working file and the new one, and the search algorithm includes:

- allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file;

- search for phentaramine, contains the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database in the folder with the specified secondary structure;

- when it finds such phentaramine produce:

- select one of the new amino acids and adding it to the four last amino acids previous interamente;

a recording of a new amino acids in the work file, reflecting the projected primary structure of a protein;

record decimal folder number describing the secondary structure of each of the found interamente;

- when not finding such phentaramine produce:

- setting the modified secondary structure;

- allocation of the last four amino acids in subsequent interamente;

- search for phentaramine containing the last four amino acids previous interamente and one amino acid in the database in the folder with the modified secondary structure;

- repetition of changes in the secondary structure and database search realize until then, until you find at least one interagent containing four amino acids previous interamente;

L) designed the primary structure of a protein is considered received in the working file the sequence of amino acids, with an appropriate description of its secondary structure.

The method is as follows:

A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds (H-bonds) peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier;

a) from Protein Data Bank produces download publicly available files with coordinates of atoms in crystals of proteins were investigated by means of x-ray analysis (PCA). To create the initial database was produced download 2500 files proteins.

b) with the help of computer programs Protein 3D (Computer program "Protein 3D", registered in Russia. APO, №980143 from 03.05.98, authors: V. Karasev, Demchenko EL) on the basis of obtained from the Protein Data Bank files create a text file that contains the primary structure of proteins with a description of H-bonds formed by peptide groups the main chains of protein secondary structure;

by means of a complex of programs for creating the database, conduct the following steps:

- carry out cutting received primary structures proteins on the fragments of the five amino acids (interagency) so that each subsequent piece in the process of movement from the bottom up, stood out with a shift to one amino acid in relation to the previous tracks, and information about N-relationships of each allocated fragment in the secondary structure protein is fully preserved. In table 1 for example, the procedure of cutting of fragment of the text file protein 1SCN (subtilisin Carlsberg). The table shows that the H-bond in interamente remain unchanged.

- interagency homologous structure of H-bonds of peptide groups in the secondary structure of a protein, sorted by folders, assigning the names of the folders are encoded in the binary system description of H-bonds peptide groups. The presence of H-bonds represent the number "1", the absence of hydrogen bonds - the number "0".

Each interamente has 5 pairs of peptide groups, H-link connection, which describes four kinds of pairs of variables: no N-links - 00, N-link O...HN - 01, H-bonds NH...O - 10 and two H-bond:...HN NH...O and - 11. Thus the name of the folder that contains homologous structures interagency, consists of 10 digits 0 and 1, we read from the top down and write in a row from left to right.

Examples of options allocated phentaramine and describes their ten-digit number in the binary system are shown in table 2. So, interagent obtained from section b-structure (the first line, the example to the left), does not contain H-bonds short-range order and describes a number 0000000000. Plot with Pentagrammaton, which is in the transition region b-structure - alpha-spiral (first row, right side) contains one link with ties O...HN NH...O and a pair of variables 11 and four sections with links O...HN - 01 and is characterized by a number 1101010101. The Central region α spiral, as shown in table 2, contains five links with ties O...HN NH...O and - 11 and describes a number 1111111111. Transition region alpha-helix - b-structure contains four link with bonds NH...O - 10 and one with ties O...HN NH...O and a pair of variables 11, which gives a ten-digit number 1010101011. Finally, section bending b-structure with one N-link, as follows from table 2, contains one link with communication NH...O - 10, three-link - without the H-bonds - 00 and one link with communication O...HN - 01, which is described by the number 1000000001.

When you create a database during the processing of text files were moving along the chain of the protein from the bottom up with a shift to one amino acid at each stage and each allocated interagent receive a ten description. In table 1, these values are given in the second on the right column. In this column we have a series of overlapping 4/5 ten-digit descriptions of structure of a site of the protein 1CSN, each of which gets in the database folder with the same number. Bold 10-digit numbers for phentaramine similar to that shown in table 2.

Table 1

The sample procedure of cutting on interagency α-helical protein fragment 1SCN

10-character description

Text file

The stages of selection phentaramine

1CSN

0000000000

69 69 PRO

0000000000

68 68 ILE

0000000010

67 67 GLY

0000001010

66 66 THR

0000101010

65 65 CYS

0010101010

64 64 GLY → →

1010101011

63 ALA N - 59 O TYR

63 ALA

62 LEU N - 58 THR O

1010101111

62 62 LEU

61 LEU N - 57 ARG O

1010111111

61 61 LEU

60 N LYS - 56 TYR O

1011111111

60 60 LYS

59 TYR O - 63 ALA N

→ →

1111111111

59 TYR N - 55 GLU O

59 TYR

58 THR O - 62 LEU N

1111111101

58 THR N - 54 ASP O

58 THR

57 ARG O - 61 LEU N

1111110101

57 ARG N - 53 ARG O

57 ARG

56 TYR O - 60 N LYS

1111010101

56 TYR N - 52 LEU O

56 TYR

55 GLU O - 59 N TYR

→ →

1101010101

55 GLU N - 51 GLN O

55 GLU

54 ASP O - 58 N THR

0101010100

54 54 ASP

53 ARG O - 57 ARG N

→

53 ARG O - 57 ARG N

0101010000

53 53 ARG 53 ARG

52 LEU O - 56 TYR N

→

52 LEU O - 56 TYR N

0101000000

52 52 LEU 52 LEU 52 LEU

51 GLN O - 55 N GLU

→

51 GLN O - 55 N GLU

0100000000

51 51 GLN 51 GLN 51 GLN 51 GLN 50 PRO 50 PRO 50 PRO 50 PRO 50 PRO

0000000000

50 49 ALA 49 ALA 49 ALA 49 ALA 49 ALA 48 ASP 48 ASP 48 ASP 48 ASP 47 SER 47 SER 47 SER 46 ARG 46 ARG

The Central parts of alpha-helices and?-structures of proteins describe, respectively, the ranks of the repeated 10-digit dialing 1111111111 and 0000000000. At the same time, the transition areas from b-structure to α spiral and from α spiral to beta structure describes blocks of 10-digit dialing slowly varying composition of pairs of variables. Examples of such blocks are shown in table 3. In bold are the initial and final parts of transitions and their 10-digit description.

Table 4

Comparison of bending α spiral with a gap of one H-bonds with a bend of b-structure with one N-bond

Bending α spiral with a gap of one H-bond

10-character description

Bend b-structure with one N-bond

10-character description

1DOG 334 GLN

333 TYR O - 337 N LYS

333 TYR N - 329 TYR O

333 TYR

332 LEU O - 336 ASP N

332 LEU N - 328 LEU O

332 LEU

331 ALA O - 335 TRP N

331 ALA N - 327 GLN O

331 ALA

330 ASP O - 334 GLN N

330 ASP N - 326 GLU O

330 ASP 1GZM

329 TYR O - 333 TYR N

1111111111

31 LEU

0000000000

329 TYR N - 325 ALA O

30 TYR

0000000010

329 TYR 29 TYR

328 LEU O - 332 LEU N

1111111101

28 GLN

0000001000

328 LEU

1111110111

27 PRO

0000100000

327 GLN O - 331 ALA N

1111011111

26 ALA N - 22 SER O

0010000000

327 GLN N - 323 ALA O

1101111111

26 ALA

1000000001

327 GLN

0111111110

25 GLU

0000000100

326 GLU O - 330 ASP N

1111111011

24PHE

0000010000

326 GLU N - 322 LEU O

23 PRO 326 GLU

1111101111

22 SER O - 26 ALA N

0001000000

325 ALA O - 329 TYR N

1110111111

22 SER

0100000000

325 ALA N - 321 THR O

1011111101

21 ARG 325 ALA

1111110101

20 VAL

0000000000

324 ALA N - 320 CYS O

19 VAL 324 ALA 18 GLY

323 ALA O - 327 GLN N

17 THR

323 ALA N - 319 LEU O

323 ALA

322 LEU O - 326 GLU N

322 LEU N - 318 PHE O

322 LEU

321 THR O - 325 ALA N

321 THR N - 317 TRP O

321 THR

320 CYS O - 324 ALA N

320 CYS

319 LEU O - 323 ALA N

319 LEU

By combining these blocks can be used to design all types of secondary structures of proteins.

d) produce simplification selected phentaramine by removing information from them about the structure of H-bonds and leaving only sequence of five amino acids;

e) to facilitate further the procedures for phentaramine in the database perform a sort on the files that contain fragments with the same five-digit numeric index, which assign them by assigning each of the amino acids to interamente one of the four groups of transformations of the symmetry (V. Karasev, V.V. Luchinin Introduction to designing bionic nano. - M: Fizmatlit, 2009, 464 S., Chapter 8). These groups are given in table 5.

Table 5

The distribution of amino acids in accordance with the group antisemit

Group symmetry

Amino acids

Group 1 Gly Pro Group 2

Ala, Leu

Group 3

Ser, Thr, Cys, Met His, Trp, Phe, Tyr

Group 4

Asp, Glu, Asn, Gln, Arg, Lys Val, Ile

In the file name record five-digit code and name of the folder where the file is located. If interagent

Efg Def Cde Bcd Abc

describes the 10-digit number 0000000000, its index is formed from the top down and write from left to right: for example, if the amino acid Efg belongs to the group 1, Def-group 2, Cde - to group 3, Bcd - to a group of 4 and Abc - group-1, it is a 5-digit code 12341, and the file name is 12341_0000000000.

Created the database contains more than 500 thousand phentaramine, sorted by more than 500 folders. The database is organized in a system consisting of 16 hypercubes, is isomorphic to a Boolean hypercube 6 (Database phentaramine proteins. Authors: Vairaki, AIESEC, Win. Registered July 7, 2010 in the Federal Agency ROSPATENT №2010620364).

The database is constantly updated by processing the new files from the Protein Data Bank. Can also be created theoretical database.

B) enter in the memory of a computer-recorded information to the media database amino acid phentaramine proteins;

C) define and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton;

Conceived initial sequence of five amino acids are presented in the form of a column of three-letter acronyms of amino acids with the marks left their rooms, written from the bottom up:

5 Efg 4 Def 3 Cde 2 Bcd 1 Abc

G) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit numbers in the binary system;

E) introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM;

The operator enters into the program of the planned sequence of five amino acids (specified initial interagent).

The input of these amino acids in the program carried out from top to bottom, starting with the fifth amino acids, and ends the first amino acid: Efg, Def, Cde, Bcd, Abc.

G) introduce and remember specified secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM;

Example input ten-digit numbers: 0000000000

C) look for the specified initial interamente projected protein in the database by using a previously recorded in the memory of a computer program PROTCOM, the search algorithm includes:

- the encoding is specified initial interamente for search in a database;

The program reads amino acids interamente down, encodes them in accordance with the affiliation to one group or another symmetry and writes a code number from left to right, similar to the created index file, for example: Efg - 1, Def - 2, Cde - 3, Bd - 4, Ab - 4, code - number 12344.

- when placed in the folder specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce:

- recording folder number of the database containing the first interagent;

record amino acid sequence of the first interamente in the working file of the program;

- record ten-digit folder number describing the secondary structure found the first interamente in working file;

The format of a working file created by the program PROTCOM shown in table 6.

Table 6

The format of a working file created by PROTCOM

1 2 3 N STP

bbbbbbbbbb

. ...

.........

5 Efg

bbbbbbbbbb

4 Def 3 Cde 2 Bcd 1 Abc

The entry sequence of amino acids of protein in the working file is made bottom-up that reflects the order of protein synthesis at the ribosome (extension of protein by adding amino acids to the top amino acid). Columns file has the following functions:

1 - rooms of amino acids in the projected protein, written from the bottom up;

2 - the sequence of amino acids in the projected protein, written from the bottom up using three signs;

3 - ten-digit numbers of folders (bbbbbbbbbb) database describing the secondary structure of the designed phentaramine written from the bottom up.

in line N signal is the end of protein sequence (STP).

Bold selected first interagent and ten-digit number of the folder where they found interagent.

- if not in a folder specified initial interamente:

- enter and memorize new set initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM;

- conduct a search for a new specified initial interamente the projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

- carry out the coding of the new specified initial interamente for search in a database;

- conduct a search for a new set start interamente in the database in the folder with the specified secondary structure interamente;

- repeat job new initial phentaramine and find new set of initial phentaramine realize until then, until a match is found interagent with this sequence of amino acids, which located in the database directory that describes the specified secondary structure of interamente.

And ask the secondary structure of each following from (N-1) phentaramine recorded in the working file by entering the same or modified ten-digit number that describes the secondary structure of the previous interamente, the program PROTCOM;

K) conducting a search in the database of phentaramine, containing four amino acids each of the (N-1)recorded in the desktop file phentaramine, and one new, and the search algorithm includes:

- allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file;

For example, in table 7 in bold are the last four amino acids previous interamente and entered description secondary structures for the search of a new interamente.

- when it finds such phentaramine produce:

- select one of the new amino acids and adding it to the last four amino acids previous interamente;

Table 7

The selection of amino acids and their secondary patterns for search phentaramine in the database

1 2 3 . ... ...... 6

0000000000

5 Efg

0000000000

4 Def 3 Cde 2 Bcd 1 Abc

- write a new amino acids in the work file, reflecting the projected primary structure of a protein;

- write the decimal folder number describing the secondary structure of each of the found interamente;

- when not finding such phentaramine produce:

- setting the modified secondary structure;

- allocation of the last four amino acids in subsequent interamente;

- search phentaramine containing the last four amino acids previous interamente and one amino acid in the database in the folder with the modified secondary structure;

- repetition of changes in the secondary structure and database search realize until then, until you find at least one interagent containing four amino acids previous interamente;

L) is considered received in the working file the sequence of amino acids with the appropriate description of its secondary structure designed primary structure of the protein.

As a result of actions of the program PROMCOM and the operator, who is designing a protein in the working file is completely filled, the second column contains the primary structure of a protein, and the third column, on the basis of which is judged on the secondary structure of this protein. The presence in the 3rd column consecutive folders 0000000000 characterizes the fragment as β-structural. Several consecutive folders numbering 1111111111 can be attributed to a fragment of α-spiral (see table 2). Transitional areas between α-helix and?-structural conformation, as well as curves of b-patterns (table 2-4) are designed and describes the relevant folders.

Description of the application are illustrated in the following graphic materials:

Fig 1. Screenprinti fragments of protein secondary structure NV and 3EOK obtained with the help of PROTEIN 3D.

a - NV (people); b - EOK (duck);

2. Screenprinti fragments of protein secondary structure 1AGD and 2R37 obtained with the help of PROTEIN 3D for the projected site of the protein.

a - phentaramine 1AGD (103-107); b - phentaramine 2R37 (189-193).

3. Screenprinti fragments of protein secondary structure 1AGD and V obtained with the help of PROTEIN 3D for the projected site of the protein.

a - phentaramine 1AGD (105-109); b - phentaramine V (35-39).

Figure 4. Screenprinti fragments of protein secondary structure 1AGD and 1BAS, obtained with the help of PROTEIN 3D for the projected site of the protein.

a - phentaramine 1AGD (106-110); b - phentaramine 1BAS (80-84).

Figure 5. Screenprinti protein 1AGD obtained with the help of PROTEIN 3D protein 1AGD, were investigated by means of the PCA.

a protein; - a kind of protein fragment; - detailed view of the secondary structure of a protein 1GDJ relevant to given secondary structure of example 2.

The method is illustrated by examples.

Example 1.

In this example describes how to design primary protein structure, with defined as α spiral secondary structure containing the crossing from b-structure to α spiral, Central region α spiral and the transition area from α spiral to beta structure.

At carrying out of the way the design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and description of the secondary structure, carry out the following:

A) create a database of amino acid phentaramine proteins containing folder interamente, moreover, the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier;

B) create a catalog of the descriptions of secondary structures, containing descriptions of secondary structures in a series of ten-digit Boolean number;

B) enter in the computer memory information recorded on the media database amino acid phentaramine proteins;

D) set the description of the secondary structure of the projected primary structure of a protein in a series of ten-digit Boolean number directory-based descriptions of secondary structures;

In the example given in the form of α spiral secondary structure contains plots of the transition from b-structure to α spiral, Central region α spiral and areas of transition from α spiral to beta structure. Description operator finds in the directory secondary structures and fixes it (table 8).

Table 8

Description of the projected secondary structure for example 1

0010101010

The crossing from α spiral to beta structure

1010101011

1010101111

1010111111

1011111111

1111111111

The Central part α-spiral

1111111111

1111111101

The crossing from b-structure to α-spiral

1111110101

1111010101

1101010101

0101010100

4 3 2 1

D) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty-canonical amino acids of proteins, which is the specified initial Pentagrammaton:

5 Asp 4 Ala 3 Pro 2 Ser 1 Leu

which is written in the order from the bottom up Leu, Ser, Pro, Ala, Asp.

E) determine and enter it in the computer memory description of the secondary structure of a given initial interamente in the form of ten-digit number in the binary system, the first ten-digit number in the given description of the secondary structure, which corresponds to the name of the folder in the database that contains the specified initial interagent: ten-digit number 0101010100. As can be seen from table 8, it is the first ten-digit number in the given description secondary structure.

It corresponds to the description of interamente with four H-bond C=O...HN described by a pair of variables 01, and one pair of variables 00 (no N-ties) - see table 3 (transition b-structure - alpha-helix).

W) injected into the memory of a computer program PROTCOM to highlight and search phentaramine projected protein in the database and write the names of the amino acids found phentaramine and rooms folder database describing the secondary structure, which found the search interagency;

1. Installing the program is conducted in a special folder in which produced work files containing the projected primary structure of a protein and describing its secondary structure in the binary system of ten-digit numbers.

3. In the beginning of the program shown in table form a system of twenty amino acids, consisting of four groups.

C) to introduce and remember the specified initial interagent projected protein in the form of a sequence of five amino acids in the program PROTCOM: the operator shall enter amino acids that make up a given initial interagent in sequence from the fifth on the first, that is top-down: Asp, Ala, Pro, Ser, Leu.

And) to introduce and remember the given description secondary structure specified initial interamente in the form of ten-digit number in the binary system in the program PROTCOM: the operator enters into the program PROTCOM sequence 0101010100.

To) look for the specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

- the encoding is specified initial interamente for search in a database;

The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application).

In this example: Asp - 4, Ala - 2, Pro - 1, Ser - 3, Leu - 2. This numeric sequence is recorded in the memory programs from left to right 42132 and is used to search for the specified initial interamente in the folder 0101010100 database in the file 42132_0101010100.

- search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente;

Program found a given initial interagent in the file 42132_0101010100:

Asp Ala Pro Ser Leu

This interagent was isolated from a text file, the program PROTEIN 3D on the basis of processing of atomic coordinates of protein from Protein Data Bank, and has the structure 0101010100 transition section b-structure - alpha-spiral (see table 8).

Tables 9, 10, 11 illustrate the work of the program. In the left part, entitled "Enter", are placed in the first column of the input to the program PROTCOM sequence numbers projected amino acids, in the second column are amino acids when entering according PS) or pairs of variables according to PL), selected by the operator based on the specified secondary structure (table 8). In the third column are written entered in the program description of the secondary structure in the form of ten-digit numbers. In the Central part, entitled "the Search interamente in the database"will be placed in the first column are the names of the files with the number of coding and number of the specified folder, and in the second - names found in interamente amino acids. In the right part of the table the record is carried out by program PROTCOM in the working file after the discovery of the specified initial interamente, and in future - after selecting the amino acids in a file with the number of coding number specified folder.

Table 9

For a given initial interamente in the database

Enter

Search interamente in the database

The entry in desktop file

No.

Amino acids or pairs of variables

The description of the secondary structure

The name of the file with the number of coding and number of the specified folder

Names of amino acids

No.

Name amino acids

Description secondary structure

5 Asp

0101010100

42132_0101010100

5 Asp

0101010100

4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 Leu

- when placed in the folder specified initial interamente consider this interagent the first of the possible number N of phentaramine projected primary structure of the protein and produce:

- recording folder number of the database containing the first interagent;

record the sequence of amino acids first interamente in work the program file;

- record ten-digit folder number describing the secondary structure found the first interamente in working file;

The program found introduced initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (tabl).

Since a given initial interagent was found, we omit steps search, pertaining to find in the folder specified initial interamente.

L) set the description of the secondary structure for each increment of (N-1) phentaramine using the description given secondary patterns in the form of ten-digit sequence of Boolean numbers that correspond to the names of the folders in the database that contain the specified interagency, through the introduction of the same or changed ten-digit number that describes the secondary structure of the previous interamente, the program PROTCOM;

For this purpose during the job description of the secondary structure of the program PROTCOM proposes to introduce a pair of variables 00, 01, 10 or 11. From table 8 shows that the following ten-digit number is 1101010101. For this reason, the operator selects 11, and introduces a couple of 11 variables in the program (the column "Amino acids and a couple of variables in table 10). The program adds 11 to the left and remove a couple of digits to the right that leads to the change of ten-digit numbers that describe the secondary structure of the previous interamente, as reflected in the column "Description given secondary structure" table 10.

Table 10

The search of the second interamente in the database

Enter

Search interamente in the database

The entry in desktop file

No.

Amino acids or pairs of variables

The description of the secondary structure

The name of the file with the number of coding and number specified folders

Names of amino acids

No.

Names of amino acids

Description secondary structure

6 11

1101010101

34213_1101010101

Ser 6 Lys

1101010101

44213_1101010101

Lys 5 Asp

0101010100

42132_0101010100

5 Asp

0101010100

4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 Leu

M) perform a database search of phentaramine containing four amino acids each of the (N-1) phentaramine recorded in the working file, and one new, and the search algorithm consists in myself:

- allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file;

- search for phentaramine containing the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database the folder with the specified description secondary structure;

To do this, the program allocates interamente recorded the working file table 10, four amino acids, written from top to bottom: Asp, Ala, Pro, Ser.

Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes a code number from left to right, similarly formed index files, but without the first amino acids: 4213 and conducts in the database search phentaramine containing four selected amino acids, in the folder with the specified structure next interamente (1101010101), i.e files X, where X can take the values 1, 2, 3, 4, corresponding to the numbers of groups of symmetry (see table 5 description of the application) - 14213_1101010101, 24213_1101010101, 34213_1101010101, 44213_1101010101.

The result of the search were found interagency containing four last amino acids: Asp, Ala, Pro, Ser and following the fifth amino acids, recorded together with codes of proteins from which they were obtained:

- in the file group 1 (14213_1101010101): interagency not found;

- in the file group 2 (24213_1101010101) - interagency not found;

- in the file group 3 (34213_1101010101): 1 - Ser;

- in the file group 4 (44213_1101010101): 2 - Lys;

- when it finds such phentaramine operator produces:

- select one of the new amino acids and adding it to the last four amino acids previous interamente;

Found a program of amino acids you can select either Ser in the file group 3 or Lys in the file group 4. The program allows to select only one option. Depending on the choice of the design will be different, that can be found only in the design. As the fifth amino acids operator chose Lys and entered the information into the program.

Next, the program produces:

a recording of a new amino acids in the work file ("Entry in the desktop file, table 10), reflecting the projected primary structure of a protein (Lys);

record decimal folder number describing the secondary structure of each of the found interamente (1101010101);

Because interagent was found, we omit the stage of searching, pertaining to find in the folder interamente.

Next, make the repetition of actions pursuant to sub. L) and M) up to the end of the design process. As can be seen from table 11, in the design process the sequence of amino acids in phases 11, 13 and 18 had a choice of two or three amino acids, on the other stages of the program found only one amino acid.

N) designed the primary structure of a protein is considered received in the working file the sequence of amino acids, with an appropriate description of its secondary structure, stored in the working file of the program PROTCOM and presented in the right part of table 11.

Table 11

A subsequent search phentaramine in the database

Enter

Search interamente in the database

The entry in desktop file

No.

Amino acids or pairs of variables

The description of the secondary structure

The name of the file with the number of coding and number of the specified folder

Name of amino acids

No.

Name amino acids

Description secondary structure

18 00

0010101010

11443_0010101010

Gly

0010101010

41443_0010101010

Lys 18 Gly 17 10

1010101011

14433_1010101011

Gly 17 Gly

1010101011

16 10

1010101111

44334_1010101111

Ile 16 Ile

1010101111

15 10

1010111111

43341_1010111111

Lys 15 Lys

1010111111

14 10

1011111111

33414_1011111111

Ser 14 Ser

1011111111

13 11

1111111111

24144_1111111111

Ala

34144_1111111111

Phe 13 Phe

1111111111

12 11

1111111111

41444_1111111111

Val 12 Val

1111111111

Ile 11 11

1111111111

14443_1111111111

Gly 11 Gly

1111111111

24443_1111111111

Ala

34443_1111111111

Thr 10 11

1111111111

44434_1111111111

Lys 10 Lys

1111111111

9 11

1111111101

44344_1111111101

Val 9 Val

1111111101

8 11

1111110101

43442_1111110101

Asn 8 Asn

1111110101

7 11

1111010101

34421_1111010101

Thr 7 Thr

1111010101

6 11

1101010101

34213_1101010101

Ser

44213_1101010101

Lys 6 Lys

1101010101

5 Asp

0101010100

42132_0101010100

5 Asp

0101010100

4 Ala 4 Ala 3 Pro 3 Pro 2 Ser 2 Ser 1 Leu 1 Leu

As seen from table 12, designed primary structure of a protein in example 1 is the most similar to the primary structure of the protein fragments NV and EOC. Thus, designed primary structure of a protein from 1 to 10 amino acid identical with the primary structure of a protein fragment NV 2-th and 11-th amino acid. At the same time, from 11 th to 18 amino acids designed primary structure of a protein identical to the primary structure of a protein fragment EOC from 12 th to 19 amino acids.

Table 12

Mapping primary structures of protein fragments

No.

No. of amino acids in proteins

The protein fragments

NV EACH 3DHR 3D4X 18 Gly 19 Ala Gly Gly Ser 17 Gly 18 Gly Gly Gly Gly 16 Ile 17 Val Ile Ile Ile 15 Lys 16 Lys Lys Lys Lys 14 Ser 15 Gly Ser Ala Gly 13 Phe 14 Trp Phe Phe Trp 12 Val 13 Ala Val Val Cys 11 Gly 12 Ala Gly Ala Ala 10 Lys 11 Lys Lys Lys Lys 9 Val 10 Val Val Val Val 8 Asn 9 Asn Asn Asn Asn 7 Thr 8 Thr Thr Ser Ser 6 Lys 7 Lys Lys Lys Lys 5 Asp 6 Asp Asp Asp Asp 4 Ala 5 Ala Ala Asn Ala 3 Pro 4 Pro Ala Ala Ala 2 Ser 3 Ser Ser Ser Ser 1 Leu 2 Leu Leu Leu Leu

Table 13 shows the two-dimensional description of hydrogen bonds is presented in table 12 fragments of proteins obtained with the help of Protein 3D file-based table 12. There is given a description of their secondary structure in the form of ten-digit Boolean number, which is completely identical to the given description secondary structure designed primary structure (table 11).

In figure 1,and 1,b presents screenprinti fragments of protein secondary structure NV and 3EOK obtained with the help of Protein 3D, and the corresponding amino acid sequence of the primary structure. These figures shows that the secondary structure of the fragments that make up designed primary structure of example 1, has overlap with the 7-th on 11-th amino acids, as well as the overlap between the amino acid sequence of their primary structures (on the sequences it in italics). Consequently, designed sequence is identical with the original fragments of the secondary structure.

Thus, in example 1 presents designed primary structure of a protein, consisting of fragments of proteins NV and EACH specified secondary structure which fully coincides with the secondary structure of each of these proteins.

Table 13

Fragments of proteins with two-dimensional description of their hydrogen bonds

Description of the secondary structure of protein fragments

The secondary structure of protein fragments from Protein Data Bank

NV 3EOK 3DHR 3D4X 19 ALA 19 GLY 19 GLY 19 SER

18 GLY N - 14 TRP O

18 GLY N - 14 PHE ABOUT

18 GLY N - 14TRP O

18 GLY 18 GLY 18 GLY 18 GLY

17 VAL N - 13 ABOUT ALA

17 ILE N - 13 VAL ABOUT

17 ILE N - 13 CYS O

17 VAL 17 ILE 17 ILE 17 ILE

16 N LYS - 12 ALA ABOUT

16 N LYS - 12 GLY ABOUT

16 N LYS - 12 ALA ABOUT

0010101010

16 LYS 16 LYS 16 LYS 16 LYS

15 GLY N - 11 LYS ABOUT

15 SER N - 11 LYS ABOUT

15 ALA N - 11 LYS ABOUT

15 GLY N - 11 LYS ABOUT

1010101011

15 GLY 15 SER 15 ALA 15 GLY

14 TRPO - 18 GLY N

14 PHE ABOUT - 18 GLY N

14 TRP O - 18 GLY N

1010101111

14 TRPN - 10 OF ABOUT VAL

14 PHE N - 10 OF ABOUT VAL

14 TRP N - 10 OF ABOUT VAL

14 TRP 14 PHE 14 PHE 14 TRP 15

1010111111

13 ABOUT ALA - 17 VAL N

13 ABOUT VAL - 17 ILE N

13 CYS O - 17 ILE N

1011111111

13 ALAN - 9 ASN ABOUT

13 VAL N - 9 ASN ABOUT

13 CYS N - 9 ASN ABOUT

14 13 ALA 13 VAL 13 VAL 13 CYS 13

1111111111

12 ABOUT ALA - 16 N LYS

12 GLY ABOUT - 16 N LYS

12 ABOUT ALA - 16 N LYS

1111111111

12 ALAN - 8 ON THR

12 GLY N - 8 ON THR

12 ALAN - 8 SER ABOUT

1111111111

12 ALA 12 GLY 12 ALA 12 ALA

11 LYS ABOUT - 15 GLY N

11 LYS ABOUT - 15 N SER

11 LYS ABOUT - 15 ALAN

11 LYS ABOUT - 15 GLY N

1111111111

11 N LYS - 7 LYS ABOUT

1111111111

11 LYS 11 LYS 11 LYS 11 LYS

10 ABOUT VAL - 14 TRP N

10 ABOUT VAL - 14 N PHE

10 VAL O - N 14TRP

1111111101

10 VAL N - 6 ABOUT ASP

10 VAL 10 VAL 10 VAL 10 VAL 8

1111110101

9 ASN AU - 13 ALAN

9 ASN AU - 13 VAL N

9 ASN AU - 13CYS N

1111010101

9 ASN N - 5 ALA ABOUT

9 ASN N - 5 ASN ABOUT

9 ASN N - 5 ALA ABOUT

9 ASN 9 ASN 9 ASN 9 ASN 6

1101010101

8 THR O - 12 ALA N

8 THR O - 12 GLY N

8 SER O - 12 ALA N

8 N THR - 4 PRO ON

8 THR N - 4 ALA ABOUT

8 SER N - 4 ALA ABOUT

0101010100

8 THR 8 THR 8 SER 8 SER

7 LYS O - 11 N LYS

7 LYS N - 3 SER ABOUT

3 7 LYS 7 LYS 7 LYS 7 LYS

6 ASP O - 10 VAL N

6 ABOUT ASP - 10 VAL N

2 6 ASP 6 ASP 6 ASP 6 ASP 1

5 ALA O - 9 ASN N

5 ASN AU - 9 ASN N

5 ALA O - 9 ASN N

5 ALA 5 ALA 5 ASN 5 ALA

4 ABOUT PRO - 8 N THR

4 ABOUT ALA - 8 N THR

4 ABOUT ALA - 8 N SER

4 PRO 4 ALA 4 ALA 4 ALA

3 SER - 7 N LYS

3 SER 3 SER 3 SER 3 SER 2 LEU 2 LEU 2 LEU 2 LEU

Information about the secondary structure of proteins NV and 3EOK belonging to the class of hemoglobins, published and presented table 14.

Table 14

The list of proteins in which the x-ray method was investigated structure that matches the given us the secondary structure

No.

Code protein

The name of protein and source selection

Literature

1 NV

HEMOGLOBIN ALPHA SUBUNIT of HUMAN (person)

G. Fermi, M.F. Perutz B. Shaanan, R. Fourme The crystal structure of human deoxyhaemoglobin at 1.74 angstroms resolution. J. Mol. Biol. v.175, p.159 (1984)

2 3EOK

HEMOGLOBIN ALPHA SUBUNIT DUCK (duck)

Sathya Moorthy, K. Neelagandan, M. Balasubramanian, M.N. Ponnuswamy. Crystal Structure Determination of Duck (Anas Platyrhynchos) Hemoglobin at 2.1 Angstrom Resolution To be published (structural data from PDB-Bank)

Example 2.

In this example describes how to design the primary structure of a protein, given in the form of inverted b-bend secondary structure.

When carrying out design of the primary structure of a protein with a given secondary structure on the basis of characterizing its amino acid sequence and description of the secondary structure, carry out the following:

A) create a database of amino acid phentaramine proteins containing folder interamente, and the source folder list drawn up by their names, formed on the basis of the encoded in binary descriptions of hydrogen bonds peptide groups phentaramine in the secondary structure of proteins, and record it on any information carrier;

B) create a catalog of the descriptions of secondary structures, containing descriptions of secondary structures in a series of ten-digit Boolean number;

B) enter in the computer memory information recorded on the media database amino acid phentaramine proteins;

D) set the description of the secondary structure of the projected primary structure of a protein in a series of ten-digit Boolean number directory-based descriptions of secondary structures;

In this example, the secondary structure set in the form of inverted b-bend. Description operator finds in the directory secondary structures and fixes it (tabl).

Table 15

Description of the projected secondary structure for example 2

0000000001

0000000100

0000010000

0001000000

0100000010

0000001000

0000100000

0010000000

1000000000

0000000000

4 3 2 1

D) determine and enter it in the computer memory initial sequence of five amino acids belonging to the group of twenty canonical amino acids of proteins, which is the specified initial Pentagrammaton:

5 Val 4 Asp 3 Cys 2 Gly 1 Tyr

which is written in the order from bottom to top: Tyr, Gly, Cys, Asp, Val.

E) determine and enter it in the computer memory description secondary structure specified initial interamente in the form of ten-digit number in the binary system, the first ten-digit number in given the description of the secondary structure, which corresponds to the name of the folder in the database that contains the specified initial interagent: 10-digit number 0000000000. As can be seen from table 15, it is the first ten-digit number in the given description secondary structure.

Installing the program is similar to Pierre in example 1.

To) look for the specified initial interamente projected protein in the database using a previously stored program computer PROTCOM, the search algorithm includes:

- the encoding is specified initial interamente for search in a database;

The encoding is done by the program by assigning each of the amino acids specified initial interamente to one or another group symmetry (table 5 description of the application).

In this example: Val - 4 Asp - 4, Cys - 3, Gly - 1 and Tyr - 3. This numeric sequence is recorded in the memory programs from left to right 44313 and is used to search for the specified initial interamente in the folder 0000000000 database in the file 44313_0000000000.

- search the specified initial interamente in the database in the folder with the specified description of the secondary structure of interamente;

Program found a given initial interagent in the file 44313_0000000000:

Val Asp Cys Gly Tyr

This interagent was isolated from a text file, the program PROTEIN 3D on the basis of processing of atomic coordinates of protein from Protein Data Bank and has β structure, described as 0000000000 that does not contain H-bonds in the immediate interamente.

Tables 16, 17, 18 illustrate the work of the program. In the left part, entitled "Input" are placed in the first column of the input to the program PROTCOM sequence numbers projected amino acids, in the second column are the amino acids when entering according PS) or couples variables according PL), selected by the operator based on the specified secondary structure (table 15). In the third column are written entered in the program description of the secondary structure in the form of ten-digit numbers. In the Central part, entitled "the Search interamente in the database"will be placed in the first column are the names of the files with the number of coding and number of the specified folder, and in the second - names found in interamente amino acids. In the right part of the table the record is carried out the program PROTCOM in the working file after the discovery of the specified initial interamente, and in future - after selecting the amino acids in a file with the number of coding number specified folder.

- when placed in the folder specified initial interamente consider this interagent the first of the possible number N phentaramine projected primary structure of the protein and produce:

- recording folder number of the database containing the first interagent;

record the sequence of amino acids first interamente in the working file of the program;

- record ten-digit numbers folder describing secondary the structure found the first interamente in working file;

The program found introduced initial interagent in the file with the appropriate encoding and the folder number and makes an entry in the desktop file (table 16). Since a given initial interagent was found, we omit steps search, pertaining to find in the folder specified initial interamente.

For this purpose during the job description of the secondary structure of the program PROTCOM proposes to introduce a pair of variables 00, 01, 10 or 11. Table 15 shows that the following ten-digit number is 1000000000. For this reason, the operator selects 10, and introduces a couple of 10 variables in the program (the column "of Amino acids, or pairs of variables in table 17). The program adds 10 to the left and remove a couple of digits to the right that leads to the change of ten-digit numbers that describe the secondary structure of the previous interamente, as reflected in the column "Description given secondary structure" table 17.

M) perform a database search of phentaramine containing four amino acids each of the (N-1) phentaramine recorded in the working file, and one new, and the search algorithm includes:

- allocation and memory of the last four amino acids in each of the (N-1) phentaramine recorded in the working file;

- search for phentaramine containing the last four amino acids each of the (N-1) phentaramine recorded in the working file, and one amino acid in the database in the folder with the specified description secondary structure;

To do this, the program allocates interamente recorded the working file table 16 four amino acids, written from top to bottom: Val, Asp, Cys, Gly.

Next, the program encodes them in accordance with the affiliation to one or another group symmetry and writes a code number from left to right, similar to the generated index of the files, but without the first amino acids: 4431 and conducts in the database search phentaramine containing four selected amino acids, in the folder with the specified structure next interamente (1000000000), i.e files X, where X can take the values 1, 2, 3, 4, corresponding to the numbers of groups of symmetry (see table 5 description of the application) - 14431_1000000000, 24431_1000000000, 34431_1000000000, 44431_1000000000.

The result of the search were found interagency containing the last four amino acids: Val, Asp, Cys, Gly, and following the fifth amino acids, recorded together with codes of proteins from which they were obtained:

- in the file group 1 (14431_1000000000): Gly;

- in the file group 2 (24431_1000000000) - interagency not found;

- in the file group 3 (34431_1000000000): - interagency not found;

- in the file group 4 (44431_1000000000): - interagency not found.

Please note that files in groups 2, 3, 4 interagency not found. For design use a single amino acid Gly.

- when it finds such phentaramine operator produces:

- select one of the new amino acids and adding it to the last four amino acids previous interamente;

As the fifth amino acids operator chose Gly and entered the information into the program.

Next, the program produces:

a recording of a new amino acids in the work file ("Entry in the desktop file, table 17), reflecting the projected primary structure of a protein (Gly);

record decimal folder number describing the secondary structure of each of the found interamente (1000000000);

Because interagent was found, we omit steps searches relating to the case of not finding the folder interamente.

As follows from table 19, we designed the primary structure of amino acids for example 2 was identical amino acid sequence of a protein fragment 1AGD. Table 20 shows the two-dimensional description of hydrogen bonds is presented in table 19 fragments of proteins obtained with the help of Protein 3D file-based table 19. Description of their secondary structure in the form of ten-digit Boolean numbers are listed in the right column of the table 20. For protein 1AGD completely identical to the given description secondary structure designed primary structure of a protein example 2 (tabl). Also found, some parts of this sequence can be compiled on the basis of phentaramine proteins 2R37 (№9), B (№11) and 1BAS (№12), which have no relationship with the protein 1AGD. In table 20 describes their secondary structure in the form of ten-digit Boolean number, which fully coincides with the given description secondary structure designed primary structure of example 2.

Table 20

The secondary structure of protein fragments

Fragments of secondary structure proteins from Protein Data Bank

Description of the secondary structure of a protein fragment 1AGD

1AGD 2R37 3B02 1BAS 14

0000000001

112 GLY 13

0000000100

111 ARG 84 LEU 12

0000010000

110 LEU 83 LEU 109 LEU 39 LEU

82 ARG ABOUT - 78 N LYS

0001000000

108 ARG ABOUT - 104 GLY N

38 ARG ABOUT - 34 LEU N

81 GLY 10

0100000010

108 ARG 107 GLY 37 GLY 80 ASP 9

0000001000

106 ASP

192 GLY 191 ASP

36 ASP 35 PRO

0000100000

105 PRO 190 PRO 7

0010000000

104 GLY N - 108 ARG ABOUT

104 GLY

189 GLY N - 193 ILE ABOUT

1000000000

103 VAL 188 VAL 5

0000000000

102 ASP 4 101 CYS 100 GLY 3 99 TYR 2 1

Figure 2-4 presents screenprinti fragments of protein secondary structure 1AGD, 2R37, V, 1BAS, obtained with the help of Protein 3D. Comparison of protein fragment 1AGD with a fragment of the protein 2R37 (figure 2,a and 2,b), with a fragment of the protein V (figure 3,a and 3 b) and protein fragment 1BAS (figure 4,and 4,b) leads to the conclusion that their secondary structure identical and they are interchangeable. This means that there is no difference in the engineered protein example 2 on the basis of phentaramine only protein 1AGD or using phentaramine obtained from four different proteins 1AGD, 2R37, V and 1BAS.

General view investigated by x-ray method protein 1AGD shown in figure 5,and. In the rectangle selection, corresponding primary structure, designed by the claimed method. Figure 5,b it is shown close up, and figure 5,in - detail view of the fragment of the secondary structure of a protein 1GDJ relevant to given secondary structure of example 2. The given figures illustrate the presence of this fragment in real protein.

Thus, in example 2 is designed primary structure of a protein is confirmed by the two options. First variant: primary structure consists of only phentaramine protein 1AGD. Given the secondary structure designed protein example 2 identical with the secondary structure of a protein fragment 1AGD. Second variant: primary structure consists of phentaramine obtained from four different proteins 1AGD, 2R37, V and 1BAS. Description of secondary structures phentaramine proteins 1AGD, 2R37, V and 1BAS also completely coincides with the given description secondary structure designed primary structure of a protein example 2.

Information about the secondary structure of proteins 1AGD, 2R37, V and 1BAS published and presented table 21.

Table 21

The list of proteins in which the x-ray method was investigated structure that matches the specified secondary structure

stage №

Code protein

Name protein

Literature

5-8, 10, 13, 14

1AGD

Histocompatipility complex

S.W. Reid, S. McAdam, K.J. Smith, P. Klenerman, C.A. O'callaghan, K. Harlos, B.K. Jakobsen, A.J. McMichael, J.I. Bell, D.I. Stuart, E.Y Jones Antagonist Hiv-1 Gag Peptides Induce Structural Changes In Hla-B8 J. Exp. Med. V. 184 2279 1996 ASTM JEMEAV US ISSN 0022-1007 0774 Resolution 2.05 Angstroms

9 2R37

Human glutathione buffer 3

E.S. Pilka, K. Guo, O. Gileadi, A. Rojkowa, F. Von Delft, A.C.W.Pike, K.L. Kavanagh, C. Johannson, M. Sundstrom, C.H. Arrowsmith, J. Weigel, T, A.M. Edwards, U. by Oppermann Crystal structure of human glutathione buffer 3 (selenocysteine to glycine mutant). No recorded citation in PubMed Resolution 1.85 Angstroms.

11 V

Transcriptional regulator, CRP family;

Agari Y, Kuramitsu S, Shinkai A X-ray crystal structure of tthb099, a crp/fnr superfamily transcriptional regulator from thermus thermophilus hb8, reveals a DNA-binding protein with no required allosteric effector molecule. Proteins (2012), to be published. Resolution 1.92 Angstroms.

12 1BAS

Fibroblast growth factor

X. Zhu, H. Komiya, A. Chirino, S. Faham, G.M. Fox, T. Arakawa, B.T. Hsu, D.C. Rees Three-dimensional structures of acidic and basic fibroblast growth factors. Science V.251 90 1991. Astm Scieas US Issn 0036-8075 038 Resolution. 1.9 Angstroms.

Registration information databases and software used in the description of the application

"The database of phentaramine proteins".

Authors: V. Karasev, A.I. Belyaev, V.V. Luchinin

Certificate of state registration database №2010620364

Registered in the Register of databases on 7 July 2010

"Computer program for constructing the primary structure of a protein with a given secondary structure" - "PROTCOM".

Authors: V. Karasev, A.I. Belyaev, V.V. Luchinin

The certificate on the state registration of the computer program №2011611105.

Registered in the Register of computer programs February 2, 2011.