Apparatus for buffering data streams read from ram. RU patent 2475817.

IPC classes for russian patent Apparatus for buffering data streams read from ram. RU patent 2475817. (RU 2475817):

G06F9/30 - Arrangements for executing machine- instructions, e.g. instruction decode (for executing micro-instructions G06F0009220000; for executing subprogrammes G06F0009400000)

Another patents in same IPC classes:

Using register renaming systems to forward intermediate results between constituent instructions and expanded instruction / 2431887
Method of executing an expanded instruction in a system with a rename table, a free list, and a constituent instruction rename table comprising converting the expanded instruction into a plurality of separately executable constituent instructions including a first constituent instruction and a second constituent instruction; assigning a physical register number associated with a physical register to the first constituent instruction by mapping an identifier of the first constituent instruction to the physical register number in the constituent instruction rename table, wherein the first constituent instruction generates an intermediate result; and associating the assigned physical register number with the second constituent instruction receiving the intermediate result.

Cleaning of segmented conveyor for wrongly predicted transitions / 2427889
Processor conveyor is segmented into an upper part, prior to commands that follow not in a program order, and one or more lower parts downstream the upper part. The upper conveyor is cleaned after detection of the fact that the transition command was wrongly predicted, minimising delay in selection of commands from a target address of the right transition. Lower conveyors may continue execution until the command of the wrongly predicted transition is confirmed, besides, at this moment of time all non-fixed commands are cleaned from the lower conveyors. Existing mechanisms of conveyor cleaning elimination may be used with the help of adding the identifier of the wrongly predicted transition, at the same time complexity and cost of hardware for lower conveyors cleaning are less.

Instruction and logical circuit to carry out dot product operation / 2421796
System to carry out dot product operation includes the following: the first memory device designed to store instruction of a dot product of "single instruction - multiple data flows" type (SIMD); a processor connected to the first memory device to execute instruction of SIMD dot product, in which instructions of SIMD dot product include an indicator of source operand, an indicator of target operand, at least one indicator of direct value, at the same time the direct value indicator includes multiple control bits.

Delayed application launching / 2419840
Delay in launching certain applications can enhance overall system performance. Applications which must be delayed may be placed in a container object or packaging to that they can be monitored and so that other applications, which depend on the delayed applications, can be processed appropriately.

Methods and apparatus for emulating branch prediction behaviour of explicit subroutine call / 2417407
Apparatus has a first input which is configured to receive an instruction address, and a second input which is configured to receive predecoded information which describes the instruction address as being related to an implicit subroutine call in a subroutine. In response to the predecoded information, the apparatus also includes an adder configured to add a constant to the instruction address defining a return address, causing the return address to be stored to an explicit subroutine resource, thus, facilitating subsequent branch prediction of a return call instruction.

Pre-decoding variable length instructions / 2412464
Method involves the following: identification of a property of a first instruction, where the property differs from other properties encoded in a first set of pre-decoding bits, for which all available encodings are defined or reserved; coding the first instruction in a second format, whose length differs from that of the first format, including part of the first instruction and the first set of pre-decoding bits, where the second format contains part of the second instruction and a second set of pre-decoding bits, encoding the second set of pre-decoding bits using one of the available encodings.

Expansion of stacked register file using shadow registers / 2405189
Method of managing shadow register file system involves the following steps: allocating one or more multi-port registers from a physical register file to a first procedure, corresponding to part of the logic stack of registers, storing data associated with the first procedure in the allocated multi-port registers; selectively saving data associated with the first procedure from one or more multi-port registers to one or more registers of the first file of shadow registers of the shadow register file system, wherein one or more registers has independent data reading/recording ports, and freeing up corresponding allocated multi-port registers for allocating the second procedure; storing data associated with the first procedure from the first shadow register file to the second shadow register file; storing at least part of data associated with the first procedure from a specific register of the second shadow register file in backing memory, and then extraction of said part of data associated with the first procedure from the backing memory to a specific register of the second shadow register file; extracting data from the second shadow register file into one or more registers of the first shadow register file; and before continuing to execute the first procedure, retrieving data associated with the first procedure from one or more registers into one or more multi-port registers, and reallocating the first procedure one or more multi-port registers.

Methods and device for ensuring correct pre-decoding / 2405188
Method involves defining a granule which is equal to the smallest length instruction in the instruction set and defining the number of granules making up the longest length instruction in the instruction denoted MAX. The method also involves determining the end of an embedded data segment, when a program is compiled or assembled into the instruction string and inserting a padding of length MAX-1 into the instruction string to the end of the embedded data. Upon pre-decoding of the padded instruction string, a pre-decoder maintains synchronisation with the instructions in the padded instruction string even if embedded data are randomly encoded to resemble an existing instruction in the variable length instruction set.

Error handling for early decoding through branch correction / 2367004
Invention relates to processors with pipeline architecture. The method of correcting an incorrectly early decoded instruction comprises stages on which: the early decoding error is detected and a procedure is called for correcting branching with a destination address for the incorrectly early decoded instruction in response to detection of the said error. The early decoded instruction is evaluated as an instruction, which corresponds to incorrectly predicted branching.

Hybrid microprocessor / 2359315
Present invention relates to computer engineering and can be used in signal processing systems. The device contains an instruction buffer, memory control unit, second level cache memory, integral arithmetic-logic unit (ALU), floating point arithmetic unit and a system controller.

Method for processing with use of one commands stream and multiple data streams / 2279706
System is disclosed with command (ADD8TO16), which decompresses non-adjacent parts of data word with utilization of signed or zero expansion and combines them by means of arithmetic operation "one command stream, multiple data streams", such as adding, performed in response to one and the same command. Command is especially useful for utilization in systems having a data channel, containing a shifting circuit before the arithmetic circuit.

Configurable computing device / 2291477
Device consists of two computing device units, each of them divided into at least four subunits, which consist of a quantity of unit cells. Named units are spatially located so that the distance between unit cell of first unit and equal unit cell in the second unit is minimal. Computing device configuration can be changed using configurational switches, which are installed between device subunits.

Expandable communication control means / 2313188
Expandable communication control means is used for maintaining communication between computing device and remote communication device. In a computer program adapted for using expandable communication control means, information about contacting side is found, and on basis of found contact information it is determined which types of transactions may be used for communication with contacting side at remote communication device. As soon as communication setup function is determined using contacting side information, communication setup request, associated with such a function, is dispatched to communication address. After receipt, expandable communication control means begins conduction of communication with remote communication device.

Method and device for shuffling data / 2316808
In accordance to shuffling instruction, first operand is received, which contains a set of L data elements, and second operand, which contains a set of L shuffling masks, where each shuffling mask includes a "reset to zero" field and selection field, for each shuffling mask, if the "reset to zero" field of shuffling mask is not set, then data indicated by shuffling mask selection field are moved, from data element of first operand, into associated data element of result, and if "reset to zero" field of shuffling mask is set, then zero is placed in associated data element of result.

Processing of message authentication control commands providing for data security / 2327204
Invention pertains to the means of providing for computer architecture. Description is given of the method, system and the computer program for computing the data authentication code. The data are stored in the memory of the computing medium. The memory unit required for computing the authentication code is given through commands. During the computing operation the processor defines one of the encoding methods, which is subject to implementation during computation of the authentication code.

Digital signal processors with configurable binary multiplier-accumulation unit and binary arithmetic-logical unit / 2342694
Present invention pertains to digital signal processors with configurable multiplier-accumulation units and arithmetic-logical units. The device has a first multiplier-accumulation unit for receiving and multiplying the first and second operands, storage of the obtained result in the first intermediate register, adding it to the third operand, a second multiplier-accumulation unit, for receiving and multiplying the fourth and fifth operands, storage of the obtained result in the second intermediate register, adding the sixth operand or with the stored second intermediate result, or with the sum of the stored first and second intermediate results. Multiplier-accumulation units react on the processor instructions for dynamic reconfiguration between the first configuration, in which the first and second multiplier-accumulation units operate independently, and the second configuration, in which the first and second multiplier-accumulation units are connected and operate together.

Processing of message digest generation commands / 2344467
Command of message digest generation is selected from memory, in response to selection of message digest generation command from memory on the basis of previously specified code of function, operation of message digest generation, which is subject to execution, is determined, at that previously specified code of function defines operation of message digest calculation or operation of function request, if determined operation of message digest generation subject to execution is operation of message digest calculation, in respect to operand, operation of message digest calculation is executed, which contains algorithm of hash coding, if determined operation of message digest generation subject to execution is operation of function request, bits of condition word are stored in block of parameters that correspond to one or several codes of function installed in processor.

FIELD: information technology.

SUBSTANCE: apparatus for buffering data streams transferred between two interfaces which are RAM and CPU data buses, respectively, and having a buffer which is based on memory or registers and accumulates data for transmission thereof upon request from a second interface without accessing the first interface, wherein the apparatus has additional buffers, a tag controller and an output multiplexer, the tag controller being connected to a multiplexer, a buffer based on memory or registers and additional buffers for tracking relevancy of data stored therein, and buffer inputs of the apparatus are connected to RAM.

EFFECT: high efficiency of a memory subsystem, lying in shorter delay in obtaining data requested by a CPU, high flexibility of application and high carrying capacity of RAM data buses.

1 dwg

The invention relates to the field of computer engineering, namely to the computing systems on the basis of universal microprocessors.

Known to block the system controller to work with external memory (RAM), which is part of a hybrid of the microprocessor, which includes the CPU, system controller, external memory, a two-level cache (patent RU 2359315, cl. G06F 9/30, . 20.06.2009).

The disadvantage of this unit is its poor performance in calls to the external memory.

The closest technical essence and technical results is the buffering mechanism of the flows of data to be read from RAM and forwards between the two interfaces that represent the data bus RAM and CPU respectively and contains buffer, made on memory or registers and stores the data for transmission on the request with the second interface without applying for them in the first (US Patent 7581072 B2, cl. G06F 12/00, . 14.12.2006).

The disadvantage of this device buffering is the low efficiency of work. In addition, the device has no tracking of the address on which it is recorded in the memory, because when the situation is possible when stored data in the buffer will lose relevance in connection with the new entry in RAM at this address.

Expected technical result of the invention consists in increasing the performance of the memory subsystem consists in reducing latency get the CPU data, and increasing the flexibility of application and storage of multiple streams of data requests from the memory and bandwidth data bus RAM by reducing the load on it.

This technical result is achieved by the device buffering data streams that are sent between the two interfaces that represent the data bus RAM and CPU, respectively, containing a buffer, made on the memory or registers and stores the data for transmission on the request from the second interface without applying for them in the first, it contains additional buffers, TEG-controller and output multiplexer, and a tag-controller is associated with a multiplexer, buffer, performed on memory or registers, and additional buffers for tracking the relevance of the data and inputs buffers devices are connected to RAM.

Reducing the delay in receiving the data arises from the fact that the data has been pre-loaded into the buffer before CPU them demanded multiple buffers allow to accumulate data on four independent addresses, and what data is passed in the second interface without a reference to the first (i.e. RAM), to offload the data bus.

The invention is illustrated by drawings, where the figure 1 shows the block diagram of a device buffering data streams to read from the RAM.

Buffering mechanism of the flows of data to be read from RAM, consists of a tag-controller 1, the address bus read-2, the address bus record 3, gates 4 and 5 verify the validity of the address read and write data bus RAM 6, buffers 7, 8, 9 and 10, the multiplexer switching outputs buffers 11. Also contains data bus CPU 12 managing requests to memory and answer CPU logic of the 13 States with machines confirmation of 14 and query 15, the address bus read from RAM 16 with strobe request to the controller RAM 17 and strobe 18 confirm the data for the CPU.

The device works as follows. After reset tag controller 1 puts on select signal buffer write permission in the buffer 7. Output buffer 10 through the multiplexer 11 appears to be on data bus CPU on a signal select the output buffer from the tag controller 1. When the CPU receives the request for a 4 read at 2, he goes as the TEG-controller 1, and the control logic of 13. TEG-controller 1, without finding, that the data at that address are in the buffer, signals about this on the internal bus control logic, and that, with the help of the machine state of the queries 14, passes the request to the RAM read 17 at 16. The obtained data are filled in the buffer 7, depending on the ratio of the frequency of the memory bus and the bus to the CPU starts outputting data from the multiplexer 11 on the bus 12 and signal confirmation 18 using the machine condition of obtaining data 15 until the data is considered fully. As batch release of these new data from RAM is complete and the desired number is sent to the CPU. However, an entry in the buffer 7 this does not stop, and continues until it is full. After the recording machine state of the queries 14 enters standby mode, and control logic 13 signal exhibits finished loading data tag controller 1, which removes write permission with buffer 7 and translates it to a buffer 8. Thus, if the next request to the address, which differed from those, which were recorded in previous times, the above algorithm of work continues to buffer 8, the data in the buffer 7 preserved for possible future request.

In the event when the petition read the processor was at the same address has already been produced reading, TEG-controller 1 reports the coincidence of addresses control logic is 13, simultaneously switching multiplexer 11 to retrieve data from the buffer in which is stored the desired data. Control logic 13 forms the address 16, which will be read from the buffer, based on the address, query the CPU read 2. The control logic of 13 forms for machine condition 15 signal name receive data confirm the data for the CPU, and the data are sent on data bus CPU 12. If then, at any time, follow the next read request 5, and address is the serial relative to the previous query, then the buffer (in this example, buffer 8) is devastated by half, simultaneously with the issuance of the data the CPU using a state machine receiving data 15 machine state of the queries 8 request for read 17 to consistently relative to the address of the query CPU bound at 16 to ensure that the buffer 8 again became full.

A separate task tag controller is the observation of the address 4, according to which the Central processor or external devices on request 5 is an entry in the RAM, because it is possible that the data stored in the buffers, already will not be relevant as to the appropriate address in RAM recording was made. In this case, the tag-controller 1 exhibits a sign that the data in the buffer is not relevant.

Device buffering data streams that are sent between the two interfaces that represent the data bus RAM and CPU, respectively, containing a buffer, made on memory or registers and stores the data for transmission on the request from the second interface without applying for them in the first, wherein the device contains additional buffers, TEG-controller and output multiplexer, and a tag-controller is associated with a multiplexer, buffer, performed on memory or registers, and additional buffers for tracking the relevance of the data and inputs buffers devices are connected to RAM.