Digital signal processors with configurable binary multiplier-accumulation unit and binary arithmetic-logical unit

FIELD: physics; computer technology.

SUBSTANCE: present invention pertains to digital signal processors with configurable multiplier-accumulation units and arithmetic-logical units. The device has a first multiplier-accumulation unit for receiving and multiplying the first and second operands, storage of the obtained result in the first intermediate register, adding it to the third operand, a second multiplier-accumulation unit, for receiving and multiplying the fourth and fifth operands, storage of the obtained result in the second intermediate register, adding the sixth operand or with the stored second intermediate result, or with the sum of the stored first and second intermediate results. Multiplier-accumulation units react on the processor instructions for dynamic reconfiguration between the first configuration, in which the first and second multiplier-accumulation units operate independently, and the second configuration, in which the first and second multiplier-accumulation units are connected and operate together.

EFFECT: faster operation of the device and flexible simultaneous carrying out of different types of operations.

21 cl, 9 dwg

 

The technical field to which the invention relates,

The present invention generally relates to electronics and more specifically to digital signal processors (DSPS, DSP) with configurable blocks of multiplication-accumulation (UN, MAC) and arithmetical and logical units (ALU, ALU).

Prior art

DSP is a specialized microprocessor that is designed to very quickly perform mathematical calculations. DSPS are widely used in many electronic modules, such as CD players, hard drives, magnetic disks of personal computers (PC), modem banks, audio devices, cell phones, etc. In cellular phones needs in calculations by the DSP continues to grow, supported the growing needs of application programs, such as modem processing 3G (3rd generation), positioning, image processing, and video, 3D games and so on. These application programs require DSP that can perform calculations quickly and efficiently.

DSP typically contains a unit of the UNIVERSITY (multiply-accumulate) and ALU (arithmetical-logical unit). Unit of the UNIVERSITY is used for multiplication with accumulation, which are commonly used in filtering and signal processing. Alisporites to add, subtraction, logical operations, shift operations and manipulation of bits. DSP can also contain many units of the UNIVERSITY to improve computational performance. An exemplary DSP architecture with two blocks of the UNIVERSITY as described in the application No. 6557022 in U.S. patent Digital Signal Processor Coupled with Multiply-Accumulate Units," filed on April 29, 2003.

Purposes of any structure DSP are (1) achieving the highest number of transactions per unit of time and (2) flexibility to perform different types of operations simultaneously to make better use of available equipment. The DSP architecture that can meet these goals, particularly desirable to satisfy the modern processing applications.

The invention

Architecture DSP with improved efficiency, described below. In one embodiment, the DSP includes two blocks of the UNIVERSITY and two ALU, where one of the ALU replaces the adder to one of the two blocks of the UNIVERSITY (BOONE). This DSP can be configured, possibly on the basis of the "team behind the team"to work in the configuration of the double BUN/single ALU, configuration single BUN/dual-ALU or configuration of the double BUN/dual-ALU. Configuration flexibility allows the DSP to process various types of transactions on signal processing and improves the utilization of available equipment. SP additionally includes pipeline registers, that interrupt the critical path (paths) and allow the DSP to operate at a higher clock speed for better performance. Other embodiments of the DSP architecture is also described below.

Various aspects and embodiments of the invention are described in more detail below.

In one aspect, presents a processor, comprising: a first block of multiplication-accumulation (UN)made with the possibility to receive and multiply the first and second operands to obtain a first intermediate result, save the first intermediate result in the first register, to summarize the stored first intermediate result with a third operand, and to provide a first output; and a second unit of the UNIVERSITY which has a capability to receive and multiply the fourth and fifth operands to obtain a second intermediate result, save the second intermediate result in the second register, to summarize the sixth operand or stored second intermediate result or the sum of the stored first and the second intermediate results and to provide a second output result.

In another aspect, presents a processor, comprising: a first block of multiplication-accumulation (UN), which includes a first multiplier, configured to receive and multiply the first and second operands and to provide the address of the first intermediate result, and the first arithmetical-logical unit (ALU), performed with the opportunity to take the first intermediate result, and the third operand, and at least one additional operand, perform operations on the received operands and to provide a first output; and a second unit of the UNIVERSITY, which includes a second multiplier, configured to receive and multiply the fourth and fifth operands and to provide a second intermediate result of the first adder is configured to sum the second intermediate result or with zero or with the first intermediate result from the first block of the UNIVERSITY, and the second adder is configured to sum the output the first adder with the sixth operand and to provide a second output result.

In another aspect, presents a processor, comprising: a first block of multiplication-accumulation (UN), which includes a first multiplier, configured to receive and multiply the first and second operands and to provide a first intermediate result, the first register is configured to store the first intermediate result and to provide the stored first intermediate result, and the first arithmetical-logical unit (ALU)which has a capability to receive and perform operations on the stored first sub-the full-time result the third operand of at least one other operand, or their combination, and to provide a first output; and a second unit of the UNIVERSITY, which includes a second multiplier, configured to receive and multiply the fourth and fifth operands and to provide a second intermediate result, the second register configured to store the second intermediate result and output a stored second intermediate result of the first adder is configured to sum the stored second intermediate result or zero, or with the stored first intermediate result from the first block of the UNIVERSITY, and the second adder is configured to sum the output of the first adder sixth operand and output a second output result.

In another aspect, presents a wireless device, comprising: a first block of multiplication-accumulation (UN), which includes a first multiplier, configured to receive and multiply the first and second operands and to provide a first intermediate result, and the first arithmetical-logical unit (ALU), performed with the opportunity to take the first intermediate result, and the third operand, and at least one additional operand, perform operations on the received operands and output a first output the result; the second block of the UNIVERSITY, which includes a second multiplier, configured to receive and multiply the fourth and fifth operands and to produce a second intermediate result of the first adder is configured to sum the second intermediate result or zero, or the first intermediate result from the first block of the UNIVERSITY, and the second adder is configured to sum the output of the first adder with the sixth operand and output a second output result; path ALU, which includes the block shear, made with the ability to accept and move seventh eighth operand or operand, and to provide a third intermediate result and the second ALU, made with the possibility to perform operations on the third intermediate result, the seventh operand, the eighth operand or their combination to produce a third output; and a register file made with the possibility to issue from the first to the eighth operand to the first and second units of the UNIVERSITY and tract ALU and save from the first to the third output from the first and second units of the UNIVERSITY and tract ALU.

Brief description of drawings

Fig. 1 illustrates a DSP with two blocks of the UNIVERSITY and one ALU.

Fig. 2 illustrates a pipelined DSP with two blocks of the UNIVERSITY and one ALU.

Fig. 3 illustrates DSP configurable with two block and UNIVERSITY and two ALU.

Fig. 4 illustrates a configurable pipelined DSP with two blocks of the UNIVERSITY and two ALU.

Fig. 5 illustrates another configurable pipelined DSP with two blocks of the UNIVERSITY and two ALU.

Fig. 6, 7 and 8 illustrate a DSP according to Fig. 5 working with configurations dual BOONE/single ALU, single BUN/dual-ALU and double BUN/dual-ALU, respectively.

Fig. 9 illustrates a wireless device in the communication system.

Detailed description

The word "exemplary" is used herein to mean "serving as an example, a case or illustration". Any variant of implementation or design, described below as "exemplary"is not necessarily be construed as preferred or advantageous in comparison with other variants of implementation or designs.

Fig. 1 shows a block diagram of the DSP 100 with two blocks of the UNIVERSITY and one ALU. File registers 110 contains a Bank of General purpose registers that can be used for storage of operands and results for units of the UNIVERSITY and the ALU. File registers 110 is connected to and communicates with a memory unit (not shown in Fig. 1). For the variant of implementation shown in Fig. 1, the file registers 110 has three input ports, labeled as PI1-PI3 and eight output ports, identified as PO1-PO8. In the General case, the register file can have any number which has input and output ports.

For the first block of the UNIVERSITY (BUN) multiplier 122a receives and multiplies two operands from output ports PO4 and PO5 file registers 110 and delivers the result to one input of the adder 140a. The adder 140a takes another operand from the output port PO6, sums the two input operands and outputs the output to the input port PI2 file registers 110. The multiplexer 128 receives the output of multiplier 122a and a value of zero for the two inputs and outputs or the output of the multiplier, or zero, depending on the control signal multiplexer (MIND, MS).

For the second unit of the UNIVERSITY (BUN), multiplier 122b receives and multiplies two operands from output port PO2 and PO3 file registers 110 and outputs its result to one input of the adder 130. The adder 130 also receives the output of multiplexer 128, sums the two input operands and outputs the output to one input of the adder 140b. The adder 140b receives the other operand from the output port PO1, sums the two input operands and delivers the result to the input port PI1 file 110 registers.

For tract ALU block 154 shift takes two inputs with the output port PO7 and PO8 file registers 110 and a third input from the secondary bus. Auxiliary bus passes immediate value embedded in the team, ALU. Block 154 shift selects one of these three input data, shifts the operand otvarennogo log on to the specified number of bits (for example, 0, 1, 2 or 3 bits to the left) and delivers the result to one input of the multiplexers 158a and 158b. The multiplexer 158a also takes the operand from the output port PO7 and takes one of the two inputs (input data) to one input of the ALU 160. The multiplexer 158b also takes an immediate value from the secondary bus and outputs one of the two inputs to the other input of the ALU 160. ALU 160 performs operations on its input operands and outputs the output to the input of PI3 file 110 registers.

Modules in the DSP 110 can be created with any number of bits. As an example, multipliers 122a and 122b can be a 16x16 multipliers bits, the adder 130 may be a 32-bit adder, adders 140a and 140b can be 40-bit adders and block 154 shift and ALU 160 can be 40-bit modules. Similarly, the file registers 110 may be designed with any number of bits for its input and output ports. As an example, output ports PO1, PO6 and PO7 can issue 40-bit operands, output ports PO2, PO3, PO4 and PO5 can send 16-bit operands, the output port PO8 can send 16-bit or 40-bit operands, and the input ports PI1, PI2 and PI3 can take 40-bit results. The above exemplary values, and other number of bits can also be used.

DSP 100 can be configured to work or as two independent unit of the UNIVERSITY, or the AK two connected blocks of the UNIVERSITY. For the configuration with two independent BOONE multiplexer 128 is controlled so as to pass a null value to the adder 130, and BUN and BUN work independently and can perform two operations UNIVERSITY at the same time on different sets of operands. For the configuration with two United BOONE multiplexer 128 is controlled to pass the output of the multiplier 122a, and BUN and BUN together perform the calculation: (B*C)±(D*E), or A±(B*C)±(D*E), where a-E are operands to the output ports PO1-PO5 respectively. These two calculations is very useful for complex multiplications and accumulations.

It is highly desirable to increase the clock frequency for the DSP to increase the amount of processing per unit time (i.e. to perform more operations per second). For example, if the clock frequency can be increased by 50%, then 50% of operations per second, more can be done with the same equipment. However, as the tract with the United two BOONE and tract ALU everyone has multiple operations sequentially on its critical path, DSP architecture, shown in Fig. 1, does not scale well when the clock frequency increases. The tract with the United two BOONE has all the multiplication and two summation on its critical path through the multiplier 122a or 122b and the adders 130 and 140b. Tract ALU has shift operatiu summation in its critical path. These operations require some time to complete and, therefore, will limit the clock frequency, which can be used for DSP.

Fig. 2 illustrates a block diagram of a pipelined DSP 102 with two blocks of the UNIVERSITY and one ALU. DSP 102 includes all of the elements of the DSP 100, shown in Fig. 1. DSP 102 additionally includes (1) register 124a connected between the multiplier 122a and the adder 140a, (2) the register 124b connected between the multiplier 122b and the adder 130, and (3) the register 156 connected between the block 154 shift multiplexers 158a and 158b.

Registers 124a, 124b and 156 are pipeline registers inserted in the critical paths BUN, BUN and tract ALU, respectively. These registers break (interrupt) the critical path and allow actinovate DSP 102 with a higher frequency. Execution cycle for the DSP 102 is split into two cascaded pipeline. On the first stage of the pipeline multipliers 122a and 122b select operands from a file 110 registers, perform operations of multiplication and store their results in registers 124a and 124b, respectively. Similarly, for path ALU block 154 shift takes input from a file registers 110 and/or the direct bus, performs shifts as specified and stores the result in register 156. At the second stage of the pipeline adders in BUN and BUN and ALU 160 in the path of the ALU are active. For the configuration with independent the mi two BUN adder 140a summarizes the output of the register 124a with the operand from the output port PO6 and gives the output to the input port PI2, and the adder 140b summarizes the output of the register 124b with the operand from the output port A and gives the output to the input port PI1. For the configuration with two United BOONE adder 130 adds the output registers 124a and 124b, and the adder 140b summarizes the output of adder 130, and the operand from the output port PO1 and gives the output to the input port PI1. For tract ALU, ALU 160 receives the output of register 156 and/or the operands from output port PO7 and auxiliary bus, performs operations on input operands and outputs the output to the input port PI3.

DSP 102 may provide all the functionality of the DSP 100. However, the DSP 102 may be Taktarov with a higher frequency than the DSP 100 (up to two times faster), because the critical paths in the DSP 102 broken pipeline registers. It also allows the DSP 102 to achieve a higher overall performance than DSP 100. The case of the conveyor can also be inserted between the adders 130 and 140b, to further break down this path, if it is a new critical path with a much longer delay than all the other ways in DSP 102. In this case, the execution cycle for the DSP 102 may be divided into three stage pipeline.

The architecture of the DSP shown in Fig. 1, has a limited ability configuration and not sootvetstvuyuschem types of computing signal processing. DSP 100 can perform two (or independent or combined operations of the UNIVERSITY and one ALU operation in parallel. For some applications it may be preferable to have two ALU operations and the operation of the UNIVERSITY, performed in parallel, or to have two operations of the UNIVERSITY and two ALU operations, all running in parallel. Applications for which the preferred two ALU operations in parallel, include the metric should be calculated as the sum of absolute differences (SAD) for motion estimation in video compression, pattern matching with voice recognition and distance calculation ways decoding by Viterbi, which are all known from the prior art.

Fig. 3 illustrates a block diagram of a configurable DSP 104 with two blocks of the UNIVERSITY and two ALU. DSP 104 includes most of the elements of the DSP 100, shown in Fig. 1. DSP 104 additionally includes multiplexers 142a and 142b and the ALU 150, which replaces the adder 140a in the DSP 100.

For the variant of implementation shown in Fig. 3, the multiplexer 142a receives the output of multiplier 122a and operands from output port PO5 and auxiliary bus. The multiplexer 142a chooses one of its three inputs and outputs the operand from the selected input to one input of the ALU 150. The multiplexer 142b receives operands from output ports PO4 and PO6, selects one of two inputs and outputs the operand from the selected input to the other input of the ALU 150. ALU 10 can perform logical operations and the manipulation of bits along with the operations of summation and subtraction over its input operands and produce the output to the input port PI2.

Fig. 3 illustrates the use of configurable output ports PO4, PO5 and PO6 file 110 registers for data submission (support) on BUN and ALU 150. This reduces the number of output ports that are required to submit data for the BUN and the ALU, which can simplify the design of the register file. Fig. 3 also illustrates a specific example of connecting the ALU 150 to the output ports of the file registers 110 and to other modules in the DSP 104. Other connections are also possible. For example, multiplexers 142a and 142b may have a greater number of inputs to accept a greater number of operands, and/or can take operands from the other output ports of the file 110 registers.

DSP 104 can be used in different configurations, which are listed in Table 1. These different configurations can be selected appropriately setting the connection for the different modules in the DSP 104, for example, using DSP. Configuration for the DSP 104 can be changed dynamically, for example, based on a command-by-command.

Table 1
Single-BOONEDouble BUN
Single ALUSupportedSupported
Double ALUSupportedSupport is W

For the DSP 104 some of the operands are used together in some configurations due to the limited number of output ports and connections.

Flexibility in the use of DSP in a variety of configurations allows the DSP to better adapt and adjust to different types of transactions on signal processing. It also allows better use of available equipment and to improve the overall performance. Various configurations for the DSP are illustrated below.

Fig. 4 illustrates a block diagram of a configurable pipelined DSP 106 with two blocks of the UNIVERSITY and two ALU. DSP 106 includes most of the elements of the DSP 104, shown in Fig. 3. DSP 106 additionally includes registers 124a, 124b and 156 of the conveyor, which are available at the output of the multipliers 122a and 122b and block 154 shift, respectively. DSP 106 can support all the configuration shown in Table 1 for the DSP 104. However, the DSP 106 can operate at a higher clock rate than the DSP 104, so as registers 124a, 124b and 156 conveyor interrupt the critical path for BUN, BUN and tract ALU, respectively.

The data path DSP can be developed with a large number of modules and/or connections than the one shown in Fig. 3 and 4, in order to achieve greater flexibility and functionality. In addition, the register file can be provided for the weekend then the AMI, to support greater flexibility in the selection of operands.

Fig. 5 illustrates a block diagram of another configurable pipelined DSP 108 with two blocks of the UNIVERSITY and two ALU. DSP 108 includes most of the elements of the DSP 106, shown in Fig. 4. However, the DSP 108 includes file 112 registers with ten output ports, which replaces the file registers 110 having eight output ports. DSP 108 additionally includes additional modules and connections for BUN and BUN and tract ALU, as described below.

For BUN block 126a shift takes the output of the register 124a, shifts its input operand the specified number of bits and outputs the output to one input of the multiplexers 128 and 142a. The multiplexer 142a takes operands from output ports PO4, PO5, and PO7 and auxiliary bus. The multiplexer 142a produces one of the five inputs to one input of the ALU 150.

For BUN unit 126b shift takes the output of the register 124b, shifts its input operand the specified number of bits and outputs the output to the adder 130. Block 132 shift takes the operand from the output port A, shifts its input operand the specified number of bits and outputs the output to one input of the multiplexer 134. The multiplexer 134 also takes on the values '0' and '0×8000' and gives one of its three input sum is ATOR 140. In particular, the multiplexer 134 outputs this value to '0'when no summation is not required to adder 140, a value of '0×8000' to rounding and the operand from the output port PO1, when accumulation.

For tract ALU multiplexer 152 receives the operands from output port PO8 and auxiliary bus and generates the output at block 154 shift. Block 154 of shifting also takes the operand from the output port PO7, selects one of two inputs, shifts the operand from the selected input by a specified number of bits and outputs the output register 156. The multiplexer 158a takes the output register 156 and the operand from the output port PO9, selects one of two inputs and outputs the operand from the selected input to one input of the ALU 160. The multiplexer 158b takes operands from output port PO10 and auxiliary bus, selects one of two inputs and outputs the operand from the selected input to the other input of the ALU 160. ALU 160 performs operations on its input operands and produces the output on the multiplexer 164. The block 162 of the shift takes operands from output port PO9 and multiplexer 158b on two inputs, selects one of these two inputs, shifts the operand from the selected input by a specified number of bits and outputs the output of the multiplexer 164. The multiplexer 164 produces one of the two inputs to the module 166 ALU saturation, to the which performs the operation "saturation" of the received value and outputs the value with saturation at the input PI3.

Blocks 126a, 126b and 132 shift is provided in BUN and BUN to handle a number of different uporyadochenie value. Blocks 154 and 162 of the shift are provided in the path of the ALU to the shift operations. Each of these blocks shift can be individually configured to shift its input operand, for example, 0, 1, 2 or 3 bits to the left or to some other number of bits of the shift. The multiplexer 134 provides additional accuracy by submitting a '0×8000' for rounding, which provides additional polbita precision.

DSP 108 has the following differences from the DSP 100, shown in Fig. 1. First, the registers 124a and 124b of the conveyor is located at the output of the multipliers 122a and 122b in BUN and BUN respectively, and the register 156 conveyor is located at the exit of block 154 shift in the path of the ALU. Secondly, the adder 140a in BUN was replaced by ALU 150, which can perform logical operations and the manipulation of bits, along with summation and subtraction. Thirdly, the block 162 of the shift and two additional output ports PO9 and PO10 file 112 registers were added to the tract ALU. Fourthly, the various new connection is now provided in the ALU 150 for BUN.

DSP 108 can support all the configuration shown in Table 1 for the DSP 104 in Fig. 3. DSP 108 may support different types and combinations of operations because of the additional blocks of the shift, multip is exoro, the output ports and connections. DSP 108 can also support a higher clock frequency, so as registers 124a, 124b and 156 conveyor interrupt critical paths for BUN, BUN and tract ALU, respectively.

Fig. 6 illustrates the DSP 108 working in the configuration of a double BUN/single ALU. In this configuration BUN and BUN can be used independently or in combination with appropriate control multiplexer 128. ALU 150 receives the output of block 126a shift (via the multiplexer 142a, which is not shown in Fig. 6 for clarity) and the operand from the output port PO6 (via multiplexer 142b, which is also not shown). For this configuration, the ALU 150 functions as an adder and performs a summation over the two input operands.

Fig. 7 illustrates the DSP 108 operating in configuration single BUN/dual-ALU. In this configuration BUN costs, and BUN performs. The multiplexer 142a can take operands from output ports PO4, PO5, and PO7 and auxiliary bus, choose one of these four inputs and issue the operand from the selected input to one input of the ALU 150. The multiplexer 142b can take operands from output ports PO4 and PO6, choose one of these two inputs to produce the operand from the selected input to the other input of the ALU 150. ALU 150 may perform any operation ALU over its input operands.

F is, 8 illustrates the DSP 108 working in the configuration of a double BUN/dual-ALU. In this configuration BUN and BUN used in a configuration with two United BOONE, and the multiplexer 128 is omitted for clarity. The multiplexer 142a can take operands from output port PO7 and auxiliary bus, choose one of these two inputs to produce the operand from the selected input to one input of the ALU 150. ALU 150 may also receive an operand from the output port PO6 at its other input and to perform any operation ALU over its input operands.

The DSP 104 and 106 can also be used in configurations of the double BUN/single ALU, single BUN/dual-ALU and double BUN/dual-ALU manner similar to that shown in Fig. 6, 7 and 8 for the DSP 108. However, the connection to the DSP 104 and 106 for these different configurations may differ from the compounds for the DSP 108, as the DSP 104 and 106 have a smaller number of ports, output ports and multiplexers than the DSP 108.

Configurable architecture for DSP 104, 106 and 108 allow these DSP to perform various types and combinations of calculations in a single command. For example, the following calculation types and combinations can be made with these DSP in a single command:

A=B+C; D=E+F; G=H+(I*J).

A=B+C; D=E-F; G=H+(I*J)+(K*L).

A=(B<<3)+C; D=E&F; G=H-(I*J).

The input operands for the calculations shown above, can come from the output ports of the register file is an auxiliary bus. Three of the A, D and G for these calculations may be connected to three input port of the register file. Many other types and combinations of calculations can also be performed in the DSP 104, 106 and 108.

Configurable architecture for DSP 104, 106 and 108 are more appropriate for all types of transactions signal processing, than the architecture for the DSP 100, because they support all parallel combination shown in Table 1.

Configurable and/or pipelined DSP, described above, can be used for various applications, including telecommunication, computing, networking, personal electronics, and so on. Estimated using DSP for wireless communication is described below.

Fig. 9 illustrates a block diagram of the wireless device 900 in the wireless communications system. The wireless device 900 may be a cellular telephone, microphone, terminal, mobile station, or some other device or design. The wireless communications system may be a system of multiple access, code division multiple access (CDMA), global system for mobiles (GSM)system, a multiple input - multiple output (MIMO)system, multiplexing orthogonal frequency division (OFDM)access system with multiplexing orthogonal frequency division (OFDMA), and so on. The wireless device 900 ways is but to provide bidirectional communication through the reception path and the transmission path.

For tract receiving signals transmitted by base stations in the system are received by the antenna 912, are routed through the antenna switch (D) 914 and served to the receiver module (RCVR) 916. Module 916 receiver processes (e.g., filters, amplifies, and converts the frequency with decreasing frequency) signal, digitizes the processed signal and outputs the sampling data on the DSP 920 for further processing. For the transmission path data that must be transmitted from the wireless device 900, issued by the DSP 920 to transmitter (TMTR) 918. Module 918 transmitter processes (e.g., filters, amplifies, and converts the frequency with increasing frequency) data and generates a modulated signal which is sent through the antenna switch 914 and transmitted using antenna 912 base stations.

DSP 920 includes various modules, such as, for example, the file registers 930, blocks 932 UNIVERSITY, ALU 934, the internal controller 940 and internal block 942 memory, all connected via an internal bus. The internal controller 940 executes commands that issue commands to the blocks 932 UNIVERSITY and ALU 934 to perform various calculations. For example, the DSP 920 may perform the encoding, interleaving, modulation, channel formation, the expansion of the spectrum, filtering and so forth for the transmission path. DSP 920 can perform f nitrovanie, compression, the formation of the channel, demodulation, the implementation turned interleave, decoding and so on for the reception path. These various operations known in the art. Specific processing that should be executed DSP 920, depends on the communication system. File registers 930, blocks 932 UNIVERSITY and ALU 934 can be carried out in any of the architectures DSP shown in Fig. 2, 3, 4, and 5.

The controller 950 controls the operation of the DSP 920 and other modules in the wireless device 900. Other modules not shown in Fig. 9, since they do not contribute to the understanding of various embodiments. Blocks 942 and 952 memory stores program codes and data used by controller 940 and 950, respectively.

Fig. 9 illustrates an exemplary design of a wireless device, which is configured and/or pipelined DSP, described above, can be used. These DSPS can also be used in other electronic devices.

Described configurable and/or pipelined DSP architecture can be implemented in various hardware modules. For example, these architectures DSP can be implemented in integrated circuit for specific applications (ASIC), digital signal processing (DSPD), programmable logic device (PLD), programmable matrix of logic elements (FPGA), a processor, controller, micro is ontroller, the microprocessor and other electronic modules.

The previous description of the disclosed embodiments is provided to enable any person to make or use the present invention. Various modifications to these embodiments will be obvious to specialists, and the generic principles defined herein may be applied to other variants of implementation, without separation from the scope or form of the invention. Thus, the present invention is not limited to the implementation shown above, but should provide the largest amount that is compatible with open principles and new features.

1. Processor with scalable processor architecture, containing

the first block of multiplication-accumulation (UN)made with the possibility to receive and multiply the first and second operands from the register file and to obtain a first intermediate result, save the first intermediate result in the first intermediate register, to summarize the stored first intermediate result with the third operand and output a first output result to the register file; and

the second block of the UNIVERSITY, made with the ability to receive and multiply the fourth and fifth operands from the register file to obtain a second intermediate result, save Deut is th intermediate result in the second intermediate register, to summarize the sixth operand or stored second intermediate result, or the sum of the stored first and second intermediate results and to provide a second output result into the register file,

the first block of the UNIVERSITY and the second block of the UNIVERSITY respond to processor instructions for dynamic reconfiguration between a first configuration in which the work first block of the UNIVERSITY and the second block of the UNIVERSITY as two independent blocks of SCIENCES, and a second configuration in which the work first block of the UNIVERSITY and the second block of the UNIVERSITY as the United units UNIVERSITY, and

moreover, the aforementioned first configuration or the second configuration includes a block of UNIVERSITY and arithmetical-logical unit (ALU).

2. The processor according to claim 1, additionally containing

tract arithmetical-logical unit (ALU), made with the ability to accept and perform the first operation on the seventh operand or the eighth operand, to obtain a third intermediate result, save the third intermediate result in the third register, to perform the second operation on the third intermediate result, the seventh operand, the eighth operand or their combination, and to provide the third output result.

3. The processor according to claim 2, additionally containing the register file made with the possibility to issue with pervogo eighth operands for the first and second units of the UNIVERSITY and tract ALU and save from the first to the third output from the first and second units of the UNIVERSITY and tract ALU.

4. The processor according to claim 1, in which

the first block of the UNIVERSITY includes

the first multiplier, configured to receive and multiply the first and second operands and to provide a first intermediate result, then the first register capable of storing the first intermediate result, and

the first adder is configured to sum the stored first intermediate result from the first register with a third operand, and to provide a first output result, and

the second block of the UNIVERSITY includes

the second multiplier, configured to receive and multiply the fourth and fifth operands and to produce a second intermediate result, and then the second register capable of storing the second intermediate result,

a second adder configured to sum the stored second intermediate result from the second register or zero, or the first intermediate result from the first register, and

a third adder configured to sum the output of the second adder with the sixth operand and output a second output result.

5. The processor according to claim 2, in which the tract ALU includes

block shear is made with the ability to accept and move to the seventh or eighth operand and issue required the third intermediate result, then the third register capable of storing a third intermediate result; and

ALU made with the possibility to perform the second operation on the third intermediate result from the third register, the seventh operand, the eighth operand or their combination to produce a third output.

6. A processor having a register file containing

the first path processing, coupled with the register file, and the first path processing includes

the first multiplier, configured to receive and multiply the first and second operands from the register file and to provide a first intermediate result, and

the first arithmetical-logical unit (ALU), performed with the opportunity to take the first intermediate result, and the third operand, and at least one additional operand, perform operations on the received operands and output a first output result; and

the second path processing, coupled with the register file, and the second path of processing includes

the second multiplier, configured to receive and multiply the fourth and fifth operands from the register file and to produce a second intermediate result;

the first adder is configured to sum the second intermediate result or zero is m or with the first intermediate result from the first path processing, and

a second adder configured to sum the output of the first adder with the sixth operand from the register file and output a second output result

the first path processing is configured to selectively provide the first opportunity to use the path of processing of the multiplication-accumulation, and second the ability to use a tract of ALU processing.

7. The processor according to claim 6, further comprising

tract ALU, including

block shear is made with the ability to accept and move seventh eighth operand or operand to produce a third intermediate result; and

the second ALU made with the possibility to perform operations on the third intermediate result, the seventh operand, the eighth operand or their combination to produce a third output.

8. The processor according to claim 7, further comprising

the register file made with the possibility to issue from the first to the eighth operand to the first and second units of the UNIVERSITY and tract ALU and save from the first to the third output from the first and second units of the UNIVERSITY and tract ALU.

9. Processor of claim 8, in which the register file includes at least two output ports configured to supply data to the first block of the UNIVERSITY, working or as a unit of the UNIVERSITY, or ALU.

10. The processor according to claim 7, in which the first and second ALU additionally performed with the opportunity to take operands from the auxiliary bus.

11. The processor according to claim 7 and can be configured with the configuration of the double BUN/single ALU, configuration single BUN/dual-ALU or configuration of the double BUN/dual-ALU.

12. The processor according to claim 7 and configurable, based on the command-secomandi to work in the configuration of the double BUN/single ALU, configuration single BUN/dual-ALU or configuration of the double BUN/dual-ALU.

13. The processor according to claim 6, further comprising

the first register is configured to save the first intermediate result and issue the stored first intermediate result to the first ALU and the first adder; and

a second register configured to keep the second intermediate result and output a stored second intermediate result to the first adder.

14. The processor according to claim 7, further comprising

the first register is configured to save the first intermediate result and issue the stored first intermediate result to the first ALU and the first adder;

a second register configured to keep the second intermediate result and output a stored second milestone is the result of the first adder; and the third register is configured to save the third intermediate result and submit a saved third intermediate result to the second ALU.

15. The processor contains

the register file,

the first path processing, coupled with the register file, the first path processing includes

the first multiplier, configured to receive and multiply the first and second operands from the register file and output a first intermediate result,

the first intermediate register, configured to save the first intermediate result and issue the stored first intermediate result, and

the first arithmetical-logical unit (ALU)which has a capability to receive and perform operations on the stored first intermediate result, and the third operand from the register file, at least one other operand or their combination to produce the first output result; and

the second path processing, coupled with the register file, while the second path of processing includes

the second multiplier, configured to receive and multiply the fourth and fifth operands from the register file and to produce a second intermediate result;

the second intermediate register is performed in the possibility to store the second intermediate result and output a stored second intermediate result,

the first adder is configured to sum the stored second intermediate result or zero, or the stored first intermediate result from the first path processing, and

a second adder configured to sum the output of the first adder with the sixth operand from the register file and output a second output result

the first path processing is selectively configurable to provide path processing block UN and tract treatment with ALU.

16. The processor indicated in paragraph 15, further comprising

the third path processing, coupled with the register file, and the third path of processing includes

the first block of the shift, made with the ability to accept and move seventh eighth operand or an operand from the register file and to give the third intermediate result,

the third intermediate register is made with the possibility to save the third intermediate result and submit a saved third intermediate result, and

the second ALU made with the possibility to perform operations on the stored third intermediate result, the seventh operand, the eighth operand or their combination to produce the third output result

this third path is to figurarum as path processing ALU, running in parallel with the first and second processing paths.

17. The processor according to clause 16, in which the second ALU is configured to perform operations on the stored third intermediate result, the seventh operand, the eighth operand, the ninth operand from the register file, the tenth operand from the register file or their combination to produce a third output.

18. The processor according to clause 16, in which the third path processing optionally includes

the second set of shift performed with the opportunity to take ninth and tenth operands and output a fourth output, and a multiplexer which has a capability to take the third and fourth output results and to extradite or the third, or the fourth output.

19. The processor indicated in paragraph 15, in which

the first tract treatment additionally includes the first set of shift performed with the opportunity to take the stored first intermediate result to produce a first shifted result to the first ALU and the first adder, and

the second tract treatment additionally includes the second set of shift performed with the opportunity to take the stored second intermediate result to produce a second shifted result to the first adder.

20. The processor according to clause 16, in which the first ALU performed with the who what and is very useful to perform operations on the stored first intermediate result, the first operand, the second operand, a third operand, the seventh operand or their combination to produce the first output result.

21. A wireless device having at least one antenna connected to the processor for exchanging information wireless transmission in a wireless communication system, and referred to the wireless device contains

the register file for storage and transfer multiple operands received from the antenna,

the first path processing, which includes

the first multiplier, configured to receive and multiply the first and second operands from the register file and output a first intermediate result, and

the first arithmetical-logical unit (ALU), performed with the opportunity to take the first intermediate result, the third operand from the register file and at least one additional operand, perform operations on the received operands and output a first output;

the second path of processing, including

the second multiplier, configured to receive and multiply the fourth and fifth operands from the register file and to produce a second intermediate result;

the first adder is configured to sum the second intermediate result or zero or first Ave the intermediate result from the first path processing, and

a second adder configured to sum the output of the first adder with the sixth operand from the register file and output a second output;

the third path of processing, including

block shear is made with the ability to accept and move seventh eighth operand or an operand from the register file and to give the third intermediate result, and

the second ALU made with the possibility to perform operations on the third intermediate result, the seventh operand, the eighth operand or their combination to produce a third output; and

moreover, the register file is configured to issue the first to the eighth operands for the first, second and third paths of processing and save it from the first to the third output, and

the first path processing is dynamically reconfigurable between configuration path processing block multiplication-accumulation and configuration of the tract treatment with ALU.



 

Same patents:

FIELD: physics.

SUBSTANCE: invention pertains to the means of providing for computer architecture. Description is given of the method, system and the computer program for computing the data authentication code. The data are stored in the memory of the computing medium. The memory unit required for computing the authentication code is given through commands. During the computing operation the processor defines one of the encoding methods, which is subject to implementation during computation of the authentication code.

EFFECT: wider functional capabilities of the computing system with provision for new extra commands or instructions with possibility of emulating other architectures.

10 cl, 15 dwg

FIELD: engineering of microprocessors and computer systems.

SUBSTANCE: in accordance to shuffling instruction, first operand is received, which contains a set of L data elements, and second operand, which contains a set of L shuffling masks, where each shuffling mask includes a "reset to zero" field and selection field, for each shuffling mask, if the "reset to zero" field of shuffling mask is not set, then data indicated by shuffling mask selection field are moved, from data element of first operand, into associated data element of result, and if "reset to zero" field of shuffling mask is set, then zero is placed in associated data element of result.

EFFECT: improved characteristics of processor and increased productivity thereof.

8 cl, 43 dwg

FIELD: network communications, in particular, control means built into applications for conduction of network exchange.

SUBSTANCE: expandable communication control means is used for maintaining communication between computing device and remote communication device. In a computer program adapted for using expandable communication control means, information about contacting side is found, and on basis of found contact information it is determined which types of transactions may be used for communication with contacting side at remote communication device. As soon as communication setup function is determined using contacting side information, communication setup request, associated with such a function, is dispatched to communication address. After receipt, expandable communication control means begins conduction of communication with remote communication device.

EFFECT: creation of more flexible and adaptable software communication control means (program components) for processing communications (connections, exchange) between devices.

3 cl, 11 dwg

FIELD: computing devices with configurable number length for long numbers.

SUBSTANCE: device consists of two computing device units, each of them divided into at least four subunits, which consist of a quantity of unit cells. Named units are spatially located so that the distance between unit cell of first unit and equal unit cell in the second unit is minimal. Computing device configuration can be changed using configurational switches, which are installed between device subunits.

EFFECT: increased performance of computing device, reduced time of data processing.

12 cl, 6 dwg

FIELD: engineering of data processing systems, which realize operations of type "one command stream and multiple data streams".

SUBSTANCE: system is disclosed with command (ADD8TO16), which decompresses non-adjacent parts of data word with utilization of signed or zero expansion and combines them by means of arithmetic operation "one command stream, multiple data streams", such as adding, performed in response to one and the same command. Command is especially useful for utilization in systems having a data channel, containing a shifting circuit before the arithmetic circuit.

EFFECT: possible use for existing processing resources in data processing system in a more efficient way.

3 cl, 5 dwg

The invention relates to data processing systems having a rated Bank and supporting vector operations

The invention relates to data processing devices

The invention relates to electronics

The invention relates to the addressing of the registers in the processing unit and can be used for digital signal processing

The invention relates to data processing systems

FIELD: physics; computer technology.

SUBSTANCE: present invention pertains to coupling time with multimedia objects, and more specifically, to provision for time references for multimedia objects. These elements are linked to other elements, which can be part of another external document. Elements of the external document are grouped in time packages, which are set forth when the elements are to be played back, when elements of separate documents are to be played back, and when multimedia objects are to be played back. Other documents can assume playback synchronisation based on the link to the separate document. The external document can contain a listener element, which reacts on an event, acting on the elements in the separate document.

EFFECT: provision for time references on multimedia objects through a separate document, containing elements which are indirectly referenced on the multimedia objects.

40 cl, 4 dwg

FIELD: computer engineering.

SUBSTANCE: device contains l decoders (l = ]log2(p-1)/2[, where p - device modulus), harmonic signal generator, l controlled phase shifters, harmonic signal phasing tester, phase shifters group for fixed phase value, first coder, first decoder, first OR gate, first group of OR gates, second OR gate, second coder, (l-1) units for multiplying by constant in absolute value, l units of AND gates, second decoder, second group of AND gates, third AND gate, third coder, modulo-two adder, first unit of OR gates, second unit of OR gates, code converter to transform number x to p-x and third unit of OR gates.

EFFECT: device functionality enhancement.

3 dwg, 2 tbl, 4 ex

Logical calculator // 2336555

FIELD: physics, computer equipment.

SUBSTANCE: invention is related to computer equipment and may be used for building of automatics facilities, functional units of control systems. Device contains n D-triggers, n elements OR-NOT, n closing switches, n opening switches, n+1 resistors.

EFFECT: simplification of realisation of simple symmetrical Boolean functions τ1,..., τn and exclusion of dependence between duration of calculation beat and number of input binary signals.

2 dwg, 1 tbl

FIELD: physics, computer equipment.

SUBSTANCE: invention is related to efficient operation of devices and, in particular, to dynamic registration of device privileged mode interruptions handlers. Invention presents possibility of registration to unlimited number of privileged functions for use by application fulfilled in the device. Method of function dynamic registration is disclosed for use by application fulfilled in the device, at that device includes at least two modes of operation, which contain privileged and unprivileged modes. Method contains stage of available segment identification in structure of data, which establishes correspondence between identifiers of data structure segments and functions and stage of saving of indicator related to function in segment. Method also includes stage of search related to identifier segment and stage of provision of identifier accessibility for unprivileged applications. Application that requires addressing to function for service provision finds identifier and uses it for address to function.

EFFECT: provision of possibility of registration to unlimited number of privileged functions for use by application fulfilled in the device.

12 cl, 3 dwg

FIELD: information technology.

SUBSTANCE: logical computing machine is designed to implement simple symmetrical boolean functions. It can be used in digital computing systems as a code conversion device. The device contains n of D-triggers, n of closing keys, n of resistors and n-1 of "Disable" elements.

EFFECT: enhancement of device features.

2 dwg, 1 tbl

FIELD: infirmation technology.

SUBSTANCE: invention refers to audio/video encoding and decoding for signal processing, particulary, to the method of converting audio/video data into data streams with possibility of random access. Herewith the encoding method consists of the following steps: encoding of audio/video data; input of the random access code flag into the random access points; input of maskable bits into encoded audio/video data for prevention of bit chain emerging that is similar wiyh the code of the flag of the random access point. The data stream decoding method with possibility of random access includes the following steps: searching for the code of the flag of the random access point preceding the first the code of the flag of the random access point; reading binary bit by turns; determinimg if the selected bit is the data bit or masked bit, rejecting the selected bit if it is the masked bit and accepting if it is the data bit; starting the new data block after reaching the next code of the flag of the random access.

EFFECT: increase of data stream decoding efficiency by preventing emergence of the false code of the flag of the random access point.

8 cl, 2 dwg

FIELD: physics; measurements.

SUBSTANCE: invention pertains to automated parking and storage enterprises, and more specifically to the interface of software for controlling, processing and presenting information for such systems. The technical outcome is the display of information and control of separate components of an automated parking and storage enterprise through a user graphic interface. The outcome is achieved through the proposed method and system for controlling the automated parking and storage enterprise using a graphical user interface. The control system provides for an interactive interface for accessing, and controlling the parking and storage enterprise locally and remotely. The system also provides the client with limited local and remote access to goods and services offered by the enterprises. To provide for interaction with the operator, the interface includes graphic objects, displayed relative component parts of the enterprise, for example, the floor of the enterprise, showing the state of that floor. The graphic object in that case is related to the component parts of the enterprise, and allows the operator to carry out control and diagnosis.

EFFECT: possibility of displaying information and controlling separate components of an automated parking and storage enterprise through a graphical user interface.

16 cl, 24 dwg

FIELD: computer equipment.

SUBSTANCE: device contains D-trigger (1), two elements "OR" (21,22), element "AND" (3) and element "Prohibition" (4).

EFFECT: expansion of functional resources by means of new circuit, comparison of n-digit binary numbers is provided.

3 dwg

Computer device // 2328767

FIELD: computer equipment.

SUBSTANCE: device consists of five ring counters, every of which consists of five triggers and switches, two separating switches, three additional triggers, two additional switches, clock pulse generator, generator of large duration pulses, RAM, adders, registers, comparison device, transducers of binary-decimal code in code of heptasegment indicator, indicator.

EFFECT: increase of device efficiency.

5 dwg

FIELD: computer engineering.

SUBSTANCE: comparator comprises four elements EXCLUSIVE OR (11-14), four elements OR (21-24), four elements NOT AND (31-34), four elements AND (41-44). Due to complementary element and training input realization of relational operator A≥B or A>B, where A=a3a2a1a0 and B=b3b2b1b0 are four digit binary data, driven by binary signal a0, ..., a3, b0, ...,b3 e{0,1}.

EFFECT: comparator functional enhancement.

1 dwg, 1 tbl

FIELD: peripheral devices.

SUBSTANCE: device has information input device, clock generator, connected to address counter with decoder, outputs of which are connected to inputs of recording device, inputs of which are connected to output of programming device, signal generator and multiplexers. Device for recording object sate is connected to output of decoder of cells address of device. Signal generator includes cells for recording checksum. First input of signals generator is connected to output of decoder of address of cells of checksum, second input - to output of recording device, first output - to first inputs of multiplexers, and second output - to first input of binary adder, by its output connected to third input of signal generator and checksum. Outputs of decoder of checksum cells address and decoder of object state recording device cells addresses are connected to second output of address counter, which is connected to second inputs of multiplexers. Recording device is programmable.

EFFECT: broader functional capabilities.

2 cl, 2 dwg

Up!