# Method of facilitating multiplication of floating-point numbers represented in residue number system

FIELD: information technology.

SUBSTANCE: method comprises steps of: concurrently writing the remainder on base p_{1} of a multiplicand in memory elements; concurrently counting the number of units b_{i} in each column of the i-th matrix; shifting the binary number b_{1} one bit to the right; summing with a number b_{2}; shifting the obtained sum _{3}. Similarly, the obtained sums are shifted and summed with subsequent numbers to obtain a sum _{1} is the first multiplication bit s_{1}, the least significant bit of each obtained sum _{2*m}. If s is greater than p_{1}, the obtained product s is corrected by successive subtraction of the base p_{1} from s until s is less than p_{1}, otherwise correction is not performed; similarly, products of m-bit residues on the rest of the bases are calculated and corrected; the powers of multipliers are simultaneously summed up and the resultant sum is the power of the determined product.

EFFECT: faster computation.

2 dwg

The invention relates to computing and is designed to build high-speed parallel-pipelined multipliers processing arrays positive floating-point numbers in the system of residual classes (JUICE).

The multiplication operation is performed in parallel on several grounds p_{i}their number n is determined by the range of P representations of numbers: P=p_{1}*p_{2}*...*p_{n}. Representation of a number in the JUICE provides the least nonnegative residues of A_{i}system coprime bases p_{i}(i∈[1,n]).

The number in the system of residual classes represent positional order and the mantissa consists of a set of residues on the grounds p_{i}.

Well-known iterative method for multiplication of integers, floating-point, which is applicable to numbers presented in positional notation, and in the system of residual classes. In this way the multiplication is reduced to a sequence of plies stacked, which are executed sequentially. When shifts multiplier vacated bits are filled with zeros. If the first bit of m-bit multiplier is equal to one, then the rst term is mnimum, otherwise the rst term is equal to zero. If the second bit of the multiplier is equal to one, then the second term is mnimum, shifted to od the n discharge to the left, otherwise, the second term is zero. To the sum of the first and the second term is added to the multiplicand, shifted by two digits to the left, if the second bit of the multiplier is equal to one, otherwise the added zero. Then to the resulting amount is added to the multiplicand, shifted by three digits to the left, if the third bit of the multiplier is equal to one, otherwise the added zero. And so on until the k-th digit of the multiplier, the accumulated amount of the multiplicand is added, shifted by k positions to the left, if the k-th bit of the multiplier is equal to one, otherwise the added zero. And so on up to m-th digit of the multiplier, the accumulated amount of the multiplicand is added, shifted by n bits to the left, if the m-th bit of the multiplier is equal to one, otherwise the added zero. In the end, the cumulative sum is the desired product of factors. The disadvantage is that, first, when the iterative method of multiplication is m-1 operations totals, and taking into account the sequential method transfers the most significant bit is the number of clock cycles summation is equal to (m-1)*2*m. Secondly, the process of forming the sum is a serial process.

The technical result from the use of the organization's way of multiplying the floating-point numbers represented in the system of residual classes, is to increase the computation speed due to the replacement of a series of m-1 arithmetic operations is adding 2*(m-1) in parallel executable operations of counting the number of single bit in the bit slices,
formed from the digits mnimogo. This operation is performed in parallel for all residues on the bases of the system of residual classes, forming cofactors. Based on the analysis and modification of the obtained values of the number of units in all bit slices is the formation of values in the binary number, which is the value of the required works. As a result, the number of clock cycles necessary for the formation of the value of the sum of an integer array of binary numbers - work, will be equal to (log_{2}m)*2*m clock cycles. Thus, the proposed method ensures the operation of forming works faster-known iterative method in ((m-1)*2*m)/((log_{2}m)*2*m)=(m-1)/log_{2}m times, for example, when m=64 calculations will be performed 8 times faster.

Description of work unit: each i-th binary positional residue on the basis of p_{i}can be represented as a sequence of bits A_{i}(a_{m}, a_{m-1}, ..., a_{2}, a_{1}), where m is the bit width of the remainder, i∈[1,n].

Is parallel to the account balance on the basis of p_{1}mnimogo in the cells of the matrix in the memory elements, the matrix size is (2*m-1) columns and m rows, where m is the bit width of the first Foundation of the system of residual classes. In cells 1 through m of the first row of the matrix is written m-bit balance based on the Yu p_{
i}mnimogo in the case when the first digit of the multiplier is equal to one, otherwise zeros are written.

In cell 2 through m+1 the second row of the matrix is written m-bit balance on the basis of pi mnimogo in the case where the second digit of the multiplier is equal to one, otherwise zeros are written.

And so on, in a cell with k (m+k-1) k-th row of the matrix is written m-bit balance on the basis of p_{1}mnimogo in the case when the k-th bit of the multiplier is equal to one, otherwise zeros are written.

And so on, in cells m (2*m-1) second row of the matrix is written bitwise balance on the basis of p_{1}mnimogo in the case when the m-th bit of the multiplier is equal to one, otherwise zeros are written.

In the other cells of the matrix are written with zeros, in General, the placement mnimogo in the cells of the matrix in the memory elements is as follows: