• May 19, 2022, 01:12:02 AM

Author Topic:  Fixed Arbitrary Precision and multiplication  (Read 480 times)

0 Members and 1 Guest are viewing this topic.

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Fixed Arbitrary Precision and multiplication
« on: January 13, 2022, 03:38:06 PM »
I was reading through this: http://www.hpdz.net/TechInfo/Bignum.htm#General and I was also exploring MandelMachine's BigNum_Arb class, which is basically a fixed precision lib. I wanted to find some way to do the multiplication but I am kinda struggling. MM is probably doing the multiplication part on c or asm. It also has some other BigNum hardcoded precision obsolete classes that have multiplication implemented, but their data storing format seems different (If I stored the same number created in the arb class, on the fixed digits, the printed double value is wrong).

The format is an array of 64 bit longs , but only the 60 bits are used (4 bits are there for overflow).
[60bits] .decimal_point. [60bits][60bits][60bits]...
I have tried to make a small example with only 2 decimals in order to try the multiplication as described in http://www.hpdz.net/TechInfo/Bignum.htm#General
but I probably mess up the sub-products because the need twice as much bits to be calculated.

Does anyone have any experience on this?

Linkback: https://fractalforums.org/index.php?topic=4598.0

Offline marcm200

  • 3e
  • *****
  • Posts: 1122
Re: Fixed Arbitrary Precision and multiplication
« Reply #1 on: January 13, 2022, 06:08:17 PM »
The multiplication in the link is standard school math multiplication in base-10.

I implemented my own fixed precision number type, where I use digits to base 1 billion (which fit in 32 bit) and then multiplying two such numbers is performed after casting each digit to int64_t by multiplying one number with each of the digits of the 2nd one, shifting the sub-result an appropriate amount of digit positions (performed by tailing zeros after shifting as the fractional part is left-aligned and the integer part right-aligned), then adding with a fixed-number addition routine to the ongoing overall result respecting carry-over. That way I can reuse all tested and working routines.

I suspect you're doing the same. or are you using the presumed speed-up described in the article where some of the sub-products are not computed in the first place (when truncation is used as rounding mode) ?
(I'm not using any of those as my number type is intended for interval arithmetics and I perform outward rounding according to sign left or right interval end).
« Last Edit: January 13, 2022, 07:21:45 PM by marcm200, Reason: interval rounding description »

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #2 on: January 28, 2022, 05:22:47 PM »
Well, I managed to get it working, as you said have ints as digits. In order to the multiplication you have to calculate the N^2 sub multiplications and then add each column together. This sum is done into a long, but what happens if this long overflows? This happens if the numbers have many digits, lets say 10.

From the 64 bits, I kept some of the first bits as a carry, for the next digit, but I was first accumulating the sum and then I was masking the carry to the next digit. Should I move the carry to the next digit on every sum to avoid overflow?

here is a snippet:

long old_sum = 0;
long sum = 0;
SHIFT = 30;
MASK = 0x3FFFFFFFL;

 for(int i = fracDigits; i > 0; i--) {

            sum = (old_sum >>> SHIFT); //carry from prev
            for(int j = 0, k = i; j <= i; j++, k--) {
                long bj = bdigits[j];
                long ak = digits[k];

                 sum += bj * ak; //THIS CAN OVERFLOW

            }
            resDigits = (int) ((sum) & MASK);

            old_sum = sum;
   
        }

        sum = (old_sum >>> SHIFT);

        long bj = bdigits[0];
        long ak = digits[0];

        sum += bj * ak;

        resDigits[0] = (int) (sum);

Offline hobold

  • Fractal Frankfurter
  • *
  • Posts: 573
Re: Fixed Arbitrary Precision and multiplication
« Reply #3 on: January 28, 2022, 06:12:42 PM »
Should I move the carry to the next digit on every sum to avoid overflow?
If you reserved no leading bits (to overflow into) then you will have to check overflow on every add, and potentially ripple carry-outs along all of the sum total's parts.

This can be very slow, so people usually do use some extra bits for intermediate results (say, for the overall sum of each column) and only ripple the carries through the sum total in a final single post processing step.

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #4 on: January 29, 2022, 09:50:08 AM »
Well there are reserved bits, on every int I use 30 bits, but when you multiply two 30 bit ints you need 60 bits, so thats why I use a long to store the temp product.
The summation for that column is taking place into a long, but this overflows. (The products are large because every digit after the decimal point have digits starting from the 30th bit and downwards)

I switched the code to this and it works for now:

long old_sum = 0; //There is an extra multiplication step before this but for simplicity I leave the final step
long sum = 0;
SHIFT = 30;
MASK = 0x3FFFFFFFL;

for(int i = fracDigits; i > 0; i--) {

            sum = old_sum; //carry from prev
            carry = 0;
            for(int j = 0, k = i; j <= i; j++, k--) {
                long bj = bdigits[j];
                long ak = digits[k];

                sum += bj * ak;
                carry += sum >>> SHIFT;
                sum = sum & MASK;

            }
            resDigits = (int) (sum);

            old_sum = carry;
        }

        sum = old_sum;

        long bj = bdigits[0];
        long ak = digits[0];

        sum += bj * ak;

        resDigits[0] = (int) (sum);

        result.sign = sign * b.sign;


Do you have any other proposal?

Offline claude

  • 3f
  • ******
  • Posts: 2202
    • mathr.co.uk
Re: Fixed Arbitrary Precision and multiplication
« Reply #5 on: January 29, 2022, 12:37:27 PM »
https://code.mathr.co.uk/fixed-point/blob/31f4d5c709978b05d31ff4a106e68f017ba050f1:/fixed-point.c#l424 is how I implemented multiplication (tl;dr use 64bit accumulators with 32bit limbs) more info including a diagram https://mathr.co.uk/blog/2019-09-30_fixed-point_numerics.html

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #6 on: January 29, 2022, 05:48:11 PM »
Thanks, I will have a look at this. For now I think that using a carry for the high bits of the sum (30+) solved the issue. Hopefully my carry wont overflow :P
The high 30 bits of the long sum for digit i are basically the low 30 bits for the long sum of digit i + 1
I managed to refactor my karatsuba implementation as well.

I have implemented mult, multKaratsuba, square, squareKaratsuba (square is faster no matter what)

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #7 on: January 30, 2022, 12:20:02 PM »
After I completed this, the next thing is the following.
Currently my digits arrays are ints and only 30 digits are used.
Do you think that it makes sense in switching the digits arrays to longs,( 60 bits to be used)?
Additions and subtractions will be faster, as the hardware will do twice the work on each digit.
The problem comes with multiplication.
I will have to use the current implementation that operates on int digits.
So from each long i will have to create two ints, one storing the 30 MSb and the next storing the 30 lsb.

So basically if I have 5 long digits the multiplication will operate on 10 int digits.
Do you think that this extra pre-process step will negate the addition/subtraction gains?

Offline hobold

  • Fractal Frankfurter
  • *
  • Posts: 573
Re: Fixed Arbitrary Precision and multiplication
« Reply #8 on: January 30, 2022, 04:36:10 PM »
Processors these days usually come with hardware to do 64bit * 64bit = 128 bit multiplication, but programming languages don't usually offer easy access to the result's upper 64 bits. It may be possible, but you might have to dig deep to find out which obstacles you need to overcome.

You could choose to do multiplication in smaller chunks (as it is now) and addition in larger chunks. But that introduces a dependency on byte order of numbers in memory ("endianness"). Not a showstopper, but causes issues with portability.

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #9 on: February 04, 2022, 11:48:35 AM »
Well, I did my best, but java is  weird. So for me doing the calculations with integers is faster.
The numbers are stored in Q format. I am still using apfloat for the other calculations but I am converting to my format for the reference loop.
I had to recompile apfloat to introduce some getters for it's internals. (I didnt want to use hacky reflections)

Offline superheal

  • Fractal Feline
  • **
  • Posts: 190
Re: Fixed Arbitrary Precision and multiplication
« Reply #10 on: March 14, 2022, 12:48:06 PM »
I have a question regarding Karatsuba. All the wikis that I read basically say that if you have N number of digits, you cut the N in half and perform two multiplications, one of the fist N/2 digits and one with the second N/2 digits and then you combine the results using:

(x1 + x0)*(y1 + y0) - x1*x0 - y1*y0

where x0, y0 the first N/2 digits and x1,y1 the second N/2 digits.

Cant we do it on a different way?

Using the attached example:

c4 (not used as an output) = b3 * a1  + b2 * a2 + b1 * a3
c3 = b3*a0 + b2 * a1 + b1 * a2 + b0 * a3 + carry(c4)
c2 = b2 * a0 + b1 * a1 + b0 * a2 + carry(c3)
c1 = b1 * a0 + b0 * a1 + carry(c2)
c0 = b0 * a0 + carry(c1)

Applying the karatsuba formula internally gives:

c4 (not used as an output) = (b3 + b1) * (a3 + a1) - b3 * a3 - b1 * a1  + b2 * a2
c3 = (b3 + b0) * (a3 + a0) - b3*a3 - b0*a0  + (b2 + b1) * (a2 + a1) - b2*a2 -b1*a1 + carry(c4)
c2 = (b2 + b0) * (a2 + a0) - b2*a2 - b0*a0 + b1 * a1 + carry(c3)
c1 = (b1 + b0) * (a1 + a0) - b1*a1 -b0*a0 + carry(c2)
c0 = b0 * a0 + carry(c1)

We can precalculate all bn*an

Isnt this equivalent ?


xx
Arbitrary Precision in GLSL and perturbation theory questions

Started by matigekunstintelligentie on Noob's Corner

1 Replies
400 Views
Last post June 07, 2020, 10:20:59 PM
by claude
clip
MandelRender: Optimized (some features removed), and arbitrary precision

Started by Mr Rebooted on Other

0 Replies
850 Views
Last post September 28, 2021, 10:26:24 PM
by Mr Rebooted
xx
(Fixed precision) floating point performance: Best datatype?

Started by marcm200 on Programming

18 Replies
1076 Views
Last post January 30, 2020, 07:11:23 PM
by marcm200
xx
Arbitrary precision Java

Started by superheal on Programming

0 Replies
157 Views
Last post September 18, 2021, 05:55:18 PM
by superheal
xx
UF extended/arbitrary precision

Started by FractalDave on UltraFractal

0 Replies
305 Views
Last post September 13, 2018, 04:14:36 PM
by FractalDave