• April 17, 2021, 10:01:58 PM

Author Topic:  Interval arithmetics on GPUs  (Read 357 times)

0 Members and 1 Guest are viewing this topic.

marcm200

• 3d
• Posts: 953
Interval arithmetics on GPUs
« on: May 25, 2020, 10:02:21 AM »
Is anyone using an interval arithmetics library on a graphics card?

I recently read about this (I don't have experience in GPU programming) and am now not sure (as the articles are some years old), whether it is currently possible to use such a library with double precision on GPUs - or does only one exist with single precision?

This one uses in one function a float2-type (double?) but only speaks about single precision in the text.
Code: [Select]
S Collange, J Flórez, D Defour. A GPU interval library based on Boost interval. 2008.hal-00263670v1full-text https://hal.archives-ouvertes.fr/file/index/docid/263670/filename/interval_gpu_fpl_v2.pdf

Code: [Select]
Performance Comparison of a CUDA interval arithmetic library with a standard interval arithmetic library. 2016DJ Nayak and PSV Natarajfull-text http://on-demand.gputechconf.com/gtc/2016/posters/GTC_2016_Tools_and_Libraries_TL_04_P6248_WEB.pdf

The speed-up would be quite high (2-3 orders of magnitude) and as I am only using basic operations when calculating exterior Mset bounds and no memory transfer needed besides some coordinates, that would allow me to achieve finer resolutions.

claude

• 3f
• Posts: 1830
Re: Interval arithmetics on GPUs
« Reply #1 on: May 25, 2020, 10:13:48 AM »
I guess float2 is a 2-vector of float. Two-Sum is a circuit used in "double-double" / "compensated" arithmetic (unevaluated sum of two floating point values to gain extra precision).

marcm200

• 3d
• Posts: 953
Re: Interval arithmetics on GPUs
« Reply #2 on: May 28, 2020, 08:12:46 AM »
I implemented my own interval mini library, not yet for the GPU (might come in the far future) just for the main processor in C++, custom-made for my tasks at hand.

https://github.com/marcm200/custInterval

It uses constant upward rounding and simulates downward rounding by appropriate negation (from the article below), so only one rounding mode switch at program start is necessary (results in about a 40-fold speed-up as compared to switching directly before an operation), but relies on the discretion of the user to ensure this.

double precision is used throughout as interval endpoints, direct function calls instead of operator overloading with temporary objects, and passing by reference for speed. All four basic operations are implemented.

The folder above contains a demo program following the orbit of the critical point in the quadratic Mandelbrot set.

Code: [Select]
Interval arithmetic with fixed rounding modeS. Rump, T. Ogita, Y. Morikura, S. Oishi, 2016full-text http://www.ti3.tu-harburg.de/paper/rump/RuOgMoOi16.pdf

sjhalayka

• Posts: 69
Re: Interval arithmetics on GPUs
« Reply #3 on: June 07, 2020, 10:34:45 PM »
If you can explain interval arithmetic, I can explain how to implement it on GPU, if it is even parallelizable, that is.

Your best bet is to use OpenGL, not OpenCL, nor CUDA.

See https://github.com/sjhalayka/qjs_compute_shader for a small code that calculates the quaternion Julia set. The code uses OpenGL 4.3 Compute Shaders. You may have a problem with getting support for that on Mac, mainly because Apple sucks. Otherwise, it will work on modern PC nVidia, AMD, and Intel graphics systems.

jukzi

• Fractal Phenom
• Posts: 59
Re: Interval arithmetics on GPUs
« Reply #4 on: June 08, 2020, 02:07:46 AM »
Since OpenGl does not specify rounding modes but "The fraction 0.5 will round in a direction chosen by the implementation, presumably the direction that is fastest." it may be fast but would give less accurate bounds.

The most sophisticated interval arithmetic library i know is http://www.ti3.tu-harburg.de/intlab/ (for matlab)
It realy shines on Matrix algebra as the bounds can be much better computed for high level operations then with low level artihmetic operators.

marcm200

• 3d
• Posts: 953
Re: Interval arithmetics on GPUs
« Reply #5 on: June 08, 2020, 09:00:42 AM »
If you can explain interval arithmetic, I can explain how to implement it on GPU, if it is even parallelizable, that is.

To arrive at a reliable result for interval seeds [a..b]+[c..d]*i whether the origin escapes for all such individual values, I need to account for the rounding error during squaring and other operations. This can be done by choosing the rounding mode so adding two intervals adds the lower border values with downward rounding and the right ends with upper rounding. Similar definitions exist for other basic operations like multiplication etc. For my purpose I only need addition, subtraction and multiplication.

I tried openMP to perform iteration of several different seed intervals in parallel, but that led to a speed-down.

My main loop looks something like this (pseudocode):
Code: [Select]
... set seeds into an array of complex intervals with double endpointsfor(int32_t iteration=0;iteration<maxit;iteration++) { ... #pragma omp parallel for for(int32_t p=0;p<MAXPARALLEL;p++) { iteration(z_current[p],seed[p],z_new[p]); z_current[p] = z_new[p]; ... some tests (inf, nan etc) } ...}
The principal function iteration now performs the actual IA operations and - using my own implementation of a simple IA mini library looks like this (I prefer sequential, reference argument taking functions over operator-overloading when looking for speed).

Code: [Select]
int32_t iterationCustIA_z2c(CustComplex& zn,CustComplex& C,CustComplex& erg) { CustInterval x2,y2,xy; int32_t error=0; error += custMul_ZAB(x2,zn.re,zn.re); error += custMul_ZAB(y2,zn.im,zn.im); error += custMul_ZAB(xy,zn.re,zn.im); CustInterval tmp1; error += custSub_ZAB(tmp1,x2,y2); error += custAdd_ZAB(erg.re,tmp1,C.re); CustInterval tmp2; error += custAdd_ZAB(tmp2,xy,xy); error += custAdd_ZAB(erg.im,tmp2,C.im); return error;}
Currently my algorithm spends about 20% in iteration and 80% elsewhere. So my idea was, if I could optimize the 80% and arrive at a more balanced time-spent like 50/50, pushing the iteration to the GPU would give the algorithm another boost - and the CPU might then do some of the earlier 80% work in parallel as well.

Now I was wondering: Would it be possible to perform the iteration-function on a graphics card? Data transfer ist only a few references and the end-points, but quite a number of temporary objects. Maybe with a similarly simple statement like #pragma do-this-on-GPU in 64 threads.

Are there GPUs that support directly double precision? I'm currently approaching the limit of single )although I'm using double always since float is no faster in my C++ compiler). If not, I might additionally need to implement a double-single datatype like claude mentioned.

Since OpenGl does not specify rounding modes but "The fraction 0.5 will round in a direction chosen by the implementation, presumably the direction that is fastest." it may be fast but would give less accurate bounds.

That was another question I had. If I can't specify the rounding mode, is it possible to retrieve which rounding mode the GPU currently uses? And is this stable in the sense, that the GPU does not change it itself at some point? And just to be safe, GPUs adhere to the IEEE standard - so virtually the infinite precision operation is performed and then the result rounded with the benefits that some operations are exact?

It is however possible to simulate correct rounding with any rounding mode (just adding, subtracting some 2^-52*value - I remove subnormals and inf by testing beforehand), although it makes things a lot more complicated.

My winner is the repetitive negation method like -( (-a) + (-b) ) for downward rounding addition of a and b when rounding mode is upward, that was a very nice idea I read about

claude

• 3f
• Posts: 1830
Re: Interval arithmetics on GPUs
« Reply #6 on: June 08, 2020, 10:58:51 AM »
Quote
Are there GPUs that support directly double precision?
Yes!

jukzi

• Fractal Phenom
• Posts: 59
Re: Interval arithmetics on GPUs
« Reply #7 on: June 08, 2020, 11:19:21 AM »
Since the accuracy/rounding is vendor specific you need to treat every operation as if all possible roundings may have occured which let the interval grow.  Not only for explizit rounding but also for simple things like sum which also may have no guarantee of precise rounding. Also you need to check for overflow. Which i guess is a mess with autoclamping instead of INF/NAN.
Best you could do is to ask the author of the paper about his library - and proof his implementation is correct before you proof some numbers.

jukzi

• Fractal Phenom
• Posts: 59
Re: Interval arithmetics on GPUs
« Reply #8 on: June 08, 2020, 11:29:26 AM »
for a particular vendor there are guarantees given:
https://docs.nvidia.com/cuda/floating-point/index.html#operations-and-accuracy
https://docs.nvidia.com/cuda/floating-point/index.html#rounding-modes

Nevertheless i would not trust all thousends of cores are working determninistic errorfree. So i would not trust any single result if not reproduced by other hardware.

marcm200

• 3d
• Posts: 953
Re: Interval arithmetics on GPUs
« Reply #9 on: June 08, 2020, 12:15:44 PM »
for a particular vendor there are guarantees given:
Those links are very interesting and informative. Thanks!

Nevertheless i would not trust all thousends of cores are working determninistic errorfree. So i would not trust any single result if not reproduced by other hardware.

If it is so bad, reading this, I come to the conclusion that using GPUs is not recommended for my purposes as reliable calculations are a must, speed is a secondary concern. I do not necessarily need the tightest bounds, so correcting the results after each operation outward would be acceptable. But I think I switch to the next best option: buying a new and faster CPU instead.

3DickUlus

• Posts: 2188
Re: Interval arithmetics on GPUs
« Reply #10 on: June 09, 2020, 05:55:27 AM »
depends how you go about it...

Similar Topics

Gallery Upload limit time interval calculation?

Started by Anon on Forum Help And Support

0 Replies
220 Views
March 12, 2018, 02:35:49 PM
by Anon
What fractal software benefits from CUDA GPUs?
9 Replies
1068 Views
April 12, 2018, 12:50:26 PM
by Taojoe