| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by addaon 381 days ago
	> Aside: Why 13 bits instead of 12? For our purposes, we’re going to ignore the carries in the most significant limb, allowing numbers to wrap when they overflow past 2256 - 1 (just like how unsigned addition works in C with normal size integer types). As a result, we can assign 52 bits to the most significant limb and ignore the fact that it will run out of room for carries before the other limbs do. Why not give the top limb 64 bits and the other four limbs 48 bits each, then? You can accumulate more additions before normalization, you can take advantage of word alignment during splitting and normalization if your instruction set has anything useful there, and your overflow properties are identical, no?

3 comments

phkahler 381 days ago

>> Why not give the top limb 64 bits and the other four limbs 48 bits each, then?

I think one goal is to use 5 64 bit registers to do 256 bit math. That means using 256/5 = 51.2 bits of each word. That's probably some kind of ideal if you want 256bit math, but not optimal if you're writing a generic big-int library. In the old days you'd want to use exactly one byte for the carry(s) because we didn't have barrel shifters to do arbitrary bit shifts efficiently. In that case I'd use 56 bits of the 64 to get nice byte alignment.

This is all quite relevant for RISC-V since the ISA does not have flags.

link

andrewla 381 days ago

Even with this explanation a 64 + 48*4 is clearly superior. You can go longer without overflow (since you have 16 bits of carry space per pseudo-digit), and the amount of carry space is aligned even more nicely.

link

Thorrez 381 days ago

>That means using 256/5 = 51.2 bits of each word.

Why must each word have the same amount? Why not 64 bits on the top word, and 48 bits on the other 4 words?

link

LegionMammal978 381 days ago

Evenly distributing the number of bits per word lets you chain more additions/subtractions before having to normalize.

link

xigoi 381 days ago

Sure, but the point is that for the most significant limb, there is no point in having redundant bits because whatever you put in them will be discarded after normalization.

link

LegionMammal978 381 days ago

Ah, in that case, you're right, it would make sense to use all 64 bits for the top limb. Still, making them all equal-sized can have benefits if you use SIMD or similar techniques to operate on them uniformly. One project of mine has been trying to work with large integers in CUDA by distributing their limbs across a warp.

link

dgoldstein0 380 days ago

Maybe some use cases want to know if there was overflow? In which case having more space for carry bits to accumulate makes it easy to give an accurate answer

link

Sukera 381 days ago

Because adding the top limbs of two encoded numbers would overflow too soon. If you set both to 2^63 for example, they overflow immediately. Might be fine for wraparound arithmetic, but not in general.

link

volemo 381 days ago

Setting both to 2^63 means your original 256-bit numbers were 2^255, thus the addition would overflow no matter what intermediate encoding you’re using.

link

vitus 381 days ago

Sure, then set one to 2^62 and the other to -2^62 (namely: 0b1100..00). It's overflow as far as unsigned arithmetic is concerned, but not in the case of signed arithmetic.

That said, when you're dealing with 256-bit integers, you're almost assuredly not working with signed arithmetic.

link

immibis 381 days ago

...so? They don't care about top limb overflow, at all. That's the point.

link

bboreham 381 days ago

Then you would need 6 words to hold a 256-bit value instead of 5 in the OP, and consequently more instructions to add them.

link

addaon 381 days ago

64 + 48 * 4 == 256... still just five 64-bit words.

link

bboreham 381 days ago

Now you can’t detect overflow?

link