| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nwellnhof 847 days ago
	It should be noted that in C89, both behaviors of division and modulo are allowed and it's implementation-defined whether division results are truncated or rounded towards negative infinity. This has changed in C99 which only allows truncation towards zero.

3 comments

Karellen 846 days ago

I've always considered it a mistake to refer to "%" as a "modulo" operator in C because of its ability to return negative numbers. That's not how modulo arithmetic works. C's "%" is a remainder operator.

link

marko_oktabyr 846 days ago

> That's not how modulo arithmetic works.

We should be a bit careful with what one means by "modulo arithmetic" here. If we are talking about arithmetic in Z/nZ (read "integers modulo n"), then the objects being acted upon are no longer integers, but equivalence classes of integers. That is, the set of all integers that satisfy the equivalence relation "~" where "a ~ b" means a - b = k*n for some integer k. For example, one of the equivalence classes in Z/3Z would be the set {..., -4, -1, 2, 5, ...}.

Since every element of the set is equivalent under that relation, we typically will choose a representative of that equivalence class, e.g., [2] for the previous class. If we view the "mod" or "modulo" operator to be a mapping from the integers to a particular representative of its equivalence class, there's no reason that negative values should be excluded. [-1] refers to the same equivalence class as [2], [32], and [-7]. The details of how integers are mapped to a chosen representative seem to vary from language to language, but modular arithmetic works all the same between them.

link

shikon7 846 days ago

But 2 mod 3 should still give the same as -1 mod 3. If the representative for the same equivalence class is allowed to be different depending on the input, you could just as well use the number itself.

link

marko_oktabyr 846 days ago

At the risk of being snarky, I should certainly hope the output depends on the input.

On a more serious note, there are quite reasonable properties that are satisfied by keeping the sign of the first argument as the chosen representative. In particular, for any two nonzero integers a and b, we have that:

a = (a / b) * b + (a mod b)

Where division should be interpreted as integer division that rounds towards zero. If a=-1 and b=3, then (a/b) would be zero which would require (a mod b) to be -1 if we want the above to hold. Also note that other rounding choices (e.g., rounding towards negative infinity) could impose (a mod b) = 2.

So choosing a particular representative comes down to choosing what properties you want your function to have in relation to other arithmetic.

link

oasisaimlessly 846 days ago

There is no "particular representative"; every equivalence class (except 0) has two possible representatives.

... and that's exactly why the C modulo operator sucks.

link

seanhunter 846 days ago

Neither of them is wrong. Languages (like python) which only return a positive number are returning the least positive residue of division under the modulus. That is the most common interpretation of it in number theory at least far as I know.

C generally does return negative numbers when you use the % operator.

link

kazinator 846 days ago

Under modulo math, -1, 9 and 19, for example, are all the same element of the modulo 10 congruence. 9 is called the "least positive residue", which is sometimes important.

link

matsemann 846 days ago

I have to re-learn this each year for Advent of Code, when inevitable my something % anything goes into the negative when searching downwards.

link

marcosdumay 846 days ago

That's not any clearer. Both "modulo" and "remainder" can be defined in ways (yeah, more than one) that include or exclude negative numbers.

You have to memorize the full definition. There's no mnemonic shortcut.

link

chubot 846 days ago

Remainder goes with division, and respects invariants

Modulus doesn’t. I believe that algebraically equality module N can exist without division

link

pwdisswordfishc 846 days ago

Exactly. Modulo is not an operator, but a term for an equivalence class of differing by a multiple of a number fixed in advance.

link

Karellen 846 days ago

> Modulo is not an operator,

"operator" is being used as a term of art in programming language design, to describe a symbol used to indicate that an operation should be performed on one or more operands.

https://en.wikipedia.org/wiki/Operator_(computer_programming...

So the "%" symbol is an operator in C and Python, and it is standard usage to describe the operator that performs a mathematical function "X" as the "X operator".

link

marko_oktabyr 846 days ago

As sibling comments have mentioned, "operator" here is being used in the programming language sense of the word, not the mathematical sense. Using the name "mod" [as in mod(a,n)] or "modulo" for the mapping that takes integers to a particular representative of its equivalence class in Z/nZ ("integers modulo n") doesn't seem like an unreasonable choice to me.

link

kazinator 846 days ago

(mod N) is an operator which indicates that the arithmetic in the preceding formula or equation is understood as taking place in a modulo N congruence.

A triple equal sign is usually used for modular equations. That triple equal sign, together with the (mod N) out to the right, constitute an operator.

-1 ≡ 9 (mod 10)

Since it is written as part of the syntax of a formula or equation, and modifies its semantics, it is an operator. Just like d/dx, sigma notation and whatever not.

link

mathgradthrow 846 days ago

That's an operator. A map to the quotient.

link

Findecanor 846 days ago

One thing that is still "implementation-defined" in C and C++ is the result of a right shift of a negative integer. On pretty much all platforms, the >> operator shifts in the sign bit and does not round the result — which makes it equivalent to flooring division by a power of two. It is consistent with the division operator only when the left-hand value is positive.

link

kazinator 846 days ago

However, the oposite shift is undefined if a 1 goes into the sign bit.

More precisely, regarding E1 << E2, it is written (in the April 2023 ISO C draft):

"If E1 has a signed type and nonnegative value, and E1 × 2ᴱ² is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined."

Thus if E1 is negative, or if the result overflows, UB.

link

rcfox 846 days ago

Didn't they specify two's complement signed integers somewhat recently? What's the rationale for leaving these behaviours undefined?

link

Findecanor 845 days ago

On some CPUs architectures, the operand size for some instructions could be larger than an `int`, in which case the upper part of a CPU register would become invalid on overflow instead of containing an extension of the sign bit.

There are also CPUs that do "saturating arithmetic" where overflow results in INT_MAX instead of wrapping.

link

kazinator 846 days ago

Because the shift operations are arithmetically defined, and the situations that are undefined correspond to overflow. v << 1 means the same thing as v * 2 and is undefined if v * 2 is undefined.

link

kragen 846 days ago

mentioned in the comments on the post, which merit reading in this case

i haven't read the rationale, but presumably the committee did this because it's what virtually all cpus do (because it's what fortran does), so it's the only thing that can be implemented efficiently on virtually any cpu with a division instruction, and so virtually all c implementations did it, and standardizing the behavior of virtually all c implementations is better than leaving it implementation-defined or standardizing a behavior that conflicts with virtually all existing implementations and can't be implemented efficiently on most high-end hardware

link

DonHopkins 846 days ago

Who would have ever thought that division could be so ... divisive?

link

kragen 846 days ago

as sophie w. arm said, 'i have a multiplier, not a divider'

link

DonHopkins 846 days ago

At the RISC of sounding short on instructions, do you mean Sophie Wilson's ARM? ;)

https://en.wikipedia.org/wiki/Sophie_Wilson

link