Calculating a record-breaking 31.4T digits of Pi with GCP

Y	Hacker News new \| ask \| show \| jobs

	Calculating a record-breaking 31.4T digits of Pi with GCP (cloud.google.com)
	129 points by ossama 2659 days ago

23 comments

aaaaaaaaaaab 2659 days ago

Meh, throwing raw power at the problem is not that impressive.

Bellard's [1] 2009 record was much more impressive, because he used a clever formula to break the existing record with a mere (albeit beefy) desktop computer: https://bellard.org/pi/pi2700e9/

The record he broke with his desktop PC was made using a supercomputer cluster.

[1] If somebody is not familiar with him, he is also the original author of QEMU, ffmpeg and the Tiny C Compiler.

link

arethuza 2659 days ago

And JSLinux - full PC emulation in your browser (this never ceases to amaze me no matter how often I play with it):

https://bellard.org/jslinux/

link

burk96 2658 days ago

There are quite a bit of other very impressive projects under their belt as well. QEMU and FFMPEG to name some I use daily. What a legend.

link

philipkiely 2658 days ago

According to the FAQ, the bottleneck on this process was hard disk speed, would running the algorithm today be noticeably faster on a NVME SSD? Could it be modified to run on a computer with a sufficient amount of RAM, perhaps in batches?

Have there been other improvements in single-core clock speeds, multi-core performance, or any other CPU hardware components to make running this noticeably faster on high-end consumer hardware?

link

Smithalicious 2658 days ago

Wow, I had no idea that the same guy is behind both QEMU and FFMPEG! That's really impressive.

link

ousta 2658 days ago

I struggle to see any complexity into google approach - unlike bellard's one that is an amazing feat.

They basically pulled more machine to compute more. nothing really impressive. pretty much any dev with that computing power could have done it

link

web007 2658 days ago

It's fairly impressive.

Keeping a single server online for 111 days straight at full CPU and RAM usage over 96 cores and 1.4TB of RAM is a good start. The fact that such a machine exists is already mind-blowing. Then add 25 more nodes running iSCSI, all out 24/7 for 111 days. Hell, just mounting 240TB on a single system is a good stunt, go ahead and try it and let me know it's not "complex".

And your last point kind of IS the point of their marketing: any dev could do it if they have the skill, and they'll rent you the hardware.

link

onlydeadheroes 2658 days ago

While those are interesting systems administration tasks, I believe the point is that the previous record was held by someone who used smarter maths (than the previous previous record) and vastly fewer computing resources.

link

Chirono 2658 days ago

I think that's the point they are trying to make; that doing this with GCP is super boring and easy.

link

aaaaaaaaaaab 2658 days ago

Exactly. It's a matter of K8s configuration, which a devops intern can do in a few weeks probably.

Compare this with the Chudnovsky brothers, who built their pi calculating "supercomputer" from commodity parts in their NY apartment, back in 1992: https://www.newyorker.com/magazine/1992/03/02/the-mountains-...

(and being mathematicians, they also discovered novel formulas for speeding up their calculation)

link

wetpaws 2659 days ago

I don't think it's about the PI tbh, more like a pretty stunt to me

link

LinuxBender 2659 days ago

They did it so I could say,

There are three kinds of mathematicians, those that can count and those that can not.

link

smarx007 2659 days ago

How Many Decimals of Pi Do We Really Need? (NASA/JPL): https://www.jpl.nasa.gov/edu/news/2016/3/16/how-many-decimal...

link

bspammer 2659 days ago

I don't think anyone is claiming this is useful in any way, it's basically an advert for the stability of Google Cloud Services (apparently the VM was migrated thousands of times over the course of the computaton). It's also just kinda fun.

link

StreakyCobra 2659 days ago

In short 40 is already crazy too many digits for most if not all applications. Yet in the original article they say «Granted, most scientific applications don’t need π beyond a few hundred digits, …». Is there scientific applications where they would really need more than 40? Or is it just the author making some guess?

link

bostonpete 2659 days ago

The linked NASA article points out that with 40 digits of pi you could compute the circumference of the visible universe to an accuracy equal to the diameter of a hydrogen atom. I'm gonna say there's no practical application that would require even 40 digits, never mind a few hundred

link

rlanday 2658 days ago

You need more digits than that to accurately compute double-precision trigonometric functions (if the input is close to pi, you need enough accurate digits left after performing range reduction).

This paper claims you need 2/pi accurate to 1144 bits which is about 345 decimal digits: https://www.csee.umbc.edu/~phatak/645/supl/Ng-ArgReduction.p...

link

ectospheno 2658 days ago

As a counterpoint, no real computation I've performed on a computer needed to compute the cos of 2^1023 radians. I can't imagine such a scenario either.

link

rlanday 2658 days ago

You can either implement the functions accurately or inaccurately. Implementing them inaccurately is a slippery slope. Intel botched the hardware implementations in their processors not only for large inputs but also for inputs nearish to multiples of pi:

http://notabs.org/fpuaccuracy/index.htm

link

fghtr 2658 days ago

>Is there scientific applications where they would really need more than 40? Or is it just the author making some guess?

It is still not clear whether Pi is a normal number or not [0]. Such calculations could in principle give an insight here.

[0] https://en.wikipedia.org/wiki/Pi#Irrationality_and_normality

link

WilliamEdward 2659 days ago

Whether there's a use for 40 digits or 40 trillion digits, needing more accuracy is not why we find these numbers. There's no 'need' for this at all. The same way there's no 'need' to find bigger prime numbers. We're just seeing how far we can go, maybe seeing a new pattern emerge.

link

klohto 2659 days ago

> The same way there's no 'need' to find bigger prime numbers.

From cryptography standpoint, there is always a need to find bigger prime number. I wouldn't compare this with a Pi.

link

benchaney 2658 days ago

Finding bigger prime numbers has no impact on cryptography at all.

The large prime numbers needed for cryptography are a few hundred digits long. The are generated by picking a random numbers and checking it’s neighbors for primality.

The largest prime numbers that have been discovered have millions of digits. Finding a larger prime would have no effect whatsoever on our ability to quickly generate primes with a few hundred digits.

link

frutiger 2658 days ago

How does finding a bigger prime help or hinder cryptography? And with many forms of cryptography (e.g. DH kex, ECC) primes only enter as the modulus of the modular arithmetic where being larger is usually not particularly helpful.

link

XorNot 2659 days ago

Finding a change in the behavior of Pi that emerges at high precisions would be a significant discovery.

link

mhh__ 2658 days ago

But it's only 9s after the 762nd digit?

link

klohto 2658 days ago

Absolutely! I'm not disputing that it's not "interesting", just that I wouldn't compare Pi number search to prime number search.

link

vkaku 2658 days ago

This! Leaving a decimal here .

link

angrygoat 2658 days ago

Alexander Yee's writeup is interesting – CPU utilisation was only about 12%, they encountered quite nasty I/O bottlenecks (particularly for writes.)

http://www.numberworld.org/blogs/2019_3_14_pi_record/

Contradicts the Google blog a little, especially where he points out that they hit performance issues with live migration (Google said it worked fine without impact on the application.)

link

bluedino 2658 days ago

He's also the author of one of the most famous Stack Overflow answers of all time, 'Why is it faster to process a sorted array than an unsorted array?'

https://stackoverflow.com/a/11227902/1760335

link

s_gourichon 2659 days ago

This is a time someone must cite 3Blue1Brown: [(183) The most unexpected answer to a counting puzzle - YouTube](https://www.youtube.com/watch?v=HEfHFsfGXjs)

link

philshem 2659 days ago

The piano music for each digit at the A-π (API) page is particularly beautiful

https://pi.delivery/#demosmusic

link

nothanksmydude 2659 days ago

Similar, but metal. This is the extended verison that goes to 110 digits.

"The numbers and rests in the formula translate to 16th notes on the kick drum, and 16th note rests. There is no kick drum beats where there are snare drums.

With the decimal point BEFORE the number, and starting with the first number, move that many decimal points to the right and insert that many 16th note rests. Use one 16th note rest to divide the numbers you passed (when applicable). Continue on throughout the rest of the figure. No repeats."

The details of the video have the full explanation

https://www.youtube.com/watch?v=uh-EdSbfdrA

link

trollied 2659 days ago

Martin Krzywinski has produced some brilliant Pi posters. http://mkweb.bcgsc.ca/pi/piday/

I have the "dots" one. https://fineartamerica.com/featured/768-digits-of-pi-up-to-f...

link

leowoo91 2658 days ago

Literally this should contain all the songs written, and the ones haven't written yet.

link

geekuillaume 2659 days ago

I'm curious, did someone estimate how much it would cost to replicate this experiment on GCloud with the same infrastructure?

link

triodan 2659 days ago

Using GCP's pricing calculator and the specs described in the blog, it's around 170,000 USD to run it for 3 months.

https://cloud.google.com/products/calculator/#id=ef2329f6-f1...

link

yjftsjthsd-h 2658 days ago

If you can break it into appropriate jobs that can handle interruptions, you should use interruptible instances to save money. And you really need to be able to handle that anyway in case an instance fails midway.

link

fhoffa 2659 days ago

Don't miss the video on how Emma did this:

- https://www.youtube.com/watch?v=JvEvTcXF-4Q

(length 3:14)

link

zimbatm 2659 days ago

this is perfect for mounting the Pi filesystem :)

https://github.com/philipl/pifs

link

sverige 2658 days ago

Whenever I see the ridiculous number of places to which pi has been calculated, I wonder if anyone has checked to see if there is a repeating pattern. I mean, 31 trillion places leaves a lot of possibilities for repetition of a couple of billion digits.

Or is my understanding of what constitutes an irrational number outdated? Is there another definition that precludes even looking for repetition in hopes of finding a denominator?

link

DannyB2 2658 days ago

Not consecutively repeating patterns.

But if you take any length pattern of digits, it would repeat an infinite number of times.

Let's take a one digit pattern, say '5'. Since the digits of pi continue forever, there would be an infinite number of '5's.

Now consider a longer pattern '53'. Since the digits of pi continue forever, there would be an infinite number of '53's. In fact, each 53 will be from one of the infinite number of '5's in the previous pattern '5'.

Now consider a longer pattern '537' . . .

. . . to continue . . .

It was long ago when I read Contact (the book), so I hope I don't misremember this too badly. At the end of the book the main character was given a budget, lab, resources, etc. They were working on looking for a message in the digits of Pi. Eventually her beeper beeped and they had found one! It must be woven into the fabric of the universe.

I think any sequence of digits that had any kind of message you are going to eventually find in Pi. Just like, if you look long enough you'll find a 5. If you keep looking you'll eventually find a 53. Keep looking, you'll eventually find a 537. Etc.

link

wool_gather 2658 days ago

This is the concept of [a "normal" number][0] (which is a strange choice of word, I think). Apparently pi is not proven to be normal. But the implication as I understand is that, yes, any given sequence that you'd care to look for is there somewhere.

[0]:https://en.wikipedia.org/wiki/Normal_number

link

stephencanon 2658 days ago

Almost all numbers are normal, so it seems like a perfectly reasonable choice of terminology =)

[On the other hand, almost none of the numbers someone on the street might name are normal, so ...]

link

pmiller2 2658 days ago

Offhand, I would guess that “normal” is the most overloaded word in mathematics. We should probably use it less.

link

wool_gather 2657 days ago

Ah, I didn't realize that, thanks! That makes sense now that you say it.

link

stephencanon 2657 days ago

Almost all real numbers are a lot of things, though:

   - uncomputable
   - irrational
   - transcendental
   - ...

So it's not really that good of a justification. =)

link

gibspaulding 2658 days ago

Pi can be proven to be an irrational number: https://en.wikipedia.org/wiki/Proof_that_%CF%80_is_irrationa...

link

hi41 2658 days ago

Quick question. Is 22/7 an approximate value of pie? What is the correct formula and why?

link

pmiller2 2658 days ago

22/7 is an approximation for pi. I prefer to use 355/113, which has an error of about 2.7x10^-7. The reason these are good approximations is that they’re convergents for the continued fraction for pi. See https://en.wikipedia.org/wiki/Continued_fraction

link

CRUDite 2659 days ago

Perhaps they subliminally hope the final pages of the novel "contact" to be real

link

maxst 2658 days ago

Are the some PI-like constants, but much harder to calculate? Like even a million digits would be hard to calculate?

link

StrangeDoctor 2658 days ago

That's actually most numbers, but we don't know any yet. https://youtu.be/5TkIe60y2GI numberphile (math YouTube series) describing some related concepts here.

link

anth_anm 2658 days ago

"we have lots of compute resources and access to software, aren't we awesome?"

link

sigi45 2658 days ago

Awesome, now we are a game of life which has generated Pi to 31.4T digits at least once :D

link

atomical 2658 days ago

What is the file size?

link

hokobomo 2659 days ago

> March 14 (represented as 3/14 in many parts of the world)

haha

link

StreakyCobra 2659 days ago

I wanted to comment the same part, but I wanted to let them the benefit of the doubt, so I went to Wikipedia «date format by country» page [1], and summed up from the table how many people have "MD" vs "DM" in their date format:

DM -> 3392/5550 ~= 61.1% MD -> 2158/5550 ~= 38.9%

Note: I ignored both green and red regions that have both "DM" and "MD" in their format.

So it is definitively not the majority of people. Using the word "some" instead would have been better, but "Many" is not totally wrong… I guess.

[1] https://en.wikipedia.org/wiki/Date_format_by_country#Usage_m...

link

klohto 2659 days ago

But that is population not a number of countries. If you look at it from a country perspective, the percentage is tiny.

link

StreakyCobra 2659 days ago

Probably. The statement talk about "parts of the world", and this concept of "parts" can be seen as countries, continents, surface of earth, atoms, etc. I chose to reduce it to the smallest meaningful entity concerning the concept of dates: humans. It is arbitrary, but justifiable :)

link

FakeComments 2659 days ago

You seem to imply it’s wrong to refer to 40% of the population as “many” — which seems odd.

link

vonmoltke 2658 days ago

I treat "some" as an informal expression of relative quantity and "many" as an informal expression of absolute quantity. So it is valid to say many countries use "MD" and many countries use "DM", though most countries use "DM" and only some use "MD".

link

isostatic 2659 days ago

The US and Philippines use month-day-year.

The rest of the world doesn't. The most popular format is Day-Month-Year, followed by Year-Month-Day.

link

s_gourichon 2659 days ago

Yes. ISO chooses year-month-day, which puts largest component first and smallest last. This has the nice benefit that treating it as a string and doing alphanumerical sort matches the actual day sort. Ref. https://en.wikipedia.org/wiki/ISO_8601

link

rkangel 2658 days ago

The main benefit of Y-M-D, is that no-one uses Y-D-M, and the 'Y' component is easily recognisable. So if you use Y-M-D, then everyone knows it's Y-M-D and there is no ambiguity.

link

isostatic 2658 days ago

I found out recently the x509/tls certa are YYMMDD

200122

For example. Amazing.

link

isostatic 2659 days ago

There are benefits to both little endian and big endian

link

yjftsjthsd-h 2658 days ago

Yes, but the US uses middle endian.

link

xpil 2659 days ago

My two favourite date formats are: (1) yyyy-MMM-dd and (2): yyyyMMdd.

Why?

(1) eliminates any ambiguity regarding interpretation. You can have an instance of a server of an unknown provenience and/or regional settings and still count on yyyy-MMM-dd to be correctly recognized.

(2) on the other hand has a nice feature of being sortable and can also be stored as an integer value if needed.

The US standard of MM/dd/yyyy is ridiculous. But it's just one of many and I'm not going to fix the world ;)

On the main topic: if anyone needs to calculate the circumference of the Universe (radius: 50bn lightyears ish) with the accuracy of the Planck distance (approximately 1.6 x 10^-30 meters), they need less than 65 digits of Pi to do so. So, as already stated, it's just a PR stunt, nothing more.

link

isostatic 2659 days ago

yyyy-MM-dd is sortable (2019-03-14)

yyyy-MMM-dd is not (2019-MAR-14)

link

skykooler 2658 days ago

Yeah, but we can't celebrate on the other way around, because there's no 14th month.

link

UncleSlacky 2658 days ago

There's the 22nd of July (22/7), also known as Pi Approximation Day, which in any case is more accurate than 3.14.

link

v_lisivka 2658 days ago

We can celebrate Pi day every month at 3 14:15:92.65359.

link

andrewla 2658 days ago

And April only has 30 days.

link

patrickg_zill 2659 days ago

How can you tell if the results are accurate?

link

onion2k 2659 days ago

Something quite interesting about pi is that you don't need to start at the beginning to calculate any particular digit, so you can check if the 31.4trillionth digit is correct yourself. The algorithm you'd need to use is the Bailey–Borwein–Plouffe formula - https://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%9... (I think there are others too.)

link

chris_overseas 2659 days ago

This is explained in the second paragraph:

> Yee independently verified the calculation using Bellard's formula[0] and BBP formula[1]

[0] https://en.wikipedia.org/wiki/Bellard%27s_formula

[1] https://en.wikipedia.org/wiki/Bailey%E2%80%93Borwein%E2%80%9...

link

audiometry 2659 days ago

Disappointed the Wikipedia article didn’t explain how in the world they arrived at that formula, why it even works. This type of stuff is pure magic to me.

link

aboutruby 2658 days ago

There is a proof linked from the Wikipedia article: https://www.davidhbailey.com/dhbpapers/digits.pdf

link

mhh__ 2658 days ago

https://bellard.org/pi/pi_bin.pdf

link

curiousgal 2659 days ago

Why would you need to do that? It's not like those digits are useful for anything at all.

link

onion2k 2659 days ago

I guess it depends on your definition of "useful", but one potential way to use pi would be as a way to transmit compressed data incredibly efficiently. If you could find the data you want to transmit in pi somewhere you'd only need to transmit an offset and a length to send anything than can be represented numerically.

The hard part would be finding what you want to send though...

;)

link

gliptic 2659 days ago

Unless it's not clear, the size of that offset would be about the same size as whatever you wanted to encode.

link

mkup 2659 days ago

Offset and a length in Pi are not going to be shorter than original data to transmit.

link

thaumaturgy 2659 days ago

Not necessarily. A double gets you pretty far into pi for a cost of just 8 bytes. A little bit of rounding, or checksumming, or other tomfoolery in principle should make it possible to reach absurdly far into pi for a byte or two more.

link

jerf 2658 days ago

Given that I personally have to have seen this proposal at least a dozen times myself, and I'm not even particularly "in the field", if it was as easy as people supposed, we'd be using it.

I will point out you're thinking of it incorrectly, though. The challenge with compression isn't to "reach far into pi". The challenge is to reach correctly into pi. A double can identify 2^64 places to reach into pi. It doesn't really matter how you specify those 2^64 locations, that's all it can do, by the simple argument that one double can't specify more than one location.

The location of the US Constitution in pi may be "far", but it's not the "far" we have in our fuzzy human brains where things sort of logarithmically just pile together until there's no meaningful difference to us humans between the 1,839,837,237,938,739,837,954th position of pi and the 1,839,827,237,938,739,837,954th position. But a compression algorithm off by that much is useless; indeed, even being off by one digit (in your choice of base) is going to produce garbage. So it's not just about being able to reach "far", it's about being able to reach far and precisely. It doesn't matter how you arrange the possible 2^64 locations a machine word can point to; specify it as 1,739,837,237,938,739,837,954 + the binary as a 64-bit int for all it matters. That reaches "far" into pi. But you won't find anything useful enough to pay off the 64bits you spent getting there.

(I mean, you want to reach far into Pi, I give you "Go BusyBeaver(64-bit int) digits into Pi". That's reaching in pretty darned far. But it's still useless as a compression algorithm, even with the mighty power of BusyBeaver there.)

(Amusingly, at BB(42) digits into Pi, you get "42" as the next digits, proving that 42 really is the answer. Prove me wrong!)

link

onion2k 2659 days ago

If your offset is a million digits long you need to be sending more than a million digits of data to make it worthwhile, but the chance of your chosen million digits being available in that space is effectively zero. :)

link

patrickg_zill 2658 days ago

You've kind of re-invented, in a way, using DCT transforms to encode music more efficiently...

link

DannyB2 2658 days ago

> Why would you need to do that? It's not like those digits are useful for anything at all.

People complain about that.

But . . . why do we have athletic competitions? It's not like it is useful for anything at all.

Or Reality TV for that matter.

link

psadri 2658 days ago

Did they find the patterns (circle?) that was referend at the end of Carl Sagan’s Contact book?

link

megaremote 2659 days ago

Does pi compress? It must if it is only numbers, but only by a half?

link

mcbain 2659 days ago

Trillions of digits of Pi compresses insanely well - to Pi, of course.

The decompression just takes a while.

link

dekhn 2658 days ago

pi is effectively randomly distributed (I don't know if this is proven) which means it would be effectively impossible to compress is using conventional entropy or dictionary based techniques.

However, there are compact formulas which can generate pi to arbitrary precision, trading off compute time with space. So it;'s effectively compressible, I guess this relates to Kolomogorov complexity in some way.

link

yjftsjthsd-h 2658 days ago

If it's stored as ASCII/UTF8 text, then it will be compressible because it's just numbers. On the other hand, if they somehow have a number format that spans that many digits then it'll already be perfectly efficient.

link

dekhn 2658 days ago

Um, sure. But I don't think anybody really cares if pi is compressible because it's ASCII representation of numbers. At best, you'd be getting some constant factor improvement due to the unused bits, but there's nothing specific to pi in that.

link

rkangel 2658 days ago

If you consider compression to be 'representing information in a way that you can recover with some processing (from less information)', then any programmatic definition of Pi is an enormous compression of the value.

link

techbio 2658 days ago

Nobody wants to steel man this question? The cute trivial++ answers are missing OPs point...

link

abdulhaq 2658 days ago

You don't get those gigajoules of heat waste back at the end

link

pgrote 2658 days ago

Why did she stop calculating? A time limit or resource issue?

link

DougBTX 2658 days ago

They calculated π * 10^13 digits for Pi day (14th March, 3/14 in month/date format) so it was an artistic/aesthetic choice.