| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by michaelgv 2853 days ago
	As someone who just recently CDN hell, and rebuilt our entire CDN network from the ground up (software and hardware), I was wondering why you picked RoR?

2 comments

zrail 2853 days ago

It’s what I know best and what I’m most productive in. The project is to get something running and learn a handful of new things, and learning a new framework would be a detriment to that first goal.

The manager app is not in the hot path with this design so performance doesn’t matter all that much.

link

michaelgv 2853 days ago

Are you designing this CDN to pull from origin, cache temporarily? Or to pull from local file and put strong cache on it?

If you need a hand let me know, I’ve built pretty large CDNs before (10M r/s at peak)

link

zrail 2853 days ago

The former to start but I want to add push zones and/or “s3sync” zones that proactively sync an s3 bucket to local disk.

Thanks for the offer! I might just take you up on it :)

link

michaelgv 2853 days ago

Just be careful, understand that if you do a PULL only CDN, you're not going to gain big benefits. If you do want a pull only CDN, have a background task runner to retrieve the files, and update them locally.

link

tatersolid 2850 days ago

> understand that if you do a PULL only CDN, you're not going to gain big benefits.

This statement makes no sense. A CDN edge node is just a cache; its size and your access patterns determine the hit ratio.

At $dayjob we get Nginx cache hit ratios on our edge in excess of 99% for “an origin fetch” setup. That is a very large benefit.

Cloudflare works entirely on origin fetch. They seem to be doing okay.

link

zrail 2853 days ago

Sure. I have Nginx set to keep files around for s long time and serve stale and refresh in the background, but proactively refreshing periodically is a good idea.

link

tmh88j 2853 days ago

What would you suggest and why? Not a loaded question for all the people eager to downvote.

link

skunkworker 2853 days ago

Personally I would build it in something like Go. I've done a lot of work in Rails and I would probably have the signup/profile/interface built in Rails 5.2 but use a high performant go framework for the really intensive stuff.

I've been considering building my own but getting an up and running gossip protocol to have it share data between nodes isn't the easiest thing in the world to code.

link

michaelgv 2853 days ago

Its a pain to code. I’ve done it, and I hated every second of it. Keeping data in sync with dynamic data in near real-time is terrible.

I wrote the CDN in Go, with Redis and a smaller go-powered daemon to retrieve assets every 20 seconds, sync them to a local storage drive, and after 5 days retrieve again - or, if there is no requests within 48 hours, clear the unused items.

Then I setup a system that if one edge requests an “unpopular” file, it’ll ping a simple REST API and have all the other edges pull that file, this allowing the edges to stay “one step ahead” of the user load

link

skunkworker 2853 days ago

Yeah, when thinking it through personally it comes down to a hard math problem. Because you have to maintain the state of the local files, whether they should live in memory vs ssd vs another node. Did you use an LRU cache for expunging less utilized resources?

link

michaelgv 2853 days ago

State is much less important to track. It’s easier to do, the real challenge is garbage collection - you need it, but you don’t want to collect too much in memory. That’s why Redis is a great tool for our edge servers.

link

skunkworker 2853 days ago

And nothing has made me realize just how slow the speed of light is until I started looking into the CAP theorem and distributed databases like CockroachDB.

link