| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by renewiltord 2119 days ago
	Jesus Christ, this is insane. Almost a Terabyte of 12.6 Gbps reads? I have a bunch of geospatial entity resolution workloads that I could absolutely smash with this. For way cheaper than the fat mem instances.

3 comments

jiggawatts 2118 days ago

GBps, not Gbps! They were getting 12.6 gigabytes per second, which would be 100 gigabits per second almost exactly.

However, even their 6-DIMM test produces only 300 Gbps, which is insufficient to saturate a modern 400 GbE network adapter for either reads or writes.

This would be most relevant on a "single master" system storing some sort of simple data where consistency requirements means that the "writer" cannot be distributed. In a situation like this, the NIC and the storage bandwidth are the ultimate limits.

In general, Intel SSDs and Intel Optane have poor but consistent bandwidth, and consistently low latency. Coupled with the high price and small capacity, they have their niche, but they're not a clear winner in any category.

As a reference point for how crazy high bandwidths are these days, NVIDIA sells a turnkey solution with 200 GB/sec network bandwidth (1.6 Tbps!): https://www.nvidia.com/en-au/data-center/dgx-a100/

lmilcin 2118 days ago

Where this is useful is when the application needs to process much more data than it then needs to transmit over the network.

For a database system like mongodb this could be perfect depending on workload.

lrem 2119 days ago

Back in the day when I interviewed for Google I had this beautiful question. The interviewer fished for a basic distributed key-value store. I just kept coming up with single machine solution to his numbers. "No, I really can have that storage+bandwidth, here's the part number."

I'm still wondering if that interview costed me a level.

lmilcin 2118 days ago

Being interviewer myself I can say interviewers tend to be stubborn. They have an excellent question (at least they think they have) that they have spend a lot of time thinking through and discussing with other candidates. Your answer causes him/her to miss the opportunity to have a real discussion and understand your knowledge of the topic.

It is better to play along and only mention the real solution at the end, to finish on a high note.

thechao 2118 days ago

Alternately, the interviewer should recognize that their question is flawed & move on. I’ve had candidates nail a 30 minute discussion question with a 2 minute nonobvious answer. Shit happens: you’re not the smartest person in the room that day.

dijit 2118 days ago

It should go without saying that as an interviewer; I am never the smartest person in the room. I am a person with different experiences and I need to find if you’re a fit- not if you’re unintelligent. If you got to me, you’re not unintelligent.

lmilcin 2118 days ago

Tell this to my candidates:)

I have interviewed couple candidates that were offended by the fact that I am asking hard questions that I know answers to.

This resulted in remarks like "I have never seen this question on the Internet" or "So what is the answer to this question anyway?" or "If I knew questions would be this hard I would not bother to come".

Somebody explain to people that it doesn't matter how hard the questions are but how the candidates compare. I am fully prepared that the candidates study questions that are available on the Internet, I want to see how they deal with something that requires a bit more than couple hours of effort and rote memorization.

mushi 2118 days ago

Is this because you were interviewing candidates who were incapable of saying “I don’t know?” That’s a good filter during any interview.

AnIdiotOnTheNet 2118 days ago

Most interviewers though aren't actually any good at it and very often don't like not being the smartest person in the room.

lmilcin 2118 days ago

Whether you are interviewer or interviewee, putting your ego before the goal of the interview is just going to waste the time of both parties.

I can't hire a person that can't restrain their ego for the duration of the interview and an interviewer that is focusing on anything else than figuring out if the candidate is right person for the job is causing disservice to everybody.

dhosek 2118 days ago

I guess it depends on where you're doing the interviewing. I've had surprisingly unintelligent people get to me in the interviewing stages. It amazes me how many developers, experienced even, who can't write a simple recursive function. And we're not even looking at, "can you come up with a recursive function as a solution to this problem?" we're looking at "write a recursive function to calculate the factorial" as the question.

joefourier 2118 days ago

How often do recursive functions come up in your daily programming life? For me they're rare enough that it makes sense that many otherwise competent developers draw a blank when asked to write one in the context of a high-stress interview.

They may look clever, but they're often not the ideal solution compared to a simpler to understand, and often easier to optimise iterative solution. I write plenty of DSP, GPGPU programming, and computer vision, and I can't honestly remember the last time I wrote a recursive function.

asdfasgasdgasdg 2118 days ago

Even if you can do a particular problem on a single machine, sometimes that isn't the right call. In a work scheduled cluster environment, a task that wants the entire machine may have trouble getting a slot unless it has the priority to preempt everything else going on on that machine. We call such VMs "picky" and they don't get scheduling guarantees.

lrem 2118 days ago

Heh, sure. But Borg was not part of the stated question, just of the expected answer. It wasn't even close to being needed to meet the stated parameters (IIRC the whole data set would fit into a single SSD card, so one could even have room for growth without scaling out).

michaelt 2118 days ago

I'm having trouble thinking of how you'd end up with a "cluster" that couldn't provide the power of a single machine?

asdfasgasdgasdg 2118 days ago

To make it concrete: consider a cluster of ten machines with 256G of ram. Team A and team B both schedule jobs with a requirement of 129G of RAM and five replicas each. The replicas get scheduled, one on each machine. Team C wants to schedule a task that takes an entire 256GB of ram. If they don't have sufficient priority to preempt the other jobs, then they will not schedule.

In real heterogenous-workload production clusters, every available machine likely has several VMs scheduled on it if the cluster isn't idle. There is never a full machine that's free unless some special effort has been taken to make it so.

fragmede 2118 days ago

The knapsack problem is known to be NP complete, and your algorithm that fixates on getting jobs the size of a single machine to run, will fail to run multi-machine jobs as successfully as an algorithm that does not. It's a far more interesting algorithm to think about than sorting lists of numbers. Job priorities are easy enough to add in, but the far more practical issue is with noisy neighbors. Even limiting things to single jobs on a single machines, network and storage bandwidth has bottlenecks the cluster scheduler has to optimize for.

To make things more complicated, a long-lived cluster is going to be made up of different classes of machines, from different CPU micro-architecture, so 'single machine' is overly constraining. Eg it's not interesting that a job with a 4 MiB requirement can always run if your job needs 32 GiB.

throwaway_pdp09 2118 days ago

http://www.frankmcsherry.org/graph/scalability/cost/2015/01/...

"Rather than making your computation go faster, the systems introduce substantial overheads which can require large compute clusters just to bring under control.

In many cases, you’d be better off running the same computation on your laptop."

My limited experience fits this in that a bit of smarts on a single box beats a bunch of boxes.

(the link is a very good read BTW)

yencabulator 2117 days ago

The biggest problem with that write-up is ignoring availability. "Fast all the way up to the crash" can be much worse than slow and steady.

Of course, for a batch job with a runtime under 11 minutes, that probably doesn't really matter too much. Just don't generalize that too much.

sukilot 2118 days ago

Your interviewer may have lacked skill in guiding the process, but you could have taken the question in the spirit it was given and answered "now if the usage grows 10x as much as hardware advances" or "if we need to be resilient to hardware failures" or "if we need to roll out a logic upgrade slowly" and continued to solve the problem.

> I'm still wondering if that interview costed me a level.

Unlikely. Candidate level is decided before the interview.

twic 2118 days ago

An interviewer once gave me a problem about scraping some information out of a log file. I wrote a short shell pipeline. The interviewer asked for a more complicated analysis. I wrote a longer pipeline. The interviewer added more requirements, with aggregation, state, etc in order to push me to write an actual program. I wrote an even longer pipeline.

By this point we had both realised that this was a battle of wits: could he come up with a problem that i couldn't solve with a pipeline?

At the end of the interview, i had a pipeline that took up most of a piece of A4 paper to write out. I had won the battle, and was offered the job.

Of course, i would not advise you to actually write a pipeline like that in production, but it's a fun exercise.

Anyway, the moral of this story is that if the interviewer wants you to solve a problem a certain way, and you can solve it in a simpler way, then a good interviewer will mark you up, not down. Perhaps at Google they didn't; they don't really seem like a company that has it together.

sukilot 2118 days ago

Your thesis is that a company that has it together would only write software that is either a shell pipeline or can be solved in 45minues?

> would not advise you to actually write a pipeline like that in production,

So when you are interviewing for a production job, why would you fight for non-production quality solutions?

ur-whale 2118 days ago

If the interviewer was gauging your political skills/ adaptability, it doesn't look like you fared too well indeed.

Frost1x 2118 days ago

That seems like an excuse for poor behavior moreso than a deeper level of testing (testing when you know you're right but someone with more authority tells you otherwise). While possible, I'd have to think it's unlikely.

lrem 2118 days ago

Do you test new grads for their political skills? I don't.

dominotw 2118 days ago

> solution to his numbers.

Why did he have concrete numbers. Couldn't he simply have increased those numbers when you proposed single machine solution.

Jabbles 2118 days ago

Does anyone have good numbers for a) RAM, and b) the cost of each of these solutions?

(RAM is obviously volatile.)