| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by weakwire 1151 days ago
	It's really fast! That is very important. Also it provides 3 variation drafts. I see this as a winner

1 comments

cush 1151 days ago

The Copilot VSCode extension does 10 drafts

link

londons_explore 1151 days ago

Both can be made to give more drafts by just asking the same question again...

I don't see more drafts as a real differentiator.

link

lavasalesman 1151 days ago

UX is a feature

link

cush 1149 days ago

> I don't see more drafts as a real differentiator.

Well the person I replied to did..

link

weakwire 1151 days ago

it spits out 3 drafts in 1 go. without the need to wait. It is simply way faster than GPT4

link

verdverm 1151 days ago

I wonder how much of that has to do with TPUv4 vs the hardware used for GPT4?

Google has invested in custom AI hardware for some time now and does not run their workloads on nvidia cards

link

londons_explore 1151 days ago

Neural networks are really parallelizable. If I scale up my AI service to handle double the number of users by buying double the number of GPU's, it is theoretically possible to also serve each user in half the time.

To do so, you need to split the matrix multiplies across the new machines. You also need more inter-machine network bandwidth, but with GPT-3 that works out to 48 kilobytes per token predicted collected from every processing node and given to every processing node. Even if Bard is 100x as big, that is still very doable within datacenter scale networking.

However, OpenAI doesn't seem to have done this - I suspect an individual request is simply routed to one of n machine clusters. As they scale up, they are just increasing n, which doesn't give any latency benefit for individual requests.

link

verdverm 1151 days ago

Yup, the TPUv4 pod is highly optimized

They are claiming to be the first to achieve >50% saturation during training. Pretty sure I recall Midjourney is using TPUv4 pods too

https://cloud.google.com/blog/products/ai-machine-learning/g...

https://cloud.google.com/tpu/docs/system-architecture-tpu-vm

link