Hacker News new | ask | show | jobs
DeepMind’s new AI with a memory outperforms algorithms 25 times its size (singularityhub.com)
324 points by darkscape 1641 days ago
15 comments

Very interesting. GPT-J is an opensource free alternative to GPT-3 and requires at least 12.1GB memory to run the model (which is reduced from original 48GB ram). But if the model stores some kind of index and does internet searches (or hard drive) instead, then it could scale much further as there is a limit on how much memory you can use in production.
Just doing some napkin math, the whole GPT-J corpus was around 500 billion tokens, which at 4 tokens per byte would be roundabout 2 Terabyte. That, parked on a fast NVMe SSD will give you roundabout 1MM random lookups per second. Even with some transfers inbetween, this should be more than enough to not just perform in equal time, but probably less — as well as cost you less than the GPU you need for the reduced size model.

Exciting times.

The real problem with (NVMe) SSD is that they have a limited number of write cycles (a max TB written).

If you don't update your database and indices they are great. But that's something really tempting to do when you do some machine learning, (specially if you know that people with deeper pockets will do so).

Typically you will have a neural network, you run it on your dataset, it produces a new dataset of embeddings, you index them, and you use this index to train a new neural network, and you repeat the loop, hopefully improving results along the way.

NVMe SSD can write at 6GB/s but can only write ~800TB that's about 37 hours of lifetime at max speed.

> Just doing some napkin math, the whole GPT-J corpus was around 500 billion tokens, which at 4 tokens per byte would be roundabout 2 Terabyte.

"Only" 825 GB actually: https://pile.eleuther.ai/

A not-insignificant fraction of that is definitively copyrighted material, though, which raises some interesting questions when switching to a model of distributing "a smaller trained model plus the original raw training data" (though it seems that the team behind GPT-J are clearly happy to distribute their full set of data anyway, and seem to be enough under the radar to not attract the wrong sort of attention,at least for now).

Not pointing out such potential problems in public forums is likely to extend the possibility that it remains readily available.
Touché. (Though with regard to those particular problematic bits, they already tweeted themselves about it, and that tweet had more likes than this submission currently has points)
>48GB ram

48GB VRAM? 48+ gigabytes of system ram is cheap, 48 gigabytes of ram on a GPU is still painfully expensive.

Yes, in this GPU market that's essentially a new car's worth of cash.
Or $10 a month for colab or any of the various and sundry cloud services available. If you end up needing a dedicated service, huggingface is phenomenal - the barriers to using large language models are trivial.
The Amd APU's would be interesting although under powered. They give you the option of setting "VRAM" size to almost any percentage of system memory.
GPUs are the slowing factor in general if I'm not mistaken when it comes to Deep Learning progress.
Yes, vram
Would m1 max work?
Right now there's meh Deep Learning support on the m1 max. TensorFlow is ~supported~, PyTorch is not. Apple has their Neural Engine and CPU/GPU architectures sort of hidden and so it's hard to port or support them.
Isn't this what watson used to do?
A neural net with access to wikipedia is faster than than a neural net that contains Wikipedia? Seems odd to call it AI with a memory though... unless I'm misunderstanding. It's more like AI with a decent memory and an understanding of how to use an encyclopedia.
Yeah, memory implies persisted state in the model, this is static lookups separate from the transformer.

Still superb, though, there's no reason you can't use other gofai tools vs a static database, to trigger expert systems or formalized reasoning.

It's not gofai. It's locality sensitive hashing published in 2008.
An external knowledge base plugged in to an inference engine? Well, that is 100% pure GOFAI, distilled and bottled, ready to go. Sorry.
I got into a very long debate with an openai person 4/5 years ago about this + adversarial learning + access to a quantum computer (think just straight up world class abacus) was close to the primitives required for more generalized AI. They didn't agree with me, but that's ok! :)
> + access to a quantum computer

Yeah... i think i already agree with the person disagreeing with you just based on this description. "world class abacus" is not an accurate description of what a quantum computer is.

That poor researcher.
Thankfully we're very good friends, so it was a few hours well spent over beers (I hope...?)
Does anyone know if we understand enough about natural, generally intelligent brains to dismiss the idea that they are using quantum phenomenon for computation? Is it unlikely for any reason?
I think people bring up this quantum phenomenon stuff a bit too often in an attempt to add a layer of mysticism to humanity. We should only jump to that conclusion if we have some kind of proof or reason to believe so. We don’t even understand a lot of how the brain works at the macro scale yet so it’s a bit early to start saying it’s dependent on those processes (other than in the same way all physical systems are dependent on quantum phenomena.)

I’m also not a neuroscientist or a physicist though, so this is just my relative layman’s take.

I believe plenty of "quantum phenomenon" are utilized in humans' biochemistry[0]. Whether the brain "calculates" things using a method that is particularly similar to the methods used in today's quantum computers is...unlikely. It probably is "quantum" in other ways though.

0: https://www.the-scientist.com/infographics/infographic--quan...

Yes. It's unavoidably "quantum" in the sense that, as a physical machine in our universe it's subject to the rules, including quantum physics. However there is no apparent mechanism by which "thinking" could harness any interesting properties of quantum physics, it's just not happening at the right scale, like the way kids sand walls on the beach don't alter the world's tides.
Quantum computers don't seem very useful yet, but brains are extremely useful (and they use much less power). Maybe current generation quantum computers would be better if they used quantum mechanics more similarly to the ways our brains use it. The article you linked mentions that entanglement and qubits are being "studied in human consciousness" but I'm not sure what exactly that entails.
I had been doing a bunch of research on quantum biology around the time that I had the conversation I mentioned above, that's probably why at the time I wanted to discuss it with him. I think these two lectures are pretty good and I'd watched them around the time I was discussing this stuff with Jack.

Quantum Biology: The Hidden Nature of Nature - https://www.youtube.com/watch?v=ADiql3FG5is

An Introduction to Quantum Biology - with Philip Ball - https://www.youtube.com/watch?v=bLeEsYDlXJk

Roger Penrose’s The Emperors New Mind proposed this in the 1990s. He gave circumstantial possibilities, and I don’t think we know more than that now.

As an amateur, my instinct is that the mystery of the observer "causing" wave function collapse is the best clue we have either way.

I wouldn’t be surprised if we are just weighted neurons, or if we do something quantum too. I doubt the something quantum will be the same as what our quantum computers do.

It's worth noting that it's not the observer causing anything, but a "technical observation" i.e. any interaction with the macroscopic environment; the analogy of "observer" leads to a misleading implication that there's something special if an agent or conscious entity is doing the observation, which it is not.
> Consciousness causing the wave-function to collapse isn't a very popular idea these days and there are alternative explanations, though as far as I know there isn't any evidence that it is involved or if it isn't. The wave function collapse happens regardless of whether a human or any conscious being is present, all that is needed is something does an observation. Imagine a scenario where a QM experiment is done, and the result is recorded by machines and stored as a data file. Let 100 years pass then distribute copies of the file to a million people. They will all see the same thing.

Source https://www.physicsforums.com/threads/does-consciousness-cau...

Actually I had just done a bunch of researching on Penrose around the time I was having that conversation with Jack, in fact I think I'd watched these two lectures within a few months of that conversation:

Quantum Biology: The Hidden Nature of Nature - https://www.youtube.com/watch?v=ADiql3FG5is

An Introduction to Quantum Biology - with Philip Ball - https://www.youtube.com/watch?v=bLeEsYDlXJk

Not necessary. Quantum AI is basically explaining something mysterious by something else mysterious, even when there is no indication they are related.
Aren't a lot of AI breakthroughs like this though? "It works, but we don't know why"
There's actually a whole field of study that uses tools from quantum information theory to successfully model cognitive and social situations. This doesn't require any assumptions of "quantum physicality". It just makes sense to model probabilistic, reflexive, slippery-to-observe systems with tools that were designed for this. Quantum theory has really great tools and they work well.

I realize this isn't quite the answer you were looking for, but I thought it was worth mentioning.

I personally think this idea that there is some obvious categorical distinction between hard "physical" quantum phenomena and classically probabilistic ones is a fallacy. Quantum theory is in some sense just probability theory with more features.

More and more researchers are catching onto this and I think that's really exciting.

Surely they'd just use tools from regular information theory and Bayesian inference? Quantum theory is just one particular application of information theory, where uncertainty is directly controlled by physical processes rather than simply due to imperfect knowledge.
The quantum phenomena for calculations require strictly limited interactions as otherwise you just get decoherence. Quantum computations inside of our brain cells are unlikely because it's relatively very hot, having random interactions which would disrupt quantum phenomena.
Someone asked this question to Leonard Susskind once at a Q&A and he said the chances are basically zero. When the guy kept pressing on the idea Susskind got kind of exasperated and seemed like he was on the verge of calling the idea woo.
I don't think we know enough to know, I also don't think we know enough to rule it out, to me it's plausible, and that was generally the debate... is a quantum requirement there or not. Also as a note, this was conversation over beers with a buddy, so it wasn't HN scrutiny standard, we were getting philosophical. (I know very little about the subject so my opinions aren't much more than "fun ideas".)
nobody needs to "dismiss the idea", the null hypothesis is that brains don't exploit quantum phenomena. In the absence of any reason for them doing so, and the lack of evidence that the do so, the onus is on the quantum brain folks to come up with better falsifiable theories.
Even if they were (big if without really any evidence), we have no idea how, so adding a quantum computer to the mix won't get us any closer to general AI.

It would be like adding a classical computer we don't know how to turn on; it won't help anything if we don't know how to use it.

We will only know if we can't built it.

My gut feeling is that even if we use some probabilistic quantum compute it should be transferable to normal compute. Also animals vs human don't have enough difference to assume we are special.

What’s your argument?
I hope I didn't come off like I think know anything about this field, because I don't. A friend who use to work for openai (Jack Clark) and I spend some time once discussing over beers some stuff around general purpose AI, and I proposed that I believe quantum is a dark horse on the road to general purpose AI, and he disagreed.
I think the person you're replying to is asking for your specific argument. As in what exactly do you think is the advantage gained by quantum computing and why is it important.
Sounds like a fun debate to me.
Yeah he's right, you're wrong.
I wonder what we can learn from AI models about how humans work.

Like, could we assume that for humans it's also faster to search for information on Wikipedia or would it be faster to recall from memory of already read Wikipedia? Although with humans stored information decay is present. (In a way a human form of garbage collection :P).

I'd be interested to see if these models are robust against algorithms like TextFooler [0]. I'm skeptical this trend of 10x'ing the parameters will solve the "clever hans" problem.

[0]: https://github.com/jind11/TextFooler

Dup https://news.ycombinator.com/item?id=29486607

(This is a different blogpost, but does not seem to add over the original)

Edit: following derac's comment see https://news.ycombinator.com/item?id=29646112 for RETRO

This article is about RETRO [0], not Gopher.

[0] https://deepmind.com/research/publications/2021/improving-la...

Seems odd to claim 25x reduction in size when the algo involves looking into a database of a trillion chunks of text.
The "algo" here refers to the neural net itself. The text index is considered an easy problem as you can do lookups in logarithmic time.
The word "Algo" here is definitely awkward. The point is though that what matters most here is the number of parameters, as those correlate quite closely with training and inference time. Storage space is pretty trivial, but TPU cycles are less so.
Thanks for this. Rereading + your comment and I think I have a better understanding of why this is progress.
I've not kept track of where large transformers like this have gotten to, GPT3 and the like - has GPT3 made any real difference to the world? Are people using it? Has it vastly improved any software?
It's a safe bet that Google is using transformers at scale for search and translations - the full extent isn't public but they release a fair amount of research papers, e.g. the current article, or https://ai.googleblog.com/2020/06/recent-advances-in-google-...

Github Copilot is definitely GPT-3-based and is seeing real-world use https://copilot.github.com

Transformers are state of the art for many tasks so they are likely to be used for "intelligent" processing of text or speech data, but due to practical limitations you are probably interacting with them mostly through web services.

I don't know about world changing but it's saved me hundreds of hours. I use it to help read academic papers, put formatting on things like markdown and subtitles, and creative writing. A lot of the things that take it 15 seconds to do take me 2 minute and drain me mentally for about 15 mins.

If anything, it's being used in force for social media marketing, where you're trying to say "buy this thing" in different ways every day.

Forgive the ignorance, but how? What tools are you using on top of GPT3 to do those things?
I recently used GPT-J to create some handouts (fake 1930's newspaper articles) for a roleplaying game. I wrote a headline and byline, then have the model suggest some text. I change the text to include the details I want the players to have + enforce consistency, then reuse the text-so-far as a prompt, and repeat.

I definitely cranked out the newspaper text much faster than I would have on my own, and the model actually made some really nice embellishments and added a couple ideas that I kept in the final text.

It's a funny question, like asking how to write a book with a computer, but perfectly valid.

You can access GPT-3 directly now. There's no waitlist, but there still are restrictions. There's some examples here: https://beta.openai.com/examples

You don't even need the API. Once you get access, it comes with access to the playground, which is enough to do anything you like.

If you look at the examples, it's very "no code". You literally tell the AI what you're trying to do and it tries its best. Most of the work in prompt engineering is writing something that can't be misunderstood. But you just have to explain to it what you want like you would to a child.

There is an API for GPT-3.

GitHub copilot uses Codex, a descendent of GPT-3.

Seconding someone else's comment : what is your workflow for those tasks ? How does it help you to read academic papers ? Or to put formatting on markdown ?
For academic papers, I basically just copy the abstract or particularly hard parts of a paper into this prompt: https://beta.openai.com/examples/default-tldr-summary

For formatting, I copy from our spreadsheet instruction, and copy what the "markdown" format looks like.

  Convert the following text to markdown format.

  Text:
  1. Open the app

  2. Enter your USERNAME and PASSWORD

  3. Go to the menu and click "Foobar"

  4. Enter the following into the field [CORRECT INPUT HERE]

  ```
  Markdown:
  # How to Use Foobar
  ## Subtitle
  - Open the app
  - Enter your USERNAME and PASSWORD
  - Select the menu and click <b>Foobar</b>
  - Enter the following into the field <b><span style="color:green;">{correctInput}</span></b>

  ```
  Text:
  (your input here)
Yes, I know it's not really markdown. But you give it an example of input and output and it will figure it out as easily as a human can.

The problem we faced was that it was meant to be markdown but it's not easy to write the parser, so we had a different format of "markdown" on different front ends. You'd have something a little different on Android, iOS, HTML.

But the beautiful thing is we can keep the same input and change the output to whatever the parser needs. And instead of having to write up regex that detects [CAPITAL LETTER INPUT] and converts that, the AI can just recognize it.

Thanks ! Pretty smart !
If we point it at the horrendously bad scots wiki (some kid in the US decided he'd translate Wikipedia into what he thought was lowland scots/Doric.. it's a disaster) we might get entertainingly bad outcomes.
Note, the stuff written by said kid has long since been deleted. However i have no idea what the quality of the rest of scowiki is.
It's not awful, but I feel it's still pretty meh. I say this as a person raised in Edinburgh in the sixties and seventies. How bad? Well.. in their backend meta pages they link to the DSL (Dictionary of the scots language/dictionars o' Scots Leid [0]) which says this:

Written Scots In the written mode, Scots spelling remains variable. Attempts to make it more consistent, notably the Scots Style Sheet produced by the Makars’ Club in 1947 or the Recommendations for Writers in Scots published by the Scots Language Society in 1985, have had at best only limited success, competing with other systems that have been developed to represent more closely localized varieties of spoken Scots.

When your reference text says the language isn't yet well captured in a single print, you better believe the wiki page is a hot mess.

[0] https://dsl.ac.uk/about-scots/what-is-scots/

Well, it is pretty hard to make something in a language when it is a dialect continuum and not a standardized variety that is forced onto the whole population through the education system and media.
Not really relevant to the topic in question but... Isn't this how most languages begin?

First you have a continuum of language dialects, then one of them dominates for political reasons, then it gets codified, then enforced onto everybody through centralised education. Dialects not under direct unified political control become related but separate languages... And so on.

Yes, but I think the point is that trying to record a standardised version of something that has not yet been codified through that process is going to inevitably be somewhat approximate at best.
Could someone explain the article to layman engineer?
It's language modelling with search engine in-the-loop.

Instead of training GPT-3 with 178B weights, you train a 25x smaller model and allow it to retrieve useful snippets from a large text index as additional information.

This solves the problem of very large models and the problem of updating an already trained model, as you can swap the text corpus with a newer one. The model learns mostly syntax, burning less trivia in its weights than a regular LM as it can simply copy the relevant information from the index.

This development was bound to happen as large LMs are expensive to use and it was an obvious idea. We've had these semantic search text indices for a few years already[1], they just weren't combined with text generation.

[1] https://github.com/spotify/annoy

So the memory doesn't solve the context problem of e.g. "conversation context"? I.e. the storage isn't modified while the model is used? If I make an app that makes conversation using such a model model, then the storage isn't modified to insert knowledge about what the early parts of the conversation was about, and it's only bringing a database of fixed information into the conversation? (I have a friend who is just like that).
You could update your storage as you go, the indexing doesn't appear to be that expensive.

For many tasks it wouldn't be helpful because the input is small enough to be covered by the context already, and for summarizing and question answering tasks, you want it to repeat information from other documents, but not from earlier in its own output.

It might be interesting for a long-context task like "given the first parts of this book, complete the next chapter".

I know AI Dungeon and Novel AI both factor in several recent text inputs when generating new text, and also have a memory section where you can add things you want the AI to never 'forget' about the current story.
Yes, the key technology here is a scalable embedding store. The leading players here are the indexes - faiss and scann. The open source platforms are opensearch, elasticsearch, featureform, milvius. Then there are saas products like pinecone.
> Gebru, a widely respected leader in AI ethics research, is known for coauthoring a groundbreaking paper that showed facial recognition to be less accurate at identifying women and people of color, which means its use can end up discriminating against them.

Surely this is a function of location? I understand the U.S.-English term “person o color” to be convoluted language for “not white”. One simple thing I notice is that if I search for, say, “child” on Google Image Search, the images indeed tend to look as what one would expect from the average inhabitant of an English-speaking nation, when I search “子供”, I indeed mostly see what I would expect from Japan. Similarly, if I search for “house”, what I find tends to look like a house most likely situated in the Netherlands; with “บ้าน”, it does resemble more so stereotypical Thai architecture.

I would assume that a.i.'s made in, say, Japan would yield different results.

The meaning in woke terminology is more subtle than just "not white". For example Asians would likely be excluded in this case, and Middle Easterners and other minorities. "People of color" in this case means blacks and dark skinned Latinos.

The idea that AI itself can be biased (as opposed to the dataset) also has some significant problems. The lead of Facebook AI Research got canceled on Twitter because he pointed out that it's the bias in the dataset used to train the AI that results in bias in the AI and not the AI itself that's biased. I'd also question whether Gebru is a "widely respected leader in AI ethics research". Model interpretability is not even close to a solved problem so just because you can demonstrate some correlation between images of black people and worse performance does not imply that "black person" is a causative factor. It could literally be dataset distribution or image contrast or any number of other plausible explanations that are easily fixable by an ML engineer.

The AI is the final product of applying a learning algorithm on a training set.

Claiming that "the AI is not biased, the training set is" is like saying "this running program isn't buggy, it's just the source code that is buggy".

Searching for "child" or "house" will yield what has been classified as such in training - and searching for Japanese or Thai labels will do the same. No surprise there, if the labels don't get normalized before training.

And normally, that's harmless - as you said, you'd expect to see an AI finding pictures of houses in the region/culture you are searching it. But in a multi-cultural/multi-ethnic society, searching for "people" and showing up only what is considered the "majority" has a whole different lot of ethical implications.

Identifying and ideally remediating such issues is why ethics research is so sorely needed.

> And normally, that's harmless - as you said, you'd expect to see an AI finding pictures of houses in the region/culture you are searching it.

I am not actually; I am searching for “huis”, not “Nederlands huis”; I'd expect the result I obtain from the former with the latter.

I'd actually expect “house” and “huis” to reveal similar results from a good search engine. Obviously this is not easily possible with how it is trained with corpora in a specific language, but from usability I think this is undesirable, if I specifically want Dutch houses I can always add that term as a specification; there is no way to simply search for houses, wherever they might be, in Dutch, or English, or Thai, or any other language.

That is to say, I'm not arguing that there is no problem; I'm arguing that the problem is highly dependent upon location, and that he article should not take such a U.S.A.-centric stance and act as though the reset of the world not exist.

No, remediating such issues (only predicting the maximum likelihood class in the dataset) is a problem of _machine learning and optimization_ research, not ethics research. There is nothing an ethicist can do to solve this problem. It is easy to point out problems with existing AI and write a bunch of papers to get yourself tenure. It is very hard to fundamentally advance our understanding of deep learning models past a fancy maximum likelihood estimation problem.
> _machine learning and optimization_ research

Ethics education is (unfortunately) not really seen as necessary across the tech field, which is why ethics researchers need to be part of at all stages of AI development.

And for what it's worth, ethics researchers should be part of all technology development - the "racist soap dispenser" should have been more than enough proof of how even a very simple, innocent product can contribute to ethnic discrimination.

The problem is that AI (and the English language to some extent) transcends borders. So even if it's an AI developed in the US, it can potentially impact people outside the US and it makes ethical sense to build something that doesn't exclude groups based on arbitrary conditions.
Yes, but to offset that, many a.i. in English were also made outside of English-speaking regions, in what one assumes to be proportional degree.

This is probably why there is more variance when searching for English terms as wel, as a Lingua Franca. If I search “house” I do see some styles of architecture not commonly found in Anglo-Saxon nations, whereas all occurrences of “huis” do seem to be situated in the Netherlands.

> many a.i. in English were also made outside of English-speaking regions

Different regions, yes - but where did the training and benchmark datasets come from? AI research is surprisingly monocultural (or use "standardized benchmarks" if you're feeling charitable). Not too long ago, there was a paper posted on HN that showed that a bunch of the datasets contain mislabeled data, which means a lot of "different" models are encoding similar biases.

Completely not the focus of the article, and you've turned the result of an error rate of 0.8 percent for gender classification of light-skinned men and a 34.7 percent error rate for the same classifier on dark-skinned women - into some kind of google image search language game?

I can only quote Joy Buolamwini on this:

“To fail on one in three, in a commercial system, on something that’s been reduced to a binary classification task, you have to ask, would that have been permitted if those failure rates were in a different subgroup?”

The answer would probably be yes if that subgroup wasn't a large percentage of the dataset used for training and testing. Or if that subgroup wasn't a large percentage of the user base.

Come on, if you've worked at any large company using ML you know model performance is literally just taking the average accuracy/ROC/precision/etc over your training dataset plus some hold out sets. Then you track proxy metrics like engagement to see if your model actually works in production. At no point does race come into the equation. Naturally, if your choice of subgroup happens to not be a large proportion of either the dataset or the userbase then you don't see the poor performance on that subgroup show up in your metrics so you don't care to fix it.

Obviously, but the question is, why were there no Black women in the data set, and what care can be taken to prevent racialized bias when selecting the data set in the future?
I would assume these data sets are not manually selected but imported from some mechanism.

Other issues which are sure to arise is that the a.i. will have trouble with people who aren't smiling, and that the data set probably contains people who look better than average, and almost certainly excludes people who suffer from injuries or deformities in appropriate proportions.

Perhaps an interesting project is simply the compilation of a vast dataset of “world proportional pictures of people”. — It would be an interesting undertaking to realize such a dataset.

World proportional is not good enough for this type of task. If we are to rely on AI for things like identifying people in pictures in a trial, we would need equal representation in the data set, so the AI doesn't have any kind of systematic bias. Otherwise, the AI's bias will compound errors in the real world. So you would need as many pictures of Australian aborigenees in the data set as Han Chinese people if you wanted to be sure there isn't a risk that a random person would be confused for someone of the over or under represented groups.
Certainly you can ask these questions but these are business process issues, not technical ones. They're unrelated to AI.

My personal take is you won't see any tangible movement on this until black women (or whatever group you choose) comprise a tangible proportion of revenue generating users. Corporations operate for money and nothing else.

Of course they are related to what we call AI, because what we call AI is primarily dependent on the quality of the business processes behind data selection and testing. If there is a strong trend of business processes to create systematic errors in the results the technology generates (an AI trained in China sucking at recognising white people wouldn't be a counter example of this phenomenon, it would be the same issue) it's an underlying weakness of the technology, and the utility of the technology needs to be viewed in the context that it's likely compromised by biases in the business processes of its developers.

Black women or other groups not viewed as the mainstream target for an AI solution aren't going to form a tangible proportion of revenue generating users if the software doesn't function properly for them. And a lot of the use cases for AI analysis don't involve the unrepresented-in-corpus minority group being the consumer anyway, they involve it being used to screen them by a third party who's been sold the tool on the false premise that it's free from human bias.

Okay. Now make the small AI with memory 25 times bigger!
This seems like a very interesting approach to creating an AI that can continuously learn new things by just updating its database. Maybe a first step towards a general purpose AI? It would be interesting to create a personal assistant based on this whose database was fed the entire digital stream generated by a persons life. How would you protect such an AI from misuse? Add another AI with a database of information on ethics that acts as a gatekeeper? Could you somehow keep the gatekeeper from being turned off, perhaps by using cryptography in some fashion for access control?
Could we say that they are re-inventing the human mind architecture by enhancing "fluid intelligence" with "crystallized intelligence".

As humans age we apparently lose the former but compensate with the latter as best we can.

oftentimes one can shrink a model down dramatically once one has a bigger, more robust model. but shrinking a huge model is still a great achievement.
So where can we download these models?
Given that this is DeepMind and not some more open AI organization, I assume you cannot.
Sounds as if they stored all the correct answers in a database and call it "better". How do they even evaluate these models? Like they already have a billion preprepared correct answers in the database. How do they come up with new questions for the evaluation?
It's the equivalent of taking an a test where you can use the internet. Sure you know the information needed to answer the question exists, but it can be difficult to extract the answer and word it into at English sentence.
Instead of storing the correct answers in an encoded/embedded form in the weights of the neural net (certain neurons very loosely corresponding to certain "answers") the correct answers are stored elsewhere. That way we can scale down the model to the necessary "thinking" parts and we don't need to use excess neurons for the "memory" part. Kind of handwavey but hopefully that explains the general idea.
You mean otherwise the whole words would be encoded in the net, and now you only need to encode the index in the database?
> all the correct answers

That is clearly not possible, so it can't be what they are doing.

Rather than diffusely encoding that knowledge in a massive number of self-organized layers of weights, it is explicitly encoded. The remaining network can "focus" on mapping input to retrieve the relevant information stored in that database, and extracting/interpolating/extrapolating that information based on the current context to generate useful output.