Hacker News new | ask | show | jobs
by vicaya 2421 days ago
Just like various chat bot incarnations, it's reasonably OK at generating generic sentences without real understanding. I tried several attempts with this leading fragment: "Is borrow checking unique to Rust? Probably not. But". The results are not pretty (the best one is as follows). Not impressed at all with this giant model :|

"Is borrow checking unique to Rust? Probably not. But the Rust library is written by and for Rust programmers, so if you're looking for a library that's designed for the language you already know, this might be your first choice.

The library is still in a fairly early stage. It does not support the standard libraries.

If you're interested in reading about how the library works, the docs are up on GitHub for reference!

How can I get this for my project?

Get it here: https://github.com/paulp/rustbinder

I've just read your blog post and want to use it, what can I do?

If you're looking for a project to use this library for, check out the docs for the "RustBinder project".

What are the current limitations and how will future changes affect this library?

This is still a work in progress. The library currently does not support the standard Rust library. There are a few work in progress"

6 comments

Am I the only one impressed by the text generated? Sure, it doesn't have any understanding but are you factoring that 1. most people on the world do not know that Rust is a programming language and 2. a single person can not have that much general knowledge. Sure they can know about Rust borrow checker but will not be able to expand that much on another subject.
This looks like it can copy large chunks of text. That's it.

It is hard task in itself ("copy" is a standard test for memory-enhanced neural systems like RNN and such) but here it needs to "understand" things. For example, what makes borrow checker in Rust unique which would be valid continuation.

Have you tried googling sentences generated?
Language models actually do "understand" things in the sense that they make decent foundations for knowledge bases (not forgetting that this is NOT what they are designed to do).

See for example https://www.aclweb.org/anthology/D19-1250.pdf (released today) which shows that the BERT language model performs extremely competitively with specialised knowledge bases and LB construction methods.

Wow! That is meaningless but difficult to distinguish at a glance, especially when read by someone not familiar with the subject.

Are we going to see such auto generated content take over as the primary pillar of the SEO content farms?

... Kinda like an automated (text) version of Siraj Raval videos :-P

It's interesting how the first paragraph in particular reads like a deflection of the original question, somewhat akin to a middle manager avoiding exposing their lack of domain expertise in response to a technical question by giving a vague answer.

There were heaps of semantic pathways the network could have chosen such as guessing what a borrower is, or maybe even weighing the significance of "borrower" more than rust in the original question because it has no clue what rust+borrower is; but it instead picked the safer option of deflection. Fascinating stuff.

Pattern matching on noise yields false positives, more news at 11.
Love the fake link to github... Which model was this? Was it trained on software type discussion?
Presumably it’s GPT-2 extra large as it is what this whole thread is about.
Kind of shocking how broad that model really is. It learned multiple programming languages and tons of things of human interest...
Well, for detailed knowledge it is not enough.

But for Internet ramblings about anything (politics, religion, capitalism vs socialism), I bet it is well beyond the average human level. (If you want to protest, go to some random Facebook, YouTube, Reddit or Twitter thread. No, not HN, or specialized groups of interests, or anything dominated by academics or IT specialist.)

Also, somewhat related: https://news.ycombinator.com/item?id=21438318 "Undercover reporter reveals life in a Polish troll farm"

A few friends of mine became parents and started participating in some parenting FB groups. It was from them a shocking contact with getting outside of the intellectual bubble.

I would be really interested in judging the quality of GPT-2-generated tests against human texts. Questions like "does the person know what they talk about?", "are they smart?" with control on the knowledge of a particular subject (e.g. do they know Rust?) would give some insight into the effective level of AI for text generation.

> Well, for detailed knowledge it is not enough. .. But for Internet ramblings about anything (politics, religion, capitalism vs socialism), I bet it is well beyond the average human level.

The real question is, how far apart are these? It certainly seems that there is a wide gulf, but the technology is moving incredibly rapidly, unclear if it's 3 years, 5 years, or 100 years away.

Everyone knows that PaulP only writes Scala and Boa - Rust just isn't his style! So unrealistic!