Hacker News new | ask | show | jobs
by fabian2k 2 hours ago
> 30-50% of engineers on core teams have been forcefully reassigned to data labeling and RLHF, upsetting folks even more.

This really doesn't sound believable to me, but who knows with all the craziness going on. Software developers in the US are seriously expensive, using them for data labeling would be a waste of resources. And the percentage sounds very high, unless "core teams" is only a small subset of the total developer count.

10 comments

> Software developers in the US are seriously expensive, using them for data labeling would be a waste of resources.

The frontier work is on labeling and training expert content, by experts. It's unglamorous work and almost certainly doesn't warrant FAANG pay, but neither did most of the work that most FAANG engineers were already doing. But it does require competent talent from the expert domain.

Like their peer companies, Meta is still sitting on a huge pool of vetted-as-competant workers from the hiring boom and expert AI training is the most ripe business opportunity in a fragile economy where pretty much every comparable opportunity has evaporated.

> Software developers in the US are seriously expensive, using them for data labeling would be a waste of resources.

Zuck basically went to a town hall and explained to his employees that their remaining value to him is as training mules for his AI.

Zuck literally said that he wants folks with higher intelligence on the Applied Intelligence team. And the best way to do that was to move folks internally, since they were "intelligent" enough to pass the Meta interviews.

Soooo, yes it is a waste of resources ($$$). But this was the initial intention.

The belief that engineers are not doing anything for x amount of time that could be better spent on other immediately measurable things is as old as the profession itself.

Ironically this vanishes when the tables are turned and we ask for things like better hardware or software. There are plenty of us here with stories of how much effort it took to convince employers that SSDs were worth it when they were new, small, and very expensive.

One of the funniest things is how hard it was to get approval for a $100 software license but now people are being encouraged to burn thousands on tokens.
From the article it sounds like what they're actually doing is reviewing LLM-generated code, for that you do need good software engineers.

Although it goes without saying that good software engineers won't enjoy doing this very much

It's only until Cold Harbor is completed.
I don't know what Cold Harbor means in a Meta context, but its interesting that its named after the battle that exemplified Grant's strategy of attrition during the American Civil War. I suspect it means waves of engineers ground down against the defenses of OpenAI/Anthropic in the hopes of eventually finding a crack. Might be best to get out while you can.
Then all the engineers will get to rejoin their outies.
>using them for data labeling would be a waste of resources

Would it? It seems like they can spend a few months extracting intelligence and "taste" from their engineers then get years worth of it back from the AI.

I wouldn't trust any engineers I know of with their "taste". At best it's a highly skewed view of the world. At worst, it's outright opposite to genpop.
I assume taste was meant in term of coding. "taste" is still often the lacking trait that LLMs have when it comes to code design.
Seriously, what a world that would create.
Unless they collude and hatch a plan to sabotage the LLM training.
Are there any examples of this actually working? I keep seeing this fantasy repeated but have not seen a plausible explanation for how they wouldn't be contibuting to the pile of negative examples which are just as valuable if not more.
Poison pilling skills is a thing, though finding evidence for it is difficult given the crux is an absence of information. The baseline instruction and training is given to the model by the expert, but edge cases are willfully neglected. The degree of neglect generally determines how detectable it is, but if all the SMEs are in on it a lot of them will probably persist. Effectiveness and impact are obviously relative to the system and the edge case. Not particularly different from the fallout previously seen during the offshoring era.
its fantasy

scale ai's value prop was catching people like this

Isn't that Scale AI investment in a company that does labeling? what are we missing? Are we all going to be labelers soon too?
I believe it, because it makes a kind of sense. Post-training has a huge impact on how well LLMs perform, and labeled data is what determines the effectiveness of post-training. This is why companies like Anthropic are so worried about distillation.

So if you have access to a large number of highly skilled people, and you really don't absolutely need them to do other things, why wouldn't you force data labeling tasks on them?

Facebook is also planning a 10% layoff, so this also works as encouragement for people to leave voluntarily.

(Before you downvote me, note that I'm not endorsing this or saying it's a good idea. I'm just saying that I believe it's true, because I can see how Facebook's leadership would think it's a good idea.)

From the article:

> Forced data labeling with 4,500+ engineers is to generate high-quality RLHF

I doubt that you get high quality from forced reassignments where the now-data labelers don’t actually want to do that kind of work.

It’s crazy to think that Meta leadership believed that it makes sense.

Do the skills these people have overlap with the skills needed for a good data labeler? I'm guessing being a domain expert is most valuable as a data labeler.
Because you can just get rid of all those people and do the data labeling tasks for 1/4 the cost?
unironically if those engineers were considered to be 'bloat' its better to have them label data because they are smarter and vetted

basically a soft layoff

Silicon Valley strikes again?

https://www.youtube.com/watch?v=obS-qZO9uCQ