Hacker News new | ask | show | jobs
by TheDong 64 days ago
What about "cowork", aiming to be the claude code of excel files and pdfs and screenshotting your desktop to tell you what's wrong?

Like, that feels like it's also a huge amount of token churn ("sure, I can search every xls file on your machine to find the 2023 invoice from that company"), and very early in its adaption curve.

Most people are still using AI as a webpage chatbot to ask questions to and copy+paste between, but running an "openclaw" like assistant, which can access your files, email, and opens you up to wild security attacks, that seems like a really big killer app.

Cowork to me also seems like it'll take longer to reach the broader market since the models are less good at "use the mouse and keyboard to do this repetitive task" than "write code", but I see it as having killer-app potential with lots of token churn.

5 comments

I think The Verge said it the best. Taking advantage of these tools to the maximum requires you to have "software brain" which the average person does not have. They struggle to set up a simple automation in their smart home platform of choice. There is little reason to believe they will take the leap to use such tools to simplify daily tasks because it requires people to think about which daily tasks can be simplified and automated.
I don't think 'software brain' is required for non-coding tasks. Rather, it requires 'manager brain', the ability to delegate, direct, and review the output. Manager brain is more prevalent than software brain and likely learnable by many knowledge workers who don't yet have it.
I think you still need software brain, because ultimately, this stuff still has limitations driven by software constraints, and having the AI try to explain it to them doesn't necessarily help.

I think we all have had experiences with people treating their computers as magic boxes and not understanding why certain requests simply are not possible to satisfy.

A growing number of non-technical managers are now using Claude Code to build small custom software. A larger share will use Cowork to automate routine business tasks. Claude Cowork will become easier to use and more automated over time, as it learns the user's preferences, just like a good executive assistant does.

Granted, it's possible that a majority of people will not acquire proper 'manager brain' either and we'll see how that pans out. Evolutionarily, managerial skills are much more aligned with what many hunter-gatherers might learn as they mature and become more of an advisor than a doer.

Even if only 10-20% of people end up using multiple autonomous agents regularly for their work and business, that will change the economy. Contrast this with <1% of people who develop software professionally.

You have to recognize that it's a problem to delegate in the first place. One example I love to trot out is, do you have any toilet seats in your life that kinda slide around bit and don't seem securely attached? It's absolutely trivial to fix this, and it's really annoying when it happens, yet with shocking frequency I encounter people who've just been dealing with the annoyance because they didn't process it as something they could solve.
It's not that easy to fix, and it can be kinda gross, and once it happens once, it tends to happen again in fairly short order. I'm someone that's fixed those loose seats countless times, and continues to do so, but the gap between me noticing it and fixing it is consistently growing.
You also need the brain of not giving up after 2/3/10 tries. I don't know what the exact numbers are but if something doesn't work properly after the second or third try a huge percentage of people give up.
How do you delegate, direct, and validate results if you have no idea what you're looking at?

This is the same issue many managers of people have for the same reason.

You’ve never tried to train the average admin.

Basic forms can be a challenge. Even things like selecting a dropdown menu or pushing a button can be surprisingly hard.

Most people here have no idea what works for the majority of people - who don’t want to spend time figuring stuff out.

I’m sure many here live in delulu land wondering why everyone doesn’t find the open claw stuff as fascinating as they do.

Yes. And that’s not a criticism of average people. Tools should fit the user not the other way around. Designs systematically removed shadows and visual clues. Developers render buttons off the screen requiring a scroll to submit. Hard to criticize the user under those circumstances. But there are people with art brains, and math brains, and software brains. So it may be the case that AI adoption is limited by how it expects the user to relate to the tool
The whole point of click and point (gui) was that one barely had to engage the brain vs using a terminal.

The ideal experience is where one’s resources are able to be allocated such that one can achieve some goal with minimal effort. We are very far away from this ideal with llm’s and absurd amounts of money has already been spent.

The point of AI is that it's supposed to be intelligent. Why silo it in an app? Instead of telling it what to automate, shouldn't it sit at the OS level, watch everything you do, and figure out what to automate by itself?
Most people don’t have good enough hardware to run a decent model. I’m not even sure if any local models can handle image input (but I’m by no means an expert in local models).

So if you’re going to need the data center to process it, then you run into the same issue Microsoft did when they announced the OS feature where they took screenshots of your desktop all the time for advanced search or whatever. People consider it to be a privacy issue.

> shouldn't it sit at the OS level, watch everything you do, and figure out what to automate by itself?

Read that again and really ask yourself if you want a private company to have access to all that and the ability to do whatever it wants with your system at the OS level.

On a smartphone, you're trusting Apple or Google to make the OS. They already can do anything they want with your system. Do you read every line of code in every security update?
Humans do not want something sitting at the OS level, watching everything you do. Microsoft, famously, tried this and the backlash was immediate and intense.

If you believe you can do better, then build it! I don't think the tide has changed though.

Cowork is a dead end. Most people can’t operate onedrive.

Tools like Claude are best at answering things when the user understands the question.

Why did they even bother putting resources into that project? Bizarre.

It’s telling how scarce vision is.

It’s an incredibly useful product for the people who can use it.

It just isn’t the next Microsoft Office. A market of 10M people vs 2B!

"Push buttons for me" in the most common ways I see it used ("add this ticket to Jira so I don't have to") is a nice timesaver for being lazy but it's not a 10x multiplier to justify the subscribe-forever cost.

I think it's more likely that the companies that employ large numbers of people to perform manual push-the-button-then-the-other-button workflows will replace the tools that need button-pushing with other sorts of automation.

And outside of work I wouldn't spend any money on something to save myself the ten minutes of logging in to pay my credit cards or check my bank statements once a month or so. I have no real need for an always-running assistant and even the things that it seems most useful for today (beating unassisted humans to the punch for limited-quantity things) are only something it could help with as long as only a very few people have access.

An AI that consumes every document on the system in response to a simple search request is going to be fired just as quickly as a human who does the same thing not long after replacements able to use conventional search tools to efficiently accomplish the same task are widely available.

Similarly, customers who rely on AI cowork tools will come to favor systems and applications that expose AI-friendly interfaces, which shouldn't be difficult to implement in most cases under the assumption that the models in question are already good at consuming API documentation and writing code (and, for that matter, writing API documentation, refactoring, and generating relatively straightforward wrapper code).

I have less faith in the market's ability to effectively respond to security threats in a timely fashion, alas.

> What about "cowork", aiming to be the claude code of excel files and pdfs and screenshotting your desktop to tell you what's wrong?

I’ve been using these types of functions for a while for some specific use cases, and it’s super useful for this. Eg go into my budgeting app and explain to me why a certain discrepancy between forecast and actual occurred, which would otherwise cost me a huge amount of time.

I’ve also been using Cowriter AI, which actively learns from what you’re doing by taking screenshots of your screen every few seconds.

These types of utilities are just starting, they’re underexplored, and will definitely burn lots of tokens (while creating value).