Hacker News new | ask | show | jobs
by gklitt 872 days ago
Post author here! I wrote this post five years ago. Since then, my conviction in the value of customizable software has only grown, but I've also updated my thinking in a few ways:

1) AI

AI is rapidly getting better at coding. Current AI is often bad at high-level architecture but is capable of making small local tweaks. Seems like a good fit for the kind of code you need to write a browser extension!

I'm exploring this direction; wrote more about it in "Malleable software in the age of LLMs" [1]

2) Security

Having talked to people who worked on various extension platforms including the browser extensions API, I see more clearly than I did five years ago that security is often the key bottleneck to deploying extension platforms meant for mass adoption. Anytime you want everyday computer users to be installing invasive extensions to important software from untrusted third parties, it's gonna be challenging to protect them.

That said, I still think that conversations around extensions tend to focus too much on security at the expense of all else. Customizability is important enough that it may be worth prioritizing it over security in some cases.

I also think there are many reasonable paths forward here. One is to exchange extensions with trusted parties -- e.g, coworkers or friends -- rather than installing from random people on the internet. Another might be to only build your own extensions; perhaps that'll become more viable with AI-assisted programming, although that introduces its own new security issues. And finally, I've met a few people who have smart ideas for architecting software in a way that helps resolve the core tensions; see [2] for an example.

3) Backend access as a key limitation

I've increasingly realized that the fact that browser extensions can only access client code in a fairly server-centric web means that many deep customizations are out of reach. Perhaps you can't read the data you want, or there's not a write API to do the thing you need.

While I'm optimistic about what extensions can do within the boundary of the client, this is an inherent limitation of the platform.

At Ink & Switch (the research lab I now work for), we're working towards local-first [3] software: collaborative software where the data and the code lives on your device. Among other benefits like privacy, we think this is the right foundation for more powerful extensions, since your data and the app code aren't locked away on a server.

[1] https://www.geoffreylitt.com/2023/03/25/llm-end-user-program...

[2] https://www.wildbuilt.world/p/inverting-three-key-relationsh...

[3] https://www.inkandswitch.com/local-first/

8 comments

The security problem of open platforms is the key.

Anything that is open enough to let someone who knows what they're doing customize the system to their liking, will also be abused by bad actors persuading people who don't know what they are doing to customize the system in ways that harm them.

The fact I can write my own custom keyboards on Android is great! But the fact someone can convince your grandparents to install a keyboard that includes an embedded key logger is not!

Browser extensions have always been a malware-rich ecosystem. Joking about removing all the toolbars from your parents' Internet Explorer whenever you went home for thanksgiving dates back to about 1999.

Custom keyboards are a great example of an app that - by default - shouldn't have write access to shared resources (that is, no network access, no writing to files which other apps can read).

Adding either of those entitlements to a keyboard app should require extremely scary dialogs. Needs to be possible - perhaps you want your password manager with sync to be part of the keyboard app - but it's clearly a huge risk.

Until you want to be able to download language dictionaries or updated language model. Or if your keyboard is actually a remote keyboard or shared keyboard taking input from some other devices.
> Until you want to be able to download language dictionaries or updated language model.

You don't need the keyboard application to be able to communicate externally for that. You could have a separate, optional, downloader/installer. That's better for security all around.

Mobile OS vendors have already thought of that and came up with the exact same solution of requiring entitlements to access the network from a keyboard app:

https://developer.apple.com/documentation/uikit/keyboards_an...

The question is do you actually trust regular users to understand what’s going on when they’re asked for permission to grant an app the ability to do something sketchy?

Bear in mind that on iOS, you can't just prompt for permission; those "regular users" need to be able to navigate to the settings app, find the relevant (deeply nested) section, and enable it there.

That narrows the gap significantly - to users who can't understand the issues, but can (even with the app providing an explanation) find reasonably well-hidden settings.

I've heard from a couple developers over the years that it's entirely impossible to implement a setting that will not be changed by people who don't know what it does.

It doesn't matter if it's behind a footnote, an easter egg, a password input, a magic email code, a call with the main project developer, all of the above, etc. No matter how many steps you try to add, there are still an incredible number of idiots who will mindlessly tap through literally any number of dialogs, warnings, and disclaimers to get to what they want.

Their brain will entirely filter out the path they took. They will probably not even remember a single one of those intermediate steps. The only thing they care about is that they're fixing some problem.

This could be one of the reasons Apple and Google don't want you jailbreaking/rooting your devices. Someone will inevitably make a guide, and millions of idiots will follow it. It will legitimately make the device less secure for them because they won't have any idea what they are doing and likely won't even remember doing it. The only thing they care about is that they're fixing some problem.

This is one reason why some people get so panicked and upset when anything on their computer changes unexpectedly, even if the change is actually harmless. They never actually understood anything. They had managed to accidentally get it how they want it through a combination of stuff that they don't remember. When anything changes, they have to go through that process again.

Look, these people are great at following guides and learning routines. Repetitive, mindless tasks like data entry are perfect for them, because they have no other talent to worry about wasting. But because these people exist, you have to be really careful about what settings you add, no matter how well you think it is hidden, because they will be changed by people who don't know what they're doing.

So far, the devs that have told me this have done so because I asked for some setting to turn off some safeguards, and they said that it's a near-universal request from power users, but they still can't do it, because the rest of their userbase is too clueless to be trusted with that setting. They'd receive bug reports from people who have no clue what went wrong, when the reality is that they disabled the safeguards in order to make something work, and then promptly forgot what happened once it worked the way they wanted. This has supposedly happened so many times in the past that they just don't take the risk anymore.

Anyway, all this is to say that while hiding a setting, as opposed to automatically prompting for it, can definitely rule out a decent chunk of idiots, you will never be able to rule out the resourceful idiots that can mindlessly follow instructions.

I think you underestimate how much we all are these resourceful idiots under the right circumstances.
I bookmarked this post, thanks! Really interesting.
A great XKCD on the topic: https://xkcd.com/2044/

I do think that with every turn of that cycle we end up with better compromises. They’ll still be compromises, though.

Executing untrusted code would be a lot safer if browsers and mobile OSes would make it easy to provide fake resources to the app/extension.

Yes, you may read my phone contents, and as far as you know, it's the contents, the whole contents and nothing but the contents - it just happens to be a folder to me. An empty folder. It's a new phone you see.

Yes here's my contact list. Sorry it's mostly empty, there's just the costly premium number in there. I hope your mothership doesn't try to call it.

Yes, here's my microphone. Oh thank you, yes, I do a good impression of Rick Astley.

Pictures on my phone? Oh yes, right this way. It's all pictures of turnips. Do you like them?

Similarly every browser should have the capability to report to sites that the user has notifications enabled when they actually don’t to end those annoying in-site “pre-prompts” which bait you into saying no to the pre-prompts so they can try to ask you again later, rather than just deal with the fact that the user denied permission with the browser-level prompt and isn’t interested.
I don’t think this is a bad idea per se (after all a fundamental principle of the open web is that the user should control the browser). However, although your suggestion is fun, it is mere civil disobedience for geeks.

The million dollar question is: how do you deliver those capabilities (a) without having grandmas phone full of spyware and (b) without giving your favorite Silicon Valley thought leader a 40% cut and total control of the ecosystem?

I don’t have the answer. Just trying to formulate the problem.

> The million dollar question is: how do you deliver those capabilities (a) without having grandmas phone full of spyware and (b) without giving your favorite Silicon Valley thought leader a 40% cut and total control of the ecosystem?

That seems orthogonal? Grandma's phone has the same spyware either way, but this makes it a toss up whether it can spy on anything real

iOS does offer options for "read selected photos" and "add-only photos".

Contact list subset and pseudo-sensors (camera, microphone, accelerometer, barometer) are much needed.

Preset location is also needed, but some apps enforce DRM or other policy by location.

App-level network policy (whitelist, blacklist) is needed. For enterprise MDM, iOS allows per-app VPNs, which could enforce app-specific network filtering. With Apple Configurator policy files, Safari can have on-demand VPNs for specific websites.

> iOS does offer options for "read selected photos" and "add-only photos".

The annoying thing here is how apps insist on either requiring full album access so they can implement their own photo picker or don’t provide a button to re-trigger reselection of “selected photos”.

I wish they’d just use the standard OS selector dialog and call it a day. I don’t care if the standard selector doesn’t meet some stupid product requirement, it’s good enough.

> don't provide a button to re-trigger reselection of "selected photos"

iOS Settings should have an app setting menu to "Edit Selected Photos".

There is already a permission system?
The issue the parent is trying to solve is you don't really have fine grained enough control, or apps nag you and won't load until you give them everything they want. My mom has a cheap camera security app that allows me to see the live streams from remote. Every single time I open the app it asks me again if I want to allow it access to my local network. The answer is a resounding "no". If I could just say "fake yes, here is my fake network", then I wouldn't be continually coerced into giving permissions to something I really don't want to share. I can think of many similar examples, another really common one is giving apps access to my contacts. Absolutely not, stop asking me, here is "Uncle Bob" with phone number 1-222-222-2222. Leave me alone
I wish it were easier to deny internet access to Apps. It isn't a perfect solution but it prevents the simplest data theft. Unfortunately side channel attacks are still too easy: Either a cooperating app, or send once of high value data via a link click opening the browser.

From what I can tell, internet access is the default just to allow apps to have advertising. Too cynical?

Android originally could deny internet access to Apps which I found useful.

Certainly I don't want an extension or plugin to have pull access to the internet. That may limit functionality. But often only push is needed (e.g. blocking list could be pushed). No third-party keyboard should have internet access.

Edit: rewrote a little clearer.

Denying access to apps: if you're on android, you can root it and use AFWall+, which just sets up a basic linux firewall - but apps are installed as individual users, so you can just allow the apps that actually need internet - messengers and browsers, and things you want to sync across networks.
XPrivacyLua for Android does just that. It requires LSPosed, which enables deep modifications of the OS and other apps. Needless to say, that has its own security implications.
Denying "local network" permissions is hilariously worthless. On both Android and iOS all it does is prevent software from sending out multicast packets (for things like device discovery, Chromecast, etc. that don't use DNS-SD), it can still go ahead and just start trying to iterate through the entire RFC 1918 address space and try to connect to everything on your network.

I spent a bunch of time trying to figure out how I would implement such a feature on a standard Linux system to sandbox apps on my PinePhone, but there's no sane way you can implement a standard "you can have internet access but not touch my local network" policy.

Well, maybe the best reaction would be to uninstall the app and give it zero stars.

Of course, if you've bought hardware controlled by it, that's unfeasible. Keep it in mind for next time.

I don't suppose there are review sites that mention how predatory and nagging a mobile app is?

I've basically given up on mobile apps around when the ipad 3 was launched and never looked back. The reasoning being that i got an ipad 1 when it was new, and you could still find pay once games then. But they all got replaced by free to play gambling applications mislabeled as 'games'. Then the news about utility applications tricking you into $50/month subscriptions came about...

I'm so excited about the malleable software / local-first / local-AI crossover, I feel like we are at the dawn of a new era of software. If we play our cards right, we can bring back control of our data from the large corporations, have ownership, and more control of how we work.

I'm particularly interested in how general purpose CRDT toolkits like Automerge and Yjs could become the backing filetype for local-first software with interoperable sync/collaboration backends. The user can then have direct access to the underlaying data via standard tooling. Files can be linked, embedded within each other, forked and merged.

We could have a new hypermedia platform built on this, where all documents are possible to be shared, forked, edited in realtime...

Basically, love what you are all doing at Ink and Switch, excited to see what you publish next.

taking back control from evil corporations is a funding/finance problem, not a technology problem. Everyone dreams of democratized ownership until they have to pay the huge developer salaries. and the go to market costs are even higher than that, all channels are saturated and you have to be louder than the noise.
It’s absolutely a technology problem. The hacker mentality is still the one who innovates and a single person is more than enough to make a significant contribution towards a very different future. That person is probably already working on it.
And here I will interject and argue a third point that it's primarily an organizational problem, and I am already working on it.

Not ready to spill the beans yet though on my projects, first have low back surgery tomorrow to get an artificial disc put in between L5-S1 - and will see how much my overall pain goes down, and how much my productivity can go up - before knowing when I can make any public announcements.

Major limitation of browser extensions is that if you want to just write them for yourself, there's no user friendly, scalable way to install them. There's no way to tell the browser that you trust all extensions in some directory to be loaded automatically and be used without signing and without maybe even having to be packed into XPI file. There's no "put a bunch of code+manifest into a directory and have browser use that" feature. This kind of simple deployment drove me to write a ton of userscripts when greasemonky just loaded plain files from gm_scripts/ subdir of browser profile directory. It was fun and easy to extend websites back then. Mozilla killed all that.

Deployment is just terrible. There's no way I'm sending my extensions somewhere over the internet to get signed after every change so I can use code I wrote on my own computer. WTF distopia is that? Nevermind the last time I checked the tooling for signing is some stupid ass 100MiB+ NPM/node app I have to now trust too. It's bigger than a freaking Linux kernel build itself.

I would normally agree with your assessment, but the problem is that the browser vendors often revoke APIs, and destroy good popular extensions.
> Customizability is important enough that it may be worth prioritizing it over security in some cases.

100% this. It should at least be acknowledged that "security" often means less options for the user.

Solution: move everything to client side.
Are you sure browser extensions improve the web apps?

Maybe they attempt to fix them because they're limited by the platform and mostly low quality software?