Hacker News new | ask | show | jobs
by chrismccord 359 days ago
Phoenix creator here. I'm happy to answer any questions about this! Also worth noting that phoenix.new is a global Elixir cluster that spans the planet. If you sign up in Australia, you get an IDE and agent placed in Sydney.
20 comments

Amazing work.

Just a clarifying question since I'm confused by the branding use of "Phoenix.new" (since I associate "Phoenix" as a web framework for Elixir apps but this seems to be a lot more than that).

- Is "Phoenix.new" an IDE?

- Is "Phoenix.new" ... AI to help you create an app using the Phoenix web framework for Elixir?

- Does "Phoenix.new" require the app to be hosted/deployed on Fly.io? If that's the case, maybe a naming like "phoenix.flyio.new" would be better and extensible for any type of service Fly.io helps in deployment - Phoenix/Elixir being one)

- Is it all 3 above?

And how does this compare to Tidewave.ai (created as presumably you know, by Elixir creator)

Apologies if I'm possibility conflating topics here.

Yes all 3. It has been weird trying to position/brand this as we started out just going for full-stack Elixir/Phoenix and it became very clear this is already much bigger than a single stack. That said, we wanted to nail a single stack super well to start and the agent is tailored for vibe'd apps atm. I want to introduce a pair mode next for more leveled assistance without having to nag it.

You could absolutely treat phoenix.new as your full dev IDE environment, but I think about it less an IDE, and more a remote runtime where agents get work done that you pop into as needed. Or another way to think about it, the agent doesn't care or need the vscode IDE or xterm. They are purely conveniences for us meaty humans.

For me, something like this is the future of programming. Agents fiddling away and we pop in to see what's going on or work on things they aren't well suited for.

Tidewave is focused on improving your local dev experience while we sit on the infra/remote agent/codex/devin/jules side of the fence. Tidewave also has a MCP server which Phoenix.new could integrate with that runs inside your app itself.

> For me, something like this is the future of programming. Agents fiddling away and we pop in to see what's going on or work on things they aren't well suited for.

Honestly, this is depressing. Pop in from what? Our factory jobs?

I understand that we are slowly taking away our own jobs but I do not find it depressing. I do find it concerning since most people do not talk about this openly. We are not sure how we are restructure so many jobs. If we cannot find jobs, what is the financial future for a large number of people across the world. This needs more thinking, honest acceptance of the situation. It will happen, we should take a positive approach to finding a new future.
Read up on the Jevons Paradox
> In economics, the Jevons paradox (/ˈdʒɛvənz/; sometimes Jevons effect) occurs when technological advancements make a resource more efficient to use (thereby reducing the amount needed for a single application); however, as the cost of using the resource drops, if the price is highly elastic, this results in overall demand increasing, causing total resource consumption to rise. Governments have typically expected efficiency gains to lower resource consumption, rather than anticipating possible increases due to the Jevons paradox.[1]

I do think there will be some Jevons effect going on with this, but I think it's important to recognize that software development as a resource is different than something like coal. For example, if the average iPhone-only teenager can now suddenly start cranking out apps, that may ultimately increase demand for apps and there may be more code than ever getting "written," but there won't necesarily be a need for your CS-grad software engineer anymore, so we could still be fucked. Why would you pay a high salary for a SWE when your business teams can just generate whatever app they need without having to know anything about how it actually works?

I think the arguments about "AI isn't good enough to replace senior engineers" will hold true for a few years, but not much beyond that. Jevon's Paradox will probably hold true for software as a resource, but not for SWEs as a resource. In the coal scenario, imagine that coal gets super cheap to procure because we invent robots that can do it from alpha to omega. Coal demand may go up, but the job for the coal miner is toast, and unless that coal miner has ownership stake, they will be out on their ass.

[1] https://en.wikipedia.org/wiki/Jevons_paradox

> Pop in from what? Our factory jobs?

Oh, you sweet summer child. ;)

You will pop in from the other 9 projects you are currently popping in on, of course! While running 10 agents at once!

And from which exactly am I earning an income to feed myself? Who's buying what I'm making? Where are they getting their money?

We're building a serfdom again.

LOL, what? Take on 10 projects at once, and start making way more money... if you're not an external-locus-of-control moron at least

You've literally been given an excavator when you currently have a shovel, and you're worried that other excavators will dig you out of a job. That is a literal analogy to your POV, here

Hopefully, from sitting by the pool drinking margaritas ... but I doubt we will get to keep our new found freedom.
Never going to happen. More efficiency and automation won’t lead to more free time and money for the masses, it will lead to fewer people employed, and those that are will be working the same hours for the same money but outputting more. Only the rich people will benefit.

In the long term. In the short term, we get to do the same work but faster.

Indeed, why would an employer pay us a high salary to sit by the pool? The benefits will go to the founders/investors and the customers. They'll benefit greatly from the increased output and lower costs, but the middlemen (SWEs) will be cut out. That's a great thing if you're a founder/investor or a customer, but not if you're the middleman. New opportunities may come around, but I don't think that's inevitable. It remains to be seen.
How about our software engineering jobs, which will now entail managing a team of agents?
wow that sounds fun /s
Sounds preferable to managing people tbh
The Phoenix.new environment includes a headless Chrome browser that our agent knows how to drive. Prompt it to add a front-end feature to your application, and it won’t just sketch the code out and make sure it compiles and lints. It’ll pull the app up itself and poke at the UI, simultaneously looking at the page content, JavaScript state, and server-side logs.

Is it possible to get that headless Chrome browser + agent working locally? With something like Cursor?

Playwright has an MCP server which I believe should be able to give you this.
When Roo Code uses Claude, it does this while developing. It renders in the sidebar and you can watch it navigate around. Incredibly slow, but that’s only a matter of time.
Does it work with VSCode GitHub Copilot LLM provider? They have Claude in there
I know it's early days, but here's a must-have wish list for me:

- ability to run locally somehow. I have my own IDE, tools etc. Browser IDEs are definitely not something I use willingly.

- ability to get all code, and deploy it myself, anywhere

---

Edit: forgot to add. I like that every video in Elixir/Phoenix space is the spiritual successor to "15-minute rails blog" from 20 year ago. No marketing bullshit, just people actually using the stuff they build.

You can push and pull code to and from local desktop already: hamburger menu => copy git clone/copy git push.

You could also have it use GitHub and do PRs for a codex/devin style workflows. Running phoenix.new itself locally isn't something we're planning, but opening the runtime for SSH access is high on our list. Then you could do remote ssh access with local vscode or whatever.

> Running phoenix.new itself locally isn't something we're planning

So no plans to open the source code?

Everyone has to eat.
For sure. I'm just hesitant to recommend sending one's codebase to a server running code I can't inspect. I suppose that's the status quo with LLM's these days, though.
confirm
"15-minute rails blog" changed the game so I definitely resonate with this. My videos are pretty raw, so happy to hear it works for some folks.
run locally or in your private cloud would be amazing. The latter bit would be a great paid option for large enterprises
Include optional default email, auth, analytics, job management (you know… the one everyone uses ::cough:: Oban ::cough::), dev/staging/prod modes (with “deployment” or something akin to CD… I know it’s already in the cloud, but you know what I mean) and some kind of non-ephemeral disk storage, maybe even domain management… and this will slay. Base44 just got bought for $80M for supplying all those, but nothing is as cool as Elixir of course!

These other details that are not “just coding” are always the biggest actual impediments to “showing your work”. Thanks for making this!! Somehow I am only just discovering it (toddler kid robbing my “learning tech by osmosis” time… a phenomenon I believe you are also currently familiar with, lol)

Hi just to confirm as I cannot find anything related to security or your use of using submitted code for training purposes. Where is your security policies with regards to that.
We don't do any model training, and only use existing open source or hosted models. Code gets sent to those providers in context windows. They all promise not to train on it, so far.
Did I not say it good enough, Kurt?
You said it terribly to be honest
Ask some security questions, I'll get you security answers. We're not a model company; we don't "train" anything.
Is there a transparent way to see credit used/remaining/topped up, and do you have any tips for how you can prompt the agent that might offer more effective use of credits?

The LLM chat taps out but I can't find a remaining balance on the fly.io dashboard to gauge how I'm using it. I _can_ see a total value of purchased top ups, but I'm not clear how much credit was included in the subscription.

It's very addictive (because it is awesome!) but I've topped up a couple of times now on a small project. The amount of work I can get out the agent per top-up does seem to be diminishing quite quickly, presumably as the context size increases.

Is there something comparable that works similarly but completely offline with appropriate hardware? Not everywhere has internet or trusts remote execution and data storage.

PS: Why can't I get IEx to have working command-line history and editing? ;-P

Any takeaways on using Fly APIs for provisioning isolated environments? I'm looking into doing something similar to Phoenix.new but for a low-code server-less workflow system.
1 week of work to go from local-only to fly provisioned IDE machines with all the proxying. fly-replay is the unsung hero in this case, that's how we can route the *.phx.run urls to your running dev servers, how we proxy `git push` to phoenix.new to your IDE's git server, and how we frame your app preview within the IDE in a way that works with Safari (cross origin websocket iframes are a no go). We're also doing a bunch of other neat tricks involving object storage, which we'll write about at some point. Feel free to reach out in slack/email if you want to chat more.
Would love to read about some of the techniques for how you accomplished this.
Thanks, I might hit you up when I'm in the weeds of that feature.
1. What's your approach to accessibility? Do you test accessibility of the phoenix.new UI? Considering that many people effectively use Phoenix to write front-ends, have you conducted any evals on how accessible those frontends come out?

2. How do you handle 3rd party libraries? Can the agent access library docs somehow? Considering that Elixir is less popular than more mainstream languages, and hence has less training data available, this seems like an important problem to solve.

It seems like they're giving you lower level building building blocks here. It's up to the developer to address these things. Instruct the agent to build/test for accessibility, feed it docs via MCP or by other means.
They use the Daisy UI component library in 1.8+ Phoenix versions which should have basic accessibility baked in.
Watched the Tetris demo of this and it was very impressive. I was particularly surprised how well it seems to work with the brand-new scopes, despite the obvious lack of much prior art. How did you get around this, how much work was the prompt, and are you comfortable sharing it?
What is the benefit of this vs. just running your agent of choice in any ole container?
The whole post is about that. Not everything is for everybody, so if it doesn't resonate for you, that's totally OK.
Oh geez so sorry for the dumb question! I read a lot about the benefits of containerization in general for agents, but thought it might be enlightening/instructive to know what this specific project adds to that (other than the special Elixir-tuned prompting).

But either way I hear you, thanks so much for taking the time to set me straight. It seems like either way you have done some visionary things here and you should be content with your good work! This stuff does not work for me for just circumstantial reasons (too poor), but still always very curious about the stuff coming out!

Again, so sorry. Congrats on the release and hope your day is good.

You're fine! Just encouraging people to read Chris's post. :)
Gotcha! I'll keep reading it I guess until I see what I am missing! Good job again!
I did none of the work! I'm just like Flavor Flav or Bez in this situation. I will relay your congrats to Chris and the team, though. ;)
This looks amazing! I keep loving Phoenix more the more I use it.

I was curious what the pricing for this is? Is it normal fly pricing for an instance, and is there any AI cost or environment cost?

And can it do multiple projects on different domains?

It’s $20 per month if you click through, and I haven’t tried it but almost certainly the normal hosting costs will be added on top.
I've tried it, the $20 of included credits lasted me about 45 minutes
Thanks, apparently didn't click through enough
Just tried it out, but it's unclear what the different buttons at the bottom of the chat history does. The rightmost one (cloud with an upwards arrow) seems to do the same as the first?
I'm also having trouble with getting it to read PDFs from URLs. I got this error:

web https://example.com/file.pdf Error: page.goto: net::ERR_ABORTED at https://example.com/file.pdf Call log: - navigating to "https://example.com/file.odf", waiting until "load" at main (/usr/local/lib/web2md/web2md.js:313:18) { name: 'Error' }

/workspace#

Do you have a package for calling LLM services we can use? This service is neat, but I don't need another LLM IDE built in Elixir but I COULD really use a way to call LLMs from Elixir.
Req.post to /chat/completions, streaming the tokens through a parser and doing regular elixir messages. It's really not more complicated than that :)
even less complicated, just set stream: false in your json :)
Thanks for everything you do Chris! Keep crushing it.
How tightly coupled to Fly.io are generated apps?
Everything starts as a stock phx.new app which use sqlite by default. Nothing is specific to fly. You should be able to copy the git clone url, paste, cd && mix deps.get && mix phx.server locally and the app will just work.
If you're willing to share, is maintaining that modularization the plan going forward? I'm pretty happy to use and pay for this and deploy it to fly, but only as long as I'm not "locked in."
Does it mean I can build and deploy a SQLite based app on fly.io with this approach without using Postgres? If yes, how does the pricing for the permanent storage ( add) needed for SQLite works? Thanks
You would need to add a fly volume ($0.15/GB per month of provisioned capacity ), also check out https://fly.io/blog/litestream-revamped/
What LLM(s) is the agent using? Are you fine-tuning the model? Is the agent/model development a proprietary effort?
Currently claude 4 sonnet as the main driver, with a combination of smaller models for certain scenarios
I'm assuming you're using FLAME?

How do you protect the host Elixir app from the agent shell, runtime, etc

Not using FLAME in this case. The agent runs entirely separately from your apps/IDE/compute. It communicates with and drives your runtime over phoenix channels
Oh interesting. So how do messages come from the container? Is there a host elixir app that is running the agent env? How does that work?
Yes, elixir app deployed across the planet as a single elixir cluster. We spawn the agents (GenServer's), globally register them, and then the end-user LiveView chat communicates with the agent with regular elixir messages, and the IDE is a phoenix channels client that communicates with and is driven by the agent.
how are they isolating ai agent state from app-level processes without breaking BEAM's supervision guarantees?
They run on separate machines and your agent just controls the remote runtime when it needs to interact with the system/write/read/etc
appreciate the clarity, that helps.

quick followup if the agent's running on a separate machine and interacting remotely, how are failure modes handled across the boundary? like if the agent crashes mid-operation or sends a malformed command, does the remote runtime treat it as an external actor or is there a strategy linking both ends for fault recovery or rollback? just trying to understand where the fault tolerance guarantees begin and end across that split.

token auth and re-handshake. Agent is respawned if it's no longer alive, and project index is resynced
the ai agent runs inside the same remote runtime as the app. does it share the BEAM vm or run as a port process?
The agent runs outside your IDE instance and controls/communicates with it over Phoenix channels