Hacker News new | ask | show | jobs
by ianbicking 894 days ago
I'm sure there will be improvements, but making a custom GPT a few weeks ago I was very unimpressed:

1. The GPT builder itself didn't feel like it was a well-tuned prompt (i.e., the prompt they use to guide prompt creation). It created long-winded prompts that left out information and didn't pay attention to what I said. Anything I enter into the GPT builder interface is probably very important!

2. The quotas are fairly low, and apply to testing. I was only able to do maybe 10 minutes of playtesting before I ran out of quota.

3. There's no tools to help with testing, it's all just vibes. No prompt comparisons.

4. The implied RAG is entirely opaque. You can upload documents, and I guess they get used...? But how? The best I could figure out was to put text into the prompt telling GPT to be very open about how it used documents, then basically ask it questions to see if it understood the content and purpose of the documents I uploaded.

5. There's no extended interface outside of the intro questions. No way to emit buttons or choices, just the ever-present text field.

6. There's no hidden state. I don't particularly want impossible-to-see state, but a powerful technique is to get GPT to make plans or internal notes as it responds. These are very confusing when presented in the chat itself. In applications I often use tags like <plan>...</plan> to mark these, which is compatible with the simple data model of a chat.

7. There's no context management. Like hidden state, I'd like to be able to mark things as "sticky"; things that should be prioritized when the context outgrows the context window.

These are all fixable, though I worry that OpenAI's confidence in AI maximalism will keep them from making hard features and instead they just rely on GPT "getting smarter" and magically not needing real features.

3 comments

Re: 1. I've made many GPTs and never took the time to read that first page / use the GPT builder. I always immediately jumped into "Configure". Had no idea this was a thing.

Re: 2. 100%. It's rough.

Re: 4. Had a ton of issues where it would just error and say it couldn't find the document, completely ruined the point / purpose of one I made.

Re: Hidden state - could be fun, but I do like the transparency of everything needing to be out in the open. But maybe a "hidden by default" scratchpad.

Maybe you can hack some context management and hidden state with custom functions? The scratch buffer can literally be an endpoint that returns OK and has a field like "plain: list[str]", assuming it keeps function call data in memory. Better context memory can be done perhaps by reminding in the system prompt to always call getPinned() each 5 messages, perhaps...

But yes, the whole thing does seem bound to regress to low-quality "I'll enter 1/5th of your request in for you". You can probably even make a script that uses GPT-3.5 to generate these "GPTs".

Yeah, you can definitely do a lot with Actions, and I've seen some examples of that. But if I want to deploy a service then I'd rather make a nice frontend along with it, and Custom GPTs aren't that! (But the fact I don't have to monetize just to support OpenAI API costs is still very appealing.)

At the moment Actions are pretty brutal to the experience, with constant confirmations and they seem to halve both the speed and the quota since I think it takes two GPT responses, one to assemble the Action and another to react to the result. Using a Custom GPT that uses extensive actions means sitting and watching it think, waiting to hit a button, then sitting and watching it think for a long time, only to run out of quota before you get to anything interesting.

I use several custom GPTs that I built inside GPT Pro and they have doubled my productivity over using GPT alone, which had already doubled my productivity.

Every time I get a response that is off from the custom then I give it a nudge to edit its configuration. The main issue is that its configuration doesn't have enough space for all the rules and it likes to ditch more minor rules when you give it something extra to remember.