Hacker News new | ask | show | jobs
by pedrovhb 824 days ago
For all the brilliance in the AI and infra departments of OpenAI, their official Python library (which is the flagship one as I understand) feels pretty unidiomatic, designed without much thought for common patterns in the language.

2012 JavaScript called, it wants its callbacks wrapped in objects back. Why do we have a context manager named "stream" for which you call `.until_done()`? This could've been an iterator, or better - an asynchronous iterator, since this is streaming over the network. We could be destructing instances of named tuples with pattern matching, or even just doing `"".join(delta.text for delta in prompt (...)`. But no here subclass this instead, tells me the wrapper around a web API.

4 comments

Hey there, I helped design the Python library.

The `stream` context manager actually does expose an async iterator (in the async client), so you could instead do this for the simple case:

    with client.beta.threads.runs.create_and_stream(…) as stream:
      async for text in stream.text_deltas:
        print(text, end="", flush=True)
which I think is roughly what you want.

Perhaps the docs should be updated to highlight this simple case earlier.

We are also considering expanding this design, and perhaps replacing the callbacks, like so:

    with client.beta.threads.runs.create_and_stream(…) as stream:
      async for event in stream.all_events:
        if event.type == 'text_delta':
          print(event.delta.value, end='')
        elif event.type == 'run_step_delta':
          event.snapshot.id
          event.delta.step_details...
which I think is also more in line with what you expect. (you could also `match event: case TextDelta: …`).

Note that the context manager is required because otherwise there's no way to tell if you `break` out of the loop (or otherwise stop listening to the stream) which means we can't close the request (and you both keep burning tokens and leak resources in your app).

Context managers are a great abstraction.
Everything feels unidiomatic. The API design is bad, the frontends they build are horrific, reliability and availability are shocking.

And yet the AI is so good I put up with them everyday

If they ever grow into a proper product org they'll be unstoppable.

Hi there, I help design the OpenAI APIs. Would you be able to share more?

You can reply here or email me at atty@openai.com.

(Please don't hold back; we would love to hear the pain points so we can fix them.)

does your team do usability tests on the apis before launching them?

if you got 3-5 developers to try and use one of the sdks to build something, i bet you'd see common trends.

e.g. we recently had to update an assistant with new data everyday and get 1 response, and this is what the engineer came up with. probably it could be improved, but this is really ugly

``` const file = await openai.files.create({ file: fs.createReadStream(fileName), purpose: 'assistants', }) await openai.beta.assistants.update(assistantId, { file_ids: [file.id], })

  const { id: threadId } = await openai.beta.threads.create({
   messages: [
    {
     role: 'user',
     content:
      'Create PostSuggestions from the file. Remember to keep the style fun and engaging, not just regurgitating the headlines. Read the WHOLE article.',
    },
   ],
  })
  const getSuggestions = async (runIdArg: string) => {
   return new Promise<PostSuggestions>(resolve => {
    const checkStatus = async () => {
     const { status, last_error, required_action } = await openai.beta.threads.runs.retrieve(threadId, runIdArg)

     console.log({ status })
     if (status === 'requires_action') {
      if (required_action?.type === 'submit_tool_outputs') {
       required_action?.submit_tool_outputs?.tool_calls?.forEach(async toolOutput => {
        const parsed = PostSuggestions.safeParse(JSON.parse(toolOutput.function.arguments))
        if (parsed.success) {
         await openai.beta.threads.runs.cancel(threadId, runIdArg)
         resolve(parsed.data)
        } else {
         console.error(`failed to parse args from openai to my type (errors=${parsed.error.errors}`)
        }
       })
      } else {
       console.error(`requires_action, but not submit_tool_outputs (type=${required_action?.type})`)
      }
     } else if (status === 'completed') {
      throw new Error(`status is completed, but no data. supposed to go to requires_action`)
     } else if (status === 'failed') {
      throw new Error(`message=${last_error?.message}, code=${last_error?.code}`)
     } else {
      setTimeout(checkStatus, 500)
     }
    }

    checkStatus()
   })
  }
  const { id: runId } = await openai.beta.threads.runs.create(threadId, {
   assistant_id: assistantId,
  })
  console.time('openai create thread')
  const newsSuggestions = await getSuggestions(runId)
  console.timeEnd('openai create thread')
```
just to add to this, it's not helped by the docs. either they don't exist, or the seo isn't working right.

e.g. search term for me "openai assistant service function call node". The first 2 results are community forums, not what i'm looking for. The 3rd is seemingly the official one but doesn't actually answer the question (how to use the assistance service with node and function calling) with an example. The 4th is in python.

https://community.openai.com/t/how-does-function-calling-act...

https://community.openai.com/t/how-assistant-api-function-ca...

https://platform.openai.com/docs/guides/function-calling

https://learn.microsoft.com/en-us/azure/ai-services/openai/h...

I'm sorry for your experience, and thanks very much for sharing the code snippet - that's helpful!

We did indeed code up some sample apps and highlighted this exact concern. We have some helpers planned to make it smoother, which we hope to launch before Assistants GA. For streaming beta, we were focused just on the streaming part of these helpers.

Hey, random question.

Is there a technical reason why log probs aren't available when using function calling? It's not a problem, I've already found a workaround. I was just curious haha.

In general I feel like the function calling/tool use is a bit cumbersome and restrictive so I prefer to write the typescript in the functions namespace myself and just use json_mode.

Have you seen/tried the `.runTools()` helper?

Docs: https://github.com/openai/openai-node?tab=readme-ov-file#aut...

Example: https://github.com/openai/openai-node/blob/bb4bce30ff1bfb06d...

(if what you're fundamentally trying to do is really just get JSON out, then I can see how json_mode is still easier).

Who can I reach out to for feedback on the web UI? Specifically, the chat.openai.com interface.

Web developer/designer for 24 years so I have a lot of ideas

...except for all the others.

Use Claude in Safari and the browser completely locks up after a single response.

My experience is their official Python library was easy to use, no surprises, everything is typed and generated from the OpenAPI spec in a thoughtful way.

The tools are great because they don't invent their own DSL, they "just" use JSON schemas.

Maybe they ought to contribute changes to OpenAPI to support streaming APIs better.

In contrast so many startups make their own annotation-driven DSLs for Python with their branding slapped over everything. It gives desperate-for-lock-in vibes. The last people OpenAI should be taking advice from for their API design is this forum.

How is suggesting the use of iterators and named tuples related to creating domain specific languages? If anything I'd say they're a much more generic and universally recognizable approach than having users subclass `AssistantEventHandler` to be passed to `client.beta.threads.runs.create_and_stream`, the context manager. This is very much a long way past just using JSON schemas but that part is ok - there's a REST API, and there's a library. If you're keen on the simplicity of JSON schema then by all means use the API with `requests` or your preferred http client library. Since that's always an option, it stands to reason that the point of having a dedicated library is to provide thoughtful abstractions that make it easier to use the service.

What I'm arguing is precisely that the abstractions in the library (such as the `AssistantEventHandler` shown in the article) are ineffective in making things simpler. They force you to over-engineer solutions and distribute state unnecessarily and be aware of that specific class interface when it could've just been something you use in a `for x in y` loop like everyone would know to do without spending an afternoon looking over docs and figuring out how the underlying implicit FSM works.

Probably written by GPT4
It’s not the case. The SDK is a collaboration between OpenAI and Stainless.

https://www.stainlessapi.com/

As a Stainless contributor I can guarantee you a lot of thoughts has been put into the design, and it definitely isn’t written by an ML model