> call to external db, and then to llm with retrieved context.
Right, neither of those are CPU intensive in a different way than LLM inference itself (the latter is LLM inference itself.)
> also business rules, no?
Business rules can vary quite a bit in content and complexity, but either tend to be simple enough that they won’t impose much additional load, or complex enough that you are probably going to want to simply use an existing rules engine (many of which, regardless of their implementation language, have Python bindings) which are going to behave the same way no matter what language you call them from.
Right, neither of those are CPU intensive in a different way than LLM inference itself (the latter is LLM inference itself.)
> also business rules, no?
Business rules can vary quite a bit in content and complexity, but either tend to be simple enough that they won’t impose much additional load, or complex enough that you are probably going to want to simply use an existing rules engine (many of which, regardless of their implementation language, have Python bindings) which are going to behave the same way no matter what language you call them from.