Voice systems only require Internet because companies have designed them that way to keep you on the hook and collecting your data. There’s no reason that voice control can’t run on a local box except for this business decision. Voice control systems had been around for years before everything was in the cloud, and with the advances made in the technology since then, the accuracy would be just fine on a local system.
That is in fact how most smart lights are implemented. There is a central unit each lightbulb connects to and that central unit (optionally) is connected to the internet.
I don't know any that are for sale on the internet but the deep learning, mesh network and microcontroller tech to join together such a capability are readily available and open source.
Newer devices and newer OSs can do some recognition device-side-only. But many features, and certainly anything through the older HomePods, requires cloud assistance.
Apple introduced voice processing on the devices with iOS 15 (that's the current version if you are not so familiar with the system.) Older versions of iOS required an internet connection for any voice command. This works on all but the oldest supported devices (which are ~iPhone 6), from what I remember.