The signals you pick up between 0.5 Hz and 40Hz are mostly likely "population" level signals or put another way the average activity of millions of neurons spread across large swaths of the brain. A good analogy re: trying to decode the brain from EEG would be akin to trying to decode what is going on in a soccer match but only using the noise of the crowds.
From start to finish, an individual action potential lasts about 1ms, followed by a few millisecond refractory period. There is probably not much biologically-relevant activity happening at faster timescales.
However, one often wants to "sort" spikes so that those arising from individual neurons are grouped together. To do so, you need to sample fast enough to capture the shape of the action potential, not just the fact that it happened. This usually requires sampling rates in the 10s of kHz. (~30 kHz would be good).
Eh, it depends on what you're calling "the signal of interest."
A spike train, the output of a neuron, can reasonably be represented as a binary signal (1=spike, 0=no spike) at 1 kHz. In fact, I'd say this is nearly standard.
The neuron's membrane potential, a combination of its current inputs, intrinsic properties, and recent history, changes on faster timescales. Whatever processing the brain does probably does not use this information itself, but one probably wants access to it anyway for experimental and practical reasons.
The signal that an extracellular electrode picks up isn't that 1 kHz spike train. It's many of them mixed together and you'll need finer features to tell them apart. In other words, you can't just invoke Nyquist, sample at 2 kHz, and call it a day.