The idea of putting a timebox around a spike is really good. Then the spike is done whenever you get an answer or the timebox runs out, and "We don't know yet" is an acceptable answer. That clearly distinguishes spikes from other development stories.
So often, our development stories depend on assumptions about how/if a new technology is going to make the change we want to make work. It might not work, and estimating with unfamiliar tech is really hard.
To put this differently, putting a timebox around a spike is really good if and only if "we don't know yet" is an acceptable answer. I find that it usually isn't - you're doing the spike precisely because you don't know enough to estimate or do some other piece of work, and if you don't get to the point where you do, you still can't do it.
In addition to the great information you've provided, we identify features/decisions/choices that require spikes based on assessment that they are high risk and/or we have low confidence in our estimate. In addition, we sometimes use spikes to break xxx-large estimates into smaller pieces.
That way, until a project/plan/item has low (enough) risk, and high (enough) confidence, it needs more spikes.
We've found this model is a great way of identifying what needs spiking, and when a spike is 'done'. And, it also helps to ensure spikes don't end up consuming too much time.
I often hear people use the word spike w/r/t this kind of thing, but it seems to me that "spike" is used as an agile code word when speaking with stakeholders to conceal what is actually going on: good ol' fashioned _research_. There's nothing wrong with research or planning, but I've seen a lot of engineers who seemed constitutionally incapable of calling it this in front of stakeholders.
You mention defining when a spike is 'done', and timeboxing. I understand why we do this, but there are times where I believe it contradicts the very nature of doing adequate research. We have a known unknown, and so we do some research to make that unknown known. In the course of doing research though, you will discover there's a cascading set of new unknowns which must also be understood. So, identifying a 'done' state, depending on the nature of the research we're talking about, can be a totally fraught activity.
Kinda ranting here, but this is a common scenario: "Within the next week, we'll spike to get a more accurate estimate of the time it will take to build a cluster that meets the requirements we've agreed to, and propose several technical recommendations which will fit the budget." I've seen too many instances of this where 2 more days, another week, maybe a whole month could have been spent answering these questions _correctly_. Now sure, if you have a trusting relationship with your stakeholders, you can easily negotiate that you think a solid set of recommendations will take another week, but there is a certain amount of pressure to just be done with it and smile, and I've seen
this kind of thing backfire over and over again depending on the level of trust that's been established between product teams, stakeholders, and engineers. 3 months into the project, the realization settles in that the database you've chosen will not scale on writes the way you'd hoped it would. Now what?
I don't recall where this quote is from, so if someone knows the source please let me know: "Weeks of programming can save you hours of planning."
I _think_ we are saying the same thing. I'd put 'done' in quotes for exactly the reasons you mention.
Timeboxing is merely used as a forced checkpoint; to re-assess across all activities/plans. It doesn't mean we can't follow on with more spike work.
At the end of the day, we, as a business, have to determine what we do (or don't do) and when. And we use spikes (or 'research', sic) to try to help inform those decisions.
If a problem occurs - as per your scalability example - or we identify more 'unknowns', we now need to make more decisions, and use spikes (and experience, and etc) to help inform those ... rinse/repeat.
We decided to use Gulp and a bunch of real frontend tools like Browserify and Sass. We moved all the assets to a `frontend` directory, uninstalled the gem wrappers for js libraries and switched to npm to manage them. The assets pipeline is good for small apps, but when you grow, you need the right tools. Frontend tools are good for frontend, no matter how you like or dislike the Javascript ecosystem.
I'm curious about this too. I'm at the point in my current project where the asset pipeline is becoming somewhat of a burden, and want to know what others have done to replace it or mitigate it.
It's on my ToDo list to write a blog post about the technical problem itself :) Not only the replacement of the assets pipeline but also what we did before.
Following this approach is very effective for getting R&D credits in Canada. Documenting your process in a lightweight manner like this helps tremendously.
So often, our development stories depend on assumptions about how/if a new technology is going to make the change we want to make work. It might not work, and estimating with unfamiliar tech is really hard.