Hacker News new | ask | show | jobs
by jacques_chester 1947 days ago
I feel like I'm fulfilling a stereotyped HN commenter role by asking this, but isn't "prediction based on historical data" actually, you know, an estimate?
4 comments

I couldn't read all the text, but I believe the metapoint is that the need for accurate estimates drop as the cycle time is reduced.

It is all about feedback.

My analogy is an old analog style joystick that positions a simulated robot arm on a screen. If the update rate is high enough I can track a rapidly moving dot, but if it drops across some threshold, and I needed to accurately position the arm at some time in the future, then I would need to construct a huge model of the system, know the force, stiction, friction, mass, moment and thermal expansion. (edit, I have another analogy, anyone can spray a moving target with a hose, but using a bow and arrow requires skill and practice)

Feedback allows us to use unpredictable components to make predictable systems. Those systems are nearly always amplifiers. Systems that use direct feedback don't have to have the same reductionist model as something that needs better prediction (estimates).

This is why Lisp was a super power in the 80s, it had a repl. Same as Smalltalk, the IDE and repl and the universe were all the same thing. It makes total sense that agile came out of a system based around repls and instant feedback. Arduino did it for embedded dev. Hypercard for programming, the spreadsheet before that.

Highbandwidth feedback allows us to be less skilled. Good estimators need to be highly skilled to make those estimates. Hose vs arrow. That reminds me, have you seen a really skilled FPS player on a predictable but high ping connection? They are almost timeless in how they predict the future, and to everyone else they dance between every 10th frame. Amazing predictors!

You can't agile a martian probe (yet). As new ways are discovered to reduce cycle times, the time between cause and effect, each proceeding structure of feedback is replaced with an even higher bandwidth one. Robust DFU is a metarepl as hardware manufactures race to ship products that are literally not finished and require a firmware update on boot to even function.

I just wanted to reply to say this was a great explanation and that I broadly agree.
Thanks for the feedback.

Quip aside, I also learned a lot writing it. Always bet on the Scientific Method and if we know something works, we should have to justify not using it.

This aligned perfectly with the thread, https://youtu.be/S1nc_chrNQk?t=120

I went on a tour of Blue Origin, I learned next to nothing. It looked like a the rich kids house that had a coleco and a neo geo.

The picture they paint of spacex is one of being in continuous flow vs cautious pessimism.

I think the type of estimation meant here is the agile type, where the team looks at a feature/requirement and estimates the time or relative effort based on the nature of the requirement + whatever other factors they choose. It's usually more intuitive and not based on historical data.
he also links to a conference talk about this. "you can make a claim that estimates are based on past behavior, but the fact is that what you're implementing is something that hasn't been implemented before. So any kind of measurement that you've made of something that has happened in the past is not going to impact what you're doing now" https://www.youtube.com/watch?v=QVBlnCTu9Ms
I'm rather skeptical on this idea that just because a specific feature was never implemented, or just because a specific bug was never fixed, that estimates based on past behavior don't work. That assertion doesn't have any basis on reality. I mean, implementing a feature of fixing a bug is not an isolated event performed with improvised approaches starting from scratch. Teams have processes and procedures that are standardized and take time, and need to be performed sequentially, which means that the improvisation part at best represents a small subset of the time invested working on a ticket.

For a concrete example, let's imagine a team which has a continuous delivery pipeline which involves a code review step and manual acceptance tests. Let's say that the code review can stay in a queue for a couple of hours, or even sleep into the next day, and that the manual acceptance tests require the feature to be deployed to a preprod stage after passing through all unit and integration tests, and that it might take a day to run.

With this process alone, the ticket already takes at least 2 or 3 days between being assigned to someone and being marked as done.

Now, let's say that the coding bit of a random ticket might take 5 minutes or 3 days. This means that the overall time between the start and end time of a ticket is about 4 days +- 2day, which means worse case scenario, it takes 6 days to close a ticket.

How is this sort of estimate not possible?

The problem of providing estimates is not one of predicting the amount of time it takes to close a ticket. The problem of providing estimates is a problem of processes, and how to adequately organize, structure, and classify work. If you don't know what you're doing then you don't know when you're done.

Good point you hardly ever start with a clean slate. Using historical data or asking engineers to estimate how long it will take will always be based upon this past performance.

But the point I try to make is that it is hard to take into account all the factors you have to deal with in a complex situation. As a human, you tend to ignore irregular influences. With tracking tools, you get this data right out of the box and it is more accurate in my opinion.

Yeah, I think the problem the OP overlooked is that most of the cycle time was taken by various internal company processes and these are really not novel every time therefore it takes a similar amount of time to deliver different features. This in my mind is simply called estimation.
The external clock is going to be some multiple of the internal clock and tasks were piling up behind contended locks. The ferrari and the skateboard make it through rush hour traffic at the same speed.
Thank you for your question!

Yes, I try to make the distinction between prediction looking at historical data and estimation asking the engineers how long it will take.