Assuming you implement the obvious thing -- tracking each page of a book the user has opened --
(a) How could you reliably track this if the user always keeps their Kindle on airplane mode?
(b) How could you track this accurately if the user reads a few hundred pages in the subway, where there's no Internet service?
(c) How could you distinguish that between someone who hopped around a book in that same time period?
(d) If Kindles don't already track page views this way, how do you update the software on all Kindles to start tracking this way? When do you switch your billing script to track purchases like that?
(e) If you're QAing a Kindle and you spot this loophole, how do you do all these fixes? How long are you willing to keep the software from shipping? How certain are you that your theoretical solution is better than what's already shipped?
Product development is hard, and it makes me angry when people handwave it as "gross incompetence" from a position of ignorance.
Uh, store a log file of every user action in that book, and send those log files to the mothership periodically, as internet is available? It does not have to be same day, just eventually.
Analyzing log files for duration/pages visited is probably easier than the equivalent for web server logs, and there are very many services that will analyze those for you.
Yeah, I'm not getting the "This is a very difficult problem to solve" take on this.
The books currently track by location point and you could log on blocks (e.g. every 100 location points progressed).
Amazon's books are already DRM'd to hell, meaning the kindle has to use the unlimited books through the marketplace. Then it's just a matter of reporting user stats, which can be covered in the Unlimited TOS.
You're still trusting the client. Any system that trusts the client is flawed. Time to read per page varies, and if you read the original article it says the scammers are mitigating chances of alerts by clicking through a book over a three day period. You're going to find people who click through at very cost efficient means somewhere in the world when you're making $60,000 a month from this scam.
You would still need to force the users through each page.
Fast-reading bad actor accounts can be flagged as abusers through pattern recognition. Since a subscription is necessary, creating numerous accounts to game the system becomes expensive fast.
This is kind of rough for technical or reference-ish books. Perfectly OK for fiction... well, except for anthologies and collections (HAL 9000 says I'm sorry Dave, I can't let you skip past the other Arthur C Clarke stories in this anthology, you'll have to read all the stories in order, at an acceptably slow speed)
I subscribe to F+SF and it would annoy me if I were technically prevented from skipping the end of stories that don't resonate with me.
Yes, it is, but unfortunately I'm not sure how to get around it in a system where you aren't actually buying the goods, but borrowing them and then they are required to know how much of it you used. Thankfully you can still buy books outright if you don't want to be tracked (sort of. All KU books are Amazon exclusive, so Amazon will at least track that you bought it).
That said, Amazon is already syncing your location,and any annotations you've made[1] so they persist across all kindle devices, so there's already a bunch of tracking in place. Given that there's already some tracking, I wouldn't be too opposed to a per-page bit for whether it was read, triggered when the page has been lingered on for five or more seconds (scaled down to 1 second for partial pages, such as ends of chapters).
1: Anyone remember the big episode years back over Amazon realizing they didn't have the license to a book, then removing it from all Kindle devices automatically, including the annotations made? In what is possibly the most ironic situation I can imagine, the book was 1984.
I believe this would create a bad incentives structure. You'd penalize the author for people getting hooked to the book (and therefore getting into the 'flow' and reading faster), and encourage scammers to just linger on pages (probably making the scam even easier).
Plus, lots of significant ambiguities to solve: user is reading a page, gets up to do something else, forgets Kindle open. How many minutes do you bill? This might be solvable with the proper signals and rules, but I believe this is far from trivial.
> If you're comfortable browsing the internet, this level of reporting on a Kindle seems almost quaint by comparison.
Yeah, I'm not okay with that other tracking either. In addition, I am paying for my Kindle and my Kindle books or KU subscription. It used to be only free services tracked you, but I guess that limitation is coming to an end.
This. It doesn't even really need to be a log. A bitset with each bit representing a page and a `1` representing "this page read" would do the trick. On a massive 8000 page tomb, that's only 1kb.
If Amazon doesn't need the exact pages read, POPCNT the total and send that.
...that wouldn't change anything. They'd just change the report file to to sync straight 1s... no, you still need obfuscation and encryption, bloating it to at least 100kb.
I don't think you gain much by forging this number on a single device and you wouldn't be able to manipulate this on ALL devices.
The reason the scam worked is that it encourages all readers to jump to the end of the book (via a link on the first page). I don't think there would be an equivalent way to force people to page through and pause on each page.
That may not make the scam totally impractical to all but the most dedicated hackers, but it does increase the scam costs substantially. Maybe enough to remove the low-hanging-fruit from the scammers and have them target elsewhere.
So, I don't think "that wouldn't change anything".
And you don't think that those logs can be faked? It might stop the casual, "hey fans, read this 'book' to support me" but it wont stop the real scammers or people who would buy reads for revenue and ratings.
Kindles are pretty locked down...it's not that difficult to have the kindle sign the data it sends (probably does that already). Being scammed by hacked kindles is one thing, but they're not even trying here...
You could easily sign the log with the same certificate that is providing the DRM on the book itself. Or a different certificate. Encrypting things is not new, nor hard.
You would need to fake the logs for paid accounts, and since rev sharing is a formula of all paid subscriptions, you'd be hard pressed to make positive returns.
Not that any of those are difficult -- and all of the problems you list around connectivity are, uh... it's not like tracking it poorly resolves that problem, they're still finding some way to eventually sync the data up now, it's just lower-quality data.
But... even if those ARE difficult problems, shouldn't you try to solve them BEFORE you launch a business model where you promise people (ie, your authors) that you can do these things? Hell, especially if they're difficult problems, you should fix them before telling people you've solved them.
>(a) How could you reliably track this if the user always keeps their Kindle on airplane mode?
I wouldn't let users always in airplane mode participate in the program. Actually, I'm pretty sure they already can't, as they need to connect to get books from KU.
>(b) How could you track this accurately if the user reads a few hundred pages in the subway, where there's no Internet service?
By using this magic thing called computer storage, and syncing later...
>(c) How could you distinguish that between someone who hopped around a book in that same time period?
By observing how much time they spend in each page (with some allowances for different reading speeds, skipping, speed reading etc) and making sure they've legitimately read a good portion of the book.
Even if they haven't actually read it, but only mimicked the above, this constraint just made the fraudsters' process much much slower to complete.
>(d) If Kindles don't already track page views this way, how do you update the software on all Kindles to start tracking this way?
You simply require users to update their software to continue participating in KU, and give them a deadline.
Users need to connect to browse/get new books anyway.
>When do you switch your billing script to track purchases like that?
After the deadline, only people with updated KU software will be there, so no problem, you just switch it.
In between, you could always switch it on an account basis (like you already have KU and non-KU account and other tiers) -- those who already updated get the new behavior, etc.
>(e) If you're QAing a Kindle and you spot this loophole, how do you do all these fixes? How long are you willing to keep the software from shipping? How certain are you that your theoretical solution is better than what's already shipped?
It's not like this things are rocket science. Companies do such QA an keep back BS products for a few months all the time. Even companies losing billions from doing so, like Apple. For Amazon, which barely breaks even and lots of offerings are loss leaders that's even easier.
>Product development is hard, and it makes me angry when people handwave it as "gross incompetence" from a position of ignorance.
Hard or not, there are always lots of cases of actual, bona fide, certified, 100% legit, "gross incompetence" too...
Right. There is no easy, tradeoff-free way to automate the tracking and proportional payment process.
Which means that Amazon really does need to move to a more Apple-like human curation process for all new authors, and/or for all new titles. Doing so will immediately tank precious vanity metrics like # of titles added to the store each month. But the alternative is an ever-growing jungle of weeds crowding out the legitimate works. The more that happens, the harder it will be to eventually weed the garden.
I'm also assuming that they had to take a bit of a lowest common denominator approach to m2m communication given that they have cell-based (read - costs amazon money) and wifi (does not cost amazon money) enabled versions of the device. If they tracked every page read and sent a log periodically, that _could_ get expensive quickly on the part of the cell-based versions depending on what network agreement they have (numerex, for example, still charges by the kb for this type of low byte traffic). Given that the rules needed to be the same for both types of devices, you couldn't necessarily have an if(wifi){ //send log} else { //send last page syncd} code branch. This is just a giant guess given that I know nothing of amazon's partner network agreements.
I have foobar2000 setup to track my plays for each song. It has an adjustable slider that I have set for 35%. Once 35% of the song has been played it increments the play count. It doesn't even need internet to do this! This stuff isn't that hard.
We run a small microsite service for designers and once enabled single page view tracking metrics - we had at the time very few customers and yet manage to smash trough our 50k keen.io event allowance in a single day. Can't imagine it on sonething where books have hundred pages and users running in the million
You can do the aggregation locally on the device. You wouldn't want to send every page view as an immediate event, just send the aggregates every 15 minutes or at the start or end of each session.
That seems like an exceptionally low allowance for any kind of page view tracking. Given that they have the technology in place server-side, and it's not an incredibly hard problem, the server-side costs to Amazon of doing this would be tiny. No comment on the cost of designing a decent algorithm and keeping ahead in the cat and mouse games.
eh it's the cost of using a prepackaged solution. I'd move to an internal one but up until now developing features for the app had more precedence than developing a state of the art event tracking solution.
Yeah, the prices aren't that insane overall tbh. The small size of the free-tier suprised me, but my impression is that free tiers on services have been shrinking since last time I was in that size of company
the system is still bad even without a malicious party, as said in the article, if the user go back to page 1 after reading the book the system will count the book as unread and the author will not get paid.
I'm surprised "start read" times and "stopped read time" and "actual pages read" isn't taken into account in the whole process. If an entire 1000+ page book is read within less than 5 minutes, something is wrong. I don't care how fast of a reader you are, my classic Kindle wouldn't be able to process that many pages in under 5 minutes anyway.
incompetence is bad.
gross incompetence is bad on a widespread and surprising scale.
incredible gross incompetence is shockingly bad on a widespread and surprising scale.
"Incredible" doesn't mean good, by definition - it means unbelievable.
Why is incompetence bad? I have no ego and don't have a way of judging things like this. I think I am unusual in not having an ego. I can't assert that anything is "bad" or "good", including incompetence. I understand "incredible" here. I also understand "gross", although "surprising" is a bit difficult. I have no ego, so no reference of what is surprising. For those with egos, surprising is usually in reference to the ego.
For example, a tebibyte of RAM is surprising to an OSX programmer. A tebibyte of RAM is not surprising to someone working on seismic simulations for oil extraction.
I do see your point. But I think you are being too pedantic... By saying it's incompetence and "bad", the poster probably meant that the people at Amazon aren't doing a great job (in achieving whatever goals they want to achieve or we customers want them to achieve, in this case keeping the scammers out). Maybe the poster is just frustrated with Amazon not doing its job. It doesn't necessarily have anything to do with egos in the sense that the people at Amazon are pathetic idiots that that we should scoff at or that we can indulge in a sense of superiority. It's simply about solving a problem.
Also, I don't think you were fully correct in saying that surprising is usually in reference to the ego. I think it's simply in reference to what you've seen before, what you are used to seeing, what you expect to see or would like to see. It's not necessarily ego-related. But yeah, having an open mind and not being limited by what you've seen is important. But I digress.
Assuming you implement the obvious thing -- tracking each page of a book the user has opened --
(a) How could you reliably track this if the user always keeps their Kindle on airplane mode?
(b) How could you track this accurately if the user reads a few hundred pages in the subway, where there's no Internet service?
(c) How could you distinguish that between someone who hopped around a book in that same time period?
(d) If Kindles don't already track page views this way, how do you update the software on all Kindles to start tracking this way? When do you switch your billing script to track purchases like that?
(e) If you're QAing a Kindle and you spot this loophole, how do you do all these fixes? How long are you willing to keep the software from shipping? How certain are you that your theoretical solution is better than what's already shipped?
Product development is hard, and it makes me angry when people handwave it as "gross incompetence" from a position of ignorance.