Hacker News new | ask | show | jobs
by andymatuschak 1996 days ago
(author of parent article here)

This is very interesting! Can you say more about the task you perform when you see these prompts? How do you decide how to "grade" your response? Presumably it's not just whether or not you can remember the deleted word(s)?

Also: are you blanking out individual words? Phrases? Sentences? An example would be interesting if you're willing to share!

1 comments

Edit: Though my examples below are all fiction, that's half or less of my current reading list. I also have philosophy, research papers, long-form academic books, popular science books, and the like.

-----

For reference, here's the next new card for one of the books I'm reading:

  Bleak House
  Charles Dickens
  CHAPTER III
  ---------------------------------------------------------------------
  coach gave me a terrible start.
  
  It said, "What the de-vil are you crying for?"
  
  I was so frightened that I lost my voice and could only answer in a
  whisper, "Me, sir?" For of course I knew it must have been the
  gentleman in the quantity of wrappings, though he was still looking
  out of his window.
  
  "Yes, you," he said, turning round.
  
  "I didn't know I was crying, sir," I faltered.
  
  "But you are!" said the gentleman. "Look here!" He came quite
  opposite to me from the other corner of the coach, brushed one of his
  [...] furry cuffs across my eyes (but without hurting me), and showed
  me that it was wet.
  
  "There! Now you know you are," he said. "Don't you?"
  
  "Yes, sir," I said.
  
  "And what are you crying for?" said the gentleman, "Don't you want to
  go there?"
(The prior cards include the text up to and including the line starting "But you are!")

My response, having never seen the passage before, was "coat's"; the correct word is "large". The substitution doesn't meaningfully alter the passage: The only thing we know about the gentleman at this point is that he's a stranger whom Esther has just met in the coach, wearing a large coat (cloak?). This is the first time he speaks in the book.

I selected "again" because this is a new card; if this had been a review, I might have deemed it acceptable. The bar for similarity varies depending on the book the passage came from, my mood on any particular day, and how much I care about the contents of the passage. The Rime of the Ancient Mariner, being poetry, doesn't get any latitude: I must remember the word exactly. Some of the tedious battle descriptions in Le Morte d'Arthur get a pass no matter how badly they go wrong.

If a card is causing me trouble, there are a few corrective actions I can take: In some instances I'll edit the card to blank out a more meaningful word than the one randomly selected, which is an easier and more engaging review task. If I can't figure out the blanked word from context, I'll look it up in a dictionary; this happens most often with my second-language texts. And in extreme cases, I have enough coverage of each book that permanently dropping card or two won't cause problems.

The idea for this came from the paper[1] that introduced the concept of Cloze deletions. They were originally envisioned as a readability score. Instead of calculating statistics about the words used, like traditional methods, Taylor proposed to blank out every Nth word and measure how many could be guessed correctly by a reader. Readability and comprehension are two sides of the same coin. They both rely on observing the relationship between text and reader. If used to evaluate the text, this is called readability; if to evaluate the reader, comprehension.

[1] https://www.gwern.net/docs/psychology/writing/1953-taylor.pd...

I'm intrigued. I've read your comments in this thread, and I can't comprehend how you decide what to cloze on. Is this something that's decided by your script? Do you have multiple clozes per note? Why not cloze on "wrappings"?

If you have a blog post describing your process, I'm sure I, and many others, would love to read it.

I have so many questions, I am intrigued! Why are you doing this? Are you reviewing whole books in this fashion, as in the whole text? How long does that take? Do you read those books first? Are you reviewing in order of the order the text was written in? How much time do you spend reviewing?
First, the mechanics: I'm reviewing entire books that I haven't read before; I have a Python script that will split up a text file and produce a CSV file for Anki to import. New cards are set to come up in the original order, but reviews follow Anki's scheduling algorithm. Because I only have 1-2 new cards per book per day, the effect is similar to having several bookmarks: One is the most current, the second trails the first by a week, and the third by a month, etc.

A "normal" length book, like a genre fiction or popular science book, generally produces about 1000 cards like the one above. At a pace of 1 new card per day, that's about a three year commitment. It's a lot easier to have more books going than to increase the pace on any single book; the brain likes variety. I've got about 20 books going at once, which is around an hour of reading every day, including both reviews and new material. Overall, that works out to an average pace of 1 "standard" book every two months.

As for why, it started as an experiment to fix several problems I was having at the same time: My preexisting Anki deck was running dry, but still contained items that I wanted to keep reviewing; I needed a source of low-effort cards to keep the review habit going. I also had a long list of books that I should read someday but that day never seemed to be getting any closer; I decided to force the issue.

And finally, I had been unable to figure out how to make flashcards for literature at all. What series of questions / prompts can you write that captures the essence of something like this:

  It is an ancient Mariner,
  And he stoppeth one of three.
  'By thy long grey beard and glittering eye,
  Now wherefore stopp'st thou me?

  The Bridegroom's doors are opened wide,
  And I am next of kin;
  The guests are met, the feast is set:
  May'st hear the merry din.'

  He holds him with his skinny hand,
  'There was a ship,' quoth he.
  'Hold off! unhand me, grey-beard loon!'
  Eftsoons his hand dropt he.

  He holds him with his glittering eye—
  The Wedding-Guest stood still,
  And listens like a three years' child:
  The Mariner hath his will.

  The Wedding-Guest sat on a stone:
  He cannot choose but hear;
  And thus spake on that ancient man,
  The bright-eyed Mariner.

  (...)
Short of memorizing the poem word for word, it's hard to imagine any set of prompts that would adequately capture this: The events are almost incidental compared to the cadence and sound of the words. Or, using the Bleak House passage from above: If this gentleman becomes a recurring character in the book, I'm bound to learn more about his personality as the story progresses. If he exhibits some small behavior in this scene that foreshadows his character development later, how am I to notice it? If I have simply noted that he and Esther first meet on this trip and never revisit the actual text, I'll not have the opportunity to make that connection.
Thank you for your detailed response. I'm really tempted to try this as I'm already in the habit of using anki on a daily basis. Would you be willing to share your Python script?

If I may, how do you ensure you're only getting one or two cards per book? Do you create a deck per book, or is all in one deck and there's some trickery in the review settings you do?

Additionally, it's really interesting to me that there is a non-linear approach for the reviewing, what's the impact on your enjoyment of books? And, does this approach give you better retainment of details?

Finally, is the length of text set, and how did you determine what it should be?

Sorry for the interrogation

This is quite striking. Thank you for sharing.