Hacker News new | ask | show | jobs
by Jtsummers 1387 days ago
I'm not quite getting your categorization of the first approach as top-down and second as bottom-up, they are the opposite to me.

Your program is going to have a flow like this:

  $ rcd2-awesome-finance-app september2022.csv
  Category    Amount
  Food        $100.00
  Clothes     $200.00
  ...
Internally it'll be like this:

  main(string filename)
    csv = open(filename)
    cat->$ = new dict()
    foreach line in csv
      transaction = parse(line)
      cat = categorize(transaction)
      amount = amount(transaction) // may just be t[2] not a function
      cat->$[cat] = cat->$[cat].or_default(0) + amount // or other logic for first time seeing it
    foreach (cat, amount) in cat->$
      print("{cat}\t{amount}\n")
So your second example is the top-down one, it requires all this logic to exist. That makes TDD harder because going from a one-line CSV file and hardcoding the amount and category (first code is dumb code), to a second one-line CSV and actually categorizing is a massive jump.

Categorizing is the core, so categorizing is the thing that really needs testing. The rest doesn't even need (substantial) testing anyways, it's a bog standard CSV-data driven tool. Which leads to a point I want to make:

You don't need to do TDD on everything to do TDD. TDD is the "show your steps" of programming (analogous to the high school algebra requirement that you show and name each step in a solution). Many people despise that in their math homework, but it's important for the teacher for the same reason TDD can be important to you, the programmer.

Math teachers (ok, some may, they're poor teachers) don't ask students to show their steps just for grins. They ask it so they can observe the thought process of the student. A student tasked with solving for x with `2x = 4` may come up with `x=2`, but how? Did they divide by 2 or subtract 2 (I've done my share of math tutoring, I have seen students subtract what should've been divided). They got the answer by coincidence. So when they are given another problem where that coincidence doesn't work, they get the wrong answer and in more complex problems how they went wrong is non-obvious ( in 3x=9 => x = 6 it would be glaringly obvious, but with more terms and unobserved steps it's not; it's early the coffee hasn't kicked in yet so I can't think of good example problems to illustrate this).

TDD does the same thing, but it's there for you, not an instructor, to observe what you are thinking along the way. Each test is supposed to be small so that you can easily see that the logic is correct. However, if you can see a larger step to take (like that structure I threw up at the top or parts of it) then go for it. I've written I don't know how many CLI apps and I would never use TDD for anything but that inner categorization logic (or its equivalent for them). It would be silly for me to write a test to see that I was reading the file correctly. I can think of only one reason to do it: You don't know the language and API yet so you want to test your own knowledge, more than the program itself, to ensure you're calling it correctly. Even then, unless it has a weird file reading API I wouldn't bother, 99% of them will be the same or similar enough.

1 comments

Interesting point of view, thanks.