Hacker News new | ask | show | jobs
by hatsix 3496 days ago
Parse this file and tell me the total of all $ amounts, and the line where we've reached 50% of the total $.

How do you approach this problem and break it down.

I've seen problems like this solved by looping through the file twice, because they answer one question at a time...

But I'm hoping there is something out there that can teach the general process of breaking down the problem, identifying loops, identifying generalizable code, and anything else that needs to happen before your code can work.

1 comments

I have no idea if this works (it did for 3 test cases), and I still end up reading through the array (but not the file) twice.

  # assuming file with $100\n$200\n$23\n$50 etc.
  linenum_vals = []
  with open('numbers.txt') as input_file:
      for i, line in enumerate(input_file):
         try:
             linenum_vals.append([i, float(line[:-1][1:])])
         except ValueError:
             print("Not a number value on line", i)
  
  total = 0
  for i in linenum_vals:
      total += i[1]
  
  half = total/2.0
  
  for p in range(len(linenum_vals)):
      if linenum_vals[p][1] == half:
          print("Half of total found on line", linenum_vals[p][0])
          break
      else:
          if linenum_vals[p-1][1] < half and linenum_vals[p][1] > half:
              print("Half of total in between lines", p-1, p)
              break
          else:
              if p+1 != len(linenum_vals):
                  if linenum_vals[p][1] < half and linenum_vals[p+1][1] > half:
                      print("Half of total in between lines", p, p+1)
                      break
          print("Half of total wasn't found in the file.") # probably because the list of values isn't in order
          
What way would you do it? I feel like I'm missing something.
I would probably keep a running total as I read through each line, then store `line_number: running_total` in a dictionary.

At the end, return the highest key in the dictionary for your total, and then run a reduce that returns the first key/value pair that is >= half the total

Thinking a little outside of the box - but if you could sort the lines by amount in ascending order, then total from both ends, take half after each addition, then shift a pointer to where "about half" would be based on where you were in the totaling (and which end) as you go.

I don't have any specific implementation - or if this idea would even work - but I do think sorting the lines first might be of help toward a better solution (I could be completely wrong on this). As well as tackling the totaling and checking from both ends of the list.

You'd have to keep track of everything, of course, in order to know at the end where in the -real list- (unsorted) the half-way total point is at. If this idea has any merit at least.