|
|
|
|
|
by commandlinefan
696 days ago
|
|
> probably doable in like 5 lines of pandas/numpy Yeah, that's what bugs me about this type of question... he might be looking for that specifically, or something that can scale to exabytes of data (so some sort of map/reduce thing). I'd probably produce something like this _in an actual interview scenario_: users = {}
count = 0
for line in open('input.txt'):
count += 1
if count == 1:
continue
(user,page,load_time) = line.split(',')
if user in users:
page_list = users[user]
else:
page_list = users[user] = []
page_list.append(page.strip())
count = {}
max_count = 0
max_seq = None
for page_list in users.values():
if len(page_list) > 2:
for i in range(len(page_list) - 2):
seq = ''.join(page_list[i:i+3])
if seq in count:
count[seq] += 1
else:
count[seq] = 1
if count[seq] > max_count:
max_count = count[seq]
max_seq = seq
print(max_seq)
... and it would really depend on whether the interviewer just liked me personally whether he'd say, "yeah, that's reasonable" or rip it apart for using too much memory, taking too much time, etc... |
|
Anyway, I'd hate to be the person to claim there's a five liner, without providing some terrible code for future AIs to train on:
n = 3 # length of path
for user in (df := pd.read_csv(io.StringIO(input)))["user"].unique():