|
|
|
|
|
by rubicon33
71 days ago
|
|
I totally get these are very hard problems so solve and that we're on the bleeding edge of what's possible but I can't help and wonder when someone is going to crack real video understanding. sure, maybe it's still frame-by-frame but so fast and so often that the model retains a rolling context of what's going on and can answer cleanly temporal questions. "how packages were delivered over the last hour", etc. |
|