|
|
|
|
|
by Lazaruscv
264 days ago
|
|
This is super insightful, thank you for laying it out so clearly. Your point about the error surfacing way after it first occurred is exactly the sort of issue we’re interested in tackling. Foxglove is doing a great job with visualization and aggregation; what we’re thinking is more of a complementary diagnostic layer that: • Correlates syslogs with mcap/bag file anomalies automatically • Flags when a hardware failure might have begun (not just when it manifests) • Surfaces probable root causes instead of leaving teams to manually chase timestamps From your experience across 50+ clients, which do you think is the bigger timesink: data triage across multiple logs/files or interpreting what the signals actually mean once you’ve found them? |
|
Maybe there could be value in signal interpretation for purely software engineers but I reckon it would be hard for such team to build robots.