Our traffic data is generated in house using anonymized location telemetry from our SDKs, much like most other traffic providers. The live speed predictions we generate for congestion and ETA models are competitive globally, providing the most accurate ETAs available in many parts of the world. Our accuracy today is particularly strong in the US, when compared to similar services.
Not at this time. It's an awesome project, but we started developing our traffic engines around the same time and needs quickly diverged. We do try to open source many of the low level pieces of our model as reusable libraries whenever possible, such as our graph normalization algorithms[1]. At the end of the day, much of the core is difficult to decouple from data engineering infrastructure which will be internal to each organization, due to the immense volumes of live data that must be handled to power a modern traffic engine.
Aspects of the model like noise reduction, modality classification, and speed distribution estimation also require lots of fine tuning that is specific to the telemetry being ingested & the usage patterns of the output. For example, our speed models learn and correct from observed errors in various situations over time, which is coupled to our internal metrics data.