Hacker News new | ask | show | jobs
by mona_rakibe 1797 days ago
Telmai (https://www.telm.ai/) is a real-time data quality monitoring platform that can automatically detect and investigate data quality issues as data is getting ingested. Our tool uses a statistical and ML engine that helps data product owners understand data anomalies and intuitively define correct versus incorrect data. These definitions are then used to proactively monitor and alert on data quality problems.

We have decades of experience with enterprise data and find that this approach towards data quality addresses a huge gap in data platforms. Detecting and investigating data quality issues is extremely tedious, time-consuming and expensive. Using Telmai, companies like Dun & Bradstreet and Myers-Holum are able to find and resolve such issues across millions of records in minutes. Ask us anything!

2 comments

Telm.ai (YC S21) - Real-time data quality monitoring

Looks interesting! I worked on https://github.com/capitalone/DataProfiler

We are looking to monitor correlation changes over time, see if sensitive data gets entered, track schema changes, etc and see the impact of down stream modeling, etc

I'm curious how heavy the input is? because usually these systems take a lot of effort to setup. Any idea?

Thanks for your feedback and the link, it's indeed a very nice open source profiler. The complexity of initial analysis of the data in search for anomalies was one of the main drivers for us. Our approach is based on providing interactive experience through which you can see the impact of various statistical distributions, ML suggestions, narrow down the important criteria and explore actual data associated with it. All this helps in building much more accurate models of data correctness to be applied for the new data. And do it much less time. However as of now we don't do data classification, it's one of the future topics of interest
Congrats on the launch!

I wanted to give some feedback on the pricing page: it's very vague, to the point I don't know how much I will end paying if I go over the 500k values (is that the free tier?).

Also, this is only for data monitoring, or can be used for stuff like server monitoring too?

Thank you so much for this feedback. Regarding the pricing page we will tighten it and also add calculator, please stay tuned. For now the way it works is after first 500K values we charge $150 per 1M attribute values/month.

We dont monitor server or any infrastructure, we are designed for data quality monitoring we can flag issues like missing data, volume drifts, schema mismatch and where we stand out is monitoring actual accuracy of data at record value level. Example : We can flag anomalous titles, emails , overrepresented phone numbers etc