| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by opportune 1065 days ago
	I read the paper. It’s fundamentally I think an exponential problem but they apply some constraints to reduce it: only considering tuples of size n <= 3, pruning. Personally I think the case where you’re doing this on one column is not that interesting: for each column, get all values with >= support frequency, group by it on old and new, include in results is risk_ratio over threshold. List results in order of risk_ratio. Going from that to just two columns is much much more computationally demanding and where pruning and such really matters.

1 comments

0cf8612b2e1e 1065 days ago

The article was giving me a market basket analysis vibe(maybe just the shared terminology), so possibly a similar underlying algorithm to do the pruning.