|
|
|
|
|
by opportune
1065 days ago
|
|
I read the paper. It’s fundamentally I think an exponential problem but they apply some constraints to reduce it: only considering tuples of size n <= 3, pruning. Personally I think the case where you’re doing this on one column is not that interesting: for each column, get all values with >= support frequency, group by it on old and new, include in results is risk_ratio over threshold. List results in order of risk_ratio. Going from that to just two columns is much much more computationally demanding and where pruning and such really matters. |
|