The coolest instance of MILP I have encountered "in the wild" at work was a disa...

_Algernon_ · on May 7, 2022

Was the purpose of aggregating the data to "anonymize" it? If so, this just illustrates how useless anonymizing data is.

lrem · on May 7, 2022

Yup, k-anonymity is hard. State of the art is rapidly evolving, requiring higher and higher k (i.e. more coarse/less useful reports). Most companies out there prefer to, um, not notice that.

ironSkillet · on May 7, 2022

Yes, and I mostly agree. Most companies just randomize an ID or do some group-bys and consider it "anonymized", which are susceptible to a variety of correlation techniques (such as setting up this MILP problem). And the more data sets you have with the same underlying masked primary key (e.g. person), the easier it can get. I have seen these weak attempts at hiding information in multiple industries. Just for the record, in my use case there was never an intent to map data to individuals, just to get the most granular information possible about key metrics.

webcomthrowaway · on May 7, 2022

Hello, would this have been on vendor reports from a certain large online advertising and search company?

ironSkillet · on May 7, 2022

Ha yes, I probably know you.