Google today announced that it is open-sourcing its so-called differential privacy library, an internal tool the company uses to securely draw insights from datasets that contain the private and sensitive personal information of its users.

Differential privacy is a cryptographic approach to data science, particularly with regard to analysis, that allows someone relying on software-aided analysis to draw insights from massive datasets while protecting user privacy. It does so by mixing novel user data with artificial “white noise,” as explained by Wired’s Andy Greenberg. That way, the results of any analysis cannot be used to unmask individuals or allow a malicious third party to trace any one data point back to an identifiable source.

The technique is the bedrock of Apple’s approach to privacy-minded machine learning, for instance. It lets Apple extract data from iPhone users, statistically anonymize that data, and still draw useful insights that can help it improve, say, its Siri a

Read More