Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation

Privacy-Preserving Social Media Data Publishing for Personalized Ranking-Based Recommendation

Abstract:

Personalized recommendation is crucial to help users find pertinent information. It often relies on a large collection of user data, in particular users’ online activity (e.g., tagging/rating/checking-in) on social media, to mine user preference. However, releasing such user activity data makes users vulnerable to inference attacks, as private data (e.g., gender) can often be inferred from the users’ activity data. In this paper, we proposed PrivRank, a customizable and continuous privacy-preserving social media data publishing framework protecting users against inference attacks while enabling personalized ranking-based recommendations. Its key idea is to continuously obfuscate user activity data such that the privacy leakage of user-specified private data is minimized under a given data distortion budget, which bounds the ranking loss incurred from the data obfuscation process in order to preserve the utility of the data for enabling recommendations. An empirical evaluation on both synthetic and real-world datasets shows that our framework can efficiently provide effective and continuous protection of user-specified private data, while still preserving the utility of the obfuscated data for personalized ranking-based recommendation. Compared to state-of-the-art approaches, PrivRank achieves both a better privacy protection and a higher utility in all the ranking-based recommendation use cases we tested.

Existing System:

To apply privacy-preserving data publishing techniques in the case of social media based recommendation, one immediate strategy is to obfuscate user public data on the user side before being sent to social media. However, such an approach is unrealistic as it hinders key benefits for users. In real-world use cases, social media provides users with a social sharing platform, where they can interact with their friends by intentionally sharing their comments/ratings on items, blogs, photos, videos, or even their real-time locations.

Disadvantage:

As it is inappropriate to obfuscate user public data before being sent to social media, an alternative solution is to protect user privacy when releasing their public data from social media to any other third-party services. Specifically, many third-party services for social media require access to user activity data (or data streams) in order to provide them with personalized recommendations. In addition to such public data, these services may require optional access to users’ profiles. While some privacy-conscious users want to keep certain data from their profiles (e.g., gender) as private, other non privacy-conscious users may not care about the same type of private data and choose to release them. Subsequently, an adversary could illegitimately infer the private data of the privacy-conscious users, by learning the correlation between the public and the private data from the non privacy-conscious users.

Proposed System:

We study the problem of privacy preserving publishing of user social media data by considering both the specific requirements of user privacy on social media and the data utility for enabling high-quality personalized recommendation. Towards this goal, we face the following three challenges. The users often have different privacy concerns, a specific type of data (e.g., gender) may be considered as private by some users, while other users may prefer to consider it as public in order to get better personalized services. Therefore, the first challenge is to provide users with customizable privacy protection, i.e., to protect user-specified private data only.