Modern web search engines exploit users' search history to personalize search results, with a goal of improving their service utility on a per-user basis. But it is this very dimension that leads to the risk of privacy infringement and raises serious public concerns. In this work, we propose a client-centered intent-aware query obfuscation solution for protecting user privacy in a personalized web search scenario. In our solution, each user query is submitted with l additional cover queries and corresponding clicks, which act as decoys to mask users' genuine search intent from a search engine. The cover queries are sequentially sampled from a set of hierarchically organized language models to ensure the coherency of fake search intents in a cover search task. Our approach emphasizes the plausibility of generated cover queries, not only to the current genuine query but also to previous queries in the same task, to increase the complexity for a search engine to identify a user's true intent. We also develop two new metrics from an information theoretic perspective to evaluate the effectiveness of provided privacy protection. Comprehensive experiment comparisons with state-of-the-art query obfuscation techniques are performed on the public AOL search log, and the propitious results substantiate the effectiveness of our solution.
现代的Web搜索引擎利用用户的搜索历史记录来个性化搜索结果,目的是以每个用户的基础改善其服务实用程序。但是,正是这种维度导致侵犯隐私的风险并引起了严重的公众关注。在这项工作中,我们提出了一种以客户意识为中心的查询混淆解决方案,用于在个性化的Web搜索方案中保护用户隐私。在我们的解决方案中,每个用户查询都提交了l附加封面查询和相应的点击,这些查询是诱饵,可掩盖用户从搜索引擎中掩盖真正的搜索意图。封面查询是从一组分层组织的语言模型中顺序取样的,以确保在封面搜索任务中伪造搜索意图的连贯性。我们的方法强调了生成的封面查询的合理性,不仅针对当前的真实查询,而且对同一任务中的先前查询,以增加搜索引擎的复杂性,以识别用户的真实意图。我们还从信息理论的角度开发了两个新指标,以评估提供的隐私保护的有效性。在公共AOL搜索日志上进行了与最先进的查询混淆技术进行的全面实验比较,并且有机结果证实了我们解决方案的有效性。