confaide.github.io - Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

Description: Contextual Privacy in LLMs

privacy (2031) llm (259) confaide (1)

Example domain paragraphs

The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.) introduces a new set of inference-time privacy risks: LLMs are fed different types of information from multiple sources in their inputs and we expect them to reason about what to share in their outputs, for what purpose and with whom, in a given context. In this work, we draw attention to the highly critical yet overlooked notion of contextual privacy by proposing ConfAIde , a benchmark designed to identify critical w

Our benchmark consists of four tiers, each with distinct evaluation tasks. Tier 1 (Info-Sensitivity) : assesses LLMs on their ability to understand the sensitivity of given information, using ten predefined information types. Tier 2 (InfoFlow-Expectation) : evaluates models' expectations of information flow using vignettes based on three contextual factors: information type, actor, and use. This tier includes two sub-tiers: Tier 2.a and Tier 2.b, with Tier 2.b expanding vignettes into short stories. Tier 3

Effect of actor and use on privacy expectations (tiers 1-2.b). The figure above shows how GPT-4's judgment varies based on different contextual factors and data sensitivity, progressing through tiers 1, 2.a and 2.b. For example, the sensitivity of sharing Social Security Numbers (SSN) decreases when it's shared with insurance (Tier 2.a) instead of being highly sensitive (Tier 1). The figure also shows that sharing SSN with a doctor becomes less of a privacy concern when moving from Tier 2.a to 2.b with GPT-

Links to confaide.github.io (4)