Summary: [2305.18449] Taming AI Bots: Controllability of Neural States in Large Language Models

We then introduce a stronger notion of controllability as
{\em almost certain reachability}, and show that, when restricted to the space
of meanings, an AI bot is controllable.

We then characterize the
subset of meanings that can be reached by the state of the LLMs for some input
prompt, and show that a well-trained bot can reach any meaning albeit with
small probability.

Abstract: We tackle the question of whether an agent can, by suitable choice of
prompts, control an AI bot to any state.

The fact that AI bots are
controllable means that an adversary could steer them towards any state.

We do so after introducing a functional
characterization of attentive AI bots, and finally derive necessary and
sufficient conditions for controllability.

Similar Articles

[2305.18449] Taming AI Bots: Controllability of Neural States in Large Language Models

Read the complete article at: arxiv.org

Add a Comment

Your email address will not be published. Required fields are marked *