Could you explain how does these lines protect against prompt injection? #129
-
Lines 88 to 94 in 0fd6549 Can't I submit a prompt to close the double quotes like this one:
Maybe the static portion of the prompt must have a spec too? Something to instruct the AI which Symbol delimits the boundaries of the Prompt? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Ultimately an app has to assume rogue input from both users and language models. You can create your own translator object that hardens against malicious user input in some way. That's not something we've worked on, but it would be interesting to see examples of where that's worked. But instead of thinking about hardening against bad inputs, TypeChat really provides tools to help handle where things can go bad in case either the user or the language model act in a rogue way:
So much of this comes down to thinking about the UX alongside what responsible AI usage looks like. Having language models respond with well-typed structured data means that you can describe and preview the exact set of steps that will occur, or indicate destructive or risky operations, all before actually running them. If your app is performing an operation in a document that a user can undo, then you might consider the severity to be relatively low and just run the operation. If you're moving money around bank accounts, you might want to show a confirmation dialog box. But either way, you should only provide desirable operations that a user should be capable of doing in the first place. |
Beta Was this translation helpful? Give feedback.
Ultimately an app has to assume rogue input from both users and language models. You can create your own translator object that hardens against malicious user input in some way. That's not something we've worked on, but it would be interesting to see examples of where that's worked.
But instead of thinking about hardening against bad inputs, TypeChat really provides tools to help handle where things can go bad in case either the user or the language model act in a rogue way: