Hack Nux: New ask Hacker News story: Ask HN: How do you implement censorship to a LLM?

Ask HN: How do you implement censorship to a LLM?
6 by mcmoor | 3 comments on Hacker News.
OpenAI keeps increasingly adding censorship to its LLM to comply to various laws. But I'm confused how do they do that? I thought it's impossible to incorporate it while training because it seems like the LLM doesn't have those censors before and now they do (?). But it also seems very unlikely that they tinker with the neural nodes directly. The only method I can think of is that they add lots of prompt themselves that's added automatically with every user prompt. Something like "You may not talk about X. You may not talk about Y." before user prompts. If it's like that, it explains why users can jailbreak the censorship, we just have to overpower those censors prompts.