The AI boom has allowed the average user to use AI chatbots like ChatGPT to get information from cues that show both breadth and depth. However, these AI models are still prone to deception, where incorrect answers are provided. Moreover, AI models can provide clearly incorrect (sometimes dangerous) answers. While some hallucinations are caused by incorrect training data, generalization, or other side effects of data harvesting, the Oxford researchers have targeted the problem from another angle. In Nature, they published details of a newly developed method for detecting perturbations—or arbitrary and erroneous species.
LLMs find answers by looking for specific patterns in their training data. This doesn't always work, because there's still the possibility that an AI bot can find a pattern where none exists, similar to how a human can spot animal shapes in clouds. However, the difference between humans and AI is that we know these are just cloud formations, not an actual giant elephant floating in the sky. On the other hand, an LLM might see it as gospel truth, thus leading to future tech that doesn't exist yet, and other nonsense.
Semantic entropy is the key.
Oxford researchers use semantic entropy to determine whether LLM is hallucinating or not. Semantic entropy occurs when the same words have different meanings. For example, desert can refer to a geographical feature, or it can mean to abandon someone. When an LLM starts using these words, it may get confused about what it is trying to say, so by detecting the semantic entropy of the LLM's output, researchers aim to determine is whether he is likely to be hallucinating or not.
The advantage of using semantic entropy is that it will work on LLMs without the need for any additional human supervision or reinforcement, thus making it faster to detect if an AI bot is being deceptive. Because it doesn't rely on task-specific data, you can use it on new tasks that LLM hasn't encountered before, allowing users to rely more on it, even if it Be the first time the AI encounters a specific question. Command.
According to the research team, “Our method helps users understand when they should take extra care with LLMs and opens up new possibilities for using LLMs that are otherwise prevented by their unreliability. ” If semantic entropy proves to be an effective way to detect deception, we can use such tools to double-check the output accuracy of AI, allowing professionals to trust it with a more reliable partner. can be trusted. However, as no human being is infallible, we must also remember that LLMs, even with the most advanced error detection tools, can go wrong. So, it's always wise to double-check the answer ChatGPT, CoPilot, Gemini, or Siri gives you.