Mitigating LLM Hallucinations

About the author

Rajat Borkar
Trainee Software Engineer

Rajat Borkar is a Trainee Software Engineer at Nitor Infotech. He is a driven professional with a strong focus and a genuine passion for Gen AI... Read More

Artificial intelligence | 24 Jul 2024 | 21 min |

With the advent of GenAI in the technological market, Large Language Models (LLMs) have become the preferred choice for developers, marketers, and businesses to boost their daily workflow efficiency. However, the persistent risk of hallucinations, bias, and toxicity remains a significant concern, as LLMs can sometimes produce inaccurate or flawed outputs.

As a solution, I recommend reading my previous blog, where I established the fundamental concepts of hallucinations, bias, and toxicity within LLMs, exploring their root causes and detrimental effects. Once you’ve done that, this blog will provide a more comprehensive understanding.

Here, we’ll focus on some of the practical solutions to combat hallucinations and improve the reliability of LLM outputs.

So, without further ado, let’s get started with knowing the techniques!

Techniques/Methods for Mitigating Hallucinations

Fighting hallucinations in LLMs requires a multi-faceted approach, using various techniques to address different aspects of the issue. Each method has its strengths and limitations, which are important to understand for an effective strategy.

Experience the power of our GenAI services to turbocharge your business’s efficiency and scalability.

Download Capability Doc

Note: Instead of just fine-tuning models, solutions may involve changing the entire model architecture to address hallucinations.

Here are some of the methods for you to follow to mitigate LLM hallucinations:

1. Knowledge Graph (KG) Integration: Knowledge graphs are structured data that show entities, their attributes, and their relationships. They help models understand connections and context, supporting complex thinking, data analysis, and information retrieval.

2. Supervised Fine-Tuning (SFT): This technique uses human expertise to train the LLM on datasets labeled with factual information. It helps the model generate outputs that match the provided truths. During supervised fine-tuning, gradients from a specific loss function adjust the LLM’s weights based on the difference between its output and ground truth labels. Since the performance of a fine-tuned model hinges on the quality of its training data, ensuring high data quality is crucial in SFT.

3. Decoding Strategy: Generally, methods to decrease false information in generated text focus on the model’s “decoding” stage. These strategies help the model produce text that is realistic and fits the context better. Thus, lowering the chance of it creating false information (hallucinating). This is achieved through various techniques, which I’ll unwrap in the following sections.

Now, let’s delve deeper into each of the above techniques, including their characteristics, how they’re implemented, and their unique value in combating hallucinations in LLMs.

1. Knowledge Graph Integration

Here are two methods within KG integration for you to follow:

a. Reducing Hallucination in Open-Domain Dialogues with Knowledge Grounding: This involves leveraging relationships between linked entities in a KG to enhance response accuracy. Here, the model incorporates context-related subgraphs for better knowledge encoding and integrates cross-attention between the KG and query context, known as the “global knowledge-grounding strategy”. Additionally, it employs the local knowledge-grounding strategy, focusing on utilizing the KG directly.

By combining these strategies, the model aims to generate more precise responses. It uses conversational reasoning to re-evaluate generated responses, ensuring the selection of the most accurate answer among them.

Here is a diagram illustrating knowledge grounding using a knowledge graph:

Fig1 Knowledge-Grounding using Knowledge Graph

Example:

Let’s assume we are having a conversation about France. Here’s the conversation history:

User: What is the capital of France?

To get the answer/output, the model can access a knowledge graph. In this case, the KG contains the following information:

Entity: France, Paris
Relation: Capital, Country
Value: Paris, France

The model utilizes this knowledge graph to enhance the accuracy of its responses. Here’s how it works:

1. Local Knowledge Grounding: The model first performs local knowledge grounding by looking up the query “capital of France” in the KG. This retrieves the entity “Paris”.

2. Global Knowledge Grounding: The model then performs global knowledge grounding by considering the entire conversation history and the KG. It can do this by using a technique called “attention”, which allows the model to focus on the most relevant parts of the KG based on the conversation history. In this case, the model would focus on the relationship between “France” and “capital”.

3. Conversational Reasoning: Finally, the model uses a conversational reasoning model to rank the candidate responses. This model considers the conversation history, the KG, and the candidate responses themselves. In this case, the candidate responses could be:

Paris is the capital of France.
London is the capital of France.

So, based on the conversation history and the knowledge graph, the conversational reasoning model would rank the first response as more likely to be correct.

b. Factual Error Detection and Correction with Evidence Retrieved from External Knowledge (FLEEK): This tool serves as an intelligent assistant to verify and enhance the factual accuracy of text. It works with any model, providing these benefits:

autonomously identifies potential facts in the text
generates related questions for each fact
searches the web and knowledge graphs for supporting evidence

This evidence is then used to verify facts and suggest corrections. By integrating evidence, generated questions, and extracted facts in its verification process, the tool ensures transparency, enabling users to comprehend its reasoning.

Here is a diagram illustrating the FLEEK methodology:

Fig2 FLEEK Methodology

Example:

Let’s say the LLM outputs the following sentence: “Mount Everest stands as the tallest mountain globally, towering at 29,031 feet (8,848 meters).”

First, FLEEK would analyze this sentence and identify the claim about the tallest mountain’s height. It might then generate a question like: “What is the documented height of the tallest mountain on Earth?”
Then it would search curated knowledge graphs (e.g., Wikidata) and the open web (e.g., scientific websites) for answers.

If the evidence overwhelmingly points to a different height (e.g., 29,032 feet), FLEEK will flag the original statement as potentially containing an error. Finally, FLEEK would suggest a correction like:

“Mount Everest stands as the tallest mountain globally, towering at 29,032 feet (8,848 meters) according to reliable sources.”

So, by highlighting the evidence and reasoning behind the correction, FLEEK allows users to assess its credibility and make informed decisions.

Onwards to the next method!

2. Supervised Fine-tuning (SFT)

Here are two methods within SFT for you to follow:

a. R-Tuning: This is a new method for training large language models (LLMs) to refuse to answer questions they don’t have the answer to. It works by figuring out the difference between what the LLM already knows and what it’s being taught during fine-tuning.

Based on this difference, R-Tuning creates special training data that teaches the LLM when to refuse answering a question, especially when the topic is outside its area of expertise. This method involves two key steps, which are:

i. Determining ambiguous queries by assessing the gap between the LLM’s inherent knowledge and the specific tuning queries. The tuning data is divided into certain and uncertain questions by making a single inference on the training set and comparing predictions to labels.

ii. Adding “refusal phrases” to situations that are currently ambiguous in the training data. Once this data is prepared, the LLM can be optimized using these new examples, allowing it to better understand and respond to situations where a refusal is appropriate.

Example:

Question	Label	Certain	Uncertain
What is the capital of France?	Paris	Yes	No
What is the meaning of life?	42	No	Yes
What is the airspeed velocity of an unladen swallow?	None	No	Yes

In the above example, the first question is certain (e.g., “What is the capital of France?”) because the LLM predicts confidently (Paris). The second and third questions are uncertain (e.g., “Who will win the next election?”) because the LLM lacks confidence in its answers. So, training the LLM on a dataset containing both certain and uncertain questions teaches it to refrain from answering questions it isn’t sure about.

b. Think While Effectively Articulating Knowledge (TWEAK): This method treats the generated text at each step, along with its predicted continuation, as a hypothesis. It then uses a Hypothesis Verification Model (HVM) to score each candidate based on how well the HVM’s verification of the hypothesis aligns with the original input facts. In simpler terms, TWEAK checks if the generated text and its continuation both make sense in the context of the original information provided.

Here is a diagram illustrating the TWEAK methodology:

Fig3 TWEAK Methodolog

3. Decoding Strategy

Here are two methods within decoding strategy for you to follow:

a. Context-Aware Decoding (CAD): This technique aims to enhance the accuracy of responses from language models when their current knowledge conflicts with the given context. It uses a contrastive output distribution to highlight differences in probability for potential outputs, depending on whether the model generates text with or without the provided context.

CAD’s main advantage is its ability to address bias from the model’s existing knowledge when it contradicts the context provided. This is especially useful in scenarios where resolving such conflicts is essential for producing precise and relevant outputs. For example, consider a language model trained on extensive factual data.

When summarizing a news article that mentions “Company X is facing financial difficulties,” the model might tend to contradict this based on its prior belief that Company X is financially stable.

CAD addresses this bias by boosting the likelihood of generating text aligned with the news article, enabling the model to create a more accurate summary.

It also provides a major implementation advantage. That is, it doesn’t need the language model to undergo extra training, making it easy to apply to existing pre-trained models. This saves computational resources and time otherwise spent on retraining for better accuracy.

Example:

Imagine a LLM trained on a dataset of mostly positive restaurant reviews. Let’s say you provide the LLM with the following context: ” A customer had an unsatisfactory experience at a restaurant. The food was undercooked, service was unresponsive, and the atmosphere was noisy ”

Without CAD: The LLM, biased by its training data, might downplay the negativity and generate a summary like: ” The dining experience was satisfactory.”
With CAD: The contrastive output distribution would significantly increase the probability of the LLM generating text aligned with the context, such as: “The customer faced issues like undercooked food, unresponsive service, and a noisy atmosphere.”

So, by amplifying the probability of context-consistent outputs, CAD helps LLMs overcome its bias and produce a more accurate and faithful response.

b. Decoding by Contrasting Layers (DoLa): This is a simple method designed to tackle the issue of hallucinations in pre-trained LLMs without requiring additional fine-tuning or external information. This approach uses the insight that factual knowledge tends to strengthen in the deeper layers of transformer-based LLM architectures.

Here’s how it works:

Contrasting Layer Outputs: DoLa compares the outputs of different layers in the LLM. Specifically, it focuses on the difference in “logits,” which are the raw scores assigned to each possible next word by the model.
Projecting to Vocabulary Space: These logit differences are then projected into the vocabulary space, meaning they are transformed into a format that reflects the actual words the model can generate.
Next-Token Distribution: By analyzing these projected differences, it can effectively predict the most likely next token in the sequence, while simultaneously filtering out improbable or factually incorrect options.

This approach capitalizes on the fact that factual knowledge tends to be more prominent in later layers of the LLM. As a result, DoLa helps to:

Reduce False Information: By filtering out improbable outputs, it minimizes the generation of factually incorrect or misleading information.
Improve Truth Recognition: By focusing on layers with stronger factual grounding, it enhances the LLM’s ability to identify and prioritize truthful information.

So, DoLa has been shown to be effective in improving the overall truthfulness of LLM responses on various tasks, including multiple-choice questions, open-ended generation tasks like TruthfulQA, and even leading to performance improvements in LLaMA family models.

By now I’m confident that you’ve gained a solid understanding of how to tackle hallucinations in LLMs.

Further, feel free to explore various post-development techniques like Retrieval Augmentation (RAG), which utilizes external knowledge bases to enhance the factual accuracy and reliability of LLM responses. As research in this area continues to evolve, we can expect even more sophisticated and effective methods for mitigating hallucinations, bringing us closer to realizing the full potential of LLMs.

Want to know more about Generative Artificial Intelligence’s (GenAI) potential? Reach us at Nitor Infotech today!