As part of MEng project "Mitigating Hallucinations in LLMs"
arXiv paper: (link coming soon)
Doubt Injection is a proposed technique that aims to encourage an LLM's Chain-of-Thought (CoT) to explore a wider set of ideas---motivated by observations of idea exploration in CoTs. This is done by randomly injecting a statement e.g. "But" at each new paragraph in the CoT. This can make marginal (but currently statistically insignificant) improvements of distilled DeepSeek on arithmetic reasoning (29.2%
Example:
The research investigates the effect of injection string, injection probability, temperature and model size on the performance of Doubt Injection compared with regular generation.
The paper also lays out a simple statistical framework to gauge the significance of LLM accuracy results, that we hope will be used more widely. For example for 1 run through a 200-question dataset, the claim that LLM B (73.0%) is better than LLM A (72.5%) is statistically insignificant: there is only 54% chance that LLM B has a higher true accuracy than LLM A. This comes from comparing posterior probability distributions over LLM accuracy.
This repository contains the evaluation scripts used in obtaining all results provided in the research paper, primarily on AIME 2024 and SimpleBench benchmarks. Additional scripts used to obtain motivating results, analyse the ideas explored in CoT responses, make changes to the formatting of result files are provided in additional_results/.
A small number of example LLM responses and experimental results are provided in responses/.
First install PyTorch from the official website. Then:
pip install numpy pandas transformers protobuf sentencepiece
To run an evaluation on the AIME 2024 dataset:
python aime_eval.py --doubt_injection $1 --llm_name "deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B" --temperature_set "0.6" --injection_string "But"
To run a specific question evaluation on the SimpleBench dataset:
python simplebench_eval.py --q_id 2 --llm_name deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --temperature_set "0.6" --injection_string "But"