AI hallucination benchmarking involves measuring the frequency and severity...
https://wiki-square.win/index.php/What_to_Believe_About_%22Llama_4_Maverick_4.6%25_Vectara%22_Summarization_Accuracy
AI hallucination benchmarking involves measuring the frequency and severity with which language models produce factually incorrect or nonsensical outputs