Web Science and Digital Libraries Research Group

Posts

Showing posts with the label Conformal Prediction

2024-12-30: Paper Summary: Benchmarking LLMs via Uncertainty Quantification

By Kenny Ajayi - December 30, 2024

Figure 1: Overall process of applying conformal prediction for uncertainty quantification in LLMs (Figure 2 in paper ) Introduction As large language models (LLMs) gain prominence in academia and industry, evaluating their performance becomes increasingly critical. Popular platforms such as the HuggingFace Open LLM Leaderboard focus solely on accuracy, neglecting an essential dimension: uncertainty . Uncertainty quantification (UQ) is vital for comprehensive model evaluation, as two models may achieve similar accuracy but show different levels of uncertainty as shown in Figure 2 below. In this blog, we summarize the paper " Benchmarking LLMs via Uncertainty Quantification " [1] by Fanghua Ye et al., presented at the NeurIPS 2024 . This paper introduces a novel benchmarking framework that incorporates uncertainty quantification using conformal prediction [2]. The paper quantifies the uncertainties of nine LLMs across five core natural language ...