Can large language model (LLM) agents serve as proxies for human investors in behavioral finance experiments? I deploy 96 GPT-4-family agents in a staggered difference-in-differences design, exposing ...
As Large Language Models (LLMs) are increasingly deployed in autonomous, high-stakes environments, the fragility of current Reinforcement Learning from Human Feedback (RLHF) alignment protocols remain...
Hallucination in large language model-generated summaries poses a critical safety challenge for knowledge-intensive domains such as Chinese history and culture. Existing faithfulness detection methods...
This paper introduces an empirical extension of the Deep Personal Privacy (DPP) framework, a novel paradigm that reconceptualizes privacy as resistance to inference rather than mere control over data ...
The large-scale deployment of Large Language Models (LLMs) is constrained by significant energy consumption and operational costs, with inference accounting for up to 90% of the total energy footprint...
Large language model (LLM)-based chatbots are increasingly integrated into various sectors of people's lives, including education, healthcare, and retail. As they become more ubiquitous, safety concer...
This paper studies whether large language models (LLMs) express systematically different opinions on contested questions in development economics and political economy, and whether those opinions are ...
Large language models are increasingly embedded across the DevOps pipeline, from planning and code generation to testing and deployment, yet multi-agent LLM pipelines remain opaque: errors propagate s...
Assessing Llm Hallucinations And The Reliability Of Using LLms For Automated Hallucination Detection
Large language models (LLMs) are increasingly deployed to assess, diagnose, and predict clinical symptoms and outcomes from textual data. However, prior work has shown that LLMs are susceptible to hal...
SinoXiv