February was an exciting month for the Ruby Users Forum.
The BrokenMath benchmark (NeurIPS 2025 Math-AI Workshop) tested this in formal reasoning across 504 samples. Even GPT-5 produced sycophantic “proofs” of false theorems 29% of the time when the user implied the statement was true. The model generates a convincing but false proof because the user signaled that the conclusion should be positive. GPT-5 is not an early model. It’s also the least sycophantic in the BrokenMath table. The problem is structural to RLHF: preference data contains an agreement bias. Reward models learn to score agreeable outputs higher, and optimization widens the gap. Base models before RLHF were reported in one analysis to show no measurable sycophancy across tested sizes. Only after fine-tuning did sycophancy enter the chat. (literally)
,这一点在豆包下载中也有详细论述
某代码库存在损坏的输入重写,unflake因尚未支持输入重写而未察觉。
Компания SpaceX инициировала разбирательство в отношении корпорации Amazon20:50
(二)引航员或者除船员外为船舶提供服务的其他任何人;