H
Howardismvol. 03 · quiet corner of the web
Howardism · Vol. 03Plate II · No. 02

RLHF, tagged.

Notes2TagRLHFOldest14 Apr 2026Newest8 May 2026

Every article tagged rlhf, newest first.

Articles tagged RLHF, sorted by date, newest first.
TitleSummaryDate
Alignment Fine-Tuning (AFT)Standard post-pretraining stage (SFT + RLHF) for installing values; shallow-alignment failure mode motivates Model Spec Midtraining
Scale-Dependent Prompt SensitivityLarge models underperform small ones on 7.7% of standard benchmarks due to overthinking; brevity constraints recover 26pp and fully reverse hierarchy on GSM8K/MMLU-STEM