LLMs Struggle with Importance Detection and Nuance
· agents, product-design
The Idea
Current LLMs/agents are bad at identifying what actually matters. They fail to catch nuance and context-appropriate responses.
Example: When asked to write feature documentation, the LLM produces verbose docs even for simple features that warrant brief descriptions. It seems optimized for hitting a certain response length rather than understanding what the user actually needs.
This suggests a deeper problem: the reward function (likely from RLHF) may be biased toward longer, more "comprehensive" outputs - conflating length with quality rather than appropriateness.
Why This Matters
- For agent builders: Need to think about how to give agents better judgment about "what's enough"
- For product design: Output calibration is a core UX problem - verbose when unnecessary, terse when detail matters
- For prompting: May need explicit signals about desired depth/brevity
- Fundamental limitation: Current models may lack the meta-cognitive ability to assess "is this the right level of detail?"
Related
- Clay study - relevant if Clay addresses data enrichment depth