When an LLM is called with temperature=0 and returns an empty response, OpenHands retries the same call with temperature=1.0. The randomness escapes whatever local minimum produced the empty completion.
def call_llm(messages):
r = client.complete(messages, temperature=0)
if not r.content.strip():
# try again with randomness
r = client.complete(messages, temperature=1.0)
return r
Why the empty response happens
Models occasionally produce zero-length output when the most-likely-next-token is the stop token. With temperature=0, the model is deterministic — same inputs, same empty answer on retry.
Bumping temperature breaks the determinism just enough that a non-stop token wins, and the model resumes generating.
When NOT to apply this
If your agent is “stuck” on a real conceptual problem, more randomness doesn’t help. The fix is to break the prompt, not break the temperature.
This pattern is specific to the empty-response failure mode. Don’t apply it to all errors.
Generalization
The broader point: LLM calls have specific failure modes that have specific cheap fixes. Most teams treat all LLM errors as “retry with backoff,” which is a coarse hammer. A small library of failure-mode-specific recoveries is more reliable:
- Empty response with temp=0: retry temp=1.
- Truncated JSON in tool args: retry with a “complete the JSON” repair prompt.
- Refused with no tool call: check guardrails, simplify the request.
- Tool call with hallucinated tool name: include allowed tool list in the retry.
Each is a few lines. Together they make the agent dramatically more robust.
Sources
-
openhands-2/05-llm-integration.md:545? unverified