Temperature perturbation as recovery from empty responses

When an LLM is called with temperature=0 and returns an empty response, OpenHands retries the same call with temperature=1.0. The randomness escapes whatever local minimum produced the empty completion.

def call_llm(messages):
    r = client.complete(messages, temperature=0)
    if not r.content.strip():
        # try again with randomness
        r = client.complete(messages, temperature=1.0)
    return r

Why the empty response happens

Models occasionally produce zero-length output when the most-likely-next-token is the stop token. With temperature=0, the model is deterministic — same inputs, same empty answer on retry.

Bumping temperature breaks the determinism just enough that a non-stop token wins, and the model resumes generating.

When NOT to apply this

If your agent is “stuck” on a real conceptual problem, more randomness doesn’t help. The fix is to break the prompt, not break the temperature.

This pattern is specific to the empty-response failure mode. Don’t apply it to all errors.

Generalization

The broader point: LLM calls have specific failure modes that have specific cheap fixes. Most teams treat all LLM errors as “retry with backoff,” which is a coarse hammer. A small library of failure-mode-specific recoveries is more reliable:

Empty response with temp=0: retry temp=1.
Truncated JSON in tool args: retry with a “complete the JSON” repair prompt.
Refused with no tool call: check guardrails, simplify the request.
Tool call with hallucinated tool name: include allowed tool list in the retry.

Each is a few lines. Together they make the agent dramatically more robust.

Sources

openhands-2/05-llm-integration.md:545 ? unverified