Skip to content

fix: handle direct model answers in ReACT loop#763

Draft
markstur wants to merge 2 commits intogenerative-computing:mainfrom
markstur:issue_762
Draft

fix: handle direct model answers in ReACT loop#763
markstur wants to merge 2 commits intogenerative-computing:mainfrom
markstur:issue_762

Conversation

@markstur
Copy link
Copy Markdown
Contributor

@markstur markstur commented Mar 27, 2026

Misc PR

Type of PR

  • Bug Fix
  • New Feature
  • Documentation
  • Other

Description

fix: handle direct model answers in ReACT loop

The ReACT framework now properly handles cases where the model provides
a direct answer without calling tools. Previously, these answers were
ignored and the loop would continue until exhausting the budget.

Added test coverage for both scenarios (no tools, unused tools).

Testing

  • Tests added to the respective file if code was changed
  • New code has 100% coverage if code as added
  • Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

The ReACT framework now properly handles cases where the model provides
a direct answer without calling tools. Previously, these answers were
ignored and the loop would continue until exhausting the budget.

Added test coverage for both scenarios (no tools, unused tools).

Fixes: generative-computing#762

Signed-off-by: Mark Sturdevant <[email protected]>
@markstur markstur requested a review from a team as a code owner March 27, 2026 23:26
@github-actions github-actions bot added the bug Something isn't working label Mar 27, 2026
@github-actions
Copy link
Copy Markdown
Contributor

The PR description has been updated. Please fill out the template for your PR to be reviewed.

@markstur
Copy link
Copy Markdown
Contributor Author

Note to reviewers: I do have concerns about my limited test environment and side effects. I'm seeing this as a good fix for when react_using_mellea with DuckDuckGo starts to find nothing. I suspect it could be impacted by DDGS rate limiting but not sure why it looks like this. Bottom line though is that there is a case where we have an answer in step value (is_complete) but we miss our is_final handling.

I wonder if there is any reason to only do this check after running out of iterations (last ditch handling), but it seem seems more right to me to just use the value when this elif case happens.

Copy link
Copy Markdown
Contributor

@jakelorocco jakelorocco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @markstur, thanks for the PR! I think this might not be an ideal way to fix the issue. I do agree that our current version of the react thinking pattern does get stuck in loops (especially for simpler answers that can be accomplished in one response).

However, I don't think we should automatically assume that a step with no tool calls and a response is the final answer. There are moments where the model will output it's thoughts as those intermediate steps and then continue.

The issue I see the most (especially with smaller models) is that the model thinks it's final tool has already been called. As a result, it just keeps repeating the same output and gets stuck till the loop exhausts.

I think there are a few potential solutions:

  • We could change the requirements for calling a tool to finalize. A lot of react patterns just look for "final_answer:" and parse the output. We could also add this in addition to the current tool call approach.
  • We could try to detect repetitions and prompt the model out of those situations. I'm not quite sure if the exact repetitions are applicable only to granite models / small models / our prompts though.
  • We could add a subsequent LLM call after each step that is a requirement that validates if the question has been answered. This adds overhead to each loop iteration, but is likely relatively low since the context should be cached. Then, if that requirement is valid, we could do one more prompt to extract the final answer using the tool / the current approach.

@markstur
Copy link
Copy Markdown
Contributor Author

Thanks @jakelorocco

Yes I agree my naive approach is probably assuming too much. I was unsuccessful when I first tried to fix this with some alternative approaches but I think I need to revisit because the problem was flaky at the time. I think I can reproduce it better now (maybe).

I'll see if I can get good results that align better with your bulleted suggestions.

With some models (and Ollama for example) we get stuck where
the model has the answer but won't call finalize. Before
failing due to iteration limit, ask the model if it has the
answer and if it responds True then use it.

Note: This is only done at the end of iterations because it is
questionable to penalize other models on each iteration. When
failure is the only option, it seems to be worth a try.

Fixes: generative-computing#762

Signed-off-by: Mark Sturdevant <[email protected]>
Copy link
Copy Markdown
Contributor

@planetf1 planetf1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one small comment on test classification otherwise LGTM



@pytest.mark.ollama
@pytest.mark.llm
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@pytest.mark.llm
@pytest.mark.e2e

We updated our markers (llm->e2e).

@markstur markstur requested a review from a team as a code owner April 18, 2026 00:18
@markstur markstur requested review from AngeloDanducci and ajbozarth and removed request for AngeloDanducci and ajbozarth April 18, 2026 00:18
@markstur markstur marked this pull request as draft April 18, 2026 00:19
@markstur
Copy link
Copy Markdown
Contributor Author

@planetf1 @jakelorocco I was letting this sit because I don't have a good solution, but I wanted to at least share this hack. It waits until the iteration loop ends and then asks the LLM if it has an answer before failing. It works for me in this ollama small model loopy situation. So before I give up on this, I thought I'd share this and ask if you think there is a more Melleaic way of doing this. Tweaking the jinja templates didn't do it for me. Probably because this is a small model behaving badly.

String parsing for "final answer" in a variety of formats would work sometimes, but also is hacky and inconsistent.

I like the idea of not doing it in every loop, but maybe as @jakelorocco said, it could be fast enough with caching.

Can/should we have a way to pass in an optional are-you-done test for the case where tools are not getting called. And call it either whenever tools are not called, or at the end, or after X loops...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: react loop is missing a check for result without tool calls

3 participants