cookbook(openai-agents): use gpt-4.1-mini + sharpen Pattern-1 prompts by DK09876 · Pull Request #40 · vectorize-io/hindsight-cookbook

DK09876 · 2026-06-04T01:51:36Z

Summary

Two related changes to notebooks/10-openai-agents-memory.ipynb to make the memory demos work reliably end-to-end. Both pair with the integration PR vectorize-io/hindsight#1866.

What & why

1. Model upgrade: `gpt-4o-mini` → `gpt-4.1-mini` (cells 4 and 9)

Pattern 2 (memory_instructions auto-inject) was returning "I currently don't have that information" under gpt-4o-mini even though memory_instructions had injected the recalled facts into the system prompt — the model was simply not grounding its answer on them. Under gpt-4.1-mini, cell 10 now returns the recalled facts verbatim:

"You use Python and SQL as your programming languages. As for tools, you use VS Code as your code editor, and you prefer to work in dark mode."

Pattern 2 is the load-bearing demo for the auto-inject path most users will actually adopt.

2. Pattern-1 prompt sharpening (cells 7 and 8)

Same template the Claude SDK cookbook adopted (commit d92ccab373 on cookbook main): nudge the agent to call recall_memory explicitly before answering, instead of leaving the tool-use decision to the model.

Cell 7: "What IDE do I use? And what's my job?" → "Recall what you know about me first, then answer: what IDE do I use and what's my job?"
Cell 8: same template ("Use the recall tool to refresh what you know about me first, then recommend...").

Cell 7 stays best-effort under gpt-4.1-mini's tool-use variance — the model still sometimes asks "what's your IDE?" rather than recalling — but with the sharpened prompt the failure rate drops and the cell makes the user's intent explicit to a reader.

Relationship to other PRs

Pairs with fix(openai-agents): default to Cloud + gated E2E + requires_real_llm bucketing hindsight#1866 (the integration code is unchanged; only the cookbook is touched here).
Supersedes the unmerged PR cookbook: sharpen Pattern-1 prompts in 08-llamaindex + 10-openai-agents #38 (cookbook/pattern1-prompt-polish-oa-llamaindex) for the OpenAI Agents notebook. That PR was abandoned because the prompt polish alone didn't reliably move Pattern 1 cells under gpt-4o-mini. This PR pairs the polish with the model bump, which is what unlocks Pattern 2 reliably.

Test plan

Cookbook 11/11 cells execute against canonical PR #1866 build, 0 errors.
Cell 10 (Pattern 2 auto-inject) reliably returns recalled facts (Python/SQL/VS Code/dark mode).
Cell 7 (Pattern 1 tool recall) is best-effort under gpt-4.1-mini; cookbook narrative now matches what the user can expect.
Reviewer to run end-to-end against the merged hindsight-openai-agents.

🤖 Generated with Claude Code

Two related changes to make 10-openai-agents-memory.ipynb's memory demos work reliably end-to-end: 1. Model: gpt-4o-mini → gpt-4.1-mini in both cells where the Agent is constructed (cell 4 explicit-tool agent, cell 9 auto-memory agent). Pattern 2 (memory_instructions auto-inject) was returning "I currently don't have that information" under gpt-4o-mini even though the memory_instructions hook had injected the recalled facts into the system prompt — the model was simply not grounding its answer on them. With gpt-4.1-mini, cell 10 now returns: "You use Python and SQL as your programming languages. As for tools, you use VS Code as your code editor, and you prefer to work in dark mode." 2. Pattern-1 prompts (cells 7 and 8): nudge the agent to call recall_memory explicitly before answering, instead of leaving the tool-use decision to the model. This is the same template the Claude SDK cookbook adopted (commit d92ccab on cookbook main). Cell 7 stays best-effort under gpt-4.1-mini's tool-use variance — the model still sometimes asks "what's your IDE" rather than recalling — but with the sharpened prompt the failure rate is visibly lower, and the cell makes the user's intent (recall first, then answer) explicit to a reader. Pattern 2 (cell 10) is the load-bearing demo for the auto-inject path that most users will actually adopt. Verified end-to-end against the canonical PR #1866 build of hindsight_openai_agents. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

DK09876 mentioned this pull request Jun 4, 2026

cookbook(langgraph): use gpt-4.1-mini + sharpen Pattern-1 prompts #41

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

cookbook(openai-agents): use gpt-4.1-mini + sharpen Pattern-1 prompts#40

cookbook(openai-agents): use gpt-4.1-mini + sharpen Pattern-1 prompts#40
DK09876 wants to merge 1 commit into
mainfrom
cookbook/openai-agents-model-prompt-polish

DK09876 commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

DK09876 commented Jun 4, 2026

Summary

What & why

1. Model upgrade: gpt-4o-mini → gpt-4.1-mini (cells 4 and 9)

2. Pattern-1 prompt sharpening (cells 7 and 8)

Relationship to other PRs

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Model upgrade: `gpt-4o-mini` → `gpt-4.1-mini` (cells 4 and 9)