Skip to content

Fix server model roles, RAG, image tasks, and GPU options#126

Open
sheikhti1205 wants to merge 11 commits into
Siddhesh2377:re-writefrom
sheikhti1205:codex/server-model-roles-rag-gpu
Open

Fix server model roles, RAG, image tasks, and GPU options#126
sheikhti1205 wants to merge 11 commits into
Siddhesh2377:re-writefrom
sheikhti1205:codex/server-model-roles-rag-gpu

Conversation

@sheikhti1205

Copy link
Copy Markdown

Hi, this PR includes the ToolNeuron fixes/improvements I mentioned in discussion #125.

What changed

Server model selection

  • Fixed remote server chat model selection so it no longer silently auto-picks the wrong model.
  • Added stricter model handling for server requests.
  • Added Android Server screen chat model picker.
  • Updated the bundled Web UI settings so it uses a real chat model dropdown from /v1/models.
  • Prevented embedding/upscaler models from being selected as chat models.

Manual model categories

  • Added manual model category assignment for installed models.
  • Categories include Chat, Embedding, Image Generation, Image Upscaler, TTS, and STT.
  • These categories are used across Store, Model Manager, Server, and Image Task screens.

Image task improvements

  • Added better handling for image generation, inpaint, and upscale tasks.
  • Added progress/metrics UI.
  • Added output options like keeping result in session, replacing input image, saving to Photos, and Save As.
  • Added GPU/OpenCL toggle for image tasks.

RAG improvements

  • Allowed selecting all file types for RAG.
  • Added fallback text extraction for more document-like formats.
  • Improved document summary behavior so summaries use broader document excerpts instead of only small top-k retrieval chunks.
  • Added better default RAG/embedding model repos.

GPU option for chat/server

  • Added GGUF GPU offload option in model loading settings.
  • Applied the same model config to normal chat and server/VLM loading paths.

Web UI

  • Redesigned the bundled server Web UI to visually match the Android app more closely.

Tested

./gradlew :app:compileDebugKotlin --console=plain
./gradlew :app:assembleDebug --console=plain

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bec32eeacc

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread app/src/main/assets/server_webui.html Outdated
Comment thread app/src/main/java/com/dark/tool_neuron/viewmodel/ImageTaskViewModel.kt Outdated
@sheikhti1205

Copy link
Copy Markdown
Author

Should I convert this to draft? And do the fixes it recommends?
I noticed another issue on the "Server model roles"
Screenshot_2026-06-14-19-41-17-532_com dark tool_neuron
Screenshot_2026-06-14-19-41-13-691_com dark tool_neuron
@Siddhesh2377

@Siddhesh2377

Copy link
Copy Markdown
Owner

Ya Bro Fix what ever you can, I am currently out so can't code much
cc: @sheikhti1205

@sheikhti1205

sheikhti1205 commented Jun 15, 2026

Copy link
Copy Markdown
Author

For the time being, I think the app is in a good state. I tested it as much as I could, and so far I haven’t noticed any major issues—only some thoughts for future improvements.

Since you mentioned some new things in your last release note, I think you can start working on those whenever you get time.

Also, about this PR, would you prefer to merge the current changes first and continue the remaining improvements in follow-up PRs, or should I keep this PR open and add more fixes here? I’m okay with both approaches.
@Siddhesh2377

@Siddhesh2377

Copy link
Copy Markdown
Owner

it's ok you can add more stuff in this pr
@sheikhti1205

@sheikhti1205

Copy link
Copy Markdown
Author

@Siddhesh2377

Note: my exams are starting soon, so I may not be able to work on this for a month or more. If you have suggestions or want changes in a specific direction, please let me know here and I’ll try to address them when I’m available.

Recent update summary:

  • Improved model selection behavior:

    • chat now accepts manually assigned chat-capable models, not only VLM-style models
    • image input still requires a vision-capable model
    • clearer warnings when no model is installed, selected, still downloading, or incompatible with the current input
  • Improved model store organization:

    • added shared model taxonomy/grouping
    • changed the store into a clearer family -> task -> model structure
    • kept DeepSeek separated so DeepSeek/Qwen-style model names do not crowd the Qwen section
    • added a default filter that hides models larger than 2GB, with an option to show them manually
    • added LFM 1.2B instruct/thinking and LFM2.5-VL 1.6B entries
    • added small Gemma, SmolLM, Qwen, and DeepSeek reasoning entries
    • reduced the crowded filter/category area
  • Improved backup/import:

    • added setup restore entry point
    • added export/import progress with ETA
    • added import preview with per-model selection
    • added overwrite/conflict handling
    • added checksum verification
    • added support for exporting content-URI models
    • added notifications for backup completion/failure
  • Improved RAG behavior:

    • better document-summary prompt
    • avoids meta answers like “the question is asking...”
    • uses full extracted document text when it fits in context
    • increased usable RAG context budget where the model allows it
    • changed “Possible sources” to “Sources” when document chunks are attached
  • Improved web search behavior:

    • better handling for direct links, especially Play Store / Google Play requests
    • better query targeting for app download links
    • carries context for short follow-up requests like “exact link”
  • Added app-side notification when an AI response completes while the app is not foregrounded.

Important note:

  • I attempted to move the bundled server Web UI toward an Open WebUI-style structure, but the current result is not good enough. The web interface got messy and should probably be treated as needing a full rewrite rather than small patching.

Other notes:

  • Web search is improved, but still not perfect.
  • RAG is also improved, but I think more effective changes will need better real-world examples and articles/cases to test against. Until I stumble across better references for what works well here, this is a reasonable stopping point.

Tested with:

./gradlew :app:compileDebugKotlin --console=plain
./gradlew :app:assembleDebug --console=plain

@Siddhesh2377

Copy link
Copy Markdown
Owner

Hey @sheikhti1205
You can Focus on your exams bro, will look into this after a month, best of luck man !

@sheikhti1205

sheikhti1205 commented Jun 22, 2026

Copy link
Copy Markdown
Author

@Siddhesh2377 I don't know but since I'm putting effort here. I really hope this turns into a great project! I'm adding a bit more stuffs to make things right + extra bits 🌟🌟
Question: Should I open a new PR? or keep this one?
edit:
I should add that the initial setup ui needs more fixing like loopholes. I'm not fixing those this time. Sorry. I'll leave that to you.

@sheikhti1205

Copy link
Copy Markdown
Author

Major changes so far:
Added server model roles and multi-engine remote server catalog handling.
Improved Remote Server WebUI with responsive layout, separate CSS asset, settings, history, markdown export, read aloud, attachments, and mobile behavior.
Added /webui.css native server route and public auth allowance.
Improved server chat routing, VLM routing, setup state, and model role fallback behavior.
Added model backup/import/export support. (needs fixing/improvement)
Added Downloads screen, download history, labels, active-download tracking, and now retry/clear handling. (untested)
Added automatic download retry for transient network/HTTP failures with preserved partial .hxd_tmp resume. (untested)
Removed visible Tool/Search model UX; legacy TOOL_SEARCH installs migrate to normal GGUF.
Simplified Vision Store browsing to provider/family sections instead of nested VLM group cards.
Kept VLM base + projector auto-download behavior internally.
Added adaptive image upscale workflow: 2x/4x/8x/custom. (this needs more fixing)
Removed temporary ONNX image-operation UI/runtime direction from active scope.
Added Storage maintenance: Quick clean, Detailed check, Deep model test, report summaries. (untested)
Restructured Settings into grouped areas: Models, Storage, Downloads, Remote Server, Web Search, Privacy & Security, Appearance, Advanced, and About.
Improved web search workflow, query modes, page fetching/extraction, search cards, and result state handling. (partially tested before more improvements were added so, it's untested 😕)
Improved RAG/search-related flows and model setup packs.
Added system TTS/STT fallback plumbing and voice handling polish.

I know I said I would stop but wanted to do a few fixes. But there are still defects, flaws and others. And neeeds testing too - which I didn't :-)
For example: Server webui is designed to work for desktop but not mobile (basically you can view it if you use desktop mode for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants