A powerful SSMS extension for chatting with local Large Language Models directly in SQL Server Management Studio—completely private and secure
Features • Installation • Getting Started • Configuration • Commands
- Privacy-First: All data stays on your machine—no cloud APIs required
- Multiple LLM Support: Works with Ollama, LM Studio, OpenAI-compatible endpoints, and custom APIs
- SQL Server Expertise: Chat with AI assistants specialized in T-SQL, database design, and optimization
- Persistent Conversations: Chat history is maintained throughout your session
- Selectable Responses: Copy and paste AI responses easily
- Native SSMS Integration: Access via View → Other Windows → Local LLM Chat
- Simple UI: Clean, distraction-free chat interface
- Instant Access: No need to switch between applications
- Keyboard Shortcuts: Send messages with Ctrl+Enter
- Read Files: Load SQL scripts and files into the conversation context
- List Directories: Browse your script folders directly from chat
- Search Files: Find files using glob patterns (e.g.,
*.sql) - Working Directory: Easily manage files in your documents folder
Built-in slash commands for quick actions:
/help- Display all available commands/info- Show extension information (author, version, GitHub)/config- Show current LLM configuration (API URL, model, settings)/read <file-path>- Read a SQL script or file into context/list [directory]- List files in a directory/search <pattern>- Search for files with patterns/write <path>- Prepare to write content to a file/clear- Clear conversation history
- Download the
.vsixfile from the releases page - Double-click the
.vsixfile to install - Restart SQL Server Management Studio
- Access via View → Other Windows → Local LLM Chat
git clone https://github.com/markusbegerow/local-llm-chat-ssms.git
cd local-llm-chat-ssms
# Open local-llm-chat-ssms.sln in Visual Studio
# Build the solution (F6)
# The VSIX will be in bin\Debug or bin\ReleaseRun the included uninstall.bat file or uninstall via Extensions → Manage Extensions in SSMS.
Option A: Ollama (Recommended for local use)
# Install Ollama from https://ollama.ai
ollama pull llama3
ollama serve- Default URL:
http://localhost:11434 - Provider: Select "Ollama"
Option B: LM Studio
- Download from lmstudio.ai
- Load a model (e.g., Llama 3, Mistral)
- Start the local server (default:
http://localhost:1234/v1/chat/completions) - Provider: Select "LM Studio"
Option C: Custom/OpenAI-compatible API
- Use any OpenAI-compatible API endpoint
- Provider: Select "Custom" or "OpenAI Compatible"
- Set full endpoint URL (e.g.,
https://your-api.com/v1/chat/completions)
-
Open SSMS
-
Go to Tools → Options → Local LLM Chat → General
-
Configure your settings:
- API Provider: Select your LLM provider (Ollama, LM Studio, OpenAI Compatible, or Custom)
- API URL: Enter your API endpoint URL
- Model Name: Enter the model to use (e.g.,
llama3,gpt-4o) - Temperature: Control randomness (0.0 = focused, 2.0 = creative)
- Bearer Token: Optional authentication token (if required)
- System Prompt: Customize the AI's behavior
- Advanced Settings: Max tokens, timeout, history length
-
Click OK to save
- Open View → Other Windows → Local LLM Chat
- Type your question or command
- Press Send or Ctrl+Enter
- The AI will respond based on your configuration!
Tools → Options → Local LLM Chat → General
| Setting | Default | Description |
|---|---|---|
| API Provider | Ollama | The LLM provider to use (Ollama, LM Studio, OpenAI Compatible, Custom) |
| API URL | http://localhost:11434 |
Base URL for the LLM API endpoint |
| Model Name | llama3 |
Name of the model to use |
| Temperature | 0.7 |
Controls randomness in responses (0.0-2.0) |
| Max Tokens | 2048 |
Maximum number of tokens in response |
| Timeout | 120 seconds |
Request timeout in seconds |
| Max History Length | 50 |
Maximum number of messages to keep in history |
| System Prompt | SQL Server assistant | System prompt that defines AI behavior |
| Bearer Token | (empty) | Optional bearer token for API authentication |
Ollama (Local)
Provider: Ollama
API URL: http://localhost:11434
Model: llama3
Temperature: 0.7
LM Studio (Local)
Provider: LM Studio
API URL: http://localhost:1234/v1/chat/completions
Model: llama-3-8b
Temperature: 0.7
Custom/OpenAI-compatible API
Provider: Custom
API URL: https://your-api.com/v1/chat/completions
Model: gpt-4o
Temperature: 0.7
Bearer Token: your-api-key-here
Type these commands in the chat window:
| Command | Description | Example |
|---|---|---|
/help |
Show all available commands | /help |
/info |
Show extension information | /info |
/config |
Display current configuration | /config |
/read <file-path> |
Read a SQL script or file | /read scripts\backup.sql |
/list [directory] |
List files in directory | /list C:\SQLScripts |
/search <pattern> |
Search for files | /search *.sql |
/write <path> |
Prepare to write to file | /write test.sql |
/clear |
Clear conversation history | /clear |
How do I optimize a query with multiple LEFT JOINs?
/read backup_script.sql
Can you review this backup script and suggest improvements?
Create a stored procedure that archives old orders to a history table
/config
Displays:
Current Configuration:
LLM Settings:
Provider: Ollama
API URL: http://localhost:11434
Model: llama3
Temperature: 0.7
Max Tokens: 2048
Timeout: 120s
Max History: 50
Bearer Token: (not set)
Working Directory: C:\Users\YourName\Documents
Tip: Change settings in Tools → Options → Local LLM Chat
/search *.sql
/list C:\SQLScripts
- Verify your LLM server is running:
- Ollama:
curl http://localhost:11434/api/tags - LM Studio: Check that the server is started in LM Studio
- Ollama:
- Check the API URL in Tools → Options → Local LLM Chat
- For custom APIs, verify the full endpoint URL includes
/v1/chat/completionsor the correct path
- Make sure your API Provider matches your actual LLM service
- For OpenAI-compatible APIs, use "Custom" or "OpenAI Compatible" provider
- Verify the API URL is complete (e.g.,
http://localhost:1234/v1/chat/completions)
- Check your model is loaded in the LLM server
- Increase the Timeout setting in options
- Verify network connectivity to the API endpoint
- Close and reopen the chat window after changing settings
- Check that settings are saved in Tools → Options
- Restart SSMS if settings don't apply
- SSMS: Version 19.0 or higher (supports 18.x, 19.x, 20+)
- .NET Framework: 4.7.2 or higher
- Local LLM: Ollama, LM Studio, or compatible OpenAI API server
- Architecture: 64-bit (amd64)
# Clone repository
git clone https://github.com/markusbegerow/local-llm-chat-ssms.git
cd local-llm-chat-ssms
# Open in Visual Studio
start local-llm-chat-ssms.sln
# Build solution (F6)
# Output: bin\Debug\local-llm-chat-ssms.vsix or bin\Release\local-llm-chat-ssms.vsixlocal-llm-chat-ssms/
├── ChatMessageViewModel.cs # View model for chat messages
├── LlmConfig.cs # Configuration model
├── LocalLlmChatSsmsPackage.cs # VSIX package entry point
├── LocalLlmChatSsmsPackage.vsct # VS command table (menu definitions)
├── LocalLlmChatWindow.cs # Tool window definition
├── LocalLlmChatWindowControl.xaml # Chat UI (XAML)
├── LocalLlmChatWindowControl.xaml.cs # Chat UI code-behind
├── LocalLlmChatWindowCommand.cs # Command to open chat window
├── OllamaClient.cs # LLM API client
├── OptionsPage.cs # Tools → Options page
├── SettingsManager.cs # Settings persistence
├── SlashCommandHandler.cs # Slash command processor
├── Utils.cs # Utility functions
├── source.extension.vsixmanifest # VSIX manifest
├── local-llm-chat-ssms.csproj # Project file
├── local-llm-chat-ssms.sln # Solution file
├── uninstall.bat # Uninstaller script
└── README.md # This file
- Llama 3.1 8B: Fast, good for general SQL questions
- CodeLlama 13B: Specialized for code generation
- Qwen 2.5 Coder: Excellent code understanding
- DeepSeek Coder: Strong at complex database logic
- Llama 3.1: Best all-around performance
- Mistral 7B: Fast and efficient
- Phi-3: Compact but capable
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
This project is licensed under the GNU General Public License v2.0 (GPLv2) - see the LICENSE file for details.
This means:
- ✅ You can freely use, modify, and distribute this software
- ✅ You must provide source code when distributing
- ✅ Any derivative works must also be licensed under GPLv2
- ✅ No warranty is provided
- Thanks to the Ollama team for making local LLMs accessible
- LM Studio for providing an excellent local inference platform
- The Visual Studio extensibility team for comprehensive documentation
If you encounter any issues or have questions:
- 🐛 Report bugs
- 💡 Request features
- ⭐ Star the repo if you find it useful!
If you like this project, support further development with a repost or coffee:
- 🧑💻 Markus Begerow
- 💾 GitHub
Privacy Notice: This extension operates entirely locally by default. No data is sent to external servers unless you explicitly configure it to use a remote API endpoint.
