Devoxx Genie is a fully Java-based LLM Code Assistant plugin for IntelliJ IDEA, designed to integrate with local LLM providers such as Ollama, LMStudio, GPT4All, Llama.cpp and Exo but also cloud based LLM's such as OpenAI, Anthropic, Mistral, Groq, Gemini, DeepInfra, DeepSeek, OpenRouter, Azure OpenAI and Amazon Bedrock
We now also support RAG-based prompt context based on your vectorized project files. In addition to Git Dif viewer and LLM-driven web search with Google and Tavily.
With MCPs and frontier models like Claude Sonnet 3.7, Gemini Pro, DevoxxGenie isn’t just another developer tool — it’s a glimpse into the future of agentic programming. One thing’s clear: we’re in the midst of a paradigm shift in AI-Augmented Programming (AAP) 🐒
- Building full-stack AI agents: From project generation to code execution (Devoxx France 2025)
- Agentic programming with DevoxxGenie (VoxxedDays Bucharest 2025)
- DevoxxGenie in action (Devoxx Belgium 2024)
- How ChatMemory works
- Hands-on with DevoxxGenie
- The Era of AAP: Ai Augmented Programming using only Java
- DevoxxGenie Demo 2024
- DevoxxGenie: Your AI Assistant for IDEA
- The Devoxx Genie IntelliJ Plugin Provides Access to Local or Cloud Based LLM Models
- 10K+ Downloads Milestone for DevoxxGenie!
- 🔥️ MCP Support: You can now add MCP servers and use them in your conversations
- 🗂️ DEVOXXGENIE.md: By incorporating this into the system prompt, the LLM will gain a deeper understanding of your project and provide more relevant responses.
- 📸 DnD images: You can now DnD images with multimodal LLM's.
- 🧐 RAG Support: Retrieval-Augmented Generation (RAG) support for automatically incorporating project context into your prompts.
- 👀 Chat History: Your chats are stored locally, allowing you to easily restore them in the future.
- 🧠 Project Scanner: Add source code (full project or by package) to prompt context when using Anthropic, OpenAI or Gemini.
- 💰 Token Cost Calculator: Calculate the cost when using Cloud LLM providers.
- 🔍 Web Search : Search the web for a given query using Google or Tavily.
- 🏎️ Streaming responses: See each token as it's received from the LLM in real-time.
- 🧐 Abstract Syntax Tree (AST) context: Automatically include parent class and class/field references in the prompt for better code analysis.
- 💬 Chat Memory Size: Set the size of your chat memory, by default its set to a total of 10 messages (system + user & AI msgs).
- ☕️ 100% Java: An IDEA plugin using local and cloud based LLM models. Fully developed in Java using Langchain4J
- 👀 Code Highlighting: Supports highlighting of code blocks.
- 💬 Chat conversations: Supports chat conversations with configurable memory size.
- 📁 Add files & code snippets to context: You can add open files to the chat window context for producing better answers or code snippets if you want to have a super focused window
- Download and start Ollama
- Open terminal and download a model using command "ollama run llama3.2"
- Start your IDEA and go to plugins > Marketplace and enter "Devoxx"
- Select "DevoxxGenie" and install plugin
- In the DevoxxGenie window select Ollama and available model
- Start prompting
- Start your IDEA and go to plugins > Marketplace and enter "Devoxx"
- Select "DevoxxGenie" and install plugin
- Click on DevoxxGenie cog (settings) icon and click on Cloud Provider link icon to create API KEY
- Paste API Key in Settings panel
- In the DevoxxGenie window select your cloud provider and model
- Start prompting
Initial support for Model Context Protocol (MCP) server tools including debugging of MCP requests & responses! MCP support is a crucial feature towards ful Agentic support within DevoxxGenie. Watch short demo of MCP in action using DevoxxGenie
 
Example of the Filesystem-server MCP which allows you to interact with the given directory.
 
Go to the DevoxxGenie settings to enable and add your MCP servers.
 
When configured correctly you can see the tools that the MCP brings to your LLM conversations
 
You can now generate a DEVOXXGENIE.md file directly from the "Prompts" plugin settings page or just use /init in the prompt input field.
 
By incorporating this into the system prompt, the LLM will gain a deeper understanding of your project and provide more relevant responses. This is a first step toward enabling agentic AI features for DevoxxGenie 🔥
Once generated, you can edit the DEVOXXGENIE.md file and add more details about your project as needed.
 
You can now drag and drop images (and project files) directly into the input field when working with multimodal LLMs like Google Gemini, Anthropic Claude, ChatGPT 4.x, or even local models such as LLaVA
 
 
You can even combine screenshots together with some code and then ask related questions!
 
Devoxx Genie now includes starting from v0.4.0 a Retrieval-Augmented Generation (RAG) feature, which enables advanced code search and retrieval capabilities. This feature uses a combination of natural language processing (NLP) and machine learning algorithms to analyze code snippets and identify relevant results based on their semantic meaning.
With RAG, you can:
- Search for code snippets using natural language queries
- Retrieve relevant code examples that match your query's intent
- Explore related concepts and ideas in the codebase
We currently use Ollama and Nomic Text embedding to generates vector representations of your project files. These embedding vectors are then stored in a Chroma DB (v0.6.2) running locally within Docker. The vectors are used to compute similarity scores between search queries and your code all running locally.
The RAG feature is a significant enhancement to Devoxx Genie's code search capabilities, enabling developers to quickly find relevant code examples and accelerate their coding workflow.
See also Demo
Expecting to add also GraphRAG in the near future.
In the IDEA settings you can modify the REST endpoints and the LLM parameters. Make sure to press enter and apply to save your changes.
We now also support Cloud based LLMs, you can paste the API keys on the Settings page.
 
The language model dropdown is not just a list anymore, it's your compass for smart model selection.
See available context window sizes for each cloud model View associated costs upfront Make data-driven decisions on which model to use for your project
You can now add the full project to your prompt IF your selected cloud LLM has a big enough window context.
Leverage the prompt cost calculator for precise budget management. Get real-time updates on how much of the context window you're using.
See the input/output costs and window context per Cloud LLM. Eventually we'll also allow you to edit these values.
"But wait," you might say, "my project is HUGE!" 😅
Fear not! We've got options:
- Leverage Gemini's Massive Context:
Gemini's colossal 1 million token window isn't just big, it's massive. We're talking about the capacity to digest approximately 30,000 lines of code in a single go. That's enough to digest most codebases whole, from the tiniest scripts to some decent projects.
But if that's not enough you have more options...
- Smart Filtering:
The new "Copy Project" panel lets you:
Exclude specific directories Filter by file extensions Remove JavaDocs to slim down your context
 
- Selective Inclusion
Right-click to add only the most relevant parts of your project to the context.
The DevoxxGenie project itself, at about 70K tokens, fits comfortably within most high-end LLM context windows. This allows for incredibly nuanced interactions – we're talking advanced queries and feature requests that leave tools like GitHub Copilot scratching their virtual heads!
DevoxxGenie now also supports the 100% Modern Java LLM inference engines: JLama.
JLama offers a REST API compatible with the widely-used OpenAI API. Use the Custom OpenAI URL to connect.
You can also integrate it seamlessly with Llama3.java but using the Spring Boot OpenAI API wrapper coupled with the JLama DevoxxGenie option.
Use the custom OpenAI URL to connect to Exo, a local LLM cluster for Apple Silicon which allows you to run Llama 3.1 8b, 70b and 405b on your own Apple computers 🤩
Write a unit test and let DevoxxGenie generated the implementation for that unit test. This approach was explained by Bouke Nijhuis in his Devoxx Belgium presentation
An demo on how to accomplish this can be seen in this 𝕏 post.
As of today (February 2, 2025), alongside the DeepSeek API Key, you can access the full 671B model for FREE using either Nvidia or Chutes! Simply update the Custom OpenAI URL, Model and API Key on the Settings page as follows:
 
Chutes URL : https://chutes-deepseek-ai-deepseek-r1.chutes.ai/v1/
Nvidia URL : https://integrate.api.nvidia.com/v1
Create an account on Grok and generated an API Key. Now open the DevoxxGenie settings and enter the OpenAI compliant URL for Grok, the model you want to use and your API Key.
 
- From IntelliJ IDEA: Go to Settings->Plugins->Marketplace-> Enter 'Devoxx' to find plugin OR Install plugin from Disk
- From Source Code: Clone the repository, build the plugin using ./gradlew buildPlugin, and install the plugin from thebuild/distributionsdirectory and select file 'DevoxxGenie-X.Y.Z.zip'
- IntelliJ minimum version is 2023.3.4
- Java minimum version is JDK 17
Gradle IntelliJ Plugin prepares a ZIP archive when running the buildPlugin task.
You'll find it in the build/distributions/ directory
./gradlew buildPlugin It is recommended to use the publishPlugin task for releasing the plugin
./gradlew publishPlugin- Select an LLM provider from the DevoxxGenie panel (right corner)
- Select some code
- Enter shortcode command review, explain, generate unit tests of the selected code or enter a custom prompt.
Enjoy!
The DevoxxGenie IDEA Plugin processes user prompts through the following steps:
- UserPromptPanel→ Captures the prompt from the UI.
- PromptSubmissionListener.onPromptSubmitted()→ Listens for the submission event.
- PromptExecutionController.handlePromptSubmission()→ Starts execution.
- PromptExecutionService.executeQuery()→ Handles token usage calculations and checks RAG/GitDiff settings.
- ChatPromptExecutor.executePrompt()→ Dispatches the prompt to the selected LLM provider.
- LLMProviderService.getAvailableModelProviders()→ Retrieves the appropriate model from- ChatModelFactory.
- 
ChatModelFactory.getModels()→ Gets the models for the select LLM provider
- 
Cloud-based LLMs: 
- 
Local models: 
- 
If streaming is enabled: - StreamingPromptExecutor.execute()→ Begins token-by-token streaming.
- ChatStreamingResponsePanel.createHTMLRenderer()→ Updates UI in real time.
 
- 
If non-streaming: - PromptExecutionService.executeQuery()→ Formats the full response.
- ChatResponsePanel.displayResponse()→ Renders the text and code blocks.
 
- 
Indexing Source Code for Retrieval - ProjectIndexerService.indexFiles()→ Indexes project files
- ChromaDBIndexService.storeEmbeddings()→ Stores embeddings in ChromaDB.
 
- 
Retrieval & Augmentation - SemanticSearchService.search()→ Fetches relevant indexed code.
- SemanticSearchReferencesPanel→ Displays retrieved results.
 
- The response is rendered in ChatResponsePanelwith:- ResponseHeaderPanel→ Shows metadata (LLM name, execution time).
- ResponseDocumentPanel→ Formats text & code snippets.
- MetricExecutionInfoPanel→ Displays token usage and cost.
 
Below is a detailed flow diagram illustrating this workflow:
- Start by exploring PromptExecutionController.javato see how prompts are routed.
- Modify ChatResponsePanel.javaif you want to enhance response rendering.
- To add a new LLM provider, create a factory under chatmodel/cloud/orchatmodel/local/.
Want to contribute? Submit a PR! 🚀









