In today’s world of artificial intelligence, creating a personal AI assistant has become surprisingly accessible thanks to various LLM models. With Spring Boot and open-source AI models, we can build a conversational AI within Java applications. Let’s explore how this works.
In this example, we’ll install and use Ollama locally, and create an endpoint that accepts a user’s prompt and returns a response using AI.
Step 1: Install Ollama
To get started, install Ollama using this link.
Once the installation is complete, download one of the available models. For this example, we’ll use the LLaMA 3 model:
ollama run llama3 // or any other model
After the model is downloaded, you can also interact with it directly through the terminal:
>>> hello
Hello! It's nice to meet you. Is there something I can help you with, or would you like to chat?
>>>
Step 2: Connect to Ollama from a Spring Boot Application
Now it’s time to connect Ollama with a Spring Boot application.
Let’s quickly generate a Spring Boot project using start.spring.io.
Here’s an example of the build.gradle file with the necessary dependencies:
plugins {
id 'java'
id 'org.springframework.boot' version '3.5.0'
id 'io.spring.dependency-management' version '1.1.7'
}
group = 'com.example'
version = '0.0.1-SNAPSHOT'
java {
toolchain {
languageVersion = JavaLanguageVersion.of(21)
}
}
repositories {
mavenCentral()
}
ext {
set('springAiVersion', "1.0.0")
}
dependencies {
implementation 'org.springframework.ai:spring-ai-starter-model-ollama'
}
dependencyManagement {
imports {
mavenBom "org.springframework.ai:spring-ai-bom:${springAiVersion}"
}
}
tasks.named('test') {
useJUnitPlatform()
}
Step 3: Create a REST Endpoint
After setting up the dependencies, all we need is a simple REST endpoint that receives a user prompt, sends it to the LLaMA 3 model, and returns the response.
For this, we can use the ChatModel class.
import org.springframework.ai.chat.model.ChatModel;
import org.springframework.ai.chat.prompt.Prompt;
import org.springframework.ai.ollama.api.OllamaModel;
import org.springframework.ai.ollama.api.OllamaOptions;
@RestController
@RequiredArgsConstructor
public class ChatController {
private final ChatModel chatModel;
@GetMapping("/message")
public String handleMessage(@RequestParam String message) {
var response = chatModel.call(
new Prompt(
message,
OllamaOptions.builder()
.model(OllamaModel.LLAMA3)
.temperature(0.4)
.build()
));
return response.getResult().getOutput().getText();
}
}
Key Components:
ChatModel — Interface that calls the AI model to generate responses.
Prompt — Wraps the user’s message and AI configuration into a single request.
OllamaOptions — Configures model settings such as temperature and model selection.
OllamaModel.LLAMA3 — Specifies the LLaMA 3 model for processing the input.
temperature — Controls the creativity of the response (0 = strict, 1 = more random).
Its time to test:

Conclusion
Using Ollama locally with Spring AI makes it easy to build AI-powered APIs without relying on external cloud services. Without spending anything for the LLM model, an AI-powered chat endpoint is ready. We can do a little bit of prompt engineering to make it more specialized.