Local AI Model Setup Guide

Connect Ollama to ITransBook — Completely Free, Completely Private

Follow this step-by-step guide to deploy open-source AI models on your own computer and connect them to ITransBook.

                ✨ Why Local Models?

                Zero API fees · Total data privacy · Works offline

1 What is Ollama?

Ollama is an open-source tool that simplifies running large language models locally on your own computer.

Visit the Ollama download page
Download the Windows installer (OllamaSetup.exe)
Run the installer and follow the prompts to complete installation
After installation, Ollama typically runs in the background — you'll see its icon in the system tray

Open PowerShell or Command Prompt and run:

ollama pull llama3.2

Model	Parameters	Min. RAM	Best For
`llama3.2:3b`	3B	4GB+	Fastest speed, great for beginners and low-spec devices
`translategemma:4b`	4B	8GB+	Specialized translation model, 55 languages, excellent quality
`qwen2.5:7b`	7B	8GB+	Excellent Chinese language support, well-balanced
`mistral`	7B	8GB+	Strong reasoning, fast response
`deepseek-r1:7b`	7B	8GB+	Logical reasoning specialist, ideal for complex AI conversations
`llama3.2`	8B	8GB+	Balanced performance and speed for general use
`deepseek-r1:8b`	8B	12GB+	Upgraded version with stronger reasoning
`translategemma:12b`	12B	24GB+	Specialized translation model, outperforms 27B general models
`llama3.2:70b`	70B	64GB+	Maximum performance, requires high-end hardware

To check installed models:

ollama list

ollama run llama3.2

Open your browser and visit: http://localhost:11434
If you see Ollama is running, the service is ready.

Click Configure next to LocalAI
API Url: http://127.0.0.1:11434/v1/chat/completions
API Key: Enter any value (cannot be left empty, e.g., local-key)
Model: Click the + icon on the left side of the model list, then enter the model name you downloaded (e.g., llama3.2)
Click Test Connection to verify the configuration
Save your settings

Once configured, you can select the llama3.2 model under LocalAI in AI Chat, Text Translation, and File Translation.

Model downloads slowly

Out of memory when running a model

ITransBook cannot connect to Ollama

Model responds slowly

Match your model to your hardware:
- 8GB RAM → 3B–7B models
- 16GB → 13B models
- 32GB+ → larger models
Leverage GPU acceleration:
- NVIDIA GPUs automatically use CUDA
Keep models updated:
- Run ollama pull <model-name> regularly
Explore the ecosystem:
- Browse the Ollama library to discover more models