Local AI Model Setup Guide
Connect Ollama to ITransBook — Completely Free, Completely Private
Follow this step-by-step guide to deploy open-source AI models on your own computer and connect them to ITransBook.
✨ Why Local Models?
Zero API fees · Total data privacy · Works offline · Faster response
1 What is Ollama?
Ollama is an open-source tool that simplifies running large language models locally on your own computer.
- One-Command Setup: Download and run models with a single terminal command
- Completely Free: No API fees — just your own hardware
- Privacy First: All data is processed locally and never leaves your machine
- Rich Model Library: Supports Llama, Mistral, Qwen, and many more
2 Install Ollama
- Visit the Ollama download page
- Download the Windows installer (
OllamaSetup.exe)
- Run the installer and follow the prompts to complete installation
- After installation, Ollama typically runs in the background — you'll see its icon in the system tray
3 Download a Model
Open PowerShell or Command Prompt and run:
ollama pull llama3.2
Model Selection Guide
| Model | Parameters | Min. RAM | Best For |
llama3.2:3b | 3B | 4GB+ | Fastest speed, great for beginners and low-spec devices |
translategemma:4b | 4B | 8GB+ | Specialized translation model, 55 languages, excellent quality |
qwen2.5:7b | 7B | 8GB+ | Excellent Chinese language support, well-balanced |
mistral | 7B | 8GB+ | Strong reasoning, fast response |
deepseek-r1:7b | 7B | 8GB+ | Logical reasoning specialist, ideal for complex AI conversations |
llama3.2 | 8B | 8GB+ | Balanced performance and speed for general use |
deepseek-r1:8b | 8B | 12GB+ | Upgraded version with stronger reasoning |
translategemma:12b | 12B | 24GB+ | Specialized translation model, outperforms 27B general models |
llama3.2:70b | 70B | 64GB+ | Maximum performance, requires high-end hardware |
To check installed models:
ollama list
4 Run the Model
ollama run llama3.2
5 Configure LocalAI in ITransBook
5.1 Verify Ollama is running
Open your browser and visit: http://localhost:11434
If you see Ollama is running, the service is ready.
5.2 Open ITransBook Settings
- Launch ITransBook
- Click the Settings icon in the bottom-left corner
- Select API Settings
5.3 Configure the LocalAI connection
- Click Configure next to LocalAI
- Endpoint:
http://127.0.0.1:11434/v1/chat/completions
- API Key: Enter any value (cannot be left empty, e.g.,
local-key)
- Model: Click the + icon on the left side of the model list, then enter the model name you downloaded (e.g.,
llama3.2)
- Click Test Connection to verify the configuration
- Save your settings
5.4 Start using LocalAI
Once configured, you can select the llama3.2 model under LocalAI in AI Chat, Text Translation, and File Translation.
FAQ
Model downloads slowly
- Check your network
- download during off-peak hours
- use a proxy or mirror
Out of memory when running a model
- Choose a smaller model variant (e.g.,
:3b)
- close other memory-heavy programs
- add more RAM
ITransBook cannot connect to Ollama
- Confirm Ollama is running
- double-check the Endpoint URL and Model
- visit
http://localhost:11434 in your browser to verify
Model responds slowly
- Switch to a smaller model (e.g., 3B)
- if you have a GPU, ensure Ollama is using GPU acceleration
- reduce conversation context length
Tips & Best Practices
- Match your model to your hardware:
- 8GB RAM → 3B–7B models
- 16GB → 13B models
- 32GB+ → larger models
- Leverage GPU acceleration:
- NVIDIA GPUs automatically use CUDA
- Keep models updated:
- Run
ollama pull <model-name> regularly
- Explore the ecosystem: