Local AI Model Setup Guide

Connect Ollama to ITransBook — Completely Free, Completely Private

Follow this step-by-step guide to deploy open-source AI models on your own computer and connect them to ITransBook.

✨ Why Local Models?
Zero API fees · Total data privacy · Works offline · Faster response

1 What is Ollama?

Ollama is an open-source tool that simplifies running large language models locally on your own computer.

2 Install Ollama

  1. Visit the
  2. Download the Windows installer (OllamaSetup.exe)
  3. Run the installer and follow the prompts to complete installation
  4. After installation, Ollama typically runs in the background — you'll see its icon in the system tray

3 Download a Model

Open PowerShell or Command Prompt and run:

ollama pull llama3.2

Model Selection Guide

ModelParametersMin. RAMBest For
llama3.2:3b3B4GB+Fastest speed, great for beginners and low-spec devices
translategemma:4b4B8GB+Specialized translation model, 55 languages, excellent quality
qwen2.5:7b7B8GB+Excellent Chinese language support, well-balanced
mistral7B8GB+Strong reasoning, fast response
deepseek-r1:7b7B8GB+Logical reasoning specialist, ideal for complex AI conversations
llama3.28B8GB+Balanced performance and speed for general use
deepseek-r1:8b8B12GB+Upgraded version with stronger reasoning
translategemma:12b12B24GB+Specialized translation model, outperforms 27B general models
llama3.2:70b70B64GB+Maximum performance, requires high-end hardware

To check installed models:

ollama list

4 Run the Model

ollama run llama3.2

5 Configure LocalAI in ITransBook

5.1 Verify Ollama is running

Open your browser and visit: http://localhost:11434
If you see Ollama is running, the service is ready.

5.2 Open ITransBook Settings

  1. Launch ITransBook
  2. Click the Settings icon in the bottom-left corner
  3. Select API Settings

5.3 Configure the LocalAI connection

  1. Click Configure next to LocalAI
  2. Endpoint: http://127.0.0.1:11434/v1/chat/completions
  3. API Key: Enter any value (cannot be left empty, e.g., local-key)
  4. Model: Click the + icon on the left side of the model list, then enter the model name you downloaded (e.g., llama3.2)
  5. Click Test Connection to verify the configuration
  6. Save your settings

5.4 Start using LocalAI

Once configured, you can select the llama3.2 model under LocalAI in AI Chat, Text Translation, and File Translation.



FAQ

Model downloads slowly
  • Check your network
  • download during off-peak hours
  • use a proxy or mirror
Out of memory when running a model
  • Choose a smaller model variant (e.g., :3b)
  • close other memory-heavy programs
  • add more RAM
ITransBook cannot connect to Ollama
  • Confirm Ollama is running
  • double-check the Endpoint URL and Model
  • visit http://localhost:11434 in your browser to verify
Model responds slowly
  • Switch to a smaller model (e.g., 3B)
  • if you have a GPU, ensure Ollama is using GPU acceleration
  • reduce conversation context length

Tips & Best Practices

  1. Match your model to your hardware:
    • 8GB RAM → 3B–7B models
    • 16GB → 13B models
    • 32GB+ → larger models
  2. Leverage GPU acceleration:
    • NVIDIA GPUs automatically use CUDA
  3. Keep models updated:
    • Run ollama pull <model-name> regularly
  4. Explore the ecosystem: