Getting Started

INSTALLATION GUIDE

Install and deploy Guaardvark on your own Linux hardware in under 15 minutes. This guide walks you through every step — from system dependencies to your first AI conversation — so you can run a fully self-hosted AI platform with zero cloud dependency.

System Requirements

Guaardvark is designed to run on a wide range of hardware, from modest workstations to dedicated GPU servers. The platform scales its capabilities based on available resources — you can run basic chat and RAG search on a machine with 4GB of RAM, or unlock video generation, voice chat, and multi-agent workflows on a system with a dedicated GPU and more memory. Before starting the installation, review the hardware tiers below to understand what your machine can support and what you might want to upgrade later.

Minimum 4GB RAM · 2 CPU cores · 10GB disk
Recommended 16GB RAM · 8 CPU cores · 50GB disk · NVIDIA GPU
Optimal 32GB+ RAM · GPU with 8GB+ VRAM · NVMe SSD

The minimum tier handles text-based inference with smaller models like Phi-3. You can run AI agents, chat, and basic RAG search, but response times will be slower and you will be limited to models under 4 billion parameters. This tier is suitable for experimenting with the platform and running lightweight tasks.

The recommended tier is where Guaardvark starts to feel responsive. With 16GB of RAM and an NVIDIA GPU, you can load 7B and 8B parameter models entirely into VRAM, which dramatically reduces inference latency. Image generation becomes practical, voice chat runs in near real-time, and agents can execute multi-step workflows without noticeable lag between tool calls.

The optimal tier unlocks every feature at full quality. A GPU with 8GB or more of VRAM can run larger models like Llama 3 70B (quantized) or dedicated image and video generation models simultaneously. An NVMe SSD reduces model load times from minutes to seconds, which matters when switching between different models during a session. This tier is ideal for production deployments, enterprise use, or anyone who wants the best possible experience from a self-hosted AI setup.

Prerequisites

Guaardvark runs on Linux. Ubuntu 22.04 LTS or newer is the recommended distribution because it has the widest testing coverage and the most straightforward dependency installation. However, Guaardvark also supports Debian 12+, Fedora 38+, and Arch Linux. The core dependencies are available on all major distributions; only the package manager commands differ.

You will need the following software installed before running the Guaardvark setup script:

The following are optional but enable additional capabilities:

Step 1: Install System Dependencies

Start by updating your package index and installing the core dependencies. The following commands are for Ubuntu and Debian-based distributions. If you are running Fedora, replace apt with dnf; on Arch, use pacman.

sudo apt update
sudo apt install -y python3 python3-venv python3-pip \
  nodejs npm redis-server git curl wget build-essential

After installation, verify that the correct versions are available:

python3 --version    # Should output 3.12 or higher
node --version       # Should output v20 or higher
redis-server --version   # Should output 5.0 or higher

If your distribution ships an older version of Python or Node.js, you may need to add a third-party repository or use a version manager. For Python, pyenv is a reliable option. For Node.js, nvm allows you to install and switch between Node.js versions without affecting the system installation.

Make sure Redis is running before proceeding:

sudo systemctl enable redis-server
sudo systemctl start redis-server

Fedora equivalent: sudo dnf install python3 python3-pip nodejs npm redis git curl wget gcc make

Arch equivalent: sudo pacman -S python python-pip nodejs npm redis git curl wget base-devel

Step 2: Clone the Repository

Clone the Guaardvark repository from GitHub and navigate into the project directory:

git clone https://github.com/guaardvark/guaardvark.git
cd guaardvark

This downloads the full source code, configuration files, and setup scripts. The repository is approximately 50MB without model files — models are downloaded separately during the configuration step.

Note: The Guaardvark repository is coming soon on GitHub. If you want to be notified when it becomes available, send an email to [email protected] with the subject line "Notify me" and we will let you know as soon as the repository is public.

Step 3: Run the Setup Script

The setup script automates the entire environment configuration process. It handles everything that would otherwise require a dozen manual commands:

./setup.sh

Here is what the setup script does, in order:

The setup script typically completes in 2–5 minutes depending on your internet speed and hardware. If any step fails, the script prints a clear error message with instructions for resolving the issue. You can safely re-run ./setup.sh after fixing a problem — it is idempotent and will skip steps that have already completed successfully.

Step 4: Configure AI Models

Guaardvark uses Ollama as its primary model runtime. Ollama manages model downloads, quantization, and inference serving through a clean local API. Install it with one command:

curl -fsSL https://ollama.com/install.sh | sh

Once Ollama is installed, pull the models you want to use. We recommend starting with two models — a larger one for complex reasoning tasks and a smaller one for fast, lightweight interactions:

ollama pull llama3      # 8B parameters — strong general-purpose model
ollama pull phi3        # 3.8B parameters — fast, efficient for simple tasks

Model selection matters. Larger models produce higher-quality responses but require more RAM and VRAM. A 7B or 8B parameter model needs roughly 4–6GB of VRAM for comfortable inference speed. A 13B model needs 8–10GB. If you are running on CPU only (no NVIDIA GPU), stick with models under 8B parameters to keep response times reasonable. Guaardvark lets you switch between models at any time from the interface, so you can experiment with different sizes and find the best balance for your hardware.

Ollama also supports other model families. You can pull models like mistral, codellama, gemma, or any model available in the Ollama library. Each model specializes in different tasks — CodeLlama excels at code generation, Mistral is strong at instruction following, and Gemma offers a good balance of quality and speed at smaller sizes.

Step 5: Start Guaardvark

With dependencies installed, the repository cloned, the setup script completed, and at least one model pulled, you are ready to launch the platform:

./start.sh

The start script boots the backend API server, the web interface, and the background task workers. Once everything is running, you will see a message confirming the platform is ready:

Guaardvark is running at http://localhost:5002

Open http://localhost:5002 in your browser to access the web interface. From there, you can start a conversation with an AI agent, upload documents for RAG search, generate images, or configure additional features.

Guaardvark also provides a command-line interface (CLI) for terminal-based interaction. The CLI supports the same features as the web interface — chat, agents, RAG queries, and model management — and is especially useful for scripting, automation, and SSH sessions on remote machines.

Optional Configuration

The five steps above give you a working Guaardvark installation with text-based AI chat and agent capabilities. The following optional configurations unlock additional features depending on your hardware and use case.

GPU Acceleration

If you have an NVIDIA GPU, install the CUDA toolkit to enable GPU-accelerated inference. Guaardvark auto-detects GPU availability at startup — no additional configuration is needed beyond installing the drivers and toolkit:

sudo apt install -y nvidia-cuda-toolkit

After installation, restart Ollama and Guaardvark. The platform will automatically offload model inference to the GPU, which typically provides a 5–20x speedup over CPU-only inference depending on the model size and your GPU's VRAM capacity.

Voice Chat

Voice chat requires FFmpeg for audio processing. Install it from your distribution's package manager:

sudo apt install -y ffmpeg

Once FFmpeg is available, Guaardvark's voice pipeline activates automatically. The Whisper speech-to-text model downloads on first use (approximately 1.5GB for the base model). Voice chat works entirely offline — speech recognition and text-to-speech both run locally without any external API calls.

Video Generation

Video generation requires ComfyUI as the inference backend. Guaardvark integrates with ComfyUI's API to generate video using Wan2.2 or CogVideoX models. A GPU with at least 8GB of VRAM is strongly recommended for video generation — the process is extremely memory-intensive and impractical on CPU alone. See the Video Generation feature page for detailed setup instructions.

Interconnector (Multi-Device Sync)

The Interconnector module enables synchronization across multiple Guaardvark instances running on different machines in your local network. This is useful for setups where you have a powerful desktop for inference and a laptop for the interface, or when multiple users share access to a single Guaardvark server. Configure the Interconnector by editing the config/interconnector.yaml file after installation.

Raspberry Pi Installation

Guaardvark fully supports ARM64 architecture, which means it runs natively on Raspberry Pi 4 and Raspberry Pi 5 hardware without emulation or compatibility layers. The installation steps are identical to the standard Linux installation described above. On a Raspberry Pi 5 with 8GB of RAM, you can run smaller models like Phi-3 and TinyLlama with acceptable response times for chat and basic agent tasks.

For a detailed walkthrough of Raspberry Pi deployment including performance benchmarks, recommended model selections for constrained hardware, and tips for optimizing inference speed on ARM, see the Raspberry Pi & Edge Deployment use case page.

Deploy Your Own AI Platform

Guaardvark is coming soon. Get notified when it launches, or explore the source on GitHub.

Get Notified View on GitHub