$ -weight: 600;">sudo -weight: 500;">dnf -weight: 500;">install ramalama
-weight: 600;">sudo -weight: 500;">dnf -weight: 500;">install ramalama
-weight: 600;">sudo -weight: 500;">dnf -weight: 500;">install ramalama
ramalama version
ramalama version
ramalama version
ramalama version x.x.x
ramalama version x.x.x
ramalama version x.x.x
ramalama run granite3.1-moe:3b
ramalama run granite3.1-moe:3b
ramalama run granite3.1-moe:3b
ramalama run huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
ramalama run huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
ramalama run huggingface://MaziyarPanahi/Mistral-7B-Instruct-v0.3-GGUF
ramalama run -c 16384 llama3.1:8b
ramalama run -c 16384 llama3.1:8b
ramalama run -c 16384 llama3.1:8b
ramalama run --temp 0 granite3.1-moe:3b
ramalama run --temp 0 granite3.1-moe:3b
ramalama run --temp 0 granite3.1-moe:3b
ramalama run --backend vulkan granite3.1-moe:3b # AMD/Intel or CPU fallback
ramalama run --backend cuda granite3.1-moe:3b # NVIDIA
ramalama run --backend rocm granite3.1-moe:3b # AMD ROCm
ramalama run --backend vulkan granite3.1-moe:3b # AMD/Intel or CPU fallback
ramalama run --backend cuda granite3.1-moe:3b # NVIDIA
ramalama run --backend rocm granite3.1-moe:3b # AMD ROCm
ramalama run --backend vulkan granite3.1-moe:3b # AMD/Intel or CPU fallback
ramalama run --backend cuda granite3.1-moe:3b # NVIDIA
ramalama run --backend rocm granite3.1-moe:3b # AMD ROCm
ramalama --debug run granite3.1-moe:3b
ramalama --debug run granite3.1-moe:3b
ramalama --debug run granite3.1-moe:3b
ramalama list
ramalama list
ramalama list
ramalama pull llama3.1:8b
ramalama pull llama3.1:8b
ramalama pull llama3.1:8b
ramalama rm llama3.1:8b
ramalama rm llama3.1:8b
ramalama rm llama3.1:8b
ramalama serve granite3.1-moe:3b
ramalama serve granite3.1-moe:3b
ramalama serve granite3.1-moe:3b
ramalama serve --webui off granite3.1-moe:3b
ramalama serve --webui off granite3.1-moe:3b
ramalama serve --webui off granite3.1-moe:3b - A Fedora system (this guide uses Fedora with -weight: 500;">dnf)
- Podman installed, RamaLama uses it as the default container engine
- Sufficient disk space for model storage (models range from ~2GB to 10GB+)
- At least 8GB RAM for smaller models; 16GB+ recommended for 7B+ parameter models - Model format compatibility: Some Hugging Face models require a pre-converted GGUF version to work with RamaLama. Stick to GGUF-format models when in doubt.
- Memory and context size: Always check the model's default context length before running on a memory-constrained machine. Use -c to cap it appropriately.
- Model size vs. accuracy: Smaller models (3B) are fast and lightweight but may lack knowledge on niche topics. For factual accuracy, 7B+ models are noticeably more reliable.
- --debug flag placement: It must come before the subcommand, i.e. ramalama --debug run not ramalama run --debug.
- RamaLama is still in active development: The project moves fast. Flag names, behaviors, and supported features can change between versions. When in doubt, check ramalama --help or the official docs. - RamaLama Official Docs
- RamaLama GitHub Repository
- RamaLama Blog