llama.cpp Quickstart with CLI and Server

March 12, 2026

Install llama.cpp, run GGUF models with llama-cli, and serve OpenAI-compatible APIs using llama-server. Key flags, examples, and tuning tips with a short commands cheatsheet

Search This Blog

Software Development News

llama.cpp Quickstart with CLI and Server

Comments

Post a Comment

Popular posts from this blog

Gitflow Workflow overview

Reranking text documents with Ollama and Qwen3 Embedding model - in Golang:

UV - a New Python Package Project and Environment Manager. Here we provide it's short description, performance statistics, how to install it and it's main commands