Posts

OpenClaw: Examining a Self-Hosted AI Assistant as a Real System

A case-study exploration of OpenClaw — a self-hosted AI assistant system that integrates local LLMs, retrieval, memory, routing, and observability into a cohesive local infrastructure. OpenClaw: Examining a Self-Hosted AI Assistant as a Real System

Oh My Opencode QuickStart for OpenCode: Install, Configure, Run

A practical Oh My Opencode quickstart for OpenCode. Learn installation via bunx or npm, configuration file locations, ultrawork mode, agent models, and real command examples for daily dev. Oh My Opencode QuickStart for OpenCode: Install, Configure, Run

Best LLMs for OpenCode - Tested Locally

Hands-on comparison of LLMs in OpenCode - local Ollama and llama.cpp models vs cloud. Coding tasks, migration map accuracy stats, and honest failure analysis. Best LLMs for OpenCode - Tested Locally

OpenHands Coding Assistant QuickStart: Install, CLI Flags, Examples

OpenHands QuickStart for developers. Install the CLI, configure your LLM API key, learn core command-line flags and safety modes, and run practical examples in interactive and headless workflows. OpenHands Coding Assistant QuickStart: Install, CLI Flags, Examples

LocalAI QuickStart: Run OpenAI-Compatible LLMs Locally

Learn to install LocalAI, load models from the gallery or Hugging Face, and serve an OpenAI-compatible API plus Web UI for chat, embeddings, images and audio on your own hardware. LocalAI QuickStart: Run OpenAI-Compatible LLMs Locally

Retrieval-Augmented Generation (RAG) Tutorial: Architecture, Implementation, and Production Guide

Step-by-step RAG tutorial: build retrieval-augmented generation systems with vector databases, hybrid search, reranking, and web search. Architecture, implementation, and production best practices. Retrieval-Augmented Generation (RAG) Tutorial: Architecture, Implementation, and Production Guide

Designing Non-Blocking RAG Pipelines

Learn how to design non-blocking RAG pipelines using asynchronous processing, vector databases, and robust error handling to achieve low-latency, high-throughput AI systems in 2026. Designing Non-Blocking RAG Pipelines

vLLM Quickstart: High-Performance LLM Serving - in 2026

Complete vLLM setup guide with Docker, OpenAI API compatibility, PagedAttention optimization. Compare vLLM vs Ollama vs Docker Model Runner for production. vLLM Quickstart: High-Performance LLM Serving - in 2026