Posts

Showing posts from January, 2026

Hardware Price Crisis: GPUs and RAM in 2025-2026

 The hardware market is in turmoil. Both RAM and GPU prices are increasing . Since September 2025, we’ve witnessed unprecedented price increases across PC components. DDR5 memory kits now cost more than high-end GPUs, the RTX 5090 has vanished from shelves, and anyone building AI infrastructure faces difficult choices. Here’s what’s happening and what it means for self-hosters and developers. The memory shortage isn’t ending soon. Industry analysts expect constraints to persist through 2026 as AI datacenter demand continues outpacing supply expansion. For individual developers and small teams, this means: Buy RAM now if you need it — Prices are unlikely to drop soon Consider unified memory systems — DGX Spark and Mac Studio sidestep the DDR5 crisis Right-size your local models — Smaller, quantized models may be the pragmatic choice Hybrid approaches — Combine local inference for privacy with cloud APIs for heavy lifting The hardware landscape has fundamentally changed...

Top 17 Trending Python Projects on GitHub.

Discover the hottest Python projects on GitHub this month, ranked by stars gained. Claude Skills dominate with AI agents, RAG frameworks, and development tools leading the charge. https://www.glukhov.org/post/2026/01/most-popular-python-projects-on-github/ #Python #OpenSource #AI #AICoding #LLM #RAG #Claude #MCP #DevOps #DeepLearning

Top 23 Trending Rust Projects on GitHub - January 2026.

Discover the hottest Rust projects on GitHub this month, ranked by stars gained. From AI coding agents to terminal tools, app frameworks to trading platforms - complete overview with stats, licenses, and use cases. https://www.glukhov.org/post/2026/01/most-popular-rust-projects-on-github/ #Rust #OpenSource #AI #AICoding #LLM #DevOps #Git #Security #CLI #Terminal #TUI

Top 19 Trending Go Projects on GitHub - January 2026.

Discover the hottest Go projects on GitHub this month, ranked by stars gained. From AI coding agents to Docker management, self-hosted apps to LLM gateways - complete overview with stats, licenses, and use cases. https://www.glukhov.org/post/2026/01/most-popular-go-projects-on-github/ #Go #Golang #OpenSource #AI #AICoding #LLM #Ollama #Docker #K8S #SelfHosting #API #DevOps #Git #Security #RAG #Privacy

Integrating PostgreSQL MCP Server with Docker and Claude Desktop

  PostgreSQL Model Context Protocol (MCP) Server integration with Docker and Claude Desktop in 2026 enables secure, scalable AI development environments. This integration addresses the need for consistent, isolated, and efficient model training and inference workflows. The discussion covers MCP Server architecture, Docker containerization strategies, Claude Desktop configuration, practical use cases, and best practices. Familiarity with Docker, PostgreSQL, and AI development frameworks is assumed.

Kubernetes Security: RBAC, Pod Security Standards, and Policy Engines

  Kubernetes has become the de facto standard for container orchestration, but its security model requires careful configuration. Effective security implementation in Kubernetes 2026 relies on robust Role-Based Access Control (RBAC) , strict Pod Security Standards, and advanced policy engines. This article examines the principles and practical implementation of these three components, focusing on their role in preventing privilege escalation, enforcing compliance, and automating security policies. Familiarity with Kubernetes cluster administration and basic security concepts is assumed.

Anaconda vs Miniconda vs Mamba Guide.

Complete comparison of Anaconda, Miniconda, and Mamba for Python package management. Learn installation, performance differences, and when to use each tool for data science and development. https://www.glukhov.org/post/2026/01/anaconda-vs-miniconda-vs-mamba/ #Python #Anaconda #Linux #Dev #OpenSource #AI #DeepLearning #PyTorch

Wayland vs X11: 2026 Comparison.

Complete comparison of Wayland and X11 display servers: architecture, security, performance, compatibility, and migration guide for Linux users in 2026. https://www.glukhov.org/post/2026/01/wayland-vs-x11-comparison/ #Linux #DevOps #OpenSource

LLM Development Ecosystem: Backends, Frontends & RAG

Here is a Set of articles about LLM Development Ecosystem: Backends, Frontends & RAG . LLM hosting (Ollama, Docker Model Runner, cloud providers), Coding in Python and Go, RAG, vector stores, embeddings, MCP

GPU and RAM Prices Surge in Australia: RTX 5090 Up 15%, RAM Up 38% - January 2026.

RTX 5090 prices jumped 15.2% to $5,566, RTX 5080 rose 6% to $1,899, and RAM surged 38% to $689 in Australia. Latest price analysis for January 2026 across Centrecom, PCCG, and Scorptec. https://www.glukhov.org/post/2026/01/ram-and-gpu-price-increase/ #SelfHosting #Hardware

NVIDIA DGX Spark Pricing: $6,249-$7,999 at Major Retailers in Australia

The NVIDIA DGX Spark (GB10 Grace Blackwell) is now available in Australia at major PC retailers with local stock. If you’ve been following the global DGX Spark pricing and availability , you’ll be interested to know that Australian pricing ranges from $6,249 to $7,999 AUD depending on storage configuration and retailer. These systems are good for running local AI/LLM workloads, like LLM inference with Ollama . The Australian configurations come with 128GB unified memory and up to 1 PFLOP of AI compute power, making them suitable for models up to ~200B parameters.  

Building LLM Applications with Rust: candle and llm Crates

  Building LLM applications in Rust using candle and llm crates reveals candle as the more viable choice due to its active development and broader hardware support. Candle 0.4.0 with CUDA 12.1 enables GPU acceleration for tensor operations, demonstrated in fintech applications with reduced latency. The llm crate, being archived and limited to GGMLv3 models, lacks support for modern formats like GGUF and newer hardware. For new projects, prioritize candle, leveraging its 2026 release features such as quantization for LLaMA and distributed inference. Explore tools like kalosm and atoma-infer to extend candle’s capabilities in production deployments.

Best Open-Source LLMs You Can Run on 16 GB VRAM (As of 2026)

Running powerful open-source LLMs on 16 GB VRAM systems is feasible through quantization and optimized deployment. Converting models like Mistral Large 3 to 4-bit precision reduces VRAM usage by up to 4x, enabling execution on consumer-grade GPUs. Phi-3 Mini achieves 68.8 MMLU and 62.2 HumanEval scores at 3.8B parameters with 8 GB VRAM at 4-bit quantization, making it ideal for low-latency applications. Use vLLM with speculative execution to deploy Mixtral 8x7B on RTX 4090 via Docker for high-parameter workloads. Evaluate model size, quantization level, and inference tools like BitsAndBytes and Hugging Face Transformers to select the best fit for your VRAM and performance needs.

Infrastructure as Code: Terraform vs OpenTofu vs Pulumi - A 2026 Comparison

  Infrastructure as Code (IaC) has become essential for managing cloud resources efficiently. This post compares Terraform 1.5, OpenTofu 1.0, and Pulumi 5.0 , analyzing their architecture, performance, features, and use cases. Key differences include language support, state management, plugin systems, and integration with cloud providers. The comparison covers technical aspects relevant to deployment pipelines, team collaboration, and infrastructure scalability.

Building High-Performance APIs with FastAPI and Async Python

  FastAPI, leveraging Python’s async/await model, enables the development of high-performance, scalable APIs suitable for modern web services. Asynchronous programming reduces latency and improves concurrency, making it essential for handling high request volumes efficiently. This post covers the fundamentals of async programming in FastAPI, designing efficient endpoints, optimizing with middleware and background tasks, and testing performance through benchmarking. Target audience includes developers familiar with Python 3.11+ and basic web framework concepts, with knowledge of async programming in 2026 being advantageous.

Terraform Best Practices: Code Organization and Standards:

Terraform best practices emphasize modular, reusable code and strict naming conventions to ensure maintainability and scalability. Modularization through self-contained modules with input/output parameters improves deployment consistency and reduces duplication. Terraform v1.6.5 enforces lowercase hyphenated resource names to avoid parsing errors, while TFLint 0.52.0 integrates with CI/CD tools for automated validation. Implement Git with descriptive commit messages and CI/CD pipelines using Terraform fmt and TFLint for pre-commit checks. For large-scale projects, adopt a centralized module repository and enforce .tflint.hcl configurations for team-wide standards.

REST API Design Best Practices

REST  API design in 2026 prioritizes noun-based URL structures for improved caching and performance, with benchmarks showing 15% faster response times. HTTP/3 adoption by 68% of major services enhances state synchronization and reduces latency through QUIC. Implementing HATEOAS with embedded links and forms enables dynamic client discovery and stateless interactions. For security, OAuth 2.0 with OpenID Connect (OIDC) per RFC 9700 is mandatory, with Keycloak 26.5 supporting JWT Authorization Grants. Use cursor-based pagination and gzip compression to reduce database load and payload sizes by 60-80%, ensuring scalable, high-performance APIs.

Self-hosted Plausible Analytics

Self-hosted Plausible Analytics with Community Edition v5.2.1 provides bloggers with GDPR-compliant, cookie-free tracking via a lightweight script added before the tag. Real-time dashboards update every 30 seconds, and funnel analysis enhances performance insights. To avoid adblocker interference, proxy the analytics script through a custom domain using Plausible’s managed proxy option. Verify installation by checking real-time traffic data in the Plausible dashboard after visiting the blog. For advanced use, consider implementing UTM campaign tracking and exploring self-hosting under the AGPL license for full data control.

Curated List of Articles about Coding in Python:

Architecture and Design Patterns, Modern Package Management, Building Production-Ready APIs, AI, RAG and LLM Integration, Data Science and Analysis, Document Processing and Web Scraping, Testing, DevOps Deployment: Curated List of Articles about Coding in Python #python, #coding #dev #devops #datascience #ai #rag #llm #architecture #web #testing

Best Linux Terminal Emulators: 2026 Comparison.

Compare top Linux terminal emulators: Alacritty, Kitty, WezTerm, GNOME Terminal, and more. Features, performance, and customization options reviewed. https://www.glukhov.org/post/2026/01/terminal-emulators-for-linux-comparison/ #Linux #bash #Dev #DevOps #OpenSource #Hardware #NVidia

Implementing Ollama client applications in Go

Ollama API is a powerful tool designed to facilitate the development and deployment of large language models (LLMs) by providing a robust set of features and efficient model serving capabilities. As of 2025, Ollama supports multiple LLMs , including the latest versions of models such as Google’s FunctionGemma , Nemotron 3 Nano , Olmo 3 , and Devstral-Small-2 . This versatility allows developers to choose the most suitable model for their specific use cases, whether it be for code generation, natural language processing, or other specialized tasks. -- Building LLM applications with Go using the Ollama API enables scalable, efficient deployments with support for models like Llama3 and Gemma. The /v1/chat/completions endpoint allows Go applications to send HTTP POST requests in OpenAI-compatible format, while the /api/generate endpoint supports real-time inference and log probabilities for specialized use cases. Streaming responses via the stream parameter reduces latency an...

RAG vs Long-Context LLMs: A Comprehensive Comparison

RAG and long-context LLMs are two approaches to handling complex language tasks in 2025, each leveraging different mechanisms for information processing. This comparison evaluates their underlying architectures, inference efficiency, context handling, and scalability in real-world applications. Key differences include retrieval integration, memory constraints, and adaptability to dynamic data sources. The analysis covers leading implementations from 2025, including major model versions and framework capabilities. RAG and long-context LLMs both address complex querying but differ in architecture and performance. RAG systems, using Elasticsearch 8.10 and LlamaIndex with Gemini 1.5 Pro, achieve 1-second response times and reduce hallucinations by retrieving external data, making them ideal for dynamic datasets. Long-context LLMs like Gemini 1.5 Pro process up to 1 million tokens internally, enabling single-pass analysis but incurring 45-second latency and higher costs. Choo...

Ollama GPU Acceleration: Ultimate CUDA & ROCm Guide:

Learn how to configure and optimize Ollama for production AI deployment using NVIDIA CUDA and AMD ROCm. This guide covers GPU acceleration setup, performance tuning, and best practices for scalable inference.: https://dasroot.net/posts/2026/01/ollama-gpu-acceleration-nvidia-cuda-amd-rocm-guide/ #Ollama #NVIDIA #CUDA #AMDROCm #GPU #AI

Open WebUI: Self-Hosted LLM Interface.

Complete guide to Open WebUI: a powerful self-hosted web interface for Ollama and OpenAI-compatible APIs with RAG, multi-user auth, and Docker deployment. https://www.glukhov.org/post/2026/01/open-webui-overview-quickstart-and-alternatives/ #AI #LLM #Ollama #Docker #SelfHosting #OpenSource #Python #K8S

Calendar of IT Events in Melbourne in 2026

Image
Melbourne continues to be a vibrant hub for technology professionals in 2026, offering a diverse range of conferences, workshops, and community meetups. For people interested in software development, cybersecurity, cloud computing, or AI, Melbourne’s tech scene has something for everyone. Here’s an essential guide to the key IT events in Melbourne happening throughout the year 2026 .    

Melbourne Tech Events to Go To in 2026.

Comprehensive guide to tech conferences, meetups, and developer events in Melbourne throughout 2026, covering AI, DevOps, Python, security, and more https://www.glukhov.org/post/2026/01/tech-events-melbourne/ #Community #Melbourne #Python #AI #DevOps #Security #K8S

vLLM Quickstart: High-Performance LLM Serving.

Complete vLLM setup guide with Docker, OpenAI API compatibility, PagedAttention optimization. Compare vLLM vs Ollama vs Docker Model Runner for production. https://www.glukhov.org/post/2026/01/vllm-quickstart/ #LLM #AI #Python #Docker #API #Ollama #DevOps #SelfHosting #NVidia #Hardware #PyTorch #DeepLearning #OpenSource #bash #Linux #Cloud #K8S

DGX Spark AU Pricing: $6,249-$7,999 at Major Retailers.

NVIDIA DGX Spark now available in Australia from $6,249 AUD. Compare prices at Centrecom, Scorptec, PCCaseGear, and PLE - ASUS Ascent GX10 & MSI EdgeXpert GB10 configs. https://www.glukhov.org/post/2026/01/dgx-spark-pricing-in-australia/ #SelfHosting #LLM #AI #AICoding #Hardware #NVidia #Melbourne #Ollama #DGXSpark

Programming in Go - Essential Resources

Over the time I created a set of articles about coding in Go. Here is an updated release note. https://glukhov.au/posts/2026/go-coding/ Hope you find it useful

DIY Printing Planner Inserts: How to.

Comprehensive guide to creating and printing custom planner inserts using booklet binding software https://www.glukhov.org/post/2025/12/diy-printing-planner-inserts/ #Offline #Filofax #DigitalDetox

Extract Text from PDFs with PDFMiner in Python.

Learn how to extract text from PDF files using PDFMiner.six in Python with practical examples, layout analysis, and performance optimization techniques https://www.glukhov.org/post/2025/12/extract-text-from-pdf-using-pdfminer-python/ #Python #API #Dev #Linux #OpenSource

Playwright: Web Scraping & Testing.

Complete guide to Playwright for web scraping, testing, and browser automation with Python, JavaScript, and TypeScript examples for modern web apps. https://www.glukhov.org/post/2025/12/playwright-for-scraping-and-testing-webapps/ #Python #JavaScript #Node.js #Testing #TypeScript #API #DevOps #OpenSource

Self-Hosting Cognee: Choosing LLM on Ollama.

Testing Cognee RAG framework with local LLMs - gpt-oss, qwen3, deepseek-r1, and others. Real-world results, configs, and performance insights. https://www.glukhov.org/post/2025/12/selfhosting-cognee-quickstart-llms-comparison/ #SelfHosting #LLM #AI #RAG #Python #Ollama #Hardware #Docker #OpenSource

Detecting AI Slop: Techniques & Red Flags.

Learn practical methods for identifying low-quality AI-generated content, including detection tools, linguistic patterns, and technical approaches. https://www.glukhov.org/post/2025/12/ai-slop-detection/ #AI #LLM #NLP #Python #DeepLearning

Vector Stores for RAG Comparison.

Comprehensive comparison of vector databases for Retrieval Augmented Generation: Pinecone, Chroma, Weaviate, Milvus, Qdrant, FAISS, and pgvector. Performance, features, and use cases. https://www.glukhov.org/post/2025/12/vector-stores-for-rag-comparison/ #LLM #AI #RAG #Python #Cloud #SelfHosting #Dev

Ubuntu lost network after kernel upgrade.

How to fix Lost Network in Ubuntu after kernel upgrade https://www.glukhov.org/post/2025/12/ubuntu-lost-network/ #Linux #bash #DevOps #Hardware

Snap vs Flatpak: Ultimate Guide for 2025.

Comprehensive comparison of Snap and Flatpak universal package managers: architecture, performance, security, and which one fits your Linux workflow best. https://www.glukhov.org/post/2025/12/snap-vs-flatpack/ #Linux #DevOps #OpenSource

RAM Price Surge: Up to 619% in 2025.

RAM prices surged 163-619% across global markets in late 2025: AI data center demand, supply constraints, and DDR5 pricing trends. Impact on PC builds and future outlook. https://www.glukhov.org/post/2025/12/ram-price-increase/ #Hardware #AI #SelfHosting

RAM Price in Australia - December 2025.

RAM Price in Australia in December 2025 https://www.glukhov.org/post/2025/12/ram-price-in-australia-december-2025/ #Hardware