vLLM V1: Serving LLMs Efficiently at Scale

2025-06-29
vLLM V1: Serving LLMs Efficiently at Scale

Ubicloud's open-source cloud service leverages vLLM V1 to serve large language models efficiently. This article delves into the vLLM V1 architecture, detailing the journey of an inference request from reception, scheduling, and model execution to output processing. Key technologies like asynchronous IPC, continuous batching, and KV cache management are explained. vLLM V1 maximizes GPU utilization through asynchronous processing, a continuous batching algorithm, and parallel GPU computation, enabling high-throughput text generation at scale. This provides valuable insights for AI engineers deploying LLMs and those interested in understanding how large language models are served efficiently.

Read more

Local NVMe SSDs: The Future of Cloud Databases?

2025-06-02
Local NVMe SSDs: The Future of Cloud Databases?

Cloud storage was initially designed around the limitations of older hardware, using network-attached disks to enhance durability and scalability. However, today's cost-effective NVMe SSDs offer significantly superior performance. This article demonstrates that PostgreSQL databases using local NVMe SSDs outperform AWS RDS and Aurora by several times in TPC-C and TPC-H benchmarks. While network-attached storage retains advantages in elasticity and durability, the reliability and affordability of NVMe SSDs now largely compensate, making local NVMe SSDs a compelling future for cloud databases.

Read more

Ubicloud's Burstable VMs: CPU Slicing with cgroups v2

2025-05-02
Ubicloud's Burstable VMs:  CPU Slicing with cgroups v2

Ubicloud, an open-source AWS alternative, introduced burstable VMs to reduce cloud costs. Leveraging Linux cgroups v2, these VMs run on a fraction of shared CPU resources, bursting to higher usage during peak loads. The article details cgroups v2 configuration and usage, including the cpuset and cpu controllers, and management via the virtual filesystem or systemd. Testing showed burstable VMs achieve around a 30% performance boost under light loads, but this is limited by cgroups v2's micro-interval restrictions.

Read more
Development burstable VMs

Hetzner AX162 Server Reliability Nightmare: A Painful Debugging Journey

2025-02-19
Hetzner AX162 Server Reliability Nightmare: A Painful Debugging Journey

Ubicloud encountered serious reliability issues with Hetzner's new AX162 servers: a 16x higher crash rate than its predecessor, AX161. After months of debugging, they suspected power limiting by Hetzner and motherboard defects as the root causes. Multiple hardware upgrades, especially motherboard replacements, ultimately resolved the issue. This experience taught them the risks of early adoption and led to process improvements, including more thorough vetting and gradual hardware rollouts.

Read more

Deep Dive into Cloud Virtualization: Red Hat, AWS Firecracker, and Ubicloud Internals

2025-01-24
Deep Dive into Cloud Virtualization: Red Hat, AWS Firecracker, and Ubicloud Internals

This blog post delves into the core architectures of cloud virtualization, using Red Hat, AWS Firecracker, and Ubicloud as case studies to compare their differences in virtual machine monitors (VMMs), kernel virtualization, and resource isolation. It explains the roles of key components like KVM, QEMU, and libvirt, and analyzes the use of technologies such as cgroups, nftables, and seccomp-bpf in achieving resource and security isolation. The author also contrasts the AWS Nitro system, summarizing the evolution of cloud virtualization technology and the importance of open-source technology in this field.

Read more