Overprovisioning Fiber: Better Safe Than Sorry

2025-03-25

When planning fiber cabling between rooms or buildings, err on the side of caution and install more fiber than you initially need. Future expansion, bandwidth upgrades, and new protocols all demand extra capacity. Furthermore, fiber failures do happen—sometimes inexplicably—and having spare pairs allows for quick recovery. While single-mode and multi-mode fibers have different applications, having sufficient redundancy is crucial for minimizing downtime and costs.

Read more

Loki's Structured Metadata: A Logistical Nightmare

2025-03-19

Grafana Loki, often touted as 'Prometheus for logs,' initially adopted a data model similar to Prometheus. However, this proved disastrous for system logs (syslog or systemd journal). Unlike Prometheus, Loki stores each label value set separately and lacks log compaction, leading to cardinality explosions. To address this, Loki introduced 'structured metadata,' but as of version 3.0.0, it remains underdeveloped. Structured metadata labels aren't treated as regular Loki labels, requiring different query syntax. Migration from existing labels is complex and potentially catastrophic, with the risk of unintentionally creating high-cardinality labels. Upgrading requires caution, migrating existing data is incredibly expensive, and careful consideration is crucial before using it in new projects.

Read more
Development system logs

JSON: A Pragmatic Choice for Machine-Readable Output on Unix

2025-02-24

The author advocates for using JSON as the machine-readable output format, based on their experience deleting emails from a Postfix mail queue. While not perfect, JSON offers several practical advantages on Unix systems: clarity, broad compatibility, extensive tool support, and easy conversion to other formats. For new programs, the author suggests that using only JSON is the simplest approach, avoiding the complexities of designing custom formats and promoting interoperability between Unix programs.

Read more

Hidden Cache Hogs: Why Your Disk Space Is Vanishing

2025-02-08

Many Unix programs cache data in hidden `.cache` and `.local` directories, making it difficult for users to find and clear these large cache files that consume significant disk space. The author witnessed firsthand how graduate students in a shared fileserver environment were baffled by these hidden caches, with hundreds of GBs of disk space being unknowingly consumed. The article calls for developers to store caches in visible directories and suggests that disk space usage tools should explicitly show the contents of these hidden directories to aid in user disk space management.

Read more

Sophisticated Phishing Attack Leverages VPN Access

2025-01-29

The University of Toronto's Computer Science department was hit by a highly sophisticated phishing attack. The attacker spoofed a departmental email address, successfully phishing a user's password. Alarmingly, the attacker used the stolen credentials to quickly register the user for the department's VPN, then used the internal-only SMTP gateway to send spam. This demonstrates pre-attack reconnaissance of the target's VPN and email environment, highlighting increasingly advanced attack techniques and the need for robust cybersecurity defenses.

Read more

Disabling Password Authentication for Internet-Facing SSH: Security Boost or Overkill?

2025-01-18

This article weighs the pros and cons of disabling password authentication for internet-facing SSH. While strong passwords offer protection against brute-force attacks, the author argues that disabling password authentication provides extra layers of security against stolen credentials, SSH server vulnerabilities, and attacks targeting default accounts. However, this also introduces inconvenience, such as the inability to log in without a keypair. The author suggests a careful consideration of the trade-offs based on individual circumstances.

Read more

/etc/glob: The Untold Story of Early Unix Shell Globbing

2025-01-13

This article delves into the history and function of `/etc/glob` in early Unix systems. Before the V7 Bourne Shell, Unix shell globbing wasn't handled by the shell itself but delegated to the external program `/etc/glob`. `/etc/glob` received the command and arguments, expanded wildcards, and then executed the command. The article details how `/etc/glob` worked across different Unix versions, including handling escaped characters and the rationale behind using an external program—likely due to resource constraints in early systems.

Read more
Development Unix history Globbing

WireGuard Setup Complexity: A Guide from Simple to Advanced

2025-01-05

This blog post explores various WireGuard setup complexities, ranging from the simplest, with completely isolated internal IP address spaces, to the most challenging 'VPN' setup where some endpoints are accessible both inside and outside the WireGuard tunnel. The author details the difficulty and potential issues of each setup, such as routing conflicts and recursive routing. The article stresses the importance of upfront planning and suggests opting for simpler configurations to avoid complex routing when designing a WireGuard environment.

Read more
Development Network Configuration

Potential Issue with zpool import/export in Linux OpenZFS

2024-12-26

A potential issue exists in Linux OpenZFS versions (as of 2.3.0) regarding importing and exporting ZFS pools. Even if no filesystems within a ZFS pool have the 'sharenfs' property set, `zpool import` and `zpool export` still run `exportfs -ra`. This can wipe out manually added or modified NFS exports, impacting environments like high-availability systems using custom NFS export configurations. The problem stems from OpenZFS blindly executing `exportfs -ra`, regardless of whether NFS exports need changing.

Read more
Development

Server Reboot Failure: Cool-Down Reboot Solves Kernel Crash

2024-12-25

The author encountered two identical servers experiencing kernel crashes that couldn't be resolved by a simple reboot. During the crash, the servers printed a series of machine check exception errors during the system firmware stage, pointing to CPU hardware issues. A cool-down period of a few minutes after powering off, followed by a reboot, resolved the problem. This demonstrates that even a brief power interruption may not fully reset certain x86 system components, requiring a cool-down period for complete recovery.

Read more

A Decade-Old Fileserver's Second Life: Cost-Effective Storage Solution

2024-12-17

A company is still running a production machine, a fileserver over a decade old. While outdated, with a BMC requiring Java for KVM-over-IP, its 16 disk bays and 10G Ethernet ports make it ideal for repurposing. Used as a bring-your-own-disk low-cost storage server, it fulfills the need for high-capacity, low-performance storage despite its age and limited RAM. This highlights the value of reusing old hardware when requirements align.

Read more

Scheduled Reboots: A Preventative Approach

2024-12-13

A university research team faced a challenging sysadmin problem: their servers had been running for too long and needed rebooting, but frequent reboots disrupt user experience. Their default was to avoid reboots, but a recent large-scale reboot due to prolonged uptime forced a change. To prevent similar issues, they've decided on a yearly reboot schedule—at least three times a year, aligning with the university's teaching schedule—balancing preventative maintenance with user experience.

Read more