mcblockd 5 years on

mcblockd, the firewall automation I created 5 years ago, continues to work.

However, it’s interesting to note how things have changed. Looking at just the addresses I block from accessing port 22…

While China remains at the top of my list of total number of blocked IP addresses, the US is now in 2nd place. In 2017, the US wasn’t even in the top 20. What has changed?

Most of the change here is driven by my automation seeing more and more attacks originating from cloud hosted services. Amazon EC2, Google, Microsoft, DigitalOcean, Linode, Oracle, et. al. While my automation policy won’t go wider than a /24 for a probe from a known US entity, over time I see probes from entire swaths of contiguous /24 networks from the same address space allocation, which will be coalesced to reduce firewall table size. Two adjacent /24 networks become a single /23. Two adjacent /23 networks become a single /22. All the way up to a possible /8 (the automation stops there).

So today, the last of 2022, I see some very large blocks owned by our cloud providers being blocked by my automation due to receiving ssh probes from large contiguous swaths of their address space.

I am very appreciative of the good things from big tech. But I’m starting to see the current cloud computing companies as the arms dealers of cyberspace.

My top 2 countries:

    CN 131,560,960 addresses
       /9 networks:    1 (8,388,608 addresses)
      /10 networks:   10 (41,943,040 addresses)
      /11 networks:   12 (25,165,824 addresses)
      /12 networks:   18 (18,874,368 addresses)
      /13 networks:   29 (15,204,352 addresses)
      /14 networks:   48 (12,582,912 addresses)
      /15 networks:   48 (6,291,456 addresses)
      /16 networks:   37 (2,424,832 addresses)
      /17 networks:   14 (458,752 addresses)
      /18 networks:    7 (114,688 addresses)
      /19 networks:   10 (81,920 addresses)
      /20 networks:    5 (20,480 addresses)
      /21 networks:    3 (6,144 addresses)
      /22 networks:    3 (3,072 addresses)
      /23 networks:    1 (512 addresses)

    US 92,199,996 addresses
       /9 networks:    3 (25,165,824 addresses)
      /10 networks:    5 (20,971,520 addresses)
      /11 networks:   10 (20,971,520 addresses)
      /12 networks:    9 (9,437,184 addresses)
      /13 networks:   16 (8,388,608 addresses)
      /14 networks:   10 (2,621,440 addresses)
      /15 networks:    8 (1,048,576 addresses)
      /16 networks:   42 (2,752,512 addresses)
      /17 networks:   10 (327,680 addresses)
      /18 networks:   11 (180,224 addresses)
      /19 networks:    8 (65,536 addresses)
      /20 networks:   10 (40,960 addresses)
      /21 networks:    2 (4,096 addresses)
      /22 networks:    9 (9,216 addresses)
      /23 networks:    9 (4,608 addresses)
      /24 networks:  818 (209,408 addresses)
      /25 networks:    4 (512 addresses)
      /26 networks:    5 (320 addresses)
      /27 networks:    5 (160 addresses)
      /28 networks:    2 (32 addresses)
      /29 networks:    7 (56 addresses)
      /30 networks:    1 (4 addresses)

You can clearly see the effect of my automation policy for the US. Lots of /24 networks get added, most of them with a 30 to 35 day expiration. Note that expirations increase for repeat offenses. But over time, as contiguous /24 networks are added due to sending probes at my firewall, aggregation will lead to wider net masks (shorter prefix lengths). Since I’m sorting countries based on the total number of addresses I’m blocking, obviously shorter prefixes have a much more profound effect than long prefixes.

TREBLEET Super Thunderbolt 3 Dock: First Impressions

TREBLEET Super Thunderbolt 3 Dock at Amazon

https://www.trebleet.com/product-page/mac-mini-thunderbolt-3-dock-with-nvme-sata-slot-cfexpress-card-slot-gray

I received this on August 25, 2022. I immediately installed a Samsung 980 Pro 1TB NVMe, then plugged the dock into AC power via the included power supply brick and into the Mac Studio M1 Ultra via the included Thunderbolt 3 cable. The performance to/from the Samsung 980 Pro 1TB NVMe is what I had hoped.

This is more than 3X faster than any other dock in this form factor available today. Sure, it’s not PCIe 4.0 NVMe speeds, but given that all other docks available in this form factor max out at 770 MB/s, and that Thunderbolt 3/4 is 5 GB/s, this is great.

I also checked some of the data in system report. All looks OK.

My first impression: this is the only dock to buy if you want NVMe in this form factor. Nothing else comes close speed-wise. Yes it’s pricey. Yes, it’s not a big brand name in North America. But they did the right thing with PCIe lane allocation, which hasn’t happened with OWC, Satechi or anyone else.

There’s really no point in buying a dock with NVMe if it won’t ever be able to run much faster than a good SATA SSD (I hope OWC, Satechi, Hagibis, AGPTEK, Qwizlab and others are paying attention). Buy this dock if you need NVMe storage. I can’t speak to longevity yet, but my initial rating: 5 out of 5 stars.

Mac Studio M1 Ultra: Thanks, Apple!

I finally have a computer I LIKE to place on my desk. I’m speaking of the Mac Studio M1 Ultra.

Apple finally created a desktop that mostly fits my needs. My only wishlist item: upgradeable internal storage (DIY or at Apple Store, I don’t care).

This was partly coincidence. The Mac Studio with M1 Ultra ticks the boxes I care about for my primary desktop. Faster than my Threadripper 3960X for compiling my C++ projects while small, aesthetically pleasing, quiet and cool. 10G ethernet? Check. 128G RAM? Check. Enough CPU cores for my work? Check. Fast internal storage? Check. Low power consumption? Check.

I’m serious: thanks, Apple!

This machine won’t be for everyone. News flash: no machine is for everyone. But for my current and foreseeable primary desktop needs, it’s great. And it’ll remain that way as long as we still have accessories for Thunderbolt available that are designed for the Mac Mini or Mac Studio. This isn’t a substitute for the Mac Pro; I can’t put PCIe cards in it, nor 1.5TB of RAM (or any beyond the 128G that came with mine). It’s also way more than a current Mac Mini. But that’s the point: it fills a spot that was empty in Apple’s lineup for a decade, which happens to be the sweet spot for people like me. Time is money, but I don’t need GPUs. I don’t need 1.5TB of RAM. I don’t need 100G ethernet (though I do need 10G ethernet). I’m not a video editor nor photographer; my ideal display is 21:9 at around 38 inches, for productivity (many Terminal windows), not for media. Hence the Studio Display and the Pro XDR are not good fits for me. But the Mac Studio M1 Ultra does what I need, really well.

Some people at Apple did their homework. Some championed what was done. Some did some really fine work putting it all together, from design to manufacturing. Some probably argued that it was a stopgap until the Apple silicon Mac Pro, and that’s true.

That last part doesn’t make it temporary product. Apple, please please please keep this tier alive. There are many of us out here that can’t work effectively with a Mac Mini, iMac or MacBook Pro but find it impossible to cost-justify a Mac Pro. And post-COVID there are many of us with multiple offices, one of which is at home. At home I don’t need a Mac Pro, nor do I really want one in my living space. I need just enough oomph to do real work efficiently, but don’t want a tower on my desk or the floor or even a rack-mounted machine (my home office racks are full).

I don’t care what machine occupies this space. But I’ll buy in this space, again and again, whereas I don’t see myself ever buying a Mac Pro for home with the current pricing structure.

Mac Studio M1 Ultra: The First Drive

Given that my new Mac Studio M1 Ultra is an ‘open box’ unit, I needed to fire it up and make sure that it works properly. One of the things I needed to check: that it works fine with my Dell U3818DW via USB-C for display. I have seen many reports of problems with ultra wide displays and M1 Macs, and I do not have a new display on my shopping list.

So on Sunday I left my hackintosh plugged in to the DisplayPort on the U3818DW, and plugged the Mac Studio into the USB-C port. It looks to me like it works just fine. I get native resolution, 3840×1600, with no fuss.

I am using a new Apple Magic Trackpad 2, and an old WASD CODE keyboard just to set things up. I don’t really need the new trackpad, since eventually I’ll decommission my hackintosh and take the trackpad from there. But I need one during the transition, and it was on sale at B&H.

With just a 30 minute spin… wow. I honestly can’t believe how zippy this machine is, right out of the box. Therein lies the beauty of using the same desktop computer for 10 years; when you finally upgrade, the odds are very good that you’re going to notice a significant improvement. In some cases, some of it will just be “less accumulated cruft launched at startup and login”. But in 10 years, the hardware is going to be much faster.

Compiling libDwm on the Mac Studio M1 Ultra with ‘make -j20' takes 32 seconds. Compiling it on my Threadripper 3960X machine with 256G of RAM with ‘make -j24‘ takes 40 seconds. You read that correctly… the M1 Ultra soundly beats my Threadripper 3960X for my most common ‘oomph’ activity (compiling C++ code), despite having a slower base clock and only having 16 performance cores and 4 efficiency cores. While using a fraction of the electricity. Bravo!

“Moore’s Law is dead.”. In the strictest sense, just on transistor density, this is mostly true. Process shrink has slowed down, etc. But the rules changed for many computing domains long before we were talking about TSMC 5nm. See Herb Sutter’s “The Free Lunch is Over“. Dies have grown (more transistors), core counts have grown, clock speed has increased but very slowly when compared to the olden days. Cache is, well, something you really need to look at when buying a CPU for a given workload.

This last point is something I haven’t had time to research, in terms of analysis. If you need performant software on a general purpose computer, cache friendliness is likely to matter. Up until recently, reaching out to RAM versus on-chip or on-die cache came with a severe penalty. That of course remains true on our common platforms (including Apple silicon). However, Apple put the RAM in the SoC. For the M1 Ultra, the bandwidth is 800 GB/sec. DDR4 3200 is 25.6GB/sec if you have 8 channels. DDR5 4800 with 8 channels is 76.8GB/sec. Let that sink in for a moment… the memory bandwidth of the M1 Ultra is more than a decimal order of magnitude higher than what we see in Intel and AMD machines. My question: how significant has this been for the benchmarks and real work loads? If significant, does this mean we’re going to see the industry follow Apple here? AMD and Intel releasing SoCs with CPU and RAM?

I know there are tinkerers that bemoan this future. But we bemoan the loss of many things in computing. I’m going to remain optimistic. Do I personally really care if today’s CPU + RAM purchase turns into an SoC purchase? To be honest, not really. But that’s just me; computing needs are very diverse. Those of us who tinker, well, we might just wind up tinkering with fewer parts. I don’t see the whole PC industry reversing any time soon in a manner that creates a walled garden any more than what we have today. It’s not like the current industry hasn’t been good for Intel and AMD. Yes, computing needs have diversified and we’ve put ‘enough’ power into smaller devices to meet the needs of many more consumers. And Intel and AMD have largely been absent in mobile. But they’ve maintained a solid foothold in the server market, cloud infrastructure, HPC, etc. As a consumer I appreciate the diversity of options in the current marketplace. We speak with our wallets. If we’re a market, I trust we’ll be served.

Apple turned heads here. For some computing needs (including my primary desktop), it appears the M1 Mac Studio is a winner. It doesn’t replace my Linux and Windows workstation, nor any of my servers, nor any of my Raspberry Pis. But for what I (and some others) need from a desktop computer, the M1 Mac Studio is the best thing Apple has done in quite some time. It hits the right points for some of us, in a price tier that’s been empty since the original cheese grater Mac Pro (2006 to way-too-late 2013). It also happens to be a nice jolt of competition. This is good for us, the consumers. Even if I never desired an Apple product, I’d celebrate. Kudos to Apple. And thanks!

Mac Studio M1 Ultra: The Decision

I’ve needed a macOS desktop for many years. My hackintosh, built when Apple had no current hardware to do what I needed to do, is more than 10 years of age. It’s my primary desktop. It’s behind on OS updates (WAY behind). It’s old. To be honest, I’m quite surprised it still runs at all. Especially the AIO CPU cooler.

The urgency was amplified when Apple silicon for macOS hit the streets. Apple is in transition, and at some point in the future, there will be no support for macOS on Intel. They’ve replaced Intel in the laptops, there’s an M1 iMac and an M1 mini, and now the Mac Studio. We’ve yet to see an Apple silicon Mac Pro, and while I’m sure it’s coming, I can’t say when nor anything about the pricing. If I assume roughly 2X multicore performance versus the M1 Ultra SOC, plus reasonable PCIe expansion, it’ll likely be out of my price range.

Fortunately, for today and the foreseeable future, the Mac Studio fits my needs. In terms of time == money, my main use is compiling C++ and C code. While single-core performance helps, so does multi-core for any build that has many units that can be compiled in parallel. So, for example, the Mac Studio with M1 Ultra has 20 CPU cores. Meaning my builds can compile 20 compilation units in parallel. Obviously there are points in my builds where I need to invoke the archiver or the linker on the result of many compiles. Meaning that for parts of the build, we’ll be single-core for a short period unless the tool itself (archiver, linker, etc.) uses multiple threads.

It’s important to note that a modern C++ compiler is, in general, a memory hog. It’s pretty common for me to see clang using 1G of RAM (resident!). Run 20 instances, that’s 20G of RAM. In other words, the 20 cores need at least 1G each to run without swapping. Add on all the apps I normally have running, and 32G is not enough RAM for me to make really effective use of the 20 cores, day in and day out.

So 64G would my target. And given that the CPU and GPU share that memory, that’s a good target for me. However…

Availability of Mac Studios with the exact configuration I wanted has been abysmal since… well… introduction. I wanted M1 Ultra with 64-core GPU, 64G RAM, 2TB storage. Apple’s lead time for this or anything close: 12 weeks. I’m assuming that a lot of this is the ongoing supply chain issues, COVID and possibly yield issues for the M1 Ultra. Apple is missing out on revenue here, so it’s not some sort of intentional move on their part, as near as I can tell. While I think there are M2 Pro and M2 Max on the horizon for the MacBook Pro (I dunno, 1H2023?), I think it’ll be a year before I see something clearly better for my use than the M1 Ultra. I can’t wait a year, unfortunately. I also can’t wait 3 months.

In fact, since I’m closing in on finishing the den, and need to move my office there, this is now urgent just from a space and aesthetics perspective. I intentionally designed the desk overbridges in the den to comfortably accommodate a Mac Studio (or Mac Mini) underneath either side. I DON’T want my hackintosh in this room! I want quiet, aesthetically pleasing, small, inconspicuous, efficient, and not a major source of heat. I need 10G ethernet. Fortunately, the Mac Studio ticks all of the boxes.

Today I picked up what was available, not exactly what I wanted. It’s an open box and hence $500 off: a Mac Studio with M1 Ultra, 64-core GPU, 128G of RAM and 1TB storage. The only thing from my wishlist not met here: 2TB storage. However, I’m only using 45% of the space on my 1TB drive in my hackintosh, and I haven’t tried to clean up much. I don’t keep music and movies on my desktop machine, but if I wanted to with the Mac Studio, I could plug in Thunderbolt 4 storage.

I’m much more excited about moving into the den than I am about the new computer. That’s unlikely to change, since the den remodeling is the culmination of a lot of work. And I know that I’m going to have to fiddle to make the new Mac Studio work well with my Dell U3818DW display. Assuming that goes well, I’m sure I’ll have a positive reaction to the Mac Studio. The Geekbench single-core scores are double that of my hackintosh. The multi-core scores are 7 times higher. This just gives me confidence that I’ll notice the speed when using it for my work. Especially since the storage is roughly a decimal order of magnitude faster. The 2TB is faster, but the jump will be huge from SATA to NVMe for my desktop. I notice this in my Threadripper machine and I’ll notice it here.

My main concern long-term is the cooling system. Being a custom solution from Apple, I don’t have options when the blower fans fail. Hopefully Apple will extend repairability beyond my first 3 years of AppleCare+. I like keeping my main desktop for more than 3 years. While in some ways it’s the easiest one to replace since it’s not rackmounted and isn’t critical to other infrastructure, it’s also my primary interface to all of my other machines: the Threadripper workstation for Linux and Windows development, my network storage machine, my web server, my gateway, and of course the web, Messages, email, Teams, Discord, etc. It saves me time and money if it lasts awhile.

UPS fiasco and mcrover to the rescue

I installed a new Eaton 5PX1500RT in my basement rack this week. I’d call it “planned, sort of…”. My last Powerware 5115 1U UPS went into an odd state which precipitated the new purchase. However, it was on my todo list to make this change.

I already own an Eaton 5PX1500RT, which I bought in 2019. I’ve been very happy with it. It’s in the basement rack, servicing a server, my gateway, ethernet switches and broadband modem. As is my desire, it is under 35% load.

The Powerware 5115 was servicing my storage server, and also under 35% load. This server has dual redundant 900W power supplies.

Installation of the new UPS… no big deal. install the ears, install the rack rails, rack the UPS.

Shut down the devices plugged into the old UPS, plug them in to the new UPS. Boot up, check each device.

Install the USB cable from the UPS to the computer that will monitor the state of the UPS. Install Network UPS Tools (nut) on that computer. Configure it, start it, check it.

This week, at this step things got… interesting.

I was monitoring the old Powerware 5115 from ‘ria’. ‘ria’ is a 1U SuperMicro server with a single Xeon E3-1270 V2. It has four 1G ethernet ports and a Mellanox 10G SFP+ card. Two USB ports. And a serial port which has been connected to the Powerware 5115 for… I don’t know, 8 years?

I can monitor the Eaton 5PX1500RT via a serial connection. However, USB is more modern, right? And the cables are less unwieldy (more wieldy). So I used the USB cable.

Trouble started here. The usbhid-ups driver did not reliably connect to the UPS. When it did, it took a long time (in excess of 5 seconds, an eternity in computing time). ‘ria’ is running FreeBSD 12.3-STABLE on bare metal.

I initially decided that I’d deal with it this weekend. Either go back to using a serial connection or try using a host other than ‘ria’. However…

I soon noticed long periods where mcrover was displaying alerts for many services on many hosts. Including alerts for local services, whose test traffic does not traverse the machine I touched (‘ria’). And big delays when using my web browser. Hmm…

Poking around, I seemed to only be able to reliably reproduce a network problem by pinging certain hosts with ICMPv4 from ria and observing periods where the round trip time would go from .05 milliseconds to 15 or 20 seconds. No packets lost, just periods with huge delays. These were all hosts on the same 10G ethernet network. ICMPv6 to the same hosts: no issues. Hmm…

I was eventually able to correlate (in my head) what I was seeing in the many mcrover alerts. On the surface, many didn’t involve ‘ria’. But under the hood they DO involve ‘ria’ simply because ‘ria’ is my primary name server. So, for example, tests that probe via both IPv6 and IPv4 might get the AAAA record but not the A record for the destination, or vice versa, or neither, or both. ‘ria’ is also the default route for these hosts. I honed in on the 10G ethernet interface on ‘ria’.

What did IPV4 versus IPv6 have to do with the problem? I don’t know without digging through kernel source. What was happening: essentially a network ‘pause’. Packets destined for ‘ria’ were not dropped, but queued for later delivery. As many as 20 seconds later! The solution? Unplug the USB cable for the UPS and kill usbhid-ups. In the FreeBSD kernel, is USB hoarding a lock shared with part of the network stack?

usbhid-ups works from another Supermicro server running the same version of FreeBSD. Different hardware (dual Xeon L5640). Same model of UPS with the same firmware.

This leads me to believe this isn’t really a lock issue. It’s more likely an interrupt routing issue. And I do remember that I had to add hw.acpi.sci.polarity="low" to /boot/loader.conf on ‘ria’ a while ago to avoid acpi0 interrupt storms (commented out recently with no observed consequence). What I don’t remember: what were all the issues I found that prompted me to add that line way back when?

Anyway… today’s lesson. Assume the last thing you changed has high probability of cause, even if there seems to be no sensible correlation. My experience this week: “Unplug the USB connection to the UPS and the 10G ethernet starts working again. Wait, what?!”.

And today’s thanks goes to mcrover. I might not have figured this out for considerably longer if I did not have alert information in my view. Being a comes-and-goes problem that only seemed to be reproducible between particular hosts using particular protocols might have made this a much more painful problem to troubleshoot without reliable status information on a dedicated display. Yes, it took some thinking and observing, and then some manual investigation and backtracking. But the whole time, I had a status display showing me what was observable. Nice!

mcrover updates

I spent some weekend time updating mcrover this month. I had two motivations:

  1. I wanted deeper tests for two web applications: WordPress and piwigo. Plex will be on the list soon, I’m just not thrilled that their API is XML-based (Xerces is a significant dependency, and other XML libraries are not as robust or complete).
  2. I wanted a test for servers using libDwmCredence. Today that’s mcroverd and mcweatherd. dwmrdapd and mcblockd will soon be transitioned from libDwmAuth to libDwmCredence.

I created a web application alert, which allows me to create code to test web applications fairly easily and generically. Obviously, the test for a given application is application-specific. Applications with REST interfaces that emit JSON are pretty easy to add.

The alerts and code to check servers using libDwmCredence allow me to attempt connection and authentication to any service using Dwm::Credence::Peer. These are the steps that establish a connection; they are the prologue to application messaging. This allows me to check the status of a service using Dwm::Credence::Peer in lieu of a deeper (application-specific) check, so I can monitor new services very early in the deployment cycle.

Time for a new desktop keyboard: the switches

I just ordered a set of Kailh Box Pink key switches from novelkeys.xyz. And I’m going to order a barebones hot swap keyboard. Why?

First, I spilled some coffee in my desktop keyboard. Maybe 1/4 cup. Which is a lot, though nowhere near what I’ve spilled on some of my buckling spring keyboards. I tore it down, doused it with 99% isopropyl alcohol, cleaned all the switches, the PCB (both sides), the case (inside and out), the keycaps. It’s working again, but it’s a reminder….

It’s my oldest CODE keyboard, Cherry MX blue switches. I like the CODE keyboard. I don’t love the price though, and I also don’t love how difficult it is to disassemble. I’ve had it apart a couple of times. And this time, one of the case tabs broke. It’s not a huge issue, but it’s annoying. If I press on the lower right corner of the keyboard case… “SQUEAK”. It’s likely more the matter that I know it’s broken than it is any sort of real annoyance, but…

I’ve never loved Cherry MX switches. My computing life began when we had some truly delightful keyboards to type on. Before the IBM Model M (very good). Even before the IBM Model F (better). Anyone truly interested in how we got where we are today would do well to do some history homework online before making your next mechanical keyboard purchase. But I can sum it up in two words: cost reductions.

I’m not going to judge that. It is what it is, and the good far outweighs the bad; cost reductions are what have allowed many of us to own desktop and laptop computers for many decades.

But… there are those of us that spend our lives at our keyboards because it’s our livelihood. And there are a LOT of us. And some of us care deeply about our human-machine interface that we spend 8+ hours at each day. And guess what? We’re not all the same. Unfortunately, we’ve mostly been saddled with only two predominant key switches for keyboards for a very long time now: various rubber dome keyboards (pretty much universally considered crummy but inexpensive), and those with Cherry MX (or something similar to Cherry MX). We do still have Unicomp making buckling spring keyboards with tooling from Lexmark (who manufactured the Model M keyboards for the North American market). And we have some new switch types (optical, hall effect, etc.). But at the moment, the keyboard world is predominantly Cherry colored.

Perhaps worse, those of us that like a switch to be both tactile and clicky have few good choices. Unicomp buckling spring is at the top for readily available and reasonably priced. But the compromises for a modern desktop are significant for a large crowd of us. Number one would be that it’s huge (including the SSK version). For some, number two would be no backlighting. And yet others want more keycap options. But it’s a long drop from the buckling spring to any MX-style switch if your goal is clicky and tactile.

I don’t hate Cherry MX blues. Nor Gateron blues. Nor many others. But… most of them feel like just what they’re designed to be. They’re not smooth (including the linears), most of them are not truly tactile, and they’re fragile (not protected from dust or liquid ingress). Most have considerable key wobble. They’re usable, I’ve just been wanting something better for a while. In a TKL (tenkeyless) keyboard with minimal footprint.

Some personal history… one of the reasons I stopped using buckling spring was just the sheer size of any of my true Model M keyboards or any of the Unicomps. The other was the activation force. I wanted something a little lighter to the touch, but still noisy. The Cherry MX blue and similar have filled the niche for me for a while. But… the scratchiness/crunchiness has always been less than ideal to me, and the sound in any board I’ve tried has been less satisfying than I’d like. I’ve not had any of the switches die on me, which is a testament to durability. But I’ve had to clean them on more than one occasion due to a small spill, to get them working again. And over time, despite the fact that they still function, their characteristics change. Some keys lose some of their clickiness. Some get louder. And out of the box, they’re not terribly consistent sound-wise. And while I’ve disassembled keyboards to clean and lube the switches… it’s very time consuming. And despite the fact that I have a pretty good hot-air rework setup, it’s very hard for me to justify spending time replacing soldered switches. I can barely justify time swapping hot-swap keys!

So… I want a more durable switch. And something smoother (less scratch/crunch) than a Cherry MX, but with a nice distinct click. And unfortunately, something that works in a PCB designed for Cherry MX since there are far and away the most options there. The Kailh White or Pink seem to fit the bill. The white are readily available, so I bought the pink just to make sure I don’t miss out on available inventory. I’ll put them in a hot-swap board with PBT keycaps and give them a test drive for a few weeks.

I know the downsides ahead of time. I had an adjustment to make when I went from buckling spring to Cherry MX blue. Buckling spring feedback and activation occur at the same time; it’s ideal. Cherry MX and related designs… most of them activate after the click. The Kailh pink and white appear to activate before the click, and they don’t have the hysteresis of the Cherry MX switches. But based on my own personal preferences which are aligned pretty closely to those who’ve reviewed the Kailh Box White and Kailh Box Pink (like Chyrosran on YouTube), I think one of these switches will make me happier than my MX blues.

Of course I could be wrong. But that’s why I’m going with an inexpensive hot-swap board for this test drive. PCB, mounting and chassis all play a significant role in how a keyboard feels and sounds. But I know many of those differences, and the goal at the moment is to pick the switches I want in my next long-term keyboard.

Threadripper 3960X: the birth of ‘thrip’

I recently assembled a new workstation for home. My primary need was a machine for software development, including deep learning. This machine is named “thrip”.

Having looked hard at my options, I decided on AMD Threadripper 3960X as my CPU. A primary driver was of course bang for the buck. I wanted PCIe 4.0, at least 18 cores, at least 4-channel RAM, the ability to utilize 256G or more of RAM, and to stay in budget.

By CPU core count alone, the 3960X is over what I needed. On the flip side, it’s constrained to 256G of RAM, and it’s also more difficult to keep cool than most CPUs (280W TDP). But on price-per-core, and overall performance per dollar, it was the clear winner for my needs.

Motherboard-wise, I wanted 10G ethernet, some USB-C, a reasonable number of USB-A ports, room for 2 large GPUs, robust VRM, and space for at least three NVMe M.2 drives. Thunderbolt 3 would have been nice, but none of the handful of TRX40 boards seem to officially support it (I don’t know if this is an Intel licensing issue or something else). The Gigabyte board has the header and Wendell@Level1Techs seems to have gotten it working, but I didn’t like other aspects of the Gigabyte TRX40 AORUS EXTREME board (the XL-ATX form factor, for example, is still limiting in terms of case options).

I prefer to build my own workstations. It’s not due to being particularly good at it, or winding up with something better than I could get pre-built. It’s that I enjoy the creative process of selecting parts and putting it all together.

I had not assembled a workstation in quite some time. My old i7-2700K machine has met my needs for most of the last 8 years. And due to a global pandemic, it wasn’t a great time to build a new computer. The supply chain has been troublesome for over 6 months now, especially for some specific parts (1000W and above 80+ titanium PSUs, for example). We’ve also had a huge availability problem for the current GPUs from NVIDIA (RTX 3000 series) and AMD (Radeon 6000 series). And I wasn’t thrilled about doing a custom water-cooling loop again, but I couldn’t find a worthy quiet cooling solution for Threadripper and 2080ti without going custom loop. Given the constraints, I wound up with these parts as the guts:

  • Asus TRX40 ROG Zenith II Extreme Alpha motherboard
  • AMD Threadripper 3960X CPU (24 cores)
  • 256 gigabytes G.Skill Trident Z Neo Series RGB DDR4-3200 CL16 RAM (8 x 32G)
  • EVGA RTX 2080 Ti FTW3 Ultra GPU with EK Quantum Vector FTW3 waterblock
  • Sabrent 1TB Rocket NVMe 4.0 Gen4 PCIe M.2 Internal SSD
  • Seasonic PRIME TX-850, 850W 80+ Titanium power supply
  • Watercool HEATKILLER IV PRO for Threadripper, pure copper CPU waterblock

It’s all in a Lian Li PC-O11D XL case. I have three 360mm radiators, ten Noctua 120mm PWM fans, an EK Quantum Kinetic TBE 200 D5 PWM pump, PETG tubing and a whole bunch of Bitspower fittings.

My impressions thus far: it’s fantastic for Linux software development. It’s so nice to be able to run ‘make -j40‘ on large C++ projects and have them complete in a timely manner. And thus far, it runs cool and very quiet.

An ode to NSFNET and ANSnet: a simple NMS for home

A bit of history…

I started my computing career at NSFNET at the end of 1991. Which then became ANSnet. In those days, we had a home-brewed network monitoring system. I believe most/all of it was originally the brainchild of Bill Norton. Later there were several contributors; Linda Liebengood, myself, others. The important thing for today’s thoughts: it was named “rover”, and its user interface philosophy was simple but important: “Only show me actionable problems, and do it as quickly as possible.”

To understand this philosophy, you have to know something about the primary users: the network operators in the Network Operations Center (NOC). One of their many jobs was to observe problems, perform initial triage, and document their observations in a trouble ticket. From there they might fix the problem, escalate to network engineering, etc. But it wasn’t expected that we’d have some omniscient tool that could give them all of the data they (or anyone else) needed to resolve the problem. We expected everyone to use their brains, and we wanted our primary problem reporter to be fast and as clutter-free as possible.

For decades now, I’m spent a considerable amount of time working at home. Sometimes because I was officially telecommuting, at other times just because I love my work and burn midnight hours doing it. As a result, my home setup has become more complex over time. I have 10 gigabit ethernet throughout the house (some fiber, some Cat6A).  I have multiple 10 gigabit ethernet switches, all managed.  I have three rackmount computers in the basement that run 7×24.  I have ZFS pools on two of them, used for nightly backups of all networked machines, source code repository redundancy, Time Machine for my macOS machines, etc.  I run my own DHCP service, an internal DNS server, web servers, an internal mail server, my own automated security software to keep my pf tables current, Unifi, etc.  I have a handful of Raspberry Pis doing various things.  Then there’s all the other devices: desktop computers in my office, a networked laser printer, Roku, AppleTV, Android TV, Nest thermostat, Nest Protects, WiFi access points, laptops, tablet, phone, watch, Ooma, etc.  And the list grows over time.

Essentially, my home has become somewhat complex.  Without automation, I spend too much time checking the state of things or just being anxious about not having time to check everything at a reasonable frequency.  Are my ZFS pools all healthy?  Are all of my storage devices healthy?  Am I running out of storage space anywhere?  Is my DNS service working?  Is my DHCP server working?  My web server?  NFS working where I need it?  Is my Raspberry Pi garage door opener working?  Are my domains resolvable from the outside world?  Are the cloud services I use working?  Is my Internet connection down?  Is there a guest on my network?  A bandit on my network?  Is my printer alive?  Is my internal mail service working?  Are any of my UPS units running on battery?  Are there network services running that should not be?  What about the ones that should be, like sshd?

I needed a monitoring system that worked like rover; only show me actionable issues.  So I wrote my own, and named it “mcrover”.  It’s more of a host and service monitoring system than a network monitoring system, but it’s distributed and secure (using ed25519 stuff in libDwmAuth).  It’s modern C++, relatively easy to extend, and has some fun bits (ASCII art in the curses client when there are no alerts, for example).  Like the old Network Operations Center, I have a dedicated display in my office that only displays the mcrover Qt client, 24 hours a day.  Since most of the time there are no alerts to display, the Qt client toggles between a display of the next week’s forecast and a weather radar image when there are no alerts.  If there are alerts, the alert display will be shown instead, and will not go away until there are no alerts (or I click on the page switch in the UI).  The dedicated display is driven by a Raspberry Pi 4B running the Qt client from boot, using EGLFS (no X11).  The Raspberry Pi4 is powered via PoE.  It is also running the mcrover service, to monitor local services on the Pi as well as many network services.  In fact the mcrover service is running on every 7×24 general purpose computing device.  mcrover instances can exchange alerts, hence I only need to look at one instance to see what’s being reported by all instances.

This has alleviated me of a lot of sys admin and network admin drudgery.  It wasn’t trivial to implement, mostly due to the variety (not the quantity) of things it’s monitoring.  But it has proven itself very worthwhile.  I’ve been running it for many months now, and I no longer get anxious about not always keeping up with things like daily/weekly/monthly mail from cron and manually checking things.  All critical (and some non-critical) things are now being checked every 60 seconds, and I only have my attention stolen when there is an actionable issue found by mcrover.

So… an ode to the philosophy of an old system.  Don’t make me plow through a bunch of data to find the things I need to address.  I’ll do that when there’s a problem, not when there isn’t a problem.  For 7×24 general purpose computing devices running Linux, macOS or FreeBSD, I install and run the mcrover service and connect it to the mesh.  And it requires very little oomph; it runs just fine on a Raspberry Pi 3 or 4.

So why the weather display?  It’s just useful to me, particularly in the mowing season where I need to plan ahead for yard work.  And I’ve just grown tired of the weather websites.  Most are loaded with ads and clutter.  All of them are tracking us.  Why not just pull the data from tax-funded sources in JSON form and do it myself?  I’ve got a dedicated display which doesn’t have any alerts to display most of the time, so it made sense to put it there.

The Qt client using X11, showing the weather forecast.

mcrover Qt client using X11, showing the weather forecast

The Qt client using EGLFS, showing the weather radar.

The curses client, showing ASCII art since there are no alerts to be shown.

mcrover curses client with no alerts.