Jan 14, 2023 Unacknowledged SYNs by country
It’s sometimes interesting to look at how different a single day might be versus the longer-term trends. And to see what happens when you make changes to your pf rules.
I added all RU networks I was blocking from ssh to the list blocked for everything. I also fired up a torrent client on my desktop.
RU moving up the list versus the previous 5 days is no surprise; a good portion of traffic I receive from RU is port scanning. But I’ll have to look to see what caused the CZ numbers to climb.
I think the only interesting thing about the torrent client is that I should do something to track UDP in a similar manner as I track TCP. If I have a torrent client running, I will wind up with a lot of UDP traffic (much of it directed to port 6881 on this day), and will respond with ICMP port unreachable. To some extent this is a burden on my outbound bandwidth, but on the other hand it will allow me to add an easy new tracker to mcflowd: “to whom am I sending ICMP port unreachables?”. Of course, UDP is trivially spoofed, so I don’t truly know the source of the UDP.
mcblockd 5 years on
However, it’s interesting to note how things have changed. Looking at just the addresses I block from accessing port 22…
While China remains at the top of my list of total number of blocked IP addresses, the US is now in 2nd place. In 2017, the US wasn’t even in the top 20. What has changed?
Most of the change here is driven by my automation seeing more and more attacks originating from cloud hosted services. Amazon EC2, Google, Microsoft, DigitalOcean, Linode, Oracle, et. al. While my automation policy won’t go wider than a /24 for a probe from a known US entity, over time I see probes from entire swaths of contiguous /24 networks from the same address space allocation, which will be coalesced to reduce firewall table size. Two adjacent /24 networks become a single /23. Two adjacent /23 networks become a single /22. All the way up to a possible /8 (the automation stops there).
So today, the last of 2022, I see some very large blocks owned by our cloud providers being blocked by my automation due to receiving ssh probes from large contiguous swaths of their address space.
I am very appreciative of the good things from big tech. But I’m starting to see the current cloud computing companies as the arms dealers of cyberspace.
My top 2 countries:
CN 131,560,960 addresses /9 networks: 1 (8,388,608 addresses) /10 networks: 10 (41,943,040 addresses) /11 networks: 12 (25,165,824 addresses) /12 networks: 18 (18,874,368 addresses) /13 networks: 29 (15,204,352 addresses) /14 networks: 48 (12,582,912 addresses) /15 networks: 48 (6,291,456 addresses) /16 networks: 37 (2,424,832 addresses) /17 networks: 14 (458,752 addresses) /18 networks: 7 (114,688 addresses) /19 networks: 10 (81,920 addresses) /20 networks: 5 (20,480 addresses) /21 networks: 3 (6,144 addresses) /22 networks: 3 (3,072 addresses) /23 networks: 1 (512 addresses) US 92,199,996 addresses /9 networks: 3 (25,165,824 addresses) /10 networks: 5 (20,971,520 addresses) /11 networks: 10 (20,971,520 addresses) /12 networks: 9 (9,437,184 addresses) /13 networks: 16 (8,388,608 addresses) /14 networks: 10 (2,621,440 addresses) /15 networks: 8 (1,048,576 addresses) /16 networks: 42 (2,752,512 addresses) /17 networks: 10 (327,680 addresses) /18 networks: 11 (180,224 addresses) /19 networks: 8 (65,536 addresses) /20 networks: 10 (40,960 addresses) /21 networks: 2 (4,096 addresses) /22 networks: 9 (9,216 addresses) /23 networks: 9 (4,608 addresses) /24 networks: 818 (209,408 addresses) /25 networks: 4 (512 addresses) /26 networks: 5 (320 addresses) /27 networks: 5 (160 addresses) /28 networks: 2 (32 addresses) /29 networks: 7 (56 addresses) /30 networks: 1 (4 addresses)
You can clearly see the effect of my automation policy for the US. Lots of /24 networks get added, most of them with a 30 to 35 day expiration. Note that expirations increase for repeat offenses. But over time, as contiguous /24 networks are added due to sending probes at my firewall, aggregation will lead to wider net masks (shorter prefix lengths). Since I’m sorting countries based on the total number of addresses I’m blocking, obviously shorter prefixes have a much more profound effect than long prefixes.
mcrover now monitors Plex server
mcrover is now monitoring my Plex server.
This was more work than expected. A big part of the issue here is that the REST API uses XML. I’ve always disliked using XML. It’s a nice technology, but when it comes to open source libraries for C++, it’s always been lacking.
Long ago, I used Xerces. Not because it’s the best, but because it was the only liberally licensed library with support for DTD and Schema validation. That is still the case today. Unfortunately, it’s very cumbersome to use and is written in old C++ (as in C++ 1998). There’s a lot of boilerplate, a considerable amount of global state (very bad for multithreaded applications), and a lot of the memory management is left to the application. I can’t imagine anyone today is using it in production.
But I plunged ahead anyway. Sadly, it was a mistake. Somewhere it was stomping on the stack, often in ways that caused problems deep inside openssl (which I don’t use directly, instead using boost::beast and boost::certify). The stack corruption caused problems trying to debug, and I didn’t have the time to figure it out. And of course I’m always suspicious of openssl, given the fact that it’s written in C and many of us lived through Heartbleed and many other critical openssl vulnerabilities. To be honest, we’ve been in desperate need of a really good, modern (and in at least C++11) C++ implementation of TLS for more than a decade. Of course I could rant about our whole TLS mess for hours, but I’ll spare you.
Time being of the essence, I switched to pugixml. I don’t need DTD/Schema validation. Problem gone, a lot less code, and a much more modern API (much harder to shoot yourself in the foot).
Inside mcrover, I’m using XPath on the XML returned by Plex. The internal code is generic; it would not be much work to support other Web applications with XML interfaces. The XPath I look for in the XML is a configuration item, and really the only reason I have something specific for Plex is their use of an API token. But the configuration is generic enough that supporting other XML Web applications shouldn’t be difficult.
At any rate, what I have now works. So now I don’t get blackholed fixing a Plex issue when I haven’t used it for months and something has gone wrong. I know ahead of time.
UPS fiasco and mcrover to the rescue
I installed a new Eaton 5PX1500RT in my basement rack this week. I’d call it “planned, sort of…”. My last Powerware 5115 1U UPS went into an odd state which precipitated the new purchase. However, it was on my todo list to make this change.
I already own an Eaton 5PX1500RT, which I bought in 2019. I’ve been very happy with it. It’s in the basement rack, servicing a server, my gateway, ethernet switches and broadband modem. As is my desire, it is under 35% load.
The Powerware 5115 was servicing my storage server, and also under 35% load. This server has dual redundant 900W power supplies.
Installation of the new UPS… no big deal. install the ears, install the rack rails, rack the UPS.
Shut down the devices plugged into the old UPS, plug them in to the new UPS. Boot up, check each device.
Install the USB cable from the UPS to the computer that will monitor the state of the UPS. Install Network UPS Tools (nut) on that computer. Configure it, start it, check it.
This week, at this step things got… interesting.
I was monitoring the old Powerware 5115 from ‘ria’. ‘ria’ is a 1U SuperMicro server with a single Xeon E3-1270 V2. It has four 1G ethernet ports and a Mellanox 10G SFP+ card. Two USB ports. And a serial port which has been connected to the Powerware 5115 for… I don’t know, 8 years?
I can monitor the Eaton 5PX1500RT via a serial connection. However, USB is more modern, right? And the cables are less unwieldy (more wieldy). So I used the USB cable.
Trouble started here. The usbhid-ups driver did not reliably connect to the UPS. When it did, it took a long time (in excess of 5 seconds, an eternity in computing time). ‘ria’ is running FreeBSD 12.3-STABLE on bare metal.
I initially decided that I’d deal with it this weekend. Either go back to using a serial connection or try using a host other than ‘ria’. However…
I soon noticed long periods where mcrover was displaying alerts for many services on many hosts. Including alerts for local services, whose test traffic does not traverse the machine I touched (‘ria’). And big delays when using my web browser. Hmm…
Poking around, I seemed to only be able to reliably reproduce a network problem by pinging certain hosts with ICMPv4 from ria and observing periods where the round trip time would go from .05 milliseconds to 15 or 20 seconds. No packets lost, just periods with huge delays. These were all hosts on the same 10G ethernet network. ICMPv6 to the same hosts: no issues. Hmm…
I was eventually able to correlate (in my head) what I was seeing in the many mcrover alerts. On the surface, many didn’t involve ‘ria’. But under the hood they DO involve ‘ria’ simply because ‘ria’ is my primary name server. So, for example, tests that probe via both IPv6 and IPv4 might get the AAAA record but not the A record for the destination, or vice versa, or neither, or both. ‘ria’ is also the default route for these hosts. I honed in on the 10G ethernet interface on ‘ria’.
What did IPV4 versus IPv6 have to do with the problem? I don’t know without digging through kernel source. What was happening: essentially a network ‘pause’. Packets destined for ‘ria’ were not dropped, but queued for later delivery. As many as 20 seconds later! The solution? Unplug the USB cable for the UPS and kill usbhid-ups. In the FreeBSD kernel, is USB hoarding a lock shared with part of the network stack?
usbhid-ups works from another Supermicro server running the same version of FreeBSD. Different hardware (dual Xeon L5640). Same model of UPS with the same firmware.
This leads me to believe this isn’t really a lock issue. It’s more likely an interrupt routing issue. And I do remember that I had to add hw.acpi.sci.polarity="low" to /boot/loader.conf on ‘ria’ a while ago to avoid acpi0 interrupt storms (commented out recently with no observed consequence). What I don’t remember: what were all the issues I found that prompted me to add that line way back when?
Anyway… today’s lesson. Assume the last thing you changed has high probability of cause, even if there seems to be no sensible correlation. My experience this week: “Unplug the USB connection to the UPS and the 10G ethernet starts working again. Wait, what?!”.And today’s thanks goes to mcrover. I might not have figured this out for considerably longer if I did not have alert information in my view. Being a comes-and-goes problem that only seemed to be reproducible between particular hosts using particular protocols might have made this a much more painful problem to troubleshoot without reliable status information on a dedicated display. Yes, it took some thinking and observing, and then some manual investigation and backtracking. But the whole time, I had a status display showing me what was observable. Nice!
I spent some weekend time updating mcrover this month. I had two motivations:
- I wanted deeper tests for two web applications: WordPress and piwigo. Plex will be on the list soon, I’m just not thrilled that their API is XML-based (Xerces is a significant dependency, and other XML libraries are not as robust or complete).
- I wanted a test for servers using libDwmCredence. Today that’s mcroverd and mcweatherd. dwmrdapd and mcblockd will soon be transitioned from libDwmAuth to libDwmCredence.
I created a web application alert, which allows me to create code to test web applications fairly easily and generically. Obviously, the test for a given application is application-specific. Applications with REST interfaces that emit JSON are pretty easy to add.
The alerts and code to check servers using libDwmCredence allow me to attempt connection and authentication to any service using Dwm::Credence::Peer. These are the steps that establish a connection; they are the prologue to application messaging. This allows me to check the status of a service using Dwm::Credence::Peer in lieu of a deeper (application-specific) check, so I can monitor new services very early in the deployment cycle.
libDwmCredence integrated into mcrover
I’ve completed my first pass at libDwmCredence (a replacement for libDwmAuth). As previously mentioned, I did this work in order to drop my use of Crypto++ in some of my infrastructure. Namely mcrover, my personal host and network monitoring software.
I’ve integrated it into mcrover, and deployed the new mcrover on 6 of my internal hosts. So far, so good.
Replacing libDwmAuth due to Crypto++ problems
This has been a while in the making, but…
I am in the process of replacing libDwmAuth with a new library I’m calling libDwmCredence. Why?
Mainly due to poor maintenance of Crypto++, which I used in libDwmAuth. Quite some time ago I tried to explain two problems with Crypto++ to the maintainers that led to memory leaks and corruption on some platforms. I gave up due to the current maintainers not resolving the problem, and growing tired of maintaining my own fork in a private repository. The root of the issue was an incorrect optimization and attempt at forced alignment. In essence, an attempt at optimization that critically destroyed the integrity of the library on some platforms (notably an important IoT platform I use, Raspbian). The last place you want code to stomp on the stack or heap is in your crypto library!
Way back when, I used Crypto++ because it was FIPS validated. That hasn’t been true for a while, and it doesn’t appear that it will be true in the future. Then when elliptic curve solutions came along, Crypto++ was very slow to adopt due to design constraints. This was troublesome for me since some IoT platforms don’t have hardware acceleration for AES, and hence I wanted to be able to use XChaCha20poly1305 and the like to improve performance, as well as Ed25519 and X25519 keys. There was a period where I couldn’t do this at all with Crypto++.
Finally, for my own work, I don’t really need much of what Crypto++ provides. And parts of the interface use C++ in atypical ways, making code hard to read.
So, I’ve switched to libsodium and wrapped it with C++. While there are many things I don’t like about libsodium, it provides what I need and I am able to make it safer by wrapping it. Furthermore, I am using boost::asio and a semi-elegant wrapper that allows me to send and receive just about anything on the wire easily. It does what I need it to do, and no more. Of course the boost::asio interfaces are considerably different than socket()/bind()/select()/accept()/connect()/etc. but I mostly only have to think about that on the server side.
New IPv4 and IPv6 container templates
I’ve spent a little bit of time working on some new slimmed-down C++ containers keyed by IPv4 addresses, IPv6 address, IPv4 prefixes and IPv6 prefixes. The containers that are keyed by prefixes allow longest-match searching by address, as would be expected.
My main objective here was to minimize the amount of code I need to maintain, by leveraging the C++ standard library and existing classes and class templates in libDwm. A secondary objective was to make sure the containers are fast enough for my needs. A third objective was to make the interfaces thread safe.
I think I did OK on the minimal code front. For example, DwmIpv4PrefixMap.hh is only 102 lines of code (I haven’t added I/O functionality yet). DwmIpv6PrefixMap.hh is 185 lines of code, including I/O functionality. Obviously they leverage existing code (Ipv4Prefix, Ipv6Prefix, et. al.).
The interfaces are thread safe. I’m in the process of switching them from mutex and lock_guard to shared_mutex and shared_lock/unique_lock.
Performance-wise, it looks pretty good. I’m using prefix dumps from routeviews to have realistic data for my unit tests. On my Threadripper 3960X development machine running Ubuntu 20.04:
% ./TestIpv4AddrMap -p 831,915 addresses, 7,380,956 inserts/sec 831,915 addresses, 16,641,961 lookups/sec 831,915 addresses, 9,032,736 removals/sec 831,915 addresses, 8,249,196 inserts/sec (bulk lock) 831,915 addresses, 54,097,737 lookups/sec (bulk lock) 831,915 addresses, 9,489,272 removals/sec (bulk lock) 831,918/831,918 passed % ./TestIpv4PrefixMap -p 901,114 prefixes, 6,080,842 prefix inserts/sec 901,114 prefixes, 14,639,881 prefix lookups/sec 901,114 addresses, 5,105,259 longest match lookups/sec 901,114 prefixes, 6,378,710 prefix inserts/sec (bulk lock) 901,114 prefixes, 25,958,230 prefix lookups/sec (bulk lock) 901,114 addresses, 5,368,727 longest match lookups/sec (bulk lock) 1,802,236/1,802,236 passed % ./TestIpv6AddrMap -p 104,970 addresses, 11,360,389 inserts/sec 104,970 addresses, 15,206,431 lookups/sec 104,970 addresses, 9,159,685 removals/sec 104,970 addresses, 12,854,518 inserts/sec (bulk lock) 104,970 addresses, 20,434,105 lookups/sec (bulk lock) 104,970 addresses, 10,302,286 removals/sec (bulk lock) 104,976/104,976 passed % ./TestIpv6PrefixMap -p 110,040 prefixes, 11,181,790 prefix lookups/sec 110,040 prefixes, 1,422,403 longest match lookups/sec 440,168/440,168 passed
What is ‘bulk lock’? The interfaces allow one to get a shared or unique lock and then perform multiple operations while holding the lock. As seen above, this doesn’t make a huge difference for insertion or removal of entries, where the time is dominated by operations other than locking and unlocking. It does make a significant difference for exact-match searches. One must be careful using the bulk interfaces to avoid deadlock, of course. But they are useful in some scenarios.
The best part, IMHO, is that these are fairly thin wrappers around
std::unordered_map. Meaning I don’t have my own hash table or trie code to maintain and I can count on
std::unordered_map behaving in a well-defined manner due to it being part of the C++ standard library. It is not the fastest means of providing longest-match lookups. However, from my perspective as maintainer… it’s a small bit of code, and fast enough for my needs.
Threadripper 3960X: the birth of ‘thrip’
I recently assembled a new workstation for home. My primary need was a machine for software development, including deep learning. This machine is named “thrip”.
Having looked hard at my options, I decided on AMD Threadripper 3960X as my CPU. A primary driver was of course bang for the buck. I wanted PCIe 4.0, at least 18 cores, at least 4-channel RAM, the ability to utilize 256G or more of RAM, and to stay in budget.
By CPU core count alone, the 3960X is over what I needed. On the flip side, it’s constrained to 256G of RAM, and it’s also more difficult to keep cool than most CPUs (280W TDP). But on price-per-core, and overall performance per dollar, it was the clear winner for my needs.
Motherboard-wise, I wanted 10G ethernet, some USB-C, a reasonable number of USB-A ports, room for 2 large GPUs, robust VRM, and space for at least three NVMe M.2 drives. Thunderbolt 3 would have been nice, but none of the handful of TRX40 boards seem to officially support it (I don’t know if this is an Intel licensing issue or something else). The Gigabyte board has the header and Wendell@Level1Techs seems to have gotten it working, but I didn’t like other aspects of the Gigabyte TRX40 AORUS EXTREME board (the XL-ATX form factor, for example, is still limiting in terms of case options).
I prefer to build my own workstations. It’s not due to being particularly good at it, or winding up with something better than I could get pre-built. It’s that I enjoy the creative process of selecting parts and putting it all together.
I had not assembled a workstation in quite some time. My old i7-2700K machine has met my needs for most of the last 8 years. And due to a global pandemic, it wasn’t a great time to build a new computer. The supply chain has been troublesome for over 6 months now, especially for some specific parts (1000W and above 80+ titanium PSUs, for example). We’ve also had a huge availability problem for the current GPUs from NVIDIA (RTX 3000 series) and AMD (Radeon 6000 series). And I wasn’t thrilled about doing a custom water-cooling loop again, but I couldn’t find a worthy quiet cooling solution for Threadripper and 2080ti without going custom loop. Given the constraints, I wound up with these parts as the guts:
- Asus TRX40 ROG Zenith II Extreme Alpha motherboard
- AMD Threadripper 3960X CPU (24 cores)
- 256 gigabytes G.Skill Trident Z Neo Series RGB DDR4-3200 CL16 RAM (8 x 32G)
- EVGA RTX 2080 Ti FTW3 Ultra GPU with EK Quantum Vector FTW3 waterblock
- Sabrent 1TB Rocket NVMe 4.0 Gen4 PCIe M.2 Internal SSD
- Seasonic PRIME TX-850, 850W 80+ Titanium power supply
- Watercool HEATKILLER IV PRO for Threadripper, pure copper CPU waterblock
It’s all in a Lian Li PC-O11D XL case. I have three 360mm radiators, ten Noctua 120mm PWM fans, an EK Quantum Kinetic TBE 200 D5 PWM pump, PETG tubing and a whole bunch of Bitspower fittings.
My impressions thus far: it’s fantastic for Linux software development. It’s so nice to be able to run ‘
make -j40‘ on large C++ projects and have them complete in a timely manner. And thus far, it runs cool and very quiet.