Below is a chart showing the number of IPv4 addresses blocked from accessing ssh on my network, by country, for the 25 countries with the most address space blocked. It has changed over the years, but the U.S. is now #2 where it wasn’t in 2017 (it wasn’t even in the top 20 back then). What hasn’t changed: China remains dominant.
It’s worth noting that my automation blocks address space based on access attempts. A prefix doesn’t wind up in the list unless the automation has seen failed access attempts from that prefix. Of course, policy and address allocation determine the width of prefix, as well as aggregation. No one gets blocked forever, but repeat offenders are blocked longer. Bear in mind that ssh on my network is intended for a single user: me. If I’m not in China at the moment, ssh need not be accessible from China.
This next chart comes from mcflowd data. One of the things it tracks is unacknowledged SYN packets that hit my gateway, per source IPV4 address. I can process this data with IP to country information to get an idea of which countries are the predominant prowlers. It shows how often different countries are either probing for a service I don’t run or trying to hit a service that I’ve blocked some of their IPv4 address space from accessing. A metaphor: the number of times they knocked on my door and I purposely didn’t answer (I did not reply with a SYN ACK). I call this the chump factor chart; many of the address spaces that contribute to this chart have been the same for 5 years.
Note that this second chart is from a period of less than 5 days. Anyone with a public Internet connection should not kid themselves into thinking they’re not being probed, constantly.
This is how you splinter the Internet. Make us fed up enough with your traffic that’s indistinguishable from traffic with criminal intent, we block you. Works great for authoritarian governments that don’t want their citizenry to communicate with the free world, and those with other motives too. 🙁
The good news for me is that I have automation that’s pretty flexible in configuration and input sources (the log parser, for example, can be used on a number of different log formats as long as they’re text and contain offending IP addresses). It saves the data I need, and isn’t a significant resource consumer on my gateway. It’s very secure, using my Credence library for ECDH, authentication and authorization (which under the hood is using libsodium at the moment). I have a reasonably robust IP to country service, which updates itself via RDAP. Sadly the registries are a disaster saga, so occasionally I wind up reloading with GeoLite or similar data. But since I only use the country data to determine how long and how wide a prefix will be blocked, and not if a prefix will be blocked, it’s mostly inconsequential. It’s just useful to be able to see where the nefarious traffic is coming from, through a geopolitical lens.
Hey U.S. (my own country): we shouldn’t throw stones while we live in a glass house. And if there’s anything we should do about big tech, I’d say regulating the weaponization of massive cloud computing resources would be a good start. Where do a lot of the U.S. probes come from? Amazon EC2, Google, Microsoft, DigitalOcean, linode, Oracle. The same holds true for probes from Canada and other parts of the western world. Some of these are legitimate research probes. However, to a large extent they’re indistinguishable from nefarious activity. And besides, I pay for my bandwidth at the end of a thin straw we call broadband in the U.S. I don’t want this traffic, yet I pay for it.