May 9, 2012

Measuring TCP round-trip times, part 5: first round of clean-ups

I added the ability to toggle a chart’s y-axis scale between linear and logarithmic. This has been deployed on the Site Traffic page.

Code cleanup… when I added the round-trip time plot, I wound up creating a lot of code that is largely a duplicate of code I had for the traffic plot. Obviously there are differences in the data and the presentation, but much of it is similar or the same. Tonight I started looking at trickling common functionality into base classes, functions and possibly a template or two.

I started with the obvious: there’s little sense in having a lot of duplicate code for the basics of the charts. While both instances were of type Wt::Chart::WCartesianChart, I had separate code to set things like the chart palette, handle a click on the chart widget, etc. I’ve moved the common functionality into my own chart class. It’s likely I’ll later use this class on my Site Health page.

May 6, 2012

Measuring TCP round-trip times, part 4: plotting the data

by dwm — Categories: FreeBSD, Software Development, Web Development — Tags: , Leave a comment

I added a new plot to the Site Traffic page. This is just another Wt widget in my existing Wt application that displays the traffic chart. Since Wt does not have a box plot, I’m displaying the data as a simple point/line chart. There’s a data series for each of the minimum, 25th percentile, median, 75th percentile and 95th percentile. These are global round-trip time measurements across all hosts that accessed my web site. In a very rough sense, they represent the network distance of the clients of my web site. It’s worth noting that the minimum line typically represents my own traffic, since my workstation shares an ethernet connection with my web server.

Clicking on the chart will display the values (in a table below the chart) for the time that was clicked. I added the same function to the traffic chart while I was in the code. I also started playing with mouse drag tracking so I can later add zooming.

May 5, 2012

Measuring TCP round-trip times, part 3: data storage classes

I’ve completed the design and initial implementation of some C++ classes for storage of TCP round trip data. These classes are simple, especially since I’m leveraging functionality from the Dwm::IO namespace in my libDwm library.

The Dwm::WWW::TCPRoundTrips class is used to encapsulate a std::vector of round trip times. Each round trip time is represented by a Dwm::TimeValue (class from libDwm). I don’t really care about the order of the entries in the vector, since a higher-level container holds the time interval in which the measurements were taken. Since I don’t care about the order of entries in the vector, I can use mutating algorithms on the vector when desired.

The Dwm::WWW::TCPHostRoundTrips class contains a std::map of the aforementioned Dwm::WWW::TCPRoundTrips objects, keyed by the remote host IP address (represented by Dwm::Ipv4Address from libDwm). An instance of this class is used to store all round trip data during a given interval. This class also contains a Dwm::TimeInterval (from my libDwm library) representing the measurement interval in which the round trip times were collected.

both of these classes have OrderStats() members which will fetch order statistics from the encapsulated data. I’m hoping to develop a box plot class for Wt in order to display the order statistics.

May 2, 2012

Measuring TCP round-trip times, part 2: the throwaway

I previously posted about measuring TCP round-trip times from my web server to its clients. Last night I quickly added code to my existing site traffic monitor to perform this task, as the experimental throwaway to validate my design idea. I have not yet designed the data store, only the collection of round-trip times. To see what it’s doing, I syslog the rtt measurements. It appears to be working fine. Here’s some data from Google’s crawlers prowling my site:


May 2 21:37:23 www sitetrafficd[2318]: [I] 66.249.66.43:45825 rtt 123.2 ms
May 2 21:38:49 www sitetrafficd[2318]: [I] 66.249.66.57:38926 rtt 123.6 ms
May 2 21:38:49 www sitetrafficd[2318]: [I] 66.249.66.43:38085 rtt 123.5 ms
May 2 21:40:16 www sitetrafficd[2318]: [I] 66.249.66.143:39725 rtt 137.8 ms
May 2 21:40:16 www sitetrafficd[2318]: [I] 66.249.66.143:37657 rtt 126.2 ms
May 2 21:41:25 www sitetrafficd[2318]: [I] 66.249.66.204:47961 rtt 160.9 ms
May 2 21:41:25 www sitetrafficd[2318]: [I] 66.249.66.143:45623 rtt 121.1 ms
May 2 21:41:47 www sitetrafficd[2318]: [I] 66.249.66.60:36603 rtt 142 ms
May 2 21:42:15 www sitetrafficd[2318]: [I] 66.249.66.204:48875 rtt 123.6 ms
May 2 21:43:15 www sitetrafficd[2318]: [I] 66.249.66.43:56275 rtt 125.8 ms
May 2 21:44:42 www sitetrafficd[2318]: [I] 66.249.66.57:49966 rtt 124.1 ms
May 2 21:44:42 www sitetrafficd[2318]: [I] 66.249.66.204:53209 rtt 122.9 ms
May 2 21:45:59 www sitetrafficd[2318]: [I] 66.249.66.238:46595 rtt 123.8 ms
May 2 21:47:27 www sitetrafficd[2318]: [I] 66.249.66.60:60241 rtt 142.2 ms

I believe I can call the raw measurement design valid. It’s a bonus that it was not difficult to add to my existing data collection application. It’s another bonus that the data collection remains fairly lightweight in user space. My collection process has a resident set size just over 4M, and that’s on a 64-bit machine. My round-trip time measurement resolution is microseconds, and I’m using the timestamps from pcap to reduce scheduler-induced variance. Since I’m not using any special kernel facilities directly, this code should port fairly easily to OS X and Linux.

May 3, 2012
I added the ability to track SYN to SYN ACK round-trip time, so I can run data collection on my desktop or gateway and characterize connections where I’m acting as the TCP client.

April 23, 2012

Measuring TCP round-trip times, part 1: random thought

by dwm — Categories: FreeBSD, Software DevelopmentLeave a comment

I would like to start measuring TCP round-trip times from my web server. This could potentially be done on either my web server or my firewall. But given that I’m already sniffing related packets on my web server for other purposes, it makes sense to do the work there, possibly in the same process.

The idea is simple, and surely unoriginal: measure the time between my server’s SYN ACK and the client’s ACK of my SYN ACK (the last 2/3 of a TCP handshake). Record the wall time, the client IP address, and the time between the transmission of my SYN ACK and the reception of the client’s ACK of my SYN ACK.

In the not too distant future, I will upgrade my desktop machine to FreeBSD 9.0-STABLE. At that point I’ll start writing code that utilizes use the new h_ertt(4) kernel module.

Much of what I want is actually client-anonymous: an idea of the distribution of network distance of the visitors of my web site. I will want a facility to deal with crawlers, since they’re of less interest to me than human eyeballs and are likely to skew some statistics.

April 23, 2012

UL1998 and the software development process

by dwm — Categories: embedded, Software DevelopmentLeave a comment

At the moment this is just a placeholder. Later I will post here about some of my experiences with UL1998 and the ETL certification process. For some reason there’s a dearth of useful information on the world wide web. I’ve not had luck finding information that is sufficient for someone new to the process to put rubber to the road, or even know where to start. The UL1998 document of course may be purchased, but it is far from sufficient for a beginner or even an experienced developer with no safety certification experience. It is a sparse document with no concrete examples of sufficient evidence for any requirement. It is, after all, a certification requirements document (and somewhat dated).

I will not be quoting from the UL1998 document. In fact I will not look at it while writing this post, but instead give an overview of what to expect to need for certification. I may also make some references to ISO 26262 (an automotive safety standard), mostly in the interest of highlighting cross-domain knowledge that is applicable for both UL1998 and ISO 26262.

April 23, 2012

dwmqping revived

by dwm — Categories: FreeBSD, Software DevelopmentLeave a comment

Recently I had the need for a program to collect and plot round-trip times from my desktop. I wrote such a program in 2008 for FreeBSD, called dwmqping. I decided to revive it this month.

The old user interface was somewhat ugly (was I aiming for a Star Trek color scheme?), and at some point I’ll probably change it. However, it is functional, and I had forgotten how useful it can be. It can ping one or more destinations, and it uses TCP packets for transmission and BPF for timing, so it’s fairly accurate and not as dramatically affected by process scheduling, etc. as timestamping in application code. Clicking on the plot causes a snapshot to be saved in /tmp; this is useful but also a nuisance since it doesn’t let you supply a filename and is easily triggered when you don’t want a snapshot.

I used Qt for the GUI, and I’ve no plans to change to something other than Qt.

The buttons needs some work, but here are some screenshots of what it looks like right now.

This first screenshot is with 2 destinations on my local network: ria and www. Here we have the live plot showing for ria. The packet rate was 1 packet per second. The blue lines and cyan dots in the plot are raw round-trip samples. The yellow line is the median of the last N points, whenre N was 1600 for this picture. If packet loss is oberved, it appears as a red area plot using the Y-axis on the right. This axis is logarithmic because TCP begins to suffer significantly before 10% packet loss.

The next screenshot is the history plot for a single destination on the other side of the country at 20 packets per second. The history plot is a box or candle plot. The top of the blue line represents the 95th percentile of the samples in the box’s set. I intentionally don’t display the maximum sample; I don’t want a single outlier to skew the Y-axis scale. The top of the cyan box represents the 75th percentile. The line through the cyan box represents the 50th percentile (median). The bottom of the cyan box represents the 25th percentile. The bottom of the blue line represents the minimum sample.

The next screenshot is the distribution plot. It shows the round trip time distribution for the last N samples, where N is 1600 in this case.

April 23, 2012

sitehealth repaired

I finally got around to hunting down and squashing a bug on my Site Health page. It was one of those bugs that wasn’t easy to reproduce, but I happened to catch it occurring repeatedly on Sunday.

One of the benefits of having C++ code using Wt for this kind of page: I can attach to a running process with the debugger (gdb) and debug a multithreaded process live. The problem was the allocation of an extra data series in the filesystem plot, without assigning header data to that data series. The header data is needed when rendering the legend. If it’s not there… one of the threads either causes a seg fault or hangs.

I fixed the problem and the Site Health page is working again. I will need to update Randy’s server when he is connected at his new location.

April 18, 2012

Systainer stack 1 for detailing

by dwm — Categories: Automotive DetailingLeave a comment

Tonight I ordered the Systainer SYS 3 T-Loc (P/N 497565) to complete the first Systainer stack for automotive detailing. A crude diagram of this stack:

April 16, 2012

CPU gauge and plot added to dwmqlaunch

by dwm — Categories: FreeBSD, Software DevelopmentLeave a comment
Having grown tired of the amount of CPU used by gkrellm and how difficult it is to create styles for it (as bad as xmms), I decided to start adding some optional widgets to dwmqlaunch to show the things I was using gkrellm to display. I started with CPU utilization. I used my old gauge code to create a small CPU utilization gauge, and created a new widget to display a CPU utilization plot. They’re at the bottom of the dwmqlaunch screenshot shown to the right of this text. In the plot, the blue bars are live unfiltered data and the orange line is the median of the 10 most recent samples.
© 2012 rfdm blog
All rights reserved