JANOS Network Capture utilizes Predictive Debugging Mode

Written by Bruce Cloutier on Oct 24, 2018 4:07 pm @bscloutier

The key to debugging these days lies in the ability to recreate the failure. Some of the more elusive problems start out as rumors where a user swears that something failed to do what was expected but that he/she cannot prove it. After a while perhaps the issue resurfaces and this time someone is able to collect enough evidence to barely prove that the problem is a real concern. What exactly occurs remains a complete mystery. Eventually someone files a bug report but the developers are clueless and cannot effectively address the cause.

Sometimes there are suggestions as to what actions may have triggered the problem. The Support Team may take the time to try to recreate the problem on a local system but to no avail. Other issues which stand a better chance of being resolved take precedence. Meanwhile news comes in that the original issue has again occurred and may even now happen in completely different locations and environments. What to do next?

Perhaps it is appropriate to regenerate the application for the user’s site to add logging and other debugging statements in hopes of creating a better record of the events leading up to a future failure. The difficulty here is in deciding what exactly to log and to be logged by what aspects of the system given no real knowledge of the source of the issue. Some guesses are made and the logging is enabled. Some time later which could be weeks the issue reoccurs and the logging proves to have not been useful. It may even be that the content of the logs are now meaningless as the logic behind them has been forgotten. So it is difficult to make progress.

It would be nice to know when the issue is going to occur so you know exactly when you need to start watching for it. Better yet, if the problem occurs you could skip back in time to only a few seconds before and you could solve it! If the issue involves network interaction you could setup a sniffer like Wireshark and set it to collecting detailed network traffic. Now with an external device this may involve the use of a Network Hub as opposed to a Network Switch so you can add a machine running Wireshark that would see all of the traffic. But that is a lot of hardware and it might just not be feasible.

Here’s where JANOS and the JNIOR might surprise you. If the issue is with the JNIOR application you can use the netstat -c command from the Command Line interface to generate a PCAP capture file capturing network traffic. That PCAP file can be downloaded to a PC with Wireshark installed and it will open right up for analysis. And here’s where the magic occurs. It is almost like JANOS knows ahead of time when the issue will occur, that PCAP file can contain traffic from BEFORE the event happened although it has been generated after the fact.

Okay. So it is not magic but the foresight of the operating system which always is capturing network traffic. There is a capture buffer that by default is 512KB and that can be enlarged through the IpConfig/CaptureBuffer Registry Key to as much as 8MB. It is not so much a ‘buffer’ as it is a queue. At any one point in time it contains a significant amount of recent network traffic. You can also set filtering that is to be employed during capture to limit the collected traffic to that of concern and thereby greatly extend the time frames involved. You can also use a filter in generating the PCAP file to limit what you transfer for analysis.

Now what if the event occurs overnight and by the time you realize it the data that you need has been overwritten? A simple utility application can be added to the unit to automatically generate the network PCAP capture file when the event is detected. You can even then have that file emailed to you!

Those of you who may have worked with this aspect of the JNIOR will note that the network capture only includes data from the previous boot if it is not yet begun to wrap around. If the failure involves an assertion that might cause the JNIOR to reboot the information is lost. Well this is true, however, as of JANOS v1.7.1 which now as of this writing a Release Candidate the capture buffer is non-volatile. It will survive a reboot and contain a record of the prior network traffic. The buffer is only reinitialized when power is applied.

It is also possible to configure the JNIOR in Promiscuous Mode wherein it would capture all external network traffic and not only that targeted to the JNIOR. If you couple that with a Network Hub you could utilize the JNIOR to monitor the interaction of other devices on a local network segment. I bet you never thought of JNIOR as a IT debugging tool?

If this aspect of the JNIOR interests you or if you need to utilize it in debugging, please feel free to contact INTEG for support. You may also consult the Registry Key Assignments documentation for details on network capture. This information is supplied as part of the Dynamic Configuration Pages (DCP) served by the JNIOR WebServer accessible by your favorite browser.