Saturday, November 16, 2013

Proof of concept: unload formatting of log messages from embedded device

Elecia White explains in Making Embedded Systems: Design Patterns for Great Software how you should unload tasks from the device to the host to save precious processing power for things that must run on the device. This inspired me to show how you can offload the formatting of log messages from the device. The result is a very simple C logging library for the device and a few nifty Ruby scripts for the host. The C library uses some processor magic to mark the logging messages for the Ruby scripts and create a shorthand syntax for outputting log messages. There are still some things to do to turn it into production ready code like decoupling the logging library call and the actual output on the periphery. But hey, it's just a proof of concept! You can find it at https://github.com/matthiaskraaz/binary-logger. There is also a job running on Travis-CI (https://travis-ci.org/matthiaskraaz/binary-logger) showing you the output of the latest version.

Wednesday, August 21, 2013

Disable dialog asking to debug or close program

This dialog is quite pesky if your unit tests run automatically and one of them crashes. Instead of the test executor being notified about the crash, the unit test sits there forever waiting for the user to react. Of course the test executor could implement a timeout and shoot the process down. Another option is to disable the dialog. Just add the following to your registry: Windows Registry Editor Version 5.00 [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows NT\CurrentVersion\AeDebug] "Debugger"="drwtsn32 -p %ld -e %ld -g" "Auto"="1" This activates Dr. Watson as the automatically selected debugger. Dr. Watson will create a crash dump for post-mortem debugging. Alternatively you can select some other debugger or tool, but you can't leave it empty for the solution to work.

Tuesday, April 9, 2013

How to debug an unexpected UAC prompt

Recently a self created executable showed the UAC prompt and it wasn't clear why. It sure didn't need administrative privileges. So how to debug this?

Wikipedia tells the UAC prompt can be triggered by:

  • the "Run the program as an administrator" compatibility option. Just check the file properties
  • the executable's manifest: if requestedExecutionLevel is present and is not asInvoker, the UAC prompt is triggered. An external manifest can be just opened with a text viewer. Also check for an embedded manifest: open the executable file in Visual Studio and check the RT_MANIFEST resource. If Visual Studio displays an error message that there are no resources, you are fine. Without resources there is no embedded manifest
  • the UAC heuristic: it checks the file name, the string resources, the manifest for keywords that indicate an installer

In my case it was the UAC heuristic. To be specific, the UAC prompt was triggered by the long and cryptic (and automatically generated) file name. Some sequence in it told UAC that this must be an installer. So renaming the file fixed the problem - no UAC prompt any more.

Friday, April 5, 2013

Why backups if you have a NAS?

I recently bought a Synology DS212j and set up a SHR (RAID1 with extra features). Then - after some toyingthorough feature evaluation - I configured backup tasks using the Time Backup extension.

Why, you might ask, do you really think two hard disks are going to crash at the same moment?

No, I don't. But mirroring doesn't protect us against files being corrupted by faulty applications. Or accidentally deleted. Thanks to Jeff Mitchell for the reminder.

BTW: What IS the probability of two hard disks in a RAID array failing? Two hard disks from the same lot? With the exact same load through all their lifetime? Within the time frame that it takes to rebuild the RAID?

Synology: Time Backup on DiskStation has failed. Please check the package log for further information.

Today my new Synology DS212j notified me via email about: Time Backup on DiskStation has failed. Please check the package log for further information.

Aside from the email notification and missing versions in the timeline, I could not find any indication of the error nor its reason in the general web interface nor the Time Backup web interface.

Google told me that the Time Backup package log could be found at /var/log/timebkp.log.

So I logged in via ssh and found: {WARN}{1365130850}{...}{Task [matthias] has failed to backup shared folder [matthias] due to [error occurred while copying files].} {ERR}{1365130850}{matthias}{Task [matthias] has failed to backup version [20130405-0500] due to [error occurred while copying files].}

Not very informative. However, in the sibling /var/log/timebkp.debug I found: Apr 05 00:00:54 [14260]BK_ERR:rsync return with error, return code = 41

Which led me to notify the file /var/log/rsync.error which hold the solution: Apr 05 00:00:53 (14436) [ERROR] log.c (350): rsync: recv_generator: mkdir "/volumeUSB1/usbshare/TimeBackup/..." failed: No space left on device (28)

I was first confused because the device had plenty of space left, but then realized that the ext4 file system had run out of inodes. I had originally provisioned the file system for another purpose and therefore reduced the inode count by factor 256. This and the habit of Time Backup to use lots of hard links which consume an inode per hard link had created the inode shortage.

As ext4 doesn't allow to increase the inode count I had to re-create the file system.

I plugged the hard disk back into the Synology and issued "Back Up Now" for all backup tasks. I had expected the version count to be reset to 1, but instead it retained the version count. Uneasy with this obvious difference between Time Backup and the content of the hard disk I resorted to deleting and re-creating all backup tasks. This seems to have done it.

Update 2013-12-13: The file system has run out of inodes again. Seems like you should use an above standard ratio of inodes to size for Time Backup.

Saturday, February 16, 2013

I really like SnagIt the screen logger

Recently I was searching for a screen logger that would save my actions while trying to reproduce a hard to find bug. I found SnagIt, the award winning software. And I really liked it. First of all I was up and running with SnagIt in an instant. Getting the evaluation key, download, installation, firing it up - a matter of minutes. The system under test was a PC simulation of a two channel medical device. Meaning: there are two processes monitoring each other with quite sharp timeouts. Meaning: any CPU intensive task might make the simulation go into safe state, because one of the channels doesn't get enough CPU time. SnagIt is light enough on the CPU to not disturb this quite sensitive application. The third thing I noticed was the very nice end result: a quite small video file that exactly captured the application and my interaction with it. Perfect to be attached to a bug report.