Archive for December, 2010

Two new fascinating features

December 15, 2010

By now you should have known. I am not the guy who would sit idle forever. There were two nice features I had conceived in the past that were always at the back of my mind. And, motivated by some friends who asked about these features, I decided to dive once more in the deep waters of firmware, kernel-code and user-side development. (I know, by now I should have known better, but it seems I cannot help it).

Before however I go into the details, please let me spend one minute into something completely irrelevant: the availability of prototype boards. Probably because of their high price, or for whatever other reason, only a few people got interested in buying prototypes. Thus, I decided to stop selling them. A number of boards are still available, but not for purchase. You can still reach me at openusbfxs -at- gmail dot com if you are interested in getting your hands on a prototype, or DIY kit, however you cannot anymore buy one. I will consider only non-commercial arrangements (e.g. send you a board in exchange to some other piece of equipment or to you rendering me a service). Mail me if you need a board and you think you have an interesting proposition.

OK, now back to the new fascinating features. What are they? Well, the first is about upgrading the firmware without the need to touch S1(b) (that’s S2 in my older design), in pure software. Doesn’t this sound like a very professional-grade feature? Just run a tiny program, and there it is, your firmware gets upgraded in-place, without even needing to unplug your device!

The second feature may sound  just a little bit more complex. It bears the obscure name “channel number persistence”, but in principle it is simple. Suppose for a second that you have two Open USB FXS devices, each connected to a phone. The ugly thing so far was that, if you got to unplug and then replug the two devices, or even to reboot your system with the devices plugged in, the dahdi channel-#-to-device mapping might change, depending on the order in which you would plug them back. Together with the channel # of a device would of course change its extension number, so you might find yourself in the unpleasant situation of getting e.g. your next door colleague’s former extension number (and her getting your former extension number, whichever of the two is worse). Channel # persistence fixes exactly this, by assigning each Open USB FXS device a fixed channel number, regardless of the order in which the devices are plugged into the system.

So, let us first dive into the software-based firmware upgrade story. Let me remind you that the Open USB FXS firmware has always had firmware upgrade support: a small bootloader firmware lives at the low memory area of the PIC (0x000 — 0x800), and if you switch S1(b) to its “on” position, unplug, and then re-plug the board, the bootloader takes control of the board. In this mode, the PIC is programmed to behave like one of Microchip’s PICDEM boards. Thus, up to now, I have been using a Windows utility from Microchip, PICDEM-FS  to flash the FXS firmware onto the higher memory area.

What was the downside of this? First, you would have to setup and maintain a separate Windows environment for firmware upgrades. And second (and most important), you would have to have physical access to the board in order to unplug it, toggle a DIP switch, replug it, unplug it, turn the switch back off and replug. Which means, you could not plug the device at the back panel of your tower PC and forget it there, under your desk forever; you ‘d have to pull the tower case out from time to time,  remove the board, plug it in, flash it with new firmware, remove again and replug it back in. Moreover, my dongle would never make it into a housing case, because then from time to time, the user would have to open the case and fiddle with the DIP switches. Certainly not too practical, is it?

So I decided to make the board able to upgrade its firmware right from Linux, without unplugging or touching S1(b). First step was to check on a small Linux program, fsusb, which proved to do the flashing job perfectly well. Goodbye, Windows! However this was the easy part. I now had to find a way to bypass S1(b), which was not as straightforward as it might sound.

At first sight, bypassing S1(b) looked too tricky. When the device boots, initially it is the bootloader that takes control; it checks the setting of S1(b), and, if the switch is in the “off” position, the bootloader jumps to the FXS firmware. Moreover, the bootloader is compiled using the Microchip’s MCC compiler with all optimizations turned on, which makes it fit right into its alotted 800-byte “memory box”. My copy of MCC had expired its demo period a long time ago, and one of the optimizations was no longer supported. If I recompiled the bootloader from source, it would take more than 800 bytes, and I would have to do extensive rewrites to the FXS firmware to align. Therefore it seemed that this direction was not a clear “no go”.

Another thought that crossed my mind was to replicate the bootloader’s functionality into the FXS firmware itself. But this seemed too convoluted, too. Of course, I could not let the FXS firmware overwrite itself. Thus, I would have to define another memory area, and write the firmware code so as to contain an “egg” part with the necessary code to overwrite the FXS firmware flash; when instructed, the code would place the “egg” at some high memory area and invoke it. But then, I would also have to rewrite the fsusb program so that it would recognize the USB vendor and product ids of the FXS “personality” (see next paragraph). Naaah, all this sounded much too complicated…

My third thought was the one that actually made it. The bootloader code is fixed and unlikely to ever change. What if I instructed the FXS firmware code to jump back to the bootloader, right after the check for S1(b)? Then, the bootloader would ignore the setting for S1(b). Wisely enough, checking for S1(b) is done very early in the bootloader code, before the initialization of the USB circuitry, hence before the board assumes any of its two possible personalities; thus, there would not be any side effects.

Well, this sounded fine, however there was still a minor issue needing to be resolved: the bootloader code assumes it is being run right after a reset, so that it can take a PICDEM “personality” (USB vendor id, product id, interface endpoints, etc.). The FXS firmware has a totally different “personality” (other product id, other endpoints, etc.). Thus, it was clear: the jump to the bootloader would have to be right after a reset. But — wait a moment: then, how would I instruct the board over software to go into bootload mode? Right after a reset there is no way for the driver to talk to the board; and when the driver is finally able to talk to the board, it’s too late, since the board has already assumed its FXS personality!

The solution I found to this dead-end was again the PIC’s EEPROM. At boot time, the FXS firmware would examine byte #4 of the EEPROM (bytes #0 to #3 are occupied by the serial number) and, if it found this byte set to zero, it would go on booting normally as an FXS device. Otherwise, it would reset the byte to zero (to avoid an endless loop at the next boot) and then it would jump to the bootloader. Yes, that would clearly make it.

And indeed, it did make it. What came next was a tiresome, but relatively straightforward job: I transcribed the code from the (open-source) utility fsusb into a small program of mine, checkfmwr.c. Now this new program does everything there is to do in order to perform a tidy firmware upgrade: it scans for Open USB FXS devices on the system; if it finds one, it checks whether (according to the program’s invocation arguments) this board requires a firmware upgrade; if it does, the program instructs the board to reset into bootloader mode; it then scans the USB bus again, and finds the board with its bootloader “personality”; it flashes the new firmware, and finally it reboots the board back into FXS mode. It all works like a charm.

A tricky part of the procedure relates to associating the (new) PICDEM device with the (old) Open USB FXS device after the board resets to bootloader mode. Since the board reboots, it is removed from the system; and when the bootloader-personality board boots, it appears with a different device number, because the Linux kernel gives new devices a new, always-increasing, device number. So the algorithm I chose was to scan the USB busses until the same bus is found that once held the Open USB FXS device in hand; then, look for a PICDEM device on that bus, with a device number higher than the one of the Open USB FXS device formerly found. This works, unless you have more PICDEM devices lurking on the same USB bus. You do not? That’s what I thought, too.

A funny preposterous situation here is that the firmware needs first to be upgraded to a version that supports the auto-upgrade functionality before it can actually auto-upgrade :-). OK, I agree, it does indeed feel funny; but remember — it is the last upgrade you will have to perform with the traditional method (S1(b))! All your next upgrades from now on will be done in software alone.

Enough with the firmware upgrades, it’s now time for the second feature, channel # persistence. This was conceivably simpler and more straightforward: to activate this feature, one would have to pass the oufxs driver a parameter, mapping serial numbers to dahdi channel #’s. Then, the driver would try to invoke dahdi_chan_register() and actually allocate these channel numbers, mapped to internal dev structures inside it. Later on, when such a pre-reserved Open USB FXS device was plugged, the driver would note that and, instead of allocating a brand-new dev structure for the new device, it would just use the pre-allocated dev structure that was registered with dahdi. Simple, isn’t it?

Of course it was not that simple in practice. There are many execution paths within the driver, and I had to make sure that, in each and every one of these, I would not use unallocated or uninitialized stuff, I would free or unregister back all unneeded malloc()ed space and unneeded dahdi channels, I would check that persistent devices were present before attempting to use them, I would not forget to unregister free()ed structures, I would not forget any spinlocks locked, etc. Which means, I froze and ooops’ed the kernel quite a few times before I got it right; but I got it right in the end.

In the course, I thought it was a good idea to check compatibility with dahdi version 2.3.x (former work was on 2.2.x). This first resulted in some BUG_ON messages, because the new dahdi version uses kernel module reference counts, and I was not supplying oufxs as the exporting module for my dahdi devices. That fixed, I then had to make the code port well to both dahdi 2.2 and dahdi 2.3 (and 2.4, which I have not tried yet). A little fiddling with the Kbuild makefile made it.

But then my (now correct) code triggered a wholly new bug, this time in the dahdi code. I had designed the driver so that pre-reserved devices that are not present on the system failed on open() (this looked like the right thing to do, when a physical device does not exist for a pre-reserved dahdi channel). But on that execution path (failure in open()), dahdi was not decrementing the module reference count, and thus my oufxs module remained locked in memory until the next reboot. I fixed the bug in my dahdi-base.c copy, and posted a bug report on Digium’s site.

A funny story went on then, where the site administrator kept removing the fix code that I posted because of site policy reasons, until I had to describe the fix in English. Anyway, my fix was finally accepted and made it into their CVS (or whatever it is they use for version control).

Anyway, the code now works well (supposedly in both dahdi 2.2.x and 2.3.x, although I have not yet checked both thoroughly). So you can perfectly well pass the driver at load time a parameter like this:

rsvsn2chan=1234ABCD:2,DEADDEAD:4

and have two dahdi channels, 2 and 4, pre-reserved for the two boards with serials 1234ABCD and DEADDEAD. This means that now you can go into /etc/dahdi/system.conf and specify parameters for channels 2 and 4, knowing that each setting will always pertain to a given board, no matter which USB port this board is plugged into. Moreover (and more importantly) you can fix extension numbers for dahdichan‘s 2 and 4 in /etc/asterisk/users.conf and know that an extension will always pertain to a given board. Which, I guess, means you will no more receive accidental calls intended for your colleague-next-door.

There is still a caveat, which I will maybe resolve in a future code fix. This fail-on-open() policy of mine for pre-reserved devices has an unwanted side effect: when a device is actually plugged in, chan_dahdi has to be reloaded in Asterisk in order for the device to be opened and work. This may not be what one wants in a production system, because it will break all active dahdi calls. So, in the future, I think I could try a more clever policy, i.e. permit the open() to succeed and then make sure in all other places in the driver that no attempt will be ever made to access non-existing hardware. This feels like a totally new bunch of bugs are waiting for me to stumble on them, so I am leaving it for now.

Another idea would be, if this does not somehow already exist in dahdi, to cooperate with Digium and make this pre-reservation code into a separate (dahdi) module, that would apply to other pluggable devices as well. Ouch, this looks like a big, big adventure for courageous and fearless kernel programmers — certainly not what I feel like right now, after all these nights of bug-hunting.

As it has always been the case in this Open USB FXS saga, each new feature opens the way for some interesting new work to do in the future. Nevertheless, the current state is not too bad as well: my devices can now be software-upgraded and be given fixed channel numbers, settings, and extension numbers. So why not try these features yourself (and let me know how it went)?

Advertisements