Archive for January, 2010

Open USB FXS is now ready!

January 18, 2010

This is a historic moment! After successful completion of the remaining read() tests, I am hereby declaring my Open USB FXS project as working!

In summary, the board works, and has a Linux driver that can be used by easy-to-read/easy-to-write/easy-to-understand userspace code. So, I decided that this post should wrap up some of the issues I found and handled during development of the driver. Before doing that, however, I must confess there are some caveats: (a) my three prototype boards all exhibit some issues and need to be carefully debugged; (b) the driver does not handle well error cases (e.g., a board that stops responding after some time of correct operation) and needs work in this area. Other than that, things are quite OK. These having been said, here come the nasty details you were longing for.

First comes the patch to the Open USB FXS schematic:

As you can see, the control bus between the PIC and the 3210 now includes a new signal, \INT, that connects pin 2 of the Si3210 (\INT) to pin 13 of the PIC (RC2). Because  the Si3210 provides an open-drain signal (when the signal is not logical zero, it is tri-stated, and this serves to facilitate connecting more 3210’s in parallel to a controller using a single wire), \INT needs to be pulled-up to +5v via a 10k resistor (RINT). You may have also noticed that I removed the 32.768-kHz crystal and its capacitors from the design, since these were not used anywhere.

Second comes the patch to the board design:

A few signal lines around pin 2 of the Si3210 have been moved slightly in order to make room for the new \INT signal; the signal lines are highlighted using Eagle’s “show” function and appear in brighter colors. RINT has been placed on the board’s bottom side. However, if you need to hand-patch the previous version of the board using wires, it’s easier to solder RINT on the top side, to a +5vDC line, e.g., near CDC2.

Next, I am giving out some details about the Linux driver (a typical char device driver). The driver code can be thought of as consisting of four conceptual sections: (1) the USB core interface, (2) the device initialization code, (3) the isochronous I/O engine, (4) the file ops section (open, release, read, write and ioctl).

The USB core interface section is copied as shamelessly as it was possible from the USB skeleton driver (that comes with the Linux kernel source) by Greg Kroah-Hartman. It consists of all the ugly stuff that is needed for the driver to recognize the device when it is plugged and set up the interface between the kernel’s USB core and the driver. This part also handles (quite cleanly, I must admit) the ugly things that happen when the user disconnects the device by unplugging the USB cable all-at-a-sudden, by coincidence at the exact time that your driver does its most-critical-task-of-all-tasks-ever. It still doesn’t crash, which means Greg has done a very good job! Since this section is mostly copied-and-pasted from the USB skeleton code, there’s not really much more to discuss here, so I am moving on to the next sections.

The device initialization section is also dealt with very extensively in my previous posts. Essentially, after making sure that some 3210’s registers report back meaningful values, the code initializes everything there is to initialize in the board. A subtlety that is worth noting is that, since this takes a long time and may involve repetitive patterns, like testing for the DC-DC converter output a number of times, the whole initialization is performed by a worker thread. A “board-state” variable is the means of communication between the initializer thread and the rest of the code; if the state is not OK, all file ops will fail reporting either “Try again later” or “I/O error”, depending on whether the board is still initializing or has failed.

The isochronous I/O engine is where I have departed significantly from the USB skeleton code. This is not something I did for fun; simply enough, the skeleton code does not contain anything isochronous, and the USB core provides only low-level primitives for isochronous transfers. The driver code has to prepare its own URBs (this is what a “request” is called in USB jargon; the acronym comes from “user request block”), submit them to the core and provide a callback function that is to be invoked when that URB completes. Moreover, the calling code must provide explicitly for buffering and for the exact position of each “packet” within a larger buffer that can contain many small-sized isochronous packets. Thus, in the general case, isochronous packets may vary in size — not our case, anyway.

In theory, designing an isochronous engine with these tools is not that hard: one just submits a bunch of URBs (a single submitted URB at a time is not safe and will most probably result in data loss; it is best to have at least one more pre-submitted for the USB core to pick up when the first is done with), and then, as each URB completes, one provides code in the callback function to re-arm the URB and re-submit it with new data. This is easier said than done, though… Plus, there are some interesting questions arising. The most obvious one is, what is a good number of URBs to pre-submit? And, similarly, how many packets should each URB contain? Obviously, there is a tradeoff between submitting too few URBs — or too few packets per URB  — (the asynchronous nature of events in the Linux kernel may result in late invocation of the driver code, which means data loss) and too many of them (unneeded delays between the computer and the board will build up, due to buffering too many data). In the driver code, I have made these two parameters into module variables that can be set when the module is loaded into the kernel (separate variables exist for the OUT and IN directions). The defaults provide for four pre-queued URBs, each with 4 packets each, amounting to a total delay of 16ms — not too terrible, I hope. With these settings, sometimes there are audible clicks when the VM (remember I am running Linux on VMWare) performs some other job, like compiling the module. Other than that, the results are not bad. One can always compile in other defaults, or use the module variables to test how the module does with various values.

The other interesting question (which brings me to the file ops section) is, how the heck can this automaton synchronize with user I/O operations like read() and write(), which are inherently asynchronous? In order to answer that, it is best to think of the synchronous URBs as railroad “trains”. Here is how this metaphore helps understanding read and write I/O.

It’s easier to first think of write I/O as follows: each byte to be written can be thought of as a passenger, who is to travel with the “train”. Each train departs at a given time, and if our passenger is not there to catch it, that’s too bad for her, the train will go anyway and she ‘ll have to wait for the next train. If the passenger is on-time for the train that waits on the station and the train has free room for her, she will get on this train. If the train is full however, she will have to wait for the next train (and, if that’s full as well, for the next one, up to a limit). If all scheduled trains are full, our passenger just goes to sleep and must be awakened when a new train is scheduled (when a URB is completed).

To complicate things a bit further, the board does not understand individual bytes, but instead requires a minimum sample of 8 bytes. OK, here is how this fits into the train metaphore. Passengers must travel in compartments (“coupés”) of eight persons. A compartment can either be fully populated, or else it must remain empty. Thus, before getting into the train, passengers who arrive at the station one by one first have to gather into a small waiting chamber which has a capacity of  eight persons. If the time comes for the train to go while the waiting chamber is half-full, the unfortunate passengers have to wait for the next train. Otherwise, as soon as the chamber fills up, all eight passengers embark together at once (needless to say that this whole silly waiting chamber routine is not needed when we have a massive arrival of passengers, except for the last few –less-than-eight– passengers).

This is roughly how write() works. There are some details, needed in order to synchronize the user-facing side of write() with the device-facing side of the “train-scheduling” automaton. For example, write() cannot be allowed to hold a buffer (a “train”) for long intervals; when the time comes, the URB completion callback will only spin-wait until write places the next 8-byte chunk, and will then take the train –ehm,  the buffer– away from write(). In this case, the train departs with less compartments than its full capacity, in the hope that a large-sized write() will fill the next buffer.

Read() works almost the same. Trains arrive at regular intervals and bring into the station a fixed number of passengers each time. Passengers wait inside their trains to get picked up (by a car or by bus or whatever, let’s not be too realistic in this case). The station can hold a limited number of trains simultaneously. If the passengers of the oldest-arrived train have not been picked up when a new train arrives, the un-picked passengers have to be, ehmm, eradicated (didn’t I tell you, don’t take this example too literally…) to make room for the newcomers. Again, a small eight-person waiting chamber helps picking up loners in the correct order. If you wonder why this is needed in the case of read(), it’s because we cannot leave a train with a half-empty compartment (which is what would happen if we let passengers disembark one-by-one from trains) or else we shall mis-align the next samples. Instead, passengers must again get off the train as a full-compartment-group and then wait in the small chamber to be picked individually.

I hope all these details are more than enough and that by now you are completely fed up with read() and write(). Good, because there are also a few ioctls to discuss about. The most interesting ones (which, me being a lazy guy,  are the only ones I have implemented) are IOCSRING (sets ring on or off), IOCSLMODE (sets the line mode to open or forward active), IOCGHOOK (returns the hook state) and IOCGDTMF (returns the DTMF code currently seen). The DTMF part deserves a few more words. The board piggybacks the (hook and) DTMF state in all data packets (the newly-patched \INT signal serves to tell the PIC when to look for these). The driver code implements a sort of a one-digit “latch”, to report each pressed DTMF digit once and only once, and that only while the digit key  is being pressed. Thus, the user code must poll for DTMF by probing frequently the device, but in theory this probing does not have to be as frequent as data I/O (a valid DTMF code should last  at least 75 ms).

A final caveat note for the device driver: in its current version, the code does not allow more than one simultaneous open()s of the device. This saved me a lot of trouble and case-handling in the code. On the other hand, this precaution does not really guarantee too much, since a multi-threaded userspace app can still mess up everything by issuing many simultaneous read()s or write()s. The driver in its current state is guaranteed to misbehave or crash if this happens. Don’t try it or don’t blame me for what will happen.

Here is an example of code. I hope it is quite clean and easy to read for all readers (after all, that’s what the driver is useful for: it hides all the ugly floor planks, holes and cracks underneath a nice Persian rug). So, ideally, you should not need to ask me what the program does. Ah, and let me not forget this as well: I will provide an update to this post when I upload the newest versions to the project’s google code site.

# include <fcntl.h>
# include <stdlib.h>
# include <stdio.h>
# include <unistd.h>
# include "openusbfxs.h"

main () {
    int d, o, i;
    char c[8], n;
    int t = 0;
    int h;
    int k;
    if ((d = open ("vm-options.ulaw", O_RDONLY)) < 0) {
        perror ("open vm-options.ulaw failed");
        exit (1);
    }
    if ((o = open ("/dev/openusbfxs0", O_WRONLY)) < 0) {
        perror ("open /dev/openusbfxs0 failed");
        exit (1);
    }
    sleep (1);
    if ((i = ioctl (o, OPENUSBFXS_IOCGHOOK, &h)) < 0) {
        perror ("IOCGHOOK failed");
        exit (1);
    }
    if (h) {
        printf ("Not ringing since set is off-hook\n");
    }
    else {
        if ((i = ioctl (o, OPENUSBFXS_IOCSRING, 1)) < 0) {
            perror ("IOCSRING (on) failed");
            exit (1);
        }
        sleep (1);
        if ((i = ioctl (o, OPENUSBFXS_IOCSRING, 0)) < 0) {
            perror ("IOCSRING (off) failed");
            exit (1);
        }
    }
    while (1) {
        if ((i = ioctl (o, OPENUSBFXS_IOCGHOOK, &h)) < 0) {
            perror ("IOCGHOOK failed");
            exit (1);
        }
        printf ("Phone is %s-hook\n", (h)? "off":"on");
        if (h) break;
        sleep (1);
    }
    while (read (d, &c[0], 8 ) == 8 ) {
        if ((n = write (o, &c[0], 8)) < 0) {
            perror ("write failed");
            exit (1);
        }
        if (n < 8 ) {
            fprintf (stderr, "write returned %d\n", n);
            break;
        }
        t += 8;
        if ((i = ioctl (o, OPENUSBFXS_IOCGHOOK, &h)) < 0) {
            perror ("IOCGHOOK failed");
            exit (1);
        }
        if (!h) {
            printf ("Phone is %s-hook\n", (h)? "off":"on");
            break;
        }
        if ((i = ioctl (o, OPENUSBFXS_IOCGDTMF, &k)) < 0) {
            perror ("IOCGDTMF failed");
            exit (1);
        }
        if (k) {
            printf ("DTMF key pressed: %c\n", k);
        }
    }
    printf ("A total of %d bytes were written\n", t);
}

Update, Jan 21: all the new code and changes/fixes to existing code have been updated to the project’s google code site. This includes the new patched versions of the schematic and the board (the old versions have been kept too in the source, as “openusbfxs-unpatched”), the updated firmware (in source and hex format), the latest Win32 console driver code (a fix was needed for understanding ring trip detection [this is what an off-hook event while the phone is ringing is called in 3210’s language]), and, of course, the newest version of the driver, including the read() and ioctl functionality, plus many other small fixes.

Now that the driver code is out in its entirety, it would be nice to hear some reports on how well the code ports to various Linux kernel versions, so if anyone out there tries it with versions different than 2.6.26-2, please inform me on the results. One note: I know that the code will not port to versions prior to 2.6.11 or .12 (some API changes were made into the kernel by then), and I don’t really intend to back-port the driver to older kernel versions at this time (maybe later, because there is nothing special in these API changes that would really prevent the code from working on older kernels). On the contrary, if the code does not port well to newer kernel versions, and somebody is kind enough to point me to whatever incompatibility is found there, I would appreciate it very much.

Advertisements