Archive for May, 2009

Hello, World!

May 15, 2009

For many months have I been imagining the day that I would write a post with this title.

Now that I am actually typing the words, it feels weird: I have reached one of the most important milestones in my Open USB FXS “pet”-project, and yet I suddenly realize that the road ahead might be quite longer than the road behind. This is the exact feeling I had the day I graduated high school, or the day I finished with my military service (which is mandatory in Greece where I live), and that feeling goes something like, “it wasn’t that hard, was it?” — not to mention, of course, that what’s going to follow will be definitely harder. Anyway, enough with all this crap, let’s jump into the details that I know every reader is looking forward to.

As it is obvious from the title of the post, I made my board utter its first “Hello, World!” sound! My board has passed its PCM language exam. I am not sure whether it has gotten an “A” grade (see below for more on that), but nevertheless, the simple “Hello, World!” voice message is reproduced loud and clear. How did I reach here, though?

My previous post, “time (in a bottle)” was devoted to the details of rewriting the TMR1 ISR routine. I won’t return to those details, with one exception: the position of the FSYNC pulse. It seems that moving this to the 32nd cycle (end) of the sequence instead of the first cycle (start) of the sequence was not necessary. That was due to my hasty re-reading of the timing diagrams on p.55 of the Si3210 datasheets. In reality, the chip starts counting bytes from the rising edge of FSYNC. I did not notice this until late in my tests, and there’s more on this later on. However, from a subject point of view, this belongs to the previous post, so I thought about mentioning it first.

So, again, how did I reach here?

It seems that the most important challenge was to create a good PCM audio communication path between the board and the PC. This required understanding and implementing USB isochronous transfers. So let me start with these.

The USB standard defines various types of isochronous endpoints, namely, synchronous, asynchronous, data, and feedback. An asynchronous endpoint means that the USB device has its own clock and the host may send (in an isochronous manner) more or less data per frame to match demand (e.g., of some codec with variable bit rate). This was not my case. In contrast, a synchronous endpoint always transfers a given amount of data per frame, and the USB device operates synchronously to the USB clocking. It looked like I was in this category. A data endpoint sends data (my case), whereas a feedback endpoint sends feedback (to e.g., increase or decrease the amount of data per frame — not my case).

Time (in a bottle) describes what it takes for synchronous/isochronous transfers to work correctly from the viewpoint of the board: the 3210 PCM highway I/O tasks and the USB I/O had to be strictly synchronized and timing had to be rigorously checked to run equally fast on both sides. All this were done via the TMR1 ISR.

However, from the side of the PC things were a bit more complicated. The user-level API of libusb-win32 that I finally worked with required my PC  “driver” program to pre-queue a buffer and wait for the transfer to finish. However, as I was soon to find out, if I waited for a queued buffer to finish and then re-queued the next buffer, the library was missing frames. So, at every point in time, I had to have not one, but two buffers queued. One buffer would be the one transmitting, and the next one would be just waiting so that no gap would occur when the first one ended. One exception to this “always-two” rule seemed natural: when a buffer was drained out, I had only one buffer queued while I was preparing the next one to be queued. Here is an outline of how this works in a loop:

prepare (buffer1);
context1 = LibUSBenqueue (buffer1);
prepare (buffer2);
context2 = LibUSBenqueue (buffer2);
current context = context1;
#transferred packets = 0;
#last time;
while (true) {
    #packets transferred from current buffer = LibUSBcheck (current context);
    if (#packets transferred from current buffer > #last time checked) {
        increase #transferred packets by #packets transferred from current buffer - #last time checked;
        #last time checked = #packets transferred from current buffer;
    if (#transferred packets == desired) break while loop;
    if (#packets transferred from current buffer == (buffer size / packet size)) {
        LibUSBfinishWith (current context);

        if (current context == context1) {
            prepare (buffer1);
            context1 = LibUSBenqueue (buffer1);
            current context = context2; // which has been previously queued
        } else {
            prepare (buffer2);
            context2 = LibUSBenqueue (buffer2);
            current context = context1; // which has been previously queued

    #packets transferred form current buffer = #last time;
LibUSBfinishWith (current context);

This is not exactly what you could call “simple”, especially if one takes into account the details of filling in buffers with data from a file (appropriately sliced into packets with headers and payloads), plus the fact that this whole thing has to work in parallel for both read (IN) and write (OUT) USB transfers [to be honest, this requirement is not necessary for a “Hello, World!” test which sends one-way audio, but naturally one needs to make sure that both-way audio would work in principle, hence the double effort].

Of course, it took days and days of trial-and-error to get the above thing working. Sometimes I was missing IN packets, and I had no idea why. Some other times, the whole thing stalled on me, and again, I had no clue as to what the error was. Anyway, as soon as I had this working, I pushed all the gory details into a “SendAudioFile” function, and tried that inside a test sequence within my driver PC program. The test sequence went like, “initialize the board, then ring once the phone, then wait until the phone goes off-hook, then SendAudioFile”, then check again on/off hook status, if off-hook SendAudioFile, etc., etc., ad infinitum.

At the beginning, I just heard an acute sound with a steady frequency. After some thinking, I attributed that to the fact that my buffer size had changed from 64 bytes (its size in my previous ISR version) to just 8 bytes. So, whenever I was not sending any PCM data, the ISR kept repeating the same 8 bytes every millisecond, and this produced a pattern that repeated itself every millisecond. This is 1kHz, and yes, I was listening to an 1-KHz pattern! So everything was in order there — of course, apart from the fact that this whole thing meant that my board was not receiving any USB data.

Activating the test pattern b10101010 (or, 0xAA, if you prefer) that pre-existed in my code made the accute sound disappear; so this means that the PIC-to-3210 path was likely OK. Then, I scrutinized the ISR code, just to find that I had left in a #define for debug purposes, which was bypassing the OUT part altogether. Fixing that gave me an audible result full of noise, cracks and clicks. Definitely better than the 1kHz steady tone, however something was still wrong. What?

It took another day or two until I tried a longer message from the collection of asterisk’s IVR sounds and noticed that I could distinguish periods with more and others with less noise, in the familiar pattern of a voice that speaks, then does short pauses between words and longer ones between phrases. This led me into thinking that the audio was passing halfway through, and the culprit for that was the position of the TXD/RXD pulse stream with respect to FSYNC. So, after a better look at the timing diagrams of the PCM highway in p.55 of the 3210 datasheet, I set the RXS and TXS registers to 1 instead of zero, meaning that the audio pulse train was one PCLK-cycle later than FSYNC. And after that, in my next test, the phone rang and “Hello, World” came out, loud and clear!

It goes without saying that there are still issues. The most important issue is that audio occasionally comes out distorted, in a way very familiar to the ear of someone who has been using VoIP systems: some packets come in late. I need to do somewhat more debugging for this (e.g., increment a counter whenever no data is received in-time and check for its value). Then, I ‘ll see what remedy I can find.

Another issue (now resolved) is the acute 1-kHz sound: in periods where no actual audio data are sent, the board just kept repeating the last two packets (two, because of ping-pong buffering), which resulted in the now-well-known 1-kHz acute sound. By changing the ISR execution path where no data were received into replacing the received data buffer with 0xFF’s (u-law silence), this has now been solved.

So, having reached this milestone, I ‘ll allow myself the few beers (and the few days off the project) that I feel I deserve. Then, probably I ‘ll go and fix some bugs (the packet loss issue and another annoying thing where, on occasion, back-on-hook is not recognized by the board) and move my way on to either first dealing with DTMF and then writing a Linux driver, or writing the Linux driver right away. Bear with me, folks! It seems that I might make it in the end! I still need to publish code, updated timing diagrams, and updated BOM — I have been neglecting these, I know — but I ‘ll do it eventually.


Time (in a bottle)

May 8, 2009

If I could save time in a bottle
The first thing that Id like to do
Is to save every day
Till eternity passes away
Just to spend them with you

This time, starting to discuss my version 2 of the ISR code, I chose the lyrics from an older song, “Time in a bottle” by Jim Groce. It feels quite like the new spec for version 2 ISR. That’s exactly what the ISR is: a series of bottled (or canned, if you prefer) prescribed time. You may check out the code of the current version of the ISR here. Lots of unfinished things still lurk around, but the general code’s shape and organization are rather close to final, unless something serious comes up.

I am not going to describe in detail what the code does here. The reasoning is much similar to my original conception. There are four “periods” as spanned by the outer (in the sense of nested for-loops, because in its actual location in the code it’s an inner one) counter cnt4, and each period contains 8 PCLK full-cycles, spanned by the “inner” counter cnt8. Each 8-cycle train is carefully profiled (and painfully debugged) to take exactly 375 TCy’s, for a full 32-PCLK-cycle train that takes 1500 TCy’s, or equivalently, 125 microseconds (which yields exactly 256kHz).

Special actions are taken when cnt8 loops over to zero and for the whole 8-cnt8-cycle train when cnt4 is zero. During the first 8 cnt8 cycles of the 32-PCLK-pulse train, PCM audio I/O between the PIC and the 3210 takes place. When cnt4=1 and cnt8=0, the DRX (sent from the PIC to the 3210) and DTX (received by the PIC from the 3210) bytes are placed in an 8-byte buffer (actually it’s a 16-byte buffer, and data start to be placed with an initial offset of 8 ) directly in USB-reachable memory. This yields tremendous time savings because the very same buffers are then directly transmitted and received over USB (when 8 bytes have been collected -for DTX- or drained out -for DRX- and) at time cnt4=2 and cnt4=3, respectively. One more change from the original version is that FSYNC is now pulsed when cnt4=3 and cnt8=7, that is, at the 32nd cycle.

Happily, at 256KHz, filling in an 8-byte buffer with the above schedule takes exactly one millisecond. This is veeeery convenient, because this is exactly the period at which a USB host sends Start-of-Frame (SOF) microframes to the board. A few nanoseconds after the SOF, the board responds with an isochronous IN (from the board to the host) USB packet, which contains 8 bytes’ worth of PCM data. That is, 1 millisecond worth of data gets transmitted every 1 millisecond, and this happens synchronously with the 3210’s clock and the USB timing SOF signal. Couldn’t hope any better, could I?

Using directly the USB memory as buffers is feasible only thanks to a buffering technique that the PIC supports, called “Ping-Pong buffering”. In Ping-Pong buffering mode, the PIC uses even- and odd-rank Buffer Descriptors for each endpoint (in each direction, IN or OUT). Then, one set of IN/OUT transactions uses the even buffers, the next one uses the odd ones, and so on. This complicated things a bit in my code, but not very much. In return, one gets a tremendous speedup, because useless memory copies are avoided altogether. I felt very lucky that this mode existed in the first place, because otherwise I am afraid that the time in the bottle would not suffice.

A nice trick too is the initial synchronization between the 3210 FSYNC and the USB signals. For that, I had to tweak a bit the USB firmware provided by Microchip. The default firmware works in polling mode, and checks asynchronously for various events. One of these events was the SOF interrupt. In order to synchronize the ISR and the SOF signal, I needed to have complete control of the SOF flag from within the ISR, so I had to comment out an instruction in Microchip’s USB firmware that resets the flag. Obviously, my handling SOF in the ISR renders useless some other parts of Microchip’s firmware too, like the provided SOF callback function.

Probably I need to mention that debugging this thing was a loong, painful experience. Finally I resorted to TMR3, which can also run, just like TMR1, at the pace of the program counter (Fosc/4). I fixed the value of TMR3 so that it read zero when the ISR started, and then moved a debugging block around the ISR to verify my expected ‘@+xx’ values. Finally, after everything looked OK, I copied the value of TMR3 onto the IN USB packet and made sure that each value I was receiving had the same offset from the previous one, and that offset was exactly 12000 as expected (just in case you are wondering, this took me three whole days to conceive, plan, implement, test, debug, correct, re-try, etc. — I would never like to do that again, never…).

Bottom line: although the song goes

But there never seems to be enough time
To do the things you want to do
Once you find them

, I think that this time I squeezed just enough time in the bottle to keep my board drinking forever PCM data. As of this writing, the actual audio functionality is yet untested, but I ‘ll test it soon and report back by updating this post. In the meantime, readers (?? if any…) can enjoy browsing through the TMR1 ISR code (GPL-licensed, so freely usable in their own project).

I just hope to see — that is, to hear — all this code working fine in practice. Otherwise, I risk winding up like the old, white-haired mad scientist from Episode 7 (was it?) of the Muppet Show, who prepares himself a series of weird-colored, steaming potions and drinks them one after the other, each time getting younger and younger, under the melody of “Time in a bottle” (definitely a must-see); until, of course, he drinks the last potion, which turns him back into an old man…