Archive for the ‘Open Hardware’ Category

Discussing further development

October 19, 2009

So, what’s next? If you have been reading this blog for a while, you may wonder if this project is (as a matter of fact, if I am) active any longer. Well, there is not an easy answer to this question. The real answer is yes and no.

The answer is “no” because, starting from the end of June, we have at home with us our second baby. This has resulted in a drastic transformation of the spacial-temporal parameters of my electronics hobbying life: there is no time for heating my soldering iron, nor can I find anywhere in sight the countless hours that I used to need to experiment with my board. Plus, there is no space on which I can lay my half-finished boards, SMD components, voltmeters, soldering paste, phone sets, etc., at least not without risking one of the kids swallowing any of the above (ok, not the voltmeter and the phone set, but the rest, yes, why not?), burning itself with the soldering iron or getting tangled with the various cables, while I am busy nursing the other one. And, of course, my lab desk has been evicted from the house, in order to make room for new beds, drawers, toys and other equipment.

Although taking care of two kids in parallel requires my wife and me to move at speeds approaching the speed of light, apparently the result is that the available space and time shrink both proportionally to one another, contrary to what Albert Einstein’s theory of relativity predicts. It seems that the mechanics of baby care somehow eluded the great physicist when he was creating this monumental work, or else I am sure he would have formulated his theory so as to cover this exception. The solution might be to add to the equations a new quantity, called “love”; this quantity would grow inversely to the product of the available space and time, so as to restore the theory’s correctness, even in extreme conditions like the ones I am describing above.

The answer about the project’s activity is “yes” as well. I haven’t dumped completely the project, since I have turned my attention to porting it to the Linux world. After all, if you check my replies to the comments  on the “About” page of this blog (e.g., replies to Denver or Vincent), you will see that I think about targeting the project for Linux/Asterisk (which was my initial plan anyway). As you may have guessed, there are two challenges to this; the first one is called “Linux” and the second one “Asterisk”.

Let me first start from the “Linux” part. In order to get me somewhat prepared, I thought that a good idea would be to study the Linux kernel and device driver modules to see how they deal with USB communication. Without claiming that I have mastered even the slightest part of it, it seems that a classless (“custom”) USB device driver is not that difficult to write – at least not in principle. While the trickiest task is to handle correctly the various kernel beasties such as spinlocks, semaphores and the  like, USB primitives themselves seem very clear and simple to use. Even isochronous communication is very much like in (the Windows flavor of) libusb: one simply has to initialize a transfer by specifying the packet size, and then one needs to queue one or more buffers, while the kernel core handles the packetization and actual I/O. Checking for the completion of a transfer is done asynchronously; one needs to do that, in order to queue up additional buffers.

As a quick grep reveals, some devices (e.g. USB cameras) use extensively isochronous USB I/O. In principle, I could learn by studying the respective device drivers and produce a driver for Open USB FXS.

Asterisk, on the other hand, is another story. In order to create a device that Asterisk can understand, one needs to write code for a “channel”. I am still studying Asterisk and DAHDI (the Asterisk ex-Zaptel device drivers), in order to see how to proceed with this task. Whereas writing a Linux driver does not seem too hard, the “channel” code looks a bit tangled to me. I am sure that I will eventually understand better its insides after some studying — whenever I manage to do that studying, of course…

One additional piece of work that needs to be done is to update the board’s firmware in order to fit some status reporting along with the isochronous PCM data. My reply to Vincent’s comment on the “About” page explains this somewhat.

So, this is where I stand for the time being. My progress is awfully slow, so I must apologize to any readers who expect a quick something, such as a project “code release”. On the other hand, since I am not burried in the loneliness of soldering components, but instead I am trying to write Linux code, and since the project’s code is posted in public, anyone (e.g., kernel driver code developers, etc.) interested in contributing to the project is warmly welcome to contact me and provide help. Feel free — this is an Open project anyway!

Update Nov 9 2009: after studying a bit David’s chan_mp code (see his comment below) and  some Linux USB device drivers (particularly some ones with isochronous transfers), I think I ‘ve got a grasp of the whole thing. First of all, writing a device driver is not strictly necessary. I could rely on usbdevfs to access the device (I verified that a Linux systems sees the board and all its USB endpoints as usbdevfs descriptors), and write all code in userland using libusb (plus some trickery for ISO transfers that the plain libusb does not support yet). Although feasible, this is somewhat too tricky and would result in ugly code. It seems that the best separation between driver and userspace functionality would be to orchestrate the isochronous transfer code in the device driver and the higher-level handling in userspace. Doing that has a great advantage: the kernel can invoke a callback function in the device driver code every time an isochronous transfer is completed, and that callback can do several things, like freeing buffer space, queueing additional data, etc. — quite different from the polling model I ‘ve been using in my strictly-userspace code. Moreover, I can hide tons of low-level functionality in the device driver code (like, initialize the ProSLIC, etc.) and provide a high-level IOCTL interface to the userland. Not bad. At least, in principle, before drowning myself in all the kernel oopses and crashes I am likely going to trigger in my device driver code.

And one more piece of news: Open USB FXS now has its own USB VID/PID pair. I have not bought a VID (it’s very expensive), however, Microchip have a VID/PID sublicensing program, to which I applied and they responded immediately with the magic numbers 0X04D8 and  0XFCF1. Thanks, Microchip! The code is not adapted yet, however, in (the extremely unlikely) case that someone out there is experimenting with my board and my code, they should from now on avoid abusing the company’s own demo card VID/PID and use the above id numbers instead.

The (un)holly explanation

July 25, 2009

As the title says, this post is all about the reason why my board sometimes used to produce garbled audio. As of this writing, I am in the midst of fixing the bug, and in some cases I am already able to listen to the complete 15-second audio file from the Asterisk IVR collection (“press one to record your unavailable message, press two to record your busy message” etc.) without a single audible hickup. This is “a small step for the project, but a giant leap for me and my own good mood”! I was nearly depressed with this bug resisting arrest all this long time, but the morale of the story is that in the end, no bug can last forever. An optimistic message, isn’t it?

To recapitulate: until my previous post, I was in doubt whether something was wrong with the board’s firmware or the issue of garbled audio was caused by libusb. Well, as one would expect, the answer is not so straightforward. The problem has a dual cause: it is was caused by a bug in the firmware which was triggered by libusb occasionally delaying a OUT packet.

So, I am going to explain the issue here. If all this is too technical for you, you may skip to the next post (whenever this comes out). There’s absolutely no need for you to go through my bug analysis session. But, if you find it instructive (or amusing, or both) to read stories on how other people blew it, you are on the right track here.

The libusb side of the problem is easy to understand. Ocasionally, libusb was delaying an OUT packet a bit. I am not sure yet why this is happening, but I am not complaining: in a real-life VoIP environment, a lost or delayed packet is something usual. Although this should not happen in a USB environment (with just a one-foot shielded cable intervening between the software and my board), the truth is that it is allowed to happen. Isochronous USB I/O clearly allows packets to be lost in the way. Thus, my firmware should clearly allow for a packet to be missed or delayed. Did it really?

Again, the answer is not so straightforward. My PIC USB configuration employs something called ping-pong buffering, which uses two sets of USB endpoint control descriptors, a set for even transmissions, and another set for odd ones. In my TMR1 ISR code, the thing was designed to work as follows: I had two sets of OUT/IN buffers, one OUT/IN pair for the even ping-pong phase, and one for the odd one. The reasoning was that I would “arm” two USB I/O operations (and OUT and an IN one) using, say, the “even” endpoint control descriptors and the even set of buffers; then, while waiting for these I/O operations to complete, I would do PCM I/O to and from the 3210 transferring data from and to the odd buffers. Then, at the next ping-pong phase, the “armed” even I/O operations for the even buffers would have completed, so I could “arm” new I/O operations using the odd endpoint control descriptors and the odd buffers, while doing PCM I/O to and from the 3210 using the even buffers. And so forth.

Of course, my code provided for packet loss in the OUT direction (that is, OUT from the PC to the board, which is the input direction for the board — confusing, I know, but one gets used to it after some time). If, at the time that the ISR was expecting an OUT packet, no such packet had shown up yet, the code would just ignore the packet loss and fill the respective (even or odd) buffer with PCM “silence” (a bunch of 0xFF’s).

When writing the code, this looked like the right thing to do; however, as usual, doing the right thing turns out to be more complicated than one thought. The first question to ask is, what will happen if libusb misses one packet? So, instead of, say, sending packets 1, 2, 3, 4, 5, 6, …, libusb sends packets 1, 2, 4, 5, 6, … — what will happen then? Supposing that packet 1 corresponds to an even buffer, this is the sequence of events:

1 --> E
2 --> O
 X --> E

, and then, guess what: packet number 4 will not be received, because it’s still the turn of an even OUT operation! The PIC queues up to four USB packets internally, so packet 4 is queued to be received by the next even OUT. However, the ISR tries an odd OUT and sees no packet there. Thus, the next events are

[4 -Q> Ε]
X --> O
4 --> E
 5 --> O

This doesn’t look so bad, does it? And, it should not sound too bad either: only two milliseconds of silence, and then everything is back in order, queued a bit behind.

While all this is fine so far, what will happen if libusb delays a packet but does not miss it? In other words, what if libusb transmits instead of packets 1, 2, 3, 4, 5, 6, …, packets 1, 2, X, 3&4, 5, 6, … ? This means that packet 3 is delayed a bit, so that when the ISR checks for the packet, it doesn’t see it, but the packet is sent shortly after, within the same SOF frame, and the PIC sees it and queues it. Packet 4 then comes on-time, so no loss occurs. Here is what happens in this case:

1 --> E
2 --> O
X --> E
[3 -Q> E]
4 --> O
[5 -Q> E]
3 --> E
 6 --> O

Wow! This looks quite different, doesn’t it? The effect is that all packets are reordered, and they are now being “played” in the opposite order than the intended one, like this: 1, 2, 4, 3, 6, 5, …, until eventually, the same thing happens again, and the order is fixed. This was the notorious bug that had been producing garbled audio (and causing me endless hours of headache and nights of fruitless debugging)!

To resolve this, I am working towards decoupling the odd-even endpoint descriptor handling from the odd-even buffers. So, if a (say) even phase OUT yields no packet, in the next round I re-try the even USB endpoint descriptor, but this time data will be pointed to the “odd” buffer (it does not really make sense to call the buffers odd and even anymore). This seems to work right most of the time, but the firmware still needs more fixes here and there. But things look (and sound!) definitely better now than before.

This was the (un)holy explanation behind a difficult and elusive bug. It certainly feels nice to leave such things behind and go on. So, what’s next? But of course it’s Linux, and Linux-wards my footsteps are driving me! But more on this in my next post(s).

Update July 26 (later on): here is an example of mis-sync between the PC and my board, as captured by the USB sniffer (you may need to click on the image to make it appear full-size):

missync

In each row sniffed, you can see two IN packets, each starting with the “magic sequence” 0xBA, 0xBE (I call them “the babes” :-) ). Next to that, you can see a 0xDD for “odd” packets or a 0xEE for “even” ones (the change I made does not affect IN packets, so this corresponds to the ping-pong phase of the PIC USB system). Then, byte #3 (starting from zero, the first boxed column of packets is #3) mirrors the sequence number of the respective OUT packet (so, under normal conditions,  it is incremented on every IN packet). Then, bytes #4 and #5 (on odd packets only) reflect the value of TMR3 at a specific point during the TMR1 ISR sequence which occurs once before an odd packet is transmitted (the value is in little endian byte order). Subtracting each number in this sequence from the previous one yields the constant number 0×5DC0, or 24000 decimal (which, multiplied by the 83,333+ ns that each PIC clock cycle takes, yields 2 milliseconds!). The next byte (not boxed, has the value 0×5) on odd packets reflects the number of OUT packet misses since last reset of the board. The last boxed column, byte #7 on even packets only is a IN sequence number, which is incremented independently on each even packet.

Now, observe that the buffer contains only 31 packets, whereas it should contain 32. The 19th packet, albeit correctly an odd one, contains a wrong sequence number (thick gray box, value 0×12 — the expected value is 0×52). The next packet is again an odd one! This is definitely a mis-sync or other similar issue on the side of libusb, because the firmware cannot repeat transmission of an odd packet. This second odd IN packet continues correctly the sequence of mirrored OUT packets (0×52)! In other words, no OUT packet was missed (and byte #6 of even packets is constantly 5, or else it should have been incremented). But now, subtracting 0×64F9 (TMR3 value on last correct odd packet appearing on the right-hand side) from 0×2079 (TMR3 value on first odd packet appearing on the left-hand side), one gets 0xBB80 (decimal 48000, or 4 milliseconds)! This means that an even IN packet is missing altogether, and this is also proved by the sequence numbers of even packets (the boxed sequence on the left-hand side ends at 0xAB, and the one on the right-hand side starts at 0xAD).

In my opinion, this shows quite clearly a mis-sync situation between the board and libusb. In this case, the board has recovered without any audible distortion. What is still puzzling is why no packet appears to have been lost in the OUT direction. If, by reading this, you can suggest an explanation, I ‘d be happy to hear about it.

Interlude II

June 30, 2009

It’ s the second time through the lifetime of this blog that development has slowed almost down to a halt. Just like in my November 2008 posts, both project-related and real-life conditions contribute to this. From the project’s standpoint, I have been offered a proposal for volunteer help, and have decided to construct a few more prototypes, so I can send over (a working) one to start development in parallel [BTW, should anyone else be interested in volunteering, please drop me a note]. So, I have swept the dust off my stash of PCBs, and ordered again the necessary materials to build three more boards (my only prototype is showing some signs of bad behavior on occasion, so building a few more prototypes to have handy does not sound like a bad idea anyway).

There are news from the real life front as well: our newborn second son has just arrived home! As you can imagine, a lot of non-openusbfxs-related tasks keep me busy most of the day, and for the rest of the time, I have to try to fill some gaps for my real-life job. Although all this leaves little time for the project now, I hope that things will gradually settle down and I ‘ll have the chance to pick up again the soldering iron.

In the meantime, I have uploaded all project-related code, this time to Google Code for people to see, and possibly comment, use, or contribute to (I decided moving off sourceforge.net, however the policy there is to keep code around, so the old — and possibly, copyright violating — code is still around; confusing, isn’t it?). Some of the uploading work included removing parts of code that are clearly Microchip’s and just provide just a few source patches to indicate my own changes (all binary PIC code is readily available however). That, plus adding appropriate copyright and license notes everywhere kept me a bit busy the last few days.

In the trial-and-error front, the situation is much as I described it in my previous post. This means that I cannot get a good voice quality yet: initially, and for one or two seconds, audio comes out OK, but then it looks as if synchronization is lost or packet losses occur. Then, if the audio file is sufficiently long, it seems to synch again for a while, then I get losses again, and so forth. This is weird, because I have been mirroring the sequence numbers of the OUT packets (the ones sent from the PC to my board) into the IN packets. By observing the mirrored sequence numbers, it seems that no loss occurs. Not a single packet is missed or reordered (well… almost; see below). I cannot tell anything about strict packet transmission timing, since the sniffer does not give me any real visibility into what happens on-the-wire.

In addition, I got a weird finding, which could be related to the intermittent audio quality. During the first two libusb isochronous receive (reap) operations, the first two packet slots are empty (remember that, depending on the buffer size, a buffer consists of several 16-byte packets, and each 16-byte packet contains 8 bytes of audio data, plus other control information). It is normal to see the first two slots empty: because of ping-pong buffering on the PIC, it takes two packets’ time to get the first OUT sequence number mirrored into an IN packet. Again, this happens for the first two reap operations. However, from the third reap operation and onwards, the first packet slot in each receive (IN) buffer seems to be occupied with the packet that comes next to the last one in that buffer! In other words, it seems that libusb returns the IN buffer one packet too late, while it resets some internal pointer, so that one packet more than the expected number appears, and this extraneous packet is placed at the beginning of the buffer. Unfortunately (from a debugging standpoint), the sniffer cannot tell me whether this is happening too in the OUTgoing direction: the sniffer “sniffs” data at the user-kernel interface, and not on-the-wire, so it is hard to tell actual data gets transmitted on the wire. Obviously, this is why debugging this issue is hard.

All this remains a mystery to me. There are two plausible scenarios: either my board loses synchronization with the SOF pulse, or libusb occasionally shufles OUT packet data, much the same way as it does with the IN packet data. I am in favor of the second scenario, not just because I am the proud designer and builder of my board, but because it makes more sense. I have measured inter-packet time using TMR3 (you have to look at the ISR code to see how), and the time is constantly 24000 PIC cycles per each two packets (I am reporting the time only on even packets — or is it only on odd ones, it eludes me now), which amounts to 2 ms exactly.

On the other hand, these 2 ms are counted using the PIC’s own crystal-based clock frequency as a time reference, and drifts between that and the PC clock may well exist. But my feeling is that such drifts would not be so important as to cause a loss-of-synch condition every few seconds.

Moreover, if it were a mis-synch issue, periodic re-synch would recur every now and then in fixed time intervals. This is not the case, though: re-synch occurs in seemingly random periods. Sometimes, many seconds of audio comes out without distortion, while in still other cases, I get garbled sound right away. Moreover, by fiddling with libusb buffer sizes, I have managed to get better or worse results. All these tend to put the blame on the side of libusb (or my userspace code) rather than on the board’s side.

In any case, my plans are to abandon further development using libusb and see how I can write a linux kernel driver. Presumably, there I will have less undesirable effects such as the .NET garbage collector or some libusb bug getting in my way. On the other hand, the code debugging cycle will be considerably longer. But I am not worrying about this. Time will show.

In any case, switching to a Linux kernel driver will take some time; in addition, I need to assemble a few boards, so more time there. Hang on, though, it will happen in the end. And, let me not forget to mention that all kinds of help are greatly appreciated!

Hello, World!

May 15, 2009

For many months have I been imagining the day that I would write a post with this title.

Now that I am actually typing the words, it feels weird: I have reached one of the most important milestones in my Open USB FXS “pet”-project, and yet I suddenly realize that the road ahead might be quite longer than the road behind. This is the exact feeling I had the day I graduated high school, or the day I finished with my military service (which is mandatory in Greece where I live), and that feeling goes something like, “it wasn’t that hard, was it?” — not to mention, of course, that what’s going to follow will be definitely harder. Anyway, enough with all this crap, let’s jump into the details that I know every reader is looking forward to.

As it is obvious from the title of the post, I made my board utter its first “Hello, World!” sound! My board has passed its PCM language exam. I am not sure whether it has gotten an “A” grade (see below for more on that), but nevertheless, the simple “Hello, World!” voice message is reproduced loud and clear. How did I reach here, though?

My previous post, “time (in a bottle)” was devoted to the details of rewriting the TMR1 ISR routine. I won’t return to those details, with one exception: the position of the FSYNC pulse. It seems that moving this to the 32nd cycle (end) of the sequence instead of the first cycle (start) of the sequence was not necessary. That was due to my hasty re-reading of the timing diagrams on p.55 of the Si3210 datasheets. In reality, the chip starts counting bytes from the rising edge of FSYNC. I did not notice this until late in my tests, and there’s more on this later on. However, from a subject point of view, this belongs to the previous post, so I thought about mentioning it first.

So, again, how did I reach here?

It seems that the most important challenge was to create a good PCM audio communication path between the board and the PC. This required understanding and implementing USB isochronous transfers. So let me start with these.

The USB standard defines various types of isochronous endpoints, namely, synchronous, asynchronous, data, and feedback. An asynchronous endpoint means that the USB device has its own clock and the host may send (in an isochronous manner) more or less data per frame to match demand (e.g., of some codec with variable bit rate). This was not my case. In contrast, a synchronous endpoint always transfers a given amount of data per frame, and the USB device operates synchronously to the USB clocking. It looked like I was in this category. A data endpoint sends data (my case), whereas a feedback endpoint sends feedback (to e.g., increase or decrease the amount of data per frame — not my case).

Time (in a bottle) describes what it takes for synchronous/isochronous transfers to work correctly from the viewpoint of the board: the 3210 PCM highway I/O tasks and the USB I/O had to be strictly synchronized and timing had to be rigorously checked to run equally fast on both sides. All this were done via the TMR1 ISR.

However, from the side of the PC things were a bit more complicated. The user-level API of libusb-win32 that I finally worked with required my PC  ”driver” program to pre-queue a buffer and wait for the transfer to finish. However, as I was soon to find out, if I waited for a queued buffer to finish and then re-queued the next buffer, the library was missing frames. So, at every point in time, I had to have not one, but two buffers queued. One buffer would be the one transmitting, and the next one would be just waiting so that no gap would occur when the first one ended. One exception to this “always-two” rule seemed natural: when a buffer was drained out, I had only one buffer queued while I was preparing the next one to be queued. Here is an outline of how this works in a loop:

prepare (buffer1);
context1 = LibUSBenqueue (buffer1);
prepare (buffer2);
context2 = LibUSBenqueue (buffer2);
current context = context1;
#transferred packets = 0;
#last time;
while (true) {
    #packets transferred from current buffer = LibUSBcheck (current context);
    if (#packets transferred from current buffer > #last time checked) {
        increase #transferred packets by #packets transferred from current buffer - #last time checked;
        #last time checked = #packets transferred from current buffer;
    }
    if (#transferred packets == desired) break while loop;
    if (#packets transferred from current buffer == (buffer size / packet size)) {
        LibUSBfinishWith (current context);

        if (current context == context1) {
            prepare (buffer1);
            context1 = LibUSBenqueue (buffer1);
            current context = context2; // which has been previously queued
        } else {
            prepare (buffer2);
            context2 = LibUSBenqueue (buffer2);
            current context = context1; // which has been previously queued
        }

    }
    #packets transferred form current buffer = #last time;
}
LibUSBfinishWith (current context);

This is not exactly what you could call “simple”, especially if one takes into account the details of filling in buffers with data from a file (appropriately sliced into packets with headers and payloads), plus the fact that this whole thing has to work in parallel for both read (IN) and write (OUT) USB transfers [to be honest, this requirement is not necessary for a "Hello, World!" test which sends one-way audio, but naturally one needs to make sure that both-way audio would work in principle, hence the double effort].

Of course, it took days and days of trial-and-error to get the above thing working. Sometimes I was missing IN packets, and I had no idea why. Some other times, the whole thing stalled on me, and again, I had no clue as to what the error was. Anyway, as soon as I had this working, I pushed all the gory details into a “SendAudioFile” function, and tried that inside a test sequence within my driver PC program. The test sequence went like, “initialize the board, then ring once the phone, then wait until the phone goes off-hook, then SendAudioFile”, then check again on/off hook status, if off-hook SendAudioFile, etc., etc., ad infinitum.

At the beginning, I just heard an acute sound with a steady frequency. After some thinking, I attributed that to the fact that my buffer size had changed from 64 bytes (its size in my previous ISR version) to just 8 bytes. So, whenever I was not sending any PCM data, the ISR kept repeating the same 8 bytes every millisecond, and this produced a pattern that repeated itself every millisecond. This is 1kHz, and yes, I was listening to an 1-KHz pattern! So everything was in order there — of course, apart from the fact that this whole thing meant that my board was not receiving any USB data.

Activating the test pattern b10101010 (or, 0xAA, if you prefer) that pre-existed in my code made the accute sound disappear; so this means that the PIC-to-3210 path was likely OK. Then, I scrutinized the ISR code, just to find that I had left in a #define for debug purposes, which was bypassing the OUT part altogether. Fixing that gave me an audible result full of noise, cracks and clicks. Definitely better than the 1kHz steady tone, however something was still wrong. What?

It took another day or two until I tried a longer message from the collection of asterisk’s IVR sounds and noticed that I could distinguish periods with more and others with less noise, in the familiar pattern of a voice that speaks, then does short pauses between words and longer ones between phrases. This led me into thinking that the audio was passing halfway through, and the culprit for that was the position of the TXD/RXD pulse stream with respect to FSYNC. So, after a better look at the timing diagrams of the PCM highway in p.55 of the 3210 datasheet, I set the RXS and TXS registers to 1 instead of zero, meaning that the audio pulse train was one PCLK-cycle later than FSYNC. And after that, in my next test, the phone rang and “Hello, World” came out, loud and clear!

It goes without saying that there are still issues. The most important issue is that audio occasionally comes out distorted, in a way very familiar to the ear of someone who has been using VoIP systems: some packets come in late. I need to do somewhat more debugging for this (e.g., increment a counter whenever no data is received in-time and check for its value). Then, I ‘ll see what remedy I can find.

Another issue (now resolved) is the acute 1-kHz sound: in periods where no actual audio data are sent, the board just kept repeating the last two packets (two, because of ping-pong buffering), which resulted in the now-well-known 1-kHz acute sound. By changing the ISR execution path where no data were received into replacing the received data buffer with 0xFF’s (u-law silence), this has now been solved.

So, having reached this milestone, I ‘ll allow myself the few beers (and the few days off the project) that I feel I deserve. Then, probably I ‘ll go and fix some bugs (the packet loss issue and another annoying thing where, on occasion, back-on-hook is not recognized by the board) and move my way on to either first dealing with DTMF and then writing a Linux driver, or writing the Linux driver right away. Bear with me, folks! It seems that I might make it in the end! I still need to publish code, updated timing diagrams, and updated BOM — I have been neglecting these, I know — but I ‘ll do it eventually.

Time (in a bottle)

May 8, 2009

If I could save time in a bottle
The first thing that Id like to do
Is to save every day
Till eternity passes away
Just to spend them with you

This time, starting to discuss my version 2 of the ISR code, I chose the lyrics from an older song, “Time in a bottle” by Jim Groce. It feels quite like the new spec for version 2 ISR. That’s exactly what the ISR is: a series of bottled (or canned, if you prefer) prescribed time. You may check out the code of the current version of the ISR here. Lots of unfinished things still lurk around, but the general code’s shape and organization are rather close to final, unless something serious comes up.

I am not going to describe in detail what the code does here. The reasoning is much similar to my original conception. There are four “periods” as spanned by the outer (in the sense of nested for-loops, because in its actual location in the code it’s an inner one) counter cnt4, and each period contains 8 PCLK full-cycles, spanned by the “inner” counter cnt8. Each 8-cycle train is carefully profiled (and painfully debugged) to take exactly 375 TCy’s, for a full 32-PCLK-cycle train that takes 1500 TCy’s, or equivalently, 125 microseconds (which yields exactly 256kHz).

Special actions are taken when cnt8 loops over to zero and for the whole 8-cnt8-cycle train when cnt4 is zero. During the first 8 cnt8 cycles of the 32-PCLK-pulse train, PCM audio I/O between the PIC and the 3210 takes place. When cnt4=1 and cnt8=0, the DRX (sent from the PIC to the 3210) and DTX (received by the PIC from the 3210) bytes are placed in an 8-byte buffer (actually it’s a 16-byte buffer, and data start to be placed with an initial offset of 8 ) directly in USB-reachable memory. This yields tremendous time savings because the very same buffers are then directly transmitted and received over USB (when 8 bytes have been collected -for DTX- or drained out -for DRX- and) at time cnt4=2 and cnt4=3, respectively. One more change from the original version is that FSYNC is now pulsed when cnt4=3 and cnt8=7, that is, at the 32nd cycle.

Happily, at 256KHz, filling in an 8-byte buffer with the above schedule takes exactly one millisecond. This is veeeery convenient, because this is exactly the period at which a USB host sends Start-of-Frame (SOF) microframes to the board. A few nanoseconds after the SOF, the board responds with an isochronous IN (from the board to the host) USB packet, which contains 8 bytes’ worth of PCM data. That is, 1 millisecond worth of data gets transmitted every 1 millisecond, and this happens synchronously with the 3210’s clock and the USB timing SOF signal. Couldn’t hope any better, could I?

Using directly the USB memory as buffers is feasible only thanks to a buffering technique that the PIC supports, called “Ping-Pong buffering”. In Ping-Pong buffering mode, the PIC uses even- and odd-rank Buffer Descriptors for each endpoint (in each direction, IN or OUT). Then, one set of IN/OUT transactions uses the even buffers, the next one uses the odd ones, and so on. This complicated things a bit in my code, but not very much. In return, one gets a tremendous speedup, because useless memory copies are avoided altogether. I felt very lucky that this mode existed in the first place, because otherwise I am afraid that the time in the bottle would not suffice.

A nice trick too is the initial synchronization between the 3210 FSYNC and the USB signals. For that, I had to tweak a bit the USB firmware provided by Microchip. The default firmware works in polling mode, and checks asynchronously for various events. One of these events was the SOF interrupt. In order to synchronize the ISR and the SOF signal, I needed to have complete control of the SOF flag from within the ISR, so I had to comment out an instruction in Microchip’s USB firmware that resets the flag. Obviously, my handling SOF in the ISR renders useless some other parts of Microchip’s firmware too, like the provided SOF callback function.

Probably I need to mention that debugging this thing was a loong, painful experience. Finally I resorted to TMR3, which can also run, just like TMR1, at the pace of the program counter (Fosc/4). I fixed the value of TMR3 so that it read zero when the ISR started, and then moved a debugging block around the ISR to verify my expected ‘@+xx’ values. Finally, after everything looked OK, I copied the value of TMR3 onto the IN USB packet and made sure that each value I was receiving had the same offset from the previous one, and that offset was exactly 12000 as expected (just in case you are wondering, this took me three whole days to conceive, plan, implement, test, debug, correct, re-try, etc. — I would never like to do that again, never…).

Bottom line: although the song goes

But there never seems to be enough time
To do the things you want to do
Once you find them

, I think that this time I squeezed just enough time in the bottle to keep my board drinking forever PCM data. As of this writing, the actual audio functionality is yet untested, but I ‘ll test it soon and report back by updating this post. In the meantime, readers (?? if any…) can enjoy browsing through the TMR1 ISR code (GPL-licensed, so freely usable in their own project).

I just hope to see — that is, to hear — all this code working fine in practice. Otherwise, I risk winding up like the old, white-haired mad scientist from Episode 7 (was it?) of the Muppet Show, who prepares himself a series of weird-colored, steaming potions and drinks them one after the other, each time getting younger and younger, under the melody of “Time in a bottle” (definitely a must-see); until, of course, he drinks the last potion, which turns him back into an old man…

Rewriting the firmware…

April 27, 2009

[Note: as of now, I have not yet fully finished rewriting the firmware. However, during the process, I have run into a vast amount of interesting information, which I am sure I will forget in the end. So, I decided documenting the rewriting process in parallel within this post. Probably, when I finish up, I 'll edit out this first paragraph.]

I decided to rewrite the firmware because of the following test: one second’s worth of 64kbps-encoded PCM data sampled at 8kHz using 8-byte values (be it linear, μ- or A-law) consists of 8000 bytes. Using 8-byte chunks, this corresponds to 1000 chunks of data being exchanged between a USB device and the PC within a single second. So, I decided to measure how long my current implementation takes to transfer this amount of data from the PC to the Open USB FXS board. And the result was: 32 whole seconds.

In other words, I would need to make my board perform 32 times faster. The current firmware used USB polling and request servicing in a serial manner, so I judged that this was never going to work. I had to make drastic changes.

One thing that I could not help noticing right away was that I had been using the old version of Microchip’s firmware (1.x); however, the company has moved to version 2.x, which is very different in its philosophy. Version 2.x is much cleaner and more abstract than 1.x; the framework files are used from the compiler’s tree and not copied over to the user code (actually, this is not a framework change, but rather an IDE enhancement); the cumbersome multi-level directory structure of 1.x is now gone. Porting my code to the new version proved easy enough, although it took me quite some time.

Porting to 2.x came with a pleasant surprise: the firmware was much, much faster now. My previous test (sending over 1 second’s worth of PCM data) now took only 2 seconds instead of ~32. One good explanation is that the new version uses USB-specific ram as buffer space, thus eliminating data copy operations between USB-specific and “user” RAM. Later, I tried to both send and receive the same amount of data; this amounted to slightly more than 4 seconds. Which, as I was soon to find out, was the best one could hope for.

Soon after, I started scratching my head about isochronous USB transfer. I read and re-read the standard and tried to find something in the firmware and/or the C++ examples from Microchip that could serve me as an example. In the course of studying isochronous transfer, I found out why my four seconds was the lower bound for bulk transfer: the USB Bulk transfer mode requires an ACK for each data packet, and a stream of packets is only initiated after a SOF (start-of-frame) microframe. The USB host sends one SOF per millisecond. So, sending over a data chunk of 8 bytes and receiving back the upstream equivalent takes four milliseconds (at best): one for the OUT data, one for the ACK back to the host, one for the IN data and one for the ACK back to the device. It was clear that I would never be able to perform faster than that, unless I increased my chunk size. A bare minimum of 32 bytes was needed. Not what I had in mind.

This made me search further for isochronous USB transfer. This mode does not require handshake (ACK packets). Again, the upper bound is one packet per SOF frame, but this was exactly how much I needed. Maybe I could provide something like a self-clocked synchronous transfer method as follows: the firmware provides a SOF callback function, which is invoked on every SOF microframe. Then, inside that function, I would send out a chunk using an isochronous IN packet (note that the “IN” and “OUT” is host-side terminology, so a USB device sends IN packets and receives OUT ones), and I would receive an isochronous OUT packet. Lost packets would never incur any timeouts etc. On the host side, I would wait for the IN packet (clocked at the rate of one packet per ms by the pace of the SOF) and reply with an OUT packet. Later on, when this would test OK, I would get rid of the ring data structures in my ISR and arrange the ISR code to mess directly with the USB packet buffers without copying (and also to synchronize with the SOF 1-kHz “heartbeat”).

My tests using the Microchip Generic USB host-side device driver all failed. As I found out soon thereafter, this device driver does not provide isochronous transfer primitives. Using bulk data primitives instead caused the primitives to halt waiting for an ACK which never came; then, of course, the primitives were timing out. No, definitely not good. I had to look further.

Googling the subject brought me some interesting discussions. Mostly this thread on the Microchip users forum site shows the efforts of some good men to make the PIC speak isochronous. In a nutshell, they all seem to agree that (1) both the PIC and the firmware can do fine with isochronous transfers, (2) Microchip’s Generic Host-side device driver is inadequate because it does not support isochronous transfer primitives. One example also mentioned libusb-win32 (available here) as an isoc-capable driver. So, I tried changing the device driver in my host-side controller.

[BTW: the Linux kernel supports isochronous transfers, so I am on the safe side with that - maybe it's time to move to Linux? We 'll see...].

Apart from some minor issues (that I overlooked for the time), porting my host-side Windows source to libusb-win32 was not that hard. However, the tests I tried produced mixed results. Many times the IN isochronous primitives were failing, and there was no good explanation for that. If I only did IN without OUT transfers, then one every two primitives worked OK, while the other one failed. Intermixing OUT and IN transfers, the IN success rate was much, much lower, whereas the OUT success rate stays at 1:2. As of now, I have not yet resolved the issue.

In the meantime, I re-read the thread I mentioned above and noticed that some people report success using the Cypress CyUSB host-side device driver. This is another direction in testing: I needed to port once more my code to Cypress. Having done this once, I thougth I might be able to do it again.

Update, April 29:  partial success using libusb-win32! One thing I had gotten wrong was that libusb expects isochronous transfer requests to be submitted using large buffers that the library fills in (or drains out, for the OUT direction) at its own pace [BTW, checking out CyUSB, it seems to work in a similar manner]. An interesting question then is, how can the user-level program synchronize with these asynchronous primitives? Obviously, one cannot wait until a large buffer is filled in and then proceed, since this would defeat the purpose of using small-size isochronous transfers. Libusb gives out a nice (though totally undocumented) way of doing it: one can ask the driver about the amount of data transferred so far (with the usb_async_reap_nocancel() primitive); if the number returned is larger than last time, the user program can safely assume that one more chunk has been transferred.

A delicate point with this method (and one of the library’s inadequacies) is that, if the IN-pipe misses a packet for some reason, then the returned value is not updated (although the pointer in the provided buffer is incremented, thus a missed packet creates an unidentifiable “hole” in the buffer). Even after receiving the whole buffer, other than using packet integrity checksums etc., one cannot really tell where the “hole” is, because there is really no signaling mechanism from the driver to the user space to inform on the event that a packet was missed.

So, I devised a method whereby I query the OUT-pipe for its progress, and assume that the IN-pipe will be ready at the same pace. This seems to work, in that the OUT-pipe is synchronized with the SOF frames and thus the transmitted byte count is incremented at a steady pace. This is nice, in that every millisecond I get the chance to run the relevant user-level code that checks for IN-packet data in a near-synchronous manner.

However, my success with this method was only partial: using a USB sniffer program I observed many missed IN-packets. It is really hard to tell what causes this. There are two suspects: the firmware and the driver. The firmware might come late in transmitting packets, or the driver may miss them whatsoever, although they are transmitted OK. A reason for the first scenario is the large portion of the time that the PIC spends into the TMR1 ISR, so it might respond late to the SOF. Clearly, I need more trial-and-error work here; I ‘ll report again later, by updating this post. [Nevertheless, I feel not so worried about all these little problems; a driver in kernel space should be able to handle isochronous transfer in much more efficient ways; for the time, I only need to make sure that the culprit is not my firmware].

Quick update, later on the same day: bypassing my ISR (by adding an immediate return instruction) makes the above test work fine with libusb! Now I can even rely on the reap_async_nocancel() from the IN pipe catching all packets, so the isochronous IN-pipe can be used reliably for timing purposes in the user code! So I guess the explanation is that, because the USB is serviced in poll mode and the ISR in its current form takes much time, the firmware gets to transmit the isochronous IN packets late some of the times. But, as I have already said somewhere earlier, I need anyway to rewrite the ISR and get rid of much of the code in there (ring buffer management and so on), so probably I ‘ll come up with an acceptable tradeoff between ISR time and USB polling. If not, I can still devote some ISR cycles to fire up the isochronous USB transfer from within the ISR in a synchronized manner (see next paragraph).

Moreover (and this something I have keeping in my mind, carefully sweeping it under the rug so far), I need anyway to synchronize the ISR to the SOF frames. This is because I will need to swap USB/PCM buffers around (preferrably using the PING-PONG buffering method of the PIC, which I still need to try out) after the first 8 PCLK-cycles of my 32-cycle ISR (remember that PCM audio I/O between the PIC and the 3210 occurs during these first 8 cycles, so buffers had better stay untouched at that stage). All this sounds like quite a lot of work – only this time it seems doable, whereas up to now, things felt more like in a survival-in-the-jungle bootcamp. The “Hello, World” milestone seems now closer than ever before!

Update, May 7: lots and lots of re-planning, strategy changes, rewrites from scratch and finally, isochronous INs (from the board to the PC) work fine! To cut a long story short, I decided to quit my early ideas of mix-n-match between interrupt and USB polling, and to rewrite my TMR1 ISR from scratch in order to make it fully capable of handling isochronous PCM I/O. Early trials were disappointing, in that I saw no packets at all coming from the board. However, when I addded code to synchronize once between the SOF frame and the IN isochronous packets, I started seeing some packets on the PC. With the aid of a USB sniffer, I noted that, sooner or later, my ISR was missing the right time frame to send a packet. This led me to a very tiresome and difficult debugging of my ISR, until finally I trimmed it not to miss a single clock cycle. It now works fine! So I plan to write a post dedicated to the ISR code, while in the meantime I will be progressing my PCM audio trials.

Halfway through getting PCM to work

April 13, 2009

At the time I started writing this post, I did not have a sure answer as to whether the board’s PCM was working correctly; however, debugging was fun, so I decided to start writing without finishing first. To take things in order, as soon as I got the board to ring a phone set, the next thing I did was to augment the functionality of the controller so as to display direct and indirect registers together. Here is the result (the rightmost cluster of values are the indirect registers — and, yes, I know a descriptive text label is missing there, but I was too lazy to add it).

Controller capture, now with indirect registers

Controller capture, now with indirect registers

After finishing with that (and re-soldering the crystal, which decided right then to break loose on one of its pins, giving me almost a heart attack when the board suddenly died on me), I turned to PCM audio. The test scenario I had in mind was to make the board reproduce an audio message on the phone. To that end, I implemented two more firmware functions, one for sending and one for receiving chunks of PCM audio data (there are lots of things to discuss here, but I am leaving this discussion for a bit later). Supposedly, these functions would write(/read) data  to(/from) the output(/input) ring(s) in the ISR area, respecting the input and output ring pointers so as to not overwrite any data. Then, I wrote a piece of code in my controller program which would open a file with PCM μ-law data and use the new function to send these over USB. As simple as that.

Am I catching anyone by surprise here by saying that this didn’t work? I guess not… Letting aside a really stupid bug (in the first tries, an off-by-one error resulted in a 0xFF ‘RESET BOARD’ command being sent over, so instead of producing audio, the thing was rebooting!), nothing but “line noise” could be heard on the phone. A quick first glance through the 3210 datasheet revealed I had forgotten to set DR 1 to 0×28 (set PCME bit to enable PCM audio). After that, the phone started producing a clicking sound that did not even distantly remind of the original audio. OK, better than nothing, I agree, but still not what I wanted.

I then played with firmware, instructing the board to ignore the data sent via USB and just send out zeros. The clicking sound should disappear, but it did not. Looking carefully through the firmware code, I found that I had incorrectly configured the DRX and DTX PIC ports (DRX is the receive PCM path for the 3210, not for the PIC). Fixing this (it took me two trials, because these were also wrong in the TIMR1 ISR assembly code) made it: the clicking sound disappeared. Good! [I am not sure what was causing the clicking sound. My most plausible explanation is that, since both the 3210 and PIC were placing the DRX line in high-impedance state, the line was acting like a small antenna and collecting noise from some other nearby signal on the board.]

Then this time PCM ought to work, right? Well, it did not. So I decided to preload the output ring with some data. This did not fix it either. This was suggesting that my ISR code was not correct in sending out PCM data. I then changed all BSF (bit set) and BCF (bit clear) instructions driving the DRX line with BTG (bit toggle) ones. Of course, the result would not be audible, because this represents a constant value (0b10101010, or 0xCC) being sent to the 3210; however, the resulting pattern should be easily distinghuishable on the oscilloscope. But what I saw on the glass was certainly not what I was expecting. Here are the PCLK and FSYNC signals. The resolution is such that more than two full ISR cycles are displayed:

PCLK        FSYNC(you may wish to click on the pictures to examine them in full size). Here is the puzzling DRX signal:

DRX signal

DRX signal

This was certainly looking wrong. To remind you about my TIMR1 ISR, the code was supposed to distinguish four ‘phases’ within a FSYNC period. All PCM I/O should occur during the first phase; instead, it seemed that the ISR code responsible for the first phase was executing three times. Back into the PIC datasheet, I found my bug: I was checking the C (carry) status bit after decrementing a counter value from zero to 0xFF with a DECF instruction; instead, I should have checked the N (negative) status bit. Fixed that, re-checked with the scope, and — voilà!

Fixed DRX test signal

Fixed DRX test signal

Of course, I would not leave  without measuring the DTX signal (PCM input from the phone to the 3210 to the PIC). Here is what this looked like:

DTX signal

DTX signal

There are two things to note here. The first is that the actual data seems to consist of a constant 0xFF pattern (which seems OK, since in u-Law encoding this corresponds to a decoder output of zero). The second noticeable thing is the ramp-like pattern to the right of each 0xFF logic-true. This can be explained by the fact that during non-transmission periods, 3210’s DTX pin goes tri-state, and the respective PIC input is also tri-stated. So, this pattern probably corresponds to a high-frequency signal while the energy captured in the transmission line between the 3210 and the PIC gradually discharges through a high-impedence path to the GND level.

What seemed encouraging here was that, when I spoke to the phone, I was able to note some “noise” in the data part, with the ramp-like part consisting of lower-placed ramps. This fits nicely with the above theory, since actual u-Law data contains some zeros, corresponding to the “noise” in the data part and an equally lower ramp-like “trajectory” to the GND level being displayed thereafter. So, the PCM receive path (although called DTX, it is the receive path) was rather working, although I was not collecting any data yet.

What about the transmit path? Well, that was not ready yet. It took another day or two until I noticed that the underrun test condition I had provided for in the ISR code was not ever taken care of. In order to avoid echoing, I had also provided for an execution path that stops transmitting PCM if a data underrun condition is detected. When removing that, I finally heard the 125-Hz (which is 1 / 8ms, consisting of a repeating pattern of test data) test sound I was expecting. But what about the actual audio data? Well… The pace at which the controller currently sends data to the board is very slow.

Actually, this is the interesting part! In contrast to writing, say, a flash disk driver, where I would be able to use 1-kB blocks, in PCM audio one needs to pass about small “chunks” of data in an isochronous manner.  If the chunks get too large, then there will be considerable (and audible) delay introduced in the audio path. If the chunks get too small however, an overrun or underrun condition gets more likely to occur. So, what is the best chunk size there, given the actual processing power of the PIC? And, most important of all, is the PIC fast enough to cope with these requirements?

To answer these questions, I took a fresh look at the USB code sample by Microchip, upon which my code has been built. It seems that the sample code is not really very effective. Lots of functions that call other functions, which in turn copy data around — very wasteful. The PIC can perform at warp-speed when doing USB transfers, using parallel dedicated hardware. But then, data are placed in a special RAM location, and the sample code copies this to “user space”, freeing the buffer back to the USB hardware. Thus, although I am not sure yet, it looks like I can save me lots and lots of wasted PIC cycles by throwing away my transmit and receive rings and interfacing directly between USB memory and my ISR.

Another, more mundane problem that has been getting away so far is that the FSYNC pulse needs to be shifted one PCLK earlier: I am raising and lowering PCLK within the first ISR cycle, but the 3210 datasheet on p.16 (and other places) is clear: PCM transfer starts at the first rising edge of PCLK after the falling edge of FSYNC. Hmm… This means that I need to move the code that pulses the FSYNC at the 31st cycle of my ISR.

Summarizing here, the good news is that PCM path of the board works both ways (well, so to speak…). So, it is time to look more closely at the actual firmware of the board and optimize the transfer paths, while at the same time keeping an eye at zaptel compatibility. It cannot be that hard, can it? So, I hope to be back soon with even better news!

Update, April 15: there seems to be a PIC USB configuration, called “ping-pong buffering” which seems to fit nicely my needs (although I am not yet entirely convinced about that). That thing works by using odd and even-numbered buffers, whose ownership is alternated between the CPU and the chip’s USB engine with a single bit change (in other words, very fast). More can be found in the PIC 18F2550 datasheet, p.177. Currently, I am studying that in parallel with Microchip’s USB stack code to check how easy it will be to adapt the sample USB stack code provided by Microchip into what I need.

Ring-ring!

April 6, 2009

Yes, you guessed it right! The title of this post means that I finally got my openusbfxs board to make a phone set ring! But please let me take things in order.

In the point where I had left things in my previous post, I was trying to implement one-by-one the initialization and calibration steps described in p.3 of Silabs’ AN35 application note. Most steps were relatively easy; however, when performing the gain mismatch (manual calibration), I noticed the error due to R7 mentioned at the end of my previous post. After fixing that, I tried to move on with common-mode calibration (steps 17 — 19 of AN35). However, there I was getting a calibration error.

In order to debug the issue a bit further, I tried to bypass this step and see what would happen if I set the line mode (register 64) to the “forward active” state. What actually happened was that the 3210 objected to that, and kept stubbornly the line mode in the “open” state. Hmm…

A quick look into the Si321x FAQ found me the same question (second question on p. 6 of the FAQ). Unfortunately, it did not get me the correct answer, since my DC-DC converter values where allegedly OK, and I had just finished with manual calibration. What was the reason then?

For almost one week thereafter, I ran just every test I could come up with against the board. First, I tried to bypass automatic return to open state (set AOPN bit in DR 67 to zero). This produced a very interesting result: when I attempted to bring the line to “forward active” mode, the DC-DC converter was auto-shutting down. I found no way to instruct it not to, so I had to find another way of figuring out what was wrong.

Enabling power alarm interrupts (DR22 <- 0xFF) enlightened me a bit more, in that I saw in DR 19 that I was getting a power interrupt because some (or, on occasion, even all) of Q1, Q2, Q3, Q4, Q5 and Q6 were sensed to dissipate too much power. OK, it was clear then: I had to fix the initial values of Indirect Registers (IRs) 32–34 and 37–39. I have to admit that I had borrowed clues for these values from the zaptel driver, so it made sense trying to find the correct values for my case. In my board I am using the 3201 and not discrete transistors, so I had no idea what the power and thermal coefficients should be for that. The answer was well hidden in the bottom of p.4 of Silabs’ AN47, where some values are suggested (the same as for SOT89 transistor packages).

At this point, debugging should have ended. Well, it did not, and the reason was a stupid error of mine, as I am explaining in the next paragraph. So setting the correct values for Q1–Q6 did not solve it; thus, after measuring countless hours with the voltmeter, devising tens of test sequences in my driver code, trying pumped-up values for IRs 32-34, and just about everything else I could imagine, I finally chose to change-ineer the si3201, just to make sure it was not burnt or something. Of course, as usual, change-ineering did not do it either (advice: choose this method as your last resort, and only if you cannot find anything else to do: it will not fix anything, but it will make you feel better because at least you tried it).

Deeply despaired, I swore strict abstination from debugging my board for the whole last weekend. Guess what: it seems that this method made it: today (Monday), taking a fresh look at my code, I finally found the culprit: because of a stupid copy-paste error, I was not initializing correctly any Indirect Registers (all values were written into the same IR)! BTW, this is my second copy-paste-due bug which takes me this long to find and fix. [It seems I must not copy-paste any more code and promise to type in every single bit. Statistically, this will save me weeks of fruitless debugging.] Anyway, right when I fixed this, everything worked magically. So now I passed steps 16, 17, 18, 19, 20 and 21 of the AN35 calibration and initialization procedure. It then sounded like a good idea to run a few tests with a phone set.

The board initialized OK when I connected the phone set I have at work (a Siemens euroset 2010) to the RJ11 pin. After initialization, taking the phone off-hook is detected in DR 68 (however, putting the phone back on-hook is not, so I need more work there). The “usual” line noise of a POTS phone line is heard from the phone’s earphone when DR64 is set to 0×01 (“forward active” mode). Moreover, with its current settings, my board can make the phone set ring! Just setting register 64 to 0×04 does it! Wow! That was actually my first milestone, back in year 2008 when I first started designing the board! I can’t really believe it took me six months or so to get here!

There are still however some signs that I don’t like. The board does not seem to understand when the phone goes on-hook again. The (absolute) values of the voltage produced by the DC-DC converter go far below the nominal 65V during operation, and I don’t know if this is normal. So, during the next (few, I hope) days will have to deal with these issues and correct any bugs I find.

As soon as this is finished, it will be PCM’s turn: I will try to have my board produce the good-old asterisk’s “Hello, World!” message onto my phone. Keeping in mind my current progress pace, a date for that next milestone should not be expected anytime sooner than year 2010 :-) . This time, however, my hope is I ‘ll not piss off the Gods of hardware as much as I have been until now, so that they will help me reach this next milestone somewhat faster. We ‘ll see.

You might just as well ask yourselves what’s down the road. Well, I think that, once (read: if ever) the board gets into a stable state, it will be an easy step to write a zaptel-compliant driver and see how the board will do with Asterisk. This then, if ever accomplished, will be the end of development for this project.

I will update this post as soon as I have the updated versions of the TIMR1 interrupt code, fixed board, schematic, BOM, etc. uploaded.

Quick update, April 8: about the reduced VBAT value in the forward active mode: this is OK, since this is exactly what setting TRACK to 1 does. In this “loop current tracking mode”, the chip provides just the necessary voltage to drive enough current through the loop, which presumably results in lots of power savings. So I need not worry about this. What’s not OK though is the back-on-hook non-detection – but I have not started debugging this yet.

Another quick update, April 9: by setting register 67 to its default 0×1F, now the board detects correctly the transition from off-hook back to on-hook (don’t ask me why, I cannot see any plausible reason for this).

Chasing an elusive bug around

March 15, 2009
Since my last post, I have been quite busy debugging my board. There seem to be more than one problems in the implementation of the DC-DC converter circuitry. However, this time I have gathered enough evidence on the bug; whatever it is, it looks that is not the fault of the PIC, nor is the 3210 the culprit. My most educated guess is that the 5VDC for both VDDD and VUNREG defines a rather marginal working environment for the converter. Add to that the tolerances of various components that set up the bias for various transistors and the sensing circuitry, and you have my exact situation.

BTW, Silab’s application note AN45 states that the MOSFET+transformer edition of the converter is preferrable for low values of VUNREG, but of course I am that type of excellent engineer who likes to first design, then build, and finally read the application notes (I trust you ‘ve met guys like me before — or, still worse, you are the exact same kind of engineer — so you know exactly what I am referring to…). Thus now I had two choices: either debug my board, or re-design it from scratch using the MOSFET version (which also sounds more expensive, so it defeats my initial $10 purpose…). Of course, I chose the first option! But let me take things in turn.

My first bug proved easy to spot. A difference in the voltages of the two sense pins of the 3210, SDCH and SDCL, drove me into spotting a bad “via” (actually, this had been corrected before my previous post, or else the converter would have never worked in the first place, not even intermittedly).

Then, I started exploring my way, starting from the 3210, and testing owards, with L1 as my final destination. Let me begin by introducing the schematic of the BJT version of DC-DC converter, copied from si3210’s datasheet:

DC-DC converter schematic

DC-DC converter schematic

From the few things I understand about electronics, this is how the thing works: the chip drives the converter through DCDRV (a square pulse signal, with frequency and duty cycle as configured using the chip’s control registers). When a logical “1″ appears on DCDRV, Q8 is driven into saturation, which then causes a current of (VUNREG — VCE)/(R16+R17) to pass through Q8, where  VCE is the voltage drop between Q8’s collector and emitter (VUNREG is shown as VDC on this version of the schematic). With current values of R16 and R17, this should cause a voltage in the order of VUNREG–1V to be applied to Q7’s base, driving in turn Q7 into saturation. Then, current flows through Q7, “charging” L1 with energy; then, when DCDRV is switched to logical “0″, Q8 and Q7 stop conducting, and L1 discharges through D1. This negative current is “stored” in C9, which also normalizes the “spiky” voltage produced by L1 discharge, producing the output voltage VBAT (-65V in our case). Should be as simple as that…

However, there are two more things to discuss: one is the DCFF signal which connects to Q7’s base through C10. What does this do? In our version of si3210, the DCFF pin produces the same signal as DCDRV inverted; this signal, fed through C10 to Q7’s base, increases the converter’s efficiency by compensating for the capacitive load of Q8 and thus making Q8 switch off faster (as per AN45, “the capacitor, C10, provides additional charge pump boost current from the DCFF pin of the Si321x to turn Q7 off faster”). Still as per AN45, “C10 with a value of 22 nF is sufficient for most applications” (AN45, p.5), however, as you can witnesss yourself, the example converter schematic from the si3210 datasheet shows a value of 100nF for this same C10.

The next thing to discuss is the voltage and current feedback circuitry, which allows si3210 to monitor the status of the converter and its output VBAT voltage. However, I’ ll discuss this a bit later.

I first debugged the signals around Q8 and Q7. Because of the unexpected behavior that I saw around Q7 though, (see oscilloscope shots later on), I first suspected that the ratio of the resistors R16 and R17 was inadequate to drive Q7 into saturation. Thus, I experimented with various values (soldering resistors in parallel or replacing existing ones), with limited success. For example, I tried R17=426Ohms (instead of its nominal 450Ohms value), and R16=220Ohms (instead of 200). By “limited success”, I mean that in my first experiment (R17=426Ohms), the si3210 occassionally did monitor some current flowing through Q7 and Q8 in its respective registers, however register 82 seldom did show something (and, when it did, it showed a much larger value than expected, around -90V — something that led the chip to quickly turn off the converter in order to protect it from burning). The second experiment, with R16=220Ohms, did not result in any remarkable result: the converter never made it to work with this resistor value; however, on the positive side, the chip was not sensing any overflow, so it did not reset the converter, and now I had a steadier condition to debug (observation of the circuit’s behavior through converter reset sequences was hard, because it involved transient behavior with much guesswork on when, why, and for how long some signals were changing or disappearing altogether).

So now I measured the various signals using an oscilloscope (from a video repair lab in my daytime job, whose employees I have to thank for tolerating me so many hours). In the pictures below you can see the interesting results.

PCLK signal

PCLK signal

FSYNC signal

FSYNC signal

Here are the two PCM bus signals, PCLK and FSYNC. They are not shown in the same time scale, for I wanted to show that PCLK has a duty cycle less than 50% (remember a 5-PIC-cycle bug that I need to correct in my TIMR1 code?), for which I needed just a few PCLK cycles to appear on the screen, whereas the FSYNC pulse is sparse (one pulse every 64 PCLK’s) to fit in the same time scale. BTW, can you see the actual FSYNC pulse? It’s just a small “spike” on the left of the screen.

The PCLK and FSYNC signals have nothing really to do with my debugging the DC-DC converter; I just thought it would be nice to show them (plus, I wanted to be sure that the 3210 is driven correctly. So, back to the DC-DC converter now.

DCDRV signal

DCDRV signal

The image on the right shows the output of the DCDRV pin, fed into the base of Q8. The frequency is something near 80kHz, and the duty cycle is as set in 3210’s control registers. So, everything is fine there, let us move on.

Q8 emitter

Q8 emitter

This image on the right shows what is happening at the emitter of Q8, right above R17 (the time scale is not the same as in the previous picture, so do not worry about this). As Q8 is switching on and off, current flows on and off through R17, so a pulse is generated at Q8’s emitter. However, there is something worrisome: the height of the pulse seems to be 5V (the viewing scale on the scope is 2V per div). According to my analysis above, this should be R16 x (VUNREG–VCE)/(R16+R17), which yields something in the order of 3 to 3.5V; well, if this is 5V, then what the heck is happening at Q8’s collector? Let us see right away…

Q8 collector

Q8 collector

Wow, what’s that? Looks like something oscillating very quickly (the oscilloscope cannot really display a visible oscillating signal in any time scale) from 5 to 6V! Where did the additional 1V come from? The only possible explanation is the DCFF signal. But how come this signal adds up to 6V? A possible explanation could be that some inductance is introduced in series with C10 from some badly designed copper path on my PCB, acting as a small DC-DC up-converter. How could I verify this? But, before going on with debugging, I thought I ‘d measure the output of Q7. The result is shown below.

Q7 collector

Q7 collector

This is pretty damn interesting, don’t you agree? But I have not the slightest idea what it means. My next step was then to remove C10 altogether, so that the DCFF signal is disconnected. Allegedly this signal is there only to “help Q8 switch off faster”. So, I thought I ‘d let Q8 “helpless” for a while. As soon as I did that, I measured -65V at the anode of D1! However, I did the change at home, and I need to scope again the board to show pictures of what changed in the converter’s signals.

Why could this be happening? I do not really know, but I have two plausible explanations. First, the DC-DC converter is designed for VUNREG>VDDD, which means that the pulses around Q8 would be something around 10V, whereas the DCFF signal would be 3.3 or 5V, so it would introduce a small 0,5-1V jitter, not that important as in my case. Second, as I said before, it may be a bad PCB design (yes, I checked C10 and found it OK). Or, it could be that I need a lower-value capacitor for C10. Or, I may need to add a low-value capacitor in the order of 10pF between Q8’s collector and the signal ground, to create a low-pass filter that would eliminate high frequencies apparently introduced there. We ‘ll see. For the time, I am happy with Q7 working without DCFF’s “help” (although it gets somewhat hot, it seems to still work within its temperature limits without burning).

Moreover, this was not the end of my pains: although the converter is producing -65V, the 3210 still does not measure anything in register 82. Why? Probably this is a bug in the values of R28 and R29 (you have to consult my schematic to see these). I have to debug this further, plus I have to measure the signals anew with the oscilloscope. Well, had I known all this beforehand, I would have never started with this project. But now I am midway through, and it would be a cowardly action to give up, wouldn’t it? So, let’s move with debugging! Hang on, brave readers, probably this adventure will eventually come to an end!

Update shortly after the post: thanks to Klaus, who provided this valuable comment, I quickly found why the 3210 was not sensing the -65V. The cause was a bad “via” right next to R5. I had checked all “vias” one by one, but this one probably was a bad contact working on occasion, or else I would have never observed 65V before through the chip’s sense. Anyway, this is a big relief. Thanks a lot, Klaus!! This was really valuable help!

Update, March 20 2009: I tested soldering C10 back in. It turned out that there is nothing wrong with the signals measured as shown above, and that the DC-DC converter works fine. Moreover, after re-soldering C10, the signals at the various test points as shown on the oscilloscope are quite the same as in the shots above, so there’s nothing wrong there. Plus, DCFF through C10 really helps Q7 working without getting hot, so it does a valuable job after all. Why hadn’t I measured -65V before then? I don’t really know. Maybe I just did a bad debugging job (perhaps I had not tested the voltage at the cathode of D1?). Maybe the chip did not sense -65V and was periodically restarting the converter? I have not the silghtest idea. Anyway, I won’t grieve about it, I ‘m happy that this all ended and I am up to my next steps.

Update, March 26 2009: the last few days I have been busy progressing with the initialization and calibration sequence of the 3210 as suggested in An35. During these tests, I hit another stupid bug of mine. I noticed that the manual calibration routine for the RING gain mismatch (cf. An35, p.3) was not completing successfully: I was writing 0×1F to DR 88 as suggested, and then stepwise decrementing that down to zero, however DR 88 kept displaying 0xFF — the largest possible value (for sure not close to the desired value of zero, right?). This led me into re-checking my schematic. Guess what: there is an off-by-10 error there: whereas R7 should have a value of 4.02kOhms, I have erroneously set a value of 40.2kOhms for it. So, I am going to replace R7 and re-try. I really do hope that I will not have any other major bugs. After this bug fix, I also owe a new version of the schematic and BOM with the correct value for R7, which I ‘ll post soon after making sure that changing R7 into the correct value fixes the problem. One more thing, too: meanwhile, among all my other debugging attempts, I fixed a bug in my PCLK TIMR1 ISR code (I was dropping PCLK 5 cycles too soon within the ISR code, and this is apparent in the PCLK oscilloscope shot above, which shows a duty cycle ~40%). So now I have a PCLK which is much better than the one in the shot above. Thus, I also owe to post the new, changed ISR code. Lots of TODO’s…

Update, March 26 2009 (later): replacing the correct value for R7 (~4.1k, made from two E12 8.1k resistors in parallel) fixed it! The Ring gain mismatch calibration works now OK! So, I am moving on…

That benign ghost

February 17, 2009

Back again!

Progress during the last two months has been terribly slow, because of too much stress in my so-to-speak daytime job — a demanding project there turned it into a day-and-night occupation for quite a few weeks. But I haven’t dropped my openusbfxs project (or at least, not yet :-) ). Thus, there are two things that I have managed to progress since my last post.

controllercapture

The first thing is that I have looked thoroughly through the direct register (DR) values that my 3210 returns to the controller program (shown in my last post, and repeated above as a thumbnail). Don’t quote me on this, however I was unable to find anything wrong with these values. If I am interpreting the 3210’s words correctly, it seems that the chip thinks that everything is fine, but still does not drive the DC-DC converter. Why is that so?

So I looked a bit more into what is happening in my second prototype board, this time using an oscilloscope. I was happily surprised to find that the 256 kHz PCLK and the FSYNC signals initially looked OK (at least, at first glance — this was actually the first time I have been checking my board with a scope). What turned to be in a worse shape than expected was the PIC itself.

In my previous post I mentioned something about a benign ghost playing tricks on my board. An indication of that was a mysterious flickering on the status LED. The oscilloscope revealed the ugly truth: after a few minutes of operation, when the PIC warms up a bit (nope, it doesn’t get hot, it just warms normally), it seems to destabilize and produce a noisy signal instead of a steady logic-one. Thus, what previously used to be a on-off signal (the signal driving the status LED), after a while becomes a sequence of noise-then-silence cycles. As the circuit remains under operation, the problem keeps worsening, resulting into a highly noisy pattern instead of a logic-one.

Why did not this show up in normal operation but only in bootloader mode? The answer is it did show up in both cases. However, normal operation has a much longer LED flashing period than bootloader mode. So, in normal operation the LED was flickering during a longer time and the human eye was tricked into seeing it as if it were constantly lit; whereas, the much shorter flashing period in bootloader mode (about 5Hz) did not provide enough time for the LED to become “incandescent” (light up in full-power) so the flickering was more apparent to the eye. A more careful inspection after scoping the board revealed that the flickering was also visible under normal operation.

What’s even worse is that this noisy logic-one signal does not contain itself into the LED-driving signal; it also appears in the PCLK line. Thus, from an initial 256-kHz series of logic-ones and logic-zeros, the PCLK signal slowly turns into a 256-kHz succession of noise-then-logic-zeros. My guess is that this explains perfectly well why sometimes the 3210 fails to operate its SPI interface correctly and returns unexpected values or values consisting of all-ones.

Since this error does not occur until the PIC is a bit lukewarm (plus I could not see anything suspicious on the crystal oscillator pins), my guess is that something is likely wrong with the PIC itself. Thus, I have ordered a couple of new PICs in order to replace the one on-board. However, and to be on the safe side, I ‘ll first try replacing the oscillator crystal — anyway, quite easier and faster than replacing the PIC, so why not try it after all.

In any case, I think that an educated guess is that the noisy PCLK signal is what inhibits the 3210 from operating correctly its DC-DC converter driver. But time (plus replacing suspected components) will show.

I hope that this time I ‘ll be back posting much sooner, and that I ‘ll be bringing good news. Till then, do wish me a successful debugging session. Have I told you how much I hate bugs? Yes? Well, I wish they hated me as much as hate them! Alas, this is not the case…

Update Feb 24 2009: after replacing the crystal, and as far as I can tell from visual inspection of the LED, the noise in the logic-one signal seems to have vanished; so, it seems that the culprit was the crystal and it was not the poor PIC’s fault after all. Nevertheless, fixing the logic-one signal did not fix the issue of the DC-DC converter not functioning. I am going to re-inspect the board with an oscilloscope to help me in locating the next bug in the queue. It may be that the 3210 was damaged during soldering and needs to be replaced (right now, this seems to me the most plausible explanation).

2nd update Feb 24 2009: after examining the board with a scope, it seems that this time there is a clock drift in the PCLK line. Although the last time I had observed noise, the clock was perfect and there was no drift at all. Seeing a clock drift now leads me into suspecting again the PIC (probably its PLL?), so I ‘m going to replace it in any case.

Update Feb 25 2009: I have replaced the PIC, however register 82 still remains stuck to zero. I ‘ll re-scope the board to check how the clock signals around the PIC are looking and will update the post later on.

2nd update Feb25 2009: ok, the clock now looks fine, however the 3210 still declares itself “unwilling to perform” (if the phrase rings a bell to you, it’s an LDAP protocol error message) . My next suspect then is the 3210 itself. Although it should, it does not produce any frequency to drive the DC-DC converter circuitry. So, I ‘ll try replacing that and see what happens.

Update Feb 26: nope, it was not the 3210. The replacement 3210 behaves exactly the same as the previous one [remember that I am not an electronics expert at all, so this simple "changineering" strategy (replace suspected components until the problem is confined so that it becomes apparent) is all I can do for now]. I have observed a change in the voltage of SDCH (pin 8 of the 3210) when DR 14 is set to zero. So my next step will be to note down carefully voltages around the DC-DC converter circuitry in both states (DR 14 = 0×10 and 0×00) and see where this leads me.

2nd update, Feb 26 (and last for this post): partial success! I managed to see 65V again (then lost it). By observing voltages, I noted an incosistent value in SDCL (5V on the 3210 pin and 4.2V elsewhere in the circuit), which led me into fixing a bad via underneath the 3210. The 3210 now drives correctly Q8, however Q8 does not seem to let through R16 enough current to drive Q7 into its conducting state. So, next thing to try is replacing Q8. Remember that this intermittent operation was the exact same behavior that my first prototype had; so I am now suspecting a bad choice of some specific component (and Q8 is a good candidate).