A trip in the dry lands of maintenance and compatibility

March 16, 2013

Super hackers and the VM potential

After an admittedly-long-time-no-see period, I decided to add this new post about compatibility with various Linux flavors. The reason is that, throughout all this period, while I did not make any progress on Open USB FXS (and FXO either), the globe kept spinning, and new versions of things kept popping up. New versions of the Linux kernel, new versions of Dahdi, and of course new versions of the Linux distributions that I had been writing about. All this otherwise welcome wave of progress was driving into obsolescence my older instructions for building and installing the Open USB FXS Dahdi driver.

I have to admit that this project tends to require a sort of “all-in-one” hacker to understand and build. One must know about electronics, have experience with SMD technology, get involved with USB intricacies, understand VoIP issues, and — worst of all — be knowledgeable in the Linux kernel and device drivers — not to mention a requirement for moderate experience with the Unix/Linux shell, editors, file system structure and commands. Although some of the fans that this project has attracted through its lifetime fitted this profile, quite a few did not. Both sides have been represented: electronics fans who did not know much about Linux; and Linux gurus who did not wish to get their hands dirty with soldering. People who did not wish to get involved with chasing and soldering louse-sized components have (or used to have, my stash is now almost extinct) an alternative: try to get their hands on a ready-made board. This covered indeed one end of the spectrum. However, the other end was not so easy to satisfy: people with experience (or a great love for, or both) in electronics, who felt at ease with soldering irons, SMD, flux, magnifying lenses and multi-meters, who however had little or no experience in Linux were facing problems. What distribution should they choose? What to put in dkms.conf? Why did the compilation result in error on distro X.13.1.20-2? And so on.

I wish I could provide a way to help these users. Apparently, I could not — I can not — ship a ready-made PC or other embedded system with Linux and my drivers installed, and even if I could, the cost would not be justified. One of my thoughts had been to provide a ready system, built as a VM, for people to download and experiment with. I do not know if people find this a good idea, however there are some shortcomings to it. The main problem is that, as documented through various posts in this blog, VMs do not perform well enough: an isochronous driver on a VM tends to miss USB slots and this results in bad audio and other issues. Thus the usefulness of a preconfigured VM would be limited to just proof-of-concept or testing environment, where people would be able to test their self-made DIY Open USB FXS board. The actual value of the VM concept let aside, I was — and still am — feeling terrified by the prospect of maintaining separate Linux distros as VMs, as per user requests. What if I provide, say, a Debian VM, and then someone comes in saying “please, please, do also provide an Ubuntu VM”? And what if I do provide a Ubuntu VM for version X.Y.V-12 of Ubuntu and then version X.Y.V.-13 pops up, so I need to align?

For the exact same reasons, in my previous posts I had refrained from giving out instructions alleged to fully work on a given version of distro X. What if I provided good and valid instructions for a snapshot of distro X, and distro X then fired out an update (say, a kernel update) that would brake it all? [And mind you, this is no science fiction; I found this to be exactly the case with Ubuntu 12.0.4LTS. Not with my driver. With their own provided ones. Read on for the details.] Apparently this would not work. Linux users are — and have always been — required to a certain degree to take control of their spaceship and perform themselves some system housekeeping and maintenance, (including some googling and guesswork) to keep up with issues like these — especially when it comes to distributions that consider spitting out untested updates on a daily basis is a good maintenance practice.

All the above fail to include some nasty — though I hope imaginary — support requests that come to my mind, like “I need to have VMWare tools installed”, or “my VMWare version does not support your VM”, or “your VM image does not see my USB devices”, or whatever else. No, no, please, take this cup of suffering away from me…

Nevertheless, if there are people who find all this crazy VM business a good idea (and promise not to ask me to keep up to date with the crazy rate at which Ubuntu and others fire updates of their systems), then I might go into that. Only for demo and proof of concept purposes, right? And I will not provide anything more than a couple of demo VMs and your favorite distro or version thereof may not be among them. And they may not work on your VMWare player or whatever. And I will just provide them as-is and they may be outdated. And if you bring them up and let them install updates from their distro repository and this breaks everything I shall not be the one to blame. And I will not provide support or updates for them. Agree? If yes, please post a comment or mail me and let me see what I can do about VMs.

Still believing that the best idea would be not to provide VMs but just good instructions to help people compile and install the drivers on their own barebone Linux system — no VMs –, I went on testing the status of my driver with several systems and kernel versions.

Debian 6.0.6 (“squeeze”)

Debian used to be very straightforward in the past, and proved to still be so. My main concern, which lasted only about ten minutes or so, was to install all the required packages. I re-synthesizing the packages I selected by hand by looking at dpkg’s log file, so I may forget something, but essentially you need the following: the toolset to build kernel drivers, the dahdi binaries and the dahdi source, subversion (to checkout my oufxs driver) and libusb-dev (because some programs in my driver depend on it). So, you need to install gcc (if it isn’t there), dahdi, dahdi-linux and dahdi-source, kernel-package, linux-headers (and maybe linux-source, I am not sure if this is required anywhere), subversion and libusb-dev. Just use apt-get install package where package is one of the above packages. Then,

root# cd /usr/src
root# mkdir dahdi-src
root# cd dahdi-src
root# tar jxf ../dahdi.tar.bz2
root# cd modules/dahdi/drivers/dahdi
root# svn checkout http://openusbfxs.googlecode.com/svn/trunk/LinuxDahdiDriver/oufxs
root# cd ../..
root# make SUBDIRS_EXTRA+=oufxs
root# make SUBDIRS_EXTRA+=oufxs install
root# modeprobe oufxs
root# lsmod | fgrep oufxs

, and there you have it. (Note that the long “svn checkout” line above is wrapped, so do not copy-paste blindly from your browser into the command shell). The last ‘lsmod’ line is not strictly required, it is just for you to see that the module has been insmod’ed correctly. You should do the rest of the configuration and testing as described at my other blog posts (i.e., for installing Asterisk, setting up startup scripts and default module options, testing with fxstest, etc.).

Ubuntu Workstation 10.0.4LTS

Ubuntu 10.0.4 is also straightforward. In terms of dependencies, things are very alike with Debian squeeze. Apart from the obvious packages required (gcc etc., which I think get installed by default), one needs dahdi-source, dahdi-linux and dahdi-dkms (these depend on each other, so you may get away with just one of them), subversion and libusb-dev. When installing dahdi-dkms, dpkg (the Debian-originating package installer) invokes dkms, so your system has now already dahdi installed through dkms. The next steps are then (a) tell dkms to remove the Ubuntu-supplied dahdi driver tree from the dkms tree (and from the kernel), (b) download the oufxs driver, (c) fix dkms.conf and (d) rebuild dahdi the dkms way, this time with oufxs in. So, here we go:

root# dkms remove -m dahdi -v 2.2.1+dfsg-1ubuntu2 --all
root# cd /usr/src/dahdi-2.2.1+dfsg-1ubuntu2/
root# vi dkms.conf

Now, you have to go to line 5 (or so), starting with MAKE[0]=, and make that infamous and dreaded change to SUBDIRS_EXTRA (in the past, I have provided some not-so-accurate instructions, thus people kept on asking how to make this change). The line must now read (note again line wrapping – this should be one line on your editor):

MAKE[0]="make modules KERNVER=$kernelver MODULES_EXTRA='opvxa12
00 wcopenpci dahdi_echocan_oslec' SUBDIRS_EXTRA='../staging/ech
o zaphfc/ oufxs' ;make;make firmware-loaders;echo : > drivers/d
ahdi/vpmadt032_loader/.vpmadt032_x86_32.o.cmd;echo : > drivers/
dahdi/vpmadt032_loader/.vpmadt032_x86_64.o.cmd;make"

Beware to include oufxs inside the quotes, like this: SUBDIR_EXTRA=’../staging/echo zaphfc/ oufxs’. Then, go to the end of the file, and add the following three lines before the “AUTOINSTALL=yes” line:

BUILT_MODULE_NAME[35]="oufxs"
BUILT_MODULE_LOCATION[35]="drivers/dahdi/oufxs"
DEST_MODULE_LOCATION[35]="/kernel/drivers/telephony/dahdi/oufxs"

Note that the number 35 above is the number next to the last one used by the ubuntu-provided dahdi source. In my case this was 34, so I picked 35. If Ubuntu update their dahdi-dkms package to add or remove drivers, this number may change and you will have to adapt accordingly. Then, you have to perform the following steps:

root# cd /usr/src/dahdi-2.2.1+dfsg-1ubuntu2/drivers/dahdi/
root# svn checkout http://openusbfxs.googlecode.com/svn/tru
nk/LinuxDahdiDriver/oufxs oufxs
root# dkms add -m dahdi -v 2.2.1+dfsg-1ubuntu2
root# dkms build -m dahdi -v 2.2.1+dfsg-1ubuntu2
root# dkms install -m dahdi -v 2.2.1+dfsg-1ubuntu2
root# modprobe oufxs
root# lsmod | fgrep oufxs

Notice again the line wrapping on the “svn checkout” command above — this should be one line.

Ubuntu 12.0.4LTS: the story

OK, let us now come to the latest-and-greatest version of Ubuntu. Lots and lots have changed in this version. Starting from the kernel version and extending to the screen resolution, the incompatibility of their own packages with their own-supplied udpated kernel version, and not forgetting that synaptic is not anymore installed by default, yielding its place to the “Ubuntu Software Center”, this version of Ubuntu is guaranteed to cause more headaches with Open USB FXS. So, here is how my story goes with 12.0.4. I hope you will find useful information in the course.

To begin with, I do not have a host of spare PCs to test on three different OS’s — to tell the truth, I have not even one spare PC; I do all my work on VMs. This time, I started by installing Ubuntu 12.0.4LTS on a new VM, but I had to interrupt and thus I suspended the VM in the course of installation. This is guaranteed to work on other OS’s. but on this one, resume gave me the following insightful message:

To revert to this snapshot, you must change the host pixel format to match that of the guest. The host’s current settings are: depth 24, bits per pixel: 32. The guest’s current settings are: depth 24, bits per pixel: 32. Error encountered while trying to restore the virtual machine state from file “…\Ubuntu12.0.4LTS.vmss”.

No other action than power-reset of the VM was possible at this stage, and of course, since I had paused the VM halfway through the initial installation, this gave me a black screen. Starting from scratch, I got 12.0.4 installed propperly without pausing the VM in the process. But then, at some later point, I had to pause and resume gave me the same error. A little googling got me to this post, which described accurately my problem, and was pointing to the right (partial) fix. I am saying “partial”, because it seems that the user must be logged on for this fix to work, but nevertheless, this was a great boost in productivity, comparing to having to reboot each time I needed to pause VMWare.

So I let the update manager do its own work. Funny enough, the first few times it tried, the update manager complained about being unable to verify the signatures of Ubuntu’s own packages. A bit of googling and reading on this gave the answer: just let the damn thing its time, and it will work. Indeed, it seems that 12.0.4LTS and its update manager need to fish the signature keys somewhere, and this somewhere takes time to respond or isn’t always up or whatever — I really did not care to find out the exact truth. I just noted the lousy user experience of bringing up a system that fails to update itself complaining that it does not know the signatures of its own vendor’s packages, but I said, anyway, what the heck. Letting it live for ten minutes or so got me the keys out of the blue, and the update manager did its thing and updated the system, kernel included. Wow, now this system runs 3.5.0-25! Yeah right… just read on to the next paragraph.

Next surprise was synaptic. It was gone. OK, I admit that I could have done my work using apt-get, however I am a fan of synaptic because it helps searching out through packages. Strangely enough, I was able to find dahdi in the “Ubuntu Software Center”, and I selected it for download, and… guess what: dahdi failed to install. But this was crazy! A Ubuntu-provided package fails to work on a Ubuntu-provided and Ubuntu-updated system! As always, a deeper look helped spot the problem. The dahdi version 2.5 that Ubuntu includes fails to compile with the kernel version 3.5 that Ubuntu guides you into updating to! The culprit is this little change after kernel version 3.3, which I read has broken quite a few third-party kernel drivers. Fortunately, googling found me an easy fix for this, so all I had to do was to un-install the Ubuntu-provided dahdi dkms package, fix it myself, and re-install it:

root# less /var/lib/dkms/dahdi/2.5.0.1+dfsg-1ubuntu2/build/make.log

got me the culprit file, wcb4xxp/base.c, which I had to fix according to the above fix. The steps:

root# dkms remove dahdi/2.5.0.1+dfsg-1ubuntu2 --all
root# cd /usr/src/dahdi/2.5.0.1+dfsg-1ubuntu2/drivers/dahdi
root# vi wcb4xxp/base.c      # apply the fix circa line 39
root# dkms add dahdi/2.5.0.1+dfsg-1ubuntu2
root# dkms build dahdi/2.5.0.1+dfsg-1ubuntu2

All this, to get me the distro-provided package installed on the distro-provided system. But it was done. Now what? From some users who had tried compiling and running my oufxs driver on newer kernels, I had reports that some tweaking had to be made in the power management part of my driver. To fix this, I had better have the kernel source and/or the kernel man pages around. Looking up for the kernel source 3.5 in the “Ubuntu Software Center” gave me nothing. Ooops… The distro-provided package repository does not give me the source for the distro-provided kernel. How come?

NOTE: the following paragraphs are a tale of my own steps. You do not need to repeat these steps.

I then decided get the familiar synaptic back, mainly to get rid of all the annoying commercial software and magazines that the “Ubuntu Software Center” presented me with when searching. This was an easy step, so I got synaptic installed. Searching with synaptic confirmed that the Ubuntu 12.0.4LTS source repository does not have the source for kernel 3.5.0-25. Great! So I had to add manually the respository of quantal (the next version) in the end of /etc/apt/sources.list:

deb http://ubuntu.mirror.cambrium.nl/ubuntu quantal-updates main

, then do an apt-get update to let apt know about the newly-added repository, then search with synaptic the kernel source for 3.5 and add it, and then (to be on the safe side) to remove the quantal repository form /etc/apt/sources.list and to re-do an apt-get update. It was getting funny. Anyway, I did all that and got me the kernel source for 3.5.0-25 – besides, I only needed to study the source, not compile it.

I then tested plugging in my Open USB FXS board. As reported, the driver initialization failed because usb_autopm_get_interface was returning -EACCESS. I checked again with the source and usb_skeleton.c, and it turned out that during all this time I had been using the usb_autopm_XXX functions incorrectly. Normally, I should use these when a process is opening the device. Instead, and in order to disable autosuspend altogether (which was my purpose after all) I needed to provide a handler function and place it into the .suspend pointer of my driver structure, making sure that this function would return zero.

This is where I stand now. During the next days, I will update this post as I am fixing my driver and finally I will upload a new version on the project’s Googlecode repository. Stay tuned, if you are interested.

Update, April 16 2013

The code has now been adapted to work with Dahdi 2.5 and the 3.5 kernel that Ubuntu runs. No substantial changes were required for the Dahdi adaptation, just a few #if’s in the code to account for minor version 5. Before I set these, I payed a couple of kernel panics and freezes, though, because the  precompiler tests were checking for minor version == 4 and not for >= 4. Lesson taught, I changed these ==’s to >= and all went fine. I had to remove usb_autopm_XXX functions altogether. However, this brings in an interesting question.

How would you like Open USB FXS to do power management?

As I said above, in order to get my driver to work, I had to remove usb_autopm functions. I was using them wrongly anyway, so no much harm was done. Supposedly, I was invoking usb_autopm_get_interface at driver initialization time to prevent the kernel from putting my dongle board in low-power mode. However, my driver was not providing to the kernel hooks for that (like .suspend, for instance), so I guess that auto-suspend was anyway disabled. Second, the autopm_get function is supposed to be invoked at file open time, whereas I was invoking it at driver initialization. The function must be called in “process context” and not in “interrupt context”, however I think I was bypassing that, because all the driver initialization was being done in a work queue (that is, in a separate kernel thread on its own), and I think this is considered “process context”. I have no idea why the kernel rejected the call with -EACCESS, but in any case that call was not doing much anyway.

However, this brought in an interesting question. How should Open USB FXS behave if it it had been engineered for supend/resume functionality? In principle, it should like go in “low-power” mode, turning off the converter and probably even shutting down all operations of the Si3210. The PIC’s USB functionality “knows” about suspend and resume, and the PIC is able to survive with exremely low currents. So, that would be a “USB-compliant” mode of suspending your Open USB FXS board. Would that be interesting, though?

My guess for an answer is no.

One could imagine another, more interesting mode of suspending a telephony device. How would you like, for example, your computer being put to sleep, and waking up when you take your phone off-hook? Or, in the case of an FXO device (assuming that I ever make it to get mine working), when the FXO interface indicates a ring from the Telco line? For me, that would mean lots of power savings. Is it doable? Certainly, for an FXS interface, it makes little sense: if we power down the converter, the device will no longer be able to track an off-hook event. However, an FXO device might be easier to deal with.

For the time, I do not intend to do any work in the area of power management. Nevertheless, I am tossing the ideas, just in case someone reads this and wants to contribute (ideas, experience, views or whatever).

Power (PC) to the people

July 4, 2011

Just another piece of great news! A user reported that they have successfully tested Open USB FXS on a Power PC platform. Actually, they have tested two PPC platforms: a Mac PowerBook G4 and another PPC platform running stock Debian. Open USB FXS worked on both — right away on the Mac, and after a bug fix in a the USB Host Controller of the second platform (the host controller did not support initially the number of USB endpoints required by Open USB FXS, but that was fixed by the manufacturer).

Well done folks! Keep it up, let the platforms that work with Open USB FXS proliferate!

Welcome, bro!

April 5, 2011

This is big news! Open USB FXO has just been born. Of course, not as a working circuit, but as a blog, just like Open USB FXS started a good two years ago. Welcome, bro! It’s nice to have you around!

Two good old friends

March 21, 2011

Sit back, relax… 

While visitors from Elektor were starting to visit my blog in waves, I was supposed to sit back, relax and enjoy the popularity of the site. Halas, seldom do things go exactly as they are supposed to, do they? By the time I was to relax, two user reports came that got me into a mad race to fix issues before they affected any other users too. But after all, that was the deal, right? Bugs were discovered. I had to fix them. A user community working in practice — exactly what I was hoping that my project would enjoy. Now that the race has ended and both reported issues are fixed, it’s time that readers heard the story.

DTMF intermittency

Back from the early days of the first Open USB FXS dongle boards, some of them were exhibiting an annoying bug. On occasion, and without any sign as to what exactly this “on occasion” coincided with, Asterisk refused to recognize DTMF tones from keys pressed on the phone. This was a very, very annoying bug, in that it made it impossible to dial out. Unplugging and replugging the board fixed the issue sometimes. Once the issue was fixed, it kept working OK for a lifetime — meaning, until the board was next unplugged or the computer was next rebooted (a good proof that a lifetime is a very short time). While I had not worried much about this issue, a user report made me think otherwise. Yes, to keep users happy, I had to fix the damn thing.

Sometimes I was able to cure the problem by adding or removing an echo canceller and/or tweaking the gain on the board and/or changing the “relaxdtmf” setting in /etc/asterisk/chan_dahdi.conf and/or changing the dahdi tonezone. But I could not get a consistent result. Googling about the issue did not help either. I came upon the same and the same advices: tweak relaxdtmf, check the gain, etc. There seemed to be people around hitting the same problem with different hardware, but there seemed to be no definite answers as to what caused the problem.

Even now that I have fixed this issue, I do not know for sure why it occurs. My understanding is that this issue is caused by some combination of echo, gain and timing settings. A possible explanation is that the output audio signal (dial tone) gets somehow echoed back together with the input signal (DTMF tone) and garbles it. This assumption looks more plausible than others, because the issue occurs less often with some single-frequency tonezones (gr, de, and others) than with dual-frequency ones (us). But I am sure that I am still missing something here, because this assumption alone cannot explain why, once DTMF works right, it works for a “lifetime”. I decided anyway to postpone this discussion for some other day, whenever I find time to really look into the root cause of the issue and fix that. For now, a reliable workaround would suffice.

Eventually, what helped me resolve the issue was the code for some other FXS board drivers in dahdi. There, I found a very interesting call to a function dahdi_qevent_lock(), which I ‘ll explain in a second. The board already implemented DTMF recognition in hardware; the recognized DTMF status had always been reported to the host piggybacked into the headers of isochronous USB packets, along with audio samples. BTW, on-board DTMF recognition has always worked reliably, but somehow I did not know what to do with it so far.

After discovering about dahdi_qevent_lock(), I found out that one can send DTMF events to the “dahdi layer”. These events are either interpreted locally by Asterisk, or they are translated into control events (in SIP and other VoIP protocols) to be sent to the remote end. One must also provide for an ioctl to “mute” the qevent functionality when “bridging” locally in dahdi (e.g., when one connects from a local FXS port to another local FXO port — the latter could be the PTT line, and muting the qevent functionality serves to pass the DTMF codes as plain audio to e.g. this PTT line once a call gets connected through it).

To cut a long story short, I added calls to dahdi_qevent_lock() and – wonder! The DTMF issue vanished. Now my board dials reliably each and every time. The downside of this is that the 3210 reportedly does not understand the DTMF systems of some Asian countries. Well, no problem, the on-board DTMF recognition functionality is optional: I have added a new module parameter “hwdtmf” (default value is 1), which can be turned off if you happen to live in China and the 3210 does not recognize correctly your DTMF tones.

The sound of silence

The second very, very annoying issue that I became aware of was this: when the board was left plugged into some systems and the system rebooted, one out of two reboots or so a terrible noise was coming out of the phone instead of a dial tone (hmm… why did this feel like a déjà vu?).

I first tried to reproduce the issue on my laptop. Well, I could not. No matter how many reboots I tried, there was no noise. But then, I tried on different hardware and, surprise: yes, I could hear the noise, too. It sounded like a jigsaw working in the next room — a jigsaw cutting into small pieces all my so far boasting about crystal-clear audio and the like.

An interesting detail was that, when Asterisk was stopped so that there was no dial tone, there was no noise. Silence transmitted fine. But what use could my board be if it could only transmit silence? I agree, it could make it fine into one of these April’s fool day circuits that Elektor magazine used to publish once, like a “fuse blower” (blew reliably every fuse you put into it) or a “solar torch” (did not have any batteries, and thus worked only under direct sunlight). Yeah, the “FXS circuit that transmits silence correctly” made a very good candidate new member of that family.

Coming back to debugging, what was it that caused the noise? It could be that the Linux driver was misbehaving, that dahdi was not working, or what? I quickly crossed out all these assumptions. A good deal of combinations of different kernel and dahdi versions all exhibited the issue. Moreover, a plain copy of an asterisk u-law-encoded audio file to the device also came out with noise. In that path, there was no Asterisk, no dahdi, nothing else than my driver and my board. And a good deal of despair, too — hadn’t I crossed out all these annoying bugs? When would they finally go away for good?

I first tried in vain to capture differences in the device status between the two states, when noise was present and absent. Nope. Si3210 register dumps seemed identical, other than a few insignificant differences in some analog values and calibration results; these differences could not be the root cause of the problem. The dahdi_diag utility also proved to output identical output, modulo the gain (I think that this is just the address of a gain buffer, so it is normal to see differences there, but I do not know for sure).

The next thing I tried was usbmon. This is a nice kernel debug uitility that sniffs USB packets as they go in and out the USB layer. I stopped Asterisk in order to let the driver send “silence” (0x7f’s) to the board, and then I cat’ed /sys/kernel/debug/usb/usbmon/2t (2 is a number that depends on the specific USB port in use), and I got this:

f6dad180 1406459225 S Zo:003:02 -115 16 = 00000001 1a400100 7f7f7f7f 7f7f7f7f
f6d183c0 1406459230 C Zi:003:02 0 16 = badd80ef 00001583 7e7efffd ff7d7efe
f6d183c0 1406459233 S Zi:003:02 -115 16 <
f6dad240 1406460220 C Zo:003:02 0 16 >
f6dad240 1406460226 S Zo:003:02 -115 16 = 00000002 1a400100 7f7f7f7f 7f7f7f7f
f6f8d000 1406460231 C Zi:003:02 0 16 = baee00f0 00000084 7d7c7eff 7e7e7eff
f6f8d000 1406460234 S Zi:003:02 -115 16 <
f6dadf00 1406461218 C Zo:003:02 0 16 >
f6dadf00 1406461224 S Zo:003:02 -115 16 = 00000003 1a400100 7f7f7f7f 7f7f7f7f
f6f8d3c0 1406461229 C Zi:003:02 0 16 = badd80f1 00001584 fe7e7e7d 7d7d7e7d
f6f8d3c0 1406461232 S Zi:003:02 -115 16 <
f6860180 1406462219 C Zo:003:02 0 16 >
f6860180 1406462225 S Zo:003:02 -115 16 = 00000004 1a400100 7f7f7f7f 7f7f7f7f
f6f8d480 1406462230 C Zi:003:02 0 16 = baee00f2 00000085 fffcfcff fefe7d7e
f6f8d480 1406462234 S Zi:003:02 -115 16 <
f6860cc0 1406463221 C Zo:003:02 0 16 >
f6860cc0 1406463230 S Zo:003:02 -115 16 = 00000005 1a400100 7f7f7f7f 7f7f7f7f
f6f8da80 1406463235 C Zi:003:02 0 16 = badd80f3 00001585 fefefffe ff7e7e7d
f6f8da80 1406463241 S Zi:003:02 -115 16 <
f6860000 1406464222 C Zo:003:02 0 16 >
f6860000 1406464229 S Zo:003:02 -115 16 = 00000006 1a400100 7f7f7f7f 7f7f7f7f
f6f8d900 1406464234 C Zi:003:02 0 16 = baee00f4 00000086 7d7dfffd fdfdff7d
f6f8d900 1406464238 S Zi:003:02 -115 16 <
f6860900 1406465219 C Zo:003:02 0 16 >
f6860900 1406465225 S Zo:003:02 -115 16 = 00000007 1a400100 7f7f7f7f 7f7f7f7f
f6f8d240 1406465230 C Zi:003:02 0 16 = badd80f5 00001586 fe7e7e7d 7c7e7d7e
f6f8d240 1406465233 S Zi:003:02 -115 16 <
f68603c0 1406466218 C Zo:003:02 0 16 >
f68603c0 1406466224 S Zo:003:02 -115 16 = 00000008 1a400100 7f7f7f7f 7f7f7f7f
f6f8d9c0 1406466229 C Zi:003:02 0 16 = baee00f6 00000087 fdfffffe 7dff7e7e

You can see output (Zo) and input (Zi) USB “packets”, consisting of eight bytes of header data and another eight bytes of audio data. This dump, produced with the noise issue absent, looked fine. Then I tried to capture a dump with real audio (some Asterisk tone output), when noise was heard. I got this:

f6900e40 3916897398 S Zo:003:02 -115 16 = 00000001 e3400100 15131720 3eaf9d96
f6d89000 3916897407 C Zi:003:02 0 16 = badd80ee 00003314 c6dc7e7e 72696b68
f6d89000 3916897416 S Zi:003:02 -115 16 <
f6900cc0 3916898377 C Zo:003:02 0 16 >
f6900cc0 3916898390 S Zo:003:02 -115 16 = 00000002 e3400100 94979fb5 3e241c1a
f6d899c0 3916898397 C Zi:003:02 0 16 = baee00ef 00000015 fb664c5e e9faddee
f6d899c0 3916898404 S Zi:003:02 -115 16 <
f69003c0 3916899369 C Zo:003:02 0 16 >
f69003c0 3916899376 S Zo:003:02 -115 16 = 00000003 e3400100 1c2333dc b0a8a5a8
f6d89540 3916899381 C Zi:003:02 0 16 = badd80ee 00003415 fb736362 5a63677d
f6d89540 3916899385 S Zi:003:02 -115 16 <
f69000c0 3916900370 C Zo:003:02 0 16 >
f69000c0 3916900376 S Zo:003:02 -115 16 = 00000004 e3400100 aebde94b 424653fe
f6d89780 3916900381 C Zi:003:02 0 16 = baee00f1 00000016 e1e35a64 ed68eaf8
f6d89780 3916900384 S Zi:003:02 -115 16 <
f6900540 3916901369 C Zo:003:02 0 16 >
f6900540 3916901376 S Zo:003:02 -115 16 = 00000005 e3400100 dde4593f 332d2b2f
f6d89c00 3916901381 C Zi:003:02 0 16 = badd80f2 00003516 7461edcf d3ccd5dd
f6d89c00 3916901384 S Zi:003:02 -115 16 <
f6900c00 3916902369 C Zo:003:02 0 16 >
f6900c00 3916902375 S Zo:003:02 -115 16 = 00000006 e3400100 3eceaea2 9d9ca0ad
f68906c0 3916902380 C Zi:003:02 0 16 = baee00f5 00000017 74584e49 4b4f5eee
f68906c0 3916902383 S Zi:003:02 -115 16 <
f6900300 3916903370 C Zo:003:02 0 16 >
f6900300 3916903376 S Zo:003:02 -115 16 = 00000007 e3400100 6d2b1c17 161a2447
f6890780 3916903380 C Zi:003:02 0 16 = badd80f4 00003517 cbc0d0de dc79f36e
f6890780 3916903384 S Zi:003:02 -115 16 <
f6900480 3916904369 C Zo:003:02 0 16 >
f6900480 3916904376 S Zo:003:02 -115 16 = 00000008 e3400100 ad9c9593 969eb637
f6890240 3916904380 C Zi:003:02 0 16 = baee00f5 00000018 655e5757 5873efd7

Just a pile of junk data? No, I do not agree. These two dumps contain a cornucopia of debug information and show clearly the cause of the problem (though not the root cause). At first, I was lost too in that sea of numbers. But then I looked back into my PCM packet headers, and enlightment came at once. First, let us isolate some Zo (OUT, from host to board) lines from the first dump and look at them (I removed the usbmon information and kept the data only for clarity; the excerpt is taken from the first dump, but the discussion below applies likewise to the second one as well):

00000001 1a400100 7f7f7f7f 7f7f7f7f
00000002 1a400100 7f7f7f7f 7f7f7f7f
00000003 1a400100 7f7f7f7f 7f7f7f7f
00000004 1a400100 7f7f7f7f 7f7f7f7f
00000005 1a400100 7f7f7f7f 7f7f7f7f
00000006 1a400100 7f7f7f7f 7f7f7f7f

One can’t help but noticing the increasing numbers 1, 2, 3, 4 in the fourth byte of the first word; yes, these are sequence numbers, which I built in on purpose, exactly for debugging (the other header information is irrelevant, so I won’t deal with that). Now, these get mirrored back into input packets (with an offset depending on {r,w}packsperurb and {r,w}urbsinflight. Here is a sequence of IN (board to host) data from the fist dump:

badd80ef 00001583 7e7efffd ff7d7efe
baee00f0 00000084 7d7c7eff 7e7e7eff
badd80f1 00001584 fe7e7e7d 7d7d7e7d
baee00f2 00000085 fffcfcff fefe7d7e
badd80f3 00001585 fefefffe ff7e7e7d
baee00f4 00000086 7d7dfffd fdfdff7d
badd80f5 00001586 fe7e7e7d 7c7e7d7e
baee00f6 00000087 fdfffffe 7dff7e7e

At the same position, fourth byte, you will see again an increasing sequence number. This is the mirrored OUT sequence. These headers contain other information too, some of which is relevant. If the second byte is 0xee, this packet was sent over the even USB endpoint, if it is 0xdd, over the odd one (odd and even endpoints are a PIC idiosynchracy that helps avoiding double buffering). Moreover, at the end of the second word  (8th byte), one can see some other sequence numbers increasing every two packets (0x83, 0x84, etc.) — this is normal. The first trace shows a perfect USB communication. No lost sequence numbers, no nothing. Let us now look at the IN (Zi) lines from the second trace:

badd80ee 00003314 c6dc7e7e 72696b68
baee00ef 00000015 fb664c5e e9faddee
badd80ee 00003415 fb736362 5a63677d
baee00f1 00000016 e1e35a64 ed68eaf8
badd80f2 00003516 7461edcf d3ccd5dd
baee00f5 00000017 74584e49 4b4f5eee
badd80f4 00003517 cbc0d0de dc79f36e
baee00f5 00000018 655e5757 5873efd7

Uh-oh… What do we have here instead of an increasing sequence? The weird sequence 0xee, 0xef (correct), 0xee (repeated, instead of the expected 0xf0, which is missing altogether), then 0xf1, 0xf2 (fine), 0xf5 (where did 0xf3 and 0xf4 go?), 0xf4 (so 0xf3 was lost altogether), and 0xf5 (again, so another duplicate instead of the lost 0xf3).

So this dump was showing fine what was happening down on the board: the PIC had gotten the sequences of IN packets all mixed up, and was sending audio with duplicate, lost, and re-ordered samples to the telephony chip. No wonder this was coming out as a jigsaw-like noise, then…

OK, this was a good explanation, but why was this happening in the first place? Why did this occur only on reboot? There, I made an assumption and tried to prove it correct. According to the assumption, the reason was that the PIC did not reset when the computer rebooted; it kept its previous state, and re-initialization caused this havoc. This was (relatively) easy to prove: I planted some special values to be echoed back with IN USB packet headers into several locations in the firmware. Among others, I added a counter that was incremented at each initialization step that I was testing. It took me 15 re-flashes of the firmware, but in the end I found it: the PIC was executing the USB SET_CONFIGURATION procedure twice! And this coincided exactly with noise.

OK, now I had a reasonable explanation as to what was triggering this havoc in data sequencing. But this did not really explain why the havoc was created, neither how to fix it. At that point, I made several assumptions. The assumption that made most sense was that, during SET_CONFIGURATION, the firmware “arms” an OUT (this means “receive” for the board, I know it’s confusing) operation to get the isochronous engine started. If the engine has already been started, this would confuse things. I thus tried to bypass this “arming” operation if SET_CONFIGURATION had already been executed once before. That did not work.

At that point, I decided to give a radical solution. Too many things in my firmware’s initialization sequence depend on the assumption that the PIC has just been powered on. So, what if I told it to reset if I found something I did not like? I thus changed the little test in the SET_CONFIGURATION initialization function to read: “if you have been here before, reset”. And this made the magic work. The noise that pested me for weeks went away for good. Just like that!

A question that I was left with after all this was, why did the issue only occur with some hardware (and not with my laptop). The answer I found had to do with power. My laptop seems to remove USB power when rebooting, and to take quite a few seconds to restore it. Other PCs seemingly either do not cut power completely (I think that the USB standard specifies a low-but-non-zero-voltage condition to signal a device powerdown?) or do cut power, but not for long. The latter had puzzled me because, on some rare occasions, when noise occurred, I had managed to unplug the board, re-plug it, and still listen to noise. But how could that be? The answer lies in the DC decoupling electrolytics. When the board is unplugged, the voltage drops, but not to zero, at least, not immediately. The DC-DC converter stops working, and maybe the 3210 powers off, but the PIC does not! PICs are designed to go into a power-saving mode, in which they can survive with peanut power, so a few hundreds of millivolts and a few microamps suffice for the PIC to keep its state for a few seconds, until the power stored in the electrolytics drains out and the voltage drops to a near-zero value.

So there was it, the answer to the puzzle. Some hardware does not cut off power completely, or reboots quite fast so that the PIC is not reset during reboot. Thus, before my latest fix, when the host rebooted, it started normally the USB initialization sequence on my board which however had kept its prior state. The PIC then got the OUT (receive) sequence all mixed up, and this was creating packet losses, duplication and reordering. And that was of course causing noise. Elementary, my dear Watson!

Although this fix of mine corrects the issue, I am not very happy with it, in that the firmware should somehow recover better from such a situation. I still do not know what exact part of the re-initialization causes the problem, and perhaps the problem could re-appear under different conditions. But, until then, I think I can live with this quick (sort of…) fix that I came up with.

The morale of the story

Is there a morale in this story? I don’t know. If yes, it must be that bugs are just like good old friends. Just when you thought you had forgotten all about them, and they had forgotten about you too, here they are — knocking on your door late in the night, right when you were contemplating a good sleep. And though you think you ‘d be far better off on your own, company — debugging — proves to be fun in the end. Until you finally get rid of them, just like when you ‘d show your good friends the door by 2:00 am. Only the bugs are not really your friends: two simple bugs like these — the board being unable to dial and outputing a terrible noise — could ruin this project altogether. Anyway, this project has so far survived from quite a few such visits from these “good old friends”. Let’s hope it is going to always make it!

Welcome, Elektor readers!

January 14, 2011

Welcome!

This blog is happy to announce that it is expecting visitors directed here from the February issue of Elektor magazine. Yes, it is true! An article on Open USB FXS has been accepted for publication by Elektor and appears in the February 2011 issue. So, hello there Elektor readers, and welcome on board! Please read on for more fun and information on how you can make the most out of this blog.

What’s on this blog

First, a bit of history: around November 2010, Elektor magazine accepted an article draft I submitted that described Open USB FXS. This was the end of a long journey though. Before making it into an article, the project has gone through numerous steps, back from its inception and up to its current, so to speak, maturity stage.

So, if you are interested in the funny (and somewhat pre-) historic side of the project, you may start browsing the blog back from its very first post, “Control-Alt-Delete: a bit of History“. You can thus share the fun as I, the respectable author and designer of this gadget, demistify the inners of a homebrew (but misbehaving) PIC prototype board, debug a series of FXS prototypes that refuse to work for various reasons, destroy my only prototype, quit on the project but restart from scratch later, try all sorts of hacks to get the firmware to work, design successive PCB versions that prove problematic for one reason or another,  discover the miraculous world of Linux device drivers, and finally come up with the design that you will find on the magazine, including the firmware and software.

If you choose not to waste time on the details, you should probably take a quick look at the most useful pages found at the right side of this page. You should be seeing a list of links under “Pages”. Among these, there is a guide on how to build an Open USB FXS dongle board. I would say it is best to consult this guide before setting out to build your own board.

Once you get the board built, it’s a good idea to consult the “Setup and Debugging” guide, that will help you in taking your board through its first baby steps in its future VoIP telephony life.

Once your board is ready to go, and if you have no prior experience on Asterisk, you may choose to read the quick-n-dirty “Asterisk setup” guide that will help you through your first VoIP call.

Kits and materials

At an earlier time, I had decided to offer a limited number of kits commercially. Unfortunately, these kits are no longer available to buy. Instead, and unless you want to build everything on your own, you are advised to purchase the PCB and a pre-programmed PIC from Elektor’s site. Notably, the PCB from Elektor is better than mine (it has the bottom-layer component placement silk, while the PCBs in my kits did not). The rest of the materials should be available on Mouser (the Silicon Labs chips) and other shops like Farnell etc. Please feel free to ask me for assistance on where to find components or on which alternatives to use. Please do also check my “how (not) to order page“.

Interact and contribute

Besides keeping track of all my failures before I got the project right, the idea behind writing a blog instead of a static page is that you, the readers, can contribute to it. First, you can contribute by posting comments. You can ask questions, suggest improvements, report problems, or otherwise comment. All comments (besides spam, which is automatically detected and deleted) are welcome. You can also use the comments facility to contact me in private; if you want your comment to stay private, please indicate so in the text and I will respect your wish.

This is an open-source hardware/firmware/software project. Besides commenting, you are also welcome to contribute. You can improve the design in many ways, or you can extend it in still others. There are lots of improvements that can be made.  It does not have to be just hardware; it can also be additions to the driver, firmware and so on. You are more than welcome to propose your own ideas and discuss them with other interested (blog and magazine) readers.

Acknowledgements

I feel deeply indebted to David Rowe for his invaluable help throughout the lifetime of this project. I do also want to thank my family for bearing with me while I was depriving them from all the time it took me to design schematics and PCBs, write code and debug (let alone the phone calls we missed because, instead of the PTT outlet, the home phone set was plugged into my dongle board for tests :-)).

I wish to thank Faidon Liambotis for various discussions we had on features etc. that helped me a lot, and Fotis Michael for sparing so much of his time with me on the lab’s oscilloscope, debugging funny DC-DC converter issues. I also wish to thank all the early commenters on this blog for the valuable advice that they have provided. And [initially forgotten and added on Feb 08 2011] a big thanks goes to my friend Christos A., for his support in the last phases of this project.

Last but by no means least, I want to thank the Elektor editor, Jan Buiting, for encouraging me to go on during the early stages of this project and for doing a great job altogether. Thanks, Jan!

That’s about all for this welcome post. Whether an Elektor reader or an occasional passer-by, enjoy the blog!

Two new fascinating features

December 15, 2010

By now you should have known. I am not the guy who would sit idle forever. There were two nice features I had conceived in the past that were always at the back of my mind. And, motivated by some friends who asked about these features, I decided to dive once more in the deep waters of firmware, kernel-code and user-side development. (I know, by now I should have known better, but it seems I cannot help it).

Before however I go into the details, please let me spend one minute into something completely irrelevant: the availability of prototype boards. Probably because of their high price, or for whatever other reason, only a few people got interested in buying prototypes. Thus, I decided to stop selling them. A number of boards are still available, but not for purchase. You can still reach me at openusbfxs -at- gmail dot com if you are interested in getting your hands on a prototype, or DIY kit, however you cannot anymore buy one. I will consider only non-commercial arrangements (e.g. send you a board in exchange to some other piece of equipment or to you rendering me a service). Mail me if you need a board and you think you have an interesting proposition.

OK, now back to the new fascinating features. What are they? Well, the first is about upgrading the firmware without the need to touch S1(b) (that’s S2 in my older design), in pure software. Doesn’t this sound like a very professional-grade feature? Just run a tiny program, and there it is, your firmware gets upgraded in-place, without even needing to unplug your device!

The second feature may sound  just a little bit more complex. It bears the obscure name “channel number persistence”, but in principle it is simple. Suppose for a second that you have two Open USB FXS devices, each connected to a phone. The ugly thing so far was that, if you got to unplug and then replug the two devices, or even to reboot your system with the devices plugged in, the dahdi channel-#-to-device mapping might change, depending on the order in which you would plug them back. Together with the channel # of a device would of course change its extension number, so you might find yourself in the unpleasant situation of getting e.g. your next door colleague’s former extension number (and her getting your former extension number, whichever of the two is worse). Channel # persistence fixes exactly this, by assigning each Open USB FXS device a fixed channel number, regardless of the order in which the devices are plugged into the system.

So, let us first dive into the software-based firmware upgrade story. Let me remind you that the Open USB FXS firmware has always had firmware upgrade support: a small bootloader firmware lives at the low memory area of the PIC (0x000 — 0x800), and if you switch S1(b) to its “on” position, unplug, and then re-plug the board, the bootloader takes control of the board. In this mode, the PIC is programmed to behave like one of Microchip’s PICDEM boards. Thus, up to now, I have been using a Windows utility from Microchip, PICDEM-FS  to flash the FXS firmware onto the higher memory area.

What was the downside of this? First, you would have to setup and maintain a separate Windows environment for firmware upgrades. And second (and most important), you would have to have physical access to the board in order to unplug it, toggle a DIP switch, replug it, unplug it, turn the switch back off and replug. Which means, you could not plug the device at the back panel of your tower PC and forget it there, under your desk forever; you ‘d have to pull the tower case out from time to time,  remove the board, plug it in, flash it with new firmware, remove again and replug it back in. Moreover, my dongle would never make it into a housing case, because then from time to time, the user would have to open the case and fiddle with the DIP switches. Certainly not too practical, is it?

So I decided to make the board able to upgrade its firmware right from Linux, without unplugging or touching S1(b). First step was to check on a small Linux program, fsusb, which proved to do the flashing job perfectly well. Goodbye, Windows! However this was the easy part. I now had to find a way to bypass S1(b), which was not as straightforward as it might sound.

At first sight, bypassing S1(b) looked too tricky. When the device boots, initially it is the bootloader that takes control; it checks the setting of S1(b), and, if the switch is in the “off” position, the bootloader jumps to the FXS firmware. Moreover, the bootloader is compiled using the Microchip’s MCC compiler with all optimizations turned on, which makes it fit right into its alotted 800-byte “memory box”. My copy of MCC had expired its demo period a long time ago, and one of the optimizations was no longer supported. If I recompiled the bootloader from source, it would take more than 800 bytes, and I would have to do extensive rewrites to the FXS firmware to align. Therefore it seemed that this direction was not a clear “no go”.

Another thought that crossed my mind was to replicate the bootloader’s functionality into the FXS firmware itself. But this seemed too convoluted, too. Of course, I could not let the FXS firmware overwrite itself. Thus, I would have to define another memory area, and write the firmware code so as to contain an “egg” part with the necessary code to overwrite the FXS firmware flash; when instructed, the code would place the “egg” at some high memory area and invoke it. But then, I would also have to rewrite the fsusb program so that it would recognize the USB vendor and product ids of the FXS “personality” (see next paragraph). Naaah, all this sounded much too complicated…

My third thought was the one that actually made it. The bootloader code is fixed and unlikely to ever change. What if I instructed the FXS firmware code to jump back to the bootloader, right after the check for S1(b)? Then, the bootloader would ignore the setting for S1(b). Wisely enough, checking for S1(b) is done very early in the bootloader code, before the initialization of the USB circuitry, hence before the board assumes any of its two possible personalities; thus, there would not be any side effects.

Well, this sounded fine, however there was still a minor issue needing to be resolved: the bootloader code assumes it is being run right after a reset, so that it can take a PICDEM “personality” (USB vendor id, product id, interface endpoints, etc.). The FXS firmware has a totally different “personality” (other product id, other endpoints, etc.). Thus, it was clear: the jump to the bootloader would have to be right after a reset. But — wait a moment: then, how would I instruct the board over software to go into bootload mode? Right after a reset there is no way for the driver to talk to the board; and when the driver is finally able to talk to the board, it’s too late, since the board has already assumed its FXS personality!

The solution I found to this dead-end was again the PIC’s EEPROM. At boot time, the FXS firmware would examine byte #4 of the EEPROM (bytes #0 to #3 are occupied by the serial number) and, if it found this byte set to zero, it would go on booting normally as an FXS device. Otherwise, it would reset the byte to zero (to avoid an endless loop at the next boot) and then it would jump to the bootloader. Yes, that would clearly make it.

And indeed, it did make it. What came next was a tiresome, but relatively straightforward job: I transcribed the code from the (open-source) utility fsusb into a small program of mine, checkfmwr.c. Now this new program does everything there is to do in order to perform a tidy firmware upgrade: it scans for Open USB FXS devices on the system; if it finds one, it checks whether (according to the program’s invocation arguments) this board requires a firmware upgrade; if it does, the program instructs the board to reset into bootloader mode; it then scans the USB bus again, and finds the board with its bootloader “personality”; it flashes the new firmware, and finally it reboots the board back into FXS mode. It all works like a charm.

A tricky part of the procedure relates to associating the (new) PICDEM device with the (old) Open USB FXS device after the board resets to bootloader mode. Since the board reboots, it is removed from the system; and when the bootloader-personality board boots, it appears with a different device number, because the Linux kernel gives new devices a new, always-increasing, device number. So the algorithm I chose was to scan the USB busses until the same bus is found that once held the Open USB FXS device in hand; then, look for a PICDEM device on that bus, with a device number higher than the one of the Open USB FXS device formerly found. This works, unless you have more PICDEM devices lurking on the same USB bus. You do not? That’s what I thought, too.

A funny preposterous situation here is that the firmware needs first to be upgraded to a version that supports the auto-upgrade functionality before it can actually auto-upgrade :-). OK, I agree, it does indeed feel funny; but remember — it is the last upgrade you will have to perform with the traditional method (S1(b))! All your next upgrades from now on will be done in software alone.

Enough with the firmware upgrades, it’s now time for the second feature, channel # persistence. This was conceivably simpler and more straightforward: to activate this feature, one would have to pass the oufxs driver a parameter, mapping serial numbers to dahdi channel #’s. Then, the driver would try to invoke dahdi_chan_register() and actually allocate these channel numbers, mapped to internal dev structures inside it. Later on, when such a pre-reserved Open USB FXS device was plugged, the driver would note that and, instead of allocating a brand-new dev structure for the new device, it would just use the pre-allocated dev structure that was registered with dahdi. Simple, isn’t it?

Of course it was not that simple in practice. There are many execution paths within the driver, and I had to make sure that, in each and every one of these, I would not use unallocated or uninitialized stuff, I would free or unregister back all unneeded malloc()ed space and unneeded dahdi channels, I would check that persistent devices were present before attempting to use them, I would not forget to unregister free()ed structures, I would not forget any spinlocks locked, etc. Which means, I froze and ooops’ed the kernel quite a few times before I got it right; but I got it right in the end.

In the course, I thought it was a good idea to check compatibility with dahdi version 2.3.x (former work was on 2.2.x). This first resulted in some BUG_ON messages, because the new dahdi version uses kernel module reference counts, and I was not supplying oufxs as the exporting module for my dahdi devices. That fixed, I then had to make the code port well to both dahdi 2.2 and dahdi 2.3 (and 2.4, which I have not tried yet). A little fiddling with the Kbuild makefile made it.

But then my (now correct) code triggered a wholly new bug, this time in the dahdi code. I had designed the driver so that pre-reserved devices that are not present on the system failed on open() (this looked like the right thing to do, when a physical device does not exist for a pre-reserved dahdi channel). But on that execution path (failure in open()), dahdi was not decrementing the module reference count, and thus my oufxs module remained locked in memory until the next reboot. I fixed the bug in my dahdi-base.c copy, and posted a bug report on Digium’s site.

A funny story went on then, where the site administrator kept removing the fix code that I posted because of site policy reasons, until I had to describe the fix in English. Anyway, my fix was finally accepted and made it into their CVS (or whatever it is they use for version control).

Anyway, the code now works well (supposedly in both dahdi 2.2.x and 2.3.x, although I have not yet checked both thoroughly). So you can perfectly well pass the driver at load time a parameter like this:

rsvsn2chan=1234ABCD:2,DEADDEAD:4

and have two dahdi channels, 2 and 4, pre-reserved for the two boards with serials 1234ABCD and DEADDEAD. This means that now you can go into /etc/dahdi/system.conf and specify parameters for channels 2 and 4, knowing that each setting will always pertain to a given board, no matter which USB port this board is plugged into. Moreover (and more importantly) you can fix extension numbers for dahdichan‘s 2 and 4 in /etc/asterisk/users.conf and know that an extension will always pertain to a given board. Which, I guess, means you will no more receive accidental calls intended for your colleague-next-door.

There is still a caveat, which I will maybe resolve in a future code fix. This fail-on-open() policy of mine for pre-reserved devices has an unwanted side effect: when a device is actually plugged in, chan_dahdi has to be reloaded in Asterisk in order for the device to be opened and work. This may not be what one wants in a production system, because it will break all active dahdi calls. So, in the future, I think I could try a more clever policy, i.e. permit the open() to succeed and then make sure in all other places in the driver that no attempt will be ever made to access non-existing hardware. This feels like a totally new bunch of bugs are waiting for me to stumble on them, so I am leaving it for now.

Another idea would be, if this does not somehow already exist in dahdi, to cooperate with Digium and make this pre-reservation code into a separate (dahdi) module, that would apply to other pluggable devices as well. Ouch, this looks like a big, big adventure for courageous and fearless kernel programmers — certainly not what I feel like right now, after all these nights of bug-hunting.

As it has always been the case in this Open USB FXS saga, each new feature opens the way for some interesting new work to do in the future. Nevertheless, the current state is not too bad as well: my devices can now be software-upgraded and be given fixed channel numbers, settings, and extension numbers. So why not try these features yourself (and let me know how it went)?

So now, what?

October 12, 2010

It’s been some time that I did not post anything. I have not made any substantial progress with the board or the driver, so I do not have anything new to report. On the other hand, I have spent a considerable amount of time preparing a small stock of working prototypes. Whether you are considering ordering a board or not, a visit to the ordering  information page would be worth the trouble.

Monitoring the blog’s web traffic and statistics shows a steady flow of referral hits from a small number of sites linking here, plus a growing number of search hits. Search strings reveal that a non-negligible number of people are looking around for things like how to program the Si3210 or what an FXS design schematic looks like. I don’t want to sound arrogant, but searches like these seem to be answered very well by the contents of this blog. Good, if this is your case, why not provide some feedback by leaving a comment? Did you find what you were looking for? Are there other sites/wikis/blogs that contain similar information, so that it would be worth linking to them from this blog? And so on.

To come to the title of this post: so now what? Am I going to progress any further with this Open USB FXS thing? The answer is, I don’t really know. As much as I dislike obituaries, except for a few optimizations (mainly in the Linux driver), there is little that I could do to improve on this very design. It might be worth though doing other things, like for example writing a Windows/Skype driver for Open USB FXS, or working on another, improved design. Besides that, there are things that I ought to do, like e.g. contact Digium and attempt to have them include my “oufxs” driver code to their dahdi distribution. This and others are things that I ought to do, but right now I am either too lazy or too busy to start with. So, do you have an idea/proposal/ suggestion? Do you wish to participate and help? Yes? Good, then why not step in and leave a comment?

That’s all for now. So now, what? You, the readers, know the answer to this question better than I do.

And a page for debugging

August 12, 2010

This is another quick note: a new page with tons of information on setting up, testing and debugging your DIY (or ready-made) board is here. Happy DIY’ing!

Update, Aug 24: Statistics show that quite a few people read the blog and especially the two how-to guides. I would appreciate receiving comments and views from these readers.

A page for DIY’ers

July 27, 2010

This is a quick one: besides some ready-made prototypes shipped, I have also shipped the first Open USB FXS DIY kit! There is new a page for DIY’ers here, that I will keep updating with photos and hints. The ordering information page is still here.

Happy DIY’ing, prospective builders!

Careful with that choke, Eugene!

July 21, 2010

Yes, the title of this post is a paraphrase of a song’s title from the early Pink Floyd. Judging from the creepy scream coming straight out of hell that is heard about the end of the song, it seems that Eugene wasn’t that careful with his axe after all (that’s “axe” instead of “choke” in the song’s title). I found out many signs of carelessness in my first prototypes, too, but it seems that I have cured all of them. Here are all the nasty details.

To begin with, I am putting down sort of an apology for being so late in writing this new post (if you don’t like apologies or if you don’t care about the reasons I was late, you can safely move on to the next paragraph). As said in Power Games, I had given my laptop for repair when testing the first ten prototype boards. The boards misbehaved in the DC-DC converter part, and working with a different PC led me into the conclusion that this was the fault of the other PC’s USB port or power supply. Well… maybe… The boards now work OK with my (repaired) laptop, but this required a series of fixes to several hardware bugs and issues that I discovered. These issues are discussed in this post, however there were many more reasons why I was so late. To begin with, in the meantime since my last post I lost my job, because of the Greek public sector’s frugality policy (all contracts in the public sector were discontinued and only public servants remained in employment). As a collateral damage, besides losing my income, my lab workspace at work was also gone; thus I had to set up a lab workbench anew, from whatever ingredients were handy. The only candidate place for housing a new lab was  a sort of warehouse that we are renting across the street. There was a slight problem though: this warehouse did not have electricity, telephone and Internet…  My home and the warehouse being in the opposite side of the street, laying cables around was not an option. However, the two sites are in visual contact, so this gave me the following idea: I moved my home wireless router outdoors facing the warehouse and verified this still covers the interior of the house; then I installed an old unused wireless access point at the warehouse and configured it as a wireless bridge; finally, I used a couple of Sipura’s (a 3000 and a 9000) to gateway my home’s PSTN connection to the wireless LAN  and to an IP phone located at the warehouse. Fine, now I had my home phone and Internet connections extended to the warehouse! But how did I power all these at the warehouse? Easy: I installed a 130Wp solar cell on the roof, 230 Ah batteries and a 1000W inverter in the basement, and I now have enough power to satisfy my needs — five to ten hours of 50~100W daily! Of course, during all these installations, I have been — and still am — actively seeking for a job and I have been running a couple of errands that might turn into regular jobs in the future. As you can imagine, all this fuss kept me away from my prototypes for quite some time.

Not forever, though: since last Monday, I got back into debugging my prototypes. And there are quite a few bugs to find. To begin with, my Mouser BOM was once again wrong in one component: I discovered that I had ordered the tantalum capacitor C25 as 1uF/25V instead of the correct 10uF/25V. Still worse, the 1uF caps had been used on all prototypes. Stupid as this might be, it was a hopeful discovery, since replacing that cap with the right-valued one might fix the problem. However, practice proved this was not a big issue. Probably if power came from an unregulated supply, the 1uF cap would also cause problems, but it seems that in a USB supply chain there are plenty of capacitors to compensate for the lost 9uF. As discussed later however, 1uF caps were problematic with larger inductor values that I tested.

After replacing C25 on a couple of boards, I re-tested both fixed and un-fixed yet boards, and they were all now producing a high-frequency noise (a “hiss”). My next mistake was that I attributed this to the same DC-DC converter issue, as before. Although this assumption was wrong, it helped me a lot, in that it gave me incentive to search further and try out a few things.

The first thing I did was some good reading on inductors. Ferrite core inductors like L1 are known to be easily saturated  under high currents (check here for more details). It seemed like saturation was what I was seeing in the scope. To explain that more in depth, let me repeat the schematic of the converter here.

As explained in Silabs’ AN45, the converter works in cycles of loading L1 with energy by switching Q7 on, and then letting L1 discharge so that it pumps current out of C9 via D1, thus resulting in a negative VBAT.

When operating as expected, L1 oscillates at its self-resonant frequency (from my first oscilloscope shots, oscillation is apparent and the self-resonant frequency can be deduced to be in the order of 10 times the frequency of the converter, which yields about 8MHz for the originally-used power chokes — note that this frequency is highly choke-dependent). However, if L1 gets saturated, it does not resonate; it acts like a wire, letting the current through. This has the effect of storing much less energy within the inductor. When Q7 cuts off, the inductor pumps less current through D1, and this eventually is sensed as insufficient power in the VBAT line. Normally, the reaction of the Si3210 in this case would be to elongate the ON period of its PWM signal to charge L1 even more. Halas, this worsens the situation, first by saturating L1 even quicker, and second by heating up the choke’s copper, thus increasing its in-line resistance and worsening even more its characteristics. Eventually, the converter sees an overcurrent condition and cuts off Q7, possibly even for a full cycle or more.

The more inductance L1 has, the more energy it can store. However, more energy means that more voltage and more current are required to make the converter work, and both these supplies are not ample in a USB-powered device. Thus, in my design I had tried to use values of 5V and less than 0.5A, respectively. These values called for a smaller-than-usual inductance value for L1 (100 uH instead of a more common value of 150 uH), and a higher operating frequency (80 kHz instead of a more common frequency of 65-70 kHz).  What would happen if I used a larger inductor, then?

Enough with the theory, now; I had to find something to fix my problematic prototypes. I first replaced L1 on a prototype with a 150 uH inductor. It worked already better than before. From AN45, I knew I had to tweak the values of DRs 93 and 92 to adapt the frequency and the min off time to 150uH.

So my next move was to look again at the wctdm.c code. I used a value of 230 for DR 93 and a value of 25 for DR92 and, suddenly, the 150 uH board worked perfectly. OK! This was a sign that I was on a good track.

Getting back to wctdm.c, I was surprised to find that there were a couple of lines of code that I had neglected so far, which instruct the chip to perform a calibration of the DC-DC converter. The wctdm.c code tries to make sure that the calibration result (a value between 0 and 15) is not too high or too low (i.e., lies between 2 and 13), presumably to ensure that the converter is working within its designed range. Repeating these lines in my driver and testing with a 100 uH board yielded a consistent calibration result of 0. This did not sound very good, so I thought about adjusting DR’s 92 and 93.

I then played a lot with the values of DR’s 92 and 93, to no avail. Then I replaced a 150 uH inductor on a second board, just to find out that the hiss was not gone on that one. This means that the cause was not the inductor.

After that, it took me less than a minute until I found the real cause of the hiss: it was C5 and C6, which were missing on all boards except — by virtue of the most devilish coincidence — the one with the replaced 150 uH inductor! If you remember, C5 and C6 were also wrongly ordered as 220 nF instead of 22. Thus, I had used some stashed C5’s and C6’s on a few boards and had left C5 and C6 unpopulated on the rest of the boards, while placing a new order for 22 nF caps. Due to the long time since my last test, and altough 22 nF caps had arrived, I had totally forgotten about the missing ones. These caps form a sort of a low-pass filter that reduces the digital noise of the converter in the telephony line. I added the missing caps and — voila! — now all prototypes were working fine.

Fine? Not exactly. I had implemented a switch argument (highfreq) on the driver to use different DR 92/93 values for the 150 uH-choke boards. Now I tested the two 150 uH boards again, and one out of them was producing a beeping noise from time to time. Yes, I had not yet fixed C25 to be 10 uF instead of the wrong 1 uF one on that board. Thus, I replaced C25 on all boards and re-tested. Now all boards were working fine.

I now have available some prototypes that can be ordered. I also feel safe to ship DIY kits, in that I (finally!) know that the materials are correct and, if assembled correctly, the boards are likely to work. The prices we have set with Medion 7 are EUR 29.50 for a DIY kit and 42.50 for a tested prototype. You should add P&P and VAT (where applicable) to these prices. Currently, I only have four or five prototypes available, and two of them are 150 uH ones. I am soon going to have more boards assembled and tested. The availability of the boards will be soon announced in the ordering page of this blog. Please use the openusbfxs at gmail dot com address for ordering. Two pages on how to DoItYourself and how to set up the driver are to follow as well, so stay tuned.

On a final note, I am sorry for keeping people waiting for so long, although I guess I had plenty of reasons for that. I hope that all will be back to normal now. I also hope that the boards will behave OK on other people’s PCs and other devices, but I owe to stress once more here that there is no such guarantee.