I’ve recently been offered a Nabaztag rabbit. It’s able to do lots of little things such as reading RSS feeds, reporting the weather and air quality, … The problem is it’s not reliable: it sometimes crashes, repeats the same things over and over again. And most of all, it’s not very configurable except if you pay a subscription (but who might want to pay for something not working…). Anyway, it’s time for a “Do It Yourself” approach !

I’ve been play with PIC audio sampling, ADC and PWM for a while, and I have to say results are quite amazing ! Building a bot able to report the weather should be easy, provided:

  1. we’re able to produce a wav speech file from text (using text-to-speech, TTS)
  2. information from the Internet can be retreived and parsed
I’ve tested several TTS programs: espeak, festival, kttsd, but espeak is easy to use and produces a nice robotic voice :) It’s also able to put the result in a wav file.About information retrieval, Yahoo Weather offers a nice API to get a XML file (and no registration is needed) easy to parse (I actually used sed to get the weather and temperature from the XML file…).

The principle is the following:

The Jal program is simple: after configuring a PWM pin, it just receives data from the serial link (115200 bds) and sets the duty cycle with a 8 bits resolution. It can play PCM, 8 bit, mono 11025 Hz wav file (a wav file also contains information about the encoding. Those informations will also be sent, but that’s not a big deal :) ).
include sb_config
include sb_protocol
include sb_mainboard

-- Configure PWM
pragma target ccp1  rb0
pin_b0_direction = output
include pwm_hardware_1
PWM_Init_Resolution(20,true,false)

include sb_sound
pwm_on()

var byte char
forever loop
    if sb_serial_read(char)
    then
        PWM_Set_DutyCycle(char,0)
    end if
end loop
And… here’s the result:

(there might be problems with the video: voice might be too low, this a mov-to-flv conversion problem…).

If not clear enough, it says:
“”"
Hi ! This is the SirBot Project.Today, I’m gonna show you how to simulate those crappy Nabaztag stuff…
Here is the deal. I’m going to search information about the weather on the Internet.
Yahoo can help us. It has a great API to retrieve information about the weather.
So, let’s go !

I’ve been configured to check the weather on Paris.
So, here’s the weather report on Paris, France:

The weather is “Partly Cloudy”.
Temperature is “19″.
“”"

I couldn’t resist to put the whole into a new bot, SirFreud… Applying what learned, this bot is able to detect sound anddetermine if it’s coming from left or right. There’s no accurate angle computed here, since for an unknown reason, the timer used to compute delays is growing faster than expected (see part 2, section “Measuring the delay”). Anyway, here’s a small video showing SirFreud…

Time to put the whole in a (I hope) great experimentation…

The first step is to validate the theory and the algorithm and its configuration. So first expected results is to determine ifwe’re able to know if the sound if coming from left, or right. In a second step, we’ll try to precisely know the angle.

1. Hardware

So, we first need to build two sound sensors. These are just LM386 base pre-amp for electret microphone. Here’s a little picture. It has been deisgned to be as smaller as possible. I’ll later add the complete diagrams and build instruction directly within the SirBot Project.

The sound sensors are connected to the mainboard, through a bread board. Not shown here, but during the experimentation, the mics are about 15cm distant from each other. Since we’re not going to measure angle, we don’t have to worry being inaccurate, what is important is this distance is greater than 2.75cm (see theory).

2. Software

The Jal program, as describe in part 2 (algorithm), will count the delay occurring when the sound will hit the first mic, then the second. It handles timeout if no sound could be detected with the second mic. Note there’s a delay occurring when data is available. Without this delay (see first experiment), there’s too much data, inconsistent and useless. The program is available here.

3. Attempt #1

The first experiment consists in:
  1. knocking 3 times near the left mic
  2. knocking 3 times between the mics
  3. knocking 3 times near the right mic
  4. knocking 3 times between the mics
  5. knocking 3 times near the left mic
In this experiment, no delay is occurring when data is available (sound localized). Plotting what the first and the second mic have received (shift with delay), this clearly shows all the expected peaks.

Looking deeply into the data also show there’s no consistent results. Given one peak, the program alternatively detects the sound coming from left, right, left, etc… Kind of saturation…This can be shown when plotting the delay: if the delay is positive, the sound is coming from the left, if negative, it’s coming from the right. We’re thus expecting a clear distribution above and below 0 (and a mix when the sound is centered). The following figure shows this distribution is far from what we’re expecting…

While I first thought about a bug in the jal program, worse a bug in the theory/algorithm, I then tried to add the delay…

4. Attempt #2

If this case, as soon as the data is available, the program sends the result and wait a little time. Thus, most of the sound wave is ignored, except the very first point, which what we need. Note this experiment consists in:

  1. knocking 3 times near the left mic
  2. knocking 3 times near the right mic
  3. knocking 3 times near the left mic
  4. knocking 3 times near the right mic

We clearly see the delay is alternatively positive (left) and negative (right) (point with delay = 0 is an artifact and should be ignored).

Knocking near the mic has been done so the sound wave is very narrow (with plastic pieces). The delay is long enough so the next time we’re acquiring data, that’ll be for the next knock. Now, with the same delay, knocking a glass (sound is resonating) doesn’t give the same results… (knocking 2 times near the left mic, then 2 times near the right one).

There’re still something to say… The two first peaks seems to come from the left (ignoring null delays), but for the two last peaks, it’s hard to say, even if the delay is more important when negative than when positive…

So… This costs me a *lot* of time, but results are interesting: if we’re waiting enough (but not too long) between data acquistions, we’re able to localize sound quite nicely… This delay is clearly important and depends on the type of sound waves we’re listening to.

Next, I’ll try to determine where the sound comes from, computing the angle.

In the previous part, we’ve been able to find a way to theoretically localize a sound in space: “the delay between the two mics gives a distance x, which is correlated to the angle the sound comes from, as x = d.sin(ß)”.OK. Now it’s time to get back to reality, and apply this. Several problems can occur:

  • the acquisition time may be too long
  • we cannot predict which microphone the sound will hit first
  • the delay between the mics may be too long… or too short
  • for some reason, one of the microphone may not record the sound

1. ADC acquisition time

We’re trying to measure (quite) precisely the delay which occur when a sound hits a microphone, then another. Since the speed of sound is ~ 343m/s @20°C, this delay can be very short. The distance between the microphone is obviously involved, but so is the analog-to-digital conversion acquisition time. We’ll name it tacq so it looks like a complicated variable, meaning we’re probably smart people.So, tacq corresponds to:

  •  the time the PIC uses to setup the ADC. With a 20MHz Xtal, that’s about 4.8µs (setup) + 10µs (delay) ~ 15µs
  • the ADC itself. We’re using low resolution ADC (8 bits) but let’s compute it for high resolution (10 bits), so we’ll get a max value. With a 20MHz Xtal, it costs 1.6µs per bit, so 10 * 1.6µs = 16µs. There’s also a need to wait a little time (Tad time, see specs) after the acquisition. It costs 1µs.
So the total is 15 + 16 + 1 = 32µs. Because we’re a little bit paranoid, we’re round this value to 40µs. So:
tacq ~ 40µs

2. Minimal distance between the microphones

In a “classic” scenario, a wave sound hits the first microphone, then the second. Neglecting the code managing this sequence, we need at least 2 * tacq = 80µs. That is,  the distance between the mics cannot be less than what gives this delay. But… in this scenario, we’re waiting the sound for hitting the first mic, but we actually don’t know what is the first, since the sound can come from anywhere. So:
  1. if a microphone detect a sound, we must check the second. The delay will give the distance, so the angle of the sound
  2. if a microphone didn’t detect any sound, we must switch the mics, and retry to step one.
From 1., this means, we’ll need 2 * tacq = 80µs. Using the speed of sound, this value corresponds to 80.10-6 * 343 = 0.02744m = 2.75cm. So the minimal distance between the microphones is:
dmin ~ 2.75cm
From 2., this means in the worst case, the witdh of sound wave must be at least:
Wmin ~ 80µs

3. Measuring the delay

Measuring the delay occuring when the sound hits the first microphone, then the second, can be done using PIC timers. The PIC 16F88 has several available timers. One interesting is timer1, which is a 16bits counter, the larger this PIC can offer. Using the internal clock as reference, the timer is incremented at Fosc/4 = 20MHz / 4 = 5MHz (every 2.10-7s = 0.2µs). Since it’s a 16bits timer, it can count 65536 * 0.2µs = 13107µs, which corresponds to ~ 4.5m.
dmax = 4.5m
2.75cm < d < 450cm

4. Borderline cases

Mostly due to the acquisition time, data and information can be loss. Particularly, beeing able to detect sound depends on the wave width:
  1. The wave’s width is large enough so acquisition can be done at least one time during the wave.
  2. The peak is too narrow, due to a too long acquisition time, we skip the information.
Note in case 1, the same information (“there is sound right now”) may be sent several times. Results may require an aggregation over a given period of time (need to define what is a peak).Assuming the width is large enough, there may also be problems while the sound hits the second microphone. Actually, the second one may even not receive a signal at all:

This case is closed the first one, and is related to the minimal width. Another case which could why the second mic didn’t receive a signal is related to the treshold. If mic1′s signal is closed to the threshold value, mic2 may not receive the signal as it could be considered as background noise:
So let’s summarize the whole:
  • the acquisition time is an important factor the get reliable results. Must use a max-speed Xtal, that is 20MHz.
  • the distance between the microphones can’t be randomly chosen. It’s between 3cm and 450cm. 10 or 15cm is good. If too long, the sound level may be to low when hitting the second mic.
  • there may be cases where information can be lost, due to the wave form (too narrow), to the fact we can’t know which mic the sound will hit first (this increase the required acquisition time since we may need to get 3 acquisition to get a result). Note, a possible optimization would say: “there’s a high probability that when a sound hits micA, the next sound may also hit the micA. So if micA is the first, keep it as the first for the next sound dectection”.
Those different points have to be tested and experienced in a “real-world” context. That’s what the next post will talk about…

One of the purpose of TweetyBot is to be able to know where sounds come from. This isn’t actually required to teach birds to sing correctly, but it’s so fun. And ultimately, the robot would be able to localize which birds is responding and turn its “head” to it…

Anyway. Here’s the problem: having two microphones, how do we know where the sound comes from ? What’s the angle ?

The problem looks like following: if we’re able to know the delay that occurs when the sound wave hits mic A and mic B, then we should be able to know the distance (speed of sound being constant), which directly depends on the angle/direction of the sound. Measuring this delay is actually feasible, so that’s the good news.

Considering borderline cases, we have:

If sound comes from left of right, “delay distance” is the same as distance between mics

If sound is localized exactly between the mics, distance is null (no delay)

This smells sinus or cosinus… After a few hours (days ?) reminding me maths and geometry when I was a little child, here’s a figure summarizing the problem, with the solution.

Note: I’ve submitted the problem to Master Fenyo. As usual, he said:

“- You’re dumb… There are multiple solutions to your problem since there’s one equation and two unknown variables.
- Ah…
- Yes. Considering your problem in a euclidian space, you can see vectors blabla, blabla…
- Ah… But…
- No, it won’t work !
- But look at this figure.
- OK. This is only valid if you consider you have flat/plane sound wave
- Ah…
- This is only valid if the distance between your microphones is greater than…
- OK. Anyway, whatever the distance will be, I assume I can use any sound waves form I want to get my result the way I want…

This being said, he’s right and results will only be an approximation…

After trying a simple peak detector (failed), a bar-graph LM3916 based sound sensor (almost success), this is time to reach the best, the amazing but scary analog-to-digital conversion… This should have been to most difficult, this is actually the simplest… Yes, thanks to a not-so-indigest 16F88 datasheets, some nice Jal libraries and precious information from Great Bert’s website, I can now have this kind of graph (rapping near the electret microphone):

How does this works ?

It uses the LM386 based preamp electret from the peak detector (refer to the mainboard to have the whole base schematic). Since it converts sound into voltage, it can be directly wired to the PIC 16F88 (I hope/think so…).

The Jal code is quite simple (use SirBot’s trunk):

include sb_config
include sb_protocol
include sb_mainboard

-- configure ADC
const ADC_hardware_Nchan      = 3         ;number of selected channels
const ADC_hardware_NVref      = 0         ;number of external references
const ADC_hardware_Rsource    = 10_000    ;maximum source resistance
const ADC_hardware_high_resolution = false;true = high resolution = 10 bits
include adc_hardware
ADC_init

pin_a0_direction = input    ; electret mic is connected to...
forever loop
    var byte res = ADC_read_low_res(0)
    echo(res)
end loop

For now, only one ADC channel is used, but soon there’ll be at least two to localize sound in space (see later). No Vref is used, so +5V/0V will be used. It’s ok since the preamp electret microphone output ranges between  those.

I’ve tested the whole in “real” condition, that is recording my birds. The result is quite nice: the sound sensor is able to detect when birds sing “like a big fat pig”. There may be problems to detect when they just “twitter in the fresh air of the morning”, though…

My first idea was to set two thresholds: one above which birds are considered to twitter, another where they sing like a bit fat pig… By this way, when the bot simulates sings, I would have been able to know when birds are responsive the correct way or not. There probably needs to have a better amplification for this.

Anyway, this sound sensor seems to be the most usable:

  • few components are required
  • no need to adjust sensitivity: everything can be configure through software
  • result is far richer than a binary response (got sound or not)
  • this is a first step to actually record sound, and play them back from the PC
Next step is to determine if acquisition time (analog-to-digital conversion) is short enough to put two (or three) sensors to localize where sounds come from (see graph here). This time should be short enough compared to the time sound waves hits one, then another sound sensor. This will require the use of timers…
Replacing the 16F1628 with a 16F88, that’s the next “main” step for the SirBot Project. One of the most interesting feature is the possibility to use a bootloader to self-program the chip. This prevents unplug/plug the chip from the board to the programmer, it’s way faster and, most importantly for me, this seems to be the only way to program my chips using a USB-to-serial converter, since my PIC01 programmer (as many JDM) can’t work with it . This is (IIRC my investigation) due to voltage differences: a programmer needs 12V, it can only have 5V from USB. Anyway, it can’t work and since my laptop doesn’t have any serial port, this is for me the only alternative (yes, I have a serial port on my desktop PC, but it’s not my main PC).

So, in this context, a bootloader must  be able to :

  • program my chips through the USB-to-serial converter
  • work under Linux (plaftorm independent)
  • easily integrate within the SirBot Project (if possible)
I first tried Bloader and its screamer. What I liked is the little program which blinks LEDs connected to any ports and send “Ok” through the serial link. This is helpful to ensure everything is ready. But… I couldn’t make it work using wine. I just got a “Error: type mismatch”, followed by the meaningful alert message: “13″. I tried different baudrates but always the same error.

I did not try further and switch to Tiny Bootloader. This bootloader seems to work fine with 16F88 and with Jal. More,a python script, pytbl, is available to use it: this is a great opportunity to use it within SirBot. So, let’s go… I’ve first flashed the 20MHz/115200bds hex file with my programmer. Using wine, I’m able to detect my 16F88 but… still unable to write the flash (“Could not write… Error…”). However, it works like a charm with a “real” serial port (%*$%%µ! converter). During my Bloader/Screamer exploration, it was mentioned that the baudrate configuration (9600, …, 115200) was useful when a usb-to-serial converter was involved… Heh ? Sounds interesting ! I thought: “if you took that asm file, modify it to use 20MHz xtal and 9600bds, compile the whole and put it in your 16F88, there will be a magical moment, for sure…”. And I did it… But no magical moment occured, still the same error. Ah, yes, I can remember the magical moment. It was when I compiled the asm file with MPLAB (crap, gpasm failed). It said “The file path [to the asm file] mustn’t be longer than 62 characters”…

Then I had the great idea: using VirtualBox, I’m able to run XP under Linux, attach my usb-to-serial converter, then launch TinyBootloader, then program my 16F88, then have my magical moment…

Now, what about pytbl ? It can detect the chip, but when it programs it (without errors), nothing is working anymore, even the bootloader… I’ve contacted the authort, so more on this later (I hope).

Last time, I’ve made a few test about a simple sound sensor (which I couldn’t make work, but I’m not giving up…). This time, it’s a bit more complex, since this sensor has a LM3916 dot/bar display driver:

  • at the beginning, there’s the same preamp electret mic LM386-based,
  • output is connected to the LM3916 (constructor’s schematics)

The idea is the following : this sound sensor will be used within the TweetyBot project, to detect when the birds sing loud (means “like a pig”) or soft (means “tweetering in the fresh air of the morning”). If they sing loud, the display driver will trigger high db LEDs. Using and/or gates (Master Fenyo said), I could connect the 10 LM3916 outputs to 4 input pins (2^3 < 10 < 2^4). It’s like an ADC, but levels are “steppized” (hum, I mean not continue…). Anyway, even without those considerations, the result is quite nice…

 

Just for the fun, this is one of my pretty birds, singing (almost) like a pig…

After determining which sound sensors should be used for the TweetyBot project, I’ve just tried to get a peak detectorworking. It should be easy to built it, but it’s not. After spending most of the sunday time on it, nothing is working…

This peak detector is quite simple:

  1. a LM386 acts as a preamp for the electret microphone
  2. a first part of a (hard-to-find and expensive) OPA2277-PA handles adjustment of the sensitivity
  3. another part of a (hard-to-find and expensive) OPA2277-PA handles comparisons and triggers the output level: 0V means “no sound”, +5V means “hey I got a sound”.

If I can get something from the LM386 output (tested with a multimeter and SirProbe), nothing from the OPA2277 output. I think this sensor must be build on a real board to avoid any contact failure, and tested with a real oscilloscope…

Yep ! I’ve just received a couple of PIC 16F88 microcontrollers. Hard to find (farnell), but finally got them.

16F88 is great compared to 16F628 because:

  • It has twice more memory (4K words)
  • It has a built-in ADC
  • and most importantly, its memory can be programmed through software, that is, it can handle a bootloader.

So I start to play. I launch the last version of ic-prog using wine, connect to my PIC01 programmer, and then try to program a bootloader (Tiny Bootloader), try to connect to it and flash a hex file… It fails. “PIC not found”… Errh ? OK, try again. Make sure the device is verified. OK. Make sure the device is correctly connected. OK. Try #2… Failed… Well, it seems starting by trying to use a bootloader isn’t what can be called a “Hello World” with this new PIC. I need the “blinking led” test. Got a program from a website. Program it. Test it. OK. It works, the LED is blinking. I then try to compile one of my own program (from SirBot). After fighting an hour to find the JAL libraries and compile the program, I try it. The PIC is supposed to echo chars, through the USART. Nothing… Nothing from the PIC…

Everythings work like a charm using a 16F628, so why not now ? Maybe because PICs are not the same :) I’ve read that “pins are compatible”, but it seems I’ve read it too fast. Looking at the specs, pins are not the same, starting by USART’s:

I should really read the specs. At least, look at the pictures…

« Older entries § Newer entries »