Sound, tone and DTMF generation by using managed DirectSound and C# and sine tone detection with pure managed Goertzel algorithm implementation

17 בפברואר 2008

תגיות: , , , , ,
6 תגובות

[This blog was migrated. You will not be able to comment here.
The new URL of this post is http://khason.net/blog/sound-tone-and-dtmf-generation-by-using-managed-directsound-and-c-and-sine-tone-detection-with-pure-managed-goertzel-algorithm-implementation/]


Well, today, we'll speak about math. A lot of math. We have a number of challenges today.

  • Generate sine, square or dual sine tone (DTMF – sounds, that your phone keyboard produces)
  • Playing it in managed code
  • Detection of base frequencies by using Goertzel (Fast Fourier) algorithm

Let's start with generation of tone. All we need is DirectSound. First of all, we should create format of wave sound

using DXS = Microsoft.DirectX.DirectSound;

Now, let's create our Device

Microsoft.DirectX.DirectSound.Device m_device = new Microsoft.DirectX.DirectSound.Device();
SourceHost = App.Current.MainWindow.Content as FrameworkElement;

m_device.SetCooperativeLevel(((HwndSource)PresentationSource.FromVisual(SourceHost)).Handle, DXS.CooperativeLevel.Priority);

As you can see, it, actually Control, thus we have to give him handle to sit on. 90%, that when you'll run it, you'll get strange exception: "DLL 'C:\Windows\assembly\GAC\Microsoft.DirectX\1.0.2902.0__31bf3856ad364e35\Microsoft.DirectX.dll' is attempting managed execution inside OS Loader lock. Do not attempt to run managed code inside a DllMain or image initialization function since doing so can cause the application to hang.". In order to get rid of it, just disable handling of LoaderLock exceptions under Managed Debugging Assistants section in Visual Studio Debug menu. Not very clear why it happens, but it is. After doing it, we can continue with in memory wave generation.

DXS.WaveFormat format = new DXS.WaveFormat();
            format.BitsPerSample = BitsPerSample;
            format.Channels = Channels;
            format.BlockAlign = BlockAlign;
            format.FormatTag = DXS.WaveFormatTag.Pcm;
            format.SamplesPerSecond = SampleRate;
            format.AverageBytesPerSecond = format.SamplesPerSecond * format.BlockAlign;

            DXS.BufferDescription desc = new DXS.BufferDescription(format);
            desc.DeferLocation = true;
            desc.BufferBytes = sample.Length;         

Now, when we'll create SecondaryBuffer to play with the Device

CurrentSample = sample;
CurrentBuffer = new DXS.SecondaryBuffer(desc, m_device);
CurrentBuffer.Write(0, sample, DXS.LockFlag.EntireBuffer);
CurrentBuffer.Play(0, DXS.BufferPlayFlags.Default);           

We done. Now let'sw create buffer to play. We'll start from the simplest sine wave. This type of wave looks like this

image

Not too hard to understand how to generate it

int length = (int)(SampleRate * BlockAlign * duration);
byte[] buffer = new byte[length];

double A = frequency * 2 * Math.PI / (double)SampleRate;
for (int i = 0; i < length; i++)
{
     //buffer[i] = (byte)(Math.Sin(i*A));
}

However, phone tones (DTMF) have double wave format. It looks like this

 image

As you can see, we take two waves and create it's composition. Here the table of tones to be used for DTMF creation.

     

Upper

Band

 
    1209 Hz 1336 Hz 1477 Hz 1633 Hz

 

697 Hz

1

2

3

A

Lower

770 Hz

4

5

6

B

Band

852 Hz

7

8

9

C

941 Hz

*

0

#

D

 

There are also events in telephony. They build on two frequencies as well

  Lower Band Upper Band
Busy signal 480 Hz 620 Hz
Dial tone 350 Hz 440 Hz
Flash (ringback) 440 Hz 480 Hz

So, in order to create such waves, we'll have to use following function (the simple one)

(128 + 63 * Math.Sin(n * 2 * Math.PI * LowFreq / rate) + 63 * Math.Sin(n * 2 * Math.PI * HighFreq / rate))

Actually, dual (or triple) sines should be generated by using something like this

int length = (int)(SampleRate * BlockAlign * duration);
            byte[] buffer = new byte[length];

            double A = frequency * 2 * Math.PI / (double)SampleRate;
            for (int i = 0; i < length; i++)
            {
                if (i > 1) buffer[i] = (byte)(2 * Math.Cos(A) * buffer[i - 1] – buffer[i - 2]);
                else if (i > 0) buffer[i] = (byte)(2 * Math.Cos(A) * buffer[i - 1] – (Math.Cos(A)));
                else buffer[i] = (byte)(2 * Math.Cos(A) * Math.Cos(A) – Math.Cos(2 * A));
            }

Done. We can generate and play sounds, tones and DTMF. How to detect what has been played? For this purpose, we should use Fourier transform, but it's very slow, thus Dr' Gerald Goertzel developed fast Fourrier Transform algorithm, that used almost in every DSP related product. A little theory about it can be found in Wikipedia (implementation there rather bad).

I simplified it, due to fact, that we actually know source possible frequencies, thus all we have to do is to check the result with well-known values

public double CalculateGoertzel(byte[] sample, double frequency, int samplerate)
{
    double Skn, Skn1, Skn2;
    Skn = Skn1 = Skn2 = 0;
    for (int i = 0; i < sample.Length; i++)
    {
        Skn2 = Skn1;
        Skn1 = Skn;
        Skn = 2 * Math.Cos(2 * Math.PI * frequency / samplerate) * Skn1 – Skn2 + sample[i];
    }

    double WNk = Math.Exp(-2 * Math.PI * frequency / samplerate);

    return 20* Math.Log10(Math.Abs((Skn – WNk * Skn1)));
}

This method checks it will steps of 5

public int TestGoertzel(int frequency, byte[] sample)
       {
           int stepsize = frequency / 5;
           Dictionary<int,double> res = new Dictionary<int,double>();
           for (int i = 0; i < 10; i++)
           {
               int freq = stepsize * i;
               res.Add(freq,CalculateGoertzel(sample,freq,SampleRate));
           }

And the actual result is rather clear

1200 – -272.273842954695
2400 – -198.538262992751
3600 – -214.476846236253
4800 – -224.242925995697
6000 – 87.1653837206957 <- here result
7200 – -225.52220053475
8400 – -222.836254116698
9600 – -230.526410368965
10800 – -220.587695185849

Have a fun with sines, waves and math and be good people. BTW, you can implement full telephone keypad, that actually working with one of my previous posts. To test it, just call any service, that required you to press any of phone buttons and sound the phone you just generated – it will accept it :)

הוסף תגובה
facebook linkedin twitter email

כתיבת תגובה

האימייל לא יוצג באתר. (*) שדות חובה מסומנים

6 תגובות

  1. louisa9 במרץ 2008 ב 18:11

    im abit comfused im only ten so im not sure what age group this website is for

    להגיב
  2. teo29 במרץ 2008 ב 17:49

    You say you have tested that piece of code ?

    If you are so kind to attach the c# file with all the code, THAT WORKS.

    Thx.

    להגיב
  3. Tamir Khason29 במרץ 2008 ב 20:40

    Teo. code parts here were taken from big project. I cannot attach it to the post. However, thatnk you for interesting my blog.

    להגיב
  4. zubair10 באפריל 2008 ב 10:14

    hi,i m makin a projecton dtmf detection via mic.i m not cleared how the gortezel work.i have used a 50 ms sound sample of dtmf tone of key 1 i.e 1906 hz.using sampling rate of 8000 i got the result on calculation as 39.0567….. .this is the result after passing the parameters in to calculate gortezel function .Help me out how can i recognise the exact dtmf on run time using gortezel algorithim.

    Waiting for your response .

    Thanks and regards

    Zubair

    להגיב
  5. sagar Kadam9 ביוני 2008 ב 15:02

    Can you explain me which frequency you are using here in above example?

    Because each dtmf signal has two frequecies associated with it. one lowerband freq and another is upperband frequency.

    Thanks and regards,
    Sagar Kadam

    להגיב
  6. Godwin24 באוקטובר 2008 ב 9:15

    How do you finally get the digits from it??
    I don't understand what you mean by this…
    1200 – -272.273842954695
    2400 – -198.538262992751
    3600 – -214.476846236253
    4800 – -224.242925995697
    6000 – 87.1653837206957 <- here result
    7200 – -225.52220053475
    8400 – -222.836254116698
    9600 – -230.526410368965
    10800 – -220.587695185849

    ??
    How do I relate these to 1,2 and 3??
    I'm confused.
    Please help me if possible.
    Thanks

    להגיב