Capturing Audio Streams With Acoustic Echo Cancellation and Beamforming

MicArrayEchoCancellation
Walkthrough: C++
Capturing Audio Streams with Acoustic Echo Cancellation
and Beamforming
About This Walkthrough In the Kinect for Windows Software Development Kit (SDK), the
MicArrayEchoCancellation sample shows how to capture an audio stream from the microphone array
of the Kinect for Xbox 360 sensor by using the MSRKinectAudio Microsoft DirectX media object
(DMO) in a Microsoft DirectShow graph. This document provides a walkthrough review of the
MicArrayEchoCancellation sample.
Resources For a complete list of documentation for the Kinect for Windows SDK Beta, plus related
reference and links to the online forums, see the beta SDK website at:
http://kinectforwindows.org
Contents
Introduction.......................................................................................................................................................................... 2
Program Description ........................................................................................................................................................... 3
Create and Configure the MSRKinectAudio DMO .......................................................................................................... 4
Select the Kinect Sensors Microphone Array .................................................................................................................. 5
Enumerate the Device Index........................................................................................................................................... 6
Determine the Device Index ........................................................................................................................................... 7
Record the Captured Stream and Determine the Source Direction ............................................................................. 8
Set Up the Data Buffer .................................................................................................................................................... 8
Set the Output Format .................................................................................................................................................... 8
Allocate Resources and the Output Buffer ................................................................................................................... 9
Capture the Audio Stream and Determine Source Direction .................................................................................. 10
License: The Kinect for Windows SDK Beta is licensed for non-commercial use only. By installing, copying, or otherwise
using the beta SDK, you agree to be bound by the terms of its license. Read the license.
Disclaimer: This document is provided as-is. Information and views expressed in this document, including URL and other
Internet Web site references, may change without notice. You bear the risk of using it.
This document does not provide you with any legal rights to any intellectual property in any Microsoft product. You may
copy and use this document for your internal, reference purposes.
2011 Microsoft Corporation. All rights reserved.
Microsoft, DirectShow, DirectX, Kinect, MSDN, Windows, and Windows Media are trademarks of the Microsoft group of
companies. All other trademarks are property of their respective owners.
MicArrayEchoCancellation Walkthrough: C++ 2
Introduction
The audio component of the Kinect for Xbox 360 sensor is a four-element linear microphone array.
An array provides some significant advantages over a single microphone, including more sophisticated
acoustic echo cancellation and noise suppression, and the ability to determine the direction of a sound
source.
The primary way for C++ applications to access the Kinect sensors microphone array is through the
MSRKinectAudio Microsoft DirectX media object (DMO). A DMO is a standard COM object that can
be incorporated into a Microsoft DirectShow graph or a Microsoft Media Foundation topology. The
Kinect for Windows Software Development Kit (SDK) Beta includes an extended version of the
Windows microphone array DMOreferred to here as the MSRKinectAudio DMOto support the
Kinect microphone array.
The MSRKinectAudio DMO supports all the standard microphone array functionality, which includes:
Acoustic echo cancellation (AEC)
Microphone array processing (MicArray)
Noise suppression (NS)
Automatic gain control (AGC)
Voice activity detection (VAD)
Sound source localization, which identifies the direction of the source in the horizo ntal plane
Beamforming, which allows the array to function as a steerable directional microphone.

The DMO supports 11 beams, with fixed directions that range from -50 to+50 degrees in
10-degree increments.
For more information on the standard microphone array, see Microphone Array Support in Windows
Vista and How to Build and Use Microphone Arrays for Windows Vista on the Microsoft Developer
Network (MSDN) website.
Although the internal details for MSRKinectAudio DMO are different, you use it in much the same way
as the standard microphone array DMO, with the following exceptions. The MSRKinectAudio DMO:
Has its own class identifier (CLSID)CLSID_CMSRMSRKinectAudio.
Exposes sound source localization functionality through a new interfaceISoundSourceLocalizer.
Supports an additional microphone array modeadaptive beamforming)which uses an internal

source localizer to automatically determine the beam direction.
The MicArrayEchoCancellation sample shows how to capture an audio stream from the Kinect sensors
microphone array by polling the MSRKinectAudio DMO in source mode. The application uses AEC to
record a high-quality audio stream and beam-forming to determine the direction to the sound source.
The DMO can also be used with a Microsoft Media Foundation topology. For an example, see
MFAudioFilter Walkthrough: C++ Sample on the beta SDK website.
Note DirectShow is COM-based, and this document assumes that you are familiar with how to use
COM objects and interfaces. You do not need to know how to implement COM objects. For the basics
of how to use COM objects, see Programming DirectX with COM on the MSDN website. That MSDN
topic is written for DirectX programmers, but the basic principles apply to all COM-based applications.
Program Description
MicArrayEchoCancellation is installed with the Kinect for Windows Software Development Kit (SDK)
Beta samples in %KINECTSDK_DIR%\Samples\KinectSDKSamples.zip.. MicArrayEchoCancellation is a
C++ console application that is implemented in MicArrayEchoCancellation.cpp.
The basic program flow is as follows:
1.
Create and configure the MSRKinectAudio DMO.
2.
Enumerate the available capture devices and select the Kinect sensors microphone array.
3.
Record 10 seconds of audio stream and determine the source direction as the capture process
progresses.
To run MicArrayEchoCancellation, start MicArrayEchoCancellation.exe and follow the instructions in the

console window.
Tip Before attempting to capture audio from the microphone array, you must be actively streaming to
the audio render device that is specified for the DMOtypically the systems speakers. Otherwise, the
MSRKinectAudio DMO fails. AEC is designed to cancel interfering sounds, so there must be something
to cancel. The simplest solution is to start playing a tune on Windows Media Player before you run
the application. The Libraries\Music\Sample Music folder on your Windows PC contains some sample
music files.
The following is a lightly edited version of the output from a MicArrayEchoCancellation session, where
the sound source moved from side to side as capture progressed:
Start a song in Windows Media Player and then press any key to start recording (echo cancellation
processing expects speakers to be producing sound).
Recording using DMO
AEC-MicArray is running ... Press "s" to stop
Position: -0.051290
Confidence: 1.000000
Beam Angle = 0.0000000
Sound output was written to file: C:\KDK\Samples\Audio\MicArrayEchoCancellation\CPP\AECout.wav
The recording process uses beamforming, which creates a single directional channel from the four
microphones in 16-kHz, 16-bit mono pulse code modulation (PCM) format. The channel is oriented to
one of the 11 beam directions. MicArrayEchoCancellation uses adaptive beamforming, which
automatically selects the beam that is closest to the source direction.
You can use the captured stream for many purposes. MicArrayEchoCancellation simply writes the
captured audio stream to AECout.wavwhich is a .wav file that can be played with Windows Media
Player.
The rest of this document is a walkthrough of the MicArrayEchoCancellation sample. It describes all the
samples functionality except for writing the capture stream to a .wav file. For details on that process,
see the sample code.
Note This document includes code excerpts, most of which have been edited for brevity and
readability. In particular, most routine error-correction code has been removed. For the complete code,
see the MicArrayEchoCancellation sample. Hyperlinks in this walkthrough refer to content on the
MSDN website.
Create and Configure the MSRKinectAudio DMO

The applications entry point_tmainmanages the overall program execution, with private methods
handling most details. The first step is to create and configure an instance of the MSRKinectAudio
DMO, as follows:
#include "MSRKinectAudio.h"
...
int __cdecl _tmain(int argc, const TCHAR ** argv)
{
HRESULT hr = S_OK;
CoInitialize(NULL);
int iMicDevIdx = -1;
IMediaObject* pDMO = NULL;
IPropertyStore* pPS = NULL;
...
SetPriorityClass (GetCurrentProcess(), HIGH_PRIORITY_CLASS);
CoCreateInstance(CLSID_CMSRKinectAudio,
NULL, CLSCTX_INPROC_SERVER,
IID_IMediaObject, (void**)&pDMO);
pDMO->QueryInterface(IID_IPropertyStore, (void**)&pPS);
PROPVARIANT pvSysMode;
PropVariantInit(&pvSysMode);
pvSysMode.vt = VT_I4;
pvSysMode.lVal = (LONG)(4) ;
pPS->SetValue(MFPKEY_WMAAECMA_SYSTEM_MODE, pvSysMode);
PropVariantClear(&pvSysMode);
...
}
In addition to standard header files, the file includes the beta SDK header file, MSRKinectAudio.h. This
file contains the DMOs globally unique identifiers (GUIDs) and interface declarations for the
MSRKinectAudio DMO.
Before creating the DMO, MicArrayEchoCancellation calls the SetPriorityClass function to set the
processs priority to HIGH_PRIORITY_CLASS. This helps ensure that the microphone is not preempted
during the capture process.
MicArrayEchoCancellation calls the CoCreateInstance function to create an instance of the DMO and
obtain its IMediaObject interface, which supports a set of methods that let you control the DMO. A
DMO supports a property storewhich you access through an IPropertyStore interfacethat contains
a collection of key-value pairs. You configure a DMO by setting the appropriate keys in the objects
property store.
DMO property store keys are GUIDs that are defined by Media Foundation. The associated values are
packaged as a PROPVARIANT structure, which must be initialized by calling the PropVariantInit
function. You then assign values to the following two members:
PROPVARIANT.vt, a VARENUM value that specifies the data type.

For example, a 4-byte unsigned integer corresponds to a VT_I4 data type.
A value member, whose name depends on vt.

For VT_I4, the value member is PROPVARIANT.lVal.
MicArrayEchoCancellation calls the DMOs QueryInterface method to obtain an IPropertyStore

pointer, and then calls the IPropertyStore::SetValue method to specify the system mode. The key is
MFPKEY_WMAAECMA_SYSTEM_MODE, and four modes are available, each of which has a
corresponding VT_I4 value as shown in the following table.
Mode
Value
Single-channel with AEC
Microphone array
Microphone array with AEC
Single-channel with automatic gain control
MicArrayEchoCancellation uses the array with AEC, so the value is set to 4.

Note MicArrayEchoCancellation uses the MSRKinectAudio DMOs default microphone array mode
which enables adaptive beamforming. So MicArrayEchoCancellation does not explicitly set
MFPKEY_WMAAECMA_FEATR_MICARR_MODE.
Select the Kinect Sensors Microphone Array

Because a system can have more than one active microphone, MicArrayEchoCancellation must specify
which microphone stream is to be captured by setting the MFPKEY_WMAAECMA_DEVICE_INDEXES key
to the appropriate device index. To determine the index, MicArrayEchoCancellation enumerates the
available microphones, determines which one is the Kinect sensors microphone array, and uses that
index to set the value, as follows:
int __cdecl _tmain(int argc, const TCHAR ** argv)
{
...
hr = GetMicArrayDeviceIndex(&iMicDevIdx);
PROPVARIANT pvDeviceId;
PropVariantInit(&pvDeviceId);
pvDeviceId.vt = VT_I4;
pvDeviceId.lVal =
(unsigned long)(iSpkDevIdx<<16) |
(unsigned long)(0x0000ffff & iMicDevIdx);
pPS->SetValue(MFPKEY_WMAAECMA_DEVICE_INDEXES, pvDeviceId);
PropVariantClear(&pvDeviceId);
...
}
The value of MFPKEY_WMAAECMA_DEVICE_INDEXES is a 32-bit integer that contains both speaker and
microphone indices:
The speaker index is in the upper 2 bytes.
The microphone index is in the lower 2 bytes.
The private GetMicArrayDeviceIndex method enumerates the available capture devices and determines
the Kinect sensors microphone array device index.
Enumerate the Device Index

The first step in enumerating the device index is to enumerate the available capture devices, as follows:
HRESULT GetMicArrayDeviceIndex( int *piDevice)
{
HRESULT hr = S_OK;
UINT index, dwCount;
IMMDeviceEnumerator* spEnumerator;
IMMDeviceCollection* spEndpoints;
*piDevice = -1;
CoCreateInstance(__uuidof(MMDeviceEnumerator),
NULL, CLSCTX_ALL, __uuidof(IMMDeviceEnumerator),
(void**)&spEnumerator);
spEnumerator->EnumAudioEndpoints(eCapture,
DEVICE_STATE_ACTIVE,
&spEndpoints);
...
}
GetMicArrayDeviceIndex:
1.
Creates a device enumerator object, and gets a pointer to its IMMDeviceEnumerator interface.
2.
Enumerates the systems capture devices by calling the enumerator objects

IMMDeviceEnumerator::EnumAudioEndpoints method, which enumerates the specified types of
audio endpoints.
The EnumAudioEndpoints parameter values are as follows:
A value from the EDataFlow enumeration that indicates the device type.
eCapture directs EnumAudioEndpoints to enumerate only capture devices.
A DEVICE_STATE_XXX constant that specifies which device states to enumerate.

DEVICE_STATE_ACTIVE directs EnumAudioEndpoints to enumerate only active devices.
The address of an IMMDeviceCollection interface pointer that contains the enumerated

capture devices.
Determine the Device Index

GetMicArrayDeviceIndex then determines the Kinect sensors microphone array device index, as follows:
HRESULT GetMicArrayDeviceIndex(int *piDevice)
{
...
spEndpoints->GetCount(&dwCount));
for (index = 0; index < dwCount; index++)
{
IMMDevice* spDevice;
spEndpoints->Item(index, &spDevice);
GUID subType = {0};
GetJackSubtypeForEndpoint(spDevice, &subType);
if (subType == KSNODETYPE_MICROPHONE_ARRAY)
{
*piDevice = index;
break;
}
}
... // Clean up and return
}
To determine device index, GetMicArrayDeviceIndex:
1.
Calls the IMMDeviceCollection::GetCount method to determine the number of devices in the

collection.
2.
Calls the IMMDeviceCollection::Item method for each capture device to get its IMMDevice
interface.
3.
For each capture device, passes the IMMDevice interface to the private
GetJackSubtypeForEndpoint method to determine the device subtype.
The KSNODETYPE_MICROPHONE_ARRAY subtype corresponds to a microphone array, presumably
belonging to the Kinect sensor.
When GetMicArrayDeviceIndex finds this subtype, it returns the associated device index to _tmain.
GetJackSubtypeForEndpoint determines the devices subtype, as follows:
HRESULT GetJackSubtypeForEndpoint(IMMDevice* pEndpoint,
GUID* pgSubtype)
{
...
IDeviceTopology*
spEndpointTopology;
IConnector*
spPlug;
IConnector*
spJack;
IPart*
spJackAsPart;
pEndpoint->Activate(__uuidof(IDeviceTopology), CLSCTX_INPROC_SERVER,
NULL, (void**)&spEndpointTopology);
spEndpointTopology->GetConnector(0, &spPlug);
spPlug->GetConnectedTo(&spJack);
spJack->QueryInterface(__uuidof(IPart), (void**)&spJackAsPart);
hr = spJackAsPart->GetSubType(pgSubtype);
...
}
To determine a capture devices subtype, you must determine what the capture device is connected to
and query that connector for the capture devices subtype. GetJackSubtypeForEndpoint:
1.
Calls the IMMDevice::Activate method to obtain the objects IDeviceTopology interface.
2.
Calls the IDeviceTopology::GetConnector method to get the devices connector.
3.
Calls the IConnector::GetConnectedTo method to determine what the connector from step 2 is
connected to.
4.
Calls QueryInterface on the object from Step 3 to get its IPart interface.
5.
Calls the IPart::GetSubType method to get the capture devices subtype GUID.
Record the Captured Stream and Determine the Source

Direction
After configuring the DMO to capture an audio stream from the Kinect sensors microphone array,
_tmain calls the private DShowRecord method to record the stream.
Set Up the Data Buffer

The output data is contained in a private CStaticMediaBuffer object that is assigned to the pBuffer
member of a DMO_OUTPUT_DATA_BUFFER structure, as follows:
HRESULT DShowRecord(...)
{
...
CStaticMediaBuffer outputBuffer;
DMO_OUTPUT_DATA_BUFFER OutputBufferStruct = {0};
OutputBufferStruct.pBuffer = &outputBuffer;
...
}
For details on the buffer object, see the sample.
Set the Output Format

DShowRecord defines the output format by passing a DMO_MEDIA_TYPE structure with the information
to IMediaObject::SetOutputType, as follows:
HRESULT DShowRecord(IMediaObject* pDMO, IPropertyStore* pPS,
const TCHAR* outFile, int iDuration)
{
...
WAVEFORMATEX wfxOut = {WAVE_FORMAT_PCM, 1, 16000, 32000, 2, 16, 0};
DMO_MEDIA_TYPE mt = {0};
hr = MoInitMediaType(&mt, sizeof(WAVEFORMATEX));
mt.majortype = MEDIATYPE_Audio;
mt.subtype = MEDIASUBTYPE_PCM;
mt.lSampleSize = 0;
mt.bFixedSizeSamples = TRUE;
mt.bTemporalCompression = FALSE;
mt.formattype = FORMAT_WaveFormatEx;
memcpy(mt.pbFormat, &wfxOut, sizeof(WAVEFORMATEX));
hr = pDMO->SetOutputType(0, &mt, 0);
MoFreeMediaType(&mt);
...
}
The output is defined by a WAVEFORMATEX structure as follows:
PCM audio
1 channel
16,000 samples/second
32,000 bytes/second, on average
A block alignment of 2, which means that the DMO processes 2 bytes of data at a time
16 bits/sample
No extra format information
Allocate Resources and the Output Buffer

DShowRecord next allocates resources, determines the audio frame size, and allocates an output buffer,
as follows:
{
...
DWORD cOutputBufLen = 0;
BYTE *pbOutputBuffer = NULL;
...
hr = pDMO->AllocateStreamingResources();
int iFrameSize;
PROPVARIANT pvFrameSize;
PropVariantInit(&pvFrameSize);
pPS->GetValue(MFPKEY_WMAAECMA_FEATR_FRAME_SIZE, &pvFrameSize);
iFrameSize = pvFrameSize.lVal;
PropVariantClear(&pvFrameSize);
cOutputBufLen = wfxOut.nSamplesPerSec * wfxOut.nBlockAlign;
pbOutputBuffer = new BYTE[cOutputBufLen];
cTtlToGo = iDuration * 100;
...
}
IMediaObject:: AllocateStreamingResources allocates any resources that the DMO requires.
MicArrayEchoCancellation gets the frame size from the DMOs property store, by passing the
MFPKEY_WMAAECMA_FEATR_FRAME_SIZE property key to the IPropertyStore::GetValue method. The
output buffer length is the product of the number of samples per second (16,000) and the block
alignment value (2). That value is then used to dimension the output buffer. Finally, DShowRecord sets
the cTtlToGo value to specify the maximum number of frames to be recorded.
DShowRecord then prepares a file to receive the data. For details, see the sample.
Capture the Audio Stream and Determine Source Direction

Before starting the capture loop, DShowRecord calls QueryInterface on the DMO to get an
ISoundSourceLocalizer interface, which exposes the beamforming algorithms that determine the
source direction, as follows:
{
...
ISoundSourceLocalizer* pSC = NULL;
...
hr = pDMO->QueryInterface(IID_ISoundSourceLocalizer, (void**)&pSC);
...
}
DShowRecord then starts a capture loop, to record the audio stream, as follows:
{
...
while (1)
{
Sleep(10); //sleep 10ms
if (cTtlToGo--<=0)
break;
do
{
// fill buffer
} while (OutputBufferStruct.dwStatus &
DMO_OUTPUT_DATA_BUFFERF_INCOMPLETE);
if (_kbhit())
{
int ch = _getch();
if (ch == 's' || ch == 'S')
break;
}
}
// Clean up and return
}
The outer while loop captures the audio data one buffer at a time. If the maximum number of frames
has been recorded, the capture loop terminates. Otherwise, the inner do-while loop collects the next
buffer.
The do-while is essentially a wait loop that cycles until the next buffer is ready. If the buffer is not yet
ready, the DMO_OUTPUT_DATA_BUFFER.dwStatus value is set to
DMO_OUTPUT_DATA_BUFFERF_INCOMPLETE. After the buffer is ready, the loop checks whether the
user has pressed an s or S key. If so, the outer loop immediately terminates. Otherwise, the outer
loop continues to the next buffer.
After the recording process is complete, DShowRecord returns and _tmain performs cleanup and exits.
The do-while loop handles the mechanics of obtaining a filled buffer and determining the source
direction, as follows:
do
{
outputBuffer.Init((byte*)pbOutputBuffer, cOutputBufLen, 0);
OutputBufferStruct.dwStatus = 0;
hr = pDMO->ProcessOutput(0, 1, &OutputBufferStruct, &dwStatus);
if (hr == S_FALSE) {
cbProduced = 0;
}
else {
hr = outputBuffer.GetBufferAndLength(NULL, &cbProduced);
}
WriteToFile(hFile, pbOutputBuffer, cbProduced);
totalBytes += cbProduced;
hr = pSC->GetBeam(&dBeamAngle);
double dConf;
hr = pSC->GetPosition(&dAngle, &dConf);
if(SUCCEEDED(hr))
{
if(dConf>0.9)
{
_tprintf(_T("Position: %f\t\tConfidence: %f\t\tBeam Angle = %f\r"),
dAngle, dConf, dBeamAngle);
}
}
} while (OutputBufferStruct.dwStatus &
DMO_OUTPUT_DATA_BUFFERF_INCOMPLETE);
The loop starts by initializing the output buffer object and setting
DMO_OUTPUT_DATA_BUFFER.dwStatus to 0. It then calls the DMOs IMediaObject::ProcessOutput
method, which drives processing samples through the pipeline in the following way:
If ProcessOutput returns S_FALSE, the buffer is not filled yet.

In that case, dwStatus is set to DMO_OUTPUT_DATA_BUFFERF_INCOMPLETE and the loop repeats.
If ProcessOutput returns S_OK, the buffer is filled and the DShowRecord calls the buffer objects
GetBufferAndLength method to get the buffers length.
The DMO_OUTPUT_DATA_BUFFERF_INCOMPLETE flag is cleared, so the do-while loop terminates
and the outer loop can proceed to the next buffer. For details on GetBufferAndLength, see the
sample.
The loop next writes the buffer to a file, which does nothing if the buffer is not ready. For details, see
the sample.
Finally, the loop calls ISourceLocalization::GetBeam and ISourceLocalization::GetPosition:
GetBeam retrieves the direction of the currently selected beam as 0.175-radian (10-degree)
increments relative to camera coordinates.
GetPosition return the estimated source direction.
Because the beam is limited to one of the 11 supported directions, it typically does not point directly
to the source and is accurate to only 5 degrees. GetPosition uses a source localization algorithm that
provides a more accurate estimate of the source direction, including a confidence value that is a
measure of the estimates accuracy.
You must call ProcessOutput before you call GetBeam or GetPosition. The ProcessOutput method
drives processing audio samples through the audio pipeline, which is required to determine the source
direction. GetBeam is successful only after ProcessOutput returns S_OK:
If ProcessOutput returns S_OK, the loop checks the GetBeam return value to determine whether
the source direction has changed and, if it has, prints the new direction.
If ProcessOutput returns S_FALSE, GetBeam fails silently and the loop repeats.
For More Information

For more information about implementing audio and related samples, see the Programming Guide
page on the Kinect for Windows SDK Beta website at:
http://kinectforwindows.org

Capturing Audio Streams With Acoustic Echo Cancellation and Beamforming

Caricato da

Informazioni sul documento

Titolo originale

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Capturing Audio Streams With Acoustic Echo Cancellation and Beamforming

Caricato da

Copyright:

Formati disponibili

MicArrayEchoCancellation

MicArrayEchoCancellation Walkthrough: C++ 2

Acoustic echo cancellation (AEC)

Microphone array processing (MicArray)

Noise suppression (NS)

Automatic gain control (AGC)

Voice activity detection (VAD)

Beamforming, which allows the array to function as a steerable directional microphone.

Has its own class identifier (CLSID)CLSID_CMSRMSRKinectAudio.

Exposes sound source localization functionality through a new interfaceISoundSourceLocalizer.

Supports an additional microphone array modeadaptive beamforming)which uses an internal

MicArrayEchoCancellation Walkthrough: C++ 3

Create and configure the MSRKinectAudio DMO.

To run MicArrayEchoCancellation, start MicArrayEchoCancellation.exe and follow the instructions in the

MicArrayEchoCancellation Walkthrough: C++ 4

Create and Configure the MSRKinectAudio DMO

MicArrayEchoCancellation Walkthrough: C++ 5

PROPVARIANT.vt, a VARENUM value that specifies the data type.

A value member, whose name depends on vt.

MicArrayEchoCancellation calls the DMOs QueryInterface method to obtain an IPropertyStore

Single-channel with AEC

Microphone array with AEC

Single-channel with automatic gain control

MicArrayEchoCancellation uses the array with AEC, so the value is set to 4.

Select the Kinect Sensors Microphone Array

MicArrayEchoCancellation Walkthrough: C++ 6

The speaker index is in the upper 2 bytes.

The microphone index is in the lower 2 bytes.

Enumerate the Device Index

Enumerates the systems capture devices by calling the enumerator objects

A DEVICE_STATE_XXX constant that specifies which device states to enumerate.

The address of an IMMDeviceCollection interface pointer that contains the enumerated

MicArrayEchoCancellation Walkthrough: C++ 7

Determine the Device Index

Calls the IMMDeviceCollection::GetCount method to determine the number of devices in the

MicArrayEchoCancellation Walkthrough: C++ 8

Calls the IMMDevice::Activate method to obtain the objects IDeviceTopology interface.

Calls the IDeviceTopology::GetConnector method to get the devices connector.

Record the Captured Stream and Determine the Source

Set Up the Data Buffer

Set the Output Format

MicArrayEchoCancellation Walkthrough: C++ 9

32,000 bytes/second, on average

No extra format information

Allocate Resources and the Output Buffer

MicArrayEchoCancellation Walkthrough: C++ 10

Capture the Audio Stream and Determine Source Direction

MicArrayEchoCancellation Walkthrough: C++ 11

If ProcessOutput returns S_FALSE, the buffer is not filled yet.

MicArrayEchoCancellation Walkthrough: C++ 12

GetPosition return the estimated source direction.

For More Information

Potrebbero piacerti anche