Documenti di Didattica
Documenti di Professioni
Documenti di Cultura
ca/~laganier/tutorial/opencv+directshow/
The objective of this page is to teach you how to use the Intel libraries to build applications where images or
sequences of images have to be processed. In addition, the DirectShow technology is also introduced; this
one is particularly useful to process image sequences or sequences captures using PC cameras.
Since this is a beginner’s guide, efforts have been made to describe in details all the necessary steps to obtain
the shown results. In addition, all the source codes used here have been made available. Note, however, that
the goal was to keep these programs as simple and short as possible; as a consequence the programming style
is not always of good quality. In particular, a better adherence to the object-oriented paradigm would have
considerably increased the quality of the programming.
All applications presented here will be simple dialog-based applications. This kind of applications can easily
be created using the MFC application wizard. On you Visual C++ menu bar, select the File|New option.
Then start the MFCAppWizard (exe). You choose a dialog-based application; select a name for the
application (here it is called cvision). VC++ should create a simple OK/Cancel Dialog for you. The class
with a name ending by Dlg will contain the member functions that control the widget of the dialog.
The first task will be to open and display an image. To do this, we will first add a button that will allow us to
select the file that contains the image. Drag a button onto the dialog and then right click on it and select
Properties option; this will allow you to change the caption to Open Image. Once this done,
double-click on the new button and change the corresponding member function name to OnOpen. The dialog
now looks like this:
The CFileDialog
class is the one to use in order to create a file dialog. This one will show up by adding the following code to
the the OnOpen member function
void CCvisionDlg::OnOpen()
{
CFileDialog dlg(TRUE, _T("*.bmp"), "",
OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,
"image files (*.bmp; *.jpg) |*.bmp;*.jpg|
AVI files (*.avi) |*.avi|All Files (*.*)|*.*||",NULL);
if (dlg.DoModal() == IDOK) {
Note how the extensions of interest (here .bmp .jpg and .avi) for the files to be opened are specified using
the fourth argument of the CFileDialog
constructor. Now, by clicking on the Open Image button, the following dialog appears:
Now that we learnt how to select a file, let’s load and display the underlying image. The Intel libraries will
help us to accomplish this task. In particular, the HighGui component of OpenCV will be put to contribution.
This one contains the required functions to load, save and display images under the Windows environment.
Since we will be using these libraries in all the example to follow, we will first see how to setup adequately
our VC++ projects in order to have the libraries linked to our application. Selection the
Project|Settings… option. A dialog will pop up. Select the C/C++ tab and the category
Preprocessor. Add the following directories to additional include directories:
q C:\Program Files\Intel\plsuite\include
q C:\Program Files\Intel\opencv\cv\include
q C:\Program Files\Intel\opencv\otherlibs\highgui
Select now the Link Tab, category Input. Add to additional library path the following directories:
q C:\Program Files\Intel\plsuite\lib\msvc
q C:\Program Files\Intel\opencv\lib
And finally select the category General of the Link tab and add the following libraries to library modules:
This setup is valid for the current project. It could be a good idea to add all these directories to the global
search path of your VC++ such that they will always be active each time you create a new project. This can
be done from the Tools|Options…
menu. You then select the Directories tab. The following two screenshots show you the information that
should be included there.
Note also that we have included the DirectX directory information (which is, in our case, C:\DXSDK\Lib)
that we will use in later examples. This one should always be the first in the list to avoid incompatibilities
with other libraries.
With these global settings, only the names of the library modules need to be specified when a new project is
created:
Now add the following header file to the project, here called cvapp.h:
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "cv.h" // include core library interface
#include "highgui.h" // include GUI library interface
class ImageProcessor {
public:
if (display) {
// create a window
cvvNamedWindow( "Original Image", 1 );
~ImageProcessor() {
cvReleaseImage( &img );
}
};
#endif
The function names starting with cvv are HighGui functions. To use the ImageProcessor class in the
application, just include the header to the dialog. Once a file is open, an ImageProcessor instance can be
created, this can be done as follows:
void CCvisionDlg::OnOpen()
{
CFileDialog dlg(TRUE, _T("*.bmp"), "",
OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,
"BMP files (*.bmp) |*.bmp|AVI files (*.avi) |*.avi|
All Files (*.*)|*.*||",NULL);
if (dlg.DoModal() == IDOK) {
3. Processing an image
Now let’s try to call one of the OpenCV function. We rewrite the header as follows:
#include <stdio.h>
#include <math.h>
#include <string.h>
class ImageProcessor {
public:
if (display) {
void display() {
void execute();
~ImageProcessor() {
cvReleaseImage( &img );
}
};
#endif
and we add a C++ source file, here named cvapp.cpp, that contains the function that does the processing.
#include "stdafx.h"
#include "cvapp.h"
// A global variable
ImageProcessor *proc = 0;
void ImageProcessor::execute() {
process(img);
}
The process
function is the one that calls the OpenCV function that does the processing. In this example, the processing
consists in a simple morphological erosion (cvErode). Obviously, all the processing could have been done
directly inside the execute member function. Also, there is no justification, at this point, to have use a
void pointer as parameter for the process
function. This has been done just for consistency with the examples to follow where the process function
will become a callback function in the processing of a sequence. Note that for simplicity, we have added a
global variable that points to the ImageProcessor instance that this application uses. Let’s now modify
our dialog by adding another button, i.e.:
void CCvisionDlg::OnOpen()
{
CFileDialog dlg(TRUE, _T("*.bmp"), "",
OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|
OFN_HIDEREADONLY,
"image files (*.bmp; *.jpg) |*.bmp;*.jpg|
AVI files (*.avi) |*.avi|All Files (*.*)|*.*||",NULL);
if (dlg.DoModal() == IDOK) {
if (proc != 0)
delete proc;
void CCvisionDlg::OnProcess()
{
if (proc != 0) {
If you open an image and push the process button, then the result is:
In the preceding example, the image has been created from a file. In many applications, it would be also
useful to create an image from scratch. This can be done using the IPL functions in which case you must first
create a header that specify the image format. The following two examples show how to create a gray level
image and a color image.
iplAllocateImage(gray, 1, 0);
iplAllocateImage(color, 1, 0);
The first parameter specifies the number of channel and the second is 0 if there is no alpha channel in the
image (which is most often the case in computer vision). The third parameter defines the pixel type. An
unsigned 8 bits pixel (IPL_DEPTH_8U ) is the common choice but 2-byte signed integer
(IPL_DEPTH_16S) and 4-byte float (IPL_DEPTH_32F ) are also very useful. The next parameters
specify the color model (basically "GRAY" or "RGB") and the channel sequence (in case of a color image).
The data order parameter specifies how the different color channels are ordered. Under IPL the choices are
pixel-oriented, i.e. RGBRGBRGB… or plane-oriented, i.e. RRRR…GGGGG…BBBB… The origin is
normally at the top left corner (IPL_ORIGIN_TL). For an efficient use of the MMX capabilities of the
processor, the line length of an image should be a multiple of 8 bytes. This is guaranteed by choosing the
quad-word alignment, each line being padded with dummy pixels if necessary. Finally, the width (number of
column) and the height (number of lines) of the image are specified. The last four parameters are usually
NULL.
Once the header created, memory must be allocated. This is the role of the iplAllocateImage function.
An initial value for the pixel data can be specified, this is the last parameter. The middle parameter of this
function must be set to 0 if no initialization is required. Do not forget to deallocate the images at the end of
the process by calling iplDeallocate(image, IPL_IMAGE_ALL ). Note that for floating point
image, iplAllocateImageFP and iplDeallocateImageFP must be used instead.
An alternative way to create and allocate image is to use the OpenCV equivalent function. Here only the size,
the pixel depth and the number of channels need to be specified, e.g.:
When manipulating images, it is common to sequentially access all pixels of an image. To this end the
iplPutPixel and iplGetPixel
can be used. You just specified the pixel coordinates and an array containing the values, as follows:
But for a more efficient loop, it is possible to directly access the buffer containing the pixels. Caution must
however be taken, because the way this loop must be executed depends on the exact image format. This is
illustrated by the following process
function, where a 8-bit RGB image, with pixel-oriented data order is scanned.
}
}
Although this is the most efficient way to scan an image, this process can be error prone. In order to simplify
this frequent task, an image Iterator can be introduced. The role of this iterator template is to take care of
the pointer manipulation involve in the processing of an image. The template is as follows:
int i, i0,j;
PEL* data;
PEL* pix;
int step;
int nl, nc;
int nch;
public:
/* constructor */
IplImageIterator(IplImage* image,
int x=0, int y=0, int dx= 0, int dy=0) :
i(x), j(y), i0(0) {
data= reinterpret_cast<PEL*>(image->imageData);
step= image->widthStep / sizeof(PEL);
nl= image->height;
if ((y+dy)>0 && (y+dy)<nl) nl= y+dy;
if (y<0) j=0;
data+= step*j;
nc= image->width ;
if ((x+dx)>0 && (x+dx)<nc) nc= x+dx;
nc*= image->nChannels;
if (x>0) i0= x*image->nChannels;
i= i0;
nch= image->nChannels;
pix= new PEL[nch];}
/* has next ? */
bool operator!() const { return j < nl; }
/* next pixel */
IplImageIterator& operator++() {i++;
if (i >= nc) { i=i0; j++; data+= step; }
return *this;}
IplImageIterator& operator+=(int s) {i+=s;
if (i >= nc) { i=i0; j++; data+= step; }
return *this;}
/* pixel access */
PEL& operator*() { return data[i]; }
const PEL operator*() const { return data[i]; }
const PEL neighbor(int dx, int dy) const
{ return *(data+dy*step+i+dx); }
PEL* operator&() const { return data+i; }
An iterator of this type can be declared by specifying the type of the pixels in the image and by giving a
pointer to the IplImage as argument to the iterator constructor, e.g.:
Once the iterator constructed, two operators can be used to iterate over an image. First the ! operator allows
to determine if we reach the end of the image and the * operator that give access to the current pixel. A
typical loop will therefore look like this:
while (!it) {
if (*it < 10) {
Note that if the image contains more than one channel, each iteration will give access to one of the channel
of a pixel. This means that in the case of a color pixel, you have to iterate three times for each pixel. In order
to access all components of a pixel, the operator &
can be used. This one returns an array that contains the current pixel channel values. For example, the
previous example will look like this (note how the iterator is incremented this time to make sure that we go
from one pixel to another):
while (!it) {
pixel= ⁢
if (pixel[1]>pixel[0] && pixel[1]>pixel[2]) {
}
it+= 3;
}
}
The use of image iterators is as efficient as directly looping with pointers. This is true as long as you set the
compiler to optimize for speed, i.e.:
When the processing involves more than one image, more than one iterator can be used. This is illustrated in
the following example:
IplImageIterator<unsigned char>
src(tmp,1,1,tmp->width-2,tmp->height-2);
IplImageIterator<unsigned char>
res(image,1,1,image->width-2,image->height-2);
while (!src) {
cvReleaseImage(&tmp);
}
Here the clone of the source image is used as input while the source image is modified inside the loop. Two
iterators are therefore defined. Since the processing also involves the neighboring pixels, the neighbor
method defined by the iterator is used. Also, in this case, a window is specified when creating the iterator
(here it defines a 1-pixel strip around the image where no processing is undertaken). The resulting image is:
In order to process image sequences (from files or from a camera), you have to use DirectShow. The
DirectShow architecture that is part of Microsoft DirectX relies on a filter architecture. There are three types
of filters: source filters that output video and/or audio signals, transform filters that process an input signal
and produce one (or several) output and finally rendering filters that display or save a media signal. The
processing of a sequence is therefore done using a series of filters connected together; the output of one filter
becoming the input of the next one (you can also have filters with multiple outputs). The first filter is usually
a decompressor that reads a file stream and the last filter could be a renderer that displays the sequence in a
window. In the DirectShow terminology, a series of filters is called a filter graph.
We will first try to process an AVI sequence. Let’s first see if DirectX is working fine. To do so, just use the
GraphEdit
application. This a very useful application included in the DirectX SDK that makes easy the building of filter
graphs. It can be started from the Start|Programs|Microsoft DirectX 8.1 SDK|DirectX
Utilities menu. The GraphEdit application window will pop up.
Our objective is now to visualize the building blocks required to obtain an AVI renderer.
Select Graph|Insert Filters… A window will display the list of available filters.
Choose the DirectShow Filters tree and select the File Source(Async.) filter.
You will be asked to select an AVI file. The filter will appear in the GraphEdit window in the form of a
box. Right-click on the output pin and select the Render Pin option. This is an intelligent option that will
determine what filters are required to render the selected source file and will automatically assemble them
together as shown here:
For an AVI sequence, the video renderer should be composed of 3 filters. The first one is the splitter that
separates the video and audio components; this filter normally has two outputs (video and audio) but note
that in the case of the selected sequence, no audio component was available. The second one is the
appropriate decompressor that decodes the video sequence. Finally, the third filter is the renderer itself that
creates the window and that displays the frame sequence in it. Just push the play button to execute the
graph and the selected AVI sequence should be displayed in a window.
We can build the same filter graph using Visual C++. You first need to include the following include path in
your project settings:
C:\DXSDK\samples\Multimedia\DirectShow\BaseClasses
C:\DXSDK\lib
STRMBASE.LIB
DirectX is implemented using the Microsoft COM technology. This means that when you want to do
something, you do it by using a given COM interface. In order to initialize the COM layer, you must call:
CoInitialize(NULL);
And similarly, when you are done with COM, you need to uninitialize it:
CoUninitialize();
A COM interface is an abstract class containing pure virtual functions (forming together the interface). Using
a COM interface is the only way to communicate with a COM object. They are obtained by calling the
appropriate API function. These functions return a value of type HRESULT representing an error code. The
simplest way to verify whether a COM call failed or succeeded is to check the return value using the
FAILED macro. All COM interface derives from the IUnknown interface.
A very important rule when you use an interface is to never forget to release it after you have finished to use
it otherwise it will result resource leaks. This is done by calling the Release method of the IUnknown
interface which decrements the object's reference count by 1; when the count reaches 0, the object is
deallocated. The safest way to call the Realease method is to use the macro SAFE_RELEASE that can be
found in dxutil.h located in C:\DXSDK\samples\Multimedia\Common\include
This macro is simply defined as:
#define SAFE_RELEASE(p) { if(p){(p)->Release();(p)=NULL;}}
To use a component of DirectX, you must first call its top-level interface. These are identified by a CLSID
identifier and each interface is identified by an IID. For example, to create a DirectShow filter graph (use to
build a series of filters) you call:
IGraphBuilder *pGraph;
CoCreateInstance(CLSID_FilterGraph, // object identifier
NULL, CLSCTX_INPROC,
IID_IGraphBuilder, // interface identifier
(void **)&pGraph); // pointer to the
// top-level interface
To request the other interfaces of this object, you use QueryInterface method. For example:
pGraph->QueryInterface(
IID_IMediaControl, // interface identifier
void **)&pMediaControl); // pointer to the interface
Once the filter graph is created, it becomes easy to create all the filters required to render an AVI file. This is
done by calling
pGraph->RenderFile(MediaFile, NULL);
This call does what the Render Pin option do in the GraphEdit application. To play the video, two
more interfaces are required the IMediaControl that is used to start the playback and the
IMediaEvent used to catch when the stream rendering has completed. Here is the complete class:
class SequenceProcessor {
public:
CoInitialize(NULL);
pGraph= 0;
if (display) {
~SequenceProcessor() {
CoUninitialize();
}
};
When an AVI file is selected, a rendering filter is created and the sequence is displayed. To have an idea of
what filters have been created, we can enumerate them by adding the following member function to our
class:
std::vector<CString> enumFilters() {
pGraph->EnumFilters(&pEnum);
pFilter->QueryFilterInfo(&FilterInfo);
WideCharToMultiByte(CP_ACP, 0, FilterInfo.achName,
-1, szName, 256, 0, 0);
fname= szName;
names.push_back(fname);
SAFE_RELEASE(FilterInfo.pGraph);
SAFE_RELEASE(pFilter);
}
SAFE_RELEASE(pEnum);
return names;
}
This method simply creates a vector of strings (you have to include <vector>) containing the names of the
filters associated with the generated filter graph. This name is obtained by reading the FILTER_INFO
structure. The enumeration is obtained by calling the method EnumFilter of the FilterGraph
instance. Note how all interfaces are released, including the one indirectly obtained through FILTER_INFO
that also contains a pointer to the associated filter graph.
void CCvisionDlg::OnOpen()
{
CFileDialog dlg(TRUE, _T("*.bmp"), "",
OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY,
"image files (*.bmp; *.jpg) |
*.bmp;*.jpg|AVI files (*.avi) |
*.avi|All Files (*.*)|*.*||",NULL);
if (dlg.DoModal() == IDOK) {
if (proc != 0)
delete proc;
if (procseq != 0)
delete procseq;
if (ext.Compare("avi")) {
} else {
m_list.ResetContent();
for (int i=0; i<names.size(); i++)
m_list.AddString(names[i]);
}
}
}
and now if you open an AVI file, you can see the filter list:
The next step is to try to build the same filter graph ourselves without using the RenderFile method.
Instead, we will create each filter and connect them together. This way we will able to modify the graph by
adding our own filters and thus performing the processing we want. Filters are connected together using their
pins; an output pin of a filter is connected to the input pin of the next filter. To obtain the pin of a filter, you
have to use the EnumPins
method. You then iterate through all the pins until you find the required one (either output or input). This is
what the following function does:
{
BOOL bFound = FALSE;
IEnumPins *pEnum;
IPin *pPin;
pFilter->EnumPins(&pEnum);
while(pEnum->Next(1, &pPin, 0) == S_OK)
{
PIN_DIRECTION PinDirThis;
pPin->QueryDirection(&PinDirThis);
if (bFound = (PinDir == PinDirThis))
break;
pPin->Release();
}
pEnum->Release();
return (bFound ? pPin : 0);
}
To add a filter (it must first be created) to the filter graph, we use the AddFilter method:
The second argument is a name for the filter that must identifies it uniquely in the filter graph (if you set it to
NULL, the graph manager will generate one for you). To connect to pins together, we simply use the
Connect method
What filters do we need to display an AVI sequence? We know the answer from the results displayed in the
filter list box or in the GraphEdit application:
Note that for some filter, the pins are created dynamically. This is the case of the AVI splitter that will create
the required output pins (video and/or audio) only when the source is connected to its input. This makes
sense since the format of the output of this kind of filter is known only when the type of its input is known. It
must also be obvious that, to be connected together, the respective output and input pins of two filters must
be of compatible types. The properties of a given pin (such as major type and subtype) can be obtained as
follows:
AM_MEDIA_TYPE amt;
pPin->ConnectionMediaType(&amt);
The following member function will now create the complete filter graph. The procedure is simple: we first
create the filter using CoCreateInstance (finding the right CLSID identifier is the key to obtain the
filter we want), add it to the filter graph, obtain its input pin and connect if to the output pin of the previous
filter.
::MessageBox( NULL,
"Unable to obtain input splitter pin", "Error",
MB_OK | MB_ICONINFORMATION );
return 0;
}
::MessageBox( NULL,
"Unable to obtain output splitter pin", "Error",
MB_OK | MB_ICONINFORMATION );
return 0;
}
::MessageBox( NULL,
"Unable to obtain decoder input pin", "Error",
MB_OK | MB_ICONINFORMATION );
return 0;
}
::MessageBox( NULL,
"Unable to obtain decoder output pin",
"Error", MB_OK | MB_ICONINFORMATION );
return 0;
}
SAFE_RELEASE(pAVIDecIn);
SAFE_RELEASE(pAVIDecOut);
SAFE_RELEASE(pAVIDec);
SAFE_RELEASE(pAVIsOut);
SAFE_RELEASE(pAVIsIn);
SAFE_RELEASE(pAVISplitter);
SAFE_RELEASE(pSourceOut);
SAFE_RELEASE(pSource);
return 1;
}
By executing this manually built filter, the result is the same as previously.
It is now time to process an image sequence. What we want to do is to sequentially process each frame of an
AVI sequence. To do so, the OpenCV library offers a special filter called ProxyTrans. It should be
located in
C:\Program Files\Intel\opencv\bin. To be used, it must first be registered. This can be done
from the MS-Dos window using the regsvr32 application (you just type regsvr32 ProxyTrans.ax,
you might have to include
C:\Program Files\Intel\opencv\bin in your path environment variable).
To check if the ProxyTrans filter is ready to be used, we use again the GraphEdit application. Build a
rendering filter graph and then delete the connection between the Decompression filter and the Video
Renderer (just click on the arrow and push on Delete button). Now select Select Graph|Insert
Filters…, the ProxyTrans
should be in the list of DirectShow filter. Insert it and connect its input pin to the decompressor and its output
pin to the renderer. The sequence should appear again when you play the filter graph.
Obviously, if you play this graph, the resulting file will be the same as the original because our
ProxyTrans
filter that is supposed to do the processing does not do anything for now. However, the size of the output
sequence might be different from the size of the original sequence, this is because of the compressor used in
the graph that might use different parameters to compress the sequence. You probably also noted that when
you play the graph, no sequence is displayed, simply because we removed the Renderer. It is quite easy to
add an extra path to the filter in order to allow the simultaneous display and saving of the sequence. The
Smart Tee is the filter you need. Add it and create the following graph:
As you can see in the figure above, the creation of a video processing filter graph requires connecting several
filters together. Many lines have to be added to our createFilterGraph method; the probability of
making an error becomes then quite high. However, a closer look at this method reveals that the same
sequence is repeated several times, suggesting that some generic function could be introduced to help the
programmer.
This function can be written in a straightforward manner. First the filter is created using
CoCreateInstance, then the input pin is obtained and is connected to the specified output pin. Once this
done, the last step consists in obtaining the required number of output pins. The function is then as follows:
if(FAILED(CoCreateInstance(
filterCLSID, NULL, CLSCTX_INPROC_SERVER,
IID_IBaseFilter,
(void**)&baseFilter)) ||!baseFilter)
{
sprintf(tmp,"Unable to create %ls filter", filtername);
::MessageBox( NULL, tmp, "Error",
MB_OK|MB_ICONINFORMATION );
return 0;
}
sprintf(tmp,
"Unable to obtain %ls input pin", filtername);
sprintf(tmp,
"Unable to connect %ls filter", filtername);
::MessageBox( NULL, tmp, "Error",
MB_OK | MB_ICONINFORMATION );
return 0;
}
SAFE_RELEASE(inputPin);
SAFE_RELEASE(*outputPin);
outputPin[i]= 0;
outputPin[i]= GetPin(baseFilter, PINDIR_OUTPUT, i+1);
if (!outputPin[i]) {
sprintf(tmp,
"Unable to obtain %s output pin (%d)",
filtername, i);
::MessageBox( NULL, tmp, "Error",
MB_OK | MB_ICONINFORMATION );
return 0;
}
}
SAFE_RELEASE(baseFilter);
return 1;
}
Using this function, it becomes easy to create a complex filter graph. The one we will build now will include
the ProxyTrans filter (note that the header file initguid.h must be included to be able to use this
filter). To be useful, this filter must do something. In fact, the objective of this filter is to give access to the
programmer to each frame of the sequence that can thus be processed. This is realized through a callback
function that is automatically called for each frame of the sequence. This callback function passes in
argument a pointer to the current image, the user is then free to analyze and modify this image. Here is an
example of a valid callback function that can be used with the ProxyTrans filter.
In order to have this function called, it must be registered to the ProxyTrans filter. This is simply done by
calling this method of the IProxyTransform interface.
pProxyTrans->set_transform(process, 0);
Here is now the function that creates the filter graph that processes an input sequence and save the result in a
file. Two preview windows are displayed, one for the original sequence, the other one for the out sequence.
bool createFilterGraph() {
IPin* pSourceOut[2];
pSourceOut[0]= pSourceOut[1]= NULL;
// Video source
addSource(ifilename, pGraph, pSourceOut);
return 1;
}
You will note that the output file produced by this program is quite big. This is simply because we are not
using any compressor when the sequence is saved. This is because such filter can only be obtained through
enumeration. This is discussed in the next section.
Filters are registered COM objects made available by the operating system to your applications. Depending
on the software applications installed on your machine, different sets of filters might be available. These files
are classified by category. There is, for example, a category identified by
CLSID_VideoCompressorCategory
and that includes all the compression filters available. When you wish to use a filter of a given category, you
must enumerate the filters available and select one of these.
To enumerate the filters, the first step consists in the creation of a system device enumerator:
ICreateDevEnum *pSysDevEnum;
CoCreateInstance(CLSID_SystemDeviceEnum, NULL,
CLSCTX_INPROC_SERVER,
IID_ICreateDevEnum,
(void **)&pSysDevEnum);
IMoniker *pMoniker;
ULONG cFetched;
while(pEnumCat->Next(
1, // number of elements requested
&pMoniker, // pointer to the moniker
&cFetched) // number of elements returned
== S_OK)
IPropertyBag *pPropBag;
pMoniker->BindToStorage(0, 0, IID_IPropertyBag,
(void **)&pPropBag);
Properties of a filter are obtained using the IPropertyBag interface. This generic interface is used to read
and write properties using text. Moniker can also be used to create a filter:
IBaseFilter* baseFilter;
pMoniker->BindToObject(NULL, NULL,
IID_IBaseFilter, (void**)&baseFilter);
This later approach must be used to create an enumerated filter instead of using the CoCreateInstance
function. The function presented below can be used to obtain the available filters of a category. It returns the
friendly names and the CLSID identifier of each filter. Either can be used after to create a given filter.
if (hr == S_OK) {
VariantClear(&varName);
VARIANT varFilterClsid;
varFilterClsid.vt = VT_BSTR;
if(SUCCEEDED(hr))
{
CLSID clsidFilter;
SysFreeString(varFilterClsid.bstrVal);
}
// Clean up.
pPropBag->Release();
pMoniker->Release();
}
pEnumCat->Release();
}
pSysDevEnum->Release();
}
This function is used to select a compression filter to be used in our sequence processing application. The list
of compression filters is displayed in a list box (obtained after the output sequence is selected:
void CCvisionDlg::OnSave()
{
// Select output file
std::vector<CString> fname;
std::vector<CLSID> fclsid;
enumFilters(CLSID_VideoCompressorCategory,
fname, fclsid);
m_list.ResetContent();
for (int i=0; i<fname.size(); i++)
m_list.AddString(fname[i]);
}
The compression filter is selected by clicking on the corresponding item before pushing the process button.
The sequence will then be saved, compressed according to the default control parameters of the chosen
compressor. What if you are not satisfied with the resulting compression rate? You can obviously try to
select another compression filter; however, it is also possible to use different control parameter values for the
chosen filter. This can be done through a special interface called IAMVideoCompression. This interface
is normally supported by the output pin of a compression filter. You can obtain the interface by calling the
QueryInterface method of the pin:
IAMVideoCompression *pCompress;
pPin->QueryInterface(IID_IAMVideoCompression,
(void**)&pCompress);
Once obtained, the interface can be used to set the compression properties, namely: the key frame rate (a
long integer), the number of predicted frames per key frame (also a long integer), and the relative
compression quality (a double expressing a percentage between 0.0 and 1.0). It is then easy to set these
values using the appropriate methods.
The same strategy can be used to select a video capture device (e.g. a USB camera). The only difference is
that these devices obviously do not have input pins. However, they normally have two output pins (one for
capture and one for preview). The basic steps to build a camera-based video processing filter graph are as
follows. First add the video capture device through enumeration:
CString cameraName= ?;
IPin* pSourceOut[2];
pSourceOut[0]= pSourceOut[1]= NULL;
addFilterByEnum(CLSID_VideoInputDeviceCategory,
cameraName,pGraph,pSourceOut,2);
addFilter(CLSID_ProxyTransform, L"ProxyTrans",
pGraph, pSourceOut);
And finally, you add the required filters to save the resulting sequence to a file:
The complete application that includes the camera selection is given here.
Now to be able to change the camera settings (such as resolution or frame rate), you must access the facilities
offered by the driver of the camera. The easiest way to do it is to use the old VideoForWindows technology
(an ancestor of DirectShow). If the camera you use has a driver compatible with this technology, then it is
possible to obtain dialog boxes to control the camera settings. This is done through the
IAMVfwCaptureDialogs
interface of the camera filter. The first thing to do is then to check if the camera supports this filter and if yes,
to check what dialogs are available. The three standard dialogs are designated by an enumerated type:
VfwCaptureDialog_Source, VfwCaptureDialog_Format,
VfwCaptureDialog_Display. The procedure to obtain one of these dialogs is quite straightforward:
IAMVfwCaptureDialogs *pVfw = 0;
Since March 28th 2003, this page has been visited times according to StatCounter.com
Since March 28th 2003, this page has been visited times according to www.digits.com