Sei sulla pagina 1di 6

file:///media/T hunder/home/patrick/personal/writing...

SAX2, a good example of Visitor


pattern

patrick@intervideo.com
$Date: 2000-09-18 03:38:58-07 $
$Revision: 1.2 $
$State: Exp $

Summary: In addition to DOM, SAX2 is another simple alternative to


processing XML documents. The design of SAX2 are based on the
Visitor pattern. This article introduces SAX2 briefly and explains its
relation to Visitor pattern.

Note :

What Is SAX2?
Have you eve thought that parsing XML documents using DOM (Document
Object Model) is a nightmare ? Here is the remedy : SAX2, the Simple API for
XML, a fast, low-memory alternative to processing XML documents.

SAX2 is a push-model parser. In other words, you provide the handlers and the
parser calls them when a particular event occurs, such as the start of a
document or the start or end of an element. The SAX2 parser generates
several categories of events, including events that occur in the content of the
XML document, events that occur in the DTD, and error events. To handle
these events, you implement a corresponding handler class that contains
methods to process the appropriate events. Note that you only need to
implement handlers for those events you wish to process. If a handler is not
implemented for a specific type of event, the event is simply ignored.

The following is a very simple command-line application that reads an XML file
and prints the file's tags to the console window (There seems to be some errors
in the example code). The application implements only the content handler.
The application consists of the following files:

MyContent.h—header file for the content handler.


MyContent.cpp—implementation of the content handler.
TestSax.cpp—the "command-line" console application.

1 of 6 08/16/2008 10:31 PM
file:///media/T hunder/home/patrick/personal/writing...

// TestSax.cpp

#include "stdafx.h" // Again, you need headers.

#include "stdio.h" // This is needed only to print something.


#include "MyContent.h" // References to SAX are hidden here

int main(int argc, char* argv[]) // Start!


{
CoInitialize(NULL); // Some magic to start COM. You may want to
// use CoInitializeEx instead.

ISAXXMLReader* pRdr = NULL; // Create parser. (A bit more magic...)


HRESULT hr = CoCreateInstance(__uuidof(SAXXMLReader), NULL, CLSCTX_ALL,
__uuidof(ISAXXMLReader), (void **)&pRdr);

if(!FAILED(hr))
{
MyContent * pMc = new MyContent(); // Set your own content
// handler (and other handlers
// as well).
hr = pRdr->putContentHandler(pMc); // And in real life, check
// this hr!

static wchar_t URL[1000]; // ParseURL expects Unicode string,


// argv[1] is ASCII.
mbstowcs( URL, argv[1], 999 ); // Verify that you have argv[1].

hr = pRdr->parseURL(URL); // and parse!

pRdr->Release(); // Now just some cleanup work...


}
else {
printf("\nUh-oh... %08X\n\n", hr); // Hopefully this will not
// happen, but let's be ready.
}

CoUninitialize(); // And finally, again, some magic to


// uninitialize COM.
return 0;
}

// MyContent.h

#import <msxml3.dll> raw_interfaces_only // We use this library.


using namespace MSXML2; // and everything for SAX is in this namespace
class MyContent : public ISAXContentHandler
{
public:
MyContent(); // Define constructor and destructor.

2 of 6 08/16/2008 10:31 PM
file:///media/T hunder/home/patrick/personal/writing...

virtual ~MyContent();

// Copy all methods from ISAXContentHandler.


// interface...
virtual HRESULT STDMETHODCALLTYPE startDocument(void);
virtual HRESULT STDMETHODCALLTYPE endDocument(void);
virtual HRESULT STDMETHODCALLTYPE startPrefixMapping(const
wchar_t__RPC_FAR *pwchPrefix,int cchPrefix,const
wchar_t __RPC_FAR *pwchUri,int cchUri);

// ...and the underlying IUnknown interface...


long __stdcall QueryInterface(const struct _GUID &,void ** );
unsigned long __stdcall AddRef(void);
unsigned long __stdcall Release(void);

// ...and add whatever you like to simplify implementation.


private:
void prt ( wchar_t * pwchFmt, const wchar_t __RPC_FAR *pwchVal,
int cchVal);
};

// MyContent.cpp

#include "stdafx.h" // We need the headers...


#include <stdio.h>
#include <stdlib.h>
#include "MyContent.h"

// Usually there is nothing to do in constructors and destructors,


// but if so, do it here.
MyContent::MyContent() {}
MyContent::~MyContent() {}

// Now finish the IUnknown stuff.


// (However, keep in mind that if you want to implement handlers as COM
// objects, you can add more functionality than shown with these
// methods.)
long __stdcall MyContent::QueryInterface(const struct _GUID &,void ** )
{ return 0; }
unsigned long __stdcall MyContent::AddRef() { return 0; }
unsigned long __stdcall MyContent::Release() { return 0; }

// Now get down to business.


// First decide which events you want and which you don't.
// It's simple for methods you don't want:
HRESULT STDMETHODCALLTYPE MyContent::startDocument()
{
return S_OK; // Return S_OK to continue.
// Any error return code will abort parsing.
}

3 of 6 08/16/2008 10:31 PM
file:///media/T hunder/home/patrick/personal/writing...

// And for events you want, do whatever you want!


HRESULT STDMETHODCALLTYPE MyContent::startElement(
/* [in] */ wchar_t __RPC_FAR *pwchNamespaceUri,
/* [in] */ int cchNamespaceUri,
/* [in] */ wchar_t __RPC_FAR *pwchLocalName,
/* [in] */ int cchLocalName,
/* [in] */ wchar_t __RPC_FAR *pwchRawName,
/* [in] */ int cchRawName,
/* [in] */ ISAXAttributes __RPC_FAR *pAttributes)
{
// I want to print the tag name.
prt(L"\n<%s>",pwchLocalName,cchLocalName);
return S_OK;
}
// "prt" is a private method.
// SAX does not use it. You don't have to implement it or any other
// private methods.
// This one is just quick print.
void MyContent::prt ( wchar_t * pwchFmt, const wchar_t __RPC_FAR *pwchVal,
int cchVal )
{
static wchar_t val[1000];
cchVal = cchVal>999 ? 999 : cchVal;
wcsncpy( val, pwchVal, cchVal ); val[cchVal] = 0;
wprintf(pwchFmt,val);
}

Does it seem similar to you ? What pattern have you found in this example ?

Yes, the Visitor pattern. Visitor pattern uses the double dispatching technique
to separate the traversing logic and the processing logic. The traversing logic
lies in implementation of ISAXXMLReader and the processing logic lies in
the implementation of the ISAXContentHandler.

Let us illustrate this example by UML diagram:

4 of 6 08/16/2008 10:31 PM
file:///media/T hunder/home/patrick/personal/writing...

The Visitor pattern is definitely good when some simple traversing and
processing is needed. On the other hands, if you want to do some complex
editing, Visitor is probably not a good choice because the visitor is a passive
role.

Reference
Design Pattern
Thinking In Java, Bruce Eckel
SAX2 Jumpstart for XML Developers, Eldar A. Musayev
How to traverse DirectShow Graph using Visitor pattern, Tsai Ying-Hau
- I will add this link when I finish it. :-)

Revision History
$Log: SAX2\040a\040good\040example\040of\040Visitor\040pattern.htm,v $
Revision 1.2 2000-09-18 03:38:58-07 patrick
+ Complete summary
+ SAX brief introduction
+ Add pattern illustration diagram
+ More explanations about visitor pattern
+ One more reference link

Revision 1.1 2000-09-18 02:28:24-07 patrick

5 of 6 08/16/2008 10:31 PM
file:///media/T hunder/home/patrick/personal/writing...

* Reformat

Revision 1.0 2000-09-17 20:26:34-07 patrick


Initial revision

6 of 6 08/16/2008 10:31 PM

Potrebbero piacerti anche