Sei sulla pagina 1di 6

file:///media/Thunder/home/patrick/personal/writing ...

SAX2, a good example of Visitor pattern

patrick@intervideo.com

$Date: 2000-09-18 03:38:58-07 $ $Revision: 1.2 $ $State: Exp $

Summary: In addition to DOM, SAX2 is another simple alternative to processing XML documents. The design of SAX2 are based on the Visitor pattern. This article introduces SAX2 briefly and explains its relation to Visitor pattern.

Note :

What Is SAX2?

Have you eve thought that parsing XML documents using DOM (Document Object Model) is a nightmare ? Here is the remedy : SAX2, the Simple API for XML, a fast, low-memory alternative to processing XML documents.

SAX2 is a push-model parser. In other words, you provide the handlers and the parser calls them when a particular event occurs, such as the start of a document or the start or end of an element. The SAX2 parser generates several categories of events, including events that occur in the content of the XML document, events that occur in the DTD, and error events. To handle these events, you implement a corresponding handler class that contains methods to process the appropriate events. Note that you only need to implement handlers for those events you wish to process. If a handler is not implemented for a specific type of event, the event is simply ignored.

The following is a very simple command-line application that reads an XML file and prints the file's tags to the console window (There seems to be some errors in the example code). The application implements only the content handler. The application consists of the following files:

MyContent.h—header file for the content handler.

MyContent.cpp—implementation of the content handler.

TestSax.cpp—the "command-line" console application.

file:///media/Thunder/home/patrick/personal/writing ...

// TestSax.cpp

#include "stdafx.h" // Again, you need headers.

#include "stdio.h" // This is needed only to print something.

#include "MyContent.h"

// References to SAX are hidden here

int main(int argc, char* argv[]) {

// Start!

CoInitialize(NULL);

// Some magic to start COM. You may want to // use CoInitializeEx instead.

ISAXXMLReader* pRdr = NULL;

// Create parser. (A bit more magic

...

)

HRESULT hr = CoCreateInstance( uuidof(ISAXXMLReader),

uuidof(SAXXMLReader), (void **)&pRdr);

NULL, CLSCTX_ALL,

__

if(!FAILED(hr))

{

MyContent * pMc = new MyContent();

hr = pRdr->putContentHandler(pMc);

// Set your own content // handler (and other handlers // as well). // And in real life, check // this hr!

static wchar_t URL[1000];

mbstowcs( URL, argv[1], 999 );

// ParseURL expects Unicode string, // argv[1] is ASCII. // Verify that you have argv[1].

hr = pRdr->parseURL(URL);

// and parse!

pRdr->Release();

} else { printf("\nUh-oh

...

}

// Now just some cleanup work ...

%08X\n\n", hr); // Hopefully this will not // happen, but let's be ready.

CoUninitialize();

return 0;

}

// And finally, again, some magic to // uninitialize COM.

// MyContent.h

#import <msxml3.dll> raw_interfaces_only

//

We use this library.

using namespace MSXML2;

// and everything for SAX is in this namespace

class MyContent : public ISAXContentHandler {

public:

MyContent();

// Define constructor and destructor.

virtual ~MyContent();

file:///media/Thunder/home/patrick/personal/writing ...

// Copy all methods from ISAXContentHandler. // interface ... virtual HRESULT STDMETHODCALLTYPE startDocument(void); virtual HRESULT STDMETHODCALLTYPE endDocument(void); virtual HRESULT STDMETHODCALLTYPE startPrefixMapping(const

 

wchar_t

RPC_FAR

 

wchar_t __

RPC_FAR

*pwchPrefix,int cchPrefix,const *pwchUri,int cchUri);

// ...

and

the underlying IUnknown interface

...

 

long __

stdcall

QueryInterface(const struct _GUID &,void ** );

unsigned long __

stdcall

stdcall

AddRef(void);

Release(void);

 

unsigned long __

// ...

and

add whatever you like to simplify implementation.

 

private:

void prt ( wchar_t * pwchFmt, const wchar_t __ int cchVal);

RPC_FAR

*pwchVal,

};

// MyContent.cpp

#include "stdafx.h" #include <stdio.h> #include <stdlib.h> #include "MyContent.h"

// We need the headers ...

// Usually there is nothing to do in constructors and destructors, // but if so, do it here. MyContent::MyContent() {} MyContent::~MyContent() {}

// Now finish the IUnknown stuff.

// (However, keep in mind that if you want to implement handlers as COM // objects, you can add more functionality than shown with these // methods.)

long __

stdcall

MyContent::QueryInterface(const struct _GUID &,void ** )

{ return 0; }

 

unsigned long __

stdcall

MyContent::AddRef() { return 0; }

unsigned long __

stdcall

MyContent::Release() { return 0; }

// Now get down to business. // First decide which events you want and which you don't. // It's simple for methods you don't want:

HRESULT STDMETHODCALLTYPE MyContent::startDocument() {

return S_OK;

}

// Return S_OK to continue. // Any error return code will abort parsing.

file:///media/Thunder/home/patrick/personal/writing ...

// And for events you want, do whatever you want! HRESULT STDMETHODCALLTYPE MyContent::startElement(

/* [in] */ wchar_t __

RPC_FAR

*pwchNamespaceUri,

/* [in] */ int cchNamespaceUri,

 

/* [in] */ wchar_t __

RPC_FAR

*pwchLocalName,

/* [in] */ int cchLocalName,

/* [in] */ wchar_t __

RPC_FAR

*pwchRawName,

/* [in] */ int cchRawName,

/* [in] */ ISAXAttributes __

RPC_FAR

*pAttributes)

{

// I want to print the tag name.

prt(L"\n<%s>",pwchLocalName,cchLocalName); return S_OK;

}

// "prt" is a private method. // SAX does not use it. You don't have to implement it or any other // private methods. // This one is just quick print.

void MyContent::prt ( wchar_t * pwchFmt, const wchar_t __ int cchVal ) {

RPC_FAR *pwchVal,

static wchar_t val[1000]; cchVal = cchVal>999 ? 999 : cchVal; wcsncpy( val, pwchVal, cchVal ); val[cchVal] = 0; wprintf(pwchFmt,val);

}

Does it seem similar to you ? What pattern have you found in this example ? Yes, the Visitor pattern. Visitor pattern uses the double dispatching technique to separate the traversing logic and the processing logic. The traversing logic lies in implementation of ISAXXMLReader and the processing logic lies in the implementation of the ISAXContentHandler. Let us illustrate this example by UML diagram:

file:///media/Thunder/home/patrick/personal/writing ...

file:///media/Thunder/home/patrick/personal/writing ... The Visitor pattern is definitely good when some simple traversing and processing is needed.

The Visitor pattern is definitely good when some simple traversing and processing is needed. On the other hands, if you want to do some complex editing, Visitor is probably not a good choice because the visitor is a passive role.

Reference

  • Design Pattern

  • Thinking In Java, Bruce Eckel

  • SAX2 Jumpstart for XML Developers, Eldar A. Musayev

  • How to traverse DirectShow Graph using Visitor pattern, Tsai Ying-Hau - I will add this link when I finish it. :-)

Revision History

$Log: SAX2\040a\040good\040example\040of\040Visitor\040pattern.htm,v $ Revision 1.2 2000-09-18 03:38:58-07 patrick + Complete summary + SAX brief introduction + Add pattern illustration diagram + More explanations about visitor pattern + One more reference link

Revision 1.1 2000-09-18 02:28:24-07 patrick

* Reformat

Revision 1.0 2000-09-17 20:26:34-07 patrick Initial revision

file:///media/Thunder/home/patrick/personal/writing ...