Sei sulla pagina 1di 7

International Journal of Computer Science Engineering and Information Technology Research (IJCSEITR) ISSN 2249-6831 Vol.

3, Issue 1 Mar 2013, 211-216 TJPRC Pvt. Ltd.

PARTIAL DECOMPRESSION USING SAX IN XML


VIJAY GULHANE1 & M. S. ALI2
1 2

Sipna COET, Amravati, Maharashtra, India

Prof.Ram Meghe COEM, Badnera, Amravati, Maharashtra, India

ABSTRACT
This research work aims to demonstrates the Extraction, compression and query processing of XML documents for Adaptive Compression Techniques and Efficient Query Evaluation . Though there are many algorithms for compressed XML dataset. It is also wellknown that no one can claims for best algorithm . We proposed here are the algorithms for xml compression and Efficient Query Evaluation as - Feasible XML compression using data compression algorithm.

Qurey Processor using Sax parsing and Interfaces . It is shown that using the proposed techniques for xml data compression will pave a way for better compression and improve the compression ratio and performance of compressor system

KEYWORDS: Partial Decompression, SAX Parser, Compression Systems, Query Processor, Ziv-Lempel Algorithm INTRODUCTION
The system first compressed the XML document by proposed algorithm. The compressed file is divided into different relational databases doing so there is no need to decompress the complete file for retrieving the results of any query. Only the required information is decompressed and submitted to the user. The average compression ratio of the designed compressor is comparable which may be considered competitive compared to other queriable XML compressors. Based on several experiments, the query processor part had the ability to answer different kinds of queries ranging from simple exact match queries to complex ones that require retrieving information from several compressed XML documents. The problem with XML is that it is text-based, and verbose by its design (the XML standard explicitly states that terseness in XML markup is of minimal importance"). As a result, the amount of information that has to be transmitted, processed and stored is often substantially larger in comparison to other data formats. This can be a serious problem in many occasions, since the data has to be transmitted quickly and stored compactly. Large XML documents not only consume transmission time, but also consume large amounts of storage space.[ H. Liefke and D. Suciu. (2000)]. This section enumerates a few of the most promising techniques namely XGrind, XQC and The evaluation of these compressors considers three main aspects: compression speed, compression ratio, [Al-Hamadani, Baydaa (2011) ] A very large number of XML compressors have been proposed in the literature of recent years. These XML compressors can be classified with respect to two main characteristics. The first classification is based on their awareness of the structure of the XML documents.

212

Vijay Gulhane & M. S. Ali

Figure 1: XML Compression Technique Figure 1 shows XML Compression Techniques, This group of the XML compressors allow queries to be processed over their compressed formats . The main focus of queriable XML compressors is to allow queries to be directly evaluated over their compressed formats without decompressing the whole document. The compression ratio of this group is usually worse than that of the archival XML compressors. This type of compressor is very important for many applications that are hosted on resource-limited computing devices such as mobile devices and GPS systems. This section discusses representatives of the two main classes of this group:

THE PROPOSE APPROCHE


We propose an XML compressor to compress XML data for the purposes retrieving and efficient evaluations of queries on compressed XML data, Extensible Markup Language (XML) [XML 1.0 (Second Edition) W3C

Recommendation, October (2000)] is proposed as a standardized data format designed for specifying and exchanging data on the Web. With the proliferation of mobile devices, such as palmtop computers, as a means of communication in recent years, it is reasonable to expect that in the foreseeable future, a massive amount of XML data will be generated and exchanged between applications in order to perform dynamic computations over the Web. However, XML is by nature verbose, since terseness in XML markup is not considered a pressing issue from the design perspective [XML 1.0 (Second Edition) W3C Recommendation, October (2000)]. In practice, XML documents are usually large in size as they often contain much redundant data. The size problem hinders the adoption of XML, since it substantially increases the costs of data processing, data storage, and data exchanges over the Web. As the common generic text compressors, such as Gzip [J. Gailly et al], Bzip2 [http://sources.redhat.com/bzip2/], WinZip [http://www.winzip.com/], PKZIP [13], or MPEG-7 (BiM)

[J.M.Martinez.MPEG-7Overview(version9)], are not able to produce usable XML compressed data, many XML specific compression technologies have been recently proposed. The essential idea of these technologies is that, by utilizing the exposed structure information in the input XML document during the compression process, they pursue two important goals at the same time. First, they aim at achieving a good compression ratio and time compared to the generic text compressors mentioned above. Second, they aim at generating a compressed XML document that is able to support efficient evaluation of queries over the data.

Partial Decompression Using SAX in XML

213

THE ARCHITECTURE OF XVSGC

Figure 2 : XVSGC Query Processor XVSGC is written in Java , with object-oriented design in mind. We decided to implement it as qurable compressor that would be easy to use and extend. The functionality of both the compressor and the search engine is described in the brief. The system is component-based. For each component .figure 2 shows the preliminary architecture of XVSGC. We base our work on the principle of LZF that XML compression partial query processing techniques (like operators,indexes, values for query optimization etc.can be used together when properly combined. This principle has been stated and forcefully validated in the domain of relational query processing [1],[3]. Thus, it is important in the XML dataset for the partial efficient qury processing. Above figure shows the query processor . In the query processor we has the input from the xml compresser and qury processor loads the compressed file by the loder. This compressed file processes for the parsing and query by the user is handel by the Query engine . once qurey is obtain then data or values are serched by the search engine that is by the index values. In the optimization phase partial qurey is optimize and exicuts the results with find data or not found. In additional to this we are designed a timers for the both in the time format( HH:MM:SS:MS) viz at compressor side and one at query processig side. It contains the following modules: The loader and compressor converts XML documents in acompressed , yet queryable format, using compression algorithms and the query work loader acess that file. The compressed repository stores the compressed documents and provides: (i) compressed data The queryprocessor processes the compressed documents and provides: :(i) compressed data Elements (ii ) Values,evaluates queries over compressed documents, Allows For efcient evaluation over the compressed repository

RESULTS

Figure 3 : Comparative Results

214

Vijay Gulhane & M. S. Ali

Figure 3 shows the comparitive results of our work(XVSGC) with the others . From the graph we can see that our approch gives better result as compair to Xmill , Gzip,Xquec and slightly equal to Xgrind.

CONCLUSIONS
To make the XML compressor adaptive and optimal, it is necessary to make the algorithm adaptive and optimal. We started this work with an objective to initiate an enquiry in existing XML compression techniques. The proposed work intent to achieve optimal, adaptive and flexible in XML compression and efficient query evaluation

REFERENCES
1. Mustafa Atay, Yezhou Sun, Dapeng Liu, Shiyong Lu, Farshad Fotouhi MAPPING XML DATA TO RELATIONAL DATA: A DOM-BASED APPROACH University, Detroit, MI 48202 2. Sherif Sakr XML compression techniques: A survey and comparison National ICT Australia (NICTA), 223 Anzac Parade, NSW 2052, Sydney, Australia Journal of Computer and System Sciences 75 (2009) 303322 3. Pankaj M. Tolani Jayant R. Haritsa XGRIND: A Query-friendly XML Compressor Proceedings of the 18th International Conference on Data Engineering (ICDE.02) 1063-6382/02 $17.00 2002 IEEE 4. Wilfred Ng Wai-Yeung Lam Peter T. Wood Mark Levene XCQ: A queriable XML compression system Knowl Inf Syst (2006) DOI 10.1007/s10115-006-0012-z 5. Wilfred Ng Lam Wai Yeung James Cheng Comparative Analysis of XML Compression Technologies Department of Computer Science The Hong Kong University of Science and Technology Hong Kong 6. AlHamadani, Baydaa Retrieving Information from Compressed XML Documents According to Vague Queries July, 2011 University of Huddersfield Repository http://eprints.hud.ac.uk/ 7. Vojtech Toman Compression of XML Data Master Thesis Prague, March 20, 2003 Faculty of Mathematics and Physics Charles University 8. Wai Yeung, Lam Wilfred Ng, Peter T. Wood Mark Levene XCQ: XML Compression and Querying System Hong Kong University of Science and Technology, Birkbeck College, University of London 9. Andrei Arion1, Angela Bonifati2, Gianni Costa2, Sandra DAguanno1,etel Efficient Query Evaluation over Compressed XML Data E. Bertino et al. (Eds.): EDBT 2004, LNCS 2992, pp. 200218, 2004. _c SpringerVerlag Berlin Heidelberg 2004 10. Gregory Leighton and Denilson Barbosa Optimizing XML Compression (Extended Version) Department of Computer Science Wayne State

arXiv:0905.4761v1 [cs.DB] 28 May 2009 11. Pasco, R. 1976. Source Coding Algorithms for Fast Data Compression. Ph. D. dissertation, Dept. of Electrical Engineering, Stanford Univ., Stanford, Calif. 12. Llewellyn, J. A. 1987. Data Compression for a Source with Markov Characteristics. Computer J. 30, 2, 149-156. 13. Ryabko, B. Y. 1987. A Locally Adaptive Data Compression Scheme. Commun. ACM 16, 2 (Sept.), 792. 14. Ziv, J., and Lempel, A. 1977. A Universal Algorithm for Sequential Data Compression. IEEE Trans. Inform. Theory 23, 3 (May), 337-343. 15. H. Liefke and D. Suciu. XMill: An Efficient Compressor for XML Data. Proceedings of the ACM SIGMOD International Conference on Management of Data, pp. 153-164 (2000).

Partial Decompression Using SAX in XML

215

16. Al-Hamadani, Baydaa (2011) Retrieving Information from Compressed XML Documents According to Vague Queries. Doctoral thesis, University of Huddersfield. 17. J. Cheng and W. Ng. XQzip: Querying Compressed XML Using Structural Indexing. Proceedings of EDBT (2004). 18. P. M. Tolani and J. R. Haritsa. XGRIND: A Query-friendly XML Compressor. IEEE Proceedings of the 18th International Conference on Data Engineering (2002). 19. J. K. Min, M. J. Park, and C. W. Chung. XPRESS: A Queriable Compression for XML Data. Proceedings of the ACM SIGMOD International Conference on Management of Data (2003). 20. Antoshenkov. Dictionary-Based Order-Preserving String Compression. VLDB Journal 6, page 26-39, (1997). 21. G. Cleary, I.H. Witten, Data compression using adaptive coding and partial string matching, IEEE Trans. Commun. OM-32 (4) (1984) 396402. 22. David Salomon, Data Compression: The Complete Reference, pub-SV, 2004. 23. M. Girardot, N. Sundaresan, Millau: An encoding format for efficient representation and exchange of XML over the Web, Comput. Networks 33 (16)(2000) 747765. XAUST Compressor 24. Sherif Sakr An Experimental Investigation of XML Compression Tools arXiv:0806.0075v1 [cs.DB] 31 May 2008

Potrebbero piacerti anche