Sei sulla pagina 1di 5

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013

Handling Queries in MANETs Using Apriori Algorithm to Increase Data Availability and Reduce Query Latency
Santhosh Kumar S1, Sundeep kumar K 2, Sridevi K.N 3
Computer Science and Engineering, CMR Institute of Technology. Bangalore, India.

Abstract- In mobile adhoc networks, node or link failures arecommon due to frequent network partition and that problem is solved by using replication technique. In this technique, the data items are replicating on mobile hosts. Some schemes are exists to balance the trade-offs between data availability and query latency under different system settings and requirements.The problem here is this technique is not considering the relationship between the data items. Here we solve this problem by mining system and it does association rule mining using apriori algorithm to find the frequent item (FI) sets. Nodes store this FI items information in a FI table. When a particular data item is queried and if it found in the FI table, the related item is also queries and cached in the node. Extensive simulation results show that the Apriori based replication can achieve a balance between these two metrics (data availability and query latency) and provide satisfying system performance.

share part of their memory space(storage) to hold data of others [2],[6]. When a single node only replicates part of the data, there will be a increase in performance of query latency and data availability issues. For example, replicating most data locally can reduce the query delay, but it reduces the data availability because many nodes may replicating the same data locally(cache), while the other data items are not replicated by any nodes. To increase the data availability, The nodes should not replicate the same data(content) that neighboring nodes(nearby) already have.This solution may increase the query latency because some nodes may not be able to replicate the most frequently accessed data, and have to access it from neighboring nodes. Although the delay of accessing the data from neighbors is shorter than that from the data owner (holder), it is much longer than accessing it locally. To overcome from that problem, new data replication Index Terms data availability, query latency, mobile ad hoc techniques are existed to address query delay and data network (MANET) association rule scheme. availability issues. The problem in existing system is that it will not considering the relationship between the data items. I.INTRODUCTION IN mobile ad hoc networks (MANETs), The nodes move In our proposed solution, In each node the data items queried freely and due to node mobility,network partition i.e all the are sent to mining system in the node. This mining system mobile nodes are not be in reachable state,So the nodes in one does association rule mining using apriori algorithm to find partition cannot access data held by nodes in other partitions. the frequent item(FI) sets. Nodes store this FI items Data replication has been widely used to improve data information in a FI table. When a particular data item is availability in MANETs [2]. The data availability can be queried and if it found in the FI table, the related item is also improved by replicating the data at mobile nodes, which are queries and cached in the node. Nodes also exchange their FI not holder of the actual(original) data. Because there are table with any other node they meet. This way all nodes can multiple replicas in the network and the probability of finding learn many FI patterns. This paper tells that the data one copy of the data is more. Also, data replication can reduce replication is more easy as previous. The detail of finding FI the query latency since mobile nodes can obtain the data from pattern is given in section IV.As number of nodes increases, some nearby replicas. However, It is impossible for one node the Availability of data in the FI based replication increases to fetch all the data considering the constraints(memory issues, rapidly than in the greedy based replication. Due to Apriori bandwidth). By taking these issues into consideration,The based replication the Query delay for each query is reduced as Expectation is that the mobile nodes should not replicate all number of nodes increased. the data items in the network. One solution to improve the data access performance considering the resource constraints of mobile nodes is that cooperate with each other; That is,

ISSN: 2231-2803

http://www.ijcttjournal.org

Page 2093

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013
II.RELATED WORK The existing system consists of few methods to overcome data availability and query delay problems. So the first is the data replication concept. In the data replication scheme, the node will request other node for the data and some replicas of the data is stored in its memory before the network partition happens. 1. Greedy technique: In this technique, the most popular replicas will be store in its local cache till the memory is full. 2. One To One Optimization technique: In this scheme, one node in the network is cooperating with other node. so the data replication is easy and the replicas will be shared among these two nodes. 3. Reliable neighbour scheme: In this scheme, one node in the network choose the other node which is having high capability to meet all other nodes in the network. 4. Reliable grouping scheme: In the RG scheme,one node always picks the most suitable data items to replicate on the group of neighboring nodes to increase the data availability and minimize the data access latency within the group. Still more Several strategies for replicating or caching data have been proposed so far [11], [12], [13], [14], [15]. Most of these strategies assume an environment where mobile hosts access databases at sites in a fixed network and replicate/ cache data on the mobile hosts. They address the issue of keeping consistency between the actual(original) data and its replicas(copy of original data) with low communication costs. These strategies assumes only one hop wireless communication and, thus, these schemes(strategies) are completely different from our approach, which assumes multihop communication in mobile ad hoc networks i.e Frequent Item Based Replication. Balancing the Tradeoffs between Query latency and Data Availability in MANETs. Because the mobile nodes have limited storage capacity, it is not possible for them to store all their required data items. As a result, they have to rely(depends) on other nodes or neighboring nodes to get some data. If mobile nodes only host their required data, it is possible that some data items are replicated by every node while some other data items are not replicated by every node. Therefore, it is important for mobile nodes to cooperate with each other and share some part of their memory space to hold(store) data for other nodes. The bad cooperation may actually degrade the performance. So the memory space should determine. Authors proposed some techniques to solve this problem. In Adhoc network, the communication links between two mobile nodes are stable, then more cooperation with these nodes can improve the data availability; if the links between those mobile nodes are not very stable, it is better for the node to hold most of the required data locally (cache). The above technique mainly solves the issue of data availability. For query latency, it is better to allocate data near the required nodes. The degree of cooperation affects both the data availability and the query latency.

III.PROPOSED METHODS Problem Definition: The data replication schemes are more effective to balance the metrics like data availability and query delay but the problem here is 1. All the algorithms dont consider the relationship between data items. If the pattern of movement of nodes changes, the algorithms performs poorly.

2.

Proposed Solution: To further increase the performance even better, the apriori algorithm is used along with these schemes. Apriori algorithm: This mining system does association rule mining using apriori algorithm to find the frequent item (FI) sets. Nodes store this FI items information in a FI table. When a particular data item is queried and if it found in the FI table, the related item is also queries and cached in the node. The link failure between the nodes in the network will be predicted by particular algorithm.

IV.ARCHITECTURE The architecture of proposed system is as follows:

ISSN: 2231-2803

http://www.ijcttjournal.org

Page 2094

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013
The architecture overview of replication by association rule mining method is as follows: The every node will have local cache, cache manager, FI miner, node request handler etc. The FI miner is responsible for which item to cache and which item to not. The cache manager is to manage the cache in the node. The FI miner works based on the transaction DB each node will communicate with other node through wireless channel the statistics module will collects the delivery ratio etc and used for graphs. THEORETICAL MODEL The following is a formal statement of the problem: Let t={i1,i2..im} be a set of literals, called Item set. Let S be a set of transactions, where each transaction T is a set of item sets such that T t. Associated with each transaction is a unique identifier called its TID(transaction id). We say that a transaction T contains X, a set of some item sets in t, if X T. An association rule is an implication of the form X Y, where X t, Y t, and X Y = . The rule X Y holds in the transaction set S with confidence C if C% of transactions in S that contain X also contain Y .The rule X Y has support Sp in the transaction set S if Sp% of transactions in S contains X Y.Given a set of transactions S, the problem of mining association rules is to generate all association rules that have support and confidence greater than the user specified minimum support and minimum confidence respectively. The problem is usually divided into two groups(sub problem). One is to find the item sets whose occurrences are exceeded the predefined threshold in the database; those item sets are called frequent item sets. The second problem is to generate association rules from those frequent item sets(large item sets)with the constraints of minimum confidence. Suppose one of the large item sets is Lk. Lk = {I1, I2,, Ik}, association rules with this item sets are generated in the following way: the first rule is {I1, I2, , Ik-1} {Ik}, by checking the confidence rule,this can be determined as interested or not required. Then other rule are generated by deleting the last data items in the antecedent and inserting it to the consistent(consequent), and the confidences of the new rules are checked to determine the requirements of them. Those processes iterated until the antecedent(history issue)becomes empty. Most of the researches focus on the first sub problem. The Apriori algorithm based Frequent item based techniques finds the frequent sets L In Database D. Let X, Y I be any two item sets. Observe that if X Y, then sup(X) sup(Y ) which leads to the following two rules: I. II. If X is frequent, then any subset Y X is also frequent. If X is not frequent, then any superset Y X cannot be frequent.

APRIORI PSEUDO CODE

(Apriori based Frequent Item selection)


Apriori (T,) L {large 1-itemsets that appear in more than transactions} K2 while Lk-1 CK Generate Lk-1 For transactions t T Ct Subset (Ck,t) For candidates c Ct count[c]count[c]+1 Lk{c Ck | count[c]} KK+1 Uk Lk Return Based on the above observations, we can significantly improve the data item set mining algorithm by reducing the number of candidates we generate, by limiting the candidates to be only those that will potentially be frequent. First we can stop generating supersets of a candidate once we determine that it is infrequent, since no superset of an infrequent item set can be frequent. Second, we can avoid any candidate that has an infrequent subset. These two observations can result in significant pruning of the search space. Find the frequent set Lk-1. Join step. Ck is generated by joining Lk-1 itself Prune Step. Any k-1 item set that is not frequent cannot be a subset of a frequent K item set, hence should be removed. Where (Ck: Candidate item set of size K) (Lk: frequent item set of size K)

ISSN: 2231-2803

http://www.ijcttjournal.org

Page 2095

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013
WORKING Node module communicates with other nodes wirelessly and node move from one place to another. Node module checks the local cache when there is need for any data item. If its not found sends request to neighbouring nodes. Node query history is used by the FI mining module which implements the apriori algorithm to learn the pattern or FI and moves the FI to the FI table. Nodes also exchanges FI table contents with other nodes they meet. Once the node notice that the download request for a particular chunk are frequently occurring and its download request count become more than the threshold value the node keep that data in its cache and discard previously stored less requested data. PERFORMANCE ANALYSIS We implemented the proposed solution in JAVA and analyzed the performance of the system in terms of Availability: it is the ratio of number of request for a particular data and the successful response for it. Availability indicates weather the particular data is there in the requested region. Due to the mobility nature of MANET nodes achieving Good availability is harder task. As we see in the graph the availability of data when we use Delay: Delay is the time between the request and the response. Generally as the number of nodes increased or the data range increase the Delay will be increased as the query for a particular data should be looked in many nodes. To reduce delay, Replication is done based on frequent Item set so the frequently downloaded items are stored in cache when the next request for the same data come the node serves it from its own cache. The figure shows that the greedy based replication methods has much delay compared to the FI based Replication.

CONCLUSION In MANETs, the network partition takes place commonly due to link failure. One way to improve data availability and reduce query latency is through data replication schemes. In this paper, we proposed association rule mining by using apriori algorithm to relate the data items to further improve the data availability and reduce the query delay. The basic idea is that Association rule generation is usually split up into two separate steps: First, find all frequent itemsets in a database. Second, the association rules are formed by these frequent itemsets and the minimum confidence constraint. An extensive performance evaluation demonstrates that the proposed schemes outperform the existing solutions in terms of data availability and query latency. Results also show that the apriori algorithms are used to improve the Data availability and reduce query latency.

FI based replication is getting increased compared to Greedy based replication. With the number of nodes increased the availability is still best.

ISSN: 2231-2803

http://www.ijcttjournal.org

Page 2096

International Journal of Computer Trends and Technology (IJCTT) volume 4 Issue 7July 2013
REFERNECES
[1]Yang Zhang proposed Balancing the Tradeoffs between Query Delay and Data Availability in MANETs, IEEE in April 2012. [2] Takahiro Hara and Sanjay, Data Replication for Improving Data Accessibility in Ad Hoc Networks ,IEEE in November 2006. [3] Bin Tang, Samir dasBenefit based Data Caching in Ad Hoc Networks, IEEE in 2006. [4] Takahiro Hara and Madria,Consistency Management Strategies for Data Replication in Mobile Ad Hoc Networks, IEEE in July 2009. [5] Jiannong Cao, Yang Zhang Cao, Li Xie,Data Consistency for Cooperative Caching in Mobile Environments ,IEEE in 2007.

[6] Takahiro Hara, Effective Replica Allocation in Ad Hoc Networks for Improving Data Accessibility,IEEE in 2001. [7] Karen H., Baochun LiEfficient and Guaranteed Service Coverage in Partitionable Mobile Adhoc Networks,IEEE in2002.

[8] Jiun-Long Huang and Ming-Syan Chen,On the Effect of Group Mobility to Data Replication in Ad Hoc Networks IEEE in MAY 2006. [9] Takahiro Hara, Quantifying Impact of Mobility on Data Availability in Mobile Ad Hoc Networks,IEEE in FEBRUARY 2010 . [10] Francoise Sailhan, Valerie Issarny, Scalable Service Discovery for MANET,IEEE in 2005. [11]D. Barbara and T. Imielinski, Sleepers and Workaholics: Caching Strategies in Mobile Environments,in 1994. [12]Y. Huang, P. Sistla, and O. Wolfson, Data Replication for MobileComputer, Proc. ACM SIGMOD 94, pp. 13-24, 1994. [13]J. Jing, A. Elmagarmid, A. Helal,Bit-Sequences:An Adaptive Cache Invalidation Method in MobileClient/Server Environments, 1997. [14]E. Pitoura and B. Bhargava, Maintaining Consistency of Data in Mobile Distributed Environments,in 1995. [15]K.L. Wu, P.S. Yu, and M.S. Chen, Energy-Efficient Caching for Wireless Mobile Computing,1996.

ISSN: 2231-2803

http://www.ijcttjournal.org

Page 2097

Potrebbero piacerti anche