Give


Graduate Exam Abstract


Chepchumba Limo

M.S. Final

May 6, 2015, 9:00 am - 11:00 am

LSC 304

Decentralized and Dynamic Community Formation in P2P Networks and Performance of Community Based Caching


Abstract: Distributed Hash Tables (DHT) are commonly used in large Peer-to-Peer networks to increase the efficiently of resolving queries. Minimizing the resource discovery time inP2P networks is highly desirable to improve system-wide performance. Distributed caching is an approach for reducing the look-up time. File sharing P2P networks have shown that there exists nodes/users who share similar interests based on semantics, geography, etc., and a group of nodes that share similar interests are said to form a community. A Community Based Caching (CBC) algorithm where nodes make caching decisions based on personal interests is investigated. A major contribution of CBC is that it alleviates the issue of nodes being limited to caching resources that are most popular in the entire network. Instead, caching decisions are primarily based on a node's community affiliations and interests. Community discovery algorithms that currently exists either need a centralized source to aid in community discovery or require additional messaging and complicated computations to determine whether to join a group or not. In many cases, nodes are also limited to being members of only one community at a time. A dynamic and decentralized community discovery algorithm, Dynamic Group Discovery (DGD), is proposed. DGD also allows nodes to be members of multiple communities at the same time. DGD's behavior and performance is then evaluated in conjunction with the Community Based Caching algorithm. To aid in group discovery during run time (i.e., dynamically), DGD uses special keys with embedded group identification information. Oversim, a flexible overly network simulation framework is used to evaluate the proposed DGD algorithm. Performance of DGD is compared to Chord and Static Group Allocation (SGA), in which group identification is done only once. Performance is evaluated for different network sizes, community sizes, and asymmetry among communities. Performance results are presented and analyzed when queries are resolved using cache data versus when queries are resolved using non-cache data. The analysis shows that DGD generally improves lookup performance when cache data is used to resolved queries. However, when non-cached data is used, DGD occasionally performs slightly worse than Chord and SGA. For example, in a network with 10,000 nodes, asymmetrical communities and no churn, DGD outperforms Chord by approximately half a hop and 0.1 seconds in latency. When churn was introduced to the same network, DGD performance drops by approximately one hop and 0.15 seconds in latency. The results also show that approximately 90% of the queries are resolved using non-cached data and therefore, even though DGD is guaranteed reduced lookup time when asymmetrical communities are present and cache records are to used to resolve queries, it is often not enough to significantly improve overall system performance. The results however confirm that caching resources based on personal interests really does reduced lookup performance when resolving queries using cache records. which loosely translates to there being few popular queries and many less popular queries. Therefore, already existing caching schemes would only favor a subset of nodes in the network. File sharing P2P networks have shown that there exists nodes/users who share similar interests based on semantics, geography etc., and a group of nodes that share similar interests are said to form a community. A Community Based Caching (CBC) algorithm where nodes make caching decisions based on personal interests is investigated. One of CBC’s major contributions is that it alleviates the issue of node’s being limited to caching resources that are popular relative to the entire network. Instead, caching decisions are primarily based on a node’s community affiliations and interests. Community discovery algorithms that currently exists either need a centralized source(s) to aid in community discovery or require additional messaging and complicated computations to determine whether to join a group or not. In many cases, nodes are also limited to being members of only one community at a time. A dynamic and decentralized community discovery algorithm, Dynamic Group Discovery (DGD), is proposed. In addition, DGD also allows nodes to be members of multiple communities at the same time. DGD’s behavior is then evaluated in conjunction with the Community Based Caching algorithm. To aid in group discovery during runtime (i.e., dynamically), DGD uses special keys with embedded group identification information. Oversim, a flexible overly network simulation framework is used to evaluate the proposed DGD algorithm. DGD’s performance is compared to Chord and Static Group Allocation (SGA – i.e., where groups identification is done only once and does not change) networks. For different cases, the size of the network, size of the communities and symmetry of the communities is varied. Results are presented and analyzed for performance when queries are resolved using cache data versus when queries are resolved using non-cache data. The analysis shows that using DGD generally improves lookup performance when cache data is used to resolved queries. However, when non-cache data is used, DGD occasionally performs slightly worse than Chord and SGA. For example, in a network with 10,000 nodes, asymmetrical communities and no churn group churn, DGD outperforms Chord by approximately half a hop and 0.1 seconds in latency. When churn was introduced to the same network, DGD’s performance drops by approximately one hop and 0.15 seconds in latency. The results also show that approximately 90% of the queries are resolved using non-cache data and therefore, even though DGD is guaranteed reduced lookup time when asymmetrical communities are present and cache records are to used to resolve queries, it is often not enough to significantly improve overall system performance. The results however confirm that caching resources based on personal interests really does reduced lookup performance when resolving queries using cache records.

Adviser: Dr. Anura Jayasumana
Co-Adviser: N/A
Non-ECE Member: christos@cs.colostate.edu
Member 3: lqyang@engr.colostate.edu
Addional Members: N/A

Publications:
N/A


Program of Study:
CS 457
CS 557
ECE 421
ECE 501
ECE 514
ECE 658
GRAD 544C
ECE 699