Paper Title
Improving Efficiency of Similarity of Document Network using Bisect K-Means

Abstract
A new approach for Identification of document clusters is proposed which describe similar clinical conditions using Bisect k-means and topic detection. Identification of cases with similar clinical characteristics from database of clinical documents is a common problem in clinical informatics. The main goal of document clustering is to identify the document clusters which describe similar clinical cases; also topic identification in those documents belongs to same clusters. To achieve this goal initially system builds the document network by linking the reports in VAERS dataset. Then we will apply the clustering algorithm named as bisect k-means clustering algorithm on this networks to find the similar kind of documents. For evaluation of clustering algorithm, system will use two performance param- eters such as memory and time. Finally, results proved that the bisect k-means clustering outperforms k-means algorithm used in previous available system. We will also working on the topic detection procedure in cluster of documents. This topic detection will ease the understanding of overall contents of the documents belongs to same clusters. Index Terms - Bisect k-means clustering, topic detection, doc- ument clusters.