Text this: An Improved Data Clustering Algorithm for Mining Web Documents