Document Similarity Example
Each symbol represents a publication, coloured by its author (see legend below). Publications are clustered
by their vocabulary similarity - the more similar the publication pair, the closer they attempt to be in
the graph. Each publication is linked to its 6 most similar publications.
Key to author colours used in graph
Instructions
- Drag publications with the mouse to see their stability. Clicking on a publication reveals a book summary
and word cloud (word cloud is for Dickens only in this demonstrator)
- Holding shift key down and dragging zooms and pans across graph - left mouse button drag to zoom, right
mouse button drag to pan.
- 'R' key restores a zoomed/panned view to the original overview.
- 'L' key to toggle linking lines on and off. Thick lines join each document to its most similar publication;
thin lines to its five next most similar publications.
- Space key to toggle movement on and off. When movement is off, publications can be dragged without them
returning to their original positions.
- Up and Down arrows (held down for a second or two) increase or decrease the minimum distance of separation
between nodes. Can be useful for unclustering closely associated publications.
Changes since previous version
- Now uses database created 24th Nov with improved word counts
- Improved formatting of book summary window (ordinal rank and comma separated large numbers)
- Improved labelling (now spans multiple lines)
- Improved colouring of books by author (maximises colour contrast between labels)
- Now uses top six most similar titles to improve clustering appearance
- Links between nodes can now be shown with colour and shape indicating tension
- Slightly improved performance
You can also download this as a stand-alone application, which runs slightly faster and at higher quality:
|
|
Last modified, 24th November, 2008