January 22nd, 2007

Tag maps update

As promised, here is an update to the tag maps appli­ca­tion I intro­duced below along with some explanations.

tag_maps.jpg

For the impa­tient: HERE’S THE LINK

(Update again: The lat­est ver­sion can be found here)

And for the curi­ous: Here’s the explanations:

What are tags? Tags are per­son­ally cho­sen, free-form key­words assigned to dig­i­tal con­tents. So instead of putting a book­mark into a folder, you might assign it the tags “pho­tog­ra­phy nature ger­many lomo”. If later you are look­ing for sites about pho­tog­ra­phy, you will find it again under this term, but also if you are search­ing for “nature”. Another nice aspect is that your tag­ging cre­ates an anno­ta­tion to the exist­ing con­tent. If you share these with oth­ers (e.g. via a pub­lic book­mark­ing ser­vice such as deli­cious), every­body ben­e­fits by dis­cov­er­ing new sites and get­ting bet­ter matches for their searches.

You can find more info about tag­ging e.g. at Wikipedia.

What are tags clouds? Tag clouds rep­re­sent a whole bunch of tags as weighted lists. The more often a tag has been used, the larger it will be dis­played in the list. This can be used to both char­ac­ter­ize sin­gle users, web­pages, as well as whole communities.

As an exam­ple, see my deli­cious tag cloud here:

Tag clouds can also be used for nav­i­ga­tion: Click one of the tags and you will come to a web page dis­play­ing all of my book­marks match­ing this tag.

What’s the prob­lem with tags clouds?

Tag clouds are nice and really pop­u­lar, but still there is quite some room for improve­ment: * Tag clouds are not suited for long tail nav­i­ga­tion: By sim­ply adding up over time, a cer­tain pat­tern con­sis­tently emerges: there will be a some dom­i­nat­ing tags (the “big head”) and a vast num­ber of rarely used tags (the “long tail”). Whilst the “big head” tags remain pretty con­stant over time and broadly char­ac­ter­ize your inter­ests, the “long tail” con­tains all the vari­ety of things you encounter. Tag clouds visu­ally pri­or­i­tize the “big head”. How­ever, both for brows­ing and for search­ing, access to the long tail is vital, since this is where the real infor­ma­tion is con­tained. * Sum­ming up over time does not rep­re­sent the dynam­ics of inter­ests:Addi­tion­ally, it can be ques­tioned if merely sum­ming up tags is the right approach in gen­eral. How about top­ics you were inter­ested in, but now you aren’t any­more? Or con­versely, very recent inter­ests, which are pretty impor­tant to you but haven’t been tagged often enough to show up in the cloud? To solve this prob­lem, Chi­rag Mehta had the nice idea of imple­ment­ing tag clouds with a time slider. How­ever, if you look at these, another prob­lem becomes evi­dent: * Tag clouds are not suit­able for ani­ma­tion: This is due to their alpha­bet­i­cal list order and visual messi­ness. Since every tag’s posi­tion in a tag cloud is defined by its predecessor’s size and posi­tion, things start jump­ing around once you start scal­ing tags. So tag clouds are not really suited to dis­play the dynam­i­cal nature of tag­ging struc­tures — how tags appear and dis­ap­pear. * Tag clouds are ordered the wrong way: Tags denote con­cepts. As such, they have mean­ing­ful rela­tions to each other. Tag clouds are ordered alpha­bet­i­cally or by size — it would be much more effec­tive, if tags that belong together could also be pre­sented together. Some of these rela­tions can be deduced auto­mat­i­cally, by observ­ing how tags are used: Some tags might always appear together, oth­ers some­times and oth­ers never. If tags co-occur fre­quently or have many com­mon “neigh­bors”, you can be sure the con­cepts denoted will be related in some manner.

So whatcha you gonna do about it?

All these issues lead me to devel­op­ing a map­ping algo­rithm to analyse and dis­play tag struc­tures based on how tags occur together. Tech­ni­cally, it is based on a vector-space model, where each tag­ging action is assigned a point in a high-dimensional vec­tor space. By apply­ing the dimen­sion­al­ity reduc­tion algo­rithms PCA (Prin­ci­pal Com­po­nents Analy­sis) and CCA (Curvi­lin­ear Com­po­nent Analy­sis), I cal­cu­late a two-dimensional map, which places fre­quently co-coccurring tags close together. Addi­tion­ally, covari­ance val­ues for all tags are stored, so I know exactly how “related” each tag is to the others.

This infor­ma­tion can be used to dis­play both maps as well as lists, which are ordered by “relat­ed­ness”. You can play around with some of the maps in the inter­ac­tive demo.

tag_maps.jpg

In its ini­tial state, all tags are scaled accord­ing to their fre­quency. Click­ing a tag will trans­form the map or list and will bring all related tags to the front accord­ing to their degree of “relat­ed­ness”. Tags are col­ored accord­ing to their “fresh­ness”. Tags are con­sid­ered fresh if their aver­age usage has increased over the last 30 days.

To see how that fresh­ness mea­sure changes over time, there is an ani­mated ver­sion as well described in an ear­lier post.

When click­ing the 1D tab, you will find a list rep­re­sen­ta­tion of the same infor­ma­tion. The inter­ac­tion prin­ci­ple is the same.

tag_map_1d.jpg

Addi­tion­ally, I added two lit­tle dia­grams depict­ing the “fresh­ness” dis­tri­b­u­tion and the num­ber of occur­rences for each tag. These will be fil­tered once you click a tag.

mini_graphs.jpg

Insights gained and how to turn these exper­i­ments into usable interfaces

I would love to write a lit­tle bit about these issues, but this post is already long enough. I hope I will find the time in the next cou­ple of days to write a follow-up!

14 Responses to 'Tag maps update'

Subscribe to comments with RSS or TrackBack to 'Tag maps update'.

  1. Well-formed data » Tag clouds
    January 22nd, 2007 at 10:56 pm

    […] Just a lit­tle pointer to an ongo­ing project: [edit: The lat­est ver­sion can be found here] […]

  2. theCollegeKid
    January 23rd, 2007 at 4:47 pm

    Thanks. I’m just start­ing up a blog and I saw the but­ton for a “tag” and I had no idea what it was. This was helpful.

  3. acw
    January 26th, 2007 at 2:11 pm

    Well done! Thanx a lot!!

    Now I’m wait­ing for the 3D-version … But so far I’m very happy :-) Although I would like to inte­grate “my” map in my blog. Maybe it’s just some time to wait …

    Greetz, Anja

  4. Well-formed data » Emerging topics v2
    February 19th, 2007 at 12:58 am

    […] indi­vid­ual tag­ging behav­iour. You might have seen a first, ani­mated ver­sion of my stud­ies based on tag maps. The orig­i­nal ani­ma­tion shows the emer­gence of pre­vi­ously rarely used tags over time. Now I dug […]

  5. Jix
    March 14th, 2007 at 10:02 pm

    very very nice project! It adds visual depth to tag­ging… NICE!

  6. Stewart McKie
    March 25th, 2007 at 1:26 am

    Moritz

    At some time I would lke to chat with you relat­ing to my web site: http://www.scriptcloud.com

    Regards, Stew­art McKie

  7. […] […]

  8. Pnille
    April 8th, 2007 at 10:28 pm

    Hey Moritz, I am a stu­dent of Knowl­edgde Orga­ni­za­tion in CPH.,DK. Very inter­est­ing Blog u have! Won­der­ing if u came across any lit­ter­a­ture describing/discussing tag-clouds in rela­tion to other ways of rep­re­sent­ing data.? BTW: Any chance of get­ting to read your the­sis? :) Best Regards Pnille

  9. links for 2007-05-18 « Talkabout
    May 18th, 2007 at 5:07 am

    […] Well-formed data » Tag maps update (tags: tags visu­al­iza­tion del.icio.us tag­cloud via joshua) […]

  10. Henry
    April 9th, 2008 at 9:19 am

    http://www.forumup.com.mx/?mforum=wholesalehandbags

  11. […] Well-formed data » Tag maps update All these issues lead me to develop a map­ping algo­rithm to ana­lyze and dis­play tag struc­tures based on how tags co-occur. Tech­ni­cally, it is based on a vector-space model, where each tag­ging action is assigned a point in a high-dimensional vec­tor space. (tags: algo­rithm ani­ma­tion data dat­a­min­ing visu­al­iza­tion tag­ging tag­cloud folk­son­omy design infor­ma­tion vec­tor math blog moritzstefaner) […]

  12. […] Moritz Ste­faner points out (and presents his own solu­tion for) sev­eral prob­lems with the format: […]

  13. bijacriwjizs
    September 30th, 2008 at 12:43 pm

    She knew. Ipressed against her chest, black milf ed said, i had.They are shaped funny. les­bian milf The glass to be covered.

  14. Elastic tag map « Blog di Chiara Verdecchia
    April 13th, 2009 at 10:32 am

    […] in rete mi sono imbat­tuta in una tesi/esperimento di Moritz Ste­faner davvero inter­es­sante. La tesi risale al 2007 ed è stata progressivamente […]

Leave a Reply