In a nut shell, similarity is very complex and when we say two things are similar we really mean that they are similar in some specific sense. So our ontology focusing on reifying similarity - not saying just what things are similar, but also how was it determined they are similar. As such, we treat similarity as a class rather than a property.
We define a class
sim:Similarity for describing similarity statements. We then define the class sim:AssociationMethod for describing a method for determining similarity. By associating a similarity statement of type sim:Similarity with a method of type sim:AssociationMethod we are describing in what sense the elements involved in our statement are similar. We can further reify our method by providing provenance (who made the method) and even fully disclose our method by pointing to a graph describing our workflow. The diagram below illustrates a basic example involving two music tracks.
Here is the same example in Turtle:
@prefix sim: <http://purl.org/ontology/similarity/> .
@prefix mo: <http://purl.org/ontology/mo/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <#> .
:TrackA a mo:Track .
:TrackB a mo:Track .
_:SimilarityStatment a sim:Similarity ;
sim:element :TrackA ;
sim:element :TrackB ;
sim:method :TimbreSim .
:TimbreSim a sim:AssociationMethod ;
foaf:maker :Me ;
sim:description <http://some.graph.uri> .
:Me a foaf:Person .
Hopefully this example is somewhat illustrative, although if you're not familar with RDF or the Turtle syntax for serializing RDF it's probably not terribly useful. The important thing to understand is that
:TimbreSim is our method for deriving our similarity statement. We specify the person who made this methdo (:Me) and we point to a Named Graph that describes our workflow. We could also attach a textual description and other information.Ofcourse this example is only illustrative and not very realistic. Enter our demo implementation classical.catfishsmooth.net . Here we've taken some public domain classical music found on Musopen as well as some information about classical composers and their network of influence as specified by the Classical Music Navigator - this gives us a source of composer-to-composer similarity. We also use the Sonic-Annotator and some of the QMUL VAMP Plugins to analyze the audio files and get a couple methods of determining track-to-track similarity.
In this implementation, we have 3 similarity methods:
- composer-to-composer influence similarity
- track-to-track audio-based timbre similarity
- track-to-track audio-based key similarity
I've had some exciting (but brief) discussions with Mert Bay and Stephen Downie about integrating MuSim with the MIREX framework such that particpants in the audio similarity task might optionally apply their algorithm to some creative commons dataset (probably Jamendo) and publish MuSim data for re-use and additional evaluation.
We haven't really covered here the workflow description syntax or concepts. This will be the focus of much future work. Our design leaves this specification open-ended, but we have used the N3-Tr framework developed by Yves Raimond in his PhD thesis to describe workflows. You can see some examples in our implementation.
For more details on MuSim you can read our ISMIR paper (pdf) or view the ontology specification.
0 comments:
Post a Comment