to home

User defined Metric Spaces on the Web

General similarity search of quantifiable resources is possible on the Web, if these are represented in standardized way. As standardized data structure for representation of quantifiable resources a "Domain Vector" (DV) is proposed, which contains a feature vector for representation of the quantitative data and a URI called "Domain Space Identifier" (DSI) as Vector Space Identifier which identifies the containing metric space which is called "Domain Space" (DS). The DS definition explains the meaning of every dimension (number) of the (feature vector in the) Domain Vector. DVs of the same DS are directly comparable using a given metric. At this similarities of the resources' data are linked (mapped) to spatial similarities of the feature vectors. So similarity search is possible by calculating distances: resources are the more similar, the smaller the distance between the feature vectors of the representing DVs is. This DV search could be efficiently combined with conventional word based search: Realization of vectorial web search * (2009).

Talk pdf * (2010 presented here)

Application in medicine * (2010 published here)

Additional information:
A Searchable Patient Record Database for Decision Support * (2009, also here accessible)
Design of a global medical database which is searchable by human diagnostic patterns* (2008, also here accessible)
Short video* (2008 uploaded, with old nomenclature using the old word "pattern" which is now replaced by "DV"*)

* Change of nomenclature and data representation during development, the latest version is the local implementation
which is online since July 2012. A demonstration of the implementation shows this video.
Please contact me in case of interest in details. There are many arguments for this approach, e.g. many numeric data are stored in separated databases in the hidden web. The proposed DV structure gives also motivation to pack such numeric data in a globally accessible and interchangeable form. So it could also help to make hidden (numeric) data from the deep web accessible for all people.
The approach is clearly efficient, because it allows to handle and search numeric data to very different topics or DSIs with the same software. Therefore we assume that the advantages of a standard will be recognized, and former or later a standard will be introduced. In the course of time much more numeric data than before will be stored openly on the web.

DV, Domain Vector, DS, Domain Space, HTTP URI, feature vector, numeric search, Vectorial Web Search