to home

Vectorial Web Search

General similarity search of quantifiable resources is possible on the Web, if these are represented in standardized way. As standardized data structure for representation of quantifiable resources a "Vectorial Resource Descriptor" or "VRD" (formerly called "pattern"*) is proposed, which contains a feature vector for representation of the quantitative data and a "Vector Space Identifier" or "VSI" (a HTTP URI) which uniquely identifies the meaning of every dimension (number) of the feature vector. Feature vectors of VRDs with the same VSI are directly comparable using a given metric. At this similarities of the resources' data are mapped to spatial similarities of the feature vectors. So similarity search is possible by calculating distances: resources are the more similar, the smaller the distance between the feature vectors of the representing VRDs is. This vectorial (numeric) similarity search could be efficiently combined with conventional word based search. We describe this here:
Realization of vectorial web search

Short video (with old nomenclature)*

A medical database which uses the search concept is an application:
Design of a global medical database which is searchable by human diagnostic patterns (also here accessible)

A short description with less examples but additional information about the structure:
A Searchable Patient Record Database for Decision Support (also here accessible)

Many numeric data are stored in separated databases in the hidden web. The proposed VRD structure gives motivation to pack such numeric data in a globally accessible and interchangeable form. So it could also help to make hidden (numeric) data from the deep web accessible for all people.

Keywords:
Patterns, VRD, Non-Text Search, Numerical Search, Feature Vectors, Pattern names, Domain Names, Pattern Domains, Numeric web search, Searchable Global Medical Database, World Wide Web, Search Engines, Data Mining, Information Retrieval, Searchable Global Medical Database, Deep Web, Hidden Web

* Remark: Change of nomenclature and data representation since 10. October 2009,  please contact me in case of interest in details.

Contact