English > API (Application Programming Interface)

Since the very first version in 2001, HyperDic has been encoded in valid XHTML, an XML language that can interact with any XML processor. Thus, HyperDic can be used directly as an XML database. With XSLT stylesheets, it is easy to extract lexical information from HyperDic pages.

In HyperDic, all the different senses of a word are presented on one page, and that page has a stable address on the web, following a canonical language/word format, where the language is represented as a twi-character code ("en" for English, "es" for Spanish and "ca" for Catalan):

Within that page, each sense of the word has an HTML anchor that can be used to reference that particular sense.

In the versions 1.x of HyperDic, this anchor used WordNet synset offsets, which change between different versions of the database. This was significantly improved in 2004 with the introduction of permanent sense identifiers in version 2 of HyperDic.

Permanent sense keys

Since 2004 HyperDic has permanent links to individual word senses, based on WordNet's sense keys, which do not change between different versions of WordNet. This makes HyperDic directly compatible with semantic web applications that rely on other versions of WordNet.

For example, the two senses of London are found on the page located at http://www.hyperdic.net/en/london.

Within that page, London (The capital and largest city of England) is referenced with the identifier n1500, while London (United States writer of novels), has the sense anchor n1800.

Sense key encoding

These identifiers are directly equivalent to the corresponding WordNet sense keys, which have the following format: london%1:15:00::. The components used in sense key encoding are explained in WordNet's senseidx manual page.

The format used to express the sense keys in HyperDic is dictated by some constraints of HTML anchors. For example, the n in n1500 corresponds to the %1 component used to denote a noun in WordNet sense keys. But HTML anchors must start with a letter, and the % character must be avoided because it has a special meaning in URLs on the web.

Interfacing semantic applications

This approach provides a stable hookup for semantic web applications that rely on other versions of WordNet: out of the 195,817 sense keys present in Wordnet 1.7.1, 194,070 were still valid in WordNet 2.0, while only 1747 (0.89%) had changed or disappeared.

To start interfacing with HyperDic, you may freely place links to HyperDic pages. You are also welcome to copy the HyperDic searchbox and/or the alphabetical HyperDic index bar on your web pages.

If you intend to use a program to automatically retrieve HyperDic pages, please contact us first, as we would need to make special arrangements to lift our restrictions for crawling the HyperDic website.

To get better acquainted with the structure of HyperDic, you can download MiniDic, a free version of HyperDic, which covers 1111 essential English words. Be sure to download the latest version, as versions lower than 2.0 use deprecated synset-based anchors.

