Information Services

InformationService is a metadata element type that allows for the referencing of external web services as metadata attached to CollectiveAccess records. InformationService is also a plugin API that makes it easy to add support for other external services. The exact information stored locally differs from plugin to plugin.

This metadata element type references specific external web services by performing a lookup operation at the remote service, and then allowing the user to pick a value from a result list. Core information about the referenced piece of data and a reference (URI) to the original resource is then stored.

For a full list of supported external web services supported by CollectiveAccess, please see

Configuration in an Installation Profile

A basic configuration of InformationService in an installation profile might look like:

<metadataElement code="my_element" datatype="InformationService">
   <labels>
     <label locale="en_US">
       <name>My InformationService Element</name>
     </label>
   </labels>
   <settings>
     <setting name="service"><!-- enter service here --></setting>
   </settings>
   <typeRestrictions>
     <restriction code="r1">
       <table>ca_objects</table>
       <settings>
         <setting name="minAttributesPerRow">0</setting>
         <setting name="maxAttributesPerRow">255</setting>
         <setting name="minimumAttributeBundlesToDisplay">1</setting>
       </settings>
     </restriction>
   </typeRestrictions>
 </metadataElement>

Note that the service setting is mandatory, and defines the plugin used for that element. A list of available plugins is below.

Available Plugins

Below is a list of existing plugins and available settings.

CollectiveAccess

This plugin allows you to reference records in remote CollectiveAccess instances. Available settings are as follows:

Setting Name

Description

Example

service

Set service setting to ‘CollectiveAccess’ to use this plugin

CollectiveAccess

baseURL

URL used to query the information service

http://localhost/admin/

table

valid CollectiveAccess table name

ca_entities

user_name

User name to authenticate with on remote system

webservice

password

Password to authenticate with on remote system

/

labelFormat

Display template to format query result labels with.

^ca_entities.preferred_labels

detailFormat

Display template to format detailed information blocks with.

^ca_objects.preferred_labels (^ca_objects.idno)

uBio

uBio is an initiative within the science library community to join international efforts to create and utilize a comprehensive and collaborative catalog of known names of all living (and once-living) organisms. Available settings for this implementation are:

Setting Name

Description

Example

service

Set service setting to ‘uBio’ to use this plugin

uBio

keyCode

uBio key code. See http://www.ubio.org/index.php?pagename=xml_services for details. Default is the ubio_keycode setting in app.conf

a1b2c3

Getty Linked Open Data Services

The Getty LOD Services are technically 3 different plugins that share a common code base. They allow referencing concepts in Getty’s AAT, TGN and ULAN vocabularies via their SPARQL Linked Open Data web service. Set the service setting to AAT, TGN or ULAN to use corresponding services. The plugin uses Getty’s SPARQL endpoint and their full text indexes for fast lookups and the full RDF representation (example here) of the concepts to display more detailed info and also to make additional data available for search.

None of the 3 plugins has any custom settings on element level, but they share a more comprehensive configuration in the configuration file linked_data.conf. The default configuration should work for most use cases. The file has 3 large blocks, one for each of the plugins (tgn, aat, ulan).

Their format is identical and consists of 3 settings:

Setting Name

Description

Example

search_text

If set to 0 we use the luc:term field for searching, which only contains the terms/labels. If set to 1 we use the luc:text field instead, which can yield a lot more but erratic results. See http://vocab.getty.edu/doc/queries/#Exact-Match_Full_Text_Search_Query

1

detail_view_info

List of attributes to show in the extended information panel. Info has to be in literal form, but can be pulled through related nodes (see below). Note that full uris for both resources and literals have to be wrapped in < and >. See also# http://www.easyrdf.org/docs/property-paths (this is the library we use to traverse the graph). Available settings are: The label setting defines the label used for this field in the extended info panel. uri is an optional setting that allows pulling information through related RDF nodes. literal is a setting that should resolve to a RDF literal and defines the actual text that is pulled in for display. limit limits the number of related nodes that are processed. Crawling the RDF graph can get very slow for a large number of nodes. stripAfterLastComma lets you strip everything after (and including) the last comma in the individual literal string. This is useful for gvp:parentString where the top-most category is usually not very useful. invert is a setting handcrafted for gvp:parentString and inverts the hierarchy path so that it starts with the most generic node.

Note that this data is only visible when you scroll down the extended info panel. It is not available for search or in bundle displays! See below for how to add data for search.

type = { label = Type, # use uri if you want to pull from a related node uri = <http://vocab.getty.edu/ontology#placeTypePreferred>, literal = <http://www.w3.org/2004/02/skos/core#prefLabel>, limit = 1, }, or

parentString = { label = Full path, literal = <http://vocab.getty.edu/ontology#parentString>, stripAfterLastComma = 1, invert = 1, },

additional_indexing_info

List of attributes to add to the search index (in addition to the display value). This allows you to use non-display information from the Getty services for search purposes. For instance, you might not want to display the full gvp:parentString for each related AAT keyword but you still want to search for the broader categories. Note that the syntax is virtually identical to the detail_view_info setting above, except for the absence of the label setting.

altLabels = { uri = <http://www.w3.org/2008/05/skos-xl#altLabel>, literal = <http://vocab.getty.edu/ontology#term> }

Wikipedia

This service allows referencing Wikipedia articles. Available settings are:

Setting Name

Description

Example

service

Set service setting to ‘Wikipedia’ to use this plugin

Wikipedia

lang

2- or 3-letter language code for Wikipedia to use. Defaults to “en”. See http://meta.wikimedia.org/wiki/List_of_Wikipedias

en

This plugin also tries to pull in an abstract and a preview image for local display. Both the abstract and preview image are available in bundle displays. Suppose your wikipedia metadata element has the code wikipedia. You can reference additional properties about a referenced article like this:

ca_objects.wikipedia.<property>

Where property is one of the following:

Setting Name

Description

image_thumbnail

Image thumbnail URL

image_thumbnail_width

Width of image thumbnail. Box is capped at 200px by 200px.

image_thumbnail_height

Height of image thumbnail. Box is capped at 200px by 200px.

image_viewer_url

(Valid for v1.5.1) URL for Wikipedia’s full screen image viewer. Example here

title

Title of the Wikipedia article

pageid

Numeric page identifier

fullurl

URL for the article

canonicalurl

Canonical URL for the article

extract

Extract of the article. This is usually a HTML representation of the full article!

abstract

CollectiveAccess tries to extract the first paragraph from the full article representation above to provide a shorter abstract. This is usually the part of the article shown above the table of contents but the extraction might fail for poorly formatted articles.

Implementing New Plugins

InformationService implementations reside in /app/lib/Plugins/InformationService and should implement IWLPlugInformationService and extend BaseInformationServicePlugin. The class name must be “WLPlugInformationService<Service>” and the file name “<Service>.php”.

It can provide additional settings using the static $s_settings variable, usually derived from $g_information_service_settings_<Service>. It should set the “NAME” property of the info array in the constructor. The Wikipedia implementation is relatively simple, and uses most of the available features (except getDataForSearchIndexing()) so you could use that as a template.

Core Functions

The core functions you must implement are:

public function lookup($pa_settings, $ps_search, $pa_options=null);

where $pa_settings is an array containing the settings for this particular element (including the ones you provided) and $ps_search is the search expression provided by the user. The function should return an array with the “results” key being a list of results for the given search expression. Each result should have a label, url and idno:

public function getExtendedInformation($pa_settings, $ps_url);

This should return an array with the “display” key set to an HTML representation of the given record (identified by the URL/URI). You can either go and look the detailed data up remotely or, for instance, call getExtraInfo() to get locally stored data (see below).

Optional functions

The functions listed below are optional, and have default (empty) implementations in BaseInformationServicePlugin, so it doesn’t hurt to leave them out of your plugin entirely. However, they can be used to provide useful features.

public function getExtraInfo($pa_settings, $ps_url);

Returns an array of key=>value pairs containing extra information to be stored locally, alongside the id, the display label and the URL. This data can be accessed using SearchResult::get(), so you should keep the keys alphanumeric, lowercase and without spaces.

public function getDataForSearchIndexing($pa_settings, $ps_url);

Returns a list of strings that are added to the search index for the record associated with this attribute. This allows you to add additional data points that can be used to find the CollectiveAccess record but are not necessarily available for display. Note that the data returned by getExtraInfo() is not indexed for search, so you might have to add the same data twice.

public function getDisplayValueFromLookupText($ps_text);

The default behavior is to use the (selected) label returned by the lookup() function as display value for attribute values. That can be undesirable for use cases like the AAT where one the one hand you want a lot of identifying information in the lookup dropdown but on the other you probably don’t care about all that info once the “relationship” has been created because the keyword is doing its job in the background (making the associated record findable). Maybe you just want a simple and short label instead to save space.

This function allows you to mangle the lookup text to create a different display value. The lookup text usually has the URL in it, so you could even look up additional info to pull in here if you wanted. An example can be found in the AAT implementation, where we do some regular expression magic to convert lookup texts:

before: [300025342] swordsmiths [people in crafts and trades by product, people in crafts and trades]

after: swordsmiths