Expressions

Expressions are statements evaluated by CollectiveAccess to a text, numeric or boolean (true/false) value. Expressions can be used to conditionally trigger (or not) elements of an import mapping or display template, where the boolean value returned determines what happens. Values may be generated through use of functions (described in more detail below), comparisons and mathematical operations.

At its simplest an expression is a number or text quantity. These are examples of perfectly valid expressions:

  • 5

  • “Software is great”

You’ll notice in the examples above that numbers are just numbers while text must be enclosed in quotes (single or double). Any quantity that is non-empty and non-zero will evaluate to “true” meaning that 5 = true while 0 = false. -1 is also true, as it is a non-zero value. Any strings besides “” (no text at all) is true, even ” ” (a single space).

While single values are valid expressions, they’re usually only useful when used in conjunction with operators. Operators are symbols that take two operands (values), perform some operation, and return a new value based upon the operands. There are several types of operators available in expressions:

Comparison Operators

Comparison operators compare two operands and return true or false. The most common operator is “=”, which returns true if the operands are exactly the same, false if they are not. For example, “wood” = “wood” is true whereas “wood” = “cement” is not. In an import mapping it is possible to use the “=” operator to check if an input field is a certain value.

Other comparison operators are:

  • > greater than

  • < less than

  • >= greater than or equal

  • <= less than or equal

  • <> not equal

  • != not equal (alternate form)

Greater than/less than operators only work with numeric values. Equal and not equal work with numbers or text.

Boolean Comparison

As of version 1.8 values representing boolean true and false are available for use in comparisons. These allow you to more easily test the return value of an expression of function using the bare, unquoted word “true” or “false”. For example, this expression:

dateIsRange("1950's") = false

Would return true when dateIsRange() returns false, which is useful for importer actions and display templates where specific behaviors are triggered by true expressions.

Math Operators

With expressions you can perform mathematical operations on numbers using +, -, * and /. These are addition, subtraction, multiplication and division respectively. The + operator also works on text, and will merge two text values together into a single run-on text value. For example:

4 + 5

will return the value 9

"Julia" + " plus " + "Allison"

will return the value “Julia plus Allison”

Logical Operators

It is also possible to string together many expressions into a larger composite expression using the boolean logic operators “AND” and “OR”. “AND” returns true if, and only if, both operands evaluates as true. “OR” returns true if, and only if, at least one operand evaluates as true. For example:

(5 > 10) AND ("seth" = "seth")

is false because 5 is not greater than 10, and both expressions need to be true for the composite AND to be true

(5 > 10) OR ("seth" = "seth")

is true because “seth” = “seth” is true and only one needs to be true for logical OR to return true

Note

Prior to version 1.8 logical operators were required to be upper-case only. Both upper and lower-case operators are now allowed.

Additional Comparison Operators

The comparison operators shown above are useful but limited. There are a couple of additional ones that are really where the action is. They are:

The “IN” Operator

“IN” lets you compare a value to a list of values. It returns true if ANY value in the list matches the value you are comparing. For example:

"Seth" IN ["Julia", "Allison", "Sophie", "Maria", "Angie", "Seth"]

returns true while

"Joe" IN ["Julia", "Allison", "Sophie", "Maria", "Angie", "Seth"]

returns false.

There is also a related “NOT IN” operator which will return true if the value is not in the list.

The =~ (regular expression) Operator

You can compare a value against a regular expression using the =~ operator. Regular expressions are a very powerful and very flexible pattern matching syntax. At its most basic a regular expression is a simple bit of text that is matched anywhere in the value being compared. For example:

"Software is great" =~ /soft/

returns true.

Note that the regular expression is on the right side of the operator and is enclosed in “/” characters. This is a traditional notation for regular expressions; they are enclosed in the forward slashes to set them off from normal text.

There is also a related !~ operator which will return true when the value does not match the regular expression.

Variables

This is all well and good, but the above examples are not terribly useful with hardcoded values in them. Where things start getting truly useful is variables. Any source in an import record can be used as a variable by prefixing its name with a “^” character. So if you were importing an Excel spreadsheet and wanted to apply rules when the word “allison” appears anywhere in the value of column 4 you’d write

^4 =~ /allison/

Similarly, if you want to make sure that the value in the 10th column is equal to “metal” then you use the expression:

^10 = "metal"

If you wanted to make sure that both conditions applied to a record then you’d use:

(^4 =~ /allison/) AND (^10 = "metal")

If either would suffice you could use “OR” rather than “AND”

For XML input data the variable names are the XML paths – the exact same thing used in the source specification but with a “^” tacked onto the front.

Functions

Functions are black-boxes that you put a number of values into in order to get a single value out of. The expression system current allows the following functions:

Function

Description

Parameters

Return value

Example

abs

Returns the absolute value of a number (eg. changes negative numbers to positive ones); takes a single value as input

ceil

Rounds a fractional number up to the next highest integer; takes a single value as input

floor

rounds a fractional number down to the next lower integer; takes a single value as input

int

Forces a number to be an integer. If the number has a decimal component it is discarded; takes a single value as input

max

Returns the largest value of those passed to it; takes any number of values as input

min

Returns the smallest value of those passed to it; takes any number of values as input

round

Rounds the number to the closest integer; takes a single value as input

random

Returns a random number between zero and the number provided as input ; takes a single value as input

rand

Synonym for random

current

Evaluates to true if the supplied date expression encompasses the current server date/time [available from version 1.5]

String <date expression>

future

Evaluates to true if the supplied date expression ‘’ends’’ any time after the current server date/time. The start date is not considered, so the range may start before or after the current date/time and still evaluate to true [available from version 1.5]

wc

Returns number of words (wc = “word count”) in a supplied text value [available from version 1.5]

String <text>

length

Returns number of characters in a supplied text value [available from version 1.5]

String <text>

sizeof

Returns number of parameters. useful for counting values. See example below [available from version 1.6]

count

Synonym for sizeof

age

Calculates age in years. accepts an arbitrary number of parameters greater than 1. It’ll take the earliest and latest dates in the parameter list as start and end of the time span, so you don’t have to worry about the order. If the result is a span of 0 years (e.g. because only 1 date was passed), it’ll retry with the current date added to the list. This is useful to calculate something’s/someone’s current age. [available from version 1.6]

Any number of date expressions

ageyears

Alias for age [available from version 1.6]

Any number of date expressions

agedays

Same as age/ageyears, only for days. [available from version 1.6]

Any number of date expressions

avgdays

Calculates the average length of the time spans passed as parameters. Accepts an arbitrary number of parameters (>1). [available from version 1.6]

Any number of date expressions

formatdate

Formats a valid date expression using PHP’s [http://php.net/manual/en/function.date.php date() function]. Formats dates as ISO by default but accepts an optional second parameter to specify the format that gets passed to date(). See the PHP documentation for available options. [available from version 1.6]

String <date expression> [String <date format>]

formatgmdate

Formats a valid date expression in UTC using PHP’s [http://php.net/manual/en/function.gmdate.php gmdate() function]. Formats dates as ISO by default but accepts an optional second parameter to specify the format that gets passed to gmdate(). See the PHP documentation for available options. [available from version 1.6]

String <date expression> [String <date format>]

isvaliddate

Returns true if parameter parses as a valid date [available from version 1.7]

String <date expression>

date

Parses a natural language date into a pair of historic timestamp values, suitable for mathematical comparison.

String <date expression>

join

Returns a list of values delimited by the first argument. All other arguments are values. Alias ‘’implode’’. [available from version 1.7]

Any number of string values

implode

Synonym for join

trim

Trims leading and trailing whitespace from a string. [available from version 1.7]

<string> text

avg

Return average of parameter values.

Any number of numeric values

sum

Return sum of parameter values

Any number of numeric values

replace

Replace values using regular expression

String <Perl compatible regular expression> String <replacement value> String <subject value>

idnoUseCount

Return number of items a value is used as an identifier (idno) for a given table.

String <idno value> String <table> (optional, if omitted defaults to “ca_objects”)

dateIsRange

Return true if date is a range rather than a single day, month or year. [available from version 1.8]

String <date expression>

boolean

dateIsRange(1950’s)

fromUnixtime.”Convert Unix timestamp to ISO-8601 formatted date. [available from version 1.8]”

Integer <timestamp>

ISO date

To include the function-produced value in your expression just add the function name with a paren-enclosed list of values following. For example:

random(10) > 5

returns true if the random number between 0 and 10 is greater than 5.

  • ceil(5.2) returns 6

  • floor(5.6) returns 5

  • round(5.2) returns 5

  • round(5.6) returns 6

  • length(“hello”) returns 5

  • sizeof(1,2,3,4) returns 4

  • age(“23 June 1912”, “7 June 1954”) returns 41

  • age(“7 June 1954”, “23 June 1912”) returns 41 (order doesn’t matter)

  • age(“7 June 1954”, “9 May 1945”, “23 June 1912”) returns 41 (‘extra’ dates don’t matter)

  • age(“28 January 1985”) returns something > 29; 30 if you run it before 28 January 2016

  • agedays(“23 June 1912”, “7 June 1954”) returns 15324

  • agedays(“1912/06/23”) returns something > 37653

  • avgdays(“1912/06/23 - 1954/06/07”, “1985/01/28 - 2015/07/24”) returns 13229

  • avgdays(“1945/01/02 - 1945/01/03”, “1985/01/28 - 1985/01/29”) returns 1

  • formatdate(“1985/01/28”) returns 2015-08-05T14* 28* 31-04* 00. Note that this result can vary based on your time zone setting in setup.php!

  • formatgmdate(“1985/01/28”) returns 1985-01-28T05* 00* 00+00* 00. Note that this result can vary based on your time zone setting in setup.php!

  • formatgmdate(“1985/01/28”, “Y”) returns 1985

  • trim(” this text has spaces at the end “) returns “this text has spaces at the end”

  • join(“, “, “Smith”, “Bob”) returns “Smith, Bob”

Parentheses

You may have noticed that parens have been sprinkled through some of the examples. You can use matched parens to group elements of an expression. This makes it easier to read and also ensures that operators are applied in the desired sequence in complex expressions. The three things you need to know about parens are:

  • Each paren’ed sub-expression is evaluated as a single unit, before being combined with other sub-expressions

  • Always match each opening paren with a closing paren

  • Parens don’t hurt anything, but can improve readability of the expression so you are encouraged to use them liberally.