Table 12

The (Java) regular expressions used for the character feature in the GM task



Capitals, lower case, hyphen then digit


Capitals followed by digit


Single capital


Single Greek character

\ p{InGreek}

Letters followed by digits


Lower case, hyphen then capitals


Single digit


Two digits


Four digits


Two capitals


Three capitals


Four capitals


Five or more capitals


Digit then hyphen


All lower case


All digits




Capital, lower case then digit


Lower case, capitals then any


Greek letter name

Match any Greek letter name

Roman digit


Capital, lower, capital and any


Contains digit


Contains capital


Contains hyphen


Contains period

.*\ ..*

Contains punctuation

.*\ p{Punct}.*

All digits


All capitals


Is a personal title


Looks like an acronym


GM, gene mention.

Alex et al. Genome Biology 2008 9(Suppl 2):S10   doi:10.1186/gb-2008-9-s2-s10

Open Data