This article contains information on creating and validating regular expression patterns in Aware, including examples for identifying sensitive data like PII and VAT IDs and custom patterns for industry-specific needs.
Regular Expressions are patterns that can trigger Events in Aware. They look for a specific combination of character, keyword, or number patterns within your connected Content Platform.
Account numbers, addresses, credit card numbers, national identification numbers, and other regular expressions are examples. We have created and validated many common patterns for you, and they are available in Aware. If you would like to create a pattern for something industry—or company-specific, like an employee ID or Customer Account Number, read below for some helpful tips.
Remember that your Customer Success Manager can help you build and validate a regular expression pattern for your use.
See http://www.regexr.com or https://www.regex101.com for a simple regular expression tool.
Creating a Regular Expression
There are many ways to go about creating regular expressions, but one of the most common ways is to follow these instructions:
- Identify the type of pattern you are trying to find within a message (e.g., 6-digit number separated by dashes ##-##-##)
- Use a RegEx tool for help when creating your pattern. The regular expression for this pattern is: \b\d{2}-\d{2}-\d{2}\b
Validate in a RegEx tool or Aware that the pattern is correct and will bring back the intended content
Below are some examples of Regular Expression Patterns that we do not have available in our product:
- Personally Identifiable Information (PII): The following patterns match types of information that many countries consider to be personally identifiable.
| Country | Identity Number | Regular Expression Pattern |
| China (PRC) | Matches an 18-digit number. | \b\d{18}\b |
| Finland | Matches an 11-digit number where the last digit is sometimes a character. | \b\d{10}\w\b |
| Ireland | Matches a 7-digit number followed by two trailing characters. | \b\d{7}[a-zA-Z]{2}\b |
| Israel | Matches a 9-digit number. | \b\d{9}\b. |
| Italy | Matches 6 characters, followed by 9-digits with a final trailing character. | \b[a-zA-Z]{6}\d{9}\w\b. |
| Poland | Matches an 11-digit number. | \b\d{11}\b. |
| South Korea | Matches a 6-digit number followed by a dash and 7 trailing digits. | \b\d{6}-\d{7}\b. |
| Sweden | Matches a 6-digit number followed by a dash and 4 trailing digits. | \b\d{6}-\d{4}\b. |
| Switzerland | Matches an 11-digit number with two different groupings.AAA.BB.CCC.DDD or the newer 756.XXXX.XXXX.XY. | \b\d{3}[.]\d{2}[.]\d{3}[.]\d{3}\b|\b756[.]\d{4}[.]\d{4}[.]\d{2}\b. |
| Spain | Matches matches an 8-digit number followed by a dash and a trailing letter. ########-X. | \b\d{8}-[a-zA-Z]\b. |
| Taiwan | Matches a letter followed by 9 digits. | \b[a-zA-Z]\d{9}\b. |
| Thailand | Matches a 13-digit number separated by dashes. #-####-#####-##-#. | \b\d{1}-\d{4}-\d{5}-\d{2}-\d\b. |
| Turkey | Matches a 13-digit number. | \b\d{13}\b. |
| United Kingdom | Matches a 10-digit number separated by dashes or the placeholder equivalent.###-###-#### or xxx-xxx-xxxx. | \b\d{3}[-.]?\d{3}[-.]?\d{4}\b|xxx-xxx-xxxx. |
| United States |
|
|
| Vietnam | Matches a 9-digit number in groupings of 3 separated by dashes. ###-###-###. | \b\d{3}[-.]?\d{3}[-.]?\d{3}\b|xxx-xxx-xxx. |
| Austria | Matches ATU + 8 digits. | \bATU\d{8}\b|U\d{8} |
| Belgium | Matches BE + 10 digits. | \bBE\d{10}\b|\d{10} |
| Bulgaria | Matches BG + 9 to 10 digits. | \bBG\d{9,10}\b|\d{9,10} |
| Croatia | Matches HR + 11 digits. | \bHR\d{11}\b|\d{11} |
| Cyprus | Matches CY + 8 digits + 1 trailing character. | \b(cy|CY)?\d{8}\w\b |
| Czech Republic | Matches CZ + 8 to 10 digits. | \b(cz|CZ)?\d{8,10}\b |
| Denmark | Matches DK + 8 digits. | \b(dk|DK)?\d{8}\b |
| Estonia | Matches EE + 9 digits. | \b(ee|EE)?\d{9}\b |
| Finland | Matches FI + 8 digits. | \b(fi|FI)?\d{8}\b |
| France | Matches FR + 2 characters followed by 9 digits. | \b(fr|FR)?[a-zA-Z]{2}\d{9}\b |
| Germany | Matches DE + 9 digits. | \b(de|DE)?\d{9}\b |
| Greece | Matches EL + 9 digits. | \b(el|EL)?\d{9}\b |
| Hungary | Matches HU + 8 digits. | \b(hu|HU)?\d{8}\b |
| Ireland | Matches IE + 7 digits followed by 1 or two characters. | \b(ie|IE)?\d{7}[a-zA-Z]{1,2}\b |
| Italy | Matches IT + 11 digits. | \b(yit|IT)?\d{11}\b |
| Latvia | Matches LV + 11 digits. | \b(lv|LV)?\d{11}\b |
| Lithuania | Matches LT + 9 or 12 digits. | \b(lt|LT)?\d{9}\b|LT\d{12} |
| Luxembourg | Matches LU + 8 digits. | \b(lu|LU)?\d{8}\b |
| Malta | Matches MT + 8 digits. | \b(mt|MT)?\d{8}\b |
| Netherlands | Matches NL + 9 digits followed by the letter B and 2 more digits. | \b(nl|NL)?\d{9}B\d{2}\b |
| Poland | Matches PL + 10 digits. ###-###-##-## or ###-##-##-###. | \b(pl|PL)?\s\d{3}-\d{3}-\d{2}-\d{2}\b|PL\s\d{3}-\d{2}-\d{2}-\d{3} |
| Portugal | Matches PT + 9 digits. | \b(pt|PT)?\d{9}\b |
| Romania | Matches RO + 2 to 10 digits. | \b(ro|RO)?\d{2,10}\b |
| Slovakia | Matches SK + 10 digits. | \b(sk|SK)?\d{10}\b |
| Slovenia | Matches SI + 8 digits. | \b(si|SI)?\d{8}\b |
| Spain | Matches ES + a character or a digit followed by 7 digits and a final character or a digit. | \b(es|ES)?[a-zA-Z0-9]\d{7}[a-zA-Z0-9]\b |
| Sweden | Matches SE + 10 digits followed by 01. | \b(se|SE)?\d{10}01\b |
| Standard | Matches GB + 9 digits separated in groupings of 3, 4 and 2.GB### #### ##. | \b(gb|GB)?\d{3}\s\d{4}\s\d{2}\b |
| Branch Traders | Matches GB + 9 digits then a following block of 3 digits.GB######### ###. | \b(gb|GB)?\d{9}\s\d{3}\b |
| Government Departments | Matches GBGD + 3 digits. | \b(gbgd|GBGD)?d{3}\b |
| Health Authorities | Matches GBHA + 3 digits. | \b(gbha|GBHA)?\d{3}\b |
Comments
0 comments
Please sign in to leave a comment.