Regular expressions
  • PDF

Regular expressions

  • PDF

Available in Classic and VPC

Regular expressions refer to the notation for displaying sentences of various expressions in a consolidated manner. Chatbot builder uses its own regular expression syntax called "nlu_script". The questions that natural language can't cover can be supplemented with good utilization of regular expressions.

Basic description of nlu_script

special symbols
The following is the special symbols used as grammatical components in the nlu_script regular expression system.

( ) { } [ ] < > \: + * ^ ~ = . | & - / `

Show regular expression syntax symbols as text
When you have to enter a special symbol used in the syntax as text, not as grammar or code, add the double quotation marks (" ") to the either side of the special symbol. The special symbol contained in " " will be recognized as text, not regular expression syntax. Make sure that you don't use single quotation marks.

잘지내(니|냐)"?"
-> 잘지내?, 잘지내니?, 잘지내냐?

Spacing
Tokens with spaces included are not matched if there is no space in between tokens, so it is recommended that as many spaces as possible should be included where permissible by the rules of Korean orthography.
예를 들어, “안녕하세요”라고 입력하면 챗봇은 “안녕하세요”만 인식하지만, “안녕 하세요”라고 입력하면 “안녕하세요”와 “안녕 하세요”를 모두 인식하게 됩니다.

안녕하세요
-> 안녕하세요 (O) / 안녕 하세요 (X)
안녕 하세요
-> 안녕하세요, 안녕 하세요 (O)

Basic notation

Required utterance expression [ ]

[ ] is used to express required utterances. It is mainly used as the method of grouping the parts of speech that must appear, and could be used by nesting.
The example below lists similar words by nesting the required expression with [ ], and using | that means OR within the [ ]. 9 sentences can be processed with the regular expression sentence composed like this.

[주문|배달|배송] [해줘|해주세요|해줘요]
-> 주문해줘, 배달해줘, 배송해줘, 주문해주세요, 배달해주세요, 배송해주세요, 주문해줘요, 배달해줘요, 배송해줘요

Optional utterance expression ( )

( ) is used to express utterances that appear selectively. The expression can be used by nesting. It can be mainly used as the method of grouping and expressing postpositions or modifiers that may be omitted from a long question. In Korean, honorifics or ending conjugations can be grouped and expressed with ( ).
예시와 같이, 안녕(하세요|하신가|하냐)로 작성할 경우, ( )는 ‘안녕’에 붙을 수 있는 어미로 선택적으로 출현 가능한 발화이며, ‘|’는 OR의 의미를 나타냅니다. 그래서 안녕(하세요|하신가|하냐)로 한 문장으로 안녕, 안녕하세요, 안녕하신가, 안녕하냐를 모두 처리할 수 있습니다.

안녕(하세요|하신가|하냐)
-> 안녕, 안녕하세요, 안녕하신가, 안녕하냐

Expression of connection between tokens ::

:: is used to signify an inter-token connection, mainly to connect stem::ending conjugations. The expression must be entered without a space.

주[라|세요]
-> 주라, 주 라, 주세요, 주 세요
주::[라|세요]
-> 주라, 주세요

[아버지|어머니]가 방에 들어가신다
-> 아버지가 방에 들어가신다(O)/ 아버지 가방에 들어가신다(O)
[아버지|어머니]::가 방에 들어가신다
-> 아버지가 방에 들어가신다(O)/ 아버지 가방에 들어가신다(X)

OR logical symbol expression |

| refers to the logical symbol of OR. It's mainly used with square brackets [ ] for required utterances or with round brackets () for optional utterances.

[칼로리|열량]알려줘
-> 칼로리 알려줘, 열량 알려줘

안녕(하세요|하신가|하냐)
-> 안녕, 안녕하세요, 안녕하신가, 안녕하냐

Expression to combine entered tokens ~[ ]~

~[]~ is used to combine the tokens entered in [ ]. The tokens entered inside ~[]~ are separated by , (commas). However, as excessive candidate patterns may be generated (permutations) if too many tokens are entered, the number of tokens in ~[]~ is limited to a maximum of 5.
If you write ~[account, creation]~ method as in the example, then it covers both "account creation method" and "creation account method."

~[account, creation]~ method
-> account creation method, creation account method

Expression to combine entered tokens repeatedly [ ]^

[ ]^ is used to combine the tokens entered in [ ] repeatedly. However, this may cause pressure on the system load, so the number of regular expressions with this pattern is limited to 100 or less per domain.

  • `[k]^(a,b): The token k entered in [] is repeated between a to b times.
  • `[k]^<a: The token entered in [] is repeated less than a times.
  • `[k]^>a: The token entered in [] is repeated more than a times.
[빨리]^(1,3) 
-> 빨리, 빨리빨리, 빨리빨리빨리(O) / 빨리빨리빨리빨리(X)
[빨리]^<3
-> 빨리, 빨리빨리(O) / 빨리빨리빨리, 빨리빨리빨리빨리(X)
[빨리]^>2
-> 빨리빨리빨리, 빨리빨리빨리빨리(O) / 빨리, 빨리빨리(X)

Expression that matches a random word <?>

<?> is a wild card that is used to match random words. As there may be side effects, it is recommended that you use it only when absolutely necessary.

  • <?>: Matches a random 1-letter word.
  • <?>*?: Matches a random word. A match is made even when there are no random words. There is no limit to the number of characters.
  • <?>+?: Matches a random word. A match is not made if there are no random words. There is no limit to the number of characters.
<?> Gildong
-> HongGildong (O)/HongHongGildong (X) 
[Hi] <?>*?
-> Hi Cheolsu (O)/Hi (O)
[Hi] <?>+?
-> Hi Cheolsu (O)/Hi (X)

Expression to specify an exception token \

\ is used to specify exception tokens in an utterance that must not be matched. You can limit the user's various utterances by entering \ before the token to make an exception.

  • The exception expression should be located in front of the wildcard.
  • We recommend that you don't use two or more exception expressions per regular expression pattern.
\[김철수|홍길동] <?>*? [슬픈|좋은|즐거운] 노래 틀어줘
-> 아이유 슬픈 노래 틀어줘 (O) / 아이유나 이지은 슬픈 노래 틀어줘 (O) / 지금 아이유 슬픈 노래 틀어줘 (O)
-> 아이유나 홍길동 슬픈 노래 틀어줘 (O) / 홍길동이나 아이유 슬픈 노래 틀어줘 (O)
-> 홍길동 슬픈 노래 틀어줘 (X) / 김철수 슬픈 노래 틀어줘 (X)

 \[<?>*? 김철수|<?>*? 홍길동] <?>*? [슬픈|좋은|즐거운] 노래 틀어줘
-> 아이유 슬픈 노래 틀어줘 (O) / 아이유나 이지은 슬픈 노래 틀어줘 (O) / 지금 아이유 슬픈 노래 틀어줘 (O)
-> 아이유나 홍길동 슬픈 노래 틀어줘 (X) / 홍길동이나 아이유 슬픈 노래 틀어줘 (X)
-> 홍길동 슬픈 노래 틀어줘 (X) / 김철수 슬픈 노래 틀어줘 (X)

**[잘못된 패턴 예시]**
[\김철수|\홍길동] <?>*? [슬픈|좋은|즐거운] 노래 틀어줘
<?>*? [\김철수|\홍길동] [슬픈|좋은|즐거운] 노래 틀어줘

Expression that substitutes a specific pattern with a designated test and saves it :

It's used when it's necessary to normalize a specific pattern to a designated text and save it. Usually, the entity analysis value is saved as the normalized value, and the value settings can be changed so that the value returned by the user can be used as is only for tasks.

  • Enter it in the form of a:b, so when the user enters a, it is substituted and saved as b.
  • The target for normalization applies only to one token before :.
1. a1|a2|a3:b
- If the user entered a1: It's matched to the pattern and is saved as a1.
- If the user entered a2: It's matched to the pattern and is saved as a2.
- If the user entered a3: It's matched to the pattern and is substituted from a3 to b and saved as b.
- If the user entered b: It's not a match for the pattern.

2. [a1|a2|a3]:b
- If the user entered a1: It's matched to the pattern and is substituted from a1 to b and saved as b.
- If the user entered a2: It's matched to the pattern and is substituted from a2 to b and saved as b.
- If the user entered a3: It's matched to the pattern and is substituted from a3 to b and saved as b.
- If the user entered b: It's not a match for the pattern.

3. a1|[a2|a3]:b
- If the user entered a1: It's matched to the pattern and is saved as a1.
- If the user entered a2: It's matched to the pattern and is substituted from a2 to b and saved as b.
- If the user entered a3: It's matched to the pattern and is substituted from a3 to b and saved as b.
- If the user entered b: It's not a match for the pattern.
  • If English text is in the location of b, then the normalization is supported in lowercase letters only.
1. a:b
- If the user entered a: It's matched to the pattern and is saved as b.

2. a:B
- If the user entered a: It's matched to the pattern and is saved as b. It's not saved as B.
Note

This expression can only be used in pattern entities.

Expression to call domain entities or system entities @{ }, `@{ }

@{ } and @{ } are used to enter domain entities, system entities, regular expression variables, and system variables registered in the chatbot builder. @{ } is used to call words registered as domain entity, and@{ } is used to call terms registered in the system entities.

How to enter domain entity

@menu: chicken, pizza

@{menu}[order|delivery]
-> chicken order, chicken delivery, pizza order, pizza delivery

How to enter system entity

`@{TELNUM}

`@{TELNUM}[search|check]please
-> 010-0000-0000 search please, 010-1234-5678 search please...

How to enter regular expression variable

@var.숫자: [1|2|3|4|5|6|7|8|9]

@var.{숫자} @var.{숫자} [세|살] (입니다)
-> 11세 입니다, 23살 입니다 

Expression to load form #{ }

#{ } is used to load forms. When creating conversation, enter an answer in the #{form name} format to load open-ended or multiple choice forms registered on the Form menu.

#{game service}

Expression to load action method ${ }

${ } is used to load action methods. Enter in the form of ${action method} to load a registered action method.

"The remaining point balance of ${membership.name} is ${membership.point} points."

Was this article helpful?