Lumesh Regular Expression Module Documentation
Module Function List
Use help regex
for assistance.
Function Description
Provides string processing capabilities based on regular expressions, implemented using the lightweight regex_lite
library. All functions support two calling styles:
- Functional Call:
regex.func(arg0, arg1...)
- Imperative Call:
regex.func arg0 arg1...
Function List and Detailed Description
1. regex.match
- Functionality: Checks if the text matches the regular expression pattern.
- Parameters:
pattern
: string - Regular expression.text
: string - Text to match.
- Return Value: boolean - Returns
true
if matched, otherwisefalse
. - Example:
# Check for digit match
regex.match '\d+' '123' # => true
regex.match '\d+' 'abc' # => false
2. regex.find
- Functionality: Finds the position and content of the first match.
- Parameters:
pattern
: string - Regular expression.text
: string - Text to search.
- Return Value:
[start, end, text] | none
start
: integer - Starting index of the match (0-based).end
: integer - Ending index of the match.text
: string - Matched content.- Returns
none
if no match is found.
- Example:
regex.find '\d+' 'abc123def'
# => [3, 6, '123']
3. regex.find_all
- Functionality: Finds all matches and their positions.
- Parameters:
pattern
: string - Regular expression.text
: string - Text to search.
- Return Value:
[[start, end, text], ...]
- List of match results (may be empty). - Example:
regex.find_all '\d+' '12a34b56'
# => [[0,2,'12'], [3,5,'34'], [6,8,'56']]
4. regex.capture
- Functionality: Extracts the first matching capture group.
- Parameters:
pattern
: string - Regular expression with capture groups.text
: string - Text to search.
- Return Value:
[full, group1, group2, ...] | none
full
: string - Full matched text.groupN
: string | none - Content of the N-th capture group (returnsnone
if not matched).
- Example:
regex.capture '(\d+)-(\w+)' '123-abc'
# => ['123-abc', '123', 'abc']
5. regex.captures
- Functionality: Extracts all matching capture groups.
- Parameters:
pattern
: string - Regular expression with capture groups.text
: string - Text to search.
- Return Value:
[[full, group1, ...], ...]
- List of all matching capture groups. - Example:
regex.captures '(\d+)' 'a1b2'
# => [['1'], ['2']]
6. regex.split
- Functionality: Splits text by the regular expression.
- Parameters:
pattern
: string - Regular expression used as a delimiter.text
: string - Text to split.
- Return Value:
[part1, part2, ...]
- List of substrings after splitting. - Example:
regex.split '\s*,\s*' 'a, b, c'
# => ['a', 'b', 'c']
7. regex.replace
- Functionality: Replaces all matches.
- Parameters:
pattern
: string - Regular expression.replacement
: string - Replacement text (supports$n
for capture group references).text
: string - Text to process.
- Return Value: string - New string after replacement.
- Example:
regex.replace '(\d+)' 'number:$1' '123 abc'
# => 'number:123 abc'
8. regex.capture_name
- Functionality: Extracts named capture groups.
- Parameters:
pattern
: string - Regular expression with named capture groups (e.g.,(?P<name>...)
).text
: string - Text to search.names
: boolean (optional) - Whether to return group names (default isfalse
).
- Return Value:
- When
names=false
:[group1, group2, ...] | none
- List of capture group values (skips the full match at index 0). - When
names=true
:[[name, value], ...] | none
- List of pairs of group names and values.
- When
- Example:
# Return group values
regex.capture_name '(?P<num>\d+)(\w+)' '123abc'
# => ['123', 'abc']
# Return group names and values
regex.capture_name '(?P<num>\d+)(\w+)' '123abc' true
# => [['num', '123'], [none, 'abc']] // Unnamed group shows as none
General Notes
- Regex Syntax: Follows
regex_lite
specifications (a lightweight subset of PCRE). - Escape Handling: Double backslashes are required in Lumesh strings (e.g.,
\d
represents a digit). - Error Handling:
- Throws errors for non-string parameters.
- Throws errors for invalid regex patterns.
- Indexing Rules: All position indices start from 0 (left-closed, right-open interval).
Regular Expression Syntax Reference
Follows standard regular expression syntax; special characters need to be escaped:
Character | Meaning | Example |
---|---|---|
. | Any character | a.c matches “abc” |
\d | Digit [0-9] | \d+ matches “123” |
\w | Word character [a-zA-Z0-9_] | \w+ matches “var1” |
\s | Whitespace character | a\sb matches “a b” |
* | 0 or more times | a*b matches “b”, “ab”, “aab” |
+ | 1 or more times | a+b matches “ab”, “aab” |
? | 0 or 1 time | a?b matches “b”, “ab” |
{n} | Exactly n times | a{3} matches “aaa” |
^ | Start of string | ^a matches strings starting with a |
$ | End of string | a$ matches strings ending with a |
[…] | Character set | [aeiou] matches any vowel |
[^…] | Non-character set | [^0-9] matches non-digit characters |
`a | ||
() | Grouping | (ab)+ matches “abab” |
Note: In double-quoted strings, backslashes must be escaped, e.g.,
"\d"
should be written as"\\d"
, while single quotes do not require escaping:'\d'
.
Advanced Usage
Performance Recommendations
- For frequently used regular expressions, use
new
to precompile and store the result. - Prefer using
find
overmatch
when only partial matches are needed. - For simple string operations, consider using string functions instead of regular expressions.
Error Handling
- All functions validate the number and type of parameters.
- Invalid regular expressions will return descriptive errors.
- Non-matching cases typically return
None
instead of an error.