Skip to main content
  1. Lumesh Document/
  2. Lumesh Libs/Modules/

Lumesh Regex Module

738 words·4 mins
Table of Contents

The Regex module provides comprehensive regular expression operation capabilities, supporting pattern matching, searching, capturing group extraction, text splitting, and replacement operations. All functions are implemented based on the regex-lite library, providing efficient regular expression processing capabilities.

Function Overview
#

Function CategoryMain FunctionsPurpose
Matching and Locatingfind, find_allFind match positions and content
Matching ValidationmatchValidate if the entire text matches the pattern
Capturing Group Operationscapture, captures, capture_nameExtract captured group content
Text Processingsplit, replaceText splitting and replacement

Matching and Locating Functions
#

find <pattern> <text> - Find the first match

  • Parameters:
    • pattern (required): String|Regex - Regular expression pattern
    • text (required): String - Text to search
  • Returns: Map|None - Match information mapping, containing start, end, and found fields; returns None if not found
  • Example: Regex.find(r'\d+', "abc123def") returns {start: 3, end: 6, found: "123"}

find_all <pattern> <text> - Find all matches

  • Parameters:
    • pattern (required): String|Regex - Regular expression pattern
    • text (required): String - Text to search
  • Returns: List[Map] - List of all matches, each element contains start, end, and found fields
  • Example: Regex.find_all(r'\d+', "abc123def456") returns a list of all matched numbers

Matching Validation Functions
#

match <pattern> <text> - Check if the entire text matches the pattern

  • Parameters:
    • pattern (required): String|Regex - Regular expression pattern
    • text (required): String - Text to validate
  • Returns: Boolean - Returns true if the entire text matches

Capturing Group Operation Functions
#

capture <pattern> <text> - Get the first matching capturing group

  • Parameters:
    • pattern (required): String|Regex - Regular expression containing capturing groups
    • text (required): String - Text to match
  • Returns: List|None - List of captured groups, index 0 is the full match, subsequent indices are each capturing group
  • Example: Regex.capture(r'(\d{4})-(\d{2})-(\d{2})', "2023-12-25") returns ["2023-12-25", "2023", "12", "25"]

captures <pattern> <text> - Get all matching capturing groups

  • Parameters:
    • pattern (required): String|Regex - Regular expression containing capturing groups
    • text (required): String - Text to match
  • Returns: List[List] - List of all matching capturing groups

capture_name <pattern> <text> - Get named capturing groups

  • Parameters:
    • pattern (required): String|Regex - Regular expression containing named capturing groups
    • text (required): String - Text to match
  • Returns: Map|None - Mapping of named capturing groups, with keys as group names and values as matched content
  • Example: Regex.capture_name(r'(?P<year>\d{4})-(?P<month>\d{2})', "2023-12") returns {year: "2023", month: "12"}

Text Processing Functions
#

split <pattern> <text> - Split text by regular expression

  • Parameters:
    • pattern (required): String|Regex - Splitting pattern
    • text (required): String - Text to split
  • Returns: List[String] - List of split strings
  • Example: Regex.split(r'\s+', "hello world test") returns ["hello", "world", "test"]

replace <pattern> <replacement> <text> - Replace all matches

  • Parameters:
    • pattern (required): String|Regex - Matching pattern
    • replacement (required): String - Replacement content
    • text (required): String - Target text
  • Returns: String - Text after replacement
  • Example: Regex.replace(r'\d+', "X", "abc123def456") returns "abcXdefX"

Parameter Handling Mechanism
#

The functions in the Regex module support flexible parameter type handling. They can process different combinations of parameters:

  • Support for both Regex type and String type pattern parameters
  • Automatically compiles strings into regular expressions
  • Provides detailed error messages

Usage Examples
#

Basic Matching Operations
#

1# Find numbers
2Regex.find(r'\d+', "Price: $123.45")
3# Returns: {start: 8, end: 11, found: "123"}
4
5# Validate email format
6Regex.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', "user@example.com")
7# Returns: true

Capturing Group Operations
#

1# Parse date
2Regex.capture(r'(\d{4})-(\d{2})-(\d{2})', "Today is 2023-12-25")
3# Returns: ["2023-12-25", "2023", "12", "25"]
4
5# Using named capturing groups
6Regex.capture_name(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})', "2023-12-25")
7# Returns: {year: "2023", month: "12", day: "25"}

Text Processing
#

1# Split text
2Regex.split(r'[,;]\s*', "apple, banana; cherry")
3# Returns: ["apple", "banana", "cherry"]
4
5# Replacement operation
6Regex.replace(r'\b\w+@\w+\.\w+\b', "[EMAIL]", "Contact us at support@example.com")
7# Returns: "Contact us at [EMAIL]"

Pipeline Operation Examples
#

1# Extract all numbers and sum them
2"Price: $123, Tax: $45, Total: $168" | Regex.find_all(r'\d+') | List.map((m) -> Into.int(m.found)) | List.sum()
3# Result: 336
4
5# Clean and format text
6"  Hello,   World!  " | Regex.replace(r'\s+', " ") | String.trim()
7# Result: "Hello, World!"

Notes
#

The Regex module is based on the regex-lite library, providing efficient regular expression processing capabilities. All functions support both string and Regex type pattern parameters, automatically handling type conversions. The capturing group functionality is particularly powerful, supporting both positional and named captures. Parameter type descriptions indicate that <> denotes required parameters and [] denotes optional parameters.

It is recommended to use raw strings (e.g., r'pattern') to avoid the complexity of escape characters.

In practical use, chained calls are supported, such as:

1let rg = r'[,;]\s*'
2rg.split("apple, banana; cherry")

In the examples, type names are used for clarity.

Related