Replacing overview
Documentation for version: 0.41.2
Use Pattern.replace()
to perform regular expression search and replace.
#
Replacing with a constant valueTo replace occurrences of a regular expression pattern in a given subject, use Pattern.replace()
to
instantiate a Replace
operation for the subject and .with()
to replace the ocurrence with a constant
string
value.
Replace.with()
returns a new string with content of the subject, but with occurrences of a
regular expression replaced by a given string
argument.
If the pattern doesn't match the subject at all, .with()
returns the subject unmodified.
#
Replacing with an empty stringWhile it is possible to call Replace.with()
with an empty string ""
to remove the occurrences,
it's more semantic to call Pattern.prune()
.
#
Replace with a groupMethod Replace.withGroup()
can be used to replace occurrences of a regular expression in the
subject with a capturing group of the expression. To replace the occurrence of regular expression
in the subject with a capturing group, pass an ordinal number of the group as the first argument
of Replace.withGroup()
.
The regular expression www.([\w.-]+)
matches substring "www.google.com"
, in which
the capturing group 1
captures "google.com"
. Replacement operation is invoked using withGroup(1)
,
and so the matched occurrence is replaced by the content of the capturing group. The second
occurrence of the matched pattern is "www.t-regx.com"
, in which the first capturing group
captures "t-regx.com"
.
Method Replace.withGroup()
accepts int|string
. With argument of type int
passed, the capturing
group is referred to by an ordinal number, commonly known as "group index". Capturing groups are
ordered by their opening parenthesis. For example: in pattern (Cat)(Foo(Bar))
, group (Cat)
is
assigned an ordinal number 1
, group (Foo(Bar))
is assigned an ordinal number 2
, and group (Bar)
is assigned an ordinal number 3
.
Because the matched occurrence is always implicitly captured and is assigned an ordinal number 0
,
call Replace.withGroup(0)
performs a somewhat redundant search and replace, because it returns the
subject unmodified, as the matched occurrence is replaced by itself.
#
Named groupsMethod Replace.withGroup()
accepts int|string
. With argument of type string
passed, the capturing
group is referred to by its group name. Not all capturing groups are named, for example group (Foo)
is not a named group.
Explicit syntax is available for named groups: (?<###>...)
, where ###
is the name of the group,
for example: (?<capital>[A-Z])[a-z]+)
. Alternative syntax for named group is (?P<###>...)
or (?'###'...)
.
Named capturing groups can be referred to either by ordinal numbers or by the name. In pattern
(?<capital>[A-Z])[a-z]{2,})
, the first group can be used to replace the occurrence either
using withGroup(1)
or withGroup('capital')
. In other words, all capturing groups are assigned
an ordinal number, but only named groups can be referred to by their name.
Additionally, modifier /n
can be used when instantiating Pattern
by passing Pattern::NO_AUTOCAPTURE
or by simply passing string
literal 'n'
.
Because modifier Pattern::NO_AUTOCAPTURE
is used in $pattern
, only the named groups are
captured, and so group (https?://)
is not captured. In this case (?<domain>[\w.]+)
is
captured and is assigned an ordinal number 1
.
In patterns without modifier 'n'
- Pattern::NO_AUTOCAPTURE
, syntax (?:...)
can be
used to add a non-capturing group.
#
Unmatched groupMethod Replace.withGroup()
throws GroupNotMatchedException
when replacement is attempted with
an unmatched group.
In example below, group (http://)
is followed by ?
. Such group may not be matched, while the
whole pattern matches. Such groups are referred to as "optional groups", because for different
subjects they may or may not be matched.
#
Nonexistent groupMethod Replace.withGroup()
throws NonexistentGroupException
when replacement is attempted with
a group that is not present in the pattern.
In fact, any operation on a missing group apart from .groupExists()
throws NonexistentGroupException
.
#
Replace with a callbackMethod Replace.callback()
performs a regular expression search, passes the flow control back
to the caller via callable
, which accepts a matched occurrence as Detail
argument, and then
performs replacement with the values returned from the callable
. Each matched occurrence of the
regular expression in the subject is replaced by string
value returned from the callable
. In
other words, Replace.callback()
accepts a callable
, which is supposed to map the received
Detail
argument to a new string
replacement.
#
Matched occurrenceThe matched occurrence of the regular expression in the subject is passed as an argument
to the callable
argument of Replace.callback()
.
The matched occurrence is passed as Detail
, which is the same interface as the one
representing the matched occurrence in Pattern.match()
, for example Matcher.first()
.
All of the implementation differences between internal structures of matching and replacing
are unified under the common Detail
interface.
#
Accepted return valuesArgument callable
passed to Replace.callback()
can only return string
, which is the
new replacement.
When value of type other than string
is returned from the callable
, then Replace.callback()
throws InvalidReplacementException
.
Stringable
return values#
Note, that objects implementing Stringable
or objects with __toString()
are also invalid.
Received argument Detail
can be cast to string
, which is the same as calling Detail.text()
;
but returning object of type Detail
or any other Stringable
object throws
InvalidReplacementException
.
To conveniently return object of type Detail
or other Stringable
type, specify PHP string
type-hint on the anonymous function or explicitly cast the object to string
.
Specify string
type-hint, so that PHP can implicitly cast the object to string
,
or omit the type-hint and allow Replace.callback()
to validate the type of the return values,
in case the returned value is of other type.
#
PHP callable notationIn PHP, certain string
and array
values are also callable
.
'strToUpper'
,'strToLower'
- acallable
that behaves similarly to globalstrToUpper()
andstrToLower()
functions[$this, 'replace']
- acallable
tht behaves similarly to a method.replace()
on$this
Function strToUpper()
accepts string
as argument, and so when callback('strToUpper')
is called,
the Detail
is passed as an argument to strToUpper()
. Because strToUpper()
accepts string
, then
Detail
is being cast to string
, which is the same as calling Detail.text()
.
In PHP, an array [$this, 'replace']
is also a valid callable
.
Notice, that Replace.callback()
truly accepts callable
as argument type. However, in PHP certain
values are also regarded as callable, for instance the aforementioned "strToUpper"
or
[$this, 'replace']
.
#
Replace with referencesMethod Replace.withReferences()
can be used to pass a formatting string to PCRE, which is an internal
implementation of T-Regx and other regular expression methods in PHP.
Replace.withReferences()
is listed last, as this is the least recommended mean of replacing elements.
Please, try and use .with()
, .withGroup()
or .callback()
first, and only use .withReferences()
as the last resort.
Method Replace.withReferences()
replaces the matched occurrences with the formatting string,
with certain tokens populated by captured groups, for example: $1
.
Special tokens in formatting string:
$
and a capturing group ordinal number:$1
,$2
and also$0
\
and a capturing group ordinal number:\1
,\2
and also\0
${
, a group ordinal number and a closing}
:${1}
,${2}
and also${0}
Please, be advised that Replace.withReferences()
is discouraged, because of a number of reasons:
- Formatting string tokens use special characters which can be mistaken for PHP language syntax (e.g. for example
format token
\1
can be mistaken for PHP notation ofchr(1)
) - Unmatched groups are simply ignored, and implicitly replaced by an empty string
""
, which may or may not be desired - Nonexistent groups are ignored, which can lead to hard to find bugs
- Formatting string can't be used to reference capturing groups by names
- Is a tight dependency on internal PCRE format
For these reasons we recommend using .with()
, .withGroup()
and .callback()
.