I would like to change some (short) strings, eg.: rz -> ż, ch -> h
appearing in the searched sentence, but excluding those appearing
on the blacklist in the form of (wider) regular patterns.
For example, I have a polish sentence (or any other) at the input:
Tarzan się tarza a tarzanie jest głupie
I would like to replace all these digraphs:
rz -> ż
but excluding such from a separate blacklist of the negative patterns, eg:
$blacklist = [ '/\bta(rz)an(?:|a|ach|ami|em|om|owi|ow|y)\b/u', '/\bTa(rz)anie\b/ui', // ... and few more others ];
For now, I’m doing it so, that I first find all the offsets of the occurrences of rz,
I create an array of $found from them and then, in a for ($blacklist as $pattern) loop
I eliminate those found with preg_match_all. At the end I only replace
those, whose offset remains in the $found array. It seems ineffective to me
at least not elegant, and I have a feeling that it can be done somehow better.