Page 1 of 1
Non-Greedy Pattern Matching
Posted: Mon Apr 04, 2016 7:29 pm
by patgilmour
I'm trying to do a non-greedy pattern match using Regex to get a 3-6 digit order number.
The standard PCRE Search Pattern is:(Order No)(.*?)(\d{3,6})
Then I get the order number using $3 (or \3 for Switch)
The problem is .*? in Switch is greedy and so it grabs everything to the end of the file/string.
So, is there a way to restrict the search after 'Order No' to "any amount of any characters up to the next 3-6 digit integer?"
Many thanks, Pat
Re: Non-Greedy Pattern Matching
Posted: Mon Apr 04, 2016 8:07 pm
by gabrielp
Could you describe where you are doing this matching? Perhaps take a screenshot of where you are putting your current search pattern. The answer could go a few different ways depending on if this is within the variable builder in Switch, Scripter, or elsewhere.
Re: Non-Greedy Pattern Matching
Posted: Mon Apr 04, 2016 8:44 pm
by patgilmour
Thanks for the reply.
I have been trying to do this in a variable builder within a flow. It examines a 'body' node from an 'Email' metadata set.
An example of the string it would try for the pattern match within:
Order No</span></b><span style="font-size: 9pt;" class=""><span class="Apple-converted-space"> </span>033174<
My main interest is in knowing if there is standard method in Switch of doing any any character non-greedy match, as I've stumbled on this before.
Re: Non-Greedy Pattern Matching
Posted: Tue Apr 05, 2016 6:23 pm
by loicaigon
Hi,
What about this : \d{3,6}$
That would catch any 3-6 numbers at the end of a string. If you need to ensure that it's not a false positive, you can filter the whole match first.
HTH
Loic
www.ozalto.com
Re: Non-Greedy Pattern Matching
Posted: Tue Apr 05, 2016 6:50 pm
by patgilmour
Thanks loicaigon - however the integer isn't at the end of a line, so it won't match.
Re: Non-Greedy Pattern Matching
Posted: Tue Apr 05, 2016 6:52 pm
by patgilmour
We found a solution.
In the end, we went with a Script and used this pattern to match the required string:
Code: Select all
var regExEmailBody = /(Order No)(.*)(\/span\>)([\d]{3,6})(.*)(Sequential\sID)/;
The trick was to find something consistent *after* the required string, and we had "(Sequential ID)" to match against
With that we were able to get the integer after the html span tags and pick out the 4th group using:
Code: Select all
var orderID = regExEmailBody.capturedTexts[4];
Thanks for the help!