I'd like to think I'm pretty good at RegEx, but this one has me stumped. Search string looks like this...
ISA*lots**of~other~data**with~~no terminating **pattern~ISA*lots**of~other~data**with~~no terminating **pattern~ISA*lots**of~other~data**with~~no terminating **pattern~ISA*lots**of~other~data**with~~no terminating **pattern~
No line breaks.
ISA*
is a consistent starting pattern.The rest of the string is completely unpredictable.
I need
ISA*
and all characters until the next instance of that pattern.
What I've Tried
A positive look-ahead, but this doesn't capture the last result.(ISA*(.*(?=ISA*))?)
A positive look-behind, but I can't figure out how to make it lazy. If it's not lazy, there is only one match. But if it is lazy, you get the right number of matches, but only one additional character after the pattern.ISA*(?<=ISA*).*?
The other solution is to programmatically split
or explode
the string, remove the first (empty) result, and then re-attach the delimiter to each result. Indeed, that is what I already have in place. But the size of the file, the large number of results, and the post-processing causing performance issues. In a preliminary test, using regex appears to offer some worthy performance gains.
This is being processed with PHP. The string is sourced from an AS400 system, in an "EDI Transaction" text file. I have yet to find any libraries that contain a working regex for this type of file.