A DBA coworker wanted some regex to strip SGML tags and spaces from a database for some reports he was doing.
He wanted to turn this:
BiScO<sub>3</sub>:<html_ent glyph=”@nbsp;” ascii=” “></html_ent> Centrosymmetric BiMnO<sub>3</sub>-type Oxide
into:
BiScO3:CentrosymmetricBiMnO3-typeOxide
So, I came up with this pattern: <(.|\n)*?>|\s and tested it at http://www.regextester.com.
Apparently, in Oracle 10g, you can use the REGEXP_REPLACE function to do some nifty lookups and transformations.


0 comments:
Post a Comment