Cleanup with WP-CLI and Regex

Recently one of my customer’s WordPress sites was filled with random hidden span tags containing a single word: “Save.” The source is a mystery, however I suspect it came from bad code rather then something malicious. It was showing up at random spots throughout the website and causing some display issues.

<span style="border-top-left-radius: 2px; border-top-right-radius: 2px; border-bottom-right-radius: 2px; border-bottom-left-radius: 2px; text-indent: 20px; width: auto; padding: 0px 4px 0px 0px; text-align: center; font-style: normal; font-variant: normal; font-weight: bold; font-size: 11px; line-height: 20px; font-family: 'Helvetica Neue', Helvetica, sans-serif; color: #ffffff; background-image: url(data:image/svg+xml; base64,phn2zyb4bwxucz0iahr0cdovl3d3dy53my5vcmcvmjawmc9zdmciighlawdodd0imzbwecigd2lkdgg9ijmwchgiihzpzxdcb3g9ii0xic0xidmxidmxij48zz48cgf0acbkpsjnmjkundq5lde0ljy2mibdmjkundq5ldiyljcymiaymi44njgsmjkumju2ide0ljc1ldi5lji1nibdni42mzismjkumju2idaumduxldiyljcymiawlja1mswxnc42njigqzaumduxldyunjaxidyunjmyldaumdy3ide0ljc1ldaumdy3iemymi44njgsmc4wnjcgmjkundq5ldyunjaxidi5ljq0oswxnc42njiiigzpbgw9iinmzmyiihn0cm9rzt0ii2zmziigc3ryb2tllxdpzhropsixij48l3bhdgg+phbhdgggzd0itte0ljczmywxljy4nibdny41mtysms42odygms42njusny40otugms42njusmtqunjyyiemxljy2nswymc4xntkgns4xmdksmjquodu0idkuotcsmjyunzq0iem5ljg1niwyns43mtggos43ntmsmjqumtqzidewljaxniwymy4wmjigqzewlji1mywymi4wmsaxms41ndgsmtyuntcyidexlju0ocwxni41nzigqzexlju0ocwxni41nzigmteumtu3lde1ljc5nsaxms4xntcsmtqunjq2iemxms4xntcsmtiuodqyideyljixmswxms40otugmtmuntiyldexljq5nsbdmtqunjm3ldexljq5nsaxns4xnzusmtiumzi2ide1lje3nswxmy4zmjmgqze1lje3nswxnc40mzygmtqundyylde2ljegmtqumdkzlde3ljy0mybdmtmunzg1lde4ljkznsaxnc43ndusmtkuotg4ide2ljayocwxos45odggqze4ljm1mswxos45odggmjaumtm2lde3lju1niaymc4xmzysmtqumdq2iemymc4xmzysmtauotm5ide3ljg4ocw4ljc2nyaxnc42nzgsoc43njcgqzewljk1osw4ljc2nya4ljc3nywxms41mzygoc43nzcsmtqumzk4iem4ljc3nywxns41mtmgos4ymswxni43mdkgos43ndksmtcumzu5iem5ljg1niwxny40odggos44nzismtcunia5ljg0lde3ljczmsbdos43ndesmtgumtqxidkuntismtkumdizidkundc3lde5ljiwmybdos40miwxos40nca5lji4ocwxos40otegos4wncwxos4znzygqzcunda4lde4ljyymia2ljm4nywxni4yntigni4zodcsmtqumzq5iem2ljm4nywxmc4yntygos4zodmsni40otcgmtuumdiyldyundk3iemxos41ntusni40otcgmjmumdc4ldkunza1idizlja3ocwxmy45otegqzizlja3ocwxoc40njmgmjaumjm5ldiylja2miaxni4yotcsmjiumdyyiemxnc45nzmsmjiumdyyidezljcyocwyms4znzkgmtmumzayldiwlju3mibdmtmumzayldiwlju3miaxmi42ndcsmjmumdugmtiundg4ldizljy1nybdmtiumtkzldi0ljc4ncaxms4zotysmjyumtk2idewljg2mywyny4wntggqzeylja4niwyny40mzqgmtmumzg2ldi3ljyznyaxnc43mzmsmjcunjm3iemyms45nswyny42mzcgmjcuodaxldixljgyocayny44mdesmtqunjyyiemyny44mdesny40otugmjeuotusms42odygmtqunzmzldeunjg2iibmawxspsijymqwodfjij48l3bhdgg+pc9npjwvc3znpg==); background-color: #bd081c; background-size: 14px; position: absolute; opacity: 1; z-index: 8675309; display: none; cursor: pointer; border: none; -webkit-font-smoothing: antialiased; background-position: 3px 50%; background-repeat: no-repeat no-repeat;">Save</span>

Manual removal was getting tedious

The unwanted span code varied slightly, making a typical find and replace difficult. I came up with the following regular expression  (<span.+?>Save<\/span>) which targets the bad entries.

Turns out this is fairly easy to remove using WP-CLI

The wp search-replace command supports regex using the flag --regex. The flags --skip-themes and --skip-plugins were added to speed up the search and replace. The following command ended up removing 91 matches.

wp search-replace '(<span.+?>Save<\/span>)' '' wp_posts --include-columns=post_content --skip-columns=guid --regex --skip-themes --skip-plugins

WP-CLI search and replace is quite powerful

If you make a mistake you can easily corrupt your data. Before running any search and replace commands I highly recommend using the --log and --dry-run flags which will display the changes without actually saving them. If all looks good then you can drop those extra flags to make the changes.