TCLUG Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [TCLUG:18438] regex for stripping FONT tags?
try
perl -i -p -e 'BEGIN{$/=0} s{</*font.*?>}{}igs' [filenames]
The BEGIN puts you in paragraph mode. The 's' causes '.' to match a
newline. Should work, haven't tested it .
Patrick McCabe
----- Original Message -----
From: <barnabas@pobox.com>
To: <tclug-list@mn-linux.org>
Sent: Thursday, June 01, 2000 6:43 AM
Subject: Re: [TCLUG:18438] regex for stripping FONT tags?
> You might try undefining the input record seperator, $/ (IIRC) or
> $INPUT_RECORD_SEPARATOR if you're use-ing English. This will cause the
> entire file to be read into a single string and should allow you to use
> the regex below to get rid of <font> tags that cross new lines.
>
> HTH
>
> Eric
>
> Mike Hicks wrote:
> >
> > Luke Francl wrote:
> > >
> > > On Wed, 31 May 2000, Mike Hicks wrote:
> > >
> > > > I think you might try
> > > >
> > > > s/<\/*font.*?>//i
> > > >
> > > > The ? will make the regex find the nearest ">" rather than one at
the
> > > > end of the line or the end of the document
> > >
> > > Ah, thank you. I really need to buy "Mastering Regular Expressions"...
> > >
> > > I also needed to add "g" to the end to find all occurances; It was
only
> > > finding one <font> tag per line.
> > >
> > > I've ended up with the following little blurb:
> > >
> > > perl -i -p -e 's/<\/*font.*?>//ig' [filenames]
> > >
> > > It works pretty nice, but doesn't match font tags that break across
> > > newlines. I tried trowing a \n* in there and adding "s" so that it
treats
> > > the string as a single line, but neither helped.
> >
> > Hmm.. I think that perl is already breaking the string up into
> > line-by-line strings. You'd probably have to somehow join() them back
> > together into one long string or prevent perl from breaking them up in
> > the first place.
> >
> > > This is OK since I can clean those culprits out by hand. Still, any
ideas
> > > on how to fix that?
> > >
> > > Thanks,
> > >
> > > Luke
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> > > For additional commands, e-mail: tclug-list-help@mn-linux.org
> >
> > --
> > _ _ _ _ _ ___ _ _ _ ___ _ _ __ Microsoft Windows: A
> > / \/ \(_)| ' // ._\ / - \(_)/ ./| ' /(__ virus with mouse support.
> > \_||_/|_||_|_\\___/ \_-_/|_|\__\|_|_\ __)
> > [ Mike Hicks | http://umn.edu/~hick0088/ | mailto:hick0088@tc.umn.edu ]
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> > For additional commands, e-mail: tclug-list-help@mn-linux.org
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> For additional commands, e-mail: tclug-list-help@mn-linux.org
>
>