TCLUG Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [TCLUG:18438] regex for stripping FONT tags?
- To: tclug-list@mn-linux.org
- Subject: Re: [TCLUG:18438] regex for stripping FONT tags?
- From: barnabas@pobox.com
- Date: Thu, 01 Jun 2000 06:43:02 -0500
- References: <Pine.SOL.4.20.0005311814400.21184-100000@garnet.tc.umn.edu> <3935AA90.DF9D215C@tc.umn.edu>
- Sender: eric
You might try undefining the input record seperator, $/ (IIRC) or
$INPUT_RECORD_SEPARATOR if you're use-ing English. This will cause the
entire file to be read into a single string and should allow you to use
the regex below to get rid of <font> tags that cross new lines.
HTH
Eric
Mike Hicks wrote:
>
> Luke Francl wrote:
> >
> > On Wed, 31 May 2000, Mike Hicks wrote:
> >
> > > I think you might try
> > >
> > > s/<\/*font.*?>//i
> > >
> > > The ? will make the regex find the nearest ">" rather than one at the
> > > end of the line or the end of the document
> >
> > Ah, thank you. I really need to buy "Mastering Regular Expressions"...
> >
> > I also needed to add "g" to the end to find all occurances; It was only
> > finding one <font> tag per line.
> >
> > I've ended up with the following little blurb:
> >
> > perl -i -p -e 's/<\/*font.*?>//ig' [filenames]
> >
> > It works pretty nice, but doesn't match font tags that break across
> > newlines. I tried trowing a \n* in there and adding "s" so that it treats
> > the string as a single line, but neither helped.
>
> Hmm.. I think that perl is already breaking the string up into
> line-by-line strings. You'd probably have to somehow join() them back
> together into one long string or prevent perl from breaking them up in
> the first place.
>
> > This is OK since I can clean those culprits out by hand. Still, any ideas
> > on how to fix that?
> >
> > Thanks,
> >
> > Luke
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> > For additional commands, e-mail: tclug-list-help@mn-linux.org
>
> --
> _ _ _ _ _ ___ _ _ _ ___ _ _ __ Microsoft Windows: A
> / \/ \(_)| ' // ._\ / - \(_)/ ./| ' /(__ virus with mouse support.
> \_||_/|_||_|_\\___/ \_-_/|_|\__\|_|_\ __)
> [ Mike Hicks | http://umn.edu/~hick0088/ | mailto:hick0088@tc.umn.edu ]
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: tclug-list-unsubscribe@mn-linux.org
> For additional commands, e-mail: tclug-list-help@mn-linux.org