On Oct 13, 2008, at 12:52 PM, Robert De Mars wrote:

> I was wondering if anyone knew of an open source project that can do  
> the
> following.
>
> I have an internal web server at work that employees use for various  
> things.
>
> I am looking for a piece of software (or several pieces if needed)  
> that
> would crawl various industry related websites, and then save a local
> copy of the articles.  I would like the software to collect the  
> selected
> content, and when it is done crawling create an index file where
> employees can see various industry news on one page.


wget and curl can do this for you.  wget is most capable, including  
editing of paths and such for local viewing.  Pretty common thing to do.

As far as the 'index' page, that's something you'd have to munge  
together yourself, I think.
---
Eric Crist