The csplit coreutil program lets me split a file into sections based on
some delimiter. What I really want to do is split a file into sections
based on a delimiter but forcing those sections to be at least b bytes in
size, even if that means including multiple delimiters in most or all
sections.
An example would be that I have an mbox file (email messages) of 300 MB
and containing 50,000 messages and I want to break it into 10 sections of
at least 30 MB each (the tenth section would have to be a little smaller
because there wouldn't be enough file left).
I can do stuff like this to divide the file "mbox" into individual email
messages, one per file...
csplit -ksz mbox '/^From /' {*}
...but I can't figure out how to make the files bigger so that they
include multiple delimiters.
It seems like there ought to be a way to do this.
Mike