Splitting a large file into smaller individual files
Apr 28 2015, 5:17 am
By: rockz  

Apr 28 2015, 5:17 am rockz Post #1

There's a million programs out there which will take a large file and split it up in 500 MB increments, but I'd like to split up a large text file (or binary file) and search for header text.

For example, if I have a text file like this:
NEWFILE:The rain in Spain falls mainly on the plain.NEWFILE:Some people just want to watch the world burn.NEWFILE:When is a door not a door?

I want 3 files to be output with the following text in them:

The rain in Spain falls mainly on the plain.
Some people just want to watch the world burn.
When is a door not a door?

There has got to be an easy way to do this that someone has already come up with.

Apr 28 2015, 5:26 am O)FaRTy1billion[MM] Post #2

This would be easy to do with probably any scripting language ... otherwise it'd be easy enough to throw together a simple C program for it.

Apr 28 2015, 5:27 am NudeRaider Post #3

Well it should be pretty easy to program yourself. You just gotta know how the header looks like*. Because individual files have an EOF Marker, which can obviously not be inside the large file to separate the subfiles.

* And that's why there's probably no program that does that yet, because every file(type) has a different header, if it has one at all.
But I'm assuming you know what's inside the large file so it should be doable.

Apr 28 2015, 1:51 pm Roy Post #4

Yeah, it's a trivial implementation. You could go and download LinqPad and then throw a few C# statements into it:
string text = File.ReadAllText(@"C:\path\to\file.txt");
string delimiter = "NEWFILE:";
string[] splits = Regex.Split(text, delimiter);
for (int i = 0; i < splits.Length; i++)
   if (!string.IsNullOrEmpty(splits[i]))
       File.WriteAllText(@"C:\path\to\file_" + i + ".txt", splits[i]);

But I went ahead and made a GUI for you in case you don't want to bother with all that. See attached. (Haven't tested it with non-text files, though.)

