Sunday, January 20, 2008

Opening really large text files

So I have this problem. I have a 500MB text file filled with prime numbers.

I'm not entirely sure how it got to be so big or how it was even made without the JVM crashing in protest (I guess file access for programs other than the end-user don't use RAM) like it did when I tried holding stuff in arrays, bt suffice it to say I'm not going to be able to open this file with any basic program (I know because opening a 4MB program in notepad takes about 1-2 hrs). Yet my programming teacher keeps talking about how stuff like this is always handled as a string (Primitive data types are so limited in that they can go up to 64 bit at most), so I got to thinking and googled "opening really large text files", thinking that there was probably some top-secret technology that only the super computer gurus had.

Turns out that I'm not the only one with this problem, and the solution seems to be a little program called TextPad (www.textpad.com). Even the evaluation copy is powerful enough to open files larger than 500MB in a little over ten seconds. OpenOffice exited after it tried, and wordpad struggled (it opened part of it then froze... well not froze, it was still working but was unresponsive).

I'm not sure why it works so fast while others struggle or even just crash outright. I wish I knew, because I think it'd be nice to write a text editor one day (among a whole slew of other things, of course), and knowing that this is a problem that can crop up from time to time (usually when trying to read files from mySQL servers, apparently)

No comments: