Thursday, October 23, 2008

Why oh why streamreader

While I was going through some playing around with python i was really impressed with the following bit of functional joy.

inList = open(fileLoc, 'rU').readlines();


I was so happy to find that little tidbit and sad to not have used such a thing in C# ever. I always used the awful:


List<String> strings = new List<String>();
using(StreamReader reader = new StreamReader(fileloc))
{
while(line = reader.ReadLine()) != null
{
strings.Add(line);
}
}


This is the type of code you see in MSDN and in various "Professional C# 2.0" type books, they'll even show you TextReader to boot. However for the vast majority of cases the least bug free and easy to use appears to be something I totally missed for the past couple of years:


string[] readText = File.ReadAllLines(fileloc);


This has been around since .net 2.0, opens the file, reads it and closes it. At this point I'd argue that the shorter option should be shown primarily to newer students and the remaining features left in an appendix somewhere.

EDIT: Jon Skeet pointed out his simple class for doing the same above but with better performance. http://csharpindepth.com/ViewChapterNotes.aspx?Chapter=6

3 comments:

Jon Skeet said...

Unfortunately ReadAllLines has to read everything into memory. What you really want is a way to read a line at a time, but simply... Fortunately, with iterator blocks this is easy to implement. See this page for a small implementation.

http://csharpindepth.com/ViewChapterNotes.aspx?Chapter=6

Ryan Svihla said...
This comment has been removed by the author.
Ryan Svihla said...

EDIT: need edit function in blogger for typos

Thanks for the tip Jon, I'll update the blog post today and add your reference.

However I'm very curious about how ReadAllLines is effectively worse?

Does the TextReader in your example only read a line into memory then the compiler GC's the previous line?

If so I could see that helping on some of our larger batch processes, but I'd think the effect would be somewhat muted on the majority of our day to day processing (14k file size stuff).

I'd think from looking at it that File.OpenText(file) would have loaded the file fully into ram (but was just an assumption). Regardless I appreciate your feedback and I'll switch over our main file reader class to what you've suggested.