The other day I had to create a .CSV file with some funky Unicode characters. Not only that, but Excel had to be able to open and edit it. When using .NET’s StreamWriter default constructor, it uses a default encoding of UTF-8 without BOM (byte order mark), which Excel can’t read.
Well, actually, it can read it, but it doesn’t realize that this is a unicode file and special characters (such as this lovely one - Ž) have a tendency to look like someone just puked a letter on the screen.
The solution is quite simple. You need to use the other StreamWriter constructor, which accepts Encoding. You then have to pass it an Encoding of UTF-8 with BOM, which you can create by using the non-default constructor of the UTF8Encoding class. The whole thing looks something like this:
1: private static readonly Encoding Utf8WithBom = new UTF8Encoding(true);
2:
3: public void WriteSomething()
4: {
5: using (var streamWriter = new StreamWriter(@"c:\hello.csv", true, Utf8WithBom))
6: {
7: streamWriter.WriteLine("hello,world");
8: }
9: }
By the way, this is more of an issue with Excel then with .NET. As the above Wikipedia article states, the UTF-8 specification doesn’t require the BOM, and most applications can do without it.