Thursday, April 21, 2005

I got burnt by ToString()

I have encountered a gotcha that I want to share. Strong typing is my thing. There is nothing that
bothers me more than seeing hardcoded strings in code.

In ADO.Net it is common to do something like this
Data ds = GetDataSetFromDB()
...
DataRow dr = ...

string Name = (string) dr["Name"];

If you mistype "Name" you are doomed. So I used to do this: define an enum called "Columns" that had all the
column names I am interested in.
enum Columns
{
Name,
Email,
Zip
// etc...
}

I can check the names once (visually and programmtically against the database) and in the code use this


string Name = (string) dr[Columns.Name.ToString()];

As long as you remember to use the enum, you can't mess up.

Now, before you go do this, it is very very slow. You would think that it is straightforward for the runtime
to convert an enum into its string representation. That's not the case.

if you compile this into an exe

using System;

public class HelloClient
{

public enum Columns
{
Name
}

public static void Main()
{
Console.WriteLine(Columns.Name.ToString());
}
}


and look at the ToString() with Reflector here is what you get. This is the ToString() from the
Enum type.

public override string ToString()
{
Type type1 = base.GetType();
FieldInfo info1 = Enum.GetValueField(type1);
object obj1 = ((RuntimeFieldInfo) info1).InternalGetValue(this, false);
return Enum.InternalFormat(type1, obj1);
}

and look at what GetValueField does...

private static FieldInfo GetValueField(Type type)
{
FieldInfo[] infoArray1;
if (type is RuntimeType)
{
infoArray1 = ((RuntimeType) type).InternalGetFields(BindingFlags.NonPublic | (BindingFlags.Public | BindingFlags.Instance), false);
}
else
{
infoArray1 = type.GetFields(BindingFlags.NonPublic | (BindingFlags.Public | BindingFlags.Instance));
}
if ((infoArray1 == null) || (infoArray1.Length != 1))
{
throw new ArgumentException(Environment.GetResourceString("Arg_EnumMustHaveUnderlyingValueField"));
}
return infoArray1[0];
}


Arg!!! Reflexion all over the place. No wonder it is slow....

Now, this is the generic Enum.ToString(). I would hope that it does not do that for Booleans... Let's check.

// Part of Boolean

public override string ToString()
{
if (!this.m_value)
{
return bool.FalseString;
}
return bool.TrueString;
}

public static readonly string FalseString;
public static readonly string TrueString;

// End of Boolean


Good!! Somebody is thinking.

Well, I guess we are stuck with the implementation of ToString() for our custom Enums. It would be nice though to let the
compiler know to inline the Enum's ToString() like a custom attribute. For instance:

[HeyCompilerBeSmartAboutTheToStringMethodPleaaaaseAttribute()]
enum Columns
{
Name,
Email
}

It would be nice indeed.

So I am stuck and went back to using strings like this

string Name = (string) dr["Name"];

Cool. If you use reflector again you will see that a hashtable look up is done every time
you access that column. I did some timing and it is still slow for my standards!

Let's push it further. What we want is brute speed like this.

string Name = (string) dr[3]; // 3 == "Name"

That is fast but brittle. If you move the column around in your table, you are out of synch and you will
most likelt get a beautiful exception.

This is why you need to use the "Ordinal" property on the "DataColumn".

If you want speed, you need to precompute the ordinals when the application starts and then use them.

You need to use a few tricks to do this on start up but it is worth it. Now my code is fast. I am a happy man.

No comments: