Transforming Arrays with Generics

I have begun to use .net version 2.0. So I’m only two years or so behind the cutting edge, but that’s how rock and roll I am.

Anyway. One of the features of C# 2 is generics. These allow one to specify a placeholder for the type of an object (say one to be placed in a list or a method parameter) which is then filled in at compile time based on the types actually used.

Most examples I’ve seen of their use are simple and don’t show the strengths of generics, and give no hint of the extra expressive power they provide. Showing how to avoid type-casting when using lists doesn’t really form a compelling argument.

I came across a problem, however, which allowed a more interesting use. Though not particularly complex, I think it demonstrates more of the potential of generics than I’ve previously seen. The solution shows generics allow expression of things which are not possible without generics and they can make your code easier to understand.

Before someone points out I’ve essentially re-implemented Array.ConvertAll(), there is a reason. I’m new to the generics language features and wanted to write something not completely trivial. At minimum, the implementation demonstrates some things the new language features make possible in a short and snappy hit¹.

The problem was I had a list of fairly similar loops, each of which converted an array of one type of objects into an array of another other type of objects. Each array needed a different procedure to translate it into the other type of array. A simple code snippet should illustrate this:



// array1 and array2 are the input arrays to be translated.

if (array1 != null && array1.Length > 0)

{
	String[] names = new String[array1.length];
	for (int i = 0; i < array1.length; i++)
	{
		names[i] = array1[i].name;
	}

}

if (array2 != null && array2.Length > 0)

{
	DateTime[][] date_ranges = new DateTime[array2.length][];
	for (int i = 0; i < array2.length; i++)
	{
		date_ranges[i] = new DateTime[];
		date_ranges[i]^{[0](#fn50510da9ac244d04aa510bdc10ae06ce-2)} = array2[i].start;
		date_ranges[i]^{[1](#fn50510da9ac244d04aa510bdc10ae06ce-1)} = array2[i].end;
	}

}

[... and so on for several loops ...]

The problem here is the intent of the code — translating array1 into names and so on — is hidden in the morass of loop and control constructs. The code is begging for refactoring to move all the repetitive code into a separate method, leaving it obvious what the intent is.

If all the types used in the arrays were the same, or had similar interfaces, it would be easy to write a method to take in the two arrays and convert one to the other. This situation has an even more difficult aspect: DateTime is a struct and String is a class, which means a method which tries to be general by using an object class as parameters and outputs isn’t even possible.

In C# pre-generics, there would have been no option but to either leave the code as it was, or at best write a method per for-loop to do the specific processing needed, including creating the new array, populating it with new objects and performing the mapping between each of the types. Generics, however, provide us with new, powerful methods to solve this problem with a single method (in combination with some of the other new C# 2 features, as we will see).

Generics allow the use of different types which share no common ancestor to be used transparently by a single method. A generic stand-in can also represent an object, a primitive type (e.g., an int) or a struct. This alone means we can do something impossible pre-generics: write a method which can deal with all three different types of data. Before a separate method would have been needed for objects, each type of primitive and each type of struct.

We define the method with “placeholder” types which can be used within the method wherever we need a concrete type. In this particular case, we define the type of the input and output arrays as the generic types paramT and outputT:

In this method signature, paramT and outputT are used in place of real types. The Translation argument will be explained later, but it can be seen that this is a second generic function which we define as taking the types paramT and outputT. These types will be defined by the caller of the method, rather than in the method itself.

This allows us to create the new array without knowing in advance what specific type it needs to be, a second impossibility pre-generics for all data types:

The second problem is we need to know how to map from the type of the input to the type of the output. Generics doesn’t help here with the mechanism of the mapping. What is needed is a mapping function which can take in an object of one type and produce an output of the required type. Generics can help, however, in the definition of this function.

To allow our method to perform the translation of inputs to outputs, a Translation delegate is passed into the TranslateArray function. This allows the TranslateArray function to delegate the responsibility of doing the object conversion to the callee, via the callee supplying an appropriate Translation function. This delegate is the Translate<paramT, outputT> argument to the TranslateArray function.

The Translation delegate is defined like this:

Like the TranslateArray function, the delegate uses generics to define the types of its input and output values. This allows the method the callee passes into TranslateArray to map any types it likes, without having to resort to object parameters and outputs and associated casting. As shown later, this also provides some type-safety to the TranslateArray method.

The full delegate and method combination is shown below:



// Transform an object of type T to an object of type O

private delegate O Translation(T input);

// Translate an input array into an output array of a 

// different type, using a function to translate each element.

private static outputT[] TranslateArray(
		paramT[] input,
		Translation f)

{
	outputT[] output = null;
	if (input != null && input.Length != 0)
	{
		output = new outputT[input.Length];
		for (int i = 0; i < input.Length; i++)
		{
			output[i] = f(input[i]);
		}
	}
	return output;

}

One of the most interesting thing is that the Translation<...> function passed into the TranslateArray function has the types paramT and outputT forced upon it. This makes sure any conversion function passed to TranslateArray has to translate between the types of TranslateArray’s inputs and outputs because both the TranslateArray and Translate functions are parameterised to the same types. This prevents type errors in the conversion functions, which are simple to make if the conversion function is only defined as taking two objects rather than specific types.

Calling TranslateArray makes the intent of the code more obvious than the for loops and if statements in the previous examples. We can use anonymous methods for the delegates passed into the TranslateArray function to simplify things further — they allow translation functions to be defined exactly where they are needed rather than elsewhere in the class. Keeping related functionality together further eases reading of the code.



String[] array1_new = TranslateArray(
		array1,
		delegate(InputTypeA val) { return val.name } );

DateTime[][] array2_new = TranslateArray(
		array2,
		delegate(TypeB val) {
			DateTime[] o = new DateTime^{[2](#fn50510da9ac244d04aa510bdc10ae06ce-3)};
			o^{[0](#fn50510da9ac244d04aa510bdc10ae06ce-2)} = value.start;
			o^{[1](#fn50510da9ac244d04aa510bdc10ae06ce-1)} = value.end;
			return o;
		});

Though not greatly shorter, the second version of the code conveys much better the intent of the code than the first, obscured as it was with if and for statements.

This is the greatest strength of generics, that they allow clearer, more understandable code to be written.

¹ Jumping ahead a little, Array.ConvertAll() could be used to replace the for-loop in the TranslateArray method.