Convert a jagged array into a 2D-array

Yesterday there was an interesting question on Stackoverflow:

If you have an array like string[][], what’s the smartest way to convert it to a regular multidimensional array like string[,] assuming the former array is rectangular (not jagged)?

Of course, we should define better what does mean “smartest”: more performing? more compact? more elegant?
Anyway, it was a nice challenge, as an exercise, try to find any alternative way to the “classic imperative” approach.

In the following analysis we will consider a source jagged array, rectangular, and having non-zero dimensions.
The source array is created and filled as follows:

            //example jagged array
            string[][] ja = new string[2][];

            for (int i = 0; i < 2; i++)
            {
                ja[i] = new string[3];

                for (int k = 0; k < 3; k++)
                {
                    ja[i][k] = "Cell [" + i + "," + k + "]";
                }
            }

The imperative way.

The imperative way is the most intuitive, I think.
The task is very simple as iterating throughout the first (outer) dimensions, then through the second (inner). That will yield the access to each cell of the 2D-matrix of the target array, but the source datum is easy to get as well.

        static string[,] ImperativeConvert(string[][] source)
        {
            string[,] result = new string[source.Length, source[0].Length];

            for (int i = 0; i < source.Length; i++)
            {
                for (int k = 0; k < source[0].Length; k++)
                {
                    result[i, k] = source[i][k];
                }
            }

            return result;
        }

What about the IL-code of the “Release” compilation?

	IL_0000: ldarg.0
	IL_0001: ldlen
	IL_0002: conv.i4
	IL_0003: ldarg.0
	IL_0004: ldc.i4.0
	IL_0005: ldelem.ref
	IL_0006: ldlen
	IL_0007: conv.i4
	IL_0008: newobj instance void string[0..., 0...]::.ctor(int32, int32)
	IL_000d: stloc.0
	IL_000e: ldc.i4.0
	IL_000f: stloc.1
	IL_0010: br.s IL_0033
	// loop start (head: IL_0033)
		IL_0012: ldc.i4.0
		IL_0013: stloc.2
		IL_0014: br.s IL_0027
		// loop start (head: IL_0027)
			IL_0016: ldloc.0
			IL_0017: ldloc.1
			IL_0018: ldloc.2
			IL_0019: ldarg.0
			IL_001a: ldloc.1
			IL_001b: ldelem.ref
			IL_001c: ldloc.2
			IL_001d: ldelem.ref
			IL_001e: call instance void string[0..., 0...]::Set(int32, int32, string)
			IL_0023: ldloc.2
			IL_0024: ldc.i4.1
			IL_0025: add
			IL_0026: stloc.2

			IL_0027: ldloc.2
			IL_0028: ldarg.0
			IL_0029: ldc.i4.0
			IL_002a: ldelem.ref
			IL_002b: ldlen
			IL_002c: conv.i4
			IL_002d: blt.s IL_0016
		// end loop

		IL_002f: ldloc.1
		IL_0030: ldc.i4.1
		IL_0031: add
		IL_0032: stloc.1

		IL_0033: ldloc.1
		IL_0034: ldarg.0
		IL_0035: ldlen
		IL_0036: conv.i4
		IL_0037: blt.s IL_0012
	// end loop

	IL_0039: ldloc.0
	IL_003a: ret

It’s not as simple as the original C#-code, but still readable (with an effort).

The declarative (Linq) way.

I found a declarative way by using Linq, and that’s working fine, but I myself having an effort for defining it as “smarter” than the first solution.

        static string[,] LinqConvert(string[][] source)
        {
            return new[] { new string[source.Length, source[0].Length] }
                .Select(_ => new { x = _, y = source.Select((a, ia) => a.Select((b, ib) => _[ia, ib] = b).Count()).Count() })
                .Select(_ => _.x)
                .First();
        }

Since the Linq is often considered an elegant way to perform some repetitive tasks, here the code is far from elegant yet readable. By the way, since the basic operation is an “action” (copy cells), that’s hitting against the Linq concept, which “pulls” data.
Anyway, the IL-code seems a bit more compact than before: that’s because the many calls, though.

	IL_0000: newobj instance void ConsoleApplication4.Program/'<>c__DisplayClass5'::.ctor()
	IL_0005: stloc.0
	IL_0006: ldloc.0
	IL_0007: ldarg.0
	IL_0008: stfld string[][] ConsoleApplication4.Program/'<>c__DisplayClass5'::source
	IL_000d: ldc.i4.1
	IL_000e: newarr string[0..., 0...]
	IL_0013: stloc.1
	IL_0014: ldloc.1
	IL_0015: ldc.i4.0
	IL_0016: ldloc.0
	IL_0017: ldfld string[][] ConsoleApplication4.Program/'<>c__DisplayClass5'::source
	IL_001c: ldlen
	IL_001d: conv.i4
	IL_001e: ldloc.0
	IL_001f: ldfld string[][] ConsoleApplication4.Program/'<>c__DisplayClass5'::source
	IL_0024: ldc.i4.0
	IL_0025: ldelem.ref
	IL_0026: ldlen
	IL_0027: conv.i4
	IL_0028: newobj instance void string[0..., 0...]::.ctor(int32, int32)
	IL_002d: stelem.ref
	IL_002e: ldloc.1
	IL_002f: ldloc.0
	IL_0030: ldftn instance class '<>f__AnonymousType0`2'<string[0..., 0...], int32> ConsoleApplication4.Program/'<>c__DisplayClass5'::'<LinqConvert>b__0'(string[0..., 0...])
	IL_0036: newobj instance void class [mscorlib]System.Func`2<string[0..., 0...], class '<>f__AnonymousType0`2'<string[0..., 0...], int32>>::.ctor(object, native int)
	IL_003b: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!1> [System.Core]System.Linq.Enumerable::Select<string[0..., 0...], class '<>f__AnonymousType0`2'<string[0..., 0...], int32>>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [mscorlib]System.Func`2<!!0, !!1>)
	IL_0040: ldsfld class [mscorlib]System.Func`2<class '<>f__AnonymousType0`2'<string[0..., 0...], int32>, string[0..., 0...]> ConsoleApplication4.Program::'CS$<>9__CachedAnonymousMethodDelegate4'
	IL_0045: brtrue.s IL_0058

	IL_0047: ldnull
	IL_0048: ldftn string[0..., 0...] ConsoleApplication4.Program::'<LinqConvert>b__3'(class '<>f__AnonymousType0`2'<string[0..., 0...], int32>)
	IL_004e: newobj instance void class [mscorlib]System.Func`2<class '<>f__AnonymousType0`2'<string[0..., 0...], int32>, string[0..., 0...]>::.ctor(object, native int)
	IL_0053: stsfld class [mscorlib]System.Func`2<class '<>f__AnonymousType0`2'<string[0..., 0...], int32>, string[0..., 0...]> ConsoleApplication4.Program::'CS$<>9__CachedAnonymousMethodDelegate4'

	IL_0058: ldsfld class [mscorlib]System.Func`2<class '<>f__AnonymousType0`2'<string[0..., 0...], int32>, string[0..., 0...]> ConsoleApplication4.Program::'CS$<>9__CachedAnonymousMethodDelegate4'
	IL_005d: call class [mscorlib]System.Collections.Generic.IEnumerable`1<!!1> [System.Core]System.Linq.Enumerable::Select<class '<>f__AnonymousType0`2'<string[0..., 0...], int32>, string[0..., 0...]>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>, class [mscorlib]System.Func`2<!!0, !!1>)
	IL_0062: call !!0 [System.Core]System.Linq.Enumerable::First<string[0..., 0...]>(class [mscorlib]System.Collections.Generic.IEnumerable`1<!!0>)
	IL_0067: ret

Performance.

The last chance given to the Linq solution could be on the performance, so I tried to run both the conversions over a 100k loops each.
Here follow the results:

timings

Conclusion.

I would not use the Linq way ever, at least for such a cases. Frankly, I don’t see any benefit, other than a coding exercise.
Any of you does have a better solution?

Here is the source code of the application.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s