The Daily Parker

Politics, Weather, Photography, and the Dog

T is for Type

Blogging A to ZNow that I've caught up, day 20 of the Blogging A-to-Z challenge is just a few hours late. (The rest of the week should be back to noon UTC/7 am Chicago time.)

Today's topic: Types.

Everything in .NET is a type, even System.Type, which governs their metadata. Types exist in a hierarchy called the Common Type System (CTS). Distilled, there are two kinds of types: value types and reference types. I alluded to this distinction Saturday earlier today when discussing strings, which are reference types (classes) that behave a lot like value types (structs).

The principal distinction is that value types live on the stack and reference types live on the heap. This means that all of the data in a value type is contained in one place. The CLR sets aside memory for the entire object and moves the whole thing around as a unit. Naturally, this means value types tend to be small: numbers, characters, booleans, that sort of thing.

Reference types also partially live on the stack but only as a pointer to the heap where they keep their main data. The .NET memory manager can move reference types and their data independently as needed to handle different situations.

One of the consequences of this distinction is that when you pass a value type to a method, you're passing the entire thing; but when you pass a reference type to a method, you're only passing its pointer. This can give new .NET developers terrific headaches:

public void ChangeStuff()
{
	var i = 12345; // i is a System.Int32
	var p = new MyClass { Name = "Hubert", Count = i };
	
	ChangeStuff(i, p);
	
	Debug.Assert(12345 == i); // Succeeds!
	Debug.Assert(54321 == MyClass.Count); // Succeeds!
	Debug.Assert("Hubert" == MyClass.Name); // Fails!
}

private void ChangeStuff(int i, MyClass myClass)
{
	i = 54321; // The i in this method is not the i in the calling method
	MyClass.Name = "Humbert"; // But the MyClass is
	MyClass.Count = i;
}

When the CLR passes i to the second ChangeStuff method, it creates a copy of i on the stack and passes in that copy. Then on line 15, we create a third copy of i that replaces the second copy on the stack.

But lines 16 and 17 aren't creating copies of MyClass; they're using the pointer to the instance of MyClass created on line 4, so that any operations on MyClass work on the very same instance.

I recommend new developers read the MSDN article on Types I referenced above. Also read up on boxing and unboxing and anonymous types.

S is for String

Blogging A to ZDay 19 of the Blogging A-to-Z challenge was Saturday, but Apollo After Hours drained me more or less completely for the weekend.

So this morning, let's pretend it's still Saturday for just a moment, and consider one of the oddest classes in the .NET Base Class Library (BCL): System.String.

A string is just a sequence of one or more characters. A character could be anything: a letter, a number, a random two-byte value, what have you. System.String holds the sequence for you and gives you some tools to control them, like Compare, Format, Join, Split, and StartsWith. Under the hood, the class holds the string as an array of char values.

Even though System.String is a class and not a struct, it behaves much more like the latter than the former. Strings are immutable: once you create a string, any changes you make to it create a new string instance. Also, below a certain length, strings live on the stack rather than in the heap, which has consequences for memory management and performance. But unlike structs, strings can be null. This is valid code:

var s = "Hello, world";
string t = null;
var u = t + s;

Strings can also be zero-length or all whitespace, which is why the class has a very useful method String.IsNullOrWhitespace().

The blog C# in Depth has a good description of strings that's worth reading. Jon Skeet also takes on string memory management in one of his longer posts.

R is for Reflection

Blogging A to ZOK, I lied. I managed to find 15 minutes to bring you day 18 of the Blogging A-to-Z challenge, in which I'll discuss one of the coolest feature of the .NET ecosystem: reflection.

Reflection gives .NET code the ability to inspect and use any other .NET code, full stop. If you think about it, the runtime has to have this ability just to function. But any code can use tools in the System.Reflection namespace. This lets you do some pretty cool stuff.

Here's a (necessarily brief) example, from the Inner Drive Extensible Architecture. This method, in the TypeInfo class, uses reflection to describe all the methods in a Type:

using System;
using System.Collections.Generic;
using System.Data;
using System.Diagnostics;
using System.Globalization;
using System.Linq;
using System.Reflection;
using System.Text;

public static IDictionary<string, MethodInfo> ReflectMethods(object target)
{
	if (null == target) return new Dictionary<string, MethodInfo>();

	var thisType = target.GetType();
	var methods = thisType.GetMethods(BindingFlags.Public | BindingFlags.NonPublic | BindingFlags.Instance | BindingFlags.Static);

	var returnList = new Dictionary<string, MethodInfo>();
	foreach (var thisMethod in methods.OrderBy(m => m.Name))
	{
		var parameters = thisMethod.GetParameters();
		var keyBuilder = new StringBuilder(thisMethod.Name.Length + (parameters.Length * 16));
		keyBuilder.Append(thisMethod.Name);
		foreach (var t in parameters)
		{
			keyBuilder.Append(";P=");
			keyBuilder.Append(t.ParameterType.ToString());
		}
		keyBuilder.Append(";R=");
		keyBuilder.Append(thisMethod.ReturnType.ToString());
		keyBuilder.Append(";F=");
		keyBuilder.Append(thisMethod.Attributes.ToString());

		try 
		{
			returnList.Add(keyBuilder.ToString(), thisMethod);
		}
		catch (ArgumentException aex) 
		{
			Trace.WriteLine(string.Format(CultureInfo.InvariantCulture, "> Exception on {0}: {1}", keyBuilder, aex.Message));
		}
		catch (Exception ex)
		{
			Trace.WriteLine(string.Format(CultureInfo.InvariantCulture, "> Exception on {0}: {1}", keyBuilder, ex.Message));
			throw;
		}
	}

	return returnList;
}

The System.Reflection.Assembly class also allows you to create and activate instances of any type, use its methods, read its properties, and send it events.

Reflection is hugely powerful and hugely useful. And very, very cool.

Dogleg

I published today's A-to-Z post a little late because I've had a lot going on this week, between the Apollo Chorus benefit tomorrow, rehearsals, and taking care of Parker.

Yesterday Parker got his sutures out. The vet said he's healing very well, no signs of infection or re-injury, and good progress on using the injured leg. He can go without the Cone of Shame while someone is observing him, and on Friday, he can have it off permanently. Both he and I really, really, really want that to happen.

They also gave me art. Here are his X-rays from two weeks ago, showing his right leg before and after surgery. Before:

After:

The after photo looks painful, but the vet and the dog both assure me it's better than the before image.

Another improvement: he can now go on 10-minute walks, increasing by 5 minutes a week until mid-May when he can go as long as he wants. We're still walking slowly so that he has to put weight on the leg. But today, when he gets his first trip around the block in almost 3 weeks, I'm sure he'll be happier.

Q is for Querying

Blogging A to ZPosting day 17 of the Blogging A-to-Z challenge just a little late because of stuff (see next post). Apologies.

Today's topic is querying, which .NET makes relatively easy through the magic of LINQ. Last week I showed how LINQ works when dealing with in-memory collections of things. In combination with Entity Framework, or another object-relational mapper (ORM), LINQ makes getting data out of your database a ton easier.

When querying a database in a .NET application, you will generally need a database connection object, a database command object, and a data reader. Here's a simple example using SQL Server:

public void DirectQueryExample(string connectionString)
{
	using (var conn = new SqlConnection(connectionString))
	{
		var command = new SqlCommand("SELECT * FROM LookupData", conn);
		conn.Open();
		var reader = command.ExecuteReader();
		foreach (var row in reader)
		{
			Console.WriteLine(reader[0]);
		}
	}
}

(Let's skip for now how bad it is to execute raw SQL from your application.)

With Entity Framework (or another ORM), the ORM generates classes that represent tables or views in your database. Imagine you have an animals table, represented by an animal class in your data project. Finding them in your database might now look like this:

public IEnumerable<Animal> OrmQueryExample(string species)
{
	var result = new List<Animal>();
	using (var db = Orm.Context)
	{
		var dtos = db.Animals.Where(p => p.Species == species);
		result.AddRange(dtos.ForEach(MapDtoToDomainObject));
	}

	return result;
}

private Animal MapDtoToDomainObject(AnimalDto animalDto)
{
	// Code elided
}

That looks a little different, no? Instead of opening a connection to the database and executing a query, we use a database context that Entity Framework supplies. We then execute a LINQ query directly on the Animals table representation with a predicate. The ORM handles constructing and executing the query, and returns an IQueryable<T> of its self-generated Animal data transfer object (DTO). When we get that collection back, we map the fields on the DTO to the domain object we want, and return an IEnumerable<T> of our own domain object back to the caller. If the query comes up empty, we return an empty list. (Here's a decent Stack Overflow post on the difference between the two collection types.)

These are naive examples, of course, and there are many other ways of using EF. For example, for field mapping we might use a package like AutoMapper instead of rolling our own field-level mapping. I encourage you to read up on EF and related technologies.

Go home, April. You're drunk.

So far, this April ranks as the 2nd coldest in Chicago history. We had snow this past weekend, and we expect to have snow tonight—on April 18th.

So it may come as a surprise to people who confuse "weather" and "climate" that, worldwide, things are pretty hot:

The warm air to our north and east has blocked the cold air now parked over the midwestern U.S. Europe, meanwhile, feels like August. And Antarctica feels like...well, Antarctica, but unusually warm.

Note that the temperature anomalies at the bottom of the image above are based on the 1980-2010 climate normal period, which was warmer than any previous 30-year period. In other words, the poles may be 3-5°C warmer than normal now and also 4-7°C warmer than any point in recorded history.

At least, historically, a cold spring means a cool summer here. Lake Michigan is a very cold 5°C today, a few degrees below normal for this time of year, and a huge sink for summer heat later on. Here's hoping, anyway.

P is for Polymorphism

Blogging A to ZWe're now past the half-way point, 16 days into the Blogging A-to-Z challenge. Time to go back to object-oriented design fundamentals.

OO design has four basic concepts:

All four have specific meanings. Today we'll just look at polymorphism (from Greek: "poly" meaning many and "morph" meaning shape).

Essentially, polymorphism means using the same identifiers in different ways. Let's take a contrived but common example: animals.

Imagine you have a class representing any animal (see under "abstraction"). Animals can move. So:

public abstract class Animal
{
	public abstract void Move();
}

Notice that the Move method has no implementation, since animal species have many different ways of moving.

Now imagine two concrete animal classes:

public class Dog : Animal
{
	public override void Move() 
	{
		// Walk like a quadraped
	}
}

public class Guppy : Animal
{
	public override void Move() 
	{
		// Swim like a fish
	}
}

Guppies and dogs both move around just fine in their own environments, and dogs can move around in the littoral areas of a guppy's environment as well. So both animals have a Move method.

In this way, the Move method is polymorphic. A caller doesn't need to know anything about guppies or dogs in order to get them to move. And the implementations of the Move method will be completely different:

public void MoveAll(IEnumerable<Animal> animals)
{
	animals.ForEach(a => a.Move());
}

That method doesn't care what the list contains. It moves them all the same.

Now imagine this class:

public class Electron : Lepton
{
	public override void Move() 
	{
		// Walk like a quadraped
	}
}

Electrons move too. The implementation of Electron.Move() differs from Dog.Move() or Guppy.Move() so vastly that no one really knows how electrons do it. But if you call Electron.Move(), you expect the thing to move.

I've only given examples of subtyping and duck typing today, so it's worth reading more about polymorphism in general. Also, as you recall from my discussion of interfaces, you probably would also define an interface like IMovable to express that your class can move, rather than relying on the abstract classes and inheritance. (Program to interfaces, not to implementations!)

Quick links

A couple stories of interest:

OK, back to being really too busy to breathe this week...

O is for O(n) Notation

Blogging A to ZFor day 15 of the Blogging A-to-Z challenge I want to talk about something that computer scientists use but application developers typically don't.

Longtime readers of the Daily Parker know that I put a lot of stock in having a liberal arts education in general, and having one in my profession in specific. I have a disclosed bias against hiring people with computer science (CS) degrees unless they come from universities with rigorous liberal arts core requirements. Distilled down to the essence, I believe that CS majors at most schools spend too much time on how computers work and not enough time on how people work.

But CS majors do have a lexicon that more liberally-educated developers don't really have. For example, when discussing how well code performs, CS majors use "Big O" notation.

In Big O notation, the O stands for "order of growth," meaning how much time could the algorithm could grow to take up given worst-case inputs.

The simplest notation is O(1), where the code always takes the same amount of time to execute no matter what the inputs are:

int q = 1;
int r = 2;
var s = q + r;

Adding two integers in .NET always takes the same amount of time, no matter what the integers are.

The titular notation for this post, O(n), means that the execution time grows linearly based on the input. Each item you add to the input increases the time the algorithm takes by exactly one unit, as in a simple for loop:

public long LoopExample(int[] numbers)
{
	long sum;
	for(var x = 0; x < numbers.Length; x++)
	{
		sum += numbers[x];
	}
	return sum;
}

In that example, each item you add to the array numbers increases the time the loop takes by exactly one unit of time, whatever that unit may be. (On most computers today that would be measured in units of milliseconds, or even hundreds of nanoseconds.)

Other algorithms may be slower or faster than these examples, except that no algorithm can be faster than O(1). The parenthetical can be any expression of n. Some algorithms grow by the logarithm of n, some grow by the double of n, some grow by the square of n.

This notation enables people to communicate precisely about how well code performs. If you find that your code takes O(n2) time to run, you may want to find a fix that reduces it to O(log n). And you can communicate to people how you increased efficiency in just that way.

So as much as I want application developers to have broad liberal educations, it's worth remembering that computer science fits into a liberal education as well.