The Daily Parker

Politics, Weather, Photography, and the Dog

Border cases

Just a quick note about debugging. I just spent about 30 minutes tracking down a bug that caused a client to get invoiced for -18 hours of premium time and 1.12 days of regular time.

The basic problem is that an appointment can begin and end at any time, but from 6pm to 8am, an appointment costs more per hour than during business hours. This particular appointment started at 5pm and went until midnight, which should be 6 hours of premium and 1 hour of regular.

The bottom line: I had unit tests, which automatically tested a variety of start and end times across all time zones (to ensure that local time always prevailed over UTC), including:

  • Starts before close, finishes after close before midnight
  • Starts before close, finishes after midnight before opening
  • Starts before close, finishes after next opening
  • Starts after close, finishes before midnight
  • Starts after close, finishes after midnight before opening
  • Starts after close, finishes after next opening
  • ...

Notice that I never tested what happened when the appointment ended at midnight.

The fix was a single equals sign, as in:

- if (localEnd > midnight & local <= localOpenAtEnd)
+ if (localEnd >= midnight & local <= localOpenAtEnd)

Nicely done, Braverman. Nicely done.

When the Azure emulator is more forgiving than real life

Last night I made the mistake of testing a deployment to Azure right before going to bed. Everything had worked beautifully in development, I'd fixed all the bugs, and I had a virgin Windows Azure affinity group complete with a pre-populated test database ready for the Weather Now worker role's first trip up to the Big Time.

The first complete and total failure of the worker role I should have predicted. Just as I do in the brick-and-mortar development world, I create low-privilege SQL accounts for applications to use. So immediately I had a bunch of SQL exceptions that I resolved with a few GRANT EXEC commands. No big deal.

Once I restarted the worker role, it connected to the database, loaded its settings, downloaded a file from NOAA and...crashed:

Inner Drive Weather threw System.Data.Services.Client.DataServiceRequestException
...
OutOfRangeInput

One of the request inputs is out of range.
RequestId:572bcfee-9e0b-4a02-9163-1c6163798d60
Time:2013-02-10T06:05:41.5664525Z

at System.Data.Services.Client.DataServiceContext.SaveResult.d__1e.MoveNext()

Oh no. The dreaded Azure Storage exception that tells you absolutely nothing.

Flash forward fifteen minutes (now past midnight; and for context, I'm writing this on the 9am flight to Los Angeles), with Fiddler running on a local instance connecting to production Azure storage, and I found the XML block on which real Azure Storage barfed but the Azure storage emulator passed without a second thought. The offending table entity is metadata that the NOAA downloader worker task stores to let the weather parsing worker task know it has work to do:

<?xml version="1.0" encoding="utf-8" standalone="yes"?>
   <entry xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" 
   xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" 
   xmlns="http://www.w3.org/2005/Atom">
  <title />
  <author>
    <name />
  </author>
  <updated>2013-02-10T05:55:49.3316301Z</updated>
  <id />
  <content type="application/xml">
    <m:properties>
      <d:BlobName>20130209-0535-sn.0034.txt</d:BlobName>
      <d:FileName>sn.0034.txt</d:FileName>
      <d:FileTime m:type="Edm.DateTime">2013-02-09T05:35:00Z</d:FileTime>
      <d:IsParsed m:type="Edm.Boolean">false</d:IsParsed>
      <d:ParseTime m:type="Edm.DateTime">0001-01-01T00:00:00</d:ParseTime>
      <d:PartitionKey>201302</d:PartitionKey>
      <d:RetrieveTime m:type="Edm.DateTime">2013-02-10T05:55:29.1084794Z</d:RetrieveTime>
      <d:RowKey>20130209-0535-41d536ff-2e70-4564-84bd-7559a0a71d4d</d:RowKey>
      <d:Size m:type="Edm.Int32">68202</d:Size>
      <d:Timestamp m:type="Edm.DateTime">0001-01-01T00:00:00</d:Timestamp>
    </m:properties>
  </content>
</entry>

Notice that the ParseTime and Timestamp values are equal to System.DateTimeOffset.MinValue, which, it turns out, is not a legal Azure table value. Wow, would it have helped me if the emulator horked on those values during development.

The fix was simply to make sure that neither System.DateTimeOffset.MinValue nor System.DateTime.MinValue ever got into an outbound table entity, which took me about five minutes to implement. Also, it turned out that even though my table entity inherited from TableServiceEntity, I still had to set the Timestamp property when using real Azure storage. (The emulator sets it for you.)

By this point it was 12:30 and I needed to get some sleep, however. So my plan to run an overnight test will have to wait until this evening at my hotel. Then I'll find the other bits of code that work fine against the emulator but, for reasons that pass understanding, the emulator gets completely wrong.

Putting a bow on it

We're just 45 minutes from releasing a software project to our client for user acceptance testing (UAT), and we're ready. (Of course, there are those 38 "known issues..." But that's what the UAT period is for!)

When I get back from the launch meeting, I'll want to check these out:

Off to the client. Then...bug fixes!

Performance improvement; or, how one line of code can change your life

I'm in the home stretch moving Weather Now to Azure. I've finished the data model, data retrieval code, integration with the existing UI, and the code that parses incoming weather data from NOAA, so now I'm working on inserting that data into the database.

To speed up development, improve the design, and generally make my life easier, I'm using Entity Framework 5.0 with database-first modeling. The problem that consumed me yesterday afternoon and on into this morning has been how to ramp up to realistic volumes of data.

The Worker Role that will go out to NOAA and put weather data where Weather Now can use it will receive somewhere around 60,000 weather reports every hour. Often, NOAA repeats reports; sometimes, NOAA sends truncated copies of reports; sometimes, NOAA sends garbled reports. The GetWeather application (soon to be Azure worker task) has to handle all of that and still function in bursts of up to 10,000 weather reports at once.

The WeatherStore class takes parsed METARs and stores them in the CurrentObservations, PastObservations, and ClimateObservations tables, as appropriate. As I've developed the class, I've written unit tests for each kind of thing it has to do: "Store single report," "Store many reports" (which tests batching them up and inserting them in smaller chunks), "Store duplicate reports," etc. Then yesterday afternoon I wrote an integration test called "Store real-life NOAA file" that took the 600 KB, 25,000-line, 6,077-METAR update NOAA published at 2013-01-01 00:00 UTC, and stuffed it in the database.

Sucker took 900 seconds—15 minutes. In real life, that would mean a complete collapse of the application, because new files come in about every 4 minutes and contain similarly thousands of lines to parse.

This morning, I attached JetBrains dotTrace to the unit test (easy to do since JetBrains ReSharper was running the test), and discovered that 90% of the method's time was spent in—wait for it—DbContext.SaveChanges(). As I dug through the line-by-line tracing, it was obvious Entity Framework was the problem.

I'll save you the steps to figure it out, except to say Stack Overflow is the best thing to happen to software development since the keyboard.

Here's the solution:

using (var db = new AppDataContext())
{
	db.Configuration.AutoDetectChangesEnabled = false;

// do interesting work

	db.SaveChanges();
}

The result: The unit test duration went from 900 seconds to...15. And that is completely acceptable. Total time spent on this performance improvement: 1.25 hours.

Chaining LINQ predicates

I've spent a good bit of free time lately working on migrating Weather Now to Azure. Part of this includes rewriting its Gazetteer, or catalog of places that it uses to find weather stations for users. For this version I'm using Entity Framework 5.0, which in turn allows me to use LINQ extensively.

I always try to avoid duplicating code, and I always try to write sufficient unit tests to prevent (and fix) any coding errors I make. (I also use ReSharper and Visual Studio Code Analysis to keep me honest.)

There are two methods in the Gazetteer's PlaceFinder class that search for places by distance. The prototypes are:

public static IEnumerable FindNearby(ILocatable center, Length radius)

and:

public static IEnumerable FindNearby(ILocatable center, Length radius, Expression<Func<Place, bool>> predicate)

But in order for the first method to work, it has to create a predicate of its own to draw a box around the center location. (The ILocatable interface requires Latitude and Longitude. Length is a class in the Inner Drive Extensible Architecture representing a measurable two-dimensional distance.) So in order for the second method to work, it has to chain predicates.

Fortunately, I found Joe and Ben Albahari's library of LINQ extensions. Here's the second method:

public static IEnumerable<PlaceDistance> FindNearby(
	ILocatable center,
	Length radius,
	Expression<Func<Place, bool>> predicate)
{
	var searchPredicate = 
		SearchDistancePredicate(center, radius)
		.And(predicate);

	var places = Find(searchPredicate);

	return SearchDistanceResults(places, center, radius);
}

This allows me to use a single Find method that takes a predicate, engages a retry policy, and returns exactly what I'm looking for. And it allows me to do this, which just blows my mind:

var results = PlaceFinder.FindNearby(TestNode, TestRadius, p => p.Feature.Name == "airport");

Compared with the way Weather Now works under the hood right now, and how much coding the existing code took to achieve the same results, I'm just stunned. And it will make migrating Weather Now a lot easier.

Upgrading to Azure Storage Client 2.0

Oh, Azure Storage team, why did you break everything?

I love upgrades. I really do. So when Microsoft released the new version of the Windows Azure SDK (October 2012, v1.8) along with a full upgrade of the Storage Client (to 2.0), I found a little side project to upgrade, and went straight to the NuGet Package Manager for my prize.

I should say that part of my interest came from wanting to use some of the .NET 4.5 features, including the asynchronous helper methods, HTML 5, and native support for SQL 2012 spatial types, in the new version of Weather Now that I hope to complete before year's end. The Azure SDK 1.8 supports .NET 4.5; previous version didn’t. And the Azure SDK 1.8 includes a new version of the Azure Emulator which supports 4.5 as well.

To support the new, Azure-based version (and to support a bunch of other projects that I migrated to Azure), I have a class library of façades supporting Azure. Fortunately, this architecture encapsulated all of my Azure Storage calls. Unfortunately, the upgrade broke every other line of code in the library.

0. Many have the namespaces have changed. But of course, you use ReSharper, which makes the problem go away.

1.The CloudStorageAccount.FromConfigurationSetting() method is gone. Instead, you have to use CloudStorageAccount.Parse(). Here is a the delta from TortoiseHg:

- _cloudStorageAccount = CloudStorageAccount.FromConfigurationSetting(storageSettingName);
+ var setting = CloudConfigurationManager.GetSetting(storageSettingName);
+ _cloudStorageAccount = CloudStorageAccount.Parse(setting);

2. BlobContainer.GetBlobReference() is gone, too. Instead of getting a generic IBlobContainer reference back, you have to specify whether you want a page blob or a block blob. In this app, I only use page blobs, so the delta looks like this:

- var blob = _blobContainer.GetBlobReference(blobName);
+ var blob = _blobContainer.GetBlockBlobReference(blobName);

Note that BlobContainer also has a GetPageBlobReference() method. It also has a nearly-useless GetBlobReferenceFromServer method that throws a 404 error if the blob doesn’t exist, which makes it useless for creating new blobs.

3. Blob.DeleteIfExists() works somewhat differently, too:

- var blob = _blobContainer.GetBlobReference(blobName);
- blob.DeleteIfExists(new BlobRequestOptions 
-	{ DeleteSnapshotsOption = DeleteSnapshotsOption.IncludeSnapshots });
+ var blob = _blobContainer.GetBlockBlobReference(blobName);
+ blob.DeleteIfExists();

4. Remember downloading text directly from a blob using Blob.DownloadText()? Yeah, that’s gone too. Blobs are all about streams now:

- var blob = _blobContainer.GetBlobReference(blobName);
- return blob.DownloadText();
+ using (var stream = new MemoryStream())
+ {
+ 	var blob = _blobContainer.GetBlockBlobReference(blobName);
+ 	blob.DownloadToStream(stream);
+ 	using (var reader = new StreamReader(stream, true))
+ 	{
+ 		stream.Position = 0;
+ 		return reader.ReadToEnd();
+ 	}
+ }

5. Because blobs are all stream-based now, you can’t simply upload files to them. Here’s the correction to the disappearance of Blob.UploadFile():

- var blob = _blobContainer.GetBlobReference(blobName);
- blob.UploadByteArray(value);
+ var blob = _blobContainer.GetBlockBlobReference(blobName);
+ using (var stream = new MemoryStream(value))
+ {
+ 	blob.UploadFromStream(stream);
+ }

6. Microsoft even helpfully corrected a spelling error which, yes, broke my code:

- _blobContainer.CreateIfNotExist();
+ _blobContainer.CreateIfNotExists();

Yes, if not existS. Notice the big red S, which is something I’d like to give the Azure team after this upgrade.*

7. We’re not done, yet. They fixed a "problem" with tables, too:

  var cloudTableClient = _cloudStorageAccount.CreateCloudTableClient();
- cloudTableClient.CreateTableIfNotExist(TableName);
- var context = cloudTableClient.GetDataServiceContext();
+ var table = cloudTableClient.GetTableReference(TableName);
+ table.CreateIfNotExists();
+ var context = cloudTableClient.GetTableServiceContext();

8. Finally, if you have used the CloudStorageAccount.SetConfigurationSettingPublisher() method, that’s gone too, but you don’t need it. Instead, use the CloudConfigurationManager.GetSetting() method directly. Instead of doing this:

if (RoleEnvironment.IsAvailable)
{
	CloudStorageAccount.SetConfigurationSettingPublisher(
		(configName, configSetter) => 
		configSetter(RoleEnvironment.GetConfigurationSettingValue(configName)));
}
else
{
	CloudStorageAccount.SetConfigurationSettingPublisher(
		(configName, configSetter) => 
		configSetter(ConfigurationManager.AppSettings[configName]));
}

You can simply do this:

var someSetting = CloudConfigurationManager.GetSetting(settingKey);

The CloudConfiguration.GetSetting() method first tries to get the setting from Azure, then from the ConfigurationManager (i.e., local settings).

I hope I have just saved you three hours of silently cursing Microsoft’s Azure Storage team.

* Apologies to Bill Cosby.

Starting the oldest item on my to-do list

I mentioned a few weeks ago that I've had some difficulty moving the last remaining web application in the Inner Drive Technology Worldwide Data Center, Weather Now, into Microsoft Windows Azure. Actually, I have two principal difficulties: first, I need to re-write almost all of it, to end its dependency on a Database of Unusual Size; and second, I need the time to do this.

Right now, the databases hold about 2 Gb of geographic information and another 20 Gb of archival weather data. Since these databases run on my own hardware right now, I don't have to pay for them outside of the server's electricity costs. In Azure, that amount of database space costs more than $70 per month, well above the $25 or so my database server costs me.

I've finally figured out the architecture changes needed to get the geographic and weather information into cheaper (or free) repositories. Some of the strategy involves not storing the information at all, and some will use the orders-of-magnitude-less-expensive Azure table storage. (In Azure storage, 25 Gb costs $3 per month.)

Unfortunately for me, the data layer is about 80% of the application, including the automated processes that go out and get weather data. So, to solve this problem, I need a ground-up re-write.

The other problem: time. Last month, I worked 224 hours, which doesn't include commuting (24 hours), traveling (34 hours), or even walking Parker (14 hours). About my only downtime was during that 34 hours of traveling and while sitting in pubs in London and Cardiff.

I have to start doing this, though, because I'm spending way too much money running two servers that do very little. And I've been looking forward to it—it's not a chore, it's fun.

Not to mention, it means I get to start working on the oldest item on my to-do list, Case 46 ("Create new Gazetteer database design"), opened 30 August 2006, two days before I adopted Parker.

And so it begins.

I wish stuff just worked

Despite my enthusiasm for Microsoft Windows Azure, in some ways it suffers from the same problem all Microsoft version 1 products have: incomplete debugging tools.

I've spent the last three hours trying to add an SSL certificate to an existing Azure Web application. In previous attempts with different applications, this has taken me about 30 minutes, start to finish.

Right now, however, the site won't launch at all in my Azure emulator, presenting a generic "Internal server error - 500" when I try to start the application. The emulator isn't hitting any of my code, however, nor is it logging anything to the Windows System or Application logs. So I have no idea why it's failing.

I've checked the code into source control and built it on another machine, where it had exactly the same problem. So I know it's something under source control. I just don't know what.

I hate very little in this world, but lazy developers who fail to provide debugging information bring me near to violence. A simple error stack would probably lead me to the answer in seconds.

Update: The problem was in the web.config file.

Earlier, I copied a connection string element from a transformation file into the master web.config file, but I forgot to remove the transformation attributes xdt:Transform="Replace" and xdt:Locator="Match(name)". This prevented the IIS emulator from parsing the configuration file, which caused the 500 error.

I must reiterate, however, that some lazy developer neglected to provide this simple piece of debugging information, and my afternoon was wasted as a result.

It reminds me of a scene in Terry Pratchett's and Neil Gaiman's Good Omens (one of the funniest books ever written). Three demons are comparing notes on how they have worked corruption on the souls of men. The first two have each spent years tempting a priest and corrupting a politician. Crowley's turn:

"I tied up every portable telephone system in Central London for forty-five minutes at lunchtime," he said.

"Yes?" said Hastur. "And then what?"

"Look, it wasn't easy," said Crowley.

"That's all?" said Ligur.

"Look, people—"

"And exactly what has that done to secure souls for our master?" said Hastur.

Crowley pulled himself together.

What could he tell them? That twenty thousand people got bloody furious? That you could hear the arteries clanging shut all around the city? And that then they went back and took it out on their secretaries or traffic wardens or whatever, and they took it out on other people? In all kinds of vindictive little ways which, and here was the good bit, they thought up themselves. The pass-along effects were incalculable. Thousands and thousands of souls all got a faint patina of tarnish, and you hardly have to lift a finger.

Somehow, debugging the Azure emulator made me think of Crowley, who no doubt helped Microsoft write the thing.

W-8 a second...

After installing Windows 8 yesterday, I discovered some interaction problems with my main tool, Visual Studio 2012. Debugging Azure has suddenly become difficult. So after installing the OS upgrade, I spent about five hours re-installing or repairing a whole bunch of other apps, and I'm not even sure I found the causes of the problems.

The next step is to install new WiFi drivers. But seriously, I'm only a few troubleshooting steps from rebuilding the computer from scratch back on Windows 7.

Cue the cursing...

W-8, W-8!

This morning I installed Microsoft Windows 8 on my laptop. As a professional geek, getting software after it's released to manufacturing but before the general public is a favorite part of my job.

It took almost no effort to set up, and I figured out the interface in just a few minutes. I like the new look, especially the active content on the Start screen. It definitely has a more mobile-computing look than previous Windows versions, with larger click targets (optimized for touch screens) and tons of integration with Windows Accounts. I haven't linked much to my LiveID yet, as I don't really want to share that much with Microsoft, but I'll need it to use SkyDrive and to rate and review the new features.

I also did laundry, vacuumed, cleaned out all my old programming books (anyone want a copy of Inside C# 2 from 2002?), and will now go shopping. And I promise never to share that level of picayune personal detail again on this blog.