The Daily Parker

Politics, Weather, Photography, and the Dog

Weakest link in the chain

I had planned to post some photos tonight showing the evolution of digital cameras, using a local landmark, but there's a snag. The CF card reader I brought along isn't showing up on my computer, even though the computer acknowledges that something is attached through a USB port.

As I'm visiting one of the most sophisticated and technological cities in the world, I have no doubt I can fix this tomorrow. Still, it's always irritating when technology that worked a few days ago simply stops working.

For those doubting my troubleshooting skills, I have confirmed that the CF card has all the photos I shot today; that the computer can see the CF card reader; and that the computer can connect effectively to other USB attachments. The problem is therefore either in the OS or in the card reader, and I'm inclined to suspect the card reader.

It's not the good times they care about, it's the bad

The repercussions from Monday's data-recovery debacle continued through yesterday.

By the time business started Tuesday morning, I had restored the client's application and database to the state it had at the moment of the upgrade, and I'd entered most of their appointments, including all of them through tomorrow (Thursday). When the client started their day, everything seemed to be all right, except for one thing I also didn't know about their business: some of their customers pay them based on the appointment ID, which is nothing more than a SQL IDENTITY column in the database.

If you know how databases work, you know that IDENTITY columns are officially non-deterministic. In this specific case, the column increments by one every time it adds a row, but also in this specific case, I didn't re-enter the data in the same order it was originally entered, since I prioritized the earlier appointments.

We've gotten through the problem now, and the client no longer want to put my head on a spike, so I will now take a moment for an after-action review that might help other software developers in the future.

First, the things I did right:

  • When I deployed the upgrade Saturday, I preserved the state of the database and application at exactly that moment.
  • All of the data in the system, every field of it, was audited. It was trivially easy to produce a report of every change made to the system from roll-out Saturday afternoon through roll-back Monday night.
  • When I rolled back the upgrade Monday night, I preserved the state of the upgraded database and application at exactly that moment.
  • When the client first noticed the problem, I dropped everything else and worked out a plan with them. The plan centered around getting their business back up first, and then dealing with the technology.
  • Their customers were completely back to normal at the start of business Tuesday.
  • The application runs on Windows Azure, which made preserving the old application state not only easy, but possible.

So what should I have done better?

  • My biggest error was overconfidence in my ability to roll back the upgrade. No matter what other errors I made, this was the root of all of them.
  • The second major error was not testing the UI on Internet Explorer 8. Mitigating this was the fact that neither I nor my client was aware that the bulk of their customers used IE8. However, given that people using IE8 were totally unable to use the application, even if the numbers of customers using IE8 was very small, the large impact should have put IE8 near the top of my regression test checklist.
  • Instead of spending a couple of hours re-entering data, I should have written a script to do it.
  • I have always regretted (though never more than today) publicizing the appointments IDENTITY column to the end user, because it's normal they'd use this ID for business purposes. This illustrates the danger—not just the sloppy design—of using a single database field for two purposes. Any future version of the application will have an OrderID field that is not a database plumbing field.

All in all, the good things outweighed the bad, and I may get back in my client's good graces when I roll out the next update. You know, the one that works on IE8, but still solves the looming problem of the platform's age.

And the day started so well...

At 8:16 this morning, a long-time client sent me an email saying that one of his customers couldn't was getting a strange bug in their scheduling application. They could see everything except for the tabbed UI control they needed to use. In other words, there was a hole in the screen where the data entry should have been.

Here's how the rest of the day went around this issue. It's the kind of thing that makes me proud to be an engineer, in the same way the guys who built Galloping Gertie were proud.

It all started when I updated a Windows Azure cloud service from the no-longer-supported SDK 1.7 running on Windows Server 2008 to the current SDK (2.2) and operating system (Windows Server 2012 R2). I also upgraded the language from C# 4.0 to C# 4.5.1, which is only possible on WS2012R2.

This upgrade started months ago, and proceeded slowly because both I and the clients had other priorities. I mean, who wants to spend a lot of money upgrading a platform without upgrading the application running on it? So the last build of the application went to production in October, and I haven't touched it since. I mean, it worked fine, why mess with it? Other than the fact that the operating system and Azure SDK are no longer supported.

Before pushing the update, I thoroughly tested the application. I mean, unit tests up the ying, with a tens-of-steps-long regression test on my local, and on an Azure test instance, before even looking askance at the Production instance. When I had tested everything I could imagine, I did this:

  1. Stopped the application, to ensure no one changed any data during the upgrade.
  2. Made a full copy of the production database ("CREATE DATABASE productioncopy AS COPY OF production")
  3. Once the data was fully copied, I uploaded the new bits to the Staging slot of the application.
  4. I updated the configuration info to the current standards.
  5. VIP swap! (I swapped the staging and production instances, so the old production instance was now in the staging slot.)
  6. And....it's running just fine. All that planning and testing worked!

So what happened? Well, it turns out there's one thing I didn't anticipate: Internet Explorer 8, released five years ago Thursday, and known to have difficulties with JavaScript. Plus, the controls we used when we orignally deployed in January 2008, made by Infragistics, have known incompatibilities with IE8, but again: the application has worked just fine the whole time.

Since everything worked just fine on earlier versions of the application, and since this update didn't directly change the UI, and since IE8 hasn't been supported in quite some time, I figured there wouldn't be any problems.

It turns out that a sizable portion of my client's customers use IE8, because they're big hospitals with big IT departments and little budgets for updates.

Once I realized with abject horror that the application was simply broken for most of the people using it, I resigned myself to rolling back to the previous release, which had worked just fine. When I got home, I started this task, and the following things happened:

  1. Once again, I stopped the application.
  2. The actual database restore went fine, as did the VIP swap putting the previous version back in the Production slot and the new version in the Staging slot.
  3. When the application started up, I realized I'd forgotten to roll back the configuration information for the logging and messaging component. So the application failed to start.
  4. I rolled back the config.
  5. The application again failed to start. Only now, because the logging and messaging component is the part that's failing, I can't see any diagnostics.
  6. Fortunately, I deployed the application with Remote Desktop enabled, so I tried connecting to the virtual machine directly.
  7. The Remote Desktop user account had expired.
  8. Fortunately I use great source control. In Mercurial, I updated to the last production build before the update, and loaded it into Visual Studio.
  9. Tried to load into Visual Studio, and failed. See, I no longer have the Azure SDK v1.7. I never installed it on this machine, in fact. I'm running SDK 2.2, and I have no easy way of running an older version.

So, as far as I knew at this point, there is simply no way to get into the application, and no way for me to re-upload the old version.

I decided to try a different tack. I rolled back the rollback and restarted the new version. I also started trying to get my last remaining Windows XP machine running so that I could confirm the bug, and start testing fixes on a Test instance running Windows Server 2012 R2.

Getting a 10-year-old laptop to boot, let me log in, stop wasting time with all the detritus it acquired in its years of service, connect to my network, and open up IE8, took 45 minutes.

Some time in there I walked Parker.

So now, I can see that the error exists in IE8, and I also have found an article on how to reset the RDP password expiration date. Only, I'm really tired, and I am worried I'll make stupid errors if I keep trying to debug this right now.

So I have two approaches I will try first thing in the morning: first, roll back to the October release, and manually update the RDP expiration date so I can remote in and debug the configuration problem. Then I'll have to re-create all the data my client added yesterday, which will take me at least an hour. If I'm supremely lucky I'll have this done by 8am. Since I've had no luck at all so far on this upgrade, I am not optimistic.

Second, I'll start removing the outdated Infragistics code. Believe it or not, jQuery works fine on IE8, despite it being pretty much the latest thing in user interface languages. It's the custom crap Infragistics pushed out years ago that fails. Unfortunately I won't be able to deploy this before leaving on Thursday morning. Fortunately the application isn't going to stop working suddenly; the OS and SDK are no longer supported, but they won't actually turn the OS off until June.

And there's the irony in a nutshell. I thought I did everything right in the deployment cycle, especially the part where I got three months ahead of the due date. The things that went wrong to get me to this state of frustration and exhaustion were numerous and tiny, kind of like the things that go wrong to cause an aviation accident. That said, the client has suffered no data loss, and I preserved a whole catalog of options to fix the problem (relatively) quickly. This isn't the disaster it would have been without the deployment tools you get with Azure.

Plus, I've learned to test everything on IE8 whenever health care companies are involved. Sheesh.

Why Ravenswood instead of, say, Lakeview?

Here are about 30 reasons, just from the last 48 hours:

CWB estimates that 21 people were taken into police custody during Wrigleyville's Saturday-into-Sunday St. Patrick's binge.

But there was only one tazing. (Rats!)

28 batteries were witnessed or otherwise confirmed by police. Few were formalized with police reports.

Ambulances took at least 17 people to area hospitals and officers were tied up with at least 19 calls from cab drivers who had disputes with their passengers over payment.

Here, now, are the notable moments in this year's green-laden blow out in the area (with a splash of Lincoln Park tossed in):

[Saturday,] 12:36PM - Huge party in an apartment, 600 block of Cornelia. It's big, it's loud, and people are urinating out the windows.

It goes on from there, and it doesn't even include yesterday's mishigos on Lake Shore Drive.

Meanwhile, back in Chicago

Some asshole with a gun and an arrest warrant has blocked the entire length of North Lake Shore Drive as every cop in Illinois tries to prise him from his car:

A car chase through the South Side and downtown involving a man wanted in connection with a murder in Georgia ended with a standoff between the man and police after the vehicle crashed on Lake Shore Drive on the Near North Side, officials and witnesses said.

According to Harvey Police Department spokeswoman Sandra Alvarado, the man in the vehicle police were pursuing is wanted in connection with a murder in Hampton, Ga. Alvarado said that at 12:24 p.m. today, Harvey police had been contacted by the Henry Country Sheriff's Office asking for help in locating a homicide suspect. Harvey police were given a description of the vehicle, its registration, GPS location and arrest warrant information on the suspect, who was wanted in connection with a March homicide. Alvarado did not name the suspect.

Harvey police located the vehicle, which fled from officers about 12:27 p.m., beginning a chase on highways and interstates in the south suburbs and on the South Side of Chicago. Eventually the vehicle ended up on South Lake Shore Drive, and then North Lake Shore, where it crashed about 1:10 p.m. near Fullerton Parkway. The dark-colored vehicle came to rest in the grass just to the east of the northbound lanes there and police were seen surrounding it with guns drawn, pointing at the vehicle.

This seems like an overreaction, but I'm not a cop. I will say that it took me nearly 90 minutes to get from Wilmette to home this afternoon, which happens when the 40,000 cars that would ordinarily go down Lake Shore Drive during that period instead go down Broadway, Clark, Halsted, and Ashland.

The incident is still going on about 800 meters from my apartment. I'll know it's over when the news helicopters bugger off.

Crimean referendum finishes days after ballots counted

As predicted last week in private Kremlin memoranda, today's referendum in Crimea has determined that more people on the peninsula support union with Russia than actually live on the peninsula. As someone once said more eloquently than I:

But here's the BBC:

Some 95.5% of voters in Crimea have supported joining Russia, officials say. after half the votes have been counted in a disputed referendum.

Crimea's leader says he will apply to join Russia on Monday. Russia's Vladimir Putin has said he will respect the Crimean people's wishes.

Some review, I think, is in order.

First, Crimea doesn't have a leader that can apply for union with Russia any more than Long Island has a leader that can apply for union with Bermuda.

Second, Putin's respect for the Crimean people's wishes notwithstanding, I can't decide if we're back in 1980, 1939, 1914, or 1836; regardless, it's nice to have the USSR back in town as we're all sick of terrorists.

Third, is there anyone who thinks about these things seriously and believes that this action shows anything other than Russian weakness? Authoritarian leaders always make this mistake, and they wind up destroying their countries. You can't conquer your way to security. Just ask, well, us.

Doomed to repeat it

The news recently and Krugman this morning have brought Tennyson to mind:

Theirs not to make reply,
Theirs not to reason why,
Theirs but to do and die:
Into the valley of Death
  Rode the six hundred.

Heroism has its place, but not when it takes everyone else through hell.