Buddy Can You Spare a Few Bytes?

Another day in the Maelstrom of Badly Designed Software, this time involving not just an operating system function but also boot firmware. Geez, if you can’t trust your boot firmware, what can you trust?

This all began when the server which runs our DVR software (Sage; a great open source product — check it out at the community forum) began acting rather oddly. Truth to tell, this had been going on for months, but since we’ve switched about 90% of our TV watching to streaming services it was low on the priority list of things to fix. But since I had the great idea of saving my back and knees by relocating the server out of the hall closet to a little nook upstairs, I thought I might as well figure out what was wrong.

That ended up being a huge time sink and hair-puller. As my dad always said, it’s the five minute jobs that take two hours. Unfortunately, with computers the ratio is often 100 or more to 1, not 24.

I thought the problem was that one of the hard drives comprising a Windows Storage Space array was failing. No problem; I’d had to replace a failing drive once before (I’m looking at you, Seagate — can you please up your quality control??) and while it’s time consuming, it’s pretty straightforward.

The basic concept behind Storage Spaces is that you assign a bunch of individual drives to a pool and Windows virtualizes them into One Big Honking Drive. With built-in redundancy and error-checking. Expandable at will. Replaceable at will. Or so I thought.

Turns out things don’t go so well when (a) more than one drive fails at the same time (really, thanx again, Seagate!!) and (b) you’ve used up all the SATA ports on your motherboard. Kinda hard to “just add another drive” when there’s nothing to hook it to, and if you can’t add another drive, Storage Spaces won’t let you gracefully degrade the pool (e.g., shift whatever’s on the failing drives to the good drives). So even though my pool had enough space available to hold all the data on its good drives, I was stuck. Gotta be able to add in order to remove. Bizarre.

The resulting confusion and hair-pulling ultimately lead to me copying what files I could out of the pool onto a new (Western Digital) drive hooked up to an add-in SATA card I installed. The net result was the total loss of my desktop system’s file history (the server also plays that role), various backups of other systems, and about 75% of our recorded TV shows. Fortunately that was Really Bad rather than Unbelievably Disastrous since, as I mentioned, we don’t use our DVR much anymore.

Because no Really Bad Computer Day is complete with just one set of problems, I also had to fight with the Gigabyte P55 USB3 motherboard powering the server. It turns out that if the boot process can “see” a hard drive, but can’t identify it, it just stalls. Without any message or beep code or alert of any kind. And either one of the built-in SATA ports is flaky or they have to be “consumed” in a particular order (e.g., master before slave on a given channel), so… It’s disturbing to plug drives in and have them work, only to plug the same drives in to different ports and have the system freeze. With no hint as to what’s wrong.

Now, space is admittedly at a premium for firmware, so it’s not like it can contain a robust error reporting system. OTOH, modern firmware does contain a lot of stuff, including a number of messages. Would it really have been so hard to include “Uh, drive seen but not recognized on SATA port X”? Besides being really helpful, not having such messages violates what I consider to be one of the most important rules of well-designed software: don’t leave the user hanging. Log something, somewhere — screen, log file, carrier pigeon, the location doesn’t matter (so long as it’s known).

There’s nothing worse than trying to figure out a problem with no information as to what it is. It forces you to go into trial and error mode, also known as Keep Moving Everything Around Until It Mysteriously Starts Working Again. Not a pleasant experience, and not one that anyone should have to experience…so long as the software is well-designed.

The morals of the story? A few:

  • If you use Windows Storage Spaces, always leave some unused hard drive ports available in your system.
  • Better yet, think really hard about using Windows Storage Spaces without a full-time IT staff (I’ve abandoned it based on this experience).
  • If your motherboard appears to freeze during the early stages of the boot process, consider that it may be having problems recognizing hard drives but is too ashamed to let you know that.

Keeping It Alive

I posted this on nVidia’s support forum, but felt it worth perpetuating somewhere else.

===

I was greatly relieved to see how nVidia is doing such a fine job of keeping alive the beautiful experience of having display drivers crash in the middle of work. Frankly, before I bought my GeForce 210 — running under Windows 10 — it’d been more than a decade since I’d enjoyed the fun of losing work by having a video driver crash and take down my entire system. Now I get to enjoy the ride every other day!

I also really appreciate how the nVidia Control Panel, and the nVidia Experience app, always display error messages when they open up. My particular favorite is “nVidia not available, please try again later”. I view that as a wonderful commentary on the demonstrated quality of nVidia’s software.

By dint of great effort, and working through repeated error messages, I believe my drivers are all up to snuff (I’m currently at version 341.95). I know that Windows 10 is completely up to date, since that happens automatically.

Thanx, nVidia, for perpetuating a key part of the computing experience that I feared had been lost forever.

“Please Contact Windows”

I’m a long-time user of the Adobe Creative Suite. So I’m more familiar with Adobe software than I’d like to be…because it is generally insanely great, from a creative point of view, and all too often not very well written, from a nuts-and-bolts point of view.

[Read more…]

Sometimes a Mind Wipe Helps

The other day the system drive — the one containing Windows and all my programs —  died unexpectedly. As in, I didn’t have a backup for it.

Lesson #23,781 learned: never run any solid state drive without a robust backup process. Actually, that’s a revision to lesson #23,103 (“never run a cheap solid state drive without a robust backup process”). Apparently, all solid state drives are both wickedly fast and notoriously unreliable. Compared to dinosaur-like spinning platter drives, at least.

So I got to experience the joys of re-installing everything, including Windows 8, from the ground up.

Actually, it wasn’t all that bad: Win8 installs much faster than previous versions, and my mainstay apps (Microsoft Office and Adobe Master Suite) are sufficiently out-of-date that I didn’t run into any “you’ve already installed our software on another computer!” nonsense. I guess software companies don’t own real computers that, you know, catastrophically fail at unexpected times. Or maybe they all do regular backups.

I also noticed some benefits from doing a clean install: the nifty Win8 power management features now work properly. I can shutdown my system in seconds, and restore it almost as quickly. In fact, I bet if I had a recent “instant on” motherboard the restore would probably be as fast as the shutdown. It’s also nice that it gets restored to exactly where I left off, with the same apps and documents open, although that feature’s been around for awhile.

Tick…Tick…Tick

I managed to dodge a bullet today.

One of the hard drives in my main desktop system has been ticking for several months now. The problem first appeared after I installed Windows 8, so I naturally assumed it had something to do with the new OS. My research into the matter seemed to confirm that, when I found a number of reports of drive ticking caused by overly aggressive head parking by Windows under some circumstances.

But none of the fixes that others used to solve their ticking problems worked in my case. So I did some more digging, and learned that the far more common reason for a drive to be ticking is that it’s about to die.

It would be particularly painful for this specific drive to die because it has all my documents on it, including multiple gigabytes of family photos and videos. And I **blush** don’t do backups as often or as thoroughly as I should.

Replacing the drive and cloning the data from the old one to the new one solved the problem. No more distracting ticking! More importantly, much less risk of losing precious data!

Repeat after me, ten times: “Post hoc ergo propter hoc“. Which is Latin for “after this, therefore because of this”. And is a very, very famous logical fallacy.

Which I often quote to others, and should have remembered myself in this instance.

Good Samaritans Do Exist

Actually, when you stop and think about it, there are far more decent people in the world than jerks. It’s just that the actions of the jerks give them a social footprint far exceeding that of the majority.

Today I met yet another Good Samaritan, a gentleman who found my Surface tablet sitting on a park bench, and took it upon himself to secure it pending the owner’s returning for it. When neither Barbara nor I showed up he took it with him, used the log in screen information to track me down, and contacted me to arrange for its return.

As luck would have it, he’s also an interesting person whose profession may well be useful in moving the San Carlos community ahead.

Perhaps it’s true that nothing happens by accident :).

Does Bank of America Hate Californians?

Or at least the ones who use Quicken to manage their personal finances with Bank of America’s online banking services? I don’t know, but it sure seems like that could be the case, based on my recent experience.

[Read more…]

MAC Addresses and Typos and Fonts, Oh My!

My old workhorse Brother black & white laser printer/scanner/copier/fax machine finally bit the dust. So naturally I went out and bought another Brother all-in-one to replace it. Why? Because I figure producing a good product 8 years ago means an even better product today.

Boy was I wrong.

[Read more…]