Friday, December 18, 2015

Woe-ARIA: aria-describedby: To Report or Not to Report?


In my last post, I waxed lyrical about the surprising complexity of the seemingly simple aria-label/ledby. Thanks to those who took the time to read it and provide their valuable thoughts. In particular, Steve Faulkner commented that he’d started working on “doc/test files suggesting what screen readers should announce from accname/description info“. Talk about responsive! Thanks Steve! His inclusion of description opened up another can of worms for me, so I thought I’d continue the trend and let the worms spill out right here. Thankfully, this particular can is somewhat smaller than the first one!

What are you on about this time?

Steve’s new document suggests that for an a tag with an href, screen readers should:
Announce accname + accdescription (if present and different from acc name), ignore element content.
I don’t agree with the “ignore element content” bit in all cases; see the “Why not just use the accessible name?” section of my label post for why. However, the bit of interest here is the suggestion that accDescription should be reported.

Well, of course it should! The spec says!

The spec allows elements to be described, so many argue that it logically follows that a supporting screen reader should always read the description. I strongly disagree.
While the label is primary information for many elements (including links), I believe the description is “secondary” information. The ARIA spec says that:
a label should be concise, where a description is intended to provide more verbose information.
“More verbose information” is the key here. It is reasonable to assume that users will not always be interested in this level of verbosity. If the information was important enough to be read always, why not just stick it in the label?

What on earth do you mean by secondary information?

I think of descriptions rather like tooltips. A tooltip isn’t always on screen, but rather, appears only when, say, the user moves their mouse over the associated element. The information is useful, but the user doesn’t always need to see it. They only need to see it if the element is of particular interest.
The HTML title attribute is most often presented as a tooltip and… wait for it… is usually presented as the accessible description (unless there’s no name).

But most screen reader users don’t use a mouse!

Quite so. But moving the mouse to an element can be generalised: some gesture that indicates the user is specifically interested in/wishes to interact with this element. When a user is just reading, they’re not doing this.

Why is this such a big deal?

Imagine you’re reading an article about the changing landscape of device connectors in portable computers over the years:
<p>There have been many different types of connections for peripheral devices in portable computers over the years: <a href="pcmcia" title="Personal Computer Memory Card International Association">PCMCIA</a>, <a href="usb" title="Universal Serial Bus">USB</a> and <a href="sata" title="Serial ATA">E-SATA</a>, just to name a few.</p>
(I use the title attribute here because it’s easier than aria-describedby, but the same could be done with aria-describedby.)
Imagine you’re reading this as a flat document, either line by line or all at once. Let’s check that out with all descriptions reported:
There have been many different types of connections for peripheral devices in portable computers over the years: link, PCMCIA, Personal Computer Memory Card International Association, link, USB, Universal Serial Bus, and link, E-SATA, Serial ATA, just to name a few.
Wow. That’s insanely verbose and not overly useful unless I’m particularly interested in the linked article. And that’s just one small sentence! If sighted users don’t have to see this all the time, why should I as a screen reader user?
Here’s another example based loosely on an issue item in the NVDA GitHub issue list:
<a href="issue/5612">Support for HumanWare Brailliant B using USB HID</a>
<a href="label/Braille" title="View all Braille issues">Braille</a>
<a href="label/enhancement" title="View all enhancement issues">enhancement</a><br>
opened <span title="16 Dec. 2015, 9:49 am AEST">2 days ago</a>
by <a href="user/jcsteh" title="View all issues opened by jcsteh">jcsteh</a>
Let’s read that entire item with descriptions:
link, Support for HumanWare Brailliant B using USB HID, link, Braille, View all Braille issues, link, enhancement, View all enhancement issues, #5612 opened 2 days ago, 16 Dec. 2015, 9:49 am AEST, by jcsteh, View all issues opened by jcsteh
In what universe is that efficient?

Slight digression: complete misunderstanding of description

As an aside, GitHub’s real implementation of this is actually far worse because they incorrectly use the aria-label attribute where I’ve used the title attribute, so you lose the real labels altogether. You get something like this:
link, Support for HumanWare Brailliant B using USB HID, link, View all Braille issues
which doesn’t even make sense. David MacDonald outlined this exact issue in his comment on my label post:
The most common mistake I’m correcting for aria-label/ledby is when it over rides the text in the element, or associated label and when that text or associated html label is important. For instance, a bit of help text on an input. They should use describedby but they don’t understand the difference between accName and accDescription.
Still, the spec is fairly clear on this point, so I guess this one is just up to evangelism.

So are you saying description should never be read? What’s the point of it, then?

Not at all. I’m saying it shouldn’t “always” be read.

When, then?

When there is “some gesture that indicates the user is specifically interested in/wishes to interact with this element”. For a screen reader, simply moving line by line through a document doesn’t satisfy this. Sure, the user is interacting with the device, but that’s because screen readers inherently require interaction; they aren’t entirely passive like sight. For me (and, surprise surprise, for NVDA), this “gesture” means something like tabbing to the link, moving to it using single letter navigation, using a command to query information about the current element, etc.

But VoiceOver reads it!

With VoiceOver, you usually move to each element individually. You don’t (at least not as often) move line by line (like you do with NVDA), where there can be several elements reported at once. With the individual element model, it makes sense to read the description because you’re dealing with a single element at a time and the user may well be interested in that specific element. And if the user really doesn’t care about it, they can always just move on to the next element early.

So now you’re saying we can’t have interoperability. Dude, make up your mind already!

Recall this from my last post:
If we want interoperability, we need solid rules. I’m not necessarily suggesting that this be compulsory or prescriptive; different AT products have different interaction models and we also need to allow for preferences and innovation.
This is one of those “different interaction models” examples.
Rich Schwerdtfeger commented on my last post:
The problem we have with AT vendors is that many have lobbied very hard for us to NOT dictate what they should do.
Examples like these are one reason AT vendors push back on this.

So, uh, what are we supposed to do?

I’m optimistic that there’s a middle ground: guidelines which allow for reasonable interoperability without restricting AT’s ability to innovate and best suit their users’ needs. As in software development, a bit of well-considered abstraction goes a long way to ensuring future longevity.
In this case, perhaps the guidelines could use the “secondary content” terminology I used above or something similar. They might say that for an a tag with an href, the name should be presented as the primary content if overridden using aria-label/ledby and the description should be treated as secondary content. This leaves it up to the AT vendor to decide exactly when this secondary content is presented based on the interaction model, while still providing some idea of how to best ensure interoperability.


  1. I kind of wish there wasn't an override feature to the accessible name calculation. I wish it was a concatenation model and users could decide how much they want to hear or not hear using their settings, with the default being it's all concatenated, and the user can move on early if they want.

    1. Leaving this stuff up to the user just isn't okay. That's more or less a cop-out for bad authoring and/or spec problems. I provided examples in my label post about why concatenation as a general rule would be a bad idea. Leaving this up to the user means the user has to constantly reconfigure their AT for each site which decides to do things differently.