Friday, December 18, 2015

Woe-ARIA: aria-describedby: To Report or Not to Report?


In my last post, I waxed lyrical about the surprising complexity of the seemingly simple aria-label/ledby. Thanks to those who took the time to read it and provide their valuable thoughts. In particular, Steve Faulkner commented that he’d started working on “doc/test files suggesting what screen readers should announce from accname/description info“. Talk about responsive! Thanks Steve! His inclusion of description opened up another can of worms for me, so I thought I’d continue the trend and let the worms spill out right here. Thankfully, this particular can is somewhat smaller than the first one!

What are you on about this time?

Steve’s new document suggests that for an a tag with an href, screen readers should:
Announce accname + accdescription (if present and different from acc name), ignore element content.
I don’t agree with the “ignore element content” bit in all cases; see the “Why not just use the accessible name?” section of my label post for why. However, the bit of interest here is the suggestion that accDescription should be reported.

Well, of course it should! The spec says!

The spec allows elements to be described, so many argue that it logically follows that a supporting screen reader should always read the description. I strongly disagree.
While the label is primary information for many elements (including links), I believe the description is “secondary” information. The ARIA spec says that:
a label should be concise, where a description is intended to provide more verbose information.
“More verbose information” is the key here. It is reasonable to assume that users will not always be interested in this level of verbosity. If the information was important enough to be read always, why not just stick it in the label?

What on earth do you mean by secondary information?

I think of descriptions rather like tooltips. A tooltip isn’t always on screen, but rather, appears only when, say, the user moves their mouse over the associated element. The information is useful, but the user doesn’t always need to see it. They only need to see it if the element is of particular interest.
The HTML title attribute is most often presented as a tooltip and… wait for it… is usually presented as the accessible description (unless there’s no name).

But most screen reader users don’t use a mouse!

Quite so. But moving the mouse to an element can be generalised: some gesture that indicates the user is specifically interested in/wishes to interact with this element. When a user is just reading, they’re not doing this.

Why is this such a big deal?

Imagine you’re reading an article about the changing landscape of device connectors in portable computers over the years:
<p>There have been many different types of connections for peripheral devices in portable computers over the years: <a href="pcmcia" title="Personal Computer Memory Card International Association">PCMCIA</a>, <a href="usb" title="Universal Serial Bus">USB</a> and <a href="sata" title="Serial ATA">E-SATA</a>, just to name a few.</p>
(I use the title attribute here because it’s easier than aria-describedby, but the same could be done with aria-describedby.)
Imagine you’re reading this as a flat document, either line by line or all at once. Let’s check that out with all descriptions reported:
There have been many different types of connections for peripheral devices in portable computers over the years: link, PCMCIA, Personal Computer Memory Card International Association, link, USB, Universal Serial Bus, and link, E-SATA, Serial ATA, just to name a few.
Wow. That’s insanely verbose and not overly useful unless I’m particularly interested in the linked article. And that’s just one small sentence! If sighted users don’t have to see this all the time, why should I as a screen reader user?
Here’s another example based loosely on an issue item in the NVDA GitHub issue list:
<a href="issue/5612">Support for HumanWare Brailliant B using USB HID</a>
<a href="label/Braille" title="View all Braille issues">Braille</a>
<a href="label/enhancement" title="View all enhancement issues">enhancement</a><br>
opened <span title="16 Dec. 2015, 9:49 am AEST">2 days ago</a>
by <a href="user/jcsteh" title="View all issues opened by jcsteh">jcsteh</a>
Let’s read that entire item with descriptions:
link, Support for HumanWare Brailliant B using USB HID, link, Braille, View all Braille issues, link, enhancement, View all enhancement issues, #5612 opened 2 days ago, 16 Dec. 2015, 9:49 am AEST, by jcsteh, View all issues opened by jcsteh
In what universe is that efficient?

Slight digression: complete misunderstanding of description

As an aside, GitHub’s real implementation of this is actually far worse because they incorrectly use the aria-label attribute where I’ve used the title attribute, so you lose the real labels altogether. You get something like this:
link, Support for HumanWare Brailliant B using USB HID, link, View all Braille issues
which doesn’t even make sense. David MacDonald outlined this exact issue in his comment on my label post:
The most common mistake I’m correcting for aria-label/ledby is when it over rides the text in the element, or associated label and when that text or associated html label is important. For instance, a bit of help text on an input. They should use describedby but they don’t understand the difference between accName and accDescription.
Still, the spec is fairly clear on this point, so I guess this one is just up to evangelism.

So are you saying description should never be read? What’s the point of it, then?

Not at all. I’m saying it shouldn’t “always” be read.

When, then?

When there is “some gesture that indicates the user is specifically interested in/wishes to interact with this element”. For a screen reader, simply moving line by line through a document doesn’t satisfy this. Sure, the user is interacting with the device, but that’s because screen readers inherently require interaction; they aren’t entirely passive like sight. For me (and, surprise surprise, for NVDA), this “gesture” means something like tabbing to the link, moving to it using single letter navigation, using a command to query information about the current element, etc.

But VoiceOver reads it!

With VoiceOver, you usually move to each element individually. You don’t (at least not as often) move line by line (like you do with NVDA), where there can be several elements reported at once. With the individual element model, it makes sense to read the description because you’re dealing with a single element at a time and the user may well be interested in that specific element. And if the user really doesn’t care about it, they can always just move on to the next element early.

So now you’re saying we can’t have interoperability. Dude, make up your mind already!

Recall this from my last post:
If we want interoperability, we need solid rules. I’m not necessarily suggesting that this be compulsory or prescriptive; different AT products have different interaction models and we also need to allow for preferences and innovation.
This is one of those “different interaction models” examples.
Rich Schwerdtfeger commented on my last post:
The problem we have with AT vendors is that many have lobbied very hard for us to NOT dictate what they should do.
Examples like these are one reason AT vendors push back on this.

So, uh, what are we supposed to do?

I’m optimistic that there’s a middle ground: guidelines which allow for reasonable interoperability without restricting AT’s ability to innovate and best suit their users’ needs. As in software development, a bit of well-considered abstraction goes a long way to ensuring future longevity.
In this case, perhaps the guidelines could use the “secondary content” terminology I used above or something similar. They might say that for an a tag with an href, the name should be presented as the primary content if overridden using aria-label/ledby and the description should be treated as secondary content. This leaves it up to the AT vendor to decide exactly when this secondary content is presented based on the interaction model, while still providing some idea of how to best ensure interoperability.

Thursday, December 17, 2015

Woe-ARIA: The Surprisingly but Ridiculously Complicated World of aria-label/ledby


WAI-ARIA is one of the best things ever to happen to web accessibility. It paved the way to free us from a world where JavaScript and any widget that didn’t have an HTML tag equated to inaccessibility. Aside from it being deployed by authors, I’ve even managed to majorly improve the accessibility of various websites using Greasemonkey scripts. I love ARIA.
But sometimes, I hate ARIA. Yes, you heard me. I said it. Sometimes, it drives me truly insane.
Let’s take aria-label and aria-labelledby. They’re awesome. Authors can just use them to make screen readers speak the right thing. Simple, right?
Not at all. I wish it were that simple, but it is so, so much more complicated than that. I’ve had a ridiculous number of discussions/arguments about aria-label/aria-labelledby over the years. Frankly, when I hear about aria-label/ledby, it just makes me cringe and groan and, depending on the day, consider quitting my job. (Okay, perhaps that last part is a bit melodramatic.)
The most frustrating part is that people frequently argue that assistive technology products aren’t following the spec when their particular use case doesn’t work as expected. Others bemoan the lack of interoperability between AT products and often blame the AT vendors. But actually, the ARIA spec and guidelines don’t say (not even in terms of recommendations) anything about what ATs should do. They talk only about what browsers should expose, and herein begins a great deal of misunderstanding, argument and confusion. And when we do try to fix one seemingly obvious use case, we often break another seemingly obvious use case.
In this epic ramble, I’ll attempt to explain just how complicated this supeficially trivial issue is, primarily so I can avoid having this argument over and over and over again. While this is specifically related to aria-label/aria-labelledby, it’s worth noting there are similar cans of worms lurking in many other aspects of ARIA. Also, I specifically discuss screen readers with a focus on NVDA in particular, but some of this should still be relevant to other AT.

Why not just use the accessible name?

Essentially, aria-label/ledby alters what a browser exposes as the “name” of an element via accessibility APIs. Furthermore, ARIA specifies when the name should be calculated from the “text” of descendant elements. So before we even get into aria-label/ledby, let’s address the question: why don’t screen raeders just use the name wherever it is present?
The major problem with this is that the “name” is just text. It doesn’t provide any semantic or formatting information.
Take this example:
<a href="foo"><em>bar</em> bas</a>
A browser will expose “bar bas” as the name of the link exactly as you might expect. But that “bar bas” is just text. What about the fact that “bar” was emphasised? If we just take the name, that information is lost. In this example:
<a href="foo"><img src="bar.png" alt="bar"> bas</a>
the name is again “bar bas”. But if we just take the name, the fact that “bar” is a graphic is lost.
These are overly simple, contrived examples, but imagine how this begins to matter once you have more complex content.
In short, content is more than just the name.

Just use it when aria-label/ledby is present.

Okay. So we can’t always use the name. But if aria-label/ledby is present, then we can use the name, right?
Wrong. To disprove this, all we have to do is take a landmark:
<div role="navigation" aria-label="Main">Lots of navigation links here</div>
Now, our screen reader comes along looking for content and sees there’s a name, which it happily uses as the content for the entire element. Oops. All of our navigation links just disappeared. All we have left is “Main”. (Of course, no screen reader actually does or has ever done this as far as I'm aware.)

That’s just silly. You obviously don’t do it for landmarks!

Well, sure, but this raises the question: when do we use it and when don’t we? “Common sense” isn’t sufficient for people, let alone computers. We need clear, unambiguous rules. There is no document which provides any such guidance for AT, so each product has to try to come up with its own rules. And thus, the cracks in the mythical utopia of interoperability begin to emerge.
That really sucks. But enough doom and gloom. Let’s try to come up with some rules here.

Render aria-label/ledby before the real content?

Yup, this would fix the landmark case. It is bad for a case like this, though:
<button aria-label="Close">X</button>
That “X” is meaningless semantically, so the author thoughtfully used aria-label. If we use both the name and content, we’ll get “Close X”. Yuck!

Landmarks are just special. You can still use aria-label/ledby as content for everything else.

Not so much. Consider this tweet-like example:
<li tabindex="-1" aria-labelledby="user message time">
  <a id="user" href="alice">@Alice</a>
  <a id="time" href="6min">6 minutes ago</a>
  <span id="message">Wow. This blog is horrible: <a href=""></a></span>
  <a href="conv">View conversation</a>
</li> uses this technique, though the code is obviously nothing like this. The “li” element is the tweet. It’s focusable and you can move between tweets by pressing j and k. The aria-labelledby means you get a nice, efficient summary experience when navigating between tweets; e.g. the time gets read last, the View conversation and Reply controls are excluded, etc. But if we used the name as content, we’d lose the formatting, links in the message, and the View conversation and Reply controls. If we render the name before the content, we end up with serious duplication.
Believe it or not, I actually have good news this time: yes, you can. But why links and buttons? And what else falls into this category? We need a proper rule here, remember.
There are certain elements such as links, buttons, graphics, headings, tabs and menu items where the content is always what makes sense as the label. While it isn’t clear that it can be used for this determination, the ARIA spec includes a characteristic of “Name From: contents” which neatly categorises these controls.
Thus, we reach our first solid rule: if the ARIA characteristic “Name From: contents” applies, aria-label/ledby should completely override the content.

What about check boxes and radio buttons?

Check boxes and radio buttons don’t quite fit this rule. The problem is that the label is often (but not always) presented separately from the check box element itself, as is the case with the standard HTML input tag:
<input id="Cheese" type="checkbox"><label for="cheese">Cheese</label>
The equivalent using ARIA would be:
<div role="checkbox" aria-labelledby="cheeseLabel">&nbsp;</div><div id="cheeseLabel">Cheese</div>
In most cases, a screen reader will see both the check box and label elements separately. If we say the name should always be rendered for check boxes, we’ll end up with double cheese: the first instance will be the name of the check box, with the second being the label element itself. Duplication is evil, primarily because it causes excessive verbosity.
Okay, so we choose one of them. But which one?

Ignore the label element, obviously. Duh.

Perhaps. In fact, WebKit and derivatives choose to strip out the label element altogether as far as accessibility is concerned in some cases. But what about the formatting and other semantic info?
Let’s try this example in Google Chrome, which has its roots in WebKit:
<input type="checkbox" id="agree"><label for="agree">I agree to the <a href="terms">terms and conditions</a></label>
The label element gets stripped out, leaving a check box and a link. If I read this in NVDA browse mode, I get:
check box not checked, I agree to the terms and conditions, link, Terms and conditions
Ug. That’s horrible. In contrast, this is what we get in Firefox (where the label isn’t stripped):
check box not checked, I agree to the, link, Terms and conditions
Ignoring the label element means we also lose its original position relative to other content. Particularly in tables, this can be really important, since the position of the label in the table might very much help you to understand the structure of the form or aid in navigation of the table.

Fine. So use the label element and ignore the name of the check box.

Great. You just broke this example:
<div role="checkbox" aria-label="Muahahaha">&nbsp;</div>

Make up your mind!

I know, right? The problem is that both of these suck.
The solution I eventually implemented in NVDA is that for check boxes and radio buttons, if the label is invisible, we do render the name as the content for the check box. Finally, another solid rule.

Sweet! And this applies to other form controls too, yeah?

Alas, no. The trouble with other form controls like text boxes, list boxes, combo boxes, sliders, etc. is that their label could never be considered their “content”. Their content is the actual stuff entered into the control; e.g. the text typed into a text box.
If the label is visible, it’s easy: we render the label element and ignore the name of the control. If it isn’t visible, currently, NVDA browse mode doesn’t present it at all.
To solve this, we need to present the label separately. For a flat document representation such as NVDA browse mode, this is tricky, since the label isn’t the “content” of anything. I think the best solution for NVDA here is to present the name of the control as meta information, but only if the label isn’t visible. I haven’t yet implemented this.

Rocking. Can the label override the content for divs, spans and table cells?

No, because if it did, again, we’d lose formatting and semantic info. These elements in particular can contain just about any amount of anything. Do we really want to risk losing that much formatting/info? See the Twitter example above for just a taste of what we might lose.
Another problem with this is the title attribute. Remember I mentioned that aria-label/ledby just alters what the browser exposes as the “name”? The problem is that other things can be exposed as the name, too. If there is no other name, the title attribute will be used if present. I’d say it’s quite likely that the title attribute has been used on quite a lot of divs and spans in the wild, perhaps even table cells. If we replaced the content in this case, that would be… rather unfortunate.
Some have argued that for table cells, we should at least append the aria-label/ledby. Aside from the nasty duplication that might result, this raises a new category of use cases: those where the label should be appended to the content, not overide it. With a new category begin the same questions: what are the rules for this category? And would this make sense for all use cases? It certainly seems sketchy to me, and sketchy just isn’t okay here. Again, we need solid, unambiguous rules.

Stop! Stop! I just can’t take it any more!

Yeah, I hear you. Welcome to my pain! But seriously, I hope this has given some insight into why this stuff is so complicated. It seems so simple when you consider a few use cases, but that simplicity starts to fall apart once you dig a little deeper. Trying to produce “common sense” behaviour for the multitude of use cases becomes extremely difficult, if not downright impossible.
If we want interoperability, we need solid rules. I’m not necessarily suggesting that this be compulsory or prescriptive; different AT products have different interaction models and we also need to allow for preferences and innovation. Right now, though, there’s absolutely nothing.