Thursday, March 28, 2013

An easy way to understanding the ATK text

As far as I remember myself I've always been in touch with ATK text interface implementation in Mozilla. I started from writing and reviewing some patches in far 2006 year. But I didn't really understand that piece of code so I wasn't sure that a change here don't break things there. At some point we decided that we should get an automated test coverage for the text interface before we do any serious work in this area. At least that allowed us to be sure to a certain extent we don't regress badly from a single bug fix. And then I helped my colleagues in test suite creation. As part of this work we caught a bug in ATK spec (thanks to Evan Yan, a Mozilla community member and those times Sun engineer). So that wasn't easy. It wouldn't be a lie if I said that I've never seen a more complicated API for the things it's designed for.

I should notice that roughly speaking IAccessible2 text interface implementation in Firefox is done via ATK text interface. So having a bad implementation in ATK we deliver all bugs right to IAccessible2 screen readers. It's a hot problem in other words. Recently I've felt myself brave enough (again) to say: we should stop this shame. And I started to look at the code and the spec trying to untwist things. And then I realized I still don't have a good perception of ATK text. First of all I thought it'd be good to add some drawings to stingy ATK spec to let me and everybody else check easily whether the expected results are correct actually.


ATK provides bunch of methods to get a text:
Of course ATK provides a bunch of other methods but they are trivial and it doesn't make sense to even mention them. Each of methods above take AtkTextBoundary as an argument and its values are:
  • char (trivial)
  • word start and word end
  • line start and line end
  • sentence start and sentence end (not implemented in Firefox)
So we have get_text_before/at/after methods and word/line start/end offsets. This is a subject of the talk.

About terms: a chapter for the advanced

If you didn't planned to read the ATK spec or dig into details then you can skip this chapter and move to the pictures part. Otherwise this chapter might be useful since it has some clarifications.

First of all, here are some terms which are used in the spec but aren't defined there:
  • word start offset - an offset where the word starts, for example, "hello, all" has two word start offsets: 0 for "hello" and 7 for "all";
  • word end offset - 5 for "hello", i.e. an offset after 'o' character, and 10 for "all";
  • inside word offset - any offset between word start and end offset (including boundaries), in our case these are 0-5 and 7-9 offsets;
  • outside word offset - everything that is not inside a word, in our case this is only 6 offset.
It's pretty much the same for line start and line end offsets.

Also we need to mention edge cases: imaginary offsets. Say we have a paragraph:


and then we do

  gint startOffset = 0, endOffset = 0;
  atk_text_get_text_at_offset(accessible, 1, 
                              &startOffset, &endOffset);
  atk_text_get_text_at_offset(accessible, 1, 
                              &startOffset, &endOffset);

In both cases we expect "hello" string with (0, 5) start and end offsets otherwise there's no way traverse this paragraph by words. But actually it goes with spec: "The returned string will contain the word at the offset if the offset is inside a word". But this means that 0 and 5 offsets are both start and end offset the same time because "the returned string is from the word start at or before the offset to the word start after the offset" and "the returned string is from the word end before the offset to the word end at or after the offset".

Summarizing it all a zero offset (0) and a last offset (character count) are special offsets and can be treated as word start, word end, line start and line end offsets.

A Quick-n-Easy Guide

Update. The proposed algorithm must be corrected to handle edge offsets properly, see ATK text pitfalls. I won't spend time to update it since Joanie proposed ATK text simplification and hopefully it will be accepted in foreseeable future.

So we are ready to put the spec verbosity into nice pictures.


WORD_START and LINE_START boundaries are illustrated this way (X symbol designates the initial offset):


Move forward to the boundary and then (if was successful) move backward. The start offset is at or before the initial offset, the end offset is after the initial offset.

WORD_END and LINE_END boundaries:


Move backward to the boundary and then (if was successful) move forward. The start offset is before the initial offset, the end offset is at or after the initial offset.


WORD_START and LINE_START boundaries:


If the initial offset is the boundary then move backward to find the start offset. Otherwise move backward twice to pick up the start and end offsets.

WORD_END and LINE_END boundaries:

Move backward twice for start and end offsets.


WORD_START and LINE_START boundaries:

Move forward twice for start and end offsets.

WORD_END and LINE_END boundaries:

If the initial offset is the boundary then move forward to find the end offset. Otherwise move forward twice to pick up the start and end offsets.

That's all. Hallelujah to Love!

P.S. Well if I'm wrong in sayings above then you'd better say it otherwise this will be implemented in Firefox soon ;)

Wednesday, March 27, 2013

Accessible Mozilla: Tech overview of Firefox 21

Next week Firefox 21 reaches a beta status so it's time to list accessibility related changes we introduced for assistive technology. This release we were focused on under-the-hood improvements and ARIA bugs. Let's start.


* We pick up accessible value of ARIA combobox in name computation. An example:

  <label for="test">Flash the screen 
    <div role="combobox">
      <div role="textbox"></div>
      <ul role="listbox" style="list-style-type:none;">
        <li role="option">1</li>
        <li role="option" aria-selected="true">2</li>
        <li role="option">3</li>

In this case accessible name of the label is "Flash the screen 2 times" (refer to bug).

* ARIA grid has editable state by default. ARIA grid editable or readonly states (a case of aria-readonly attribute usage) are inherited by grid cells until aria-readonly attribute on a gridcell overrides it (see bug).

* We no longer expose hidden:false object attribute for aria-hidden="false" because ARIA group members concluded that aria-hidden mirrors CSS display:none (check out the bug for details). In particular this means

  <div aria-hidden="true">
    <input aria-hidden="false">

HTML input element is still ARIA hidden.

Sure the author doesn't have any single reason to use aria-hidden="false" in his web app but you know people do crazy things every time. If we exposed it then it could be confusing for screen readers (if they wouldn't know that "false" value is an author error of course). However you may ask why ARIA spec doesn't like to treat "false" value as "true" value like it does for any other error value. It's less work for browsers, no burden for AT. You see that wouldn't bad. I do not know. Go ask your dad.

* We supported ARIA based text attributes. Now you can use aria-invalid="grammar" to mark that your text has a grammatical error. From AT perspective it means that we expose invalid:grammar text attribute.

Unfortunately ARIA spec doesn't look perfect about defining the aria-invalid. For example, it doesn't allow a list of values

  <p aria-invalid="spelling,grammar">my wrong text</p>

and it says nothing about aria-invalid inheritance which would allow you to do a trick

  <p aria-invalid="spelling">
    <span aria-invalid="grammar">my wrong text</span>

to say that your text is misspelled and has grammatical error. Also it doesn't say how the browser should resolve collisions between aria-invalid and native spelling support which is also mapped into invalid text attribute.


* HTML5 main element was implemented. It's exposed as xml-roles:main object attribute on accessibility layer.

* We sorted out name computation for HTML input buttons:
  • HTML input@type="button", @type="submit", type="reset" gets name from:
    • @value attribute
    • @title attribute 
  • HTML input@type="image" gets name from:
    • @alt attribute
    • @value attribute
    • @title attribute
  • HTML input@type="image" having no valid @src attribute gets name from:
    • @alt attribute
    • @value attribute
    • "Submit Query" - a visible label of the button

Everything else

* We did one more fix in our name computation algorithm and now we don't jam together a plain text and name coming from a control element. For example:

  <label>foo<input type="text" value="bar">baz</label>

The accessible name of the label is "foo bar baz". Visually "foo" and "baz" are separated from HTML input, thus we wrap the control's name by spaces when we compute the label name.

* We learned how to coalesce state change events so you should get only one event when the object changes its state fast enough, for example, it may happen during document loading when state busy of a document accessible is switched quickly.

* You can use magic offsets now to get text attributes.