Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrap download_list_html in <p> so VoiceRSS sees it #918

Closed
wants to merge 1 commit into from

Conversation

BryceStevenWilley
Copy link
Contributor

Originally found in mplp/docassemble-mlhframework#62, docassemble only sends parts of the page to Voice RSS to be read. It has a fairly complicated set of rules, including the fact that it skips over divs and the contents of any tag that isn't only a string. All that means that the download list buttons and document titles aren't read out.

It's fairly minor, and the voice RSS has a bunch of other issues as well (see mplp/docassemble-mlhframework#62 (comment)), but wrapping the entire download_list_html in paragraph tags stops docassemble from skipping over those buttons. It's a bit hacky, but it's not something that's worth fixing upstream, and this is a fairly simple change.

Originally found in mplp/docassemble-mlhframework#62,
docassemble only sends parts of the page to Voice RSS to be read. It has a
fairly complicated set of rules, including the fact that it skips over `div`s
and the contents of any tag that isn't only a string. All that means that the
download list buttons and document titles aren't read out.

It's fairly minor, and the voice RSS has a bunch of other issues as well (see
mplp/docassemble-mlhframework#62 (comment)),
but wrapping the entire `download_list_html` in paragraph tags stops
docassemble from skipping over those buttons. It's a bit hacky, but it's not
something that's worth fixing upstream, and this is a fairly simple change.
@nonprofittechy
Copy link
Member

Are other block level elements allowed to be in a <p> tag?

@BryceStevenWilley
Copy link
Contributor Author

Hm, not technically. Browsers seem to handle it by automatically closing the paragraph tag before the div, and ignoring the later closing p tag. But yeah, not to spec.

I don't think there's another way around the issue other than tricking docassemble like this or trying to change the screenreader code extensively, which itself is finicky (I tried to format everything such that the docassemble wouldn't skip the text, but it either skips almost everything, or duplicates what is read due to how it reads text content). I'm convinced that turning off VoiceRSS is the better option, but eh. I'll close this.

@nonprofittechy
Copy link
Member

Hm, not technically. Browsers seem to handle it by automatically closing the paragraph tag before the div, and ignoring the later closing p tag. But yeah, not to spec.

I don't think there's another way around the issue other than tricking docassemble like this or trying to change the screenreader code extensively, which itself is finicky (I tried to format everything such that the docassemble wouldn't skip the text, but it either skips almost everything, or duplicates what is read due to how it reads text content). I'm convinced that turning off VoiceRSS is the better option, but eh. I'll close this.

I'm open to hearing more about voicerss and other ways it might break things... we've not gotten a lot of favorable feedback on it. Can you say more?

@plocket
Copy link
Collaborator

plocket commented Jan 4, 2024

We might want to change this just a tad upstream in download_list_html if this is a pervasive problem. We'll then also be able to adjust styles if there's a problem with those.

@BryceStevenWilley
Copy link
Contributor Author

BryceStevenWilley commented Jan 4, 2024

Less that it breaks other things, and more that it's just not going to be great at reading everything on the page. I didn't know how it parsed the page until yesterday when I looked into it.

  • Docassemble first tries to exclude things that are hidden on the page, but it doesn't use browser semantics, it just finds things that have aria-hidden. This still reads out things that have the HTML attribute hidden, or display: none, both of which screen readers will skip over. For example, it reads the full content of collapsed templates (might be okay), notes that are hidden by show ifs (not okay), and each of the in-interview review widget pages ("Did we help you?", "Thank you for your feedback", "Thank you for your review") despite the latter two being hidden with display: none (less okay).

    • this can't be fixed until it starts gathering the styles of all of the elements on the page and includes a full CSS ruleset, which isn't going to happen on the backend where screenreader.py is getting everything.
  • The relevant part of the existing code looks something like this (paraphrased from screenreader.py):

    for element in all_html_tags:
       ...
       if element.name in ['div', 'option']:
           return False
       if element.name in ['img', 'input'] and element.has_attr('alt'):
           return True
       if element.name in ['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'button', 'textarea', 'note', 'label', 'li']:
           return True
       if element.parent and element.parent.name in ['p', 'h1', 'h2', 'h3', 'h4', 'h5', 'button', 'textarea', 'note', 'label', 'li']:
           return False
       if element.string:
           return True
       return False

    This will read image alt text and all of the text content in p, h1, etc, or if the only content in an HTML tag is text. Otherwise, it skips it. As mentioned, I tried changing this a bit, but it seemed fairly fragile; I couldn't get it to read the "Download" "View" buttons without it also finding a lot of other duplicate or unnecessary content on the page. It looks like that's because we have extra fontawesome icons at the front, so docassemble skips the entire link name (this happens on review screens as well).

  • There's also the existing issue about the control buttons being read out by the screen reader: Remove "Listen to page" screen reader from tab order docassemble-AssemblyLine#621.

We might want to change this just a tad upstream in download_list_html

I thought about that. We could; changing the divs to tables and putting icons under an additional span might help. But this is just one case, and the more I look into it, the more screenreader.py doesn't behave great in other places as well.

Essentially, I'm not confident that it'll do a good job the more components we use on a page. To fix things would mean we'd start iteratively implement browser accessibility rules, which seems ironic given that we're saying that this isn't an accessibility feature. We could just say it's not intended to read all page content, but why have a feature that doesn't actually read you the full page? There might be some point between where it is now and something much too complicated, but I don't know where that is. All of that pushes me to say it's not worth the effort to polish the feature.

@BryceStevenWilley BryceStevenWilley deleted the voice_rss_download_list branch January 8, 2024 17:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants