My team has a robust digital accessibility program and processes for WCAG conformance in our apps. Because of this, there are definitely accessibility defects that get caught and addressed in order of impact and business priority like any other bug. Obviously we want to aim for 100% accessibility for our users, but it's a continual work in progress as new enhancements or changes are released.
I'm stuck on the appropriate measurement to indicate support. If we have 50 common tasks and the most central 10 tasks are solid but some supporting (but also common) tasks have a contrast fail or accessibleLabel missing, does that make the whole app not supporting the feature? If "completing the task" is the rubric there are a whole range of interpretations for that.
In a complex app, I anticipate that a group like ours will have strong support for many of the Accessibility Nutrition Labels accessibility features across tasks and devices, but realistically never be 100% free of defects for a given Apple Accessibility feature, even among core tasks.
As I consider the next steps for Nutrition Labels, I do not see anything in the documentation that gives a sort of baseline or measurement for inclusion. We plan to test all steps to complete a task, and log defects accordingly with an assigned timeline for fixing them (as would be true for functional defects).