I discovered that doing skip-cache re-reloading of https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf would *intermittently* cause (some of) the AnnotationLayers to break with errors printed in the console (see below).
In hindsight this bug is really obvious, however it took me quite some time to find it, since the `StructTreePage.prototype.serializable` getter will lookup various data and all of those cases can fail during loading when streaming and/or range requests are being used.
Finally, to prevent any future errors, ensure that the viewer won't break in these sort of situations.
```
Uncaught (in promise)
Object { message: "Missing data [19098296, 19098297)", name: "UnknownErrorException", details: "MissingDataException: Missing data [19098296, 19098297)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55
\#renderAnnotationLayer: "UnknownErrorException: Missing data [17552729, 17552730)". viewer.mjs:8737:15
Uncaught (in promise)
Object { message: "Missing data [17552729, 17552730)", name: "UnknownErrorException", details: "MissingDataException: Missing data [17552729, 17552730)", stack: "BaseExceptionClosure@resource://pdf.js/build/pdf.mjs:453:29\n@resource://pdf.js/build/pdf.mjs:456:2\n" }
viewer.mjs:8801:55
```
This allows using the new methods in browsers that support them, e.g. Firefox 133+, while still providing fallbacks where necessary; see https://github.com/tc39/proposal-arraybuffer-base64
*Please note:* These are not actual polyfills, but only implements what we need in the PDF.js code-base. Eventually this patch should be reverted, once support is generally available.
- Add explicit `length` validation of the /ID entries. Given the `EMPTY_FINGERPRINT` constant we're already *implicitly* assuming a particular length.
- Move the constants into the `fingerprints`-getter, since they're not used anywhere else.
- Replace the `hexString` helper function with the standard `Uint8Array.prototype.toHex` method; see https://github.com/tc39/proposal-arraybuffer-base64
After PR 18845 we're accessing this getter more, hence it seems like a good idea to ensure that the initial `formInfo` access is covered as well.
While unlikely to be a problem in practice, at least theoretically that data may not be available and the code in `fieldObjects` could thus currently be *unintentionally* invoked more than once.
The `renderForms` parameter pre-dates the introduction of the general `intent` parameter, which means that we're now effectively passing the same state twice to these `getOperatorList` methods.
Similar to the `mustBeViewed` method, we can check the relevant parameters within the `mustBeViewedWhenEditing` method itself since that (in my opinion) slightly helps readability of the code in the `src/core/document.js` file.
In *hindsight* this seems like a better idea, since it avoids the need to manually pass `isEditing` around as a boolean value.
Note that `RenderingIntentFlag` is *internal* functionality, not exposed in the official API, which means that it can be extended and modified as necessary.
Right now, editable annotations are using their own canvas when they're drawn, but
it induces several issues:
- if the annotation has to be composed with the page then the canvas must be correctly
composed with its parent. That means we should move the canvas under canvasWrapper
and we should extract composing info from the drawing instructions...
Currently it's the case with highlight annotations.
- we use some extra memory for those canvas even if the user will never edit them, which
the case for example when opening a pdf in Fenix.
So with this patch, all the editable annotations are drawn on the canvas. When the
user switches to editing mode, then the pages with some editable annotations are redrawn but
without them: they'll be replaced by their counterpart in the annotation editor layer.
- These changes will allow a simpler way of implementing PR 17770.
- The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done *once* per PDF document (and most PDFs don't included this data).
- This makes the /Lang attribute *directly available* in the `textLayer`, which has the following advantages:
- We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer).
- Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770.
*Please note:* This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).
This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation.
Please see https://eslint.org/docs/latest/rules/arrow-body-style
Some fields, somewhere under the Fields entry in Acroform, could have no name (in T)
but with a parent which has a name but which isn't somewhere under Fields.
As a side-effect, this patch prevents infinite loops because of potential cycles
under Fields.
The `fieldObjects`-getter is implemented in the `PDFDocument` class, which means that the `this._localIdFactory`-property that we pass to `AnnotationFactory.create` doesn't actually exist.
The reason that this hasn't caused any bugs, that I'm aware of, is that all /Fields-entries need to be References to actually make sense.
The `fieldObjects`-getter itself is called, from `src/core/worker.js`, in a way that'll ensure that any `MissingDataException`s are handled. However the problem is that the actual data-lookups in `fieldObjects` and `#collectFieldObjects` are done inside of a Promise, which means that `MissingDataException`s won't be handled and parsing could thus break.
To address this we change all data-lookups to be asynchronous instead.
Over time the amount of "document level" data potentially needed during parsing of Annotations have increased a fair bit, which means that we currently need to ensure that a bunch of data is available for each individual Annotation.
Given that this data is "constant" for a PDF document we can instead create (and cache) it lazily, only when needed, *before* starting to parse the Annotations on a page. This way the parsing of individual Annotations should become slightly less asynchronous, which really cannot hurt.
An additional benefit of these changes is that we can reduce the number of parameters that need to be explicitly passed around in the annotation-code, which helps overall readability in my opinion.
One potential drawback of these changes is that the `AnnotationFactory.create` method no longer handles "everything" on its own, however given how few call-sites there are I don't think that's too much of a problem.
The issue described in the mentioned bug is reall because
Acrobat is rendering the XFA instead of the Acroform.
The original patch just tried to workaround the issue but it
induces some regressions.
In order to minimize the size the of a saved pdf, we generate only one
image and use a reference in each annotation using it.
When printing, it's slightly different since we have to render each page
independantly but we use the same image within a page.
After PR 12563 we're now free to use optional chaining in the worker-thread as well. (This patch also fixes one previously "missed" case in the `web/` folder.)
For the MOZCENTRAL build-target this patch reduces the total bundle-size by `1.6` kilobytes.
On my computer, it takes few tenths of a second to load a local font.
Since a font can be used several times in a document, the cache will
improve performances.