The PDF specification states that empty dash arrays, i.e. arrays with
zero elements, are in fact valid. In that case the dash array simply
corresponds to a solid, unbroken line. However, this case was erroneously
being flagged as invalid and therefore the annotation was not drawn
because its width was set to zero. This commit fixes the issue by
allowing dash arrays to have a length of zero.
This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does *almost* the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers
The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular:
- In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state.
- In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used.
- In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however *not* a problem since the value of a Promise cannot be changed once fulfilled or rejected.
- In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them:
- For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is *removed* on its first (valid) invocation.
- For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks.
- In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and *test-only* code.
---
[1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).
The following are some highlights of this patch:
- In the Worker we only extract a *subset* of the potential contents of the `Usage` dictionary, to avoid having to implement/test a bunch of code that'd be completely unused in the viewer.
- In order to still allow the user to *manually* override the default visible layers in the viewer, the viewable/printable state is purposely *not* enforced during initialization in the `OptionalContentConfig` constructor.
- Printing will now always use the *default* visible layers, rather than using the same state as the viewer (as was the case previously).
This ensures that the printing-output will correctly take the `Usage` dictionary into account, and in practice toggling of visible layers rarely seem to be necessary except in the viewer itself (if at all).[1]
---
[1] In the unlikely case that it'd ever be deemed necessary to support fine-grained control of optional content visibility during printing, some new (additional) UI would likely be needed to support that case.
The specs are unclear about what kind of xref table format must be used.
In checking the validity of some pdfs in the preflight tool from Acrobat
we can guess that having the same format is the correct way to do.
The pdf in the mentioned bug, after having been changed, wasn't correctly
displayed in neither Chrome nor Acrobat: it's now fixed.
This should be a *tiny* bit more efficient, since it avoids parsing substrings that we don't care about.
*Please note:* I cannot find an ESLint rule to enforce this automatically.
It isn't really a fix for the mentioned bug but it slightly improve things.
In reducing the memory use, the time spent in the GC is reduced either.
The algorithm to compute the bounding box is the same as before but it has just
been rewritten to be more efficient.
This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation.
Please see https://eslint.org/docs/latest/rules/arrow-body-style
For arrow functions that are both simple and short, we can avoid using explicit `return` to shorten them even further without hurting readability.
For the `gulp mozcentral` build-target this reduces the overall size of the output by just under 1 kilo-byte (which isn't a lot but still can't hurt).
and try to load the font family (guessed from the font name) before trying
the local substitution.
The local(...) command expects to have a real font name and not a predefined
substitution it's why we try the font family.
When the text of an annotation is extracted in using getTextContent, consecutive white spaces
are just replaced by one space and. So this patch add an option to make sure that white
spaces are preserved when appearance is parsed.
For the case where there's no appearance, we can have a fast path to get the correct string
from the Content entry.
When an existing FreeText is edited, space (0x20) are replaced by non-breakable (0xa0) ones
to make to see all of them on screen.
In PR 11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents.
However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR 16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes.
Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to *copy* a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases.
For the PDF document in issue 11878, the rendering time of *the second page* changes as follows (on my computer):
- With the `master`-branch it takes >600 ms to render.
- With this patch that goes down to ~50 ms, which is one order of magnitude faster.
(Note that all other pages are, as expected, completely unaffected by these changes.)
This new main-thread copying is limited to "large" global images, since:
- Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue.
- With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread.
- This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.
There's obviously a few things wrong with the Annotations in the referenced PDF document, however parsing of an Annotation shouldn't just break if the /BS-entry isn't a dictionary.
When a pdf as a FreeText without appearance, we use a fake font in order to render it
and that leads to create few new refs for the font.
But then when we're saving, we create some new refs which start at the same number
as the previous created ones.
Consequently, when saving we're using some wrong objects (like a font) to check if
we're able to render the newly added FreeText.
In order to fix this bug, we just remove the persistent refs (which are only used
when rendering/printing) during the saving.
- Re-factor the existing `fetchData` helper function such that it can fetch more types of data, and it now supports "arraybuffer", "json", and "text".
This only needed minor adjustments in the `DOMCMapReaderFactory` and `DOMStandardFontDataFactory` classes.[1]
- Expose the `fetchData` helper function in the API, such that the viewer is able to access it.
- Use the `fetchData` helper function in the `GenericL10n` class, since this should allow fetching of localization-data even if the default viewer is run in an environment without support for the Fetch API.
---
[1] While testing this I also noticed a minor inconsistency when handling standard font-data on the worker-thread.
Some fields, somewhere under the Fields entry in Acroform, could have no name (in T)
but with a parent which has a name but which isn't somewhere under Fields.
As a side-effect, this patch prevents infinite loops because of potential cycles
under Fields.