3386 Commits

Author SHA1 Message Date
calixteman
efc5c3c231
Merge pull request #19862 from calixteman/bug1961423
Fix 'print to pdf' on Mac with a cid font (bug 1961423)
2025-04-25 15:11:55 +02:00
Jonas Jenwald
312c85bfd6
Merge pull request #19815 from Snuffleupagus/getMergedResources-size
Ensure that "local" /Contents stream-dict /Resources aren't empty (PR 19803 follow-up)
2025-04-25 10:46:04 +02:00
Calixte Denizet
785991a97c Fix 'print to pdf' on Mac with a cid font (bug 1961423) 2025-04-24 20:19:12 +02:00
Jonas Jenwald
64007e777e Ensure that the /Form XObject /Resources-entry is actually a dictionary (issue 19848) 2025-04-23 10:19:20 +02:00
Jonas Jenwald
adc9eb5a5a Always fallback to checking all destinations, when lookup fails (issue 19835)
In the referenced PDF document the keys, in the /Dests dictionary, need to account for PDFDocEncoding.
To improve destination handling in general we'll now unconditionally fallback to always checking all destinations.
2025-04-20 14:53:10 +02:00
Jonas Jenwald
91ba147317 Check that the Object.prototype hasn't been incorrectly extended (PR 11582 follow-up)
This complements, and extends, the existing check of the `Array.prototype` in the worker-thread.
To simplify the implementation we'll now abort immediately, rather than collecting all "bad" properties.
2025-04-18 12:19:29 +02:00
calixteman
4b1875c8c0
Merge pull request #19825 from calixteman/bug1961107
Avoid to create any subarrays when optimizing 'save, transform, constructPath, restore' (bug 1961107)
2025-04-17 19:42:28 +02:00
Calixte Denizet
d7cbda6cb5 Avoid to create any subarrays when optimizing 'save, transform, constructPath, restore' (bug 1961107)
Removing those `subarray`calls helps to improve performance by a factor 6 on Linux and by a factor of 3
on Windows 11.
2025-04-17 19:14:01 +02:00
Jonas Jenwald
bf553f22da Ensure that the /P-entry is actually a dictionary in StructTreePage.prototype.addNode
This may fix issue 19822, but without a test-case it's simply impossible to know for sure.
2025-04-17 14:01:53 +02:00
Jonas Jenwald
76f23ce3b5 Catch, and ignore, errors during Page.prototype.getStructTree
This way any errors thrown during parsing of the page-structTree will not be forwarded to the viewer.
2025-04-17 13:57:30 +02:00
Jonas Jenwald
245d9ba925 Ensure that "local" /Contents stream-dict /Resources aren't empty (PR 19803 follow-up)
This is a small, and quite possibly pointless, optimization which ensures that any "local" /Resources aren't empty, to avoid needlessly trying to load and merge dictionaries.
2025-04-14 09:58:15 +02:00
Jonas Jenwald
834423b51d Add more logical assignment in the src/ folder
This patch uses nullish coalescing assignment in cases where it's immediately obvious from surrounding code that doing so is safe, and logical OR assignment elsewhere (mostly the changes in XFA code).
2025-04-12 17:28:33 +02:00
Jonas Jenwald
1c80412f61 Change PDFDocument.prototype._xfaStreams to return a Map
Using a `Map` rather than an `Object` is a nicer, since it has better support for both iteration and checking if a key exists.
We also change the initial values to be `null`, rather than empty strings, and reduce duplication when creating the `Map`.

*Please note:* Since this is worker-thread code, these changes are "invisible" at the API-level.
2025-04-12 12:47:22 +02:00
Jonas Jenwald
1048508dd1 Catch circular references in /Form XObjects (issue 19800)
For simplicity we will abort /Form XObject parsing *immediately* when encountering a circular reference, rather than letting it continue up until some limit (as e.g. PDFium appears to do), which should be fine since there are never any guarantees if/how *corrupt* PDF documents will render.
2025-04-11 16:54:22 +02:00
Jonas Jenwald
7a94fafd30 Prefer /Resources from the /Contents stream-dict, if available
In rare cases /Resources are also found in the /Contents stream-dict, in addition to in the /Page dict, hence we need to prefer those when available; see `issue18894.pdf`.
2025-04-11 16:54:22 +02:00
Jonas Jenwald
835a456767 Use adjustWidths unconditionally for all embedded fonts (issue 19802)
Previously we'd only do this for Type1/CFF fonts, see e.g. PR 6736, since the font-program may update the /FontMatrix.
However, it seems that we should do this unconditionally to account for fonts with non-default /FontMatrix-entries in the font-dictionary (which seem to be pretty rare).
2025-04-11 15:01:35 +02:00
Jonas Jenwald
fbc4f4b12a Handle non-integer and out-of-range values correctly in Indexed color spaces
In PDF version 2.0 the handling of Indexed color spaces was clarified as follows:
> The index value should be an integer in the range 0 to hival. If the value is a real number, it shall be rounded to the nearest integer (0.5 values shall be rounded up); if it is outside the range 0 to hival, it shall be adjusted to the nearest value within that range.

Please refer to https://github.com/pdf-association/pdf-differences/tree/main/IndexedColor
2025-04-09 15:31:49 +02:00
Jonas Jenwald
12c7c7b0af
Merge pull request #19773 from Snuffleupagus/inline-PDFImage-createRawMask
Inline `PDFImage.createRawMask` in the `PDFImage.createMask` method
2025-04-08 17:19:09 +02:00
Jonas Jenwald
dc3e24a76a Inline PDFImage.createRawMask in the PDFImage.createMask method
After the introduction of `OffscreenCanvas` support we now have *two separate* mask-methods in the `PDFImage` class, and the reason that they were not combined is likely that we need the "raw" bytes when parsing Type3-glyph image masks.
However, that case is easy to support simply by disabling `OffscreenCanvas` usage when parsing Type3-glyphs and that way we're able to reduce some code duplication.

Another slightly strange property of the `PDFImage.createMask` method is that it needs various image-dictionary parameters *manually* provided, which is probably because this is very old code.
That feels slightly unwieldy, and we instead change the method to pass in the image-stream directly and do the necessary data-lookup internally.

A side-effect of this re-factoring is that we now support using the custom `isSingleOpaquePixel` operator in Type3-glyphs, which shouldn't hurt even though it seems extremely unlikely for that to ever happen in Type3-glyphs.
2025-04-08 12:01:50 +02:00
Jonas Jenwald
d882d0869c Move the IDENTITY_MATRIX constant into src/core/core_utils.js (PR 19772 follow-up)
After the changes in PR 19772 the `IDENTITY_MATRIX` constant is now only used on the worker-thread, which leads to Webpack marking the code as unused in the *built* `pdf.mjs` file; see https://phabricator.services.mozilla.com/D244533#change-8oITAexCvrlQ
2025-04-07 11:40:18 +02:00
Calixte Denizet
4c63905a18 Avoid to create an array when setting the text matrix 2025-04-05 20:45:26 +02:00
Jonas Jenwald
7cfb1be650
Merge pull request #19758 from Snuffleupagus/OperatorList-setOptions
Initialize the `isOffscreenCanvasSupported` option, in the `OperatorList` class, once per document
2025-04-05 18:45:55 +02:00
calixteman
7eef7dfc78
Merge pull request #19763 from calixteman/simplify_updaterect
Replace UpdateRectMinMax by getAxialAlignedBoundingBox
2025-04-04 21:33:05 +02:00
Calixte Denizet
e7a951547d Replace UpdateRectMinMax by getAxialAlignedBoundingBox
and don't use array destructuring because it induces a memory and perf penalties.
2025-04-04 19:57:55 +02:00
Jonas Jenwald
dad6febc39 Pass the /Info-strings as a Map to the src/core/writer.js code
We want to iterate through the data in the `computeMD5` function, and `Map`s have "nicer" support for that than generic objects.
(Somewhat recently `Map` performance was improved in Firefox, however this also isn't really performance sensitive code.)
2025-04-04 13:36:13 +02:00
Jonas Jenwald
8c5fb7979f Also transfer ImageBitmaps for image masks and inline images
Currently we're only transferring TypedArrays, however there's no reason (as far as I can tell) to not also transfer `ImageBitmap`s in these cases.
2025-04-03 18:50:39 +02:00
Calixte Denizet
41bed561f0 Simplify updateRectMinMax in order to use slightly less memory 2025-04-03 17:06:58 +02:00
Jonas Jenwald
4a6c47489e Initialize the isOffscreenCanvasSupported option, in the OperatorList class, once per document
Currently we're setting this option for each small inline image, which seems unnecessary since it should suffice to do that once per document.
2025-04-03 14:00:07 +02:00
Jonas Jenwald
4262603b06 Re-name the Util.applyTransformToBezierInPlace method
Given that all `Util.apply...` methods are now using in/out parameters, we can slightly shorten the name of this one.
2025-04-01 23:03:50 +02:00
Jonas Jenwald
fa643bb22f Change Util.applyTransform to use the point-argument as an in/out parameter
This will help reduce the total number of Array allocations, which cannot hurt, and also allows us to remove the `Util.applyTransformInPlace` method.
2025-04-01 23:03:36 +02:00
Jonas Jenwald
e5fbf52405
Merge pull request #19736 from Snuffleupagus/compileType3Glyph-worker
[api-minor] Move Type3-glyph compilation to the worker-thread
2025-04-01 19:40:30 +02:00
Jonas Jenwald
b33522a208
Merge pull request #19746 from Snuffleupagus/evaluator-img-cache-tweaks
Reduce some code duplication when handling globally cached images
2025-04-01 19:15:10 +02:00
calixteman
25693dc0ee
Merge pull request #19712 from calixteman/optimize_save_construct
Optimize save-transform-constructPath-restore
2025-04-01 18:20:59 +02:00
Jonas Jenwald
9cd5a9658a [api-minor] Move Type3-glyph compilation to the worker-thread
After PR 19731 the format of compiled Type3-glyphs is now simple enough that the compilation can be moved to the worker-thread, without introducing any significant additional complexity.
This allows us to, ever so slightly, simplify the implementation in `src/display/canvas.js` since the Type3 operatorLists will now directly include standard path-rendering operators (using the format introduced in PR 19689).

As part of these changes we also stop caching Type3 image masks since: we've not come across any cases where that actually helps, they're usually fairly small, and it simplifies the code.

Note that one "negative" change introduced in this patch is that we'll now compile Type3-glyphs *eagerly*, whereas previously we'd only do that lazily upon their first use.
However, this doesn't seem to impact performance in any noticeable way since the compilation is fast enough (way below 1 ms/glyph in my testing) and Type3-fonts are also limited to just 256 glyphs. Also, many (or most?) Type3-fonts don't even use image masks and are thus not affected by these changes.
2025-04-01 09:09:00 +02:00
Jonas Jenwald
213830f44f Use, and re-name, the addLocallyCachedImageOps helper for global images too
This avoids having to "manually" set the image operators for globally cached images.
2025-03-31 10:57:04 +02:00
Jonas Jenwald
e0e59eaf01 Define the global cache-data once in buildPaintImageXObject
Currently we duplicate the same identical code three times, which seems both unnecessary and error prone.
2025-03-31 10:29:29 +02:00
Tim van der Meij
1ccc82beff
Merge pull request #19733 from Snuffleupagus/import-vite-ignore
[api-minor] Attempt to improve support for using the PDF.js builds with Vite
2025-03-30 11:58:20 +02:00
Jonas Jenwald
8e3a3387e0 Reduce duplication when specifying the fn-operations in buildPaintImageXObject
Currently we explicitly specify the fn-`OPS` both when adding entries to the operatorList and to the image-caches, and by using a temporary variable we can reduce a bit of duplication (similar to the existing args-handling).
2025-03-29 15:56:46 +01:00
Jonas Jenwald
8bcc3664c9 [api-minor] Attempt to improve support for using the PDF.js builds with Vite
Similar to Webpack there's apparently other bundlers that will not leave `import`-calls alone unless magic comments are used.
Hence we extend the builder to also append `/* @vite-ignore */` comments to `import`-calls, in order to attempt to improve support for using the PDF.js builds together with Vite.

This patch also renames `__non_webpack_import__` to `__raw_import__` since the functionality is no longer bundler-specific.

***PLEASE NOTE:*** This patch is provided as-is, and it does *not* mean that the PDF.js project can/will provide official support for Vite.
2025-03-28 16:34:00 +01:00
Jonas Jenwald
29788eb0b8 Use Util.rectBoundingBox more in the annotation code (PR 19713 follow-up)
By changing the return format of the various `pointsCallback` functions we can use the `Util.rectBoundingBox` helper in the `MarkupAnnotation.prototype._setDefaultAppearance` method as well, thus shortening the code slightly.
2025-03-27 13:46:36 +01:00
Jonas Jenwald
c6b9338f6f
Merge pull request #19719 from Snuffleupagus/structTree-xref
Pass the `XRef`-instance explicitly to the `StructTreeRoot` class, and simplify the `StructElementNode.prototype.role` getter
2025-03-25 19:41:20 +01:00
Jonas Jenwald
22107145df
Merge pull request #19691 from Snuffleupagus/xfa-template-rm-reduce
Remove `Array.prototype.reduce` usage from the `src/core/xfa/template.js` file
2025-03-25 18:17:12 +01:00
Jonas Jenwald
532406427b Simplify the StructElementNode.prototype.role getter a tiny bit
Given that the `roleMap` only contains `Name`-strings, the lookup can be simplified thanks to modern JavaScript features.
2025-03-25 18:04:56 +01:00
Jonas Jenwald
3c93d63731 Pass the XRef-instance explicitly to the StructTreeRoot class
This avoids the current situation where we're accessing it through various dictionaries, since that's a somewhat brittle solution given that in the general case a `Dict`-instance may not have the `xref`-field set (see e.g. the empty-Dict).
2025-03-25 18:04:51 +01:00
Calixte Denizet
6146e5fee7 Optimize save-transform-constructPath-restore
The 4 operations can be replaced with just one in applying the transform to the points coordinates.
2025-03-25 15:31:45 +01:00
Calixte Denizet
dee80cb082 In the struct tree a kid can be a reference to an MCID entry (issue #19694) 2025-03-25 15:01:52 +01:00
Jonas Jenwald
9456b88f6b Remove Array.prototype.reduce usage from the src/core/xfa/template.js file
Using `Array.prototype.reduce` often leads to less readable code, and in these cases we can replace that with other Array-methods instead.
2025-03-24 16:40:41 +01:00
Jonas Jenwald
b85f0903ca Add new bounding-box helpers in Util to reduce code duplication
Currently we have a `Util`-helper for computing the bounding-box of a Bézier curve, however for simple points and rectangles we repeat virtually identical code in many spots throughout the code-base.

 - Introduce new `Util.pointBoundingBox` and `Util.rectBoundingBox` helpers.

 - Remove the "fallback" from `Util.bezierBoundingBox` and only support passing in a `minMax`-array, since there's only a single call-site using the other format and it could be easily updated.
2025-03-23 19:20:02 +01:00
Jonas Jenwald
e73224ded7 Introduce Math.sumPrecise usage in the code-base
This is a new JavaScript feature that makes it easy to compute the sum of list of values; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/sumPrecise

This allows us to remove most cases of `Array.prototype.reduce`, which helps improve readability since that (in my opinion) often isn't the most intuitive code.
2025-03-23 13:03:56 +01:00
calixteman
d009e4b3a7
Merge pull request #19689 from calixteman/use_path2d
[api-minor] Use a Path2D when doing a path operation in the canvas (bug 1946953)
2025-03-22 21:46:27 +01:00