3221 Commits

Author SHA1 Message Date
Jonas Jenwald
641e2f506e [api-minor] Re-factor how the useWorkerFetch option is used internally
With the recently added OpenJPEG no-wasm fallback we need to send the `wasmUrl` option to the worker-thread *regardless* of the value of the `useWorkerFetch` option, since the fallback won't work if we don't have a URL to `import` it from.
For consistency the code is re-factored to always send the factory-urls to the worker-thread, and simply check the `useWorkerFetch` option there instead.

Also, as a follow-up to PR 19525, introduce a new `useWasm` option that can be used in e.g. browser-tests to forcibly disable WebAssembly usage.
2025-02-22 09:56:53 +01:00
Jonas Jenwald
6d3bb47655
Merge pull request #19525 from calixteman/bug1935076_part2
Provide a js fallback when the wasm version of openjpeg is failing to load (bug 1935076)
2025-02-22 09:34:40 +01:00
Calixte Denizet
36e4f5c222 Provide a js fallback when the wasm version of openjpeg is failing to load (bug 1935076) 2025-02-21 19:03:47 +01:00
Jonas Jenwald
d5ce35f744 Move the EXIF-block replacement into JpegStream (PR 19356 follow-up)
Currently we modify the EXIF-block in place, which may end up "breaking" the JPEG-data of the original PDF document since e.g. saving it from the viewer no longer contains the real EXIF-block.
Hence the EXIF-block replacement is moved into the `JpegStream` class, such that we can copy the data before doing the replacement.
2025-02-20 12:41:39 +01:00
Jonas Jenwald
db7cf40a30 Don't cache free/missing XRef entries (issue 19510)
During the XRef stream parsing we're attempting to lookup an entry that hasn't yet been found, since parsing is currently running, and given that we'd also cache free/missing XRef entries we'd then return an incorrect value during normal PDF parsing.

The simplest solution here is to just not cache free/missing XRef entries, since a properly generated PDF document shouldn't be trying to access objects it doesn't contain.
Furthermore, the amount of "extra" parsing now needed for such XRef entries shouldn't be significant enough to be an issue.
2025-02-18 18:04:00 +01:00
Jonas Jenwald
36979e9eb2 Fix all outstanding ESLint arrow-body-style warnings
Currently this rule is disabled in a number of spots across the code-base, and unless absolutely necessary we probably shouldn't disable linting, so let's just update the code to fix all the outstanding cases.
2025-02-17 15:45:44 +01:00
Jonas Jenwald
3f15e0c469
Merge pull request #19495 from Snuffleupagus/issue-19494
Check more of the stream when looking for commands after inline image (issue 19494)
2025-02-15 17:05:59 +01:00
Jonas Jenwald
65df1d336f Check more of the stream when looking for commands after inline image (issue 19494)
Currently we only check `followingBytes`, which turns out to be too short to find e.g. valid transform (cm) commands with decimal arguments.
2025-02-15 15:14:47 +01:00
Jonas Jenwald
a722ca4de5 Shorten the CipherTransformFactory.prototype.#buildObjectKey method
- Use `TypedArray.prototype.set()` rather than a manual loop when building the `key`.

 - Use an existing local variable to avoid re-computing the length of the `encryptionKey`.
2025-02-15 13:00:42 +01:00
calixteman
92ff26e4ff
Merge pull request #19427 from calixteman/bug1946181
Add some unicode mapping for ligatures when writing the cmap table in the font (bug 1946181)
2025-02-14 17:46:07 +01:00
Jonas Jenwald
33cba30bdb Search for destinations in both /Names and /Dests dictionaries (issue 19474)
Currently we only use either one of them, preferring the NameTree when it's available.
2025-02-14 15:49:05 +01:00
Jonas Jenwald
bd05b255fa [api-major] Apply the userUnit using CSS, to fix the text/annotation layers (bug 1947248)
Rather than modifying the "raw" dimensions of the page, we'll instead apply the `userUnit` as an *additional* scale-factor via CSS.

*Please note:* It's not clear to me if this solution is fully correct either, or if there's other problems with it, but it at least *appears* to work.

---

With these changes, the following CSS variables are now assumed to be available/set as necessary: `--total-scale-factor`, `--scale-factor`, `--user-unit`, `--scale-round-x`, and `--scale-round-y`.
2025-02-11 14:36:06 +01:00
Jonas Jenwald
b4a6b1ba0b
Merge pull request #19456 from Snuffleupagus/more-TypedArray-fill
Replace a couple of loops with `TypedArray.prototype.fill()`
2025-02-09 17:45:17 +01:00
calixteman
6862e84daa
Merge pull request #19437 from calixteman/signature_save_print1
[Editor] Add the ability to print and save some newly added signatures (bug 1946795)
2025-02-09 17:13:01 +01:00
Jonas Jenwald
294fa3e4e6 Replace a couple of loops with TypedArray.prototype.fill()
When you want to initialize a TypedArray with a non-zero value for all elements the `fill`-method is simpler than manually looping through it.
2025-02-09 16:24:45 +01:00
Tim van der Meij
8ba8e75d6c
Merge pull request #19446 from Snuffleupagus/shorten-MeshStreamReader-readBits
Shorten the `MeshStreamReader.prototype.readBits` method a little bit
2025-02-09 12:57:53 +01:00
Jonas Jenwald
7919f4e84f Use crypto.getRandomValues unconditionally in the src/core/crypto.js file
This functionality is now available in all browsers/environments that we support, please see https://developer.mozilla.org/en-US/docs/Web/API/Crypto/getRandomValues#browser_compatibility
2025-02-09 11:42:17 +01:00
Jonas Jenwald
36522d85cc Shorten the MeshStreamReader.prototype.readBits method a little bit
- Use a `BaseStream`-instance method to directly get the int32 value.

 - Use local variables more.
2025-02-08 11:17:35 +01:00
Calixte Denizet
24417a1a0b [Editor] Add the ability to print and save some newly added signatures (bug 1946795) 2025-02-07 23:07:27 +01:00
Tim van der Meij
b43efdd545
Merge pull request #19436 from Snuffleupagus/api-FetchBinaryData
Combine the main-thread message handlers for CMap-, StandardFontData-, and Wasm-files
2025-02-07 21:13:58 +01:00
Jonas Jenwald
88e5da1e37 Combine the main-thread message handlers for CMap-, StandardFontData-, and Wasm-files
Currently we have three separate and virtually identical message handlers for this data, which can easily be combined into a single message handler instead.
2025-02-07 14:33:15 +01:00
Calixte Denizet
185ec1faf6 Add some unicode mapping for ligatures when writing the cmap table in the font (bug 1946181) 2025-02-06 10:51:15 +01:00
Jonas Jenwald
925071ab46 Use the getPdfColorArray helper more in the src/core/annotation.js file
While the file-size reduction is obviously tiny, it should never hurt to remove code duplication.
2025-02-05 11:37:32 +01:00
Jonas Jenwald
6f2966628c Add width/height getters in the Annotation class
Currently we're manually computing the width/height of the /Rect-entry in a number of spots throughout the worker-thread Annotation code, which these new getters help avoid.
2025-01-31 13:21:53 +01:00
Jonas Jenwald
31deee26b9
Merge pull request #19393 from Snuffleupagus/fewer-hasFieldFlag
Use fewer `hasFieldFlag` calls in the `src/core/annotation.js` file
2025-01-30 15:10:14 +01:00
Jonas Jenwald
9e227de3c5 Use fewer hasFieldFlag calls in the src/core/annotation.js file
We lookup a number of field-flags multiple times, which seems unnecessary.
2025-01-30 13:38:57 +01:00
Jonas Jenwald
db53320da8 Initialize the image-options, on the worker-thread, once per document
Currently we're initializing the image-options for every page, which seems unnecessary since it should suffice to do that once per document.

Also, changes the `BasePdfManager` constructor to improve readability/documentation a little bit.
2025-01-30 11:52:15 +01:00
Jonas Jenwald
6f2706fad6 Support the password field-flag in TextWidgetAnnotation (issue 19389) 2025-01-29 12:40:09 +01:00
Jonas Jenwald
237a17ad56 Introduce a readInt16 helper function in the src/core/core_utils.js file
Rather than manually repeating code needed to get an int16, we can move and re-use an existing helper function instead.
2025-01-29 10:14:23 +01:00
Jonas Jenwald
6281a89778 Handle *empty* BigInt64Array/BigUint64Array in the isNumberArray helper
The current checks would accidentally allow *empty* BigInt64Array/BigUint64Array, which we can fix by instead checking directly for those types.
This should be fine since those types are available in all environments that we support, see:
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigInt64Array#browser_compatibility
 - https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/BigUint64Array#browser_compatibility
2025-01-27 12:57:47 +01:00
Jonas Jenwald
9241e1be8c [api-minor] Simplify clean-up of page resources after rendering
After PR 2317, which landed in 2012, we'd immediately clean-up after rendering for pages with large image resources. This had the effect that re-rendering, e.g. after zooming, would force us to re-parse the entire page which could easily lead to bad performance.
In PR 16108, which landed in 2023, we tried to lessen the impact of that by slightly delaying clean-up however that's obviously not a perfect solution (and it increased the complexity of the relevant code).

Furthermore, the condition for this "immediate" clean-up seems a bit arbitrary to me since a page could easily contain a large number of smaller images whose total size vastly exceeds the threshold.

Hence this patch, which suggests that we remove the conditional and delayed clean-up after rendering. Compared to the situation back in 2012, a number of things have improved since:
 - We have *multiple* caches for repeated image-resources on the worker-thread[1], which helps reduce overall memory usage and improves performance.
 - We downsize huge images on the worker-thread, which means that the images we're using on the main-thread cannot be arbitrarily large.
 - The amount of available RAM on devices should be a lot higher, since more than a decade has passed.

A future improvement here, for more resource constrained environments, could be to instead clean-up when actually needed using e.g. `WeakRef`s (see issue 18148).

---
[1] More specifically:
 - `LocalImageCache`, which caches image-data by /Name and /Ref on the `PartialEvaluator.prototype.getOperatorList` level.
 - `RegionalImageCache`, which caches image-data by /Ref on the `PartialEvaluator`-instance (i.e. at the page) level.
 - `GlobalImageCache`, which caches image-data by /Ref globally at the document level.
2025-01-22 12:19:44 +01:00
Calixte Denizet
1ccf6ed976 Correctly render the glyph outline when it has a stroke pattern
It fixes #19360.

Each glyph in the test case has a fill and a stroke pattern, so the current transform used
to scale the glyph outline must be the same.
In setting the stroke color to green, I noticed that the last outline contains some non-closed
subpaths, so when generating the glyph outline, every time we 'moveTo', we close the previous
subpath.
2025-01-21 15:30:16 +01:00
Jonas Jenwald
c4ba3ac23f Replace the EXIF-block with dummy data to prevent JPEG images being rotated (bug 1942064)
The `ImageDecoder` will respect the EXIF orientation, which can lead to JPEG images being incorrectly rotated. To avoid this we replace the entire EXIF-block with dummy data, which works since it'll cause EXIF parsing to bail out early in Firefox; see https://searchfox.org/mozilla-central/rev/9a66d18cb35595c89f499a1011c9dd7e573fce77/image/decoders/EXIF.cpp#130-138
2025-01-20 16:50:22 +01:00
Calixte Denizet
0c14d6e2db Handle the case where openjpeg.wasm is missing
It fixes #19336.
2025-01-17 22:11:24 +01:00
Jonas Jenwald
6038b5a992 Handle JPX wasm fetch-response errors correctly (PR 19329 follow-up)
Currently we're not checking that the response is actually OK before getting the data, which means that rather than throwing an error we can get an empty `ArrayBuffer`.

To avoid duplicating code we can move an existing helper into `src/core/core_utils.js` and re-use it when fetching the JPX wasm-file as well.
2025-01-17 10:20:16 +01:00
Calixte Denizet
94b4b54ef6 [api-major] Add openjpeg.wasm to pdf.js (bug 1935076)
In order to fix bug 1935076, we'll have to add a pure js fallback in case wasm is disabled
or simd isn't supported. Unfortunately, this fallback will take some space.

So, the main goal of this patch is to reduce the overall size (by ~93k).
As a side effect, it should make easier to use an other wasm file (which must export
_jp2_decode, _malloc and _free).
2025-01-16 21:09:50 +01:00
Jonas Jenwald
e5bc760316 Access the number of components correctly in JPEG 2000 images with color space entries (issue 19326)
This small typo appears to be a regression from PR 18204.
2025-01-15 10:16:06 +01:00
Jonas Jenwald
5e569cade5 Improve performance when reading very large TrueType "cmap" tables (issue 19319)
In the affected font the total number of mapping-entries is `1142348`, and no less than `997473` of them are duplicates.
Given that every duplicate causes a lot of Array elements to be moved this becomes extremely inefficient, which we can avoid by keeping track of seen `charCode`s and directly build the final mappings-Array instead.
2025-01-13 13:09:47 +01:00
Jonas Jenwald
e0be2a02dc Re-use existing helper functions in FontRendererFactory
Currently we re-implement a number of helper functions specifically for this code, which seems completely unnecessary since there's already general purpose ones available in the `src/core/core_utils.js` file.
2025-01-12 13:01:54 +01:00
Tim van der Meij
4d4e1befeb
Merge pull request #19289 from Snuffleupagus/issue-19281
Skip LinkAnnotations when collecting field objects (issue 19281)
2025-01-04 13:32:18 +01:00
Jonas Jenwald
6f062abb76 Skip LinkAnnotations when collecting field objects (issue 19281)
The `/Root/AcroForm/Fields` array contains a "ridiculous" number of LinkAnnotations, which obviously makes no sense since those are not form fields.
To improve performance we'll thus ignore those when collecting the field objects.
2025-01-04 11:54:45 +01:00
Jonas Jenwald
74c1795c9f Use Dict iteration more (PR 19051 follow-up)
There's a few cases where we're looping through the result of `Dict.prototype.getKeys` and then manually look-up the values, which after PR 19051 can be replaced with direct iteration instead.
2025-01-02 15:09:19 +01:00
Jonas Jenwald
20d5332009 For images that include SMask/Mask entries, ignore an SMask defined in the current graphics state
From section [11.6.4.3 Mask Shape and Opacity](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G10.4848628) in the PDF specification:
 - An image XObject may contain its own *soft-mask image* in the form of a subsidiary image XObject in the `SMask` entry of the image dictionary (see "Image Dictionaries"). This mask, if present, shall override any explicit or colour key mask specified by the image dictionary's `Mask` entry. Either form of mask in the image dictionary shall override the current soft mask in the graphics state.
2024-12-30 14:25:07 +01:00
Tim van der Meij
8a50d2d302
Merge pull request #19259 from Snuffleupagus/more-wrapReason
Reduce duplication when handling "DocException" and "PasswordRequest" messages
2024-12-29 15:47:11 +01:00
Tim van der Meij
91d801728d
Merge pull request #19262 from Snuffleupagus/CompiledFont-assert-fontMatrix
Assert that the `fontMatrix` is always valid in `compileGlyph`
2024-12-29 15:41:04 +01:00
Tim van der Meij
5b94c865e7
Merge pull request #19263 from Snuffleupagus/Type2Compiled-fix-glyphNameMap
Remove the unused `glyphNameMap` parameter from `Type2Compiled`
2024-12-29 15:39:58 +01:00
Jonas Jenwald
aba6369638 Remove the unused glyphNameMap parameter from Type2Compiled
As part of the changes in PR 4259, which landed over ten years ago, the `glyphNameMap` property on `Font`-instances was removed.
The reason that this didn't cause any bugs is that we always fallback on `getGlyphsUnicode`, and when using that data we also rely on `StandardEncoding`, hence we should just remove the unused parameter from the `Type2Compiled` constructor.
2024-12-27 14:31:39 +01:00
Jonas Jenwald
2ae5aa08cc Assert that the fontMatrix is always valid in compileGlyph
While [bug 1893645](https://bugzilla.mozilla.org/show_bug.cgi?id=1893645) was fixed some time ago now, it still shouldn't hurt to also assert that the `fontMatrix` is always valid when invoking the `compileGlyph` method.
2024-12-27 11:41:49 +01:00
Jonas Jenwald
3a797317c8 Shorten some if-statements in the src/core/ folder
Thanks to modern JavaScript there's some existing code that can be shortened slightly.
2024-12-26 18:37:25 +01:00
Jonas Jenwald
6bde49a606 Reduce duplication when handling "DocException" and "PasswordRequest" messages
Rather than having to manually implement the exception-handling for the "DocException" message, we can instead re-use (and slightly extend) the existing `wrapReason` function since that one already does what we need.

Furthermore, we can also simplify handling of the "PasswordRequest" message a little bit and again re-use the `wrapReason` function.

Finally, the patch makes the following smaller changes:
 - Avoid needlessly re-creating exceptions in the `wrapReason` function.
 - Use a slightly shorter parameter name in the `wrapReason` function.
 - Remove the unused entries in the `CallbackKind`/`StreamKind` enumerations.
2024-12-26 12:55:49 +01:00