759 Commits

Author SHA1 Message Date
Jonas Jenwald
db43f158dc Inline the default Factory-definitions in getDocument
- Most of the these are only used in the `src/display/api.js` file, and this leads to slightly shorter code.

 - A number of unit-tests need a `BaseCanvasFactory`-instance, however that one is available through the `PDFDocumentProxy`-instance nowadays.

 - For other unit-tests the remaining necessary default Factory-definitions can be moved into the `test/unit/test_utils.js` file.
2025-01-18 14:09:14 +01:00
Jonas Jenwald
1ddce76a8b Simplify the JSDocs for the various getDocument Factory-parameters
Given that we nowadays provide default Node.js versions of these Factory-parameters it no longer seems necessary to mention that environment specifically.
2025-01-16 23:01:36 +01:00
Calixte Denizet
94b4b54ef6 [api-major] Add openjpeg.wasm to pdf.js (bug 1935076)
In order to fix bug 1935076, we'll have to add a pure js fallback in case wasm is disabled
or simd isn't supported. Unfortunately, this fallback will take some space.

So, the main goal of this patch is to reduce the overall size (by ~93k).
As a side effect, it should make easier to use an other wasm file (which must export
_jp2_decode, _malloc and _free).
2025-01-16 21:09:50 +01:00
Jonas Jenwald
6bde49a606 Reduce duplication when handling "DocException" and "PasswordRequest" messages
Rather than having to manually implement the exception-handling for the "DocException" message, we can instead re-use (and slightly extend) the existing `wrapReason` function since that one already does what we need.

Furthermore, we can also simplify handling of the "PasswordRequest" message a little bit and again re-use the `wrapReason` function.

Finally, the patch makes the following smaller changes:
 - Avoid needlessly re-creating exceptions in the `wrapReason` function.
 - Use a slightly shorter parameter name in the `wrapReason` function.
 - Remove the unused entries in the `CallbackKind`/`StreamKind` enumerations.
2024-12-26 12:55:49 +01:00
Jonas Jenwald
490e740365 [api-minor] Remove deprecated getDocument options (PR 18776 follow-up) 2024-12-15 14:13:44 +01:00
Jonas Jenwald
6153b15231 Remove the raw path-strings after creating the actual Path2D glyph-objects
The `Path2D` glyph-objects are cached on the `FontFaceObject`-instance, so we can save a little bit of memory by removing the raw path-strings once they're no longer needed.
2024-12-09 15:09:18 +01:00
Jonas Jenwald
c6e3fc4fe6 Take the userUnit into account in the PageViewport class (issue 19176) 2024-12-08 15:51:04 +01:00
Tim van der Meij
2bbf9b432c
Merge pull request #19099 from timvandermeij/updates
Update dependencies and translations to the most recent versions
2024-12-01 14:54:07 +01:00
Jonas Jenwald
4b0900fdfa Use even more optional chaining in the src/display/api.js file
This slightly shortens the code, in various `destroy`-methods, which cannot hurt.
Also, use pre-processor checks to simplify `PDFDocumentLoadingTask.destroy` in the Firefox PDF Viewer since the `PDFWorker.fromPort`-method isn't used there.
2024-11-30 12:06:27 +01:00
Tim van der Meij
725ae4998c
Fix the type definition of the fingerprints getter in src/display/api.js
The `Array` type takes one parameter that describes the possible types
of the inner  elements. This parameter may exist of pipes to indicate
multiple possible types. However, the current type definition provides
multiple parameters; this is incorrect syntax, and TypeScript 5.7+ now
fails on this due to stricter syntax validation.

This commit fixes the issue by changing the type definition to the
proper syntax, which together with the accompanying comment about the
contents of the fingerprints array should document it sufficiently.
2024-11-24 21:53:45 +01:00
Jonas Jenwald
471284f51b [api-minor] Disable ImageDecoder usage by default in Chromium browsers
Given that there are multiple issues with `ImageDecoder` in Chromium browsers, affecting both BMP and JPEG images, for now we (by default) disable that functionality there to avoid problems.

This also means that we can remove the previously added, and separate, `isChrome` API-option.
2024-11-14 12:05:15 +01:00
Jonas Jenwald
65eedfb0fc [api-minor] Add a getDocument option to disable ImageDecoder usage
This allows end-users to forcibly disable `ImageDecoder` usage, even if the browser appears to support it (similar to the pre-existing option for `OffscreenCanvas`).
2024-11-12 17:12:42 +01:00
Jonas Jenwald
cbf0ca71bf [api-minor] Only support the Fetch API for "remote" PDF documents in Node.js environments
The Fetch API has been supported since Node.js version 18, see https://developer.mozilla.org/en-US/docs/Web/API/Fetch_API#browser_compatibility
2024-11-03 16:18:10 +01:00
Jonas Jenwald
c7407230c1 [api-minor] Load Node.js packages/polyfills with process.getBuiltinModule
This allows *synchronous* loading of Node.js modules and (indirectly) packages, thus simplifying the code a fair bit.
2024-11-03 16:13:58 +01:00
Jonas Jenwald
4e12906061 Move the various DOM-factories into their own files
- Over time the number and size of these factories have increased, especially the `DOMFilterFactory` class, and this split should thus aid readability/maintainability of the code.

 - By introducing a couple of new import maps we can avoid bundling the `DOMCMapReaderFactory`/`DOMStandardFontDataFactory` classes in the Firefox PDF Viewer, since they are dead code there given that worker-thread fetching is always being used.

 - This patch has been successfully tested, by running `$ ./mach test toolkit/components/pdfjs/`, in a local Firefox artifact-build.

*Note:* This patch reduces the size of the `gulp mozcentral` output by `1.3` kilo-bytes, which isn't a lot but still cannot hurt.
2024-11-01 13:31:28 +01:00
Jonas Jenwald
7572382c7a Change the "FetchBuiltInCMap"/"FetchStandardFontData" message-handlers to be asynchronous
This way we can directly throw Errors, rather than having to "manually" return rejected Promises, which is ever so slightly shorter.

Also, since `useWorkerFetch` is always true in MOZCENTRAL builds these message-handlers should not be invoked there.
2024-10-31 09:29:11 +01:00
Jonas Jenwald
afb4813d1c Simplify the "ReaderHeadersReady" message-handler in the API
We can convert the handler to an `async` function, which removes the need to create a temporary Promise here.
Given the age of this code it shouldn't hurt to simplify it a little bit.
2024-10-29 14:59:39 +01:00
Calixte Denizet
b649b6f8dd Use a BMP decoder when resizing an image
The image decoding won't block the main thread any more.
For now, it isn't enabled for Chrome because issue6741.pdf leads to a crash.
2024-10-28 14:09:52 +01:00
Jonas Jenwald
788eabc76a Re-factor the MessageHandler-class event handler function
- Change the "message" event handler function to a private method.

 - Remove the "message" event listener with an `AbortSignal`.

 - Extend the `LoopbackPort`-class with `AbortSignal` support.
2024-10-16 10:16:27 +02:00
Jonas Jenwald
a989244570 Slightly re-factor the transportFactory initialization in getDocument
Given that the `WorkerTransport`-constructor will access all possible factory-instances, let's ensure that the `transportFactory`-object always has a consistent shape regardless of other options.

Also, since `useWorkerFetch` is always true in MOZCENTRAL builds we never need to create `CMapReaderFactory`/`StandardFontDataFactory`-instances there.
2024-09-26 12:16:05 +02:00
Jonas Jenwald
bb302dd993 [api-minor] Pass CanvasFactory/FilterFactory, rather than instances, to getDocument
This unifies the various factory-options, since it's consistent with `CMapReaderFactory`/`StandardFontDataFactory`, and ensures that any needed parameters will always be consistently provided when creating `CanvasFactory`/`FilterFactory`-instances.

As shown in the modified example this may simplify some custom implementations, since we now provide the ability to access the `CanvasFactory`-instance used with a particular `getDocument`-invocation.
2024-09-23 11:26:30 +02:00
Jonas Jenwald
f77a29d675 Simplify the code that picks the appropriate NetworkStream-implementation
This code is quite old and has been moved/re-factored a few times over the years, however we can simplify this even further since we don't actually need a function to determine what NetworkStream-implementation to use.
2024-09-17 12:23:43 +02:00
Jonas Jenwald
c4fdb28573 Remove PDFWorkerUtil and move its contents into PDFWorker instead
This is possible thanks to features, i.e. private fields and in particular static initialization blocks, that didn't exist back when we started using classes in the code-base.
2024-07-29 11:22:43 +02:00
Tim van der Meij
c77b97daff
Update the JS/CSS files for the new Prettier/Stylelint versions 2024-07-13 16:29:47 +02:00
Jonas Jenwald
a4ffc1066c Move the internal API/Worker isEditing-state into RenderingIntentFlag
In *hindsight* this seems like a better idea, since it avoids the need to manually pass `isEditing` around as a boolean value.
Note that `RenderingIntentFlag` is *internal* functionality, not exposed in the official API, which means that it can be extended and modified as necessary.
2024-07-04 23:34:30 +02:00
Calixte Denizet
64635f3b35 [api-minor][Editor] When switching to editing mode, redraw pages containing editable annotations
Right now, editable annotations are using their own canvas when they're drawn, but
it induces several issues:
 - if the annotation has to be composed with the page then the canvas must be correctly
   composed with its parent. That means we should move the canvas under canvasWrapper
   and we should extract composing info from the drawing instructions...
   Currently it's the case with highlight annotations.
 - we use some extra memory for those canvas even if the user will never edit them, which
   the case for example when opening a pdf in Fenix.

So with this patch, all the editable annotations are drawn on the canvas. When the
user switches to editing mode, then the pages with some editable annotations are redrawn but
without them: they'll be replaced by their counterpart in the annotation editor layer.
2024-07-02 14:11:40 +02:00
Jonas Jenwald
a4f1a9a41b Cancel the requestAnimationFrame in the API when cancelling rendering
Errors related to this `requestAnimationFrame` show up intermittently when running the integration-tests on the bots, however I've been unable to reproduce it locally.
Hence I cannot guarantee that it's enough to fix the timing issues, however this should be generally safe since the `requestAnimationFrame` invokes the `_next`-method and the first thing that one does is check that rendering hasn't been cancelled.
2024-06-26 17:26:02 +02:00
Jonas Jenwald
f3f88eecb4 Use an AbortController to remove the temporary "error" handler for the worker 2024-06-15 14:35:32 +02:00
Jonas Jenwald
2d0e08f1c8 Introduce a helper method for resolving the PDFWorker promise
This avoids having to repeat the same code multiple times, since besides resolving the promise we also need to send the "configure" message to the worker-thread.
2024-06-15 14:35:30 +02:00
Jonas Jenwald
8d4456172b Reduce duplication when handling the "test" message from the worker
The feature-testing on the worker-thread has been simplified in previous pull requests, which means that we can simplify this main-thread handler as well.
2024-06-15 14:35:28 +02:00
Jonas Jenwald
0a36b667e4 Use an early return in PDFWorker.prototype._initialize when workers are disabled
This helps reduce overall indentation in the method, thus leading to slightly less code.
Also, remove an old comment referring to Chrome 15 since that's no longer relevant now.
2024-06-15 14:35:12 +02:00
Jonas Jenwald
d6612b3427 Remove some now redundant validation in getDocument
Given that we now check/validate all options properly this old code can be simplified.
2024-06-15 14:35:10 +02:00
Calixte Denizet
ff6180a4c9 Add an option to enable/disable hardware acceleration (bug 1902012) 2024-06-12 18:41:07 +02:00
Jonas Jenwald
06334c97ef Improve the loadingParams functionality in the API
- Move the definition of the `loadingParams` Object, to simplify the code.

 - Add a unit-test, since none existed and the viewer depends on this functionality.
2024-05-24 09:26:40 +02:00
Jonas Jenwald
15b5808eee [api-minor] Re-factor the basic textLayer-functionality
This is very old code, and predates e.g. the introduction of JavaScript classes, which creates unnecessarily unwieldy code in the viewer.
By introducing a new `TextLayer` class in the API, similar to how e.g. the `AnnotationLayer` looks, we're able to keep most parameters on the class-instance itself. This removes the need to manually track them in the viewer, and simplifies the call-sites.

This also removes the `numTextDivs` parameter from the "textlayerrendered" event, since that's only added to support default-viewer functionality that no longer exists.

Finally we try, as far as possible, to polyfill the old `renderTextLayer` and `updateTextLayer` functions since they are exposed in the library API.
For *simple* invocations of `renderTextLayer` the behaviour should thus be the same, with only a warning printed in the console.
2024-05-17 14:20:20 +02:00
Tim van der Meij
4db843617f
Merge pull request #18047 from Snuffleupagus/issue-18042
Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up)
2024-05-15 15:40:18 +02:00
Jonas Jenwald
6b171540b7 Initialize the networkStream synchronously in getDocument
This is fairly old code, and at some point the need for this to be asynchronous disappeared.
2024-05-14 17:04:25 +02:00
Jonas Jenwald
cbb8748a22 Inline the _fetchDocument helper function in getDocument
This function has been modified a number of times over the years, and at this point it's small/simple enough that we can just inline the code instead.
2024-05-14 16:29:41 +02:00
Jonas Jenwald
c5f92437f7 Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up)
For images that failed to decode once we want to avoid a pointless round-trip to the main-thread, which could otherwise happen for globally cached images.
2024-05-14 13:58:36 +02:00
Jonas Jenwald
6d523c316c [api-minor] Include the document /Lang attribute in the textContent-data
- These changes will allow a simpler way of implementing PR 17770.

 - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done *once* per PDF document (and most PDFs don't included this data).

 - This makes the /Lang attribute *directly available* in the `textLayer`, which has the following advantages:
    - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer).

    - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770.
      *Please note:* This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).
2024-05-14 12:44:41 +02:00
Jonas Jenwald
298d72133e
Merge pull request #18051 from Snuffleupagus/NodePackages
[api-minor] Re-factor how Node.js packages/polyfills are  loaded (issue 17245)
2024-05-14 11:43:57 +02:00
Jonas Jenwald
761abc7cc3
Merge pull request #18066 from Snuffleupagus/rm-FontFaceObject-ignoreErrors
Remove the `ignoreErrors` option from the `FontFaceObject` class
2024-05-14 09:49:08 +02:00
Jonas Jenwald
5f6f1686b5 Remove the ignoreErrors option from the FontFaceObject class
- The `stopAtErrors` API option, which is the inverse of the "internal" `ignoreErrors` option, is explicitly documented as applying to *parsing* (i.e. the worker-thread) while the `FontFaceObject` class is used during rendering (i.e. the main-thread); see b6765403a1/src/display/api.js (L164-L167)

 - A glyph that fails in the `FontRendererFactory`, on the worker-thread, will already cause (overall) parsing to stop when `ignoreErrors === false` hence checking the option on the main-thread as well seems redundant; see b6765403a1/src/core/evaluator.js (L4527-L4533)

 - Removing this option simplifies the code, and slightly reduces the number of options that we need to handle in the main-thread code.
2024-05-11 10:18:23 +02:00
Jonas Jenwald
5e50479ac6 Use more object destructuring in the "commonobj" handler in the API 2024-05-11 09:44:10 +02:00
Jonas Jenwald
4a8d742592 Move the reporting of page Stats into the API
This avoids having to add a couple of event listeners in the viewer, when debugging is enabled, and is consistent with the existing handling of `FontInspector` and `StepperManager` in the API.
2024-05-11 09:42:05 +02:00
Jonas Jenwald
2643570364 [api-minor] Re-factor how Node.js packages/polyfills are loaded (issue 17245)
*Please note:* This removes top level await from the GENERIC builds of the PDF.js library.

Despite top level await being supported in all modern browsers/environments, note [the MDN compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/await#browser_compatibility), it seems that many frameworks and build-tools unfortunately have trouble with it.
Hence, in order to reduce the influx of support requests regarding top level await it thus seems that we'll have to try and fix this.

Given that top level await is only needed for Node.js environments, to load packages/polyfills, we re-factor things to limit the asynchronicity to that environment.
The "best" solution, with the least likelihood of causing future problems, would probably be to await the load of Node.js packages/polyfills e.g. at the top of the `getDocument`-function. Unfortunately that doesn't work though, since that's a *synchronous* function that we cannot change without breaking "the world".

Hence we instead await the load of Node.js packages/polyfills together with the `PDFWorker` initialization, since that's the *first point* of asynchronicity during initialization/loading of a PDF document. The reason that this works is that the Node.js packages/polyfills are only needed during fetching of the PDF document respectively during rendering, neither of which can happen *until* the worker has been initialized.
Hopefully this won't cause any future problems, since looking at the history of the PDF.js project I don't believe that we've (thus far) ever needed a Node.js dependency at an earlier point.
This new pattern for accessing Node.js packages/polyfills will also require some care during development *and* importantly reviewing, to ensure that no new top level await is added in the main code-base.
2024-05-06 23:20:03 +02:00
Jonas Jenwald
f6cd03955b [api-minor] Move the page reference/number caching into the API
Rather than having to handle this *manually* throughout the viewer, this functionality can instead be moved into the API which simplifies the code slightly.
2024-04-29 18:54:06 +02:00
Calixte Denizet
551e63901c Simplify the way to pass the glyph drawing instructions from the worker to the main thread
and remove the use of eval in the font loader.
2024-04-27 21:28:31 +02:00
Jonas Jenwald
7206d0a237 Validate explicit destinations on the worker-thread to prevent DataCloneError (issue 17981)
*Note:* This borrows a helper function from the viewer, however the code cannot be directly shared since the worker-thread has access to various primitives.
2024-04-22 22:51:35 +02:00
Jonas Jenwald
e4d0e84802 [api-minor] Replace the PromiseCapability with Promise.withResolvers()
This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does *almost* the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers

The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular:
 - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state.
 - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used.
 - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however *not* a problem since the value of a Promise cannot be changed once fulfilled or rejected.
 - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them:
     - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is *removed* on its first (valid) invocation.
     - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks.
 - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and *test-only* code.

---
[1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).
2024-04-01 11:42:37 +02:00