615 Commits

Author SHA1 Message Date
calixteman
44affa76b9
Merge pull request #20197 from ryzokuken/sab-font-data
Serialize font data into an ArrayBuffer
2025-09-22 09:40:14 +02:00
Ujjwal Sharma
4bed7370f4 [WIP] Serialize font data into an ArrayBuffer
This PR serializes font data into an ArrayBuffer
that is then transfered from the worker to the
main thread. It's more efficient than the current
solution which clones the "export data" object
which includes the font data as a Uint8Array.

It prepares us to switch to a SharedArrayBuffer
in the future, which would allow us to share
the font data with multiple agents, which would be
crucial for the upcoming "renderer" worker.
2025-09-19 12:02:40 +05:30
Bedis Nbiba
b6165c02b9 use console.warn/info where appropriate
Change info to use console.info and warn function
to use console.warn, this not only makes sense semantically
but also in practice server side runtimes like deno
write console.log to stdout, and console.warn
to stderr (info goes to stdout, unfortunately?)
this is important because logging to stdout
can break some cli apps.
2025-09-15 22:37:08 +00:00
Calixte Denizet
25990e4287 [Editor] Add a sidebar allowing the user to navigate between the comments in a pdf (bug 1985567)
This implements what has been specified but it's still not done: we have to handle editing a comment
or an annotation containing one.
2025-08-29 16:00:06 +02:00
Nicolò Ribaudo
6a22da9c2e
Add logic to track rendering area of various PDF ops
This commit is a first step towards #6419, and it can also help with
first compute which ops can affect what is visible in that part of
the page.

This commit adds logic to track operations with their respective
bounding boxes. Only operations that actually cause something to
be rendered have a bounding box and dependencies.

Consider the following example:
```
0. setFillRGBColor
1. beginText
2. showText "Hello"
3. endText
4. constructPath [...] -> eoFill
```
here we have three rendering operations: the showText op (2) and the
path (4). (2) depends on (0), (1) and (3), while (4) only depends on
(0). Both (2) and (4) have a bounding box.

This tracking happens when first rendering a PDF: we then use the
recorded information to optimize future partial renderings of a PDF, so
that we can skip operations that do not affected the PDF area on the
canvas.

All this logic only runs when the new `enableOptimizedPartialRendering`
preference, disabled by default, is enabled.

The bounding boxes and dependencies are also shown in the pdfBug
stepper. When hovering over a step now:
- it highlights the steps that they depend on
- it highlights on the PDF itself the bounding box
2025-08-22 18:26:59 +02:00
Tim van der Meij
23bd705cea
Merge pull request #20080 from calixteman/add_comment_1
[Editor] Add the possibility to add Popup annotations (bug 1976724)
2025-07-17 21:07:06 +02:00
Calixte Denizet
5162209b27 [Editor] Make possible to change the highlight color from the main panel
It's consistent with the way to change colors for FreeText or Ink annotations.
2025-07-17 20:57:51 +02:00
Calixte Denizet
636ff503ff [Editor] Add the possibility to add Popup annotations (bug 1976724)
It's a first big step.
This patch contains a new modal dialog to let the user edit/update a Popup
2025-07-15 22:39:16 +02:00
Calixte Denizet
8fc51dc089 [Editor] Add the possibility to add a popup to an annotation when saving
When saving/printing, only update the properties which are provided and set
a default value only when there is no pre-existing one.
2025-07-11 21:42:21 +02:00
Noritaka Kobayashi
fa568e826d
Fix typos across the codebase 2025-07-07 09:59:36 +09:00
Calixte Denizet
2541d96bf5 [JS] Make the date parser less strict
and display the expected date formt as a tooltip.
2025-06-20 08:57:07 +02:00
Jonas Jenwald
b629bafd1c Allow to, optionally, keep Unicode escape sequences in stringToPDFString (PR 17331 follow-up)
Currently *some* of the links[1] on page three of the `issue19835.pdf` test-case aren't clickable, since the destination (of the LinkAnnotation) becomes empty.
The reason is that these destinations include the character `\x1b`, which is interpreted as the start of a Unicode escape sequence specifying the language of the string; please refer to section [7.9.2.2 Text String Type](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1957385) in the PDF specification.

Hence it seems that we need a way to optionally disable that behaviour, to avoid a "badly" formatted string from becoming empty (or truncated), at least for cases where we are:
 - Parsing named destinations[2] and URLs.
 - Handling "strings" that are actually /Name-instances.
 - Building a lookup Object/Map based on some PDF data-structure.

*NOTE:* The issue that prompted this patch is obviously related to destinations, however I've gone through the `src/core/` folder and updated various other `stringToPDFString` call-sites that (directly or indirectly) fit the categories listed above.

---
[1] Try clicking on anything on the line containing "Item 7A. Quantitative and Qualitative Disclosures About Market Risk 27".

[2] Unfortunately just skipping `stringToPDFString` in this case would cause other issues, such as the named destination becoming "unusable" in the viewer; see e.g. issues 14847 and 14864.
2025-04-30 20:51:10 +02:00
Jonas Jenwald
3d4e8bb17e Add a basic navigator polyfill for older Node.js versions
Modern Node.js versions now include a `navigator` implementation, with a few basic properties, that's actually enough for the PDF.js use-cases; please see https://nodejs.org/api/globals.html#navigator
Unfortunately we still support Node.js version `20`, hence we add a basic polyfill since that allows simplifying the code slightly.
2025-04-28 13:07:12 +02:00
Jonas Jenwald
abc9522886 Avoid (most) string parsing when removing/replacing the hash property of a URL 2025-04-25 23:13:05 +02:00
Calixte Denizet
d7cbda6cb5 Avoid to create any subarrays when optimizing 'save, transform, constructPath, restore' (bug 1961107)
Removing those `subarray`calls helps to improve performance by a factor 6 on Linux and by a factor of 3
on Windows 11.
2025-04-17 19:14:01 +02:00
Jonas Jenwald
22657e2b6e
Merge pull request #19778 from Snuffleupagus/replace-getAll
[api-minor] Replace various `getAll` methods with iterators
2025-04-08 17:26:24 +02:00
Jonas Jenwald
d882d0869c Move the IDENTITY_MATRIX constant into src/core/core_utils.js (PR 19772 follow-up)
After the changes in PR 19772 the `IDENTITY_MATRIX` constant is now only used on the worker-thread, which leads to Webpack marking the code as unused in the *built* `pdf.mjs` file; see https://phabricator.services.mozilla.com/D244533#change-8oITAexCvrlQ
2025-04-07 11:40:18 +02:00
Jonas Jenwald
2c593b06e4 [api-minor] Replace various getAll methods with iterators
These `getAll` methods are not used anywhere within the PDF.js code-base, outside of tests, and were mostly added (speculatively) for third-party users.
To still allow access to the same data we instead introduce iterators on these classes, which (slightly) shortens the code and allows us to remove the `objectFromMap` helper function.

A summary of the changes in this patch:
 - Replace the `getAll` methods with iterators in the following classes: `AnnotationStorage`, `Metadata`, and `OptionalContentGroup`.

 - Change, and also re-name, `AnnotationStorage.prototype.setAll` into a test-only method since it's not used elsewhere.

 - Remove the `Metadata.prototype.has` method, since it's only used in tests and can be trivially replaced by calling `Metadata.prototype.get` and checking if the returned value is `null`.
2025-04-06 21:43:16 +02:00
Calixte Denizet
e7a951547d Replace UpdateRectMinMax by getAxialAlignedBoundingBox
and don't use array destructuring because it induces a memory and perf penalties.
2025-04-04 19:57:55 +02:00
Jonas Jenwald
4262603b06 Re-name the Util.applyTransformToBezierInPlace method
Given that all `Util.apply...` methods are now using in/out parameters, we can slightly shorten the name of this one.
2025-04-01 23:03:50 +02:00
Jonas Jenwald
c852e877d8 Change Util.applyInverseTransform to use the point-argument as an in/out parameter
This will help reduce the total number of Array allocations, which cannot hurt.
2025-04-01 23:03:44 +02:00
Jonas Jenwald
fa643bb22f Change Util.applyTransform to use the point-argument as an in/out parameter
This will help reduce the total number of Array allocations, which cannot hurt, and also allows us to remove the `Util.applyTransformInPlace` method.
2025-04-01 23:03:36 +02:00
Jonas Jenwald
e5fbf52405
Merge pull request #19736 from Snuffleupagus/compileType3Glyph-worker
[api-minor] Move Type3-glyph compilation to the worker-thread
2025-04-01 19:40:30 +02:00
calixteman
25693dc0ee
Merge pull request #19712 from calixteman/optimize_save_construct
Optimize save-transform-constructPath-restore
2025-04-01 18:20:59 +02:00
Jonas Jenwald
9cd5a9658a [api-minor] Move Type3-glyph compilation to the worker-thread
After PR 19731 the format of compiled Type3-glyphs is now simple enough that the compilation can be moved to the worker-thread, without introducing any significant additional complexity.
This allows us to, ever so slightly, simplify the implementation in `src/display/canvas.js` since the Type3 operatorLists will now directly include standard path-rendering operators (using the format introduced in PR 19689).

As part of these changes we also stop caching Type3 image masks since: we've not come across any cases where that actually helps, they're usually fairly small, and it simplifies the code.

Note that one "negative" change introduced in this patch is that we'll now compile Type3-glyphs *eagerly*, whereas previously we'd only do that lazily upon their first use.
However, this doesn't seem to impact performance in any noticeable way since the compilation is fast enough (way below 1 ms/glyph in my testing) and Type3-fonts are also limited to just 256 glyphs. Also, many (or most?) Type3-fonts don't even use image masks and are thus not affected by these changes.
2025-04-01 09:09:00 +02:00
Calixte Denizet
6e9fbd9420 Simplify singularValueDecompose2dScale in order to make it using less memory
In using the Firefox profiler (with JS allocations tracking) and wuppertal.pdf, I noticed
we were using a bit too much memory for a function which is supposed to just compute 2 numbers.
The memory used by itself isn't so important but having a too much objects lead to waste some time
to gc them.

So this patch aims to simplify it a bit.
2025-03-31 10:28:01 +02:00
Calixte Denizet
6146e5fee7 Optimize save-transform-constructPath-restore
The 4 operations can be replaced with just one in applying the transform to the points coordinates.
2025-03-25 15:31:45 +01:00
Jonas Jenwald
b85f0903ca Add new bounding-box helpers in Util to reduce code duplication
Currently we have a `Util`-helper for computing the bounding-box of a Bézier curve, however for simple points and rectangles we repeat virtually identical code in many spots throughout the code-base.

 - Introduce new `Util.pointBoundingBox` and `Util.rectBoundingBox` helpers.

 - Remove the "fallback" from `Util.bezierBoundingBox` and only support passing in a `minMax`-array, since there's only a single call-site using the other format and it could be easily updated.
2025-03-23 19:20:02 +01:00
Jonas Jenwald
e73224ded7 Introduce Math.sumPrecise usage in the code-base
This is a new JavaScript feature that makes it easy to compute the sum of list of values; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Math/sumPrecise

This allows us to remove most cases of `Array.prototype.reduce`, which helps improve readability since that (in my opinion) often isn't the most intuitive code.
2025-03-23 13:03:56 +01:00
calixteman
d009e4b3a7
Merge pull request #19689 from calixteman/use_path2d
[api-minor] Use a Path2D when doing a path operation in the canvas (bug 1946953)
2025-03-22 21:46:27 +01:00
Calixte Denizet
be1f5671bb [api-minor] Use a Path2D when doing a path operation in the canvas (bug 1946953)
With this patch, all the paths components are collected in the worker until a path
operation is met (i.e., stroke, fill, ...).
Then in the canvas a Path2D is created and will replace the path data transfered from the worker,
this way when rescaling, the Path2D can be reused.
In term of performances, using Path2D is very slightly improving speed when scaling the canvas.
2025-03-22 20:35:24 +01:00
Jonas Jenwald
028e4f7ea8 [api-minor] Add a basic AbortSignal.any polyfill in PDF.js legacy builds
This is an admittedly very basic polyfill, to allow us to remove a bunch of inline feature testing, that I've thrown together based on reading https://developer.mozilla.org/en-US/docs/Web/API/AbortSignal/any_static and related MDN articles.

Compared to PR 19218 it's obviously much more "primitive", however the implementation is simple and it doesn't suffer from any licensing issues (since I wrote the code myself).
2025-03-18 10:54:49 +01:00
Jonas Jenwald
07bbbf75a5 Introduce a helper function for clamping a value to a range
Currently we have a number of spots in the code-base where we need to clamp a value to a [min, max] range. This is either implemented using `Math.min`/`Math.max` or with a local helper function, which leads to some unnecessary duplication.

Hence this patch adds and re-uses a single helper function for this, which we'll hopefully be able to remove in the future once https://github.com/tc39/proposal-math-clamp/ becomes generally available.
2025-03-06 14:05:26 +01:00
Jonas Jenwald
165d90fe26 Re-use the isValidExplicitDest helper function in the worker/viewer
Currently we re-implement the same helper function twice, which in hindsight seems like the wrong decision since that way it's quite easy for the implementations to accidentally diverge.
The reason for doing it this way was because the code in the worker-thread is able to check for `Ref`- and `Name`-instances directly, which obviously isn't possible in the viewer but can be solved by passing validation-functions to the helper.
2025-03-01 12:08:56 +01:00
Jonas Jenwald
c2e33307b1 Introduce some URL.parse() usage in the code-base
This (fairly new) static method allows parsing URLs without having to wrap `new URL(...)` calls within `try...catch` blocks, thus simplifying the code; see https://developer.mozilla.org/en-US/docs/Web/API/URL/parse_static

For older browsers/environments the functionality will be polyfilled, but *only* in `legacy` builds, via `core-js`; see https://github.com/zloirock/core-js?tab=readme-ov-file#url-and-urlsearchparams

*Please note:* This is currently limited to the `src/`- and `web/`-folders, such that we don't break development/testing, since the functionality is not available in all Node.js versions that we support; see https://developer.mozilla.org/en-US/docs/Web/API/URL/parse_static#browser_compatibility
2025-02-15 19:10:36 +01:00
Jonas Jenwald
d5d3d8b3f7 [GENERIC viewer] Add Fluent PLATFORM function (PR 19414 follow-up)
This patch extends the `FeatureTest.platform` getter to provide the necessary information, and uses that one to implement a Fluent `PLATFORM` function based on: https://searchfox.org/mozilla-central/rev/d1fbe983fb7720f0a4aca0e748817af11c1a374e/intl/l10n/rust/fluent-ffi/src/bundle.rs#81-98
2025-02-06 18:59:35 +01:00
calixteman
42c2b7b657
Merge pull request #19339 from calixteman/signature_tools
[Editor] (WIP) Add a new tool in order to add an handwritten signature to a pdf (bug 1942343)
2025-01-30 10:54:56 +01:00
Calixte Denizet
2f828c7bf4 [Editor] (WIP) Add a new tool in order to add an handwritten signature to a pdf (bug 1942343)
This patch is adding some code in order to extract a drawing as curves from an image.
The algorithm is basically the following:
 - reduce the dimensions
 - make it gray
 - apply a bilateral filter in order to add some blurryness while keeping the edges
 - compute the histogram
 - guess what's the background color which should contain a large majority of the pixels
 - make a binary image
 - extract the contours in using the Suzuki algorithm
 - apply the Douglas-Peucker algorithm in order to reduce the number of points

The algorithm is improvable but it should work pretty well if there's a clear difference between
the background and the drawing.
In a v2 we could use a ML model in order to improve the extraction.

There's few changes related to the UI in order to make the tool usable, but they're very basic
for the moment.
2025-01-29 21:52:14 +01:00
Jonas Jenwald
9241e1be8c [api-minor] Simplify clean-up of page resources after rendering
After PR 2317, which landed in 2012, we'd immediately clean-up after rendering for pages with large image resources. This had the effect that re-rendering, e.g. after zooming, would force us to re-parse the entire page which could easily lead to bad performance.
In PR 16108, which landed in 2023, we tried to lessen the impact of that by slightly delaying clean-up however that's obviously not a perfect solution (and it increased the complexity of the relevant code).

Furthermore, the condition for this "immediate" clean-up seems a bit arbitrary to me since a page could easily contain a large number of smaller images whose total size vastly exceeds the threshold.

Hence this patch, which suggests that we remove the conditional and delayed clean-up after rendering. Compared to the situation back in 2012, a number of things have improved since:
 - We have *multiple* caches for repeated image-resources on the worker-thread[1], which helps reduce overall memory usage and improves performance.
 - We downsize huge images on the worker-thread, which means that the images we're using on the main-thread cannot be arbitrarily large.
 - The amount of available RAM on devices should be a lot higher, since more than a decade has passed.

A future improvement here, for more resource constrained environments, could be to instead clean-up when actually needed using e.g. `WeakRef`s (see issue 18148).

---
[1] More specifically:
 - `LocalImageCache`, which caches image-data by /Name and /Ref on the `PartialEvaluator.prototype.getOperatorList` level.
 - `RegionalImageCache`, which caches image-data by /Ref on the `PartialEvaluator`-instance (i.e. at the page) level.
 - `GlobalImageCache`, which caches image-data by /Ref globally at the document level.
2025-01-22 12:19:44 +01:00
Jonas Jenwald
75cba72ca6 [api-major] Replace MissingPDFException and UnexpectedResponseException with one exception
These old exceptions have a fair amount of overlap given how/where they are being used, which is likely because they were introduced at different points in time, hence we can shorten and simplify the code by replacing them with a more general `ResponseException` instead.

Besides an error message, the new `ResponseException` instances also include:
 - A numeric `status` field containing the server response status, similar to the old `UnexpectedResponseException`.

 - A boolean `missing` field, to allow easily detecting the situations where `MissingPDFException` was previously thrown.
2025-01-16 22:51:05 +01:00
Jonas Jenwald
6bde49a606 Reduce duplication when handling "DocException" and "PasswordRequest" messages
Rather than having to manually implement the exception-handling for the "DocException" message, we can instead re-use (and slightly extend) the existing `wrapReason` function since that one already does what we need.

Furthermore, we can also simplify handling of the "PasswordRequest" message a little bit and again re-use the `wrapReason` function.

Finally, the patch makes the following smaller changes:
 - Avoid needlessly re-creating exceptions in the `wrapReason` function.
 - Use a slightly shorter parameter name in the `wrapReason` function.
 - Remove the unused entries in the `CallbackKind`/`StreamKind` enumerations.
2024-12-26 12:55:49 +01:00
Calixte Denizet
2b05924504 Improve perfs of the font renderer
Some SVG paths are generated from the font and used in the main thread
to render the glyphs.
2024-12-08 17:12:10 +01:00
Jonas Jenwald
ef0331877d Introduce Promise.try() usage in the code-base
This simplifies the creation of Promises in some cases; see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/try
2024-12-04 10:31:42 +01:00
Calixte Denizet
cee65fcd4e [Editor] Add a new base class to allow to add a drawing in the SVG layer.
This patch makes a clear separation between the way to draw and the editing stuff.
It adds a class DrawEditor which should be extended in order to create new drawing tools.
As an example, the ink tool has been rewritten in order to use it.
2024-11-28 15:23:03 +01:00
Calixte Denizet
1ef670411a Rescale the image data when they're really too large
It fixes #17190.
2024-11-23 22:42:30 +01:00
Jonas Jenwald
c290a12ce1 Simplify the getUuid helper function
We can remove most feature testing from this helper function, with the exception of `randomUUID` since that's only available in "secure contexts", and also remove the fallback code-path.
Note that this code was only added for Node.js compatibility, and it's no longer necessary now that the minimum support version is `20`; see also https://developer.mozilla.org/en-US/docs/Web/API/Crypto#browser_compatibility

Finally, this patch also adds a basic unit-test for the helper function.
2024-11-21 13:11:10 +01:00
Jonas Jenwald
af3c1f2a0d Enable the ESLint no-console rule in parts of the code-base
The purpose of these changes is to make it more difficult to accidentally include logging statements, used during development and debugging, when submitting patches for review.

For (almost) all code residing in the `src/` folder we should use our existing helper functions to ensure that all logging can be controlled via the `verbosity` API-option.

For the `test/unit/` respectively `test/integration/` folders we shouldn't need any "normal" logging, but it should be OK to print the *occasional* warning/error message.

Please find additional details about the ESLint rule at https://eslint.org/docs/latest/rules/no-console
2024-11-14 12:41:39 +01:00
Jonas Jenwald
65eedfb0fc [api-minor] Add a getDocument option to disable ImageDecoder usage
This allows end-users to forcibly disable `ImageDecoder` usage, even if the browser appears to support it (similar to the pre-existing option for `OffscreenCanvas`).
2024-11-12 17:12:42 +01:00
Jonas Jenwald
2145a7b9ca Use the hexNumbers structure in the stringToUTF16HexString helper
We can re-use the `hexNumbers` structure here, since that allows us to directly lookup the hexadecimal values and shortens the code.
2024-11-02 15:00:32 +01:00
Jonas Jenwald
db1238aae3 Re-factor the toHexUtil helper (PR 17862 follow-up)
We can re-use the `hexNumbers` structure, since that allows us to directly lookup the hexadecimal values and shortens the code.
2024-10-29 16:35:44 +01:00