This commit is a first step towards #6419, and it can also help with
first compute which ops can affect what is visible in that part of
the page.
This commit adds logic to track operations with their respective
bounding boxes. Only operations that actually cause something to
be rendered have a bounding box and dependencies.
Consider the following example:
```
0. setFillRGBColor
1. beginText
2. showText "Hello"
3. endText
4. constructPath [...] -> eoFill
```
here we have three rendering operations: the showText op (2) and the
path (4). (2) depends on (0), (1) and (3), while (4) only depends on
(0). Both (2) and (4) have a bounding box.
This tracking happens when first rendering a PDF: we then use the
recorded information to optimize future partial renderings of a PDF, so
that we can skip operations that do not affected the PDF area on the
canvas.
All this logic only runs when the new `enableOptimizedPartialRendering`
preference, disabled by default, is enabled.
The bounding boxes and dependencies are also shown in the pdfBug
stepper. When hovering over a step now:
- it highlights the steps that they depend on
- it highlights on the PDF itself the bounding box
This way it helps to reduce the overall canvas dimensions and make the rendering faster.
The drawback is that when scrolling, the page can be blurry in waiting for the rendering.
The default value is 200% on desktop and will be 100% for GeckoView.
Instead, we update the visible canvas every 500ms.
With large canvas, updating at 60fps lead to a lot gfx transactions and it can take a lot of time.
For example, with wuppertal_2012.pdf on Windows, displaying it at 150% takes around 14 min !!! without
this patch when it takes only around 14 sec with. Even at 30% it helps to improve the performance
by around 20%.
Modern Node.js versions now include a `navigator` implementation, with a few basic properties, that's actually enough for the PDF.js use-cases; please see https://nodejs.org/api/globals.html#navigator
Unfortunately we still support Node.js version `20`, hence we add a basic polyfill since that allows simplifying the code slightly.
It's already enabled by default in Firefox, and since there's no open issues regarding auto-linking I suppose that we can attempt to enable it unconditionally.
Browsers not only limit the maximum total canvas area, but additionally also limit their maximum width/height which affects PDF documents with e.g. very tall and narrow pages.
To address this we add a new `maxCanvasDim` viewer-option, which in Firefox will use a browser preference, such that both the total canvas area and the width/height will affect when CSS-zooming is used.
The `viewerCssTheme` was removed in #17222 and subsequently reenabled in #17293,
but only for Chromium and generic builds. This commit reenables the
function using the new method introduced in #17293.
When rendering big PDF pages at high zoom levels, we currently fall back
to CSS zoom to avoid rendering canvases with too many pixels. This
causes zoomed in PDF to look blurry, and the text to be potentially
unreadable.
This commit adds support for rendering _part_ of a page (called
`PDFPageDetailView` in the code), so that we can render portion of a
page in a smaller canvas without hiting the maximun canvas size limit.
Specifically, we render an area of that page that is slightly larger
than the area that is visible on the screen (100% larger in each
direction, unless we have to limit it due to the maximum canvas size).
As the user scrolls around the page, we re-render a new area centered
around what is currently visible.
Automatically detect links in the text content of a file and automatically
generate link annotations at the appropriate locations to achieve
automatic link detection and hyperlinking.
This patch is adding some code in order to extract a drawing as curves from an image.
The algorithm is basically the following:
- reduce the dimensions
- make it gray
- apply a bilateral filter in order to add some blurryness while keeping the edges
- compute the histogram
- guess what's the background color which should contain a large majority of the pixels
- make a binary image
- extract the contours in using the Suzuki algorithm
- apply the Douglas-Peucker algorithm in order to reduce the number of points
The algorithm is improvable but it should work pretty well if there's a clear difference between
the background and the drawing.
In a v2 we could use a ML model in order to improve the extraction.
There's few changes related to the UI in order to make the tool usable, but they're very basic
for the moment.
This prepares for a future where we're using more than one wasm-file, originating in different `external/`-folders, by extending the existing `gulp.watch` usage.
The following diff illustrates how to add more entries:
```diff
diff --git a/gulpfile.mjs b/gulpfile.mjs
index 0e0a5a1ac..1502755be 100644
--- a/gulpfile.mjs
+++ b/gulpfile.mjs
@@ -655,6 +655,10 @@ function createWasmBundle() {
base: "external/openjpeg",
encoding: false,
}),
+ gulp.src(["external/foobar/*.wasm"], {
+ base: "external/foobar",
+ encoding: false,
+ }),
]);
}
@@ -2125,7 +2129,7 @@ gulp.task(
},
function watchWasm() {
gulp.watch(
- "external/openjpeg/*",
+ ["external/openjpeg/*", "external/foobar/*"],
{ ignoreInitial: false },
gulp.series("dev-wasm")
);
```
These old exceptions have a fair amount of overlap given how/where they are being used, which is likely because they were introduced at different points in time, hence we can shorten and simplify the code by replacing them with a more general `ResponseException` instead.
Besides an error message, the new `ResponseException` instances also include:
- A numeric `status` field containing the server response status, similar to the old `UnexpectedResponseException`.
- A boolean `missing` field, to allow easily detecting the situations where `MissingPDFException` was previously thrown.
In order to fix bug 1935076, we'll have to add a pure js fallback in case wasm is disabled
or simd isn't supported. Unfortunately, this fallback will take some space.
So, the main goal of this patch is to reduce the overall size (by ~93k).
As a side effect, it should make easier to use an other wasm file (which must export
_jp2_decode, _malloc and _free).
We have a number of base-classes that are only intended to be extended, but never to be used directly. To help enforce this during development these base-class constructors will check for direct usage, however that code is obviously not needed in the actual builds.
*Note:* This patch reduces the size of the `gulp mozcentral` output by `~2.7` kilo-bytes, which isn't a lot but still cannot hurt.
This patch adds a new entry in the secondary menu in order to open a dialog to let the user:
- disables the alt-text generation thanks to a ML model;
- deletes the alt-text model downloaded in Firefox;
- disabled the new alt-text flow.
For the Firefox pdf viewer, we want to use AI to guess an alt-text when adding an image to a pdf.
For now the telemtry stuff is not implemented and will come soon.
In order to test it locally:
- set enableAltText, enableFakeMLManager and enableUpdatedAddImage to true.
or in Firefox:
- set browser.ml.enable, pdfjs.enableAltText and pdfjs.enableUpdatedAddImage to true.
- Initialize all user-options upfront, to allow getting the options without having to check if they exist first. (This is easier to implement after PR 18475 landed.)
- Move the user-options into a private field in `AppOptions`. (This code is old enough to pre-date general availability of that class feature.)
- Move code/methods limited to GENERIC-builds into `AppOptions`.
Currently we'll only dispatch events, for the options that support it, when updating preferences. Since this could potentially lead to inconsistent behaviour, let's avoid any future surprises by *always* dispatching events regardless of how the relevant option is being changed.
Obviously we should then also dispatch events in `AppOptions.set`, and to avoid adding even more duplicated code this method is changed into a wrapper for `AppOptions.setAll`.
While this is technically a tiny bit less efficient I don't think it matters in practice, since outside of development- and debug-mode `AppOptions.set` is barely used.
Since "localeProperties" is needed in Firefox, let's remove a tiny bit of option duplication by using it in the GENERIC builds as well.
For convenience, the old debug-only "locale" hash-parameter is kept intact.
For options with varying types, see `useSystemFonts`, we're not sufficiently validating the type when setting a new value. This means that for an option that uses `OptionKind.UNDEF_ALLOWED` we'd allow *any* value, which is obviously not the intention.
Hence we instead introduce a new and *optional* `type`-field that allows specifying exactly which types are valid when multiple ones are supported.
*Note:* This obviously didn't occur to me until after PR 18465 landed, sorry about that!
This option is old enough that it predates e.g. the introduction of AppOptions, so it probably cannot hurt to re-factor this a little bit now.
- In development-mode we can just set this directly in AppOptions.
- In the extension-builds we still need to set it dynamically, however by moving this code we get the benefit of being able to avoid storing a data-URL in that case; note how [the API ignores those anyway](98e772727e/src/display/api.js (L256-L262)).
To avoid introducing any inline "hacks" in the viewer-code this meant adding `useSystemFonts` to the AppOptions, which thus required some new functionality since the default value should be `undefined` given how the option is handled in the API; note [this code](ed83d7c5e1/src/display/api.js (L298-L301)).
Finally, also moves the definition of the development-mode `window.isGECKOVIEW` property to the HTML file such that it's guaranteed to be set regardless of how and when it's accessed.