Beanz/pdf.js - pdf.js - Gitea: Git with a cup of tea

Beanz/pdf.js

Author	SHA1	Message	Date
Calixte Denizet	bc87f4e8d6	Add the possibility to create a pdf from different ones (bug 1997379) For now it's just possible to create a single pdf in selecting some pages in different pdf sources. The merge is for now pretty basic (it's why it's still a WIP) none of these data are merged for now: - the struct trees - the page labels - the outlines - named destinations For there are 2 new ref tests where some new pdfs are created: one with some extracted pages and an other one (encrypted) which is just rewritten. The ref images are generated from the original pdfs in selecting the page we want and the new images are taken from the generated pdfs.	2025-11-07 14:57:48 +01:00
Calixte Denizet	17993a0c5d	Fix the creation of the annotation layer in the ref tests	2025-10-02 20:27:29 +02:00
Nicolò Ribaudo	e4ea2e0c79	Store ops bboxes in a linear Uint8Array This PR changes the way we store bounding boxes so that they use less memory and can be more easily shared across threads in the future. Instead of storing the bounding box and list of dependencies for each operation that renders _something_, we now only store the bounding box of _every_ operation and no dependencies list. The bounding box of each operation covers the bounding box of all the operations affected by it that render something. For example, the bounding box of a `setFont` operation will be the bounding box of all the `showText` operations that use that font. This affects the debugging experience in pdfBug, since now the bounding box of an operation may be larger than what it renders itself. To help with this, now when hovering on an operation we also highlight (in red) all its dependents. We highlight with white stripes operations that do not affect any part of the page (i.e. with an empty bbox). To save memory, we now save bounding box x/y coordinates as uint8 rather than float64. This effectively gives us a 256x256 uniform grid that covers the page, which is high enough resolution for the usecase.	2025-09-09 10:24:48 +02:00
Nicolò Ribaudo	6a22da9c2e	Add logic to track rendering area of various PDF ops This commit is a first step towards #6419, and it can also help with first compute which ops can affect what is visible in that part of the page. This commit adds logic to track operations with their respective bounding boxes. Only operations that actually cause something to be rendered have a bounding box and dependencies. Consider the following example: ``` 0. setFillRGBColor 1. beginText 2. showText "Hello" 3. endText 4. constructPath [...] -> eoFill ``` here we have three rendering operations: the showText op (2) and the path (4). (2) depends on (0), (1) and (3), while (4) only depends on (0). Both (2) and (4) have a bounding box. This tracking happens when first rendering a PDF: we then use the recorded information to optimize future partial renderings of a PDF, so that we can skip operations that do not affected the PDF area on the canvas. All this logic only runs when the new `enableOptimizedPartialRendering` preference, disabled by default, is enabled. The bounding boxes and dependencies are also shown in the pdfBug stepper. When hovering over a step now: - it highlights the steps that they depend on - it highlights on the PDF itself the bounding box	2025-08-22 18:26:59 +02:00
calixteman	1b427a3af5	Merge pull request #20016 from ryzokuken/move-getcontext [api-minor] Move getContext call to InternalRenderTask	2025-07-08 22:20:19 +02:00
Noritaka Kobayashi	fa568e826d	Fix typos across the codebase	2025-07-07 09:59:36 +09:00
Ujjwal Sharma	b1b728d47f	[api-minor] Move getContext call to InternalRenderTask This is a precursor to moving the call into a worker thread to let us use `OffscreenCanvas`. The current position wouldn't work since we make transformations to the canvas object after the getContext call, which isn't allowed for OffscreenCanvas. Also it isn't allowed to clone or `transferControlToOffscreen` the canvas after the `getContext` call.	2025-07-04 00:53:51 +02:00
Nicolò Ribaudo	f789c132da	Log test number and id in reftests The new output looks like ``` Harness thinks this browser is firefox Fetching manifest "/test/test_manifest.json"... done [1/1152] bug1755201: Skipping file "pdfs/bug1755201.pdf" [2/1152] issue14864: Skipping file "pdfs/issue14864.pdf" [3/1152] issue18503: Skipping file "pdfs/issue18503.pdf" [4/1152] issue19835: Skipping file "pdfs/issue19835.pdf" [5/1152] filled-background-range: Loading file "pdfs/filled-background.pdf" Loading page 1/3... Snapshotting... done Loading page 2/3... Snapshotting... done Loading page 3/3... Snapshotting... done [6/1152] tracemonkey-eq: Loading file "pdfs/tracemonkey.pdf" Loading page 1/14... Snapshotting... done Loading page 2/14... Snapshotting... done ``` This gives an idea of how many tests are left while they are running	2025-06-09 18:48:06 +02:00
Jonas Jenwald	97a42486f5	Improve how the PDF.js version/commit information is exposed in the built files To make it easier to tell which PDF.js version/commit that the built files correspond to, they have (since many years) included `pdfjsVersion` and `pdfjsBuild` constants with that information. As currently implemented this has a few shortcomings: - It requires manually adding the code, with its preprocessor statements, in all relevant files. - It requires ESLint disable statements, since it's obviously unused code. - Being unused, this code is removed in the minified builds. - This information would be more appropriate as comments, however Babel discards all comments during building. - It would be helpful to have this information at the top of the built files, however it's being moved during building. To address all of these issues, we'll instead utilize Webpack to insert the version/commit information as a comment placed just after the license header.	2025-05-19 15:01:05 +02:00
Jonas Jenwald	2c593b06e4	[api-minor] Replace various `getAll` methods with iterators These `getAll` methods are not used anywhere within the PDF.js code-base, outside of tests, and were mostly added (speculatively) for third-party users. To still allow access to the same data we instead introduce iterators on these classes, which (slightly) shortens the code and allows us to remove the `objectFromMap` helper function. A summary of the changes in this patch: - Replace the `getAll` methods with iterators in the following classes: `AnnotationStorage`, `Metadata`, and `OptionalContentGroup`. - Change, and also re-name, `AnnotationStorage.prototype.setAll` into a test-only method since it's not used elsewhere. - Remove the `Metadata.prototype.has` method, since it's only used in tests and can be trivially replaced by calling `Metadata.prototype.get` and checking if the returned value is `null`.	2025-04-06 21:43:16 +02:00
Jonas Jenwald	319d239f41	Add an `OutputScale` static method to get the `devicePixelRatio` Currently we lookup the `devicePixelRatio`, with fallback handling, in a number of spots in the code-base. Rather than duplicating code we can instead add a new static method in the `OutputScale` class, since that one is now exposed in the API.	2025-03-12 21:07:06 +01:00
Calixte Denizet	7280540901	[api-minor] Use an icc profile for converting CMYK to RGB	2025-03-10 14:18:20 +01:00
Jonas Jenwald	641e2f506e	[api-minor] Re-factor how the `useWorkerFetch` option is used internally With the recently added OpenJPEG no-wasm fallback we need to send the `wasmUrl` option to the worker-thread regardless of the value of the `useWorkerFetch` option, since the fallback won't work if we don't have a URL to `import` it from. For consistency the code is re-factored to always send the factory-urls to the worker-thread, and simply check the `useWorkerFetch` option there instead. Also, as a follow-up to PR 19525, introduce a new `useWasm` option that can be used in e.g. browser-tests to forcibly disable WebAssembly usage.	2025-02-22 09:56:53 +01:00
Calixte Denizet	36e4f5c222	Provide a js fallback when the wasm version of openjpeg is failing to load (bug 1935076)	2025-02-21 19:03:47 +01:00
Jonas Jenwald	36979e9eb2	Fix all outstanding ESLint `arrow-body-style` warnings Currently this rule is disabled in a number of spots across the code-base, and unless absolutely necessary we probably shouldn't disable linting, so let's just update the code to fix all the outstanding cases.	2025-02-17 15:45:44 +01:00
Jonas Jenwald	bd05b255fa	[api-major] Apply the `userUnit` using CSS, to fix the text/annotation layers (bug 1947248) Rather than modifying the "raw" dimensions of the page, we'll instead apply the `userUnit` as an additional scale-factor via CSS. Please note: It's not clear to me if this solution is fully correct either, or if there's other problems with it, but it at least appears to work. --- With these changes, the following CSS variables are now assumed to be available/set as necessary: `--total-scale-factor`, `--scale-factor`, `--user-unit`, `--scale-round-x`, and `--scale-round-y`.	2025-02-11 14:36:06 +01:00
Calixte Denizet	94b4b54ef6	[api-major] Add openjpeg.wasm to pdf.js (bug 1935076) In order to fix bug 1935076, we'll have to add a pure js fallback in case wasm is disabled or simd isn't supported. Unfortunately, this fallback will take some space. So, the main goal of this patch is to reduce the overall size (by ~93k). As a side effect, it should make easier to use an other wasm file (which must export _jp2_decode, _malloc and _free).	2025-01-16 21:09:50 +01:00
Calixte Denizet	f6662d3f7c	Add a ref test for setting disableFontFace to true	2024-12-08 16:06:25 +01:00
Nicolò Ribaudo	202b26487f	Do not stringify errors when logging them Converting errors to string drops their stack trace, making it more difficult to debug their actual reason. We can instead pass the error objects as-is to console.warn/error, so that Firefox/Chrome devtools will show both the stack trace of the console.warn/error call, and the original stack trace of the error. This commit also enables the `unicorn/no-console-spaces` ESLint rule, which avoids accidental extra spaces when passing multiple parameters to `console.*` methods.	2024-12-06 14:47:22 +01:00
Jonas Jenwald	867aaf01fa	Merge pull request #19117 from Snuffleupagus/bot-forceNoChrome Disable the browser-tests in Google Chrome on the bots	2024-12-01 18:28:03 +01:00
Calixte Denizet	cee65fcd4e	[Editor] Add a new base class to allow to add a drawing in the SVG layer. This patch makes a clear separation between the way to draw and the editing stuff. It adds a class DrawEditor which should be extended in order to create new drawing tools. As an example, the ink tool has been rewritten in order to use it.	2024-11-28 15:23:03 +01:00
Jonas Jenwald	fd31e728f7	Disable the browser-tests in Google Chrome on the bots Given that `browsertest` repeatedly timeout in Google Chrome, and considering that Firefox is the primary development target, we stop running them on the bots to avoid having to repeatedly deal with this. Note that we already disabled these tests on Windows almost three years ago, because of stability issues; see PR 14392.	2024-11-27 17:59:51 +01:00
Calixte Denizet	3c343acbb6	[Editor] Simplify the draw layer code and tweak a bit the highlight one (e.g. it's useless to have 64 bits floating point numbers when 32 bits ones are enough). It's a required step for the refactoring of the ink tool (in order to use the draw layer). It avoids to call several functions acting on the same SVG element.	2024-11-21 17:50:35 +01:00
Calixte Denizet	92b7374aad	Disable ref test 'issue18896' for Chrome because it takes too much time	2024-11-11 18:55:56 +01:00
Calixte Denizet	5a9607b2ad	[Editor] Refactor the free highlight stuff in order to be able to use the code for more general drawing One goal is to make the code for drawing with the Ink tool similar to the one to free highlighting: it doesn't really make sense to have so different ways to do almost the same thing. When the zoom level is high, it'll avoid to create a too big canvas covering all the page which consume more memory, makes the drawing very slow and the overall user xp pretty bad. A second goal is to be able to easily implement more drawing tools where we would just have to implement how to draw from the pointer coordinates.	2024-10-29 15:41:08 +01:00
Jonas Jenwald	1a1dfe60bf	Add more optional chaining in the `test/` directory	2024-09-22 13:00:57 +02:00
Tim van der Meij	ad95df2214	Enable the `unicorn/prefer-includes` linting rule For more information refer to https://github.com/sindresorhus/eslint-plugin-unicorn/blob/main/docs/rules/prefer-includes.md. Fortunately this only required one change because we already use `.includes()` everywhere else. Note that that is mostly due to the `mozilla/use-includes-instead-of-indexOf` rule which we replace with this new rule now because it's more complete.	2024-08-08 19:01:35 +02:00
Calixte Denizet	5f95d9b1ba	[Editor] Allow Float32Array for quadpoints in annotations (bug 1907958) Added annotations could have some quadpoints (highlight, ink). The isNumberArray check was returning false and consequently the annotation wasn't printable. The tests didn't catch this issue because the quadpoints were passed as Array. So driver.js has been updated in order to pass them as Float32Array in order to be in a situation similar to the real life one.	2024-07-31 16:23:01 +02:00
Jonas Jenwald	8a979c2d0e	[api-minor] Remove `Outliner` from the official API As far as I can tell `Outliner` is only exposed in the API because we need to access it when running some of the reference-tests, but is otherwise not used. Hence this seems like something that should be kept internal and thus only exposed in TESTING-builds.	2024-07-16 13:08:26 +02:00
Jonas Jenwald	15b5808eee	[api-minor] Re-factor the basic textLayer-functionality This is very old code, and predates e.g. the introduction of JavaScript classes, which creates unnecessarily unwieldy code in the viewer. By introducing a new `TextLayer` class in the API, similar to how e.g. the `AnnotationLayer` looks, we're able to keep most parameters on the class-instance itself. This removes the need to manually track them in the viewer, and simplifies the call-sites. This also removes the `numTextDivs` parameter from the "textlayerrendered" event, since that's only added to support default-viewer functionality that no longer exists. Finally we try, as far as possible, to polyfill the old `renderTextLayer` and `updateTextLayer` functions since they are exposed in the library API. For simple invocations of `renderTextLayer` the behaviour should thus be the same, with only a warning printed in the console.	2024-05-17 14:20:20 +02:00
Jonas Jenwald	049848ba00	Unify the `ReadableStream` and `TextContent` code-paths in `src/display/text_layer.js` The only reason that this code still accepts `TextContent` is for backward-compatibility purposes, so we can simplify the implementation by always using a `ReadableStream` internally.	2024-05-07 13:03:57 +02:00
Jonas Jenwald	e4d0e84802	[api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()` This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does almost the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular: - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state. - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used. - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however not a problem since the value of a Promise cannot be changed once fulfilled or rejected. - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them: - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is removed on its first (valid) invocation. - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks. - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and test-only code. --- [1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).	2024-04-01 11:42:37 +02:00
Jonas Jenwald	3c78ff5fb0	[api-minor] Implement basic support for OptionalContent `Usage` dicts (issue 5764, bug 1826783) The following are some highlights of this patch: - In the Worker we only extract a subset of the potential contents of the `Usage` dictionary, to avoid having to implement/test a bunch of code that'd be completely unused in the viewer. - In order to still allow the user to manually override the default visible layers in the viewer, the viewable/printable state is purposely not enforced during initialization in the `OptionalContentConfig` constructor. - Printing will now always use the default visible layers, rather than using the same state as the viewer (as was the case previously). This ensures that the printing-output will correctly take the `Usage` dictionary into account, and in practice toggling of visible layers rarely seem to be necessary except in the viewer itself (if at all).[1] --- [1] In the unlikely case that it'd ever be deemed necessary to support fine-grained control of optional content visibility during printing, some new (additional) UI would likely be needed to support that case.	2024-03-12 13:18:15 +01:00
Jonas Jenwald	363dce6744	Use a limit, in more places, when splitting strings This should be a tiny bit more efficient, since it avoids parsing substrings that we don't care about. Please note: I cannot find an ESLint rule to enforce this automatically.	2024-02-02 13:10:52 +01:00
Jonas Jenwald	f9a384d711	Enable the `arrow-body-style` ESLint rule This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation. Please see https://eslint.org/docs/latest/rules/arrow-body-style	2024-01-21 16:20:55 +01:00
Calixte Denizet	ae5828c968	[Editor] Avoid conflicts between new persistent refs and the ones created when saving (bug 1865341) When a pdf as a FreeText without appearance, we use a fake font in order to render it and that leads to create few new refs for the font. But then when we're saving, we create some new refs which start at the same number as the previous created ones. Consequently, when saving we're using some wrong objects (like a font) to check if we're able to render the newly added FreeText. In order to fix this bug, we just remove the persistent refs (which are only used when rendering/printing) during the saving.	2023-12-05 12:33:21 +01:00
Calixte Denizet	1f0f51613a	Make sure that only one radio button is checked in its group (bug 1864136) When the first checked radio is met, the others in the group are set to false.	2023-11-23 17:01:49 +01:00
Calixte Denizet	31d9b9f574	[Editor] Add a way to extract the outlines of a union of rectangles The goal is to be able to get these outlines to fill the shape corresponding to a text selection in order to highlight some text contents. The outlines will be used either to show selected/hovered highlights.	2023-11-20 18:45:19 +01:00
Jonas Jenwald	17af706070	[api-minor] Use "data-l10n-id"/"data-l10n-args", rather than manually updating DOM-elements, to trigger translation (PR 17146 follow-up) This patch changes almost all viewer-components[1] to use "data-l10n-id"/"data-l10n-args" for localization, which means that in many cases we no longer need to pass around the `L10n`-instance any more. One part of the code-base where the `L10n`-instance is still being used "directly" is the AnnotationEditors, however while it might be possible to convert (most of) that code as well that's not attempted in this patch. --- [1] The one exception is the `PDFDocumentProperties` dialog, since the way it's currently implemented makes that less straightforward to fix without a lot of code changes.	2023-10-22 14:17:47 +02:00
Calixte Denizet	66982a2a11	[api-minor] Move to Fluent for the localization (bug 1858715) - For the generic viewer we use @fluent/dom and @fluent/bundle - For the builtin pdf viewer in Firefox, we set a localization url and then we rely on document.l10n which is a DOMLocalization object.	2023-10-19 11:20:41 +02:00
Jonas Jenwald	927e50f5d4	[api-major] Output JavaScript modules in the builds (issue 10317) At this point in time all browsers, and also Node.js, support standard `import`/`export` statements and we can now finally consider outputting modern JavaScript modules in the builds.[1] In order for this to work we can only use proper `import`/`export` statements throughout the main code-base, and (as expected) our Node.js support made this much more complicated since both the official builds and the GitHub Actions-based tests must keep working.[2] One remaining issue is that the `pdf.scripting.js` file cannot be built as a JavaScript module, since doing so breaks PDF scripting. Note that my initial goal was to try and split these changes into a couple of commits, however that unfortunately didn't really work since it turned out to be difficult for smaller patches to work correctly and pass (all) tests that way.[3] This is a classic case of every change requiring a couple of other changes, with each of those changes requiring further changes in turn and the size/scope quickly increasing as a result. One possible "issue" with these changes is that we'll now only output JavaScript modules in the builds, which could perhaps be a problem with older tools. However it unfortunately seems far too complicated/time-consuming for us to attempt to support both the old and modern module formats, hence the alternative would be to do "nothing" here and just keep our "old" builds.[4] --- [1] The final blocker was module support in workers in Firefox, which was implemented in Firefox 114; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Statements/import#browser_compatibility [2] It's probably possible to further improve/simplify especially the Node.js-specific code, but it does appear to work as-is. [3] Having partially "broken" patches, that fail tests, as part of the commit history is really not a good idea in general. [4] Outputting JavaScript modules was first requested almost five years ago, see issue 10317, and nowadays there should be much better support for JavaScript modules in various tools.	2023-10-07 09:31:08 +02:00
Calixte Denizet	8daf2f1eb1	[Annotation] Use the clip-path property when an annotation has some quad points This way it'll avoid to split a div in multiple divs having the same id (which is supposed to be unique).	2023-07-20 10:53:23 +02:00
Calixte Denizet	7cd062ec68	Draw correctly background images in ref tests It's a workaround for bug https://bugzilla.mozilla.org/show_bug.cgi?id=1844414. It should be reverted (in order to avoid a perf penalty) once the bug is fixed in Firefox.	2023-07-19 20:29:54 +02:00
Calixte Denizet	7ac3bf6f17	[Editor] Don't forget to generate non-missing images when printing (bug 1844036)	2023-07-18 15:39:18 +02:00
Calixte Denizet	599b9498f2	[Editor] Add support for printing/saving newly added Stamp annotations In order to minimize the size the of a saved pdf, we generate only one image and use a reference in each annotation using it. When printing, it's slightly different since we have to render each page independantly but we use the same image within a page.	2023-06-26 15:47:05 +02:00
Jonas Jenwald	19880fcf9a	[api-minor] Move the l10n-translation into the `AnnotationLayer` With the changes in PR 16552 we can now move general translation into the `AnnotationLayer` itself, which should improve things ever so slightly in third-party implementations where the default viewer isn't used.	2023-06-20 20:28:35 +02:00
Calixte Denizet	d1e172458f	[api-minor] Make the popup independent of their associated annotations - it'll help to be able to move popups on screen to let the user read the text - popups won't inherit some properties from their parent: - the popup can be misrendered if for example the parent has a clip-path property. - add an outline to the popup when the parent is focused. - hide a popup when it's clicked.	2023-06-20 15:30:39 +02:00
Calixte Denizet	4351708ae6	[api-minor] Make the AnnotationLayer an object in order to use it in the AnnotationEditorLayer It'll be useful to make the Freetext and Ink annotations editable.	2023-06-02 22:44:14 +02:00
Jonas Jenwald	317abd6d07	Change the `createPromiseCapability` helper function into a `PromiseCapability` class This is not only slightly more compact, but it also simplifies the handling of the `settled` getter.	2023-04-29 13:43:24 +02:00
Calixte Denizet	117bbf7cd9	[api-minor] Don't normalize the text used in the text layer. Some arabic chars like \ufe94 could be searched in a pdf, hence it must be normalized when creating the search query. So to avoid to duplicate the normalization code, everything is moved in the find controller. The previous code to normalize text was using NFKC but with a hardcoded map, hence it has been replaced by the use of normalize("NFKC") (it helps to reduce the bundle size by 30kb). In playing with this \ufe94 char, I noticed that the bidi algorithm wasn't taking into account some RTL unicode ranges, the generated font wasn't embedding the mapping this char and the unicode ranges in the OS/2 table weren't up-to-date. When normalized some chars can be replaced by several ones and it induced to have some extra chars in the text layer. To avoid any regression, when copying some text from the text layer, a copied string is normalized (NFKC) before being put in the clipboard (it works like this in either Acrobat or Chrome).	2023-04-17 14:31:23 +02:00

1 2 3 4 5 ...