pdf.js

Author	SHA1	Message	Date
Tim van der Meij	d01a0bd0c8	Fix annotation border style parsing by handling empty dash arrays The PDF specification states that empty dash arrays, i.e. arrays with zero elements, are in fact valid. In that case the dash array simply corresponds to a solid, unbroken line. However, this case was erroneously being flagged as invalid and therefore the annotation was not drawn because its width was set to zero. This commit fixes the issue by allowing dash arrays to have a length of zero.	2024-04-08 16:34:27 +02:00
Calixte Denizet	3f2f98336e	Update the current stride before composing when decoding a text region Fixes #17871. We do something similar to: https://source.chromium.org/chromium/chromium/src/+/main:third_party/pdfium/core/fxcodec/jbig2/JBig2_TrdProc.cpp;l=373-379;drc=24c6be6924df3ff585bb63f6aed4e2c81e791fb2	2024-04-03 18:44:50 +02:00
Calixte Denizet	8f5d907a52	Don't translate char codes when platform,encoding isn't (3,0)	2024-04-03 16:08:11 +02:00
Tim van der Meij	2e5282928f	Merge pull request #17854 from Snuffleupagus/rm-PromiseCapability [api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()`	2024-04-02 15:21:43 +02:00
Jonas Jenwald	e4d0e84802	[api-minor] Replace the `PromiseCapability` with `Promise.withResolvers()` This replaces our custom `PromiseCapability`-class with the new native `Promise.withResolvers()` functionality, which does almost the same thing[1]; please see https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers The only difference is that `PromiseCapability` also had a `settled`-getter, which was however not widely used and the call-sites can either be removed or re-factored to avoid it. In particular: - In `src/display/api.js` we can tweak the `PDFObjects`-class to use a "special" initial data-value and just compare against that, in order to replace the `settled`-state. - In `web/app.js` we change the only case to manually track the `settled`-state, which should hopefully be OK given how this is being used. - In `web/pdf_outline_viewer.js` we can remove the `settled`-checks, since the code should work just fine without it. The only thing that could potentially happen is that we try to `resolve` a Promise multiple times, which is however not a problem since the value of a Promise cannot be changed once fulfilled or rejected. - In `web/pdf_viewer.js` we can remove the `settled`-checks, since the code should work fine without them: - For the `_onePageRenderedCapability` case the `settled`-check is used in a `EventBus`-listener which is removed on its first (valid) invocation. - For the `_pagesCapability` case the `settled`-check is used in a print-related helper that works just fine with "only" the other checks. - In `test/unit/api_spec.js` we can change the few relevant cases to manually track the `settled`-state, since this is both simple and test-only code. --- [1] In browsers/environments that lack native support, note [the compatibility data](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Promise/withResolvers#browser_compatibility), it'll be polyfilled via the `core-js` library (but only in `legacy` builds).	2024-04-01 11:42:37 +02:00
Jonas Jenwald	3d2eb36e27	Merge pull request #17858 from Snuffleupagus/issue-17848 Ensure that Mesh /Shadings have non-zero width/height (issue 17848)	2024-03-30 21:59:58 +01:00
Jonas Jenwald	07a8836ab2	Ensure that Mesh /Shadings have non-zero width/height (issue 17848)	2024-03-29 22:58:25 +01:00
Calixte Denizet	136c1faa7f	Display outlines even if one has no title Fixes #17856.	2024-03-29 21:30:24 +01:00
Jonas Jenwald	0d039937f9	Add better support for /Launch actions with /FileSpec dictionaries (issue 17846)	2024-03-26 20:15:48 +01:00
Calixte Denizet	9c3471dd01	Don't render corrupted inlined images Fixes #17794.	2024-03-15 15:33:18 +01:00
Jonas Jenwald	3c78ff5fb0	[api-minor] Implement basic support for OptionalContent `Usage` dicts (issue 5764, bug 1826783) The following are some highlights of this patch: - In the Worker we only extract a subset of the potential contents of the `Usage` dictionary, to avoid having to implement/test a bunch of code that'd be completely unused in the viewer. - In order to still allow the user to manually override the default visible layers in the viewer, the viewable/printable state is purposely not enforced during initialization in the `OptionalContentConfig` constructor. - Printing will now always use the default visible layers, rather than using the same state as the viewer (as was the case previously). This ensures that the printing-output will correctly take the `Usage` dictionary into account, and in practice toggling of visible layers rarely seem to be necessary except in the viewer itself (if at all).[1] --- [1] In the unlikely case that it'd ever be deemed necessary to support fine-grained control of optional content visibility during printing, some new (additional) UI would likely be needed to support that case.	2024-03-12 13:18:15 +01:00
Calixte Denizet	4e1b96c781	[Annotations] Widget annotations must be in front of the other ones (bug 1883609)	2024-03-05 19:04:58 +01:00
Calixte Denizet	e0b843d991	Always add links in the annotation layer Fixes #17730.	2024-02-27 22:48:08 +01:00
Calixte Denizet	a6eadf8150	Avoid to access to a missing cidSystemInfo property Fixes #17689.	2024-02-19 09:55:23 +01:00
Jonas Jenwald	a7bcc81eb1	Add a dummy `beginMarkedContentProps` operator when optional content parsing fails (issue 17679)	2024-02-17 13:45:16 +01:00
calixteman	a83a8d7e4f	Merge pull request #17674 from calixteman/issue17671 Fix the endoffset of the last glyph when it's followed by a null offset in the loca table	2024-02-15 10:19:55 +01:00
Calixte Denizet	fcad3718f0	Fix the endoffset of the last glyph when it's followed by a null offset in the loca table It fixes #17671.	2024-02-14 17:20:04 +01:00
Calixte Denizet	2133da166e	When updating, write the xref table in the same format as the previous one (bug 1878916) The specs are unclear about what kind of xref table format must be used. In checking the validity of some pdfs in the preflight tool from Acrobat we can guess that having the same format is the correct way to do. The pdf in the mentioned bug, after having been changed, wasn't correctly displayed in neither Chrome nor Acrobat: it's now fixed.	2024-02-13 14:14:37 +01:00
Jonas Jenwald	37e98e39f6	Skip any whitespace after the first object in linearized PDFs (issue 17665) This way the code is now consistent with the non-linearized branch in the `PDFDocument.startXRef` getter.	2024-02-12 22:05:36 +01:00
Jonas Jenwald	363dce6744	Use a limit, in more places, when splitting strings This should be a tiny bit more efficient, since it avoids parsing substrings that we don't care about. Please note: I cannot find an ESLint rule to enforce this automatically.	2024-02-02 13:10:52 +01:00
Calixte Denizet	7f2428a77e	Reduce memory use and improve perfs when computing the bounding box of a bezier curve (bug 1875547) It isn't really a fix for the mentioned bug but it slightly improve things. In reducing the memory use, the time spent in the GC is reduced either. The algorithm to compute the bounding box is the same as before but it has just been rewritten to be more efficient.	2024-01-24 23:41:14 +01:00
Jonas Jenwald	fa583427ef	Always export the "raw" /ToUnicode-data from `PartialEvaluator.preEvaluateFont` (PR 13354 follow-up) This, ever so slightly, simplifies the implementation in the `PartialEvaluator.extractDataStructures`-method.	2024-01-22 13:06:32 +01:00
Jonas Jenwald	f21a30dfb4	Convert the `PartialEvaluator.readToUnicode` method to be async	2024-01-22 12:47:06 +01:00
Jonas Jenwald	f5c01188dc	Convert the `PartialEvaluator.extractDataStructures` method to be async	2024-01-22 12:47:06 +01:00
Jonas Jenwald	cf0797dfbd	Use `await` consistently in the `PartialEvaluator.setGState` method	2024-01-22 12:47:06 +01:00
Jonas Jenwald	1cc83c4fdc	Use `await` consistently in the `PartialEvaluator.buildFormXObject` method	2024-01-22 12:47:06 +01:00
calixteman	bba831821d	Merge pull request #17558 from calixteman/bug1669097 Print correctly documents containing chars with an unicode greater than 0xFFFF (bug 1669097)	2024-01-22 12:23:06 +01:00
Calixte Denizet	06601fd90c	Print correctly documents containing chars with an unicode greater than 0xFFFF (bug 1669097)	2024-01-22 10:48:00 +01:00
Tim van der Meij	49b2d9b5af	Merge pull request #17556 from Snuffleupagus/issue-17554 Ensure that `EvaluatorPreprocessor.opMap` has a null-prototype (issue 17554)	2024-01-21 20:58:09 +01:00
Jonas Jenwald	d7e41d4cb6	Ensure that `EvaluatorPreprocessor.opMap` has a null-prototype (issue 17554) This accidentally regressed in PR 16956, sorry about that!	2024-01-21 19:59:13 +01:00
Jonas Jenwald	3c2c0ecd88	Use the ESLint `arrow-body-style` rule in more spots in `src/core/evaluator.js`	2024-01-21 17:42:33 +01:00
Jonas Jenwald	d1bef8cb86	Use `await` consistently in the `PartialEvaluator.translateFont` method	2024-01-21 17:36:50 +01:00
Jonas Jenwald	fc62eec901	Convert the `handleSetFont` methods, in `src/core/evaluator.js`, to be async	2024-01-21 17:32:05 +01:00
Jonas Jenwald	f9a384d711	Enable the `arrow-body-style` ESLint rule This manually ignores some cases where the resulting auto-formatting would not, as far as I'm concerned, constitute a readability improvement or where we'd just end up with more overall indentation. Please see https://eslint.org/docs/latest/rules/arrow-body-style	2024-01-21 16:20:55 +01:00
Jonas Jenwald	9dfe9c552c	Use shorter arrow functions where possible For arrow functions that are both simple and short, we can avoid using explicit `return` to shorten them even further without hurting readability. For the `gulp mozcentral` build-target this reduces the overall size of the output by just under 1 kilo-byte (which isn't a lot but still can't hurt).	2024-01-21 10:13:12 +01:00
Calixte Denizet	d64f334f98	[Editor] Add support for printing/saving free highlight annotations	2024-01-19 12:58:46 +01:00
Calixte Denizet	10389c5017	Add the font Linux Libertine as a possible substitution for Times New Roman and try to load the font family (guessed from the font name) before trying the local substitution. The local(...) command expects to have a real font name and not a predefined substitution it's why we try the font family.	2024-01-16 12:31:23 +01:00
Calixte Denizet	e9946fa22a	[Editor] Draw a line instead of a Bezier curve when an Ink has only one point Fixes #17418.	2024-01-15 13:32:36 +01:00
Calixte Denizet	405f573d70	Take into account empty lines when extracting text content from the appearance Fixes #17492.	2024-01-14 20:23:29 +01:00
Calixte Denizet	0392feaee4	Remove terminal white spaces when extracting text from annotation appearances	2024-01-09 10:42:53 +01:00
Calixte Denizet	7839e7b495	Preserve the whitespaces when getting text from FreeText annotations (bug 1871353) When the text of an annotation is extracted in using getTextContent, consecutive white spaces are just replaced by one space and. So this patch add an option to make sure that white spaces are preserved when appearance is parsed. For the case where there's no appearance, we can have a fast path to get the correct string from the Content entry. When an existing FreeText is edited, space (0x20) are replaced by non-breakable (0xa0) ones to make to see all of them on screen.	2024-01-05 10:20:32 +01:00
Jonas Jenwald	9f02cc36d4	Attempt to further reduce re-parsing for globally cached images (PR 11912, 16108 follow-up) In PR 11912 we started caching images that occur on multiple pages globally, which improved performance a lot in many PDF documents. However, one slightly annoying limitation of the implementation is the need to re-parse the image once the global-caching threshold has been reached. Previously this was difficult to avoid, since large image-resources will cause cleanup to run on the main-thread after rendering has finished. In PR 16108 we started delaying this cleanup a little bit, to improve performance if a user e.g. zooms and/or rotates the document immediately after rendering completes. Taking those two PRs together, we now have a situation where it's much more likely that the main-thread has "globally used" images cached at the page-level. Hence we can instead attempt to copy a locally cached image into the global object-cache on the main-thread and thus reduce unnecessary re-parsing of large/complex global images, which significantly reduces the rendering time in many cases. For the PDF document in issue 11878, the rendering time of the second page changes as follows (on my computer): - With the `master`-branch it takes >600 ms to render. - With this patch that goes down to ~50 ms, which is one order of magnitude faster. (Note that all other pages are, as expected, completely unaffected by these changes.) This new main-thread copying is limited to "large" global images, since: - Re-parsing of small images, on the worker-thread, is usually fast enough to not be an issue. - With the delayed cleanup after rendering, it's still not guaranteed that an image is available in a page-level cache on the main-thread. - This forces the worker-thread to wait for the main-thread, which is a pattern that you always want to avoid unless absolutely necessary.	2023-12-21 21:26:21 +01:00
Jonas Jenwald	e547b198a3	Compute the length of the final image-bitmap/data on the worker-thread Currently this is done in the API, but moving it into the worker-thread will simplify upcoming changes.	2023-12-21 21:26:21 +01:00
Jonas Jenwald	63eb8991a3	Support Annotations with corrupt /BS-entries There's obviously a few things wrong with the Annotations in the referenced PDF document, however parsing of an Annotation shouldn't just break if the /BS-entry isn't a dictionary.	2023-12-09 10:36:18 +01:00
Calixte Denizet	ae5828c968	[Editor] Avoid conflicts between new persistent refs and the ones created when saving (bug 1865341) When a pdf as a FreeText without appearance, we use a fake font in order to render it and that leads to create few new refs for the font. But then when we're saving, we create some new refs which start at the same number as the previous created ones. Consequently, when saving we're using some wrong objects (like a font) to check if we're able to render the newly added FreeText. In order to fix this bug, we just remove the persistent refs (which are only used when rendering/printing) during the saving.	2023-12-05 12:33:21 +01:00
Calixte Denizet	52ea20eda4	Don't throw when there isn't enough data to get block info in flate stream but just ends the stream.	2023-11-26 18:12:22 +01:00
Calixte Denizet	f8f4432961	[Editor] Add support for saving/printing a newly added Highlight annotation (bug 1865708)	2023-11-22 10:41:55 +01:00
Jonas Jenwald	a6f0609a6e	Throw a `JpegError` when a JPEG image has no frame data (issue 17302) Given that there's nothing to parse in this case, since we're dealing with an invalid JPEG image, throwing an explicit Error makes sense here.	2023-11-20 17:33:49 +01:00
Jonas Jenwald	709d89420e	Re-factor how the `GenericL10n` class fetches localization-data - Re-factor the existing `fetchData` helper function such that it can fetch more types of data, and it now supports "arraybuffer", "json", and "text". This only needed minor adjustments in the `DOMCMapReaderFactory` and `DOMStandardFontDataFactory` classes.[1] - Expose the `fetchData` helper function in the API, such that the viewer is able to access it. - Use the `fetchData` helper function in the `GenericL10n` class, since this should allow fetching of localization-data even if the default viewer is run in an environment without support for the Fetch API. --- [1] While testing this I also noticed a minor inconsistency when handling standard font-data on the worker-thread.	2023-11-14 13:45:14 +01:00
Calixte Denizet	09b4fe6a30	Get the field name from its parent when it doesn't have one when collecting fields (bug 1864136) Some fields, somewhere under the Fields entry in Acroform, could have no name (in T) but with a parent which has a name but which isn't somewhere under Fields. As a side-effect, this patch prevents infinite loops because of potential cycles under Fields.	2023-11-13 14:41:14 +01:00

1 2 3 4 5 ...

3063 Commits