pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	2979e23f3c	Ensure that `XFAFactory.prototype.isValid` returns a boolean value Considering the name of the method, and how it's actually being used, you'd expect it to return a boolean value. Given how it's currently being used this inconsistency doesn't cause any issues, however we should still fix this.	2025-05-04 13:42:17 +02:00
Tim van der Meij	5ca57fbd4b	Merge pull request #19885 from Snuffleupagus/loadXfaImages-simplify Simplify the `loadXfaImages` method and related code	2025-05-04 13:41:06 +02:00
Tim van der Meij	22cb3080ee	Merge pull request #19887 from Snuffleupagus/serializeXfaData-simplify Simplify the `serializeXfaData` method and related code	2025-05-04 13:38:01 +02:00
Jonas Jenwald	b3e16800f5	Remove the `BasePdfManager.prototype.catalog` getter This is only invoked once and it can be trivially replaced by the `ensureCatalog`-method, since the code where it's used is already asynchronous.	2025-05-03 13:40:23 +02:00
Jonas Jenwald	b531720d9c	Simplify the `serializeXfaData` method and related code Rather than having a dedicated `BasePdfManager`-method for this one call-site we can instead change `PDFDocument.prototype.serializeXfaData` to a non-async method, that we invoke via `BasePdfManager.prototype.ensureDoc`.	2025-05-03 11:20:42 +02:00
Jonas Jenwald	122822a750	Simplify the `loadXfaImages` method and related code Currently we create an intermediate `Dict` during parsing, however that seems unnecessary since (note especially the second point): - The `NameOrNumberTree.prototype.getAll` method will already resolve any references, as needed, during parsing. - The `Catalog.prototype.xfaImages` getter is invoked, via the `BasePdfManager`-instance, such that any `MissingDataException`s are already handled correctly.	2025-05-02 11:53:41 +02:00
calixteman	91bfe12f38	Merge pull request #19883 from gpanakkal/checkbutton-tostyle Fix arguments in `toStyle` call in `CheckButton`	2025-05-01 22:08:03 +02:00
Gautam Panakkal	7bba3bd4ad	Add missing `this` arg to `toStyle` in `CheckButton.prototype.[$toHTML]`	2025-05-01 10:19:28 -07:00
Jonas Jenwald	b629bafd1c	Allow to, optionally, keep Unicode escape sequences in `stringToPDFString` (PR 17331 follow-up) Currently some of the links[1] on page three of the `issue19835.pdf` test-case aren't clickable, since the destination (of the LinkAnnotation) becomes empty. The reason is that these destinations include the character `\x1b`, which is interpreted as the start of a Unicode escape sequence specifying the language of the string; please refer to section [7.9.2.2 Text String Type](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1957385) in the PDF specification. Hence it seems that we need a way to optionally disable that behaviour, to avoid a "badly" formatted string from becoming empty (or truncated), at least for cases where we are: - Parsing named destinations[2] and URLs. - Handling "strings" that are actually /Name-instances. - Building a lookup Object/Map based on some PDF data-structure. NOTE: The issue that prompted this patch is obviously related to destinations, however I've gone through the `src/core/` folder and updated various other `stringToPDFString` call-sites that (directly or indirectly) fit the categories listed above. --- [1] Try clicking on anything on the line containing "Item 7A. Quantitative and Qualitative Disclosures About Market Risk 27". [2] Unfortunately just skipping `stringToPDFString` in this case would cause other issues, such as the named destination becoming "unusable" in the viewer; see e.g. issues 14847 and 14864.	2025-04-30 20:51:10 +02:00
Jonas Jenwald	254431df1e	Avoid extra lookup/parsing when all destinations are already available Whenever we cannot find a destination we'll fallback to checking all destinations, to account for e.g. out-of-order NameTrees, and in those cases any subsequent destination-lookups can be made a tiny bit more efficient by immediately checking the already cached destinations.	2025-04-30 15:26:00 +02:00
Jonas Jenwald	0922aa9e9d	Merge pull request #19880 from Snuffleupagus/numberToString-assert-number Assert that `numberToString` is called with a number (issue 19877)	2025-04-29 20:35:32 +02:00
Jonas Jenwald	f5faf86180	Assert that `numberToString` is called with a number (issue 19877) NOTE: Given that this is an internal function, used only in the worker-thread, it's not clear to me that this is an entirely "necessary" change.	2025-04-29 20:31:24 +02:00
Calixte Denizet	7a251b206e	Fix the bbox when saving a rotated text field (bug 1963407)	2025-04-29 18:49:07 +02:00
Jonas Jenwald	abc9522886	Avoid (most) string parsing when removing/replacing the hash property of a URL	2025-04-25 23:13:05 +02:00
calixteman	efc5c3c231	Merge pull request #19862 from calixteman/bug1961423 Fix 'print to pdf' on Mac with a cid font (bug 1961423)	2025-04-25 15:11:55 +02:00
Jonas Jenwald	312c85bfd6	Merge pull request #19815 from Snuffleupagus/getMergedResources-size Ensure that "local" /Contents stream-dict /Resources aren't empty (PR 19803 follow-up)	2025-04-25 10:46:04 +02:00
Calixte Denizet	785991a97c	Fix 'print to pdf' on Mac with a cid font (bug 1961423)	2025-04-24 20:19:12 +02:00
Jonas Jenwald	64007e777e	Ensure that the /Form XObject /Resources-entry is actually a dictionary (issue 19848)	2025-04-23 10:19:20 +02:00
Jonas Jenwald	adc9eb5a5a	Always fallback to checking all destinations, when lookup fails (issue 19835) In the referenced PDF document the keys, in the /Dests dictionary, need to account for PDFDocEncoding. To improve destination handling in general we'll now unconditionally fallback to always checking all destinations.	2025-04-20 14:53:10 +02:00
Jonas Jenwald	91ba147317	Check that the `Object.prototype` hasn't been incorrectly extended (PR 11582 follow-up) This complements, and extends, the existing check of the `Array.prototype` in the worker-thread. To simplify the implementation we'll now abort immediately, rather than collecting all "bad" properties.	2025-04-18 12:19:29 +02:00
calixteman	4b1875c8c0	Merge pull request #19825 from calixteman/bug1961107 Avoid to create any subarrays when optimizing 'save, transform, constructPath, restore' (bug 1961107)	2025-04-17 19:42:28 +02:00
Calixte Denizet	d7cbda6cb5	Avoid to create any subarrays when optimizing 'save, transform, constructPath, restore' (bug 1961107) Removing those `subarray`calls helps to improve performance by a factor 6 on Linux and by a factor of 3 on Windows 11.	2025-04-17 19:14:01 +02:00
Jonas Jenwald	bf553f22da	Ensure that the /P-entry is actually a dictionary in `StructTreePage.prototype.addNode` This may fix issue 19822, but without a test-case it's simply impossible to know for sure.	2025-04-17 14:01:53 +02:00
Jonas Jenwald	76f23ce3b5	Catch, and ignore, errors during `Page.prototype.getStructTree` This way any errors thrown during parsing of the page-structTree will not be forwarded to the viewer.	2025-04-17 13:57:30 +02:00
Jonas Jenwald	245d9ba925	Ensure that "local" /Contents stream-dict /Resources aren't empty (PR 19803 follow-up) This is a small, and quite possibly pointless, optimization which ensures that any "local" /Resources aren't empty, to avoid needlessly trying to load and merge dictionaries.	2025-04-14 09:58:15 +02:00
Jonas Jenwald	834423b51d	Add more logical assignment in the `src/` folder This patch uses nullish coalescing assignment in cases where it's immediately obvious from surrounding code that doing so is safe, and logical OR assignment elsewhere (mostly the changes in XFA code).	2025-04-12 17:28:33 +02:00
Jonas Jenwald	1c80412f61	Change `PDFDocument.prototype._xfaStreams` to return a `Map` Using a `Map` rather than an `Object` is a nicer, since it has better support for both iteration and checking if a key exists. We also change the initial values to be `null`, rather than empty strings, and reduce duplication when creating the `Map`. Please note: Since this is worker-thread code, these changes are "invisible" at the API-level.	2025-04-12 12:47:22 +02:00
Jonas Jenwald	1048508dd1	Catch circular references in /Form XObjects (issue 19800) For simplicity we will abort /Form XObject parsing immediately when encountering a circular reference, rather than letting it continue up until some limit (as e.g. PDFium appears to do), which should be fine since there are never any guarantees if/how corrupt PDF documents will render.	2025-04-11 16:54:22 +02:00
Jonas Jenwald	7a94fafd30	Prefer /Resources from the /Contents stream-dict, if available In rare cases /Resources are also found in the /Contents stream-dict, in addition to in the /Page dict, hence we need to prefer those when available; see `issue18894.pdf`.	2025-04-11 16:54:22 +02:00
Jonas Jenwald	835a456767	Use `adjustWidths` unconditionally for all embedded fonts (issue 19802) Previously we'd only do this for Type1/CFF fonts, see e.g. PR 6736, since the font-program may update the /FontMatrix. However, it seems that we should do this unconditionally to account for fonts with non-default /FontMatrix-entries in the font-dictionary (which seem to be pretty rare).	2025-04-11 15:01:35 +02:00
Jonas Jenwald	fbc4f4b12a	Handle non-integer and out-of-range values correctly in Indexed color spaces In PDF version 2.0 the handling of Indexed color spaces was clarified as follows: > The index value should be an integer in the range 0 to hival. If the value is a real number, it shall be rounded to the nearest integer (0.5 values shall be rounded up); if it is outside the range 0 to hival, it shall be adjusted to the nearest value within that range. Please refer to https://github.com/pdf-association/pdf-differences/tree/main/IndexedColor	2025-04-09 15:31:49 +02:00
Jonas Jenwald	12c7c7b0af	Merge pull request #19773 from Snuffleupagus/inline-PDFImage-createRawMask Inline `PDFImage.createRawMask` in the `PDFImage.createMask` method	2025-04-08 17:19:09 +02:00
Jonas Jenwald	dc3e24a76a	Inline `PDFImage.createRawMask` in the `PDFImage.createMask` method After the introduction of `OffscreenCanvas` support we now have two separate mask-methods in the `PDFImage` class, and the reason that they were not combined is likely that we need the "raw" bytes when parsing Type3-glyph image masks. However, that case is easy to support simply by disabling `OffscreenCanvas` usage when parsing Type3-glyphs and that way we're able to reduce some code duplication. Another slightly strange property of the `PDFImage.createMask` method is that it needs various image-dictionary parameters manually provided, which is probably because this is very old code. That feels slightly unwieldy, and we instead change the method to pass in the image-stream directly and do the necessary data-lookup internally. A side-effect of this re-factoring is that we now support using the custom `isSingleOpaquePixel` operator in Type3-glyphs, which shouldn't hurt even though it seems extremely unlikely for that to ever happen in Type3-glyphs.	2025-04-08 12:01:50 +02:00
Jonas Jenwald	d882d0869c	Move the `IDENTITY_MATRIX` constant into `src/core/core_utils.js` (PR 19772 follow-up) After the changes in PR 19772 the `IDENTITY_MATRIX` constant is now only used on the worker-thread, which leads to Webpack marking the code as unused in the built `pdf.mjs` file; see https://phabricator.services.mozilla.com/D244533#change-8oITAexCvrlQ	2025-04-07 11:40:18 +02:00
Calixte Denizet	4c63905a18	Avoid to create an array when setting the text matrix	2025-04-05 20:45:26 +02:00
Jonas Jenwald	7cfb1be650	Merge pull request #19758 from Snuffleupagus/OperatorList-setOptions Initialize the `isOffscreenCanvasSupported` option, in the `OperatorList` class, once per document	2025-04-05 18:45:55 +02:00
calixteman	7eef7dfc78	Merge pull request #19763 from calixteman/simplify_updaterect Replace UpdateRectMinMax by getAxialAlignedBoundingBox	2025-04-04 21:33:05 +02:00
Calixte Denizet	e7a951547d	Replace UpdateRectMinMax by getAxialAlignedBoundingBox and don't use array destructuring because it induces a memory and perf penalties.	2025-04-04 19:57:55 +02:00
Jonas Jenwald	dad6febc39	Pass the /Info-strings as a `Map` to the `src/core/writer.js` code We want to iterate through the data in the `computeMD5` function, and `Map`s have "nicer" support for that than generic objects. (Somewhat recently `Map` performance was improved in Firefox, however this also isn't really performance sensitive code.)	2025-04-04 13:36:13 +02:00
Jonas Jenwald	8c5fb7979f	Also transfer `ImageBitmap`s for image masks and inline images Currently we're only transferring TypedArrays, however there's no reason (as far as I can tell) to not also transfer `ImageBitmap`s in these cases.	2025-04-03 18:50:39 +02:00
Calixte Denizet	41bed561f0	Simplify updateRectMinMax in order to use slightly less memory	2025-04-03 17:06:58 +02:00
Jonas Jenwald	4a6c47489e	Initialize the `isOffscreenCanvasSupported` option, in the `OperatorList` class, once per document Currently we're setting this option for each small inline image, which seems unnecessary since it should suffice to do that once per document.	2025-04-03 14:00:07 +02:00
Jonas Jenwald	4262603b06	Re-name the `Util.applyTransformToBezierInPlace` method Given that all `Util.apply...` methods are now using in/out parameters, we can slightly shorten the name of this one.	2025-04-01 23:03:50 +02:00
Jonas Jenwald	fa643bb22f	Change `Util.applyTransform` to use the point-argument as an in/out parameter This will help reduce the total number of Array allocations, which cannot hurt, and also allows us to remove the `Util.applyTransformInPlace` method.	2025-04-01 23:03:36 +02:00
Jonas Jenwald	e5fbf52405	Merge pull request #19736 from Snuffleupagus/compileType3Glyph-worker [api-minor] Move Type3-glyph compilation to the worker-thread	2025-04-01 19:40:30 +02:00
Jonas Jenwald	b33522a208	Merge pull request #19746 from Snuffleupagus/evaluator-img-cache-tweaks Reduce some code duplication when handling globally cached images	2025-04-01 19:15:10 +02:00
calixteman	25693dc0ee	Merge pull request #19712 from calixteman/optimize_save_construct Optimize save-transform-constructPath-restore	2025-04-01 18:20:59 +02:00
Jonas Jenwald	9cd5a9658a	[api-minor] Move Type3-glyph compilation to the worker-thread After PR 19731 the format of compiled Type3-glyphs is now simple enough that the compilation can be moved to the worker-thread, without introducing any significant additional complexity. This allows us to, ever so slightly, simplify the implementation in `src/display/canvas.js` since the Type3 operatorLists will now directly include standard path-rendering operators (using the format introduced in PR 19689). As part of these changes we also stop caching Type3 image masks since: we've not come across any cases where that actually helps, they're usually fairly small, and it simplifies the code. Note that one "negative" change introduced in this patch is that we'll now compile Type3-glyphs eagerly, whereas previously we'd only do that lazily upon their first use. However, this doesn't seem to impact performance in any noticeable way since the compilation is fast enough (way below 1 ms/glyph in my testing) and Type3-fonts are also limited to just 256 glyphs. Also, many (or most?) Type3-fonts don't even use image masks and are thus not affected by these changes.	2025-04-01 09:09:00 +02:00
Jonas Jenwald	213830f44f	Use, and re-name, the `addLocallyCachedImageOps` helper for global images too This avoids having to "manually" set the image operators for globally cached images.	2025-03-31 10:57:04 +02:00
Jonas Jenwald	e0e59eaf01	Define the global cache-data once in `buildPaintImageXObject` Currently we duplicate the same identical code three times, which seems both unnecessary and error prone.	2025-03-31 10:29:29 +02:00

1 2 3 4 5 ...

3350 Commits