pdf.js

Author	SHA1	Message	Date
calixteman	0e2b59e3d5	Merge pull request #20064 from calixteman/dict_api Add few methods to the Dict class in order to simplify the code when writing an annotation	2025-07-08 22:16:57 +02:00
calixteman	481b74a1c2	Merge pull request #20063 from calixteman/issue20062 Fix the default appearance of a Polygon annotation when a fill color is provided	2025-07-08 22:10:52 +02:00
Calixte Denizet	63b37b4371	Add few methods to the Dict class in order to simplify the code when writing an annotation	2025-07-08 21:23:29 +02:00
Calixte Denizet	ecc7096a80	Fix the default appearance of a Polygon annotation when a fill color is provided It fixes #20062.	2025-07-08 20:51:58 +02:00
Tim van der Meij	b999d5ce58	Merge pull request #20060 from noritaka1166/fix-typos Fix typos across the codebase	2025-07-08 18:28:52 +02:00
Calixte Denizet	bb52a440ce	Use the creation date in the popup when there is no modification date Remove the h1 element in popup title because it caused a warning in Firefox and use a span instead.	2025-07-07 10:51:35 +02:00
Noritaka Kobayashi	fa568e826d	Fix typos across the codebase	2025-07-07 09:59:36 +09:00
Calixte Denizet	194e2ede4d	Add some telemetry in order to know what are the certificates used in pdfs (bug 1973573)	2025-06-24 22:23:29 +02:00
Calixte Denizet	3bdc5d54fe	Get the text under highlight/squiggly/underline/strikethrough annotations (bug 1885505) and add an invisible element containing the text in the annotation layer to make it readable by a screen reader.	2025-06-22 21:47:29 +02:00
Calixte Denizet	2541d96bf5	[JS] Make the date parser less strict and display the expected date formt as a tooltip.	2025-06-20 08:57:07 +02:00
Calixte Denizet	14f0e88910	Fix a printing issue on Mac (bug 1961423) On mac, the pdf backend used when printing is using the cid from the font, so if a char has null cid then it's equivalent to .notdef and some viewers don't display it.	2025-06-07 21:33:17 +02:00
Tim van der Meij	d90ec29b4d	Merge pull request #19964 from Snuffleupagus/core-catalog-private Replace semi-private fields/methods with actual private ones in `src/core/catalog.js`	2025-05-24 14:16:06 +02:00
Jonas Jenwald	c5449a98e0	Ignore empty paths when optimizing `constructPath` operations (issue 19971) Note how we're handling empty paths in [src/display/canvas.js](`a8e05d82e2/src/display/canvas.js (L1423-L1428)`), hence we need add similar code in the `QueueOptimizer` as well.	2025-05-23 13:59:05 +02:00
Jonas Jenwald	a90e46bdc9	Use nullish coalescing in the `Catalog.prototype.numPages` getter	2025-05-21 16:47:20 +02:00
Jonas Jenwald	0d2ab3c709	Replace semi-private fields/methods with actual private ones in `src/core/catalog.js` Additionally, move a couple of "public" fields out of the class constructor and initialize them directly instead.	2025-05-21 12:10:07 +02:00
calixteman	fc68a9f3ee	Merge pull request #19958 from calixteman/rm_useless_subarray_flate_stream Don't create a useless subarray when getting image data from a flate stream	2025-05-19 19:29:25 +02:00
calixteman	f148106cb1	Merge pull request #19957 from calixteman/rm_qcms_subarray Remove all the useless subarrays when using qcms.	2025-05-19 17:33:52 +02:00
Jonas Jenwald	36b40d959b	Merge pull request #19955 from Snuffleupagus/issue-19954 Support Type3 fonts with an incomplete /FontDescriptor dictionary (issue 19954)	2025-05-19 17:26:46 +02:00
Calixte Denizet	3ecbef516d	Don't create a useless subarray when getting image data from a flate stream	2025-05-19 16:48:23 +02:00
Calixte Denizet	782e883a87	Remove all the useless subarrays when using qcms. It reduces the memory use and the newly added function `getRgbHex` is 4 times faster.	2025-05-19 16:07:57 +02:00
Jonas Jenwald	c02ea0c681	Simplify how we handle Type3 fonts without a /FontDescriptor dictionary Part of this is very old code, which we can now simplify a little bit.	2025-05-19 15:26:11 +02:00
Calixte Denizet	5789afd3f8	Create the css color to use with the canvas in the worker It slightly reduces the time spent to draw and the memory used.	2025-05-19 14:52:24 +02:00
Jonas Jenwald	5f5d9dfc28	Support Type3 fonts with an incomplete /FontDescriptor dictionary (issue 19954) We have a fallback for the common case of Type3 fonts without a /FontDescriptor dictionary, however we also need to handle the case where it's present but lacking the required /FontName entry.	2025-05-19 12:56:14 +02:00
studystill	b585937340	chore: remove redundant word in comment Signed-off-by: studystill <chenghuiyue@outlook.com>	2025-05-16 15:16:26 +08:00
calixteman	ca05a0dbfc	Merge pull request #19911 from calixteman/issue19753 Decode appearance keys of checkboxes	2025-05-10 15:01:22 +02:00
Jonas Jenwald	f26f984fa0	Improve validation in the `Catalog.prototype.openAction` getter When the /OpenAction data is an Array we're currently using it as-is which could theoretically cause problems in corrupt PDF documents, hence we ensure that a "raw" destination is actually valid. (This change is covered by existing unit-tests.) Note: In the Dictionary case we're using the `Catalog.parseDestDictionary` method, which already handles all of the necessary validation.	2025-05-10 11:51:58 +02:00
Calixte Denizet	49a098cb5d	Decode appearance keys of checkboxes	2025-05-09 21:46:17 +02:00
calixteman	293506ada7	Merge pull request #19903 from Snuffleupagus/shorten-fieldObjects-getter Shorten the `PDFDocument.prototype.fieldObjects` getter slightly	2025-05-09 15:49:51 +02:00
Jonas Jenwald	1f7581b5c6	Shorten the `PDFDocument.prototype.fieldObjects` getter slightly The effect is probably not even measurable, however this patch ever so slightly reduces the asynchronicity in the `fieldObjects` getter. These changes should be safe since: - We're inside of the `PDFDocument`-class and the `annotationGlobals`-getter, which will always return a (shadowed) Promise and won't throw `MissingDataException`s, can be accessed directly without going through the `BasePdfManager`-instance. - The `acroForm`-dictionary can be accessed through the `annotationGlobals`-data, removing the need to "manually" look it up and thus the need for using `Promise.all` here. - We can also lookup the /Fields-data, in the `acroForm`-dictionary, synchronously since the initial `formInfo.hasFields` check guarantees that it's available.	2025-05-07 17:47:09 +02:00
Jonas Jenwald	36fafbc05c	Use object destructuring a bit more in the `src/core/document.js` file	2025-05-07 13:41:50 +02:00
Jonas Jenwald	92b065c87e	Replace a number of semi-private fields with actual private ones in `src/core/document.js` These are fields that can be moved out of their class constructors, and be initialized directly.	2025-05-07 13:41:44 +02:00
Jonas Jenwald	39803a9f25	Replace a number of semi-private methods with actual private ones in `src/core/document.js` There's a few remaining cases that are used with either cached getters or `BasePdfManager.prototype.ensure`-methods, and those cannot be converted.	2025-05-07 13:41:36 +02:00
Jonas Jenwald	0ded85e9b3	Add a `Page` helper method to create a `PartialEvaluator`-instance Currently we repeat the same identical code five times in the `Page`-class when creating a `PartialEvaluator`-instance, which given the number of parameters it needs seems like unnecessary duplication.	2025-05-07 13:41:29 +02:00
Jonas Jenwald	62009ffa70	Simplify how the `ObjectLoader` is used The `ObjectLoader.prototype.load` method has a fast-path, which avoids any lookup/parsing if the entire PDF document is already loaded. However, we still need to create an `ObjectLoader`-instance which seems unnecessary in that case. Hence we introduce a static `ObjectLoader.load` method, which will help avoid creating `ObjectLoader`-instances needlessly and also (slightly) shortens the call-sites. To ensure that the new method will be used, we extend the `no-restricted-syntax` ESLint rule to "forbid" direct usage of `new ObjectLoader()`.	2025-05-06 15:49:59 +02:00
Jonas Jenwald	ef1ad675c2	Unify method return values in the `ObjectLoader` class Given that all the methods are already asynchronous we can just use `await` more throughout this code, rather than having to explicitly return function-calls and `undefined`. Note also how none of the `ObjectLoader.prototype.load` call-sites use the return value.	2025-05-06 15:43:00 +02:00
Calixte Denizet	ac925f4f1b	Downscale jpeg2000 images, if needed, while decoding them It fixes #19517.	2025-05-05 22:39:59 +02:00
Jonas Jenwald	d9548b1c18	Slightly re-factor how we pre-load fonts and images in XFA documents Rather than "manually" invoking the methods from the `src/core/worker.js` file we introduce a single `PDFDocument`-method that handles this for us, and make the current methods private. Since this code is only invoked at most once per document, and only for XFA documents, we can use `BasePdfManager.prototype.ensureDoc` directly rather than needing a stand-alone method.	2025-05-04 13:44:33 +02:00
Jonas Jenwald	604153957a	Reduce duplication when parsing fonts in `loadXfaFonts` Currently we repeat virtually the same code when calling the `PartialEvaluator.prototype.handleSetFont` method, which we can avoid by introducing an inline helper function.	2025-05-04 13:42:17 +02:00
Jonas Jenwald	2979e23f3c	Ensure that `XFAFactory.prototype.isValid` returns a boolean value Considering the name of the method, and how it's actually being used, you'd expect it to return a boolean value. Given how it's currently being used this inconsistency doesn't cause any issues, however we should still fix this.	2025-05-04 13:42:17 +02:00
Tim van der Meij	5ca57fbd4b	Merge pull request #19885 from Snuffleupagus/loadXfaImages-simplify Simplify the `loadXfaImages` method and related code	2025-05-04 13:41:06 +02:00
Tim van der Meij	22cb3080ee	Merge pull request #19887 from Snuffleupagus/serializeXfaData-simplify Simplify the `serializeXfaData` method and related code	2025-05-04 13:38:01 +02:00
Jonas Jenwald	b3e16800f5	Remove the `BasePdfManager.prototype.catalog` getter This is only invoked once and it can be trivially replaced by the `ensureCatalog`-method, since the code where it's used is already asynchronous.	2025-05-03 13:40:23 +02:00
Jonas Jenwald	b531720d9c	Simplify the `serializeXfaData` method and related code Rather than having a dedicated `BasePdfManager`-method for this one call-site we can instead change `PDFDocument.prototype.serializeXfaData` to a non-async method, that we invoke via `BasePdfManager.prototype.ensureDoc`.	2025-05-03 11:20:42 +02:00
Jonas Jenwald	122822a750	Simplify the `loadXfaImages` method and related code Currently we create an intermediate `Dict` during parsing, however that seems unnecessary since (note especially the second point): - The `NameOrNumberTree.prototype.getAll` method will already resolve any references, as needed, during parsing. - The `Catalog.prototype.xfaImages` getter is invoked, via the `BasePdfManager`-instance, such that any `MissingDataException`s are already handled correctly.	2025-05-02 11:53:41 +02:00
calixteman	91bfe12f38	Merge pull request #19883 from gpanakkal/checkbutton-tostyle Fix arguments in `toStyle` call in `CheckButton`	2025-05-01 22:08:03 +02:00
Gautam Panakkal	7bba3bd4ad	Add missing `this` arg to `toStyle` in `CheckButton.prototype.[$toHTML]`	2025-05-01 10:19:28 -07:00
Jonas Jenwald	b629bafd1c	Allow to, optionally, keep Unicode escape sequences in `stringToPDFString` (PR 17331 follow-up) Currently some of the links[1] on page three of the `issue19835.pdf` test-case aren't clickable, since the destination (of the LinkAnnotation) becomes empty. The reason is that these destinations include the character `\x1b`, which is interpreted as the start of a Unicode escape sequence specifying the language of the string; please refer to section [7.9.2.2 Text String Type](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1957385) in the PDF specification. Hence it seems that we need a way to optionally disable that behaviour, to avoid a "badly" formatted string from becoming empty (or truncated), at least for cases where we are: - Parsing named destinations[2] and URLs. - Handling "strings" that are actually /Name-instances. - Building a lookup Object/Map based on some PDF data-structure. NOTE: The issue that prompted this patch is obviously related to destinations, however I've gone through the `src/core/` folder and updated various other `stringToPDFString` call-sites that (directly or indirectly) fit the categories listed above. --- [1] Try clicking on anything on the line containing "Item 7A. Quantitative and Qualitative Disclosures About Market Risk 27". [2] Unfortunately just skipping `stringToPDFString` in this case would cause other issues, such as the named destination becoming "unusable" in the viewer; see e.g. issues 14847 and 14864.	2025-04-30 20:51:10 +02:00
Jonas Jenwald	254431df1e	Avoid extra lookup/parsing when all destinations are already available Whenever we cannot find a destination we'll fallback to checking all destinations, to account for e.g. out-of-order NameTrees, and in those cases any subsequent destination-lookups can be made a tiny bit more efficient by immediately checking the already cached destinations.	2025-04-30 15:26:00 +02:00
Jonas Jenwald	0922aa9e9d	Merge pull request #19880 from Snuffleupagus/numberToString-assert-number Assert that `numberToString` is called with a number (issue 19877)	2025-04-29 20:35:32 +02:00
Jonas Jenwald	f5faf86180	Assert that `numberToString` is called with a number (issue 19877) NOTE: Given that this is an internal function, used only in the worker-thread, it's not clear to me that this is an entirely "necessary" change.	2025-04-29 20:31:24 +02:00

1 2 3 4 5 ...

3338 Commits