3338 Commits

Author SHA1 Message Date
calixteman
0e2b59e3d5
Merge pull request #20064 from calixteman/dict_api
Add few methods to the Dict class in order to simplify the code when writing an annotation
2025-07-08 22:16:57 +02:00
calixteman
481b74a1c2
Merge pull request #20063 from calixteman/issue20062
Fix the default appearance of a Polygon annotation when a fill color is provided
2025-07-08 22:10:52 +02:00
Calixte Denizet
63b37b4371 Add few methods to the Dict class in order to simplify the code when writing an annotation 2025-07-08 21:23:29 +02:00
Calixte Denizet
ecc7096a80 Fix the default appearance of a Polygon annotation when a fill color is provided
It fixes #20062.
2025-07-08 20:51:58 +02:00
Tim van der Meij
b999d5ce58
Merge pull request #20060 from noritaka1166/fix-typos
Fix typos across the codebase
2025-07-08 18:28:52 +02:00
Calixte Denizet
bb52a440ce Use the creation date in the popup when there is no modification date
Remove the h1 element in popup title because it caused a warning in Firefox and use a span instead.
2025-07-07 10:51:35 +02:00
Noritaka Kobayashi
fa568e826d
Fix typos across the codebase 2025-07-07 09:59:36 +09:00
Calixte Denizet
194e2ede4d Add some telemetry in order to know what are the certificates used in pdfs (bug 1973573) 2025-06-24 22:23:29 +02:00
Calixte Denizet
3bdc5d54fe Get the text under highlight/squiggly/underline/strikethrough annotations (bug 1885505)
and add an invisible element containing the text in the annotation layer to make
it readable by a screen reader.
2025-06-22 21:47:29 +02:00
Calixte Denizet
2541d96bf5 [JS] Make the date parser less strict
and display the expected date formt as a tooltip.
2025-06-20 08:57:07 +02:00
Calixte Denizet
14f0e88910 Fix a printing issue on Mac (bug 1961423)
On mac, the pdf backend used when printing is using the cid from the font,
so if a char has null cid then it's equivalent to .notdef and some viewers
don't display it.
2025-06-07 21:33:17 +02:00
Tim van der Meij
d90ec29b4d
Merge pull request #19964 from Snuffleupagus/core-catalog-private
Replace semi-private fields/methods with actual private ones in `src/core/catalog.js`
2025-05-24 14:16:06 +02:00
Jonas Jenwald
c5449a98e0 Ignore empty paths when optimizing constructPath operations (issue 19971)
Note how we're handling empty paths in [src/display/canvas.js](a8e05d82e2/src/display/canvas.js (L1423-L1428)), hence we need add similar code in the `QueueOptimizer` as well.
2025-05-23 13:59:05 +02:00
Jonas Jenwald
a90e46bdc9 Use nullish coalescing in the Catalog.prototype.numPages getter 2025-05-21 16:47:20 +02:00
Jonas Jenwald
0d2ab3c709 Replace semi-private fields/methods with actual private ones in src/core/catalog.js
Additionally, move a couple of "public" fields out of the class constructor and initialize them directly instead.
2025-05-21 12:10:07 +02:00
calixteman
fc68a9f3ee
Merge pull request #19958 from calixteman/rm_useless_subarray_flate_stream
Don't create a useless subarray when getting image data from a flate stream
2025-05-19 19:29:25 +02:00
calixteman
f148106cb1
Merge pull request #19957 from calixteman/rm_qcms_subarray
Remove all the useless subarrays when using qcms.
2025-05-19 17:33:52 +02:00
Jonas Jenwald
36b40d959b
Merge pull request #19955 from Snuffleupagus/issue-19954
Support Type3 fonts with an incomplete /FontDescriptor dictionary (issue 19954)
2025-05-19 17:26:46 +02:00
Calixte Denizet
3ecbef516d Don't create a useless subarray when getting image data from a flate stream 2025-05-19 16:48:23 +02:00
Calixte Denizet
782e883a87 Remove all the useless subarrays when using qcms.
It reduces the memory use and the newly added function `getRgbHex` is 4 times faster.
2025-05-19 16:07:57 +02:00
Jonas Jenwald
c02ea0c681 Simplify how we handle Type3 fonts without a /FontDescriptor dictionary
Part of this is very old code, which we can now simplify a little bit.
2025-05-19 15:26:11 +02:00
Calixte Denizet
5789afd3f8 Create the css color to use with the canvas in the worker
It slightly reduces the time spent to draw and the memory used.
2025-05-19 14:52:24 +02:00
Jonas Jenwald
5f5d9dfc28 Support Type3 fonts with an incomplete /FontDescriptor dictionary (issue 19954)
We have a fallback for the common case of Type3 fonts without a /FontDescriptor dictionary, however we also need to handle the case where it's present but lacking the required /FontName entry.
2025-05-19 12:56:14 +02:00
studystill
b585937340 chore: remove redundant word in comment
Signed-off-by: studystill <chenghuiyue@outlook.com>
2025-05-16 15:16:26 +08:00
calixteman
ca05a0dbfc
Merge pull request #19911 from calixteman/issue19753
Decode appearance keys of checkboxes
2025-05-10 15:01:22 +02:00
Jonas Jenwald
f26f984fa0 Improve validation in the Catalog.prototype.openAction getter
When the /OpenAction data is an Array we're currently using it as-is which could theoretically cause problems in corrupt PDF documents, hence we ensure that a "raw" destination is actually valid. (This change is covered by existing unit-tests.)

*Note:* In the Dictionary case we're using the `Catalog.parseDestDictionary` method, which already handles all of the necessary validation.
2025-05-10 11:51:58 +02:00
Calixte Denizet
49a098cb5d Decode appearance keys of checkboxes 2025-05-09 21:46:17 +02:00
calixteman
293506ada7
Merge pull request #19903 from Snuffleupagus/shorten-fieldObjects-getter
Shorten the `PDFDocument.prototype.fieldObjects` getter slightly
2025-05-09 15:49:51 +02:00
Jonas Jenwald
1f7581b5c6 Shorten the PDFDocument.prototype.fieldObjects getter slightly
The effect is probably not even measurable, however this patch ever so slightly reduces the asynchronicity in the `fieldObjects` getter. These changes should be safe since:

 - We're inside of the `PDFDocument`-class and the `annotationGlobals`-getter, which will always return a (shadowed) Promise and won't throw `MissingDataException`s, can be accessed directly without going through the `BasePdfManager`-instance.

 - The `acroForm`-dictionary can be accessed through the `annotationGlobals`-data, removing the need to "manually" look it up and thus the need for using `Promise.all` here.

 - We can also lookup the /Fields-data, in the `acroForm`-dictionary, synchronously since the initial `formInfo.hasFields` check guarantees that it's available.
2025-05-07 17:47:09 +02:00
Jonas Jenwald
36fafbc05c Use object destructuring a bit more in the src/core/document.js file 2025-05-07 13:41:50 +02:00
Jonas Jenwald
92b065c87e Replace a number of semi-private fields with actual private ones in src/core/document.js
These are fields that can be moved out of their class constructors, and be initialized directly.
2025-05-07 13:41:44 +02:00
Jonas Jenwald
39803a9f25 Replace a number of semi-private methods with actual private ones in src/core/document.js
There's a few remaining cases that are used with either cached getters or `BasePdfManager.prototype.ensure`-methods, and those cannot be converted.
2025-05-07 13:41:36 +02:00
Jonas Jenwald
0ded85e9b3 Add a Page helper method to create a PartialEvaluator-instance
Currently we repeat the same identical code five times in the `Page`-class when creating a `PartialEvaluator`-instance, which given the number of parameters it needs seems like unnecessary duplication.
2025-05-07 13:41:29 +02:00
Jonas Jenwald
62009ffa70 Simplify how the ObjectLoader is used
The `ObjectLoader.prototype.load` method has a fast-path, which avoids any lookup/parsing if the entire PDF document is already loaded.
However, we still need to create an `ObjectLoader`-instance which seems unnecessary in that case.

Hence we introduce a *static* `ObjectLoader.load` method, which will help avoid creating `ObjectLoader`-instances needlessly and also (slightly) shortens the call-sites.
To ensure that the new method will be used, we extend the `no-restricted-syntax` ESLint rule to "forbid" direct usage of `new ObjectLoader()`.
2025-05-06 15:49:59 +02:00
Jonas Jenwald
ef1ad675c2 Unify method return values in the ObjectLoader class
Given that all the methods are already asynchronous we can just use `await` more throughout this code, rather than having to explicitly return function-calls and `undefined`.
Note also how none of the `ObjectLoader.prototype.load` call-sites use the return value.
2025-05-06 15:43:00 +02:00
Calixte Denizet
ac925f4f1b Downscale jpeg2000 images, if needed, while decoding them
It fixes #19517.
2025-05-05 22:39:59 +02:00
Jonas Jenwald
d9548b1c18 Slightly re-factor how we pre-load fonts and images in XFA documents
Rather than "manually" invoking the methods from the `src/core/worker.js` file we introduce a single `PDFDocument`-method that handles this for us, and make the current methods private.
Since this code is only invoked at most *once* per document, and only for XFA documents, we can use `BasePdfManager.prototype.ensureDoc` directly rather than needing a stand-alone method.
2025-05-04 13:44:33 +02:00
Jonas Jenwald
604153957a Reduce duplication when parsing fonts in loadXfaFonts
Currently we repeat virtually the same code when calling the `PartialEvaluator.prototype.handleSetFont` method, which we can avoid by introducing an inline helper function.
2025-05-04 13:42:17 +02:00
Jonas Jenwald
2979e23f3c Ensure that XFAFactory.prototype.isValid returns a boolean value
Considering the name of the method, and how it's actually being used, you'd expect it to return a boolean value.
Given how it's currently being used this inconsistency doesn't cause any issues, however we should still fix this.
2025-05-04 13:42:17 +02:00
Tim van der Meij
5ca57fbd4b
Merge pull request #19885 from Snuffleupagus/loadXfaImages-simplify
Simplify the `loadXfaImages` method and related code
2025-05-04 13:41:06 +02:00
Tim van der Meij
22cb3080ee
Merge pull request #19887 from Snuffleupagus/serializeXfaData-simplify
Simplify the `serializeXfaData` method and related code
2025-05-04 13:38:01 +02:00
Jonas Jenwald
b3e16800f5 Remove the BasePdfManager.prototype.catalog getter
This is only invoked *once* and it can be trivially replaced by the `ensureCatalog`-method, since the code where it's used is already asynchronous.
2025-05-03 13:40:23 +02:00
Jonas Jenwald
b531720d9c Simplify the serializeXfaData method and related code
Rather than having a dedicated `BasePdfManager`-method for this one call-site we can instead change `PDFDocument.prototype.serializeXfaData` to a non-async method, that we invoke via `BasePdfManager.prototype.ensureDoc`.
2025-05-03 11:20:42 +02:00
Jonas Jenwald
122822a750 Simplify the loadXfaImages method and related code
Currently we create an intermediate `Dict` during parsing, however that seems unnecessary since (note especially the second point):
 - The `NameOrNumberTree.prototype.getAll` method will already resolve any references, as needed, during parsing.
 - The `Catalog.prototype.xfaImages` getter is invoked, via the `BasePdfManager`-instance, such that any `MissingDataException`s are already handled correctly.
2025-05-02 11:53:41 +02:00
calixteman
91bfe12f38
Merge pull request #19883 from gpanakkal/checkbutton-tostyle
Fix arguments in `toStyle` call in `CheckButton`
2025-05-01 22:08:03 +02:00
Gautam Panakkal
7bba3bd4ad Add missing this arg to toStyle in CheckButton.prototype.[$toHTML] 2025-05-01 10:19:28 -07:00
Jonas Jenwald
b629bafd1c Allow to, optionally, keep Unicode escape sequences in stringToPDFString (PR 17331 follow-up)
Currently *some* of the links[1] on page three of the `issue19835.pdf` test-case aren't clickable, since the destination (of the LinkAnnotation) becomes empty.
The reason is that these destinations include the character `\x1b`, which is interpreted as the start of a Unicode escape sequence specifying the language of the string; please refer to section [7.9.2.2 Text String Type](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1957385) in the PDF specification.

Hence it seems that we need a way to optionally disable that behaviour, to avoid a "badly" formatted string from becoming empty (or truncated), at least for cases where we are:
 - Parsing named destinations[2] and URLs.
 - Handling "strings" that are actually /Name-instances.
 - Building a lookup Object/Map based on some PDF data-structure.

*NOTE:* The issue that prompted this patch is obviously related to destinations, however I've gone through the `src/core/` folder and updated various other `stringToPDFString` call-sites that (directly or indirectly) fit the categories listed above.

---
[1] Try clicking on anything on the line containing "Item 7A. Quantitative and Qualitative Disclosures About Market Risk 27".

[2] Unfortunately just skipping `stringToPDFString` in this case would cause other issues, such as the named destination becoming "unusable" in the viewer; see e.g. issues 14847 and 14864.
2025-04-30 20:51:10 +02:00
Jonas Jenwald
254431df1e Avoid extra lookup/parsing when all destinations are already available
Whenever we cannot find a destination we'll fallback to checking all destinations, to account for e.g. out-of-order NameTrees, and in those cases any subsequent destination-lookups can be made a tiny bit more efficient by immediately checking the already cached destinations.
2025-04-30 15:26:00 +02:00
Jonas Jenwald
0922aa9e9d
Merge pull request #19880 from Snuffleupagus/numberToString-assert-number
Assert that `numberToString` is called with a number (issue 19877)
2025-04-29 20:35:32 +02:00
Jonas Jenwald
f5faf86180 Assert that numberToString is called with a number (issue 19877)
*NOTE:* Given that this is an *internal* function, used only in the worker-thread, it's not clear to me that this is an entirely "necessary" change.
2025-04-29 20:31:24 +02:00