791 Commits

Author SHA1 Message Date
Calixte Denizet
ac925f4f1b Downscale jpeg2000 images, if needed, while decoding them
It fixes #19517.
2025-05-05 22:39:59 +02:00
Calixte Denizet
7a251b206e Fix the bbox when saving a rotated text field (bug 1963407) 2025-04-29 18:49:07 +02:00
Jonas Jenwald
64007e777e Ensure that the /Form XObject /Resources-entry is actually a dictionary (issue 19848) 2025-04-23 10:19:20 +02:00
Jonas Jenwald
1048508dd1 Catch circular references in /Form XObjects (issue 19800)
For simplicity we will abort /Form XObject parsing *immediately* when encountering a circular reference, rather than letting it continue up until some limit (as e.g. PDFium appears to do), which should be fine since there are never any guarantees if/how *corrupt* PDF documents will render.
2025-04-11 16:54:22 +02:00
Jonas Jenwald
835a456767 Use adjustWidths unconditionally for all embedded fonts (issue 19802)
Previously we'd only do this for Type1/CFF fonts, see e.g. PR 6736, since the font-program may update the /FontMatrix.
However, it seems that we should do this unconditionally to account for fonts with non-default /FontMatrix-entries in the font-dictionary (which seem to be pretty rare).
2025-04-11 15:01:35 +02:00
Jonas Jenwald
fbc4f4b12a Handle non-integer and out-of-range values correctly in Indexed color spaces
In PDF version 2.0 the handling of Indexed color spaces was clarified as follows:
> The index value should be an integer in the range 0 to hival. If the value is a real number, it shall be rounded to the nearest integer (0.5 values shall be rounded up); if it is outside the range 0 to hival, it shall be adjusted to the nearest value within that range.

Please refer to https://github.com/pdf-association/pdf-differences/tree/main/IndexedColor
2025-04-09 15:31:49 +02:00
Nicolò Ribaudo
da5b681b16
[chrome] Fix text selection with .markedContent
The current text layer approach based on absolutely positioned
`<span>` elements by default causes flickering with text selection,
and we have browser-specific workarounds to solve that.

In Chrome, the workaround involves moving the `.endOfContent` element to
right after the last element that contains some selected content. This
works well in simple PDFs, but breaks when we have `span.markedContent`
elements. Given a text layer structure like the following, rendered
as four consecutive lines:
```html
<span class="markedContent">
  <br>
  <span>development enter the construction phase (estimated at around</span>
</span>
<span class="markedContent">
  <br>
  <span>300 MEUR).</span>
</span>
<span class="markedContent">
  <br>
  <span>Kreate's EBITA increased to 2.8 MEUR (Q4'23: 2.7 MEUR) and the</span>
</span>
<span class="markedContent">
  <br>
  <span>margin rose to 3.7% (Q4'23: 3.4%). However, profitability was</span>
</span>
```
when starting to select from inside the first line and dragging down
to the empty space after the second line, Chrome will anchor the
selection at the beginning of either the `<br>` or the `<span>` inside
the last `.markedContent`, depending on whether the selection is in
"per-character mode" (i.e. click and drag) or "per-word mode" (i.e.
double click and drag). This causes us to insert the `.endOfContent`
element in the wrong place (one element too far), which causes one
more line to be selected, which triggers another `"selecctionchange"`
event, which causes us to move `.endOfContent` again, and so on, looping
until when the whole page is selected.

This commit fixes the issue by making sure that when the end of the
selection range points to the _begining_ of an element, we walk back
the dom finding the first non-empty element, and attatch `.endOfContent`
to the end of that.
2025-04-07 17:36:40 +02:00
Jonas Jenwald
667645798f Apply char/word-spacing correctly for missing Type3-glyphs
In the included PDF document the Type3-font doesn't contain any glyph definition for "space", despite that character being referenced in the /Contents stream.
While missing Type3-glyphs obviously cannot be rendered, we still need to update the current canvas position such that any char/word-spacing is correctly applied.

The test-case was found at https://github.com/pdf-association/pdf-differences/tree/main/Type3WordSpacing
2025-03-29 00:12:08 +01:00
Calixte Denizet
a3c31904f1 Take into account the group bbox
It fixes #16742.
2025-03-24 15:07:31 +01:00
Calixte Denizet
2369e2d84f Take into account the path and the line width when consuming a stroked path 2025-03-23 18:08:06 +01:00
Jonas Jenwald
f90732e4c3 Extend getSupplementalGlyphMapForCalibri with Pound-sign (issue 19695) 2025-03-20 20:12:24 +01:00
Calixte Denizet
276c8f3518 [Editor] Take into account the parent rotation when switch to edit mode (issue #19424) 2025-03-19 14:26:55 +01:00
Jonas Jenwald
ee34c5c648 Let Lexer.prototype.getNumber treat more cases of a single minus sign as zero (bug 1953099)
This patch extends the approach of PR 14543, by also treating e.g. minus signs followed by '(' or '<' as zero.
Inside of a /Contents stream those characters will generally mean the start of one or more glyphs.
2025-03-12 17:50:13 +01:00
Calixte Denizet
4b4f85484e Always use the absolute value of the line thickness (issue 19633) 2025-03-11 14:03:23 +01:00
Calixte Denizet
f3454a738d [Editor] In edit mode, a non-editable stamp must be visible after the page is rendered 2025-03-06 15:37:27 +01:00
Jonas Jenwald
59cb9a064e Extend getGlyphMapForStandardFonts with some Cyrillic entries (issue 19550) 2025-02-26 10:16:06 +01:00
Ross Johnson
4f25d7f6cd Fix decryption of R=4, V=4 files with < 16-byte keys by 0-padding - undocumented but matches Acrobat behavior (issue #19484) 2025-02-24 15:36:37 -06:00
Jonas Jenwald
553ec7babc
Merge pull request #19128 from nicolo-ribaudo/draw-page-portion
[api-minor] Render high-res partial page views when falling back to CSS zoom (bug 1492303)
2025-02-21 22:24:41 +01:00
Xiphoseer
24aa39eb14 Consider textRise when showing type3 font glyphs
Add test case for issue 19532
2025-02-21 21:31:04 +01:00
Nicolò Ribaudo
458b2ee402
[api-minor] Render high-res partial page views when falling back to CSS zoom (bug 1492303)
When rendering big PDF pages at high zoom levels, we currently fall back
to CSS zoom to avoid rendering canvases with too many pixels. This
causes zoomed in PDF to look blurry, and the text to be potentially
unreadable.

This commit adds support for rendering _part_ of a page (called
`PDFPageDetailView` in the code), so that we can render portion of a
page in a smaller canvas without hiting the maximun canvas size limit.

Specifically, we render an area of that page that is slightly larger
than the area that is visible on the screen (100% larger in each
direction, unless we have to limit it due to the maximum canvas size).
As the user scrolls around the page, we re-render a new area centered
around what is currently visible.
2025-02-21 10:00:55 -08:00
Calixte Denizet
8f9232202f [JS] Skip throwing actions
It fixes #19505.

We were invaliding throwing actions (in setting event.rc to false) and all the event process was stopped.
Now we're just dumping the exception in the console: the action is skipped and event.rc is not set
else the input fields aren't updated wit KeyStroke actions.
2025-02-17 21:58:20 +01:00
Jonas Jenwald
33cba30bdb Search for destinations in both /Names and /Dests dictionaries (issue 19474)
Currently we only use either one of them, preferring the NameTree when it's available.
2025-02-14 15:49:05 +01:00
Jonas Jenwald
bd05b255fa [api-major] Apply the userUnit using CSS, to fix the text/annotation layers (bug 1947248)
Rather than modifying the "raw" dimensions of the page, we'll instead apply the `userUnit` as an *additional* scale-factor via CSS.

*Please note:* It's not clear to me if this solution is fully correct either, or if there's other problems with it, but it at least *appears* to work.

---

With these changes, the following CSS variables are now assumed to be available/set as necessary: `--total-scale-factor`, `--scale-factor`, `--user-unit`, `--scale-round-x`, and `--scale-round-y`.
2025-02-11 14:36:06 +01:00
Jonas Jenwald
37ef0b4a86 Ignore the URLs when checking if inferred links overlap existing LinkAnnotations (PR 19110 follow-up)
*Note:* For the issue mentioned on Matrix it'll obviously still make sense to improve the regular expression to detect more URL edge-cases.

However it occurred to me that even once that particular case is fixed there'll always be a risk that inferred links could overlap, and effectively block, the actual LinkAnnotations.
Hence this patch removes the URL comparison to ensure that overlapping inferred links will always be ignored.
2025-02-09 22:06:58 +01:00
Ujjwal Sharma
61ba1ea48c Enable automatic URL linking
Automatically detect links in the text content of a file and automatically
generate link annotations at the appropriate locations to achieve
automatic link detection and hyperlinking.
2025-02-05 16:56:54 +01:00
Jonas Jenwald
6f2706fad6 Support the password field-flag in TextWidgetAnnotation (issue 19389) 2025-01-29 12:40:09 +01:00
Calixte Denizet
1ccf6ed976 Correctly render the glyph outline when it has a stroke pattern
It fixes #19360.

Each glyph in the test case has a fill and a stroke pattern, so the current transform used
to scale the glyph outline must be the same.
In setting the stroke color to green, I noticed that the last outline contains some non-closed
subpaths, so when generating the glyph outline, every time we 'moveTo', we close the previous
subpath.
2025-01-21 15:30:16 +01:00
Jonas Jenwald
c4ba3ac23f Replace the EXIF-block with dummy data to prevent JPEG images being rotated (bug 1942064)
The `ImageDecoder` will respect the EXIF orientation, which can lead to JPEG images being incorrectly rotated. To avoid this we replace the entire EXIF-block with dummy data, which works since it'll cause EXIF parsing to bail out early in Firefox; see https://searchfox.org/mozilla-central/rev/9a66d18cb35595c89f499a1011c9dd7e573fce77/image/decoders/EXIF.cpp#130-138
2025-01-20 16:50:22 +01:00
Jonas Jenwald
e5bc760316 Access the number of components correctly in JPEG 2000 images with color space entries (issue 19326)
This small typo appears to be a regression from PR 18204.
2025-01-15 10:16:06 +01:00
Calixte Denizet
06f72d5662 [Editor] Don't try to use an non-existing canvas when rendering an invisible existing stamp editor
It fixes #19239.

When the canvas isn't existing the editor has no image: it's fine because the editor is invisible.
Once it's made visible, the canvas is set when the annotation layer has been rendered.
2025-01-10 16:05:59 +01:00
Jonas Jenwald
916fff0e42 Access the bbox/background data correctly in the MeshShadingPattern class (issue 18816)
This appears to have regressed in PR 13808, since it removed the `matrix`-entry from array returned by the `MeshShading.prototype.getIR` method *without* also updating the indexes in the `MeshShadingPattern` constructor.
2025-01-08 15:57:56 +01:00
Jonas Jenwald
20d5332009 For images that include SMask/Mask entries, ignore an SMask defined in the current graphics state
From section [11.6.4.3 Mask Shape and Opacity](https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G10.4848628) in the PDF specification:
 - An image XObject may contain its own *soft-mask image* in the form of a subsidiary image XObject in the `SMask` entry of the image dictionary (see "Image Dictionaries"). This mask, if present, shall override any explicit or colour key mask specified by the image dictionary's `Mask` entry. Either form of mask in the image dictionary shall override the current soft mask in the graphics state.
2024-12-30 14:25:07 +01:00
Nicolò Ribaudo
4e2aabc5cd
Fix left offset when scrolling to search result
When computing the left offset of the highlighted text, we cannot use
.offsetLeft because the text might have been scaled through CSS, and it
needs to be taken into account.

Use `.getClientRects()`/`.getBoundingClientRect()` instead, which will
return measurements scaled appropriately.
2024-12-16 16:04:13 +01:00
Calixte Denizet
d61b882888 [Editor] Avoid to focus an existing editor when enabling the layer
It fixes #18911.
2024-12-11 15:06:56 +01:00
Jonas Jenwald
c6e3fc4fe6 Take the userUnit into account in the PageViewport class (issue 19176) 2024-12-08 15:51:04 +01:00
Calixte Denizet
6fe6b6d6b7 Get the first codepoint instead of the first char when using the toUnicode map
It fixes #19182.
2024-12-06 18:25:13 +01:00
Calixte Denizet
7e02c77250 [Editor] Make ink annotation editable 2024-12-02 17:15:33 +01:00
Calixte Denizet
1460a8128a Fix the clickable area for rotated ink annotations 2024-12-02 09:48:22 +01:00
Calixte Denizet
2a337082c0 Build date consistently (in term of tz) when executing some embedded JS (bug 1934157)
The date was create in UTC+0 and then amended in using set-Month/Date which take into account
the user timezone.
With this patch we build all the date in the user timezone.
2024-11-29 19:50:21 +01:00
Calixte Denizet
aa9503e51f Correctly compute the mapping between text and normalized text when it contains a compound word on two lines
It fixes #19120.

The original text doesn't contain the cr so we must take that into account.
2024-11-28 15:56:04 +01:00
Calixte Denizet
b0b0de98e7 Use the V entry as an option when no options in a choice widget
It fixes #19083.

It isn't really a fix but more a workaround (we should correctly implement the
choice widget as a mix of text input+select).
2024-11-21 17:27:34 +01:00
Jonas Jenwald
e92a929a58 Try to improve handling of missing trailer dictionaries in XRef.indexObjects (issue 18986)
The problem with the referenced PDF document has nothing to do with invalid dates, as the issue seems to suggest, but rather with the fact that it has neither an XRef table nor a trailer dictionary.
Given that crucial parts of the internal document structure is missing, you might argue that it's not really a PDF document.

In an attempt to support this kind of corruption, we'll simply iterate through all (previously found) XRef entries and pick one that *might* be a valid /Root dictionary.
There's obviously no guarantee that this works, and it might not be fast in larger PDF documents, but at least it cannot be any worse than *immediately* throwing `InvalidPDFException` as we previously did here.

*Please note:* I'm totally fine with this patch being rejected, since it's somewhat questionable if we should actually attempt to support "PDF documents" with this level of corruption.
2024-11-05 18:19:26 +01:00
Calixte Denizet
d114f71feb Always fill the mask with the backdrop color
It fixes #18956.

In the patch #18029, for performance reasons and because I thought it was useless, I deliberately chose to not fill the mask
with the backdrop color when it's full black: it was a bad idea.
So in this patch we always add the backdrop color to the mask.
2024-10-26 14:14:51 +02:00
Calixte Denizet
86d1272d1b When changing a property on a radio button then change it for the siblings (bug 1922766) 2024-10-17 18:27:12 +02:00
Jonas Jenwald
689ffda9df
Merge pull request #18902 from Snuffleupagus/pdkids-rm-linked-test
Add the `pdkids` PDF document to the repository
2024-10-15 22:15:09 +02:00
Alexander Grahn
441efe456e Optional Content (OC) radiobutton (RB) groups implemented. Resolves #18823.
The code parses the /RBGroups entry in the OC configuration dict and adds the property `rbGroups' to instances of the OptionalContentGroup class. rbGroups takes an array of Sets, where each Set instance represents an RB group the OptionalContentGroup instance is a member of. Such a Set instance contains all OCG ids within the corresponding RB group. RB groups an OCG is associated with are processed when its visibility is set to true, as required by the PDF spec.
2024-10-15 11:34:45 +02:00
Jonas Jenwald
fb3c7b6d8f Add the pdkids PDF document to the repository
Given that the sub-title of that document is "Public domain texts for young people." and that the images have clear sources at the end of the document, it should (hopefully) be OK to add it to the repository rather than relying on a linked test-case.
2024-10-15 10:55:17 +02:00
Calixte Denizet
e7ab8cd8c1 Fallback on gray colorspace when there are no colorspace and no name in the scn/SCN arguments
It fixes #18894.
2024-10-13 16:02:07 +02:00
Calixte Denizet
3194f3de8b Keep the empty lines in the text fields
It fixes #18036.
2024-10-05 16:19:41 +02:00
Calixte Denizet
3103deaa44 Fix missing annotation parent in using the one from the Fields entry
Fixes #15096.
2024-10-04 20:00:19 +02:00