pdf.js

Author	SHA1	Message	Date
Jonas Jenwald	8728f7f134	Support an odd number of digits in hexadecimal strings (issue 18645) See https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G6.1840792	2024-08-23 16:31:43 +02:00
Jonas Jenwald	908f453384	Merge pull request #18627 from richard-smith-preservica/rcs/send-page-dict-requests-in-parallel Send fetch requests for all page dict lookups in parallel	2024-08-21 13:58:03 +02:00
Richard Smith (smir)	a67b9aec6c	Send fetch requests for all page dict lookups in parallel - When adding page dict candidates to the lookup tree, also initiate fetching them from xref, so if they are not yet loaded at all, the XHR will be sent - Only at the top level - assume that if there is a /Pages tree, it is sensibly structured and the number of requests won't be too bad - We can then await on the cached Promise without making the requests pipeline - This has a significant performance improvement for load-on-demand (i.e. with auto-fetch turned off) when a PDF has a large number of pages in the top level /Pages collection, and those pages are spread through a file, so every candidate needs to be fetched separately - PDFs with many pages where each page is a big image and all the pages are at the top level are quite a common output for digitisation programmes - I would have liked to do something like "if it's the top level collection and page count = number of kids, then just fetch that page without traversing the tree" but unfortunately I agree with comments on #8088 that there is no good general solution to allow for /Pages nodes with empty /Kids arrays	2024-08-21 11:08:14 +01:00
Jonas Jenwald	6dd31183be	Use standard glyph mapping for non-embedded and non-composite Calibri fonts (issue 18208) Given that we handle non-embedded Calibri fonts as "mapped to standard font", we really ought to be able to use the same glyph mapping as for an actual standard font. Note that this actually improves consistency in the code, given how we already handle such fonts if they happen to be of the `CIDFontType2` type; see `b47c7eca83/src/core/fonts.js (L1186-L1190)`	2024-08-19 19:10:35 +02:00
Jonas Jenwald	aebb8534f3	Limit base-class initialization checks to development and TESTING modes We have a number of base-classes that are only intended to be extended, but never to be used directly. To help enforce this during development these base-class constructors will check for direct usage, however that code is obviously not needed in the actual builds. Note: This patch reduces the size of the `gulp mozcentral` output by `~2.7` kilo-bytes, which isn't a lot but still cannot hurt.	2024-08-12 12:26:35 +02:00
Calixte Denizet	5f95d9b1ba	[Editor] Allow Float32Array for quadpoints in annotations (bug 1907958) Added annotations could have some quadpoints (highlight, ink). The isNumberArray check was returning false and consequently the annotation wasn't printable. The tests didn't catch this issue because the quadpoints were passed as Array. So driver.js has been updated in order to pass them as Float32Array in order to be in a situation similar to the real life one.	2024-07-31 16:23:01 +02:00
Jonas Jenwald	c4cd405a8f	Ignore non-dictionary nodes when parsing StructTree data (issue 18503)	2024-07-28 12:08:44 +02:00
Calixte Denizet	c3065629ca	[Editor] Correctly save a non-ascii alt text	2024-07-24 19:13:45 +02:00
Calixte Denizet	482994cc04	Use a transparent color when setting fill/stroke colors in a pattern context but with no colorspace	2024-07-22 09:56:10 +02:00
Calixte Denizet	71bae38afb	Fallback on DeviceGray when a colorspace cannot be parsed	2024-07-21 17:56:31 +02:00
Jonas Jenwald	d24a61c648	Allow /XYZ destinations without zoom parameter (issue 18408) According to the PDF specification these destinations should have a zoom parameter, which may however be `null`, but it shouldn't be omitted; please see https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G11.2095870 Hence we try to work-around bad PDF generators by making the zoom parameter optional when validating explicit destinations in both the worker and the viewer.	2024-07-18 13:29:32 +02:00
Tim van der Meij	c77b97daff	Update the JS/CSS files for the new Prettier/Stylelint versions	2024-07-13 16:29:47 +02:00
calixteman	9b1b5ff7e7	Merge pull request #18419 from calixteman/reuse_old_dict_when_updating [Editor] Update the freetext annotation dictionary instead of creating a new one when updating an existing freetext	2024-07-11 11:24:15 +02:00
Jonas Jenwald	e8d35c25ee	Merge pull request #18412 from Snuffleupagus/issue-18059 Also update the width/unicode data when replacing missing glyphs in non-embedded Type1 fonts (issue 18059)	2024-07-11 10:52:17 +02:00
Calixte Denizet	6711123f68	[Editor] Update the freetext annotation dictionary instead of creating a new one when updating an existing freetext	2024-07-11 10:44:21 +02:00
Jonas Jenwald	403d023617	Allow e.g. /FitH destinations without additional parameter (bug 1907000) According to the PDF specification these destinations should have a coordinate parameter, which may however be `null`, but it shouldn't be omitted; please see https://opensource.adobe.com/dc-acrobat-sdk-docs/pdfstandards/PDF32000_2008.pdf#G11.2095870 Hence we try to work-around bad PDF generators by making the coordinate parameter optional when validating explicit destinations in both the worker and the viewer.	2024-07-11 10:36:44 +02:00
Jonas Jenwald	56653e5770	Also update the width/unicode data when replacing missing glyphs in non-embedded Type1 fonts (issue 18059) Please note: This causes a little bit of movement in the `issue2770` test-case, however this matches the rendering in both Adobe Reader and PDFium.	2024-07-09 09:41:01 +02:00
Jonas Jenwald	f9d63201eb	Revert "Remove the unused `Font.prototype.spaceWidth` getter (PR 13424 follow-up)" This reverts commit 4aee67227e3c5b28f1bb4d8fa6b2ad882bc23b5a.	2024-07-09 09:28:10 +02:00
alexcat3	1c364422a6	Handle toUnicode cmaps that omit leading zeros in hex encoded UTF-16 (issue 18099) Add unit test to check compatability with such cmaps In the PDF in issue 18099. the toUnicode cmap had a line to map the glyph char codes from 00 to 7F to the corresponding code points. The syntax to map a range of char codes to a range of unicode code points is <start_char_code> <end_char_code> <start_unicode_codepoint> As the unicode code points are supposed to be given in UTF-16 BE, the PDF's line SHOULD have probably read <00> <7F> <0000> Instead it omitted two leading zeros from the UTF-16 like this <00> <7F> <00> This confused PDF.js into mapping these character codes to the UTF-16 characters with the corresponding HIGH bytes (01 became \u0100, 02 became \u0200, et cetera), which ended up turning latin text in the PDF into chinese when it was copied I'm not sure if the PDF spec actually allows PDFs to do this, but since there's at least one PDF in the wild that does and other PDF readers read it correctly, PDF.js should probably support this	2024-07-06 11:29:21 -04:00
Tim van der Meij	ccb141e211	Merge pull request #18393 from Snuffleupagus/mustBeViewedWhenEditing-params Check the relevant parameters inside of the `mustBeViewedWhenEditing` method	2024-07-05 15:33:45 +02:00
Jonas Jenwald	38528d1116	Remove the `renderForms` parameter from the Annotation `getOperatorList` methods The `renderForms` parameter pre-dates the introduction of the general `intent` parameter, which means that we're now effectively passing the same state twice to these `getOperatorList` methods.	2024-07-05 12:25:18 +02:00
Jonas Jenwald	5f744904ac	Check the relevant parameters inside of the `mustBeViewedWhenEditing` method Similar to the `mustBeViewed` method, we can check the relevant parameters within the `mustBeViewedWhenEditing` method itself since that (in my opinion) slightly helps readability of the code in the `src/core/document.js` file.	2024-07-05 11:38:55 +02:00
Jonas Jenwald	a4ffc1066c	Move the internal API/Worker `isEditing`-state into `RenderingIntentFlag` In hindsight this seems like a better idea, since it avoids the need to manually pass `isEditing` around as a boolean value. Note that `RenderingIntentFlag` is internal functionality, not exposed in the official API, which means that it can be extended and modified as necessary.	2024-07-04 23:34:30 +02:00
Calixte Denizet	832fc93aa4	Use vertical variant of a char when it's in a missing vertical font (bug 1905623)	2024-07-03 09:46:54 +02:00
Calixte Denizet	64635f3b35	[api-minor][Editor] When switching to editing mode, redraw pages containing editable annotations Right now, editable annotations are using their own canvas when they're drawn, but it induces several issues: - if the annotation has to be composed with the page then the canvas must be correctly composed with its parent. That means we should move the canvas under canvasWrapper and we should extract composing info from the drawing instructions... Currently it's the case with highlight annotations. - we use some extra memory for those canvas even if the user will never edit them, which the case for example when opening a pdf in Fenix. So with this patch, all the editable annotations are drawn on the canvas. When the user switches to editing mode, then the pages with some editable annotations are redrawn but without them: they'll be replaced by their counterpart in the annotation editor layer.	2024-07-02 14:11:40 +02:00
Calixte Denizet	42bb2b0737	Fix the computation of unitsPerEm when the fontMatrix has some negative coefficients It's a follow-up of #18253.	2024-06-24 16:40:07 +02:00
calixteman	a081dd25eb	Merge pull request #18306 from calixteman/bug1903731 Always use DW if it's a number for the font default width (bug 1903731)	2024-06-20 16:52:29 +02:00
Calixte Denizet	8c9a665728	Always use DW if it's a number for the font default width (bug 1903731)	2024-06-20 15:33:34 +02:00
Jonas Jenwald	3fae2d71f6	Don't throw if there's not enough data to get the header in `FlateStream` (issue 18298) Following in the footsteps of PR 17340.	2024-06-20 13:03:17 +02:00
Jonas Jenwald	604e8977e9	Add a helper function for handling locally cached image data (PR 18269 follow-up) This avoids having to duplicate the same exact code multiple times.	2024-06-18 17:20:40 +02:00
Jonas Jenwald	22ca7d52d3	Ensure that dependencies are added to the operatorList for locally cached images (issue 18259)	2024-06-18 12:25:53 +02:00
Calixte Denizet	d1452206d9	Compute correctly the unitsPerEm value from the fontMatrix when converting a font (bug 1539074)	2024-06-15 17:51:34 +02:00
Calixte Denizet	ff6180a4c9	Add an option to enable/disable hardware acceleration (bug 1902012)	2024-06-12 18:41:07 +02:00
Calixte Denizet	e3faa40f0f	Don't display annotations with a PMD (barcode stuff) entry (bug 1899804) There's no specification for that (even if it's possible to have an idea from the xfa specs) so we just want to hide them in order to avoid to display something wrong.	2024-06-10 21:01:37 +02:00
Calixte Denizet	196affd8e0	Fix decoding of JPX images having an alpha channel When an image has a non-zero SMaskInData it means that the image has an alpha channel. With JPX images, the colorspace isn't required (by spec) so when we don't have it, the JPX decoder will handle the conversion in RGBA format.	2024-06-03 20:08:11 +02:00
Calixte Denizet	9654ad570a	Decompress when it's possible images in using DecompressionStream Getting images is already asynchronous, so we can use this opportunity to use DecompressStream (which is async too) to decompress images.	2024-06-02 14:00:05 +02:00
Calixte Denizet	6fa98ac99f	[api-minor] Simplify how the list of points are structured Instead of sending to the main thread an array of Objects for a list of points (or quadpoints), we'll send just a basic float buffer. It should slightly improve performances (especially when cloning the data) and use slightly less memory.	2024-05-30 15:36:15 +02:00
Jonas Jenwald	27436d52b2	Reduce indentation when parsing new annotations in `getOperatorList` This code has, over the years, become more complex and less indentation generally helps readability.	2024-05-25 12:00:44 +02:00
Jonas Jenwald	ce52ce063e	Change `parsingType3Font` to a getter (PR 14448 follow-up) We can easily "compute" `parsingType3Font` from the `type3FontRefs`-value, and thus avoid having to separately track two related properties.	2024-05-25 10:46:12 +02:00
Jonas Jenwald	c349ac3a5d	Skip the temporary variable when calling `#findStreamLength` (PR 18125 follow-up)	2024-05-25 10:38:32 +02:00
Jonas Jenwald	cfcb700ecc	Prevent XRef errors from breaking font loading (bug 1898802) Note that the referenced file is trivially corrupt, since it contains two PDF documents placed in the same file which doesn't make sense (and isn't how a PDF document should be updated). However it's still a good idea to ensure that `loadFont` is able to handle errors when resolving References, since that allows us to invoke the existing fallback font handling.	2024-05-24 21:37:35 +02:00
Jonas Jenwald	3afa9bfc42	Improve /Page validation for linearized documents (issue 18138) The referenced PDF document contains corrupt linearization-data, that doesn't point to the first page as intended.	2024-05-22 12:04:02 +02:00
Jonas Jenwald	57014d0d13	Support corrupt PDF documents that contain "endsteam" commands (issue 18122) This patch also re-factors the findStreamLength-helper to avoid even more code duplication.	2024-05-21 13:38:17 +02:00
Jonas Jenwald	59637c1fa8	Merge pull request #18115 from Snuffleupagus/freeze-evaluatorOptions Freeze `evaluatorOptions` in the src/core/pdf_manager.js file	2024-05-21 12:19:04 +02:00
Jonas Jenwald	440b4b6eeb	Support charCodes larger than 32-bit in `adjustMapping` (issue 18117) This also required changing the initial `charCodeToGlyphId`-data to an Object, which seems generally correct since it's consistent with existing code in the `src\core\{cff_font, type1_font}.js` files.	2024-05-20 12:13:55 +02:00
Jonas Jenwald	3cd6c6c0e6	Freeze `evaluatorOptions` in the src/core/pdf_manager.js file Given that these options are passed from the API we don't want to accidentally modify them.	2024-05-18 15:16:12 +02:00
Jonas Jenwald	c5f92437f7	Avoid re-parsing global images that failed decoding (issue 18042, PR 17428 follow-up) For images that failed to decode once we want to avoid a pointless round-trip to the main-thread, which could otherwise happen for globally cached images.	2024-05-14 13:58:36 +02:00
Jonas Jenwald	6d523c316c	[api-minor] Include the document /Lang attribute in the textContent-data - These changes will allow a simpler way of implementing PR 17770. - The /Lang attribute is fetched lazily, with the first `getTextContent` invocation. Given the existing worker-thread caching, this will thus only need to be done once per PDF document (and most PDFs don't included this data). - This makes the /Lang attribute directly available in the `textLayer`, which has the following advantages: - We don't need to block, and thus delay, overall viewer initialization on fetching it (nor pass it around throughout the viewer). - Third-party users of the `textLayer` will automatically benefit from this, once we start actually using the /Lang attribute in PR 17770. Please note: This also, importantly, means that the `text` reference-tests will then cover this code (which wouldn't otherwise have been the case).	2024-05-14 12:44:41 +02:00
Jonas Jenwald	4aee67227e	Remove the unused `Font.prototype.spaceWidth` getter (PR 13424 follow-up) This getter became unused in PR 13424, well over two years ago, and apparently none of us noticed that.	2024-05-11 11:50:51 +02:00
Jonas Jenwald	9b41bfc374	Introduce helper functions for parsing /Matrix and /BBox arrays	2024-05-03 22:37:50 +02:00

1 2 3 4 5 ...

3044 Commits