-
-
Notifications
You must be signed in to change notification settings - Fork 25
Fix: Honor width and height HTML attributes in image generation #100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
nicolasiscoding
wants to merge
14
commits into
develop
Choose a base branch
from
fix/honor-html-image-dimensions
base: develop
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
nicolasiscoding
commented
Sep 12, 2025
- `preProcessing` <?[Object]> | ||
- `skipHTMLMinify` <?[Boolean]> flag to skip minification of HTML. Defaults to `false`. | ||
- `imageProcessing` <?[Object]> | ||
- `maxRetries` <?[Number]> maximum number of retry attempts for failed image downloads. Defaults to `2`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do the typescript typings
684762c
to
f030781
Compare
96d6007
to
c5eccae
Compare
- Check for width/height attributes in vNode.properties - Use HTML-specified dimensions when available - Fall back to actual image dimensions when not specified - Particularly important for WYSIWYG editors like TinyMCE - Added root:true to ESLint config to prevent parent config interference Fixes #99 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add HTML attribute processing in computeImageDimensions function - HTML attributes without units default to pixels (e.g., width="100" → "100px") - Support aspect ratio preservation when only width or height is specified - Fix fallback logic that was overriding HTML attributes with original dimensions - Add comprehensive test cases in example files Fixes issue where TinyMCE image width/height attributes were ignored, causing all images to render at original size instead of specified dimensions. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Extend calculateAbsoluteValues to support all units: px, pt, cm, in, % - Update HTML attribute processing to detect all unit types - Add comprehensive test cases covering all supported units: * Explicit pixel units (180px x 90px) * Point units (144pt x 72pt) * Centimeter units (4cm x 2cm) * Inch units (1.5in x 0.75in) * Percentage units (10% x 10%) * Mixed units (3cm width, 1in height) This ensures full compatibility with TinyMCE and other rich text editors that may specify dimensions in various measurement units. Test cases added to both example-node.js and example.js files for comprehensive validation of unit conversion accuracy. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add validation to detect when image URLs return HTML error pages instead of image data - Add comprehensive buffer validation before calling sizeOf function - Prevent "unsupported file type: undefined" errors with proper error detection - Add graceful handling for invalid/corrupted image responses - Provide clearer error messages for debugging image processing issues This fixes the cryptic "unsupported file type: undefined (file: undefined)" errors by detecting when URLs return HTML error pages (common with Wikimedia) and providing meaningful error messages instead. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…d rate limiting - Implement in-memory image cache with Map to store successful downloads - Cache both successful downloads and failures to prevent retry spam within same document - Clear cache between document generations to allow retry of failed URLs in new runs - Add comprehensive cache statistics and logging for monitoring performance - Prevent rate limiting by avoiding duplicate downloads of same image URLs - Smart retry logic: cache failures per document, but allow retries across documents Cache statistics show significant performance improvement: - Only unique URLs are downloaded once per document generation - Duplicate image references use cached data instantly - Failed downloads are cached to prevent retry storms within same document - Fresh attempts allowed for new document generations 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add comprehensive buffer validation before sizeOf calls in all locations - Add HTML response detection to prevent "unsupported file type" errors - Replace crashes with graceful error handling and continue processing - Add retry mechanism with exponential backoff for image downloads - Implement intelligent image caching to prevent rate limiting - Clear cache between document generations to allow fresh retry attempts - Add detailed logging for debugging image processing issues 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add imageProcessing config section to defaultDocumentOptions in constants.js - Make maxRetries configurable (default: 2) via documentOptions.imageProcessing.maxRetries - Add verboseLogging option (default: false) for conditional debug output - Remove duplicate constants - all reference centralized defaults from constants.js - Update buildImage, convertVTreeToXML, and findXMLEquivalent to accept imageOptions - Replace console.log with conditional logVerbose helper function - Add comprehensive test script demonstrating configuration options - Users can now configure: { imageProcessing: { maxRetries: 3, verboseLogging: true } } 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add data URL detection in buildParagraph functions to handle cached images - Prevent redundant imageToBase64 calls on already-processed data URLs - Remove duplicate data URL parsing logic in buildParagraph - Both buildParagraph sections now check if imageSource starts with 'data:' - Resolves issue where cached images were being reprocessed causing sizeOf errors - Images are now processed consistently across all code paths 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Document maxRetries and verboseLogging configuration options - Add practical TypeScript example showing image processing configuration - Include options in main API documentation section - Provide clear defaults and usage examples for new image processing features 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- buildImage now updates vNode.properties.src with cached/converted data URLs - Eliminates dual processing of the same image through different code paths - Prevents buildParagraph from reprocessing already-cached images - Resolves "unsupported file type: undefined" errors completely - All image processing paths now see consistent, processed data URLs 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
- Add imageProcessing interface with maxRetries and verboseLogging properties - Keep preprocessing lowercase to match actual implementation - TypeScript definitions now correctly reflect the implementation in constants.js 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
…ge processing ISSUE 1 - Duplicate Buffer Creation (render-document-file.js): - BEFORE: Buffer.from(response.fileContent, 'base64') called twice * For ZIP file creation * For image size analysis - AFTER: Create imageBuffer once, reuse for both operations - IMPACT: Reduces memory allocation and CPU usage for base64 decoding ISSUE 2 - Code Duplication (render-document-file.js): - BEFORE: Identical lineRule attribute setting code in two locations * figure > img case * direct img case - AFTER: Extract into addLineRuleToImageFragment() helper function - IMPACT: DRY principle, single source of truth for lineRule logic ISSUE 3 - Double Regex Execution (docx-document.js): - BEFORE: matches[1].match(/\/(.*?)$/) called twice in same expression - AFTER: Execute once, store in mimeTypePart variable, reuse result - IMPACT: Eliminates redundant regex execution and potential null reference Applied to Nicolas's html-honor branch with retry/caching functionality intact.
…controls - Replace image-to-base64 library with axios for better control over HTTP requests - Add configurable timeout (default 5s) to prevent hung image downloads - Add maximum image size limits (default 10MB) to prevent memory issues - Implement exponential backoff on timeouts for retry attempts - Add proper error handling for HTTP status codes and network errors - Update TypeScript definitions and documentation Security improvements: - Prevents DoS attacks from slow/unresponsive image servers - Bounded resource usage with size and timeout limits - Better error categorization (timeout vs network vs HTTP errors) DRY improvements: - Extracted downloadImageToBase64 to src/utils/image.js - Eliminated duplicate function definitions between helper files - Added input validation for timeout/size parameters - Enhanced error messages with structured logging 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
8155dcd
to
7fda70e
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
Fixes #99 - Images with width/height HTML attributes now render at the specified dimensions instead of always using actual image size.
Problem
When HTML contains images with explicit width/height attributes (common from WYSIWYG editors like TinyMCE), these dimensions were being ignored in favor of the actual image dimensions.
Solution
vNode.properties.width
andvNode.properties.height
Example
Will now render as 100x100 in the DOCX, regardless of the actual image size.
Changes
buildImage
function insrc/helpers/render-document-file.js
root: true
to.eslintrc.json
to isolate submodule ESLint configTesting
🤖 Generated with Claude Code