Markdownify upgrade is causing problems #1236

harshitgtypeface · 2025-05-05T05:35:18Z

With the recent version of markdownify, the library no longer converts HTML content into proper Markdown. Instead, it often returns the original HTML as-is. This issue is especially noticeable when handling HTML extracted from PDFs, where the expected Markdown formatting is incorrect.

afourney · 2025-05-21T22:11:58Z

Is there a sample document you could provide?

Also, to clarify, if HTML is extracted from a PDF, it won't automatically be converted to markdown. This type of recursive processing currently is not automatic.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Markdownify upgrade is causing problems #1236

Markdownify upgrade is causing problems #1236

harshitgtypeface commented May 5, 2025

afourney commented May 21, 2025

Uh oh!

Markdownify upgrade is causing problems #1236

Markdownify upgrade is causing problems #1236

Comments

harshitgtypeface commented May 5, 2025

afourney commented May 21, 2025

Uh oh!