-
Notifications
You must be signed in to change notification settings - Fork 3k
Cloud not convert stream / pdf to markdown #1134
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks for the report. Let's get to the bottom of this. What version of the library are you using? Did you install it with [all] or at least [pdf]? On my plate is to add a debug option and more python logging, to better support debugging these types of scenarios. |
seeing the same. installed markitdown version 0.1.1 using: "pip install -e packages/markitdown[all]" The only install command that didn't fail was this (below), but it leads to something like OP's reported error above when used:
====================
|
Ignore my previous comment, it was a "me" issue. Referencing here in case anyone runs into the same thing. Adding quotation marks the around the target ( 'markitdown[all]' ) allowed proper install. |
Hey there,
i wanted to generate a markdown of a really long pdf document (roughly around 100 pages). Simple print works, but as soon as it should be converted to markdown, it gives the following issue below. Is there a now limitation to the length of a document?
Traceback (most recent call last):
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 73, in
main()
~~~~^^
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 34, in main
text = process_file(file_path)
File "/Users/user/Desktop/Repositories/markitdown/script/markdown.py", line 19, in process_file
result = md.convert(file_path)
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 259, in convert
return self.convert_local(source, stream_info=stream_info, **kwargs)
~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 310, in convert_local
return self._convert(file_stream=fh, stream_info_guesses=guesses, **kwargs)
~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user/Desktop/Repositories/markitdown/packages/markitdown/src/markitdown/_markitdown.py", line 541, in _convert
raise UnsupportedFormatException(
f"Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported."
)
markitdown._exceptions.UnsupportedFormatException: Could not convert stream to Markdown. No converter attempted a conversion, suggesting that the filetype is simply not supported
The text was updated successfully, but these errors were encountered: