mark-down: unable to convert pdf file

markitdown path to .pdf > document.md -> im using this command but im getting below error

Traceback (most recent call last):
  File "/opt/anaconda3/bin/markitdown", line 8, in <module>
    sys.exit(main())
             ^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/__main__.py", line 42, in main
    result = markitdown.convert(args.filename)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/_markitdown.py", line 1094, in convert
    return self.convert_local(source, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/_markitdown.py", line 1114, in convert_local
    return self._convert(path, extensions, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/_markitdown.py", line 1255, in _convert
    raise FileConversionException(
markitdown._markitdown.FileConversionException: Could not convert '/Users/pranshujain/Desktop/python/markitdown/src/test.pdf' to Markdown. File type was recognized as ['.pdf', '.pdf']. While converting the file, the following error was encountered:

Traceback (most recent call last):
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/_markitdown.py", line 1239, in _convert
    res = converter.convert(local_path, **_kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/markitdown/_markitdown.py", line 490, in convert
    text_content=pdfminer.high_level.extract_text(local_path),
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/pdfminer/high_level.py", line 169, in extract_text
    for page in PDFPage.get_pages(
  File "/opt/anaconda3/lib/python3.11/site-packages/pdfminer/pdfpage.py", line 154, in get_pages
    doc = PDFDocument(parser, password=password, caching=caching)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/anaconda3/lib/python3.11/site-packages/pdfminer/pdfdocument.py", line 748, in __init__
    raise PDFSyntaxError("No /Root object! - Is this really a PDF?")
pdfminer.pdfparser.PDFSyntaxError: No /Root object! - Is this really a PDF? 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

mark-down: unable to convert pdf file #275

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mark-down: unable to convert pdf file #275

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions