You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are using MarkItDown and so far it is working very well.
We came across an issue with docx files having images, looking into the code, it looks like mammoth library which is used by DocxConverter allows passing a handler which can process the images and return let's say alt text. These when converted to markdown are giving descriptions for the images.
But I could not see an option of passing this handler in the convert call on MarkItDown class.
Could this be exposed if possible ?
an example in mammoth will be like this
htmlResult = mammoth.convert_to_html(
"<path to docx file>",
convert_image=mammoth.images.img_element(convert_image),
)
The text was updated successfully, but these errors were encountered:
Fantastic find. We should certainly expose this. I'm looking into a mechanism that will allow passing more options to the converters (as well as allowing for more of a plugin architecture).
Fantastic find. We should certainly expose this. I'm looking into a mechanism that will allow passing more options to the converters (as well as allowing for more of a plugin architecture).
It will be really nice if the mechanism can also handle images in xlsx file.
They implemented plugins and I have an example of keeping the images in files using mammoth's CLI ImageWriter. I haven't explored alt text yet: #1099 (comment)
Hello Team,
We are using MarkItDown and so far it is working very well.
We came across an issue with docx files having images, looking into the code, it looks like mammoth library which is used by DocxConverter allows passing a handler which can process the images and return let's say alt text. These when converted to markdown are giving descriptions for the images.
But I could not see an option of passing this handler in the convert call on MarkItDown class.
Could this be exposed if possible ?
an example in mammoth will be like this
The text was updated successfully, but these errors were encountered: