-
Notifications
You must be signed in to change notification settings - Fork 3k
optional dependencies #103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@gagb @casperdcl , Yeah, more generally, a lot of these should be optional dependencies. Ideally we would have something like: pip install markitdown[ocr, openai, yt_transcript] Etc. to optionally include some of the more esoteric or heavy dependencies. We can then just include or exclude the converters accordingly. What do you think? |
I like this but there is so much appeal to the simplicity from just running |
Aliases are quite easy to implement...
Pretty common Pythonicity. btw you should probably rename this issue "optional dependencies" or similar. Also you can use a markdown quote block ( |
i think we can just make all the dependencies optional and make the script install dependencies if needed as the way it is happening in ultralytics there if a package is needed it will be installed on runtime also updates also work on runtime |
Whoa at most you could do: try:
import openai
except ImportError as exc:
raise ImportError("please `pip/conda install openai` or `pip install markitdown[llm]`") from exc Meanwhile side-effects like this are highly discouraged: try:
import openai
except ImportError:
os.system(f"{sys.executable} -m pip imstall openai")
import openai |
i think something like
|
👍 to the idea of using optional dependencies. i wanted to try using markitdown as a global install (
there's a whole bunch of users out there that won't have the system privs to bring in this many dependencies (or will stay away because it doesn't make sense). it seems like it should be possible to install the minimum set for minimum stated functionality, "MarkItDown is a utility for converting various files to Markdown." |
This also increases exposure to a) supply chain attacks and b) CVEs in the whole repo. Just today, I added markitdown to my repo running safety checks, and got hit with a CVE:
Now here I can potentially just add a constraint on the dependency, but there will not always be "quick fixes", which prevents me from reliably using this library in anything production-grade. Additionally, when working with any kind of docker setup / container registry, every additional dependency and every additional MB translates to potentially a LOT of extra cost. Add to that that maybe sometimes I want to make sure that a video is not accidentally leaked to a 3rd party API when using markitdown? |
Maybe there are even more dependencies, haven't checked in detail. |
Yeah I want to move to optional dependencies asap. Relatedly, the latest version in main (not PyPi) also supports 3rd party extensions, minimizing -- I hope -- the need for the kitchen sink. So this is a known problem, and one I'm very keen to solve. The current status quo is a consequence of having lifted the code out of another (also experimental) project -- namely Magentic One. It hasn't yet been sufficiently generalized |
Yes, this keeps me up at night. I want to make a series of breaking changes for 0.0.2, and I will include a move to optional dependencies in that. |
Folks, I have a potential design here. Let me know what you think -- before I expand it to all the converters we provide (this mechanism is not meant to be used with the plugin extensions.... for that, plugins are responsible for dependencies). |
Originally posted by @casperdcl in #100 (comment)
The text was updated successfully, but these errors were encountered: