You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When attempting to convert a Zhihu article using the CLI tool, markitdown throws an unhandled exception due to a 403 Forbidden response from the target URL.
Traceback (most recent call last):
File "/Users/xiongxinwei/Library/Caches/pypoetry/virtualenvs/telepace-server-WT4oou3h-py3.12/bin/markitdown", line 8, in <module>
sys.exit(main())
^^^^^^
File ".../markitdown/__main__.py", line 197, in main
result = markitdown.convert(
^^^^^^^^^^^^^^^^^^^
File ".../markitdown/_markitdown.py", line 271, in convert
return self.convert_uri(source, stream_info=stream_info, **_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File ".../markitdown/_markitdown.py", line 443, in convert_uri
response.raise_for_status()
File ".../site-packages/requests/models.py", line 1024, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 403 Client Error: Forbidden for url: https://zhuanlan.zhihu.com/p/11654788270
Expected Behavior:
The tool should either:
Successfully fetch and convert the article if access is allowed, or
Gracefully handle 403 responses with a clear error message indicating access is denied.
Environment:
OS: macOS
Python: 3.12
Tool version: latest from repo
Installed via: Poetry
Possible Cause:
Zhihu may be blocking automated requests. It might be necessary to:
Add custom headers (e.g., a user-agent string) to mimic a browser
Handle HTTP errors more gracefully
The text was updated successfully, but these errors were encountered:
cubxxw
changed the title
markitdown fails with 403 Forbidden when converting Zhihu article URL
bug: markitdown fails with 403 Forbidden when converting Zhihu article URL
Apr 21, 2025
When attempting to convert a Zhihu article using the CLI tool, markitdown throws an unhandled exception due to a 403 Forbidden response from the target URL.
Reproduction Steps:
Error Traceback:
Expected Behavior:
The tool should either:
Environment:
Possible Cause:
Zhihu may be blocking automated requests. It might be necessary to:
The text was updated successfully, but these errors were encountered: