Skip to content

Fix Nested Table Conversion in HTML-to-Markdown Process #219

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 252 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
252 commits
Select commit Hold shift + click to select a range
80822b0
Updates to 0.4.1, pkgmeta included directly in setup.
skoczen Nov 27, 2017
7f7b49d
Adding MIT license file
Lqp1 Sep 27, 2018
8f5bae0
updating classifer to mit license
jvanasco Jun 13, 2019
5d49939
Add newline before and after a markdown list
AlexVonB Jul 4, 2019
44d7259
added tests for matthewwithanm#11
AlexVonB Jul 4, 2019
f290ea4
Merge pull request #11 from AlexVonB/AlexVonB-patch-1
matthewwithanm Jul 4, 2019
2da613c
remove prefixed and suffixed spaces from inline tags
AlexVonB Jul 11, 2019
cddea82
remove needless checks for emtpy text
AlexVonB Jul 12, 2019
c4f373d
Remove newline-only textnodes outside <pre>
tlp-red Nov 20, 2019
d9f7dd4
Add nested OL test (for newlines) and correct lists nesting
tlp-red Nov 21, 2019
626f85f
Correct inline UL test as paragraphs are followed by two newlines
tlp-red Nov 21, 2019
370a771
Remove debug prints
tlp-red Nov 22, 2019
2e6a31a
Merge pull request #14 from AlexVonB/fix-inline-spaces
AlexVonB Aug 9, 2020
55cfd8c
fixed nested lists and wrote correct tests
AlexVonB Aug 9, 2020
93c3186
added egg dirs to gitignore
AlexVonB Aug 9, 2020
57ab199
cleaning up changes with help of linter
AlexVonB Aug 9, 2020
8fd01fb
Merge branch 'develop'
AlexVonB Aug 9, 2020
227dfef
Bump to 0.5.0
AlexVonB Aug 9, 2020
fd23bc6
Merge branch 'master' into develop
AlexVonB Aug 10, 2020
d2fcad9
ignore build folder
AlexVonB Aug 10, 2020
a5e358e
Add some fancy badges
SimonIT Aug 10, 2020
17d690c
Remove alt because it makes some trouble
SimonIT Aug 10, 2020
3b8335e
Create python-publish.yml
matthewwithanm Aug 11, 2020
5400edb
Merge pull request #21 from matthewwithanm/python-publish
AlexVonB Aug 11, 2020
92f210f
Bump version 0.5.1
AlexVonB Aug 11, 2020
9348396
Support the start attribute for ordered lists
SimonIT Aug 11, 2020
51be536
Replace downloads badge
SimonIT Aug 13, 2020
00344af
Create python-app.yml
AlexVonB Aug 18, 2020
135ba8c
set max flake8 version
AlexVonB Aug 18, 2020
dcdcedb
set max flake8 version again
AlexVonB Aug 18, 2020
cc7164f
set max flake8 version again2
AlexVonB Aug 18, 2020
efe7867
set max flake8 version again3
AlexVonB Aug 18, 2020
459937c
use python3.6 for linting
AlexVonB Aug 18, 2020
4e57eaa
Merge pull request #22 from SimonIT/ol-start-attribute
AlexVonB Aug 18, 2020
c144600
Bump version 0.5.2
AlexVonB Aug 18, 2020
d4ace2c
Make badges inline
SimonIT Aug 19, 2020
8abaaec
Merge pull request #20 from SimonIT/badges
AlexVonB Aug 19, 2020
73b2511
Merge remote-tracking branch 'upstream/develop' into ordered-list
SimonIT Aug 26, 2020
458d765
Fix tests
SimonIT Aug 26, 2020
b209c2e
Fix parsing corrupt html
SimonIT Aug 31, 2020
0cf98c4
Merge pull request #24 from SimonIT/fix-corrupt-html
AlexVonB Sep 1, 2020
a8e7175
Bump version 0.5.3
AlexVonB Sep 1, 2020
fa864ec
Add support for headings that include nested block elements
idvorkin Nov 15, 2020
86a42ec
Add method for <code> tag
AndrewCRichards Nov 23, 2020
761a121
Correct test_code_with_tricky_content()
AndrewCRichards Nov 26, 2020
942ccb4
Formatting tweak
AndrewCRichards Nov 27, 2020
4f163d5
Using a regexp to determine if a tag is a heading.
idvorkin Dec 12, 2020
ab425ec
Add many tests and support image tag
idvorkin Dec 13, 2020
aa70748
Merge pull request #26 from idvorkin/develop
AlexVonB Dec 13, 2020
cb64cb9
Bump Version 0.6.0
AlexVonB Dec 13, 2020
cbce9a9
dont replace newlines and tabs with spaces
AlexVonB Dec 29, 2020
63233d8
Fixing autolinks
AlexVonB Jan 2, 2021
86a5b61
Merge branch 'fix-link-underscores' into develop
AlexVonB Jan 4, 2021
e8ab089
bump to v0.6.1
AlexVonB Jan 4, 2021
d9dcb68
Merge branch 'fix-extra-headline-whitespace' into develop
AlexVonB Jan 12, 2021
225e394
satisfy linter
AlexVonB Jan 12, 2021
1e8c03e
Add ignore comment tags
BrunoMiguens Feb 5, 2021
81a42ff
Add new line at the end of file
BrunoMiguens Feb 5, 2021
b4cc116
Merge pull request #34 from BrunoMiguens/add-ignore-comment-tags
AlexVonB Feb 7, 2021
8a7d8aa
Add basic support for HTML tables
BrunoMiguens Feb 8, 2021
d0dcf35
Add tests for basic and thead/tbody tables
BrunoMiguens Feb 8, 2021
55e8cbe
Remove unnecessary tests
BrunoMiguens Feb 8, 2021
1ff5f39
Fix lint
BrunoMiguens Feb 8, 2021
4163fd5
Remove empty header validation to allow empty header
BrunoMiguens Feb 8, 2021
1fe4535
Revert header validation and leave possibility to empty column
BrunoMiguens Feb 8, 2021
bf5a0e4
Allow for a custom strong or emphasis symbol
andredelft Feb 15, 2021
253c177
Allow for the use of backslash for newlines
andredelft Feb 15, 2021
3f25b79
Update README with the two new options
andredelft Feb 15, 2021
8535f93
Fix code ticks in README
andredelft Feb 15, 2021
21ee182
closing #25 and #18
AlexVonB Feb 21, 2021
b3ab97f
bump to v0.6.4
AlexVonB Feb 21, 2021
23b5640
upgrading code for python 3.x
AlexVonB Feb 21, 2021
21e238f
use python 3.8 instead of 3.6
AlexVonB Feb 21, 2021
e34c6cf
bump to v0.6.5
AlexVonB Feb 21, 2021
4d46c34
Test strong_em_symbol
andredelft Apr 5, 2021
f13bb9c
Change option to newline_style and use variables like heading_style does
andredelft Apr 5, 2021
e126f9b
Test newline_style
andredelft Apr 5, 2021
a164860
Use .lower() on _style option fetching
andredelft Apr 5, 2021
cc4ffb5
Fix linting
andredelft Apr 5, 2021
0dd1e78
Update README.rst
andredelft Apr 5, 2021
2d40d49
Separate the strong_em_symbol and newline style tests
andredelft Apr 5, 2021
0c543d8
Introduce OPTIONs for `strong_em_symbol`
andredelft Apr 18, 2021
25cae54
Merge pull request #37 from andredelft/develop
AlexVonB Apr 18, 2021
79adf8b
bump to v0.6.6
AlexVonB Apr 22, 2021
9b43239
guard table lines with pipes, resolves the empty header problem
AlexVonB Apr 22, 2021
938f993
Merge branch 'develop' into add-basic-support-for-tables
AlexVonB Apr 22, 2021
fdb2b77
Merge pull request #36 from BrunoMiguens/add-basic-support-for-tables
AlexVonB Apr 22, 2021
6b0aa72
bump to v0.7.0
AlexVonB Apr 22, 2021
cb4300c
Add conversion for hr element
jiulongw Apr 28, 2021
1b52455
fix hr tests
AlexVonB May 2, 2021
a74528b
Merge pull request #40 from jiulongw/jiulongw/hr
AlexVonB May 2, 2021
ba8e1c5
bump to v0.7.1
AlexVonB May 2, 2021
59a1af0
Merge branch 'develop' into ordere-list-update
AlexVonB May 2, 2021
f96e9f3
fixed whitespace issues at nested lists
AlexVonB May 2, 2021
e4c8b3d
Merge pull request #23 from SimonIT/ordere-list-update
AlexVonB May 2, 2021
2539b77
bump to v0.7.2
AlexVonB May 2, 2021
f2813db
Fix missing whitespaces in <li> node
jiulongw May 10, 2021
a95c263
Keep important spaces in <li> element
jiulongw May 10, 2021
62c25a8
Merge pull request #43 from jiulongw/develop
AlexVonB May 16, 2021
2d5339f
bump to v0.7.3
AlexVonB May 16, 2021
8163bc7
Allow for tables without header row
AlexVonB May 16, 2021
83ee8a8
allow tables with headers in first (or any) column
AlexVonB May 17, 2021
769987c
implemented table parsing correctly
AlexVonB May 17, 2021
5e3f315
Merge branch 'fix-headless-tables' into develop
AlexVonB May 18, 2021
e81e886
bump to v0.7.4
AlexVonB May 18, 2021
fda8adb
Merge branch 'andrewcrichards/add_code_samp_kbd_tags' of https://gith…
AlexVonB May 21, 2021
1e195e4
Merge branch 'AndrewCRichards-andrewcrichards/add_code_samp_kbd_tags'…
AlexVonB May 21, 2021
5b28e59
ordering functions alphabetically
AlexVonB May 21, 2021
42e414a
added del and s tags
AlexVonB May 21, 2021
2a8218f
refactor simple inline conversions
AlexVonB May 21, 2021
dac3d2d
added pre tag
AlexVonB May 21, 2021
9cf877e
bump to v0.8.0
AlexVonB May 21, 2021
d9bb969
ignore doctype tag, test cdata tag
AlexVonB May 30, 2021
03fe63d
bump to v0.8.1
AlexVonB May 30, 2021
bfaf6c4
add option 'default_title' to links
AlexVonB May 30, 2021
c399be1
restructured test files
AlexVonB May 30, 2021
b3361a6
add options for sub and sup tags
AlexVonB May 30, 2021
1d08cfc
bump to v0.9.0
AlexVonB May 30, 2021
d1ed6c8
add examples for custom converters
AlexVonB Jun 27, 2021
8aa3688
add figure/figcaption
AlexVonB Jun 30, 2021
3d78381
rewrote text processing to not escape _ in code
AlexVonB Jul 11, 2021
d147dac
Revert "add figure/figcaption"
AlexVonB Jul 11, 2021
64006b8
bump to v0.9.1
AlexVonB Jul 11, 2021
2d8470c
fix rst syntax error
AlexVonB Jul 11, 2021
aeef3f1
convert tags inside table cells as inline
AlexVonB Aug 25, 2021
76f8491
bump to v0.9.3
AlexVonB Aug 25, 2021
5e394cd
Fixed issue #52 - added stripping of text to list
Sep 4, 2021
223b063
Added appropriate test
Sep 4, 2021
f829d32
remove trailing whitespace to satisfy the linter
AlexVonB Sep 4, 2021
ce18139
Merge pull request #53 from Hozhyi/fix/bullet_list_tags_in_separate_l…
AlexVonB Sep 4, 2021
cc2e233
bump to v0.9.4
AlexVonB Sep 4, 2021
f818fc7
added language for multiline code
Inzaniak Nov 1, 2021
3f9903b
satisfy linter
AlexVonB Nov 17, 2021
abe3e12
differentiated between text and code language
AlexVonB Nov 17, 2021
647b568
Merge branch 'Inzaniak-develop' into develop
AlexVonB Nov 17, 2021
df4b13d
add readme for code_language
AlexVonB Nov 17, 2021
35a5ca0
fix readme for code_language
AlexVonB Nov 17, 2021
7f1730e
bump to v0.10.0
AlexVonB Nov 17, 2021
7a3a4f9
allow flake8 v4.x
AlexVonB Dec 11, 2021
746b358
bump to v0.10.1
AlexVonB Dec 11, 2021
fc8ccdc
add option to not escape underscores
AlexVonB Jan 18, 2022
3b9213c
bump to v0.10.2
AlexVonB Jan 18, 2022
d914d8b
wording
AlexVonB Jan 23, 2022
bf6969d
allow BeautifulSoup objects to be converted
AlexVonB Jan 23, 2022
f441409
bump to v0.10.3
AlexVonB Jan 23, 2022
98b4857
fixed readme
AlexVonB Jan 24, 2022
e2124be
Add code language callback
tdgroot Apr 9, 2022
7e1705c
add option to allow inline images in selected tags
AlexVonB Apr 13, 2022
eb6f1d3
add escaping of asterisks and option to disable it
AlexVonB Apr 13, 2022
4cdb061
Merge branch 'code_language_callback' of https://github.com/tdgroot/p…
AlexVonB Apr 13, 2022
192fa22
added readme for callback
AlexVonB Apr 13, 2022
236d137
Merge branch 'tdgroot-code_language_callback' into develop
AlexVonB Apr 13, 2022
a77116c
bump to v0.11.0
AlexVonB Apr 13, 2022
ce11a5d
Fix detection of "first row, not headline" (#63)
mvkorpel Apr 14, 2022
d2fd9fd
bump to v0.11.1
AlexVonB Apr 14, 2022
fd8f8f4
typo in readme
AlexVonB Apr 24, 2022
8159aea
added wrap option
AlexVonB Apr 24, 2022
2c03710
bump to v0.11.2
AlexVonB Apr 24, 2022
f2585bf
Add console entry point (#72)
BioBox Aug 28, 2022
52d50a9
added readme for cli
AlexVonB Aug 28, 2022
5d4f5db
don't escape text in pre tag (Fenced Code Blocks) (#67)
tjmnmk Aug 28, 2022
a56f81d
Switch to `tox` for tests (#73)
AlexVonB Aug 28, 2022
266d24f
bump to v0.11.3
AlexVonB Aug 28, 2022
a579fa2
fixed readme and added linter to detect this earlier
AlexVonB Aug 28, 2022
fa4d48b
bump to v0.11.4
AlexVonB Aug 28, 2022
3d499fe
fix cli options: default heading, em symbols
AlexVonB Aug 31, 2022
0a065f7
first test, then lint
AlexVonB Aug 31, 2022
dbf8656
bump to v0.11.5
AlexVonB Aug 31, 2022
f8d4a46
fixed cli parameters
AlexVonB Sep 2, 2022
20be440
added nix shell file
AlexVonB Sep 2, 2022
7169a01
bump to v0.11.6
AlexVonB Sep 2, 2022
dbaaef3
avoid text normalization/escaping in any preformatted/code context
chrispy-snps Jan 15, 2024
7f012a2
Merge pull request #104 from chrispy-snps/fix/97-101-102
chrispy-snps Jan 15, 2024
4dd3ae2
ignore script and style content (such as css and javascript) (#112)
tlk Mar 11, 2024
e87020e
Add no css example to readme (#111)
GeeCastro Mar 11, 2024
39fd670
Fix newline start in header tags (#89)
5yato4ok Mar 26, 2024
f8badf5
convert_td: strip text (#91)
carantunes Mar 26, 2024
1192e9c
added tests for linebreaks in table cells
AlexVonB Mar 26, 2024
d032e78
Strip text before adding blockquote markers (#76)
andredelft Mar 26, 2024
f2c75a3
revert workaround example in README.rst for <script> and <style> now …
chrispy-snps Mar 26, 2024
d69f66e
added further readme for custom converters
AlexVonB Mar 26, 2024
f4c730f
Support conversion of header rows in tables without th tag (#83)
huuyafwww Mar 26, 2024
e976fe5
make sure there are blank lines around table/figure captions (#114)
chrispy-snps Mar 26, 2024
a79f499
fixed tests for table caption
AlexVonB Mar 26, 2024
85b2ff4
Table merge cell horizontally (#110)
xydxydxyd1 Mar 26, 2024
2b96dfc
bump to v0.12.1
AlexVonB Mar 26, 2024
ba7ddb8
Avoid inline styles inside `<code>` / `<pre>` conversion (#117)
jsm28 Apr 4, 2024
e3d8f06
Escape all characters with Markdown significance (#118)
jsm28 Apr 4, 2024
ed6d68d
fixed github action badges
AlexVonB Apr 4, 2024
5b22cd3
More carefully separate inline text from block content
jsm28 Apr 9, 2024
20ab4e0
Update MANIFEST.in to exclude tests during packaging (#125)
samypr100 Jun 23, 2024
60ecf17
handle un-parsable colspan values
AlexVonB Jun 23, 2024
22dfe32
Special-case use of HTML tags for converting `<sub>` / `<sup>` (#119)
jsm28 Jun 23, 2024
b3bd99c
better naming for markup variables
AlexVonB Jun 23, 2024
18e80b9
handle ol start value is not number (#127)
microdnd Jun 23, 2024
0839124
added test for ol start check
AlexVonB Jun 23, 2024
50d3e83
fix pytest version to 8
AlexVonB Jul 14, 2024
23fdf3e
bump to v0.13.0
AlexVonB Jul 14, 2024
5c398a6
Migrated the metadata into PEP 621-compliant pyproject.toml (#138)
AlexVonB Jul 14, 2024
10e5a34
bump to version v0.13.1
AlexVonB Jul 14, 2024
81db09c
Merge branch 'develop' into para-newlines-92-98
jsm28 Sep 30, 2024
7347ce4
More selective escaping of `-#.)` (alternative approach)
jsm28 Oct 2, 2024
72a3c25
Fix whitespace issues around wrapping
jsm28 Oct 3, 2024
153b5a5
More thorough cleanup of input whitespace
jsm28 Oct 3, 2024
9a1c27d
Fix logic for indentation inside list items
jsm28 Oct 3, 2024
b716626
Set escape_misc to False by default to improve backwards compatibility
alfonsrv Oct 9, 2024
0f94326
Merge branch 'jsm28-selective-escaping' into jsm28-list-indentation
AlexVonB Nov 20, 2024
1b5903a
ignore bs4 warnings in tests
AlexVonB Nov 24, 2024
3c25ab2
Merge branch 'alfonsrv-fix-pr-118' into jsm28-list-indentation
AlexVonB Nov 24, 2024
d435f9e
renamed functions that return boolean
AlexVonB Nov 24, 2024
685a3a5
bump to version v0.14.0
AlexVonB Nov 24, 2024
0afab1e
prevent very large headline prefixes
AlexVonB Nov 24, 2024
1cf3266
prevent `<hn>` to call convert_hn and crash
AlexVonB Nov 24, 2024
b3ce178
bump to version v0.14.1
AlexVonB Nov 24, 2024
6a0e020
do not construct Markdown links in code spans and code blocks
chrispy-snps Dec 29, 2024
e42bcbd
insert a blank line between table caption, table content
chrispy-snps Dec 29, 2024
66daed3
Merge pull request #165 from chrispy-snps/chrispy/fix-a-in-code
chrispy-snps Jan 19, 2025
6d1e8bb
Merge pull request #167 from chrispy-snps/chrispy/table-caption-blank…
chrispy-snps Jan 19, 2025
b4d71a5
allow a wrap_width value of None for unlimited line lengths (#169)
chrispy-snps Jan 19, 2025
6cd0c0e
optimize empty-line handling for li and blockquote content
chrispy-snps Dec 30, 2024
afd4470
Merge pull request #171 from chrispy-snps/chrispy/optimize-li-blockqu…
chrispy-snps Jan 19, 2025
d530bf8
support HTML definition lists (<dl>, <dt>, and <dd>)
chrispy-snps Dec 31, 2024
a810ec9
Merge pull request #173 from chrispy-snps/chrispy/support-definition-…
chrispy-snps Jan 19, 2025
a14c0c4
Add a new configuration option to control tabler header row inference…
SomeBottle Jan 19, 2025
7216e28
for convert_* functions, allow for tags with special characters in th…
Fess-AKA-DeadMonk Jan 19, 2025
8787714
code simplification to remove need for children_only parameter (#174)
chrispy-snps Jan 19, 2025
b54acbd
add blank line before ATX-style headings to avoid ambiguity (#178)
chrispy-snps Jan 21, 2025
92f8edb
add blank line before/after preformatted block (#179)
chrispy-snps Jan 21, 2025
36aa22d
remove superfluous leading/trailing whitespace (#181)
chrispy-snps Jan 27, 2025
85b80f4
simplify computation of convert_children_as_inline variable (#182)
chrispy-snps Feb 4, 2025
8c1d9c4
when computing <ol><li> numbering, ignore non-<li> previous siblings …
chrispy-snps Feb 4, 2025
c1fc38c
make conversion non-destructive to soup; improve div/article/section …
chrispy-snps Feb 4, 2025
9b49570
use list-based processing (inspired by AlextheYounga) (#186)
chrispy-snps Feb 17, 2025
a39ba0b
propagate parent tag context downward to improve runtime (#191)
chrispy-snps Feb 18, 2025
b99d379
Avoid stripping nonbreaking spaces (#188)
jsm28 Feb 19, 2025
e6db566
Escape right square brackets (#187)
jsm28 Feb 19, 2025
f375e71
rename regex pattern variables (#195)
chrispy-snps Feb 20, 2025
662327e
use a conversion function cache to improve runtime (#196)
chrispy-snps Feb 24, 2025
712c5f6
use compiled regex for escaping patterns (#194)
chrispy-snps Feb 24, 2025
d903ce9
bump to version v1.0.0
chrispy-snps Feb 24, 2025
955cd93
Support `video` tag with `poster` attribute (#189)
itmammoth Feb 28, 2025
1155711
add missing newlines for definition lists (#200)
chrispy-snps Mar 2, 2025
307bd3b
in inline contexts, resolve <br/> to a space instead of an empty stri…
chrispy-snps Mar 4, 2025
80d353e
Generalize handling of colspan in case where colspan is in first row …
sbrown61 Mar 5, 2025
10feba9
bump to version v1.1.0
chrispy-snps Mar 5, 2025
9a78ae4
Add beautiful_soup_parser option (#206)
vincentkelleher Mar 29, 2025
6657b49
make convert_hn() public instead of internal (#213)
chrispy-snps Apr 20, 2025
0957650
Add conversion support for <q> tags (#217)
colinrobinsonuib Apr 28, 2025
6cfc19d
ensure that explicitly provided heading conversion functions are used…
chrispy-snps May 3, 2025
722ee8f
fix bug:nested tables in html are lost when converting to markdown
Wuhall May 14, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
ignore build folder
  • Loading branch information
AlexVonB committed Aug 10, 2020
commit d2fcad9173dcb80dca366f10672f8572161d8417
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,4 @@
/dist
/MANIFEST
/venv
build/