Skip to content
Open
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
59 commits
Select commit Hold shift + click to select a range
e7f187b
Modifying return value for parseBlock, parseInline
yanntrividic Jun 23, 2025
2c7d45d
Adding roles to headers for sections with roles
yanntrividic Jun 23, 2025
b626d72
Add roles to els parsed with withOptionalTitle
yanntrividic Jun 23, 2025
4b1bc98
Units tests for this PR and new role attributes
yanntrividic Jun 23, 2025
6b2d679
Merge branch 'jgm:main' into main
yanntrividic Jul 23, 2025
d932d30
Merge branch 'jgm:main' into main
yanntrividic Jul 25, 2025
08dbdfa
Adding roles to headers for sections with roles
yanntrividic Jun 23, 2025
5af69cb
Add roles to els parsed with withOptionalTitle
yanntrividic Jun 23, 2025
71edd85
Units tests for this PR and new role attributes
yanntrividic Jun 23, 2025
58f471c
pandoc-lua-engine: Allow hslua-2.4.0 in the tests
tarleb Jun 25, 2025
9d10af7
Lua: add more UTF-8-aware file operations to `pandoc.system`.
tarleb Jun 23, 2025
198eaa0
CI: use windows-2022. windows-2019 is no longer provided.
jgm Jul 6, 2025
dbcdb0c
PDF: make images from MediaBag available in tmp dir...
jgm Jul 8, 2025
71743f9
PDF: Use utf8ToText for LaTeX log messages.
jgm Jul 9, 2025
0d68b91
doc/lua-filters.md: Add example on using pandoc.Table constructor. (#…
SeanESCA Jul 10, 2025
10307e3
Update `--version` copyright dates.
jgm Jul 13, 2025
a493340
Use hardcoded string "pandoc" for program name in `--version`.
jgm Jul 13, 2025
b718f9e
Export `copyrightMessage` from Text.Pandoc.App module.
jgm Jul 13, 2025
8f66ada
Revert "Export `copyrightMessage` from Text.Pandoc.App module."
jgm Jul 14, 2025
27b2926
Remove code duplication around version info.
jgm Jul 14, 2025
f0fc5fd
Typst writer: set lang attribute in Divs.
jgm Jul 14, 2025
965c74a
Lua: add `normalize` function to *Pandoc* objects
tarleb Jul 19, 2025
356a507
Use latest dev citeproc and update the default CSL...
jgm Jul 20, 2025
2de4cda
Fix citeproc-87 test.
jgm Jul 20, 2025
eb2f3d4
Fix pandoc-citeproc-64 test.
jgm Jul 20, 2025
1d8218e
Use latest dev citeproc.
jgm Jul 20, 2025
55cbd96
Fix a test.
jgm Jul 21, 2025
fd2c684
Fixed cabal.project stanza for citeproc.
jgm Jul 21, 2025
3cd261e
Typst: add support for custom and/or translated "Abstract" titles
tarleb Jul 21, 2025
957add2
Markdown writer: match indents in definition items
tarleb Jul 22, 2025
02ce2ef
Djot writer: fix duplicate attributes before section headings.
jgm Jul 23, 2025
7cd0289
T.P.ImageSize: support avif images.
jgm Jul 23, 2025
23d480d
Fix incomplete pattern matches from new ImageType constructor.
jgm Jul 23, 2025
a52d8cb
Fix CI so that -Wall -Werror works again!
jgm Jul 23, 2025
3d1be4e
Makefile: add -Wall to ghc options.
jgm Jul 23, 2025
8dfb2fa
Lua: add function `pandoc.path.exists`.
tarleb Jul 23, 2025
addfa97
Use latest dev citeproc.
jgm Jul 23, 2025
a42a84c
Revise Makefile and CI treatment of `--ghc-options`.
jgm Jul 23, 2025
6e46b62
Ensure that all modules have explicit export lists.
jgm Jul 23, 2025
5f56d62
CI: don't warn on unused imports in ghc 9.10+.
jgm Jul 23, 2025
538bb04
CI: another stab at preventing ghc 9.10, 9.12 from erroring.
jgm Jul 23, 2025
16b6ec0
Fix CI again.
jgm Jul 23, 2025
f9ce3cd
Use latest dev citeproc.
jgm Jul 23, 2025
9b7287e
Use dev texmath.
jgm Jul 24, 2025
8ecb2a8
Fix stack.yaml.
jgm Jul 24, 2025
e1e2493
Add features to typst base template.
christopherkenny Jul 9, 2024
c365732
Org reader: Recognize "fast access" characters in TODO state definiti…
RyanGibb Jul 24, 2025
53c3f88
DocBook reader: Add rowspan support. (#10981)
SeanESCA Jul 24, 2025
6070379
Revert a test case that changed due to a reverted citeproc change.
jgm Jul 24, 2025
b11afcf
Use latest dev citeproc.
jgm Jul 24, 2025
7a30647
T.P.PDF: clean up `makePDF`
tarleb Jul 25, 2025
6f61b8e
PDF: allow `pdflatex-dev` and `lualatex-dev` as PDF engines
tarleb Jul 25, 2025
e7e1725
PDF: Improve error readability when pdf-engine is not supported.
tarleb Jul 25, 2025
a517533
Merge branch 'main' of https://github.com/yanntrividic/pandoc
yanntrividic Jul 25, 2025
4c01975
Adding roles to headers for sections w/ roles
yanntrividic Jun 23, 2025
eb8a928
Add roles to els parsed with withOptionalTitle
yanntrividic Jun 23, 2025
cc96c03
Merge branch 'main' of https://github.com/yanntrividic/pandoc
yanntrividic Jul 25, 2025
ce65132
Units tests for this PR and new role attributes
yanntrividic Jun 23, 2025
7d6d428
Merge branch 'main' of https://github.com/yanntrividic/pandoc
yanntrividic Jul 25, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 20 additions & 8 deletions src/Text/Pandoc/Readers/DocBook.hs
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ import Text.Pandoc.Builder
import Text.Pandoc.Class.PandocMonad (PandocMonad, report)
import Text.Pandoc.Options
import Text.Pandoc.Logging (LogMessage(..))
import Text.Pandoc.Shared (safeRead, extractSpaces)
import Text.Pandoc.Shared (safeRead, extractSpaces, addPandocAttributes)
import Text.Pandoc.Sources (ToSources(..), sourcesToText)
import Text.Pandoc.Transforms (headerShift)
import Text.TeXMath (readMathML, writeTeX)
Expand Down Expand Up @@ -855,15 +855,19 @@ getBlocks :: PandocMonad m => Element -> DB m Blocks
getBlocks e = mconcat <$>
mapM parseBlock (elContent e)

getRoleAttr :: Element -> [(Text, Text)] -- extract role attribute and add it to the attribute list
getRoleAttr e = case attrValue "role" e of
"" -> []
r -> [("role", r)]

parseBlock :: PandocMonad m => Content -> DB m Blocks
parseBlock (Text (CData CDataRaw _ _)) = return mempty -- DOCTYPE
parseBlock (Text (CData _ s _)) = if T.all isSpace s
then return mempty
else return $ plain $ trimInlines $ text s
parseBlock (CRef x) = return $ plain $ str $ T.toUpper x
parseBlock (Elem e) =
case qName (elName e) of
parseBlock (Elem e) = do
parsedBlock <- case qName (elName e) of
"toc" -> skip -- skip TOC, since in pandoc it's autogenerated
"index" -> skip -- skip index, since page numbers meaningless
"para" -> parseMixed para (elContent e)
Expand Down Expand Up @@ -975,6 +979,7 @@ parseBlock (Elem e) =
"title" -> return mempty -- handled in parent element
"subtitle" -> return mempty -- handled in parent element
_ -> skip >> getBlocks e
return $ addPandocAttributes (getRoleAttr e) parsedBlock
where skip = do
let qn = qName $ elName e
let name = if "pi-" `T.isPrefixOf` qn
Expand Down Expand Up @@ -1112,7 +1117,12 @@ parseBlock (Elem e) =
modify $ \st -> st{ dbSectionLevel = n }
b <- getBlocks e
modify $ \st -> st{ dbSectionLevel = n - 1 }
return $ headerWith (elId, classes, maybeToList titleabbrevElAsAttr++attrs) n' headerText <> b
let content = headerWith (elId, classes, maybeToList titleabbrevElAsAttr)
n' headerText <> b
return $ case attrValue "role" e of
"" -> content
_ -> divWith ("", ["section"],
("level", T.pack $ show n') : attrs) content
Comment on lines +1121 to +1124
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be mixing the use of section Divs and bare headings in the same document the way you do here. Why not add the role attribute to the Header? When the resulting AST is passed through makeSections, it will become a section div.

Copy link
Author

@yanntrividic yanntrividic Jul 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello, thanks for taking the time to have a look at it again. I proposed this modification following what I understood from your recommendation in #10665 (comment).

The problem that led to this modification was that the role attributes were applied recursively to all child sections, because of the way addPandocAttributes is designed. From my understanding, we arrived to this "acceptable" solution to avoid this recursion.

Should we figure out something else then?

titleabbrevElAsAttr =
case filterChild (named "titleabbrev") e `mplus`
(filterChild (named "info") e >>=
Expand All @@ -1135,9 +1145,8 @@ parseBlock (Elem e) =
b <- p
case mbt of
Nothing -> return b
Just t -> return $ divWith (attrValue "id" e,[],[])
Just t -> return $ divWith (attrValue "id" e, [], getRoleAttr e)
(divWith ("", ["title"], []) (plain t) <> b)

-- Admonitions are parsed into a div. Following other Docbook tools that output HTML,
-- we parse the optional title as a div with the @title@ class, and give the
-- block itself a class corresponding to the admonition name.
Expand Down Expand Up @@ -1226,8 +1235,8 @@ parseInline (Text (CData _ s _)) = do
else return $ text s
parseInline (CRef ref) =
return $ text $ fromMaybe (T.toUpper ref) $ lookupEntity ref
parseInline (Elem e) =
case qName (elName e) of
parseInline (Elem e) = do
parsedInline <- case qName (elName e) of
"anchor" -> do
return $ spanWith (attrValue "id" e, [], []) mempty
"phrase" -> do
Expand Down Expand Up @@ -1349,6 +1358,9 @@ parseInline (Elem e) =
-- <?asciidor-br?> to in handleInstructions, above.
"pi-asciidoc-br" -> return linebreak
_ -> skip >> innerInlines id
return $ case qName (elName e) of
"emphasis" -> parsedInline
_ -> addPandocAttributes (getRoleAttr e) parsedInline
where skip = do
let qn = qName $ elName e
let name = if "pi-" `T.isPrefixOf` qn
Expand Down
42 changes: 38 additions & 4 deletions test/docbook-reader.docbook
Original file line number Diff line number Diff line change
Expand Up @@ -27,9 +27,9 @@
This is a set of tests for pandoc. Most of them are adapted from John
Gruber’s markdown test suite.
</para>
<sect1 id="headers">
<sect1 id="headers" role="sect1role">
<title>Headers</title>
<sect2 id="level-2-with-an-embedded-link">
<sect2 id="level-2-with-an-embedded-link" role="sect2role">
<title>Level 2 with an <ulink url="/url">embedded link</ulink></title>
<sect3 id="level-3-with-emphasis">
<title>Level 3 with <emphasis>emphasis</emphasis></title>
Expand Down Expand Up @@ -74,6 +74,9 @@
<para>
Here’s a regular paragraph.
</para>
<para role="pararole">
And here’s a regular paragraph with a role.
</para>
<para>
In Markdown 1.0.0 and earlier. Version 8. This line turns into a list
item. Because a hard-wrapped line in the middle of a paragraph looked like
Expand All @@ -93,6 +96,11 @@
This is a block quote. It is pretty short.
</para>
</blockquote>
<blockquote role="roleblockquote">
<para>
This is a block quote with a role.
</para>
</blockquote>
<blockquote>
<para>
Code in a block quote:
Expand Down Expand Up @@ -233,6 +241,26 @@ These should not be escaped: \$ \\ \&gt; \[ \{
</para>
</listitem>
</orderedlist>
<para>
with role:
</para>
<orderedlist role="listrole" numeration="arabic">
<listitem>
<para>
First
</para>
</listitem>
<listitem>
<para>
Second
</para>
</listitem>
<listitem>
<para>
Third
</para>
</listitem>
</orderedlist>
<para>
and tight:
</para>
Expand Down Expand Up @@ -702,6 +730,12 @@ These should not be escaped: \$ \\ \&gt; \[ \{
<para>
So is <emphasis role="strong"><emphasis>this</emphasis></emphasis> word.
</para>
<para>
So is <emphasis role="emphasisrole"><emphasis>this</emphasis></emphasis> word with a role.
</para>
<para>
So is <phrase role="phraserole"><phrase>this</phrase></phrase> phrase with a role.
</para>
<para>
This is code: <literal>&gt;</literal>, <literal>$</literal>,
<literal>\</literal>, <literal>\$</literal>,
Expand Down Expand Up @@ -1408,7 +1442,7 @@ or here: &lt;http://example.com/&gt;
<para>
Table with attributes
</para>
<table xml:id="mytableid1" class="mytableclass1 mytableclass2" tabstyle="mytabstyle1">
<table xml:id="mytableid1" class="mytableclass1 mytableclass2" tabstyle="mytabstyle1" role="tablerole1">
<title>
Attribute table caption
</title>
Expand Down Expand Up @@ -1444,7 +1478,7 @@ or here: &lt;http://example.com/&gt;
<para>
Table with attributes, without caption
</para>
<informaltable xml:id="mytableid2" class="mytableclass3 mytableclass4" tabstyle="mytabstyle2">
<informaltable xml:id="mytableid2" class="mytableclass3 mytableclass4" tabstyle="mytabstyle2" role="tablerole2">
<tgroup>
<thead>
<th>
Expand Down
Loading
Loading