You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I found that the same problem exists for guillemets - "«" »" commonly used in French texts.
Guillemets are also used with a spacing on either side « Bonjour ! », which when I found also causes incorrect segmentation if they are simply replaced with double quotes in a pre-processing step.
German texts often use a pair of
„
and to“,
to delineate quoted text. These cause issues for example in the below text:Nach einem kurzen Zögern näherte sie sich Louis. „Darf ich mitspielen?“, fragte sie schüchtern.
Where the segmentation is
where for other languages a similar sentence would retain the second sentence as a single entity:
The text was updated successfully, but these errors were encountered: