-
Notifications
You must be signed in to change notification settings - Fork 105
Description
I wanted to remove any extraneous characters from the initials and only have the initials with no punctuation or whitespace. In the process I stumbled upon two shortcomings with the formatting of initials.
Setting initials delimiter to empty string
>>> from nameparser import HumanName
>>> HumanName('Doe, John A.').initials()
'J. A. D.'
> >> HumanName('Doe, John A.', initials_delimiter='').initials()
'J. A. D.' <= EXPECTED 'J A D'
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A.').initials()
'J A D'
>>> HumanName('Doe, John A.', initials_format='{first}{middle}{last}').initials()
'JAD'
It seems that while one can set the inititals_delimiter
to an empty string via the CONSTANT
, it is not possible via the keyword on HumanName
. Presumably, this is because an empty string evaluates to False
here:
python-nameparser/nameparser/parser.py
Line 99 in 759a131
self.initials_delimiter = initials_delimiter or self.C.initials_delimiter |
I would expect this could be fixed by changing that line to:
self.initials_delimiter = initials_delimiter if initials_delimiter is not None else self.C.initials_delimiter
Removing all whitespace from initials is not possible with multi-part names.
>>> from nameparser import HumanName
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A. Kenneth', initials_format='{first}{middle}{last}').initials()
'JA KD' <= EXPECTED 'JAKD'
>>> HumanName('Doe, John A. Kenneth', initials_delimiter='.', initials_format='{first}{middle}{last}').initials()
'J.A. K.D.' <= EXPECTED 'JAKD'
This one is not so easy to fix. The code joins the parts together with a space hard-coded in.
python-nameparser/nameparser/parser.py
Lines 270 to 277 in 759a131
initials_dict = { | |
"first": (self.initials_delimiter + " ").join(first_initials_list) + self.initials_delimiter | |
if len(first_initials_list) else self.C.empty_attribute_default, | |
"middle": (self.initials_delimiter + " ").join(middle_initials_list) + self.initials_delimiter | |
if len(middle_initials_list) else self.C.empty_attribute_default, | |
"last": (self.initials_delimiter + " ").join(last_initials_list) + self.initials_delimiter | |
if len(last_initials_list) else self.C.empty_attribute_default | |
} |
You could require the space to be part of the delimiter, but that might result in weird output for certain formats (i.e., {last}, {first} {middle}
) and it would be a backward incompatible change for anyone who has already defined custom delimiters. Maybe another setting needs to be defined for this. Although, I have no idea what name to give it.
In the end, I worked around both issues with ''.join(name.initials_list())
, but it would be nice to be able to have full control with the provided formatting options.