Skip to content

Initials Formatting #152

@waylan

Description

@waylan

I wanted to remove any extraneous characters from the initials and only have the initials with no punctuation or whitespace. In the process I stumbled upon two shortcomings with the formatting of initials.

Setting initials delimiter to empty string

>>> from nameparser import HumanName
>>> HumanName('Doe, John A.').initials()
'J. A. D.'
> >> HumanName('Doe, John A.', initials_delimiter='').initials()
'J. A. D.'                                                                 <=  EXPECTED 'J A D'
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A.').initials()
'J A D'
>>> HumanName('Doe, John A.', initials_format='{first}{middle}{last}').initials()
'JAD'

It seems that while one can set the inititals_delimiter to an empty string via the CONSTANT, it is not possible via the keyword on HumanName. Presumably, this is because an empty string evaluates to False here:

self.initials_delimiter = initials_delimiter or self.C.initials_delimiter

I would expect this could be fixed by changing that line to:

self.initials_delimiter = initials_delimiter if initials_delimiter is not None else self.C.initials_delimiter

Removing all whitespace from initials is not possible with multi-part names.

>>> from nameparser import HumanName
>>> from nameparser.config import CONSTANTS
>>> CONSTANTS.initials_delimiter = ''
>>> HumanName('Doe, John A. Kenneth', initials_format='{first}{middle}{last}').initials()
'JA KD'                                                                  <=  EXPECTED 'JAKD'
>>> HumanName('Doe, John A. Kenneth', initials_delimiter='.', initials_format='{first}{middle}{last}').initials()
'J.A. K.D.'                                                              <=  EXPECTED 'JAKD'

This one is not so easy to fix. The code joins the parts together with a space hard-coded in.

initials_dict = {
"first": (self.initials_delimiter + " ").join(first_initials_list) + self.initials_delimiter
if len(first_initials_list) else self.C.empty_attribute_default,
"middle": (self.initials_delimiter + " ").join(middle_initials_list) + self.initials_delimiter
if len(middle_initials_list) else self.C.empty_attribute_default,
"last": (self.initials_delimiter + " ").join(last_initials_list) + self.initials_delimiter
if len(last_initials_list) else self.C.empty_attribute_default
}

You could require the space to be part of the delimiter, but that might result in weird output for certain formats (i.e., {last}, {first} {middle}) and it would be a backward incompatible change for anyone who has already defined custom delimiters. Maybe another setting needs to be defined for this. Although, I have no idea what name to give it.

In the end, I worked around both issues with ''.join(name.initials_list()), but it would be nice to be able to have full control with the provided formatting options.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions