Open
Description
Current behavior:
-> 2.7.15 narrow
>>> import sysconfig
>>> sysconfig.get_config_vars('Py_UNICODE_SIZE')
[2]
>>> from builtins import str
>>> str
<class 'future.types.newstr.newstr'>
>>> foo = u'abc\U0001F4A9'
>>> foo
u'abc\U0001f4a9'
>>> bar = str(foo)
>>> bar
'abc\U0001f4a9'
>>> type(bar)
<class 'future.types.newstr.newstr'>
>>> type(foo)
<type 'unicode'>
>>> len(bar)
5
>>> len(foo)
5
>>>
-> 2.7.15 wide
>>> import sysconfig
>>> sysconfig.get_config_vars('Py_UNICODE_SIZE')
[4]
>>> from builtins import str
>>> str
<class 'future.types.newstr.newstr'>
>>> foo = u'abc\U0001F4A9'
>>> foo
u'abc\U0001f4a9'
>>> type(foo)
<type 'unicode'>
>>> bar = str(foo)
>>> bar
'abc\U0001f4a9'
>>> type(bar)
<class 'future.types.newstr.newstr'>
>>> len(bar)
4
>>> len(foo)
4
>>>
-> 3.7.0
>>> import sysconfig
>>> sysconfig.get_config_vars('Py_UNICODE_SIZE')
[None]
>>> from builtins import str
>>> str
<class 'str'>
>>> foo = u'abc\U0001F4A9'
>>> foo
'abc💩'
>>> type(foo)
<class 'str'>
>>> bar = str(foo)
>>> type(bar)
<class 'str'>
>>> bar
'abc💩'
>>> len(bar)
4
>>> len(foo)
4
>>>
The desired behavior is that the semantic meaning behavior of a newstr is the same as Python3.3+ str, regardless of the underlying storage encoding.