Name normalization happens when using pip to install python package. Dash
'_' and period
'.'are conflated based on certain rules when pip searchs for packages.
For example, we use the following command to install a package
pip install aaa.bbb_ccc
pip will search for the package
aaa.bbb_ccc on PyPI. The “best” match for the requirements is selected (see pip guide and source code for details). Loosely speaking, the “best” match is the newest version of the package.
- The package name
aaa.bbb_cccis transformed to
# extracted from pip source code _canonicalize_regex = re.compile(r"[-_.]+") def canonicalize_name(name): return _canonicalize_regex.sub("-", name).lower()
wheelname is transformed in the same way (source code)
aaa.bbb_cccis converted to
# extracted from pip source code def safe_name(name): """Convert an arbitrary string to a standard distribution name Any runs of non-alphanumeric/. characters are replaced with a single '-'. """ return re.sub('[^A-Za-z0-9.]+', '-', name)
tarballname is replaced by
Package name convention in PEP 8
PEP 8 doesn’t encourage a long and complicated package name.
# extracted from PEP 8 Python packages should also have short, all-lowercase names, although the use of underscores is discouraged.