Senf introduces a new platform native string type called fsnative. It adds functions to convert text, bytes and paths to and from that new type and helper functions to integrate it nicely with the Python stdlib.

Senf supports Python 2.7, 3.3+, works with PyPy, works on Linux, Windows, macOS, is MIT licensed, and only depends on the stdlib. It does not monkey patch anything in the stdlib.

pip install senf

https://github.com/lazka/senf

Why?

OS strings are used in many different places across the Python stdlib. They are used for filesystem paths, for environemnt variables (os.environ), for program arguments (sys.argv and subprocess), for printing to the console (sys.stdout, sys.stderr) and more.

The problem with them is that they come in many shapes and forms and handling them has changed significantly between Python 2 and Python 3.

A valid platform native string is either bytes, unicode, str + surrogates (either through the surrogatepass or the surrogateescape error handler) or anything implementing the __fspath__ protocol. The values of those types depend on the Python version, the platform and the enviroment the program was started in. Ideally we don’t want to care about any of those details.


For example, assume you want to check the extension of a file name:

import os
from senf import path2fsn

def has_extension(filename, ext):
    root, filename_ext = os.path.splitext(path2fsn(filemame))
    return filename_ext == path2fsn(ext)

This will just work everywhere. path2fsn() will convert anything which is considered a valid path by Python to a fsnative and then we can just compare by value. Note that Python stdlib functions will always returns the same type which was passed in, so os.path.splitext() will return two fsnative values.


Or you want to send a filename over some binary interface:

from senf import fsnative, fsn2bytes, bytes2fsn

def send(filename):
    assert isinstance(filename, fsnative)
    data = fsn2bytes(filename, "utf-8")
    return data

def receive(data):
    filename = bytes2fsn(data, "utf-8")
    return filename

fsn2bytes() converts the path to binary (“utf-8” is used on Windows, or “wtf-8” to be exact) and the receiving end re-creates the filename with bytes2fsn().


Another example is printing filenames and text to a console:

import os
from senf import print_, argv

for filename in os.listdir(argv[1]):
    print_(u"File: ", filename)

Senf provids its own print function which can output platform strings as is and mix them with text. No more encoding/decoding errors.

In addition, Senf emulates ANSI escape sequence handling when using the Windows console and extends Python 2 under Windows with Unicode support for sys.argv and os.environ.

Who?

Senf is used by the following software:

  • Quod Libet - A multi platform music player
  • mutagen - A Python multimedia tagging library