Does not handle badly encoded filenames transparently

When a filename on disk has an invalid name, in the sense that the name is not valid in the character encoding python is using for filenames, pylibacl breaks.

Since python 3.1, there is a mechanism for supporting filenames like these, using surrogate code points: http://legacy.python.org/dev/peps/pep-0383/

This causes these invalid filenames to be converted to normal unicode strings, using the "surrogate code points" in place of the invalid characters. However, when such a filename is passed to pylibacl (through the ACL constructor file argument), an exception occurs:

```
    actual = posix1e.ACL(file=path)
UnicodeEncodeError: 'utf-8' codec can't encode character '\udce9' in position 99: surrogates not allowed
```

Looking at the code, I suspect this error comes from [the call to `PyArg_ParseTupleAndKeywords`](https://github.com/iustin/pylibacl/blob/master/acl.c#L155), which is told to convert into a string using utf-8, which then fails.

Looking [at the docs for `PyArg_ParseTupleAndKeywords`](https://docs.python.org/3.4/c-api/arg.html#strings-and-buffers), it suggests that this problem can be solved by using the `0&` format:

> Note: This format does not accept bytes-like objects. If you want to accept filesystem paths and convert them to C character strings, it is preferable to use the O& format with [PyUnicode_FSConverter()](https://docs.python.org/3.4/c-api/unicode.html#c.PyUnicode_FSConverter) as converter.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Does not handle badly encoded filenames transparently #2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

Does not handle badly encoded filenames transparently #2

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions