On 08/12/14 14:31, Jeff Morriss wrote:
On 08/09/14 22:41, Evan Huus wrote:
http://buildbot.wireshark.org/trunk/builders/Clang%20Code%20Analysis/builds/2911/steps/check-abi/logs/stdio
I took a quick look at the recent check-abi buildbot failure, which
appears to be manpage related:
wireshark.pod around line 3525: Non-ASCII character seen before
=encoding in 'KovE<aacute>ř'. Assuming UTF-8
POD document had syntax errors at /usr/bin/pod2man line 71.
Which is curious, because wireshark.pod.template *does* have an
=encoding line...
[As discussed later on this thread] the current master doesn't give this
warning but I did notice that the generated man page doesn't contain the
actual UTF8 characters required by some people's names. E.g.,
"doc/wireshark.1" on my system contains:
XXXXX XXXXXXXX <dpb[AT]...]
though, interestingly, Joerg's name got "translated" from what's in
AUTHORS (which contains an o-umlaut) to:
Joerg Mayer <jmayer[AT]...]
Ah, OK, pod2man does that unless you specify "-u":
-u, --utf8
By default, pod2man produces the most conservative possible *roff output to try to ensure that it will work with as many different *roff implementations as possible. Many *roff
implementations cannot handle non-ASCII characters, so this means all non-ASCII characters are converted either to a *roff escape sequence that tries to create a properly accented character
(at least for troff output) or to "X".
This option says to instead output literal UTF-8 characters. If your *roff implementation can handle it, this is the best output format to use and avoids corruption of documents containing
non-ASCII characters. However, be warned that *roff source with literal UTF-8 characters is not supported by many implementations and may even result in segfaults and other bad behavior.
At least on my system here "-u" makes for an ugly rendering of the man
page (though this system seems to have LANG/locale issues).