Python / Chinese Encodings

Collapse
This topic is closed.
X
X
 
  • Time
  • Show
Clear All
new posts
  • Achim Domma

    Python / Chinese Encodings

    Hi,

    I need to convert Big5 or GB encoded chinese strings to unicode. It would
    be also nice to be able to detect the encoding of the original string.
    Search with groups.google.c om I found some links to different projects but
    they all look not very active. Can somebody give me a short overview of the
    status of processing chinese texts with python?

    regards,
    Achim


  • Martin v. Löwis

    #2
    Re: Python / Chinese Encodings

    "Achim Domma" <domma@procoder s.net> writes:
    [color=blue]
    > I need to convert Big5 or GB encoded chinese strings to unicode. It would
    > be also nice to be able to detect the encoding of the original string.
    > Search with groups.google.c om I found some links to different projects but
    > they all look not very active. Can somebody give me a short overview of the
    > status of processing chinese texts with python?[/color]

    The very short summary: Use the CJK codecs package; it supports all
    encodings you might encounter, and it is actively maintained.

    As for detecting the encoding of the original string: Forget it. Tell
    your communication partners to always properly declare the encoding.

    Regards,
    Martin

    Comment

    Working...