web123456

Step by step: Solve python UnicodeDecodeError: 'gb2312' codec can't decode problem

question:UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3: illegal multibyte sequence

Reason: Python is doing converting normal strings to unicode objects,

For example: u_string = unicode(string, "gb2312"), if your string has some traditional characters such as "Hejiao Primary School"

In-houseJiao, thengb2312 cannot be parsed as a simplified Chinese encoding. It is necessary to use the national standard extension code gbk. gbk supports traditional Chinese and Japanese fake texts.


Solution: Use gbk instead of gb2312, for example:u_string = unicode(string , "gbk")