question:UnicodeDecodeError: 'gb2312' codec can't decode bytes in position 2-3: illegal multibyte sequence
Reason: Python is doing converting normal strings to unicode objects,
For example: u_string = unicode(string, "gb2312"), if your string has some traditional characters such as "Hejiao Primary School"
In-houseJiao, thengb2312 cannot be parsed as a simplified Chinese encoding. It is necessary to use the national standard extension code gbk. gbk supports traditional Chinese and Japanese fake texts.
Solution: Use gbk instead of gb2312, for example:u_string = unicode(string , "gbk")