python installation of lxml problems and methods

Whether it is using the crawler framework scrapy or simple requests to resolve after requests. It is inevitable that you need to use the html parsing library.

Of course, it can replace part of the search. Due to the obscure regular syntax and in other scenarios, html parsing is essential.

There are many recommendations for lxml online, advantages:Stable and efficient

However, it is difficult to successfully install lxml in one go

pip install lxml
If there is no error, it will burn high incense. . .

You can download the wheel file under Windows and execute it in the download directory.
pip install
/~gohlke/pythonlibs/#lxml

Usually, it is a message that libxml2 is missing, which is the version that comes with the system is too low. Need to upgrade libxml2

ubuntu
apt-get install libxml2-dev libxslt-dev python-dev

centos
yum install libxml2-dev libxslt-dev python-dev

If this doesn't work, you can download the source code and compile it locally.

So I encountered such a fucking problem
error: command 'gcc' failed with exit status 1

Environment: The 256M memory of the tiler is not enough, and every time you call gcc is cheap, it will be killed by the system. Moreover, the OpenVZ architecture used by the tiler, you cannot add temporary swap unless you change the computer room again.

Method 1: Find a system with the same environment, compile lxml into a wheel, copy it to install it.

This method does not work, there is no such environment at hand.

So I just copied the local lxml package in the site-packet folder.

As a result, the program ran. . . .