Lxml is a nice python library for XML processing. ETree is really quick, which makes things interesting if you have a large amount of XML files (or a bigger one) to process. Installation on linux/mac is painless (OK, you need homebrew on mac to make int painless, but you get my point...).

The other day I had to do it on windows. This is not really painless. I managed to do it, so here are the steps.

Prerequisites

  • Python 2.7
  • VC++ 2880, the Express edition (freely downloadable from Microsoft), because python 2.7 is compiled against VS 2008
  • Create a virtual environment (e.g. in C:\Users\myuser\virtualenvs\myvenv). I'll refer to this path as virtual_env_path later in the article.
  • In your virtual environment directory, create:
    • A directory PC (e.g. virtual_env_path\PC). This will contain the headers (equivalent to include).
    • A directory "PCbuild". This will contain the libraries (equivalent to "lib").

Installing dependencies

To be successful, you need to install several dependencies. The procedure is identical and it's outlined below:

  1. Download the archive (.zip file). All archives have similar structure, typically containing directories named bin, lib and include
  2. Place the files in the archive's include directory in the PC directory created above. Note that your path should read e.g. virtual_env_path\PC\libxml, not virtual_env_path\PC\include\libxml,
  3. Place the files of the lib directory in virtual_env_path\PCbuild.

Lxml has four dependencies which need to be installed following the steps described above:

  1. Install libxml from xmlsoft.org
  2. Install libxslt from xmlsoft.org
  3. Install zlib from gnu
  4. Install libiconv from gnu, namely the "developer files" zip. Important: copy "libiconv.lib" to "iconv.lib" in PCbuild.

Once this is done, you can install lxml in your virtual environment:

pip install lxml

As a side note, you need to perform the following steps:

  • Get the default iconv archive, because it contains the .DLL file
  • Copy all contents of bin directories in downloaded archives to your virtual environment's bin directory
  • copy the libiconv.dll to iconv.dll
  • copy all the .dll files in your virtual_env_path\Lib\site-packages\lxml. This is definitely not nice :(.