Skip to content
David Megginson edited this page Feb 14, 2015 · 6 revisions

When you install libhxl-python, the following commands are available from your regular command line (e.g. the Linux shell prompt). Most of the commands accept HXL datasets on standard input and write them to standard output, so you can combine them in a processing pipeline, like this:

hxlselect -q adm1=Conakry my-data.csv | hxlbounds -b conakry-bounds.json

The HXL cookbook contains a collection of recipes for using these tools together to accomplish simple and complex analysis, transformation, and validation tasks on HXL-encoded datasets.

These commands are designed to work with very large datasets: they will perform efficiently even with hundreds of thousands of rows of data, with minimal memory overhead, so they are suitable for use in high-demand, multitasking environments. (The exceptions is hxlcount (command) and hxlsort (command), which can consume memory in linear proportion to the size of the dataset).

To get a reminder of any command's usage, provide the "-h" option, e.g.

hxlvalidate -h

HXL commands

  • hxl2geojson — create a GeoJSON layer from a HXL dataset containing lat/lon columns (e.g. for showing on a web map).
  • hxladd — add extra columns with constant values (e.g. to add a fixed country or reporting-date value to every row for interoperability).
  • hxlbounds — test whether locations in a HXL dataset appear within a boundary shape (e.g. are they all in the expected country?).
  • hxlclean — create a new version of a HXL dataset which is fully normalised (untagged columns and extra headers removed, all compact-disaggegated data expanded).
  • hxlcount — produce simple aggregate statistics from a HXL dataset.
  • hxlcut — create a new version of a HXL dataset with some columns removed (e.g. to strip personally-identifiable information).
  • hxlmerge — merge fields from one HXL dataset into another one, based on common keys (like a code).
  • hxlrename — change the tags on one or more columns in a HXL dataset.
  • hxlselect — create a new version of a HXL dataset with some rows removed (e.g. those not matching a specific sector or organisation).
  • hxlsort — sort a dataset based on one or more hashtagged columns.
  • hxlvalidate — validate a HXL dataset against the default HXL schema or a customised one, and get a list of warnings and errors.
Clone this wiki locally