Skip to content
David Megginson edited this page Aug 25, 2015 · 13 revisions

The most-common way of working with HXL-tagged datasets in libhxl is through the use of filters. A filter is a mini-program that performs a single operation on incoming HXL data, then passes it on, possibly to other filters. HXL supports the following filters:

Filter chains

Filters often work in chains. For example, the following command-line sequence selects rows where #org is "Red Cross", counts the number of rows for each #adm1, then renames the generic #meta+count column to #output+activities (assuming that each row represents an activity):

hxlselect -q 'org=Red Cross' | hxlcount -t adm1 | hxlrename -r 'meta+count:output+activities'

Here is the same sequence inside a Python program:

source = hxl.data(url).count('adm1').rename_columns('meta+count:output+activities')

The HXL Proxy is a web application that lets you define these filter chains in your browser, then apply them to any online HXL dataset.

Clone this wiki locally