Skip to content

Releases: ssl-hep/ServiceX_frontend

Version 2.4.1

27 Oct 16:45

Choose a tag to compare

Providing the same max range for pyarrow as coffea

PyHEP Upgrade

13 Aug 00:34
0f78b8c

Choose a tag to compare

Lots of things were fixed after trying to run against the 70 TB of data for CMS Run1.

  • Supports Python 3.9
  • Will now report supported return types (parquet, root)
  • Long filenames are hashed in the local cache to avoid OS limitations
  • A request tittle can be passed to services to "name" the transform
  • Support lists of URLS or a single URL for a file source as well as the more traditional dataset identifiers
  • Support deleting a single datafile or a query status file
  • api_endpoints now have names not just types
  • A local file can be written that matches query hashes with request id's, and can safely be checked into a repo in order to quickly re-use other people's queries.
  • Better status updates during running and downloading, and support py widgets in vscode.

Bug fix: ignore title when calculating hash

12 Aug 01:41
490b2e9

Choose a tag to compare

  • Make sure title is ignored when calculating hash - it makes no difference in the way the data is calculated

Dataset Name

08 Aug 00:51
8592816

Choose a tag to compare

Dataset Name Pre-release
Pre-release
  • Add an "english" readable property to the servicex dataset that contains the name. This can be quite long, depending on the dataset.

Bug Fix: Default Resolution

07 Aug 17:07
16af18a

Choose a tag to compare

Pre-release

Streaming/fix logic we introduced to parse through user's config files and integrate default values

Post PyHEP Release

04 Aug 16:20
f4a2f41

Choose a tag to compare

Post PyHEP Release Pre-release
Pre-release

This is the first beta of 2.4. While we believe it is feature complete, there is still some wider testing that needs to happen. The goal of this release is to support the full re-analysis of the CMS Run 1 Higgs.

New Features:

  • You can specify a single http:// or root:// file as input for a single file dataset.

  • You can specify a list of http:// and/or root:// files. They will be processed by ServiceX as long as it has permission to access the data.

  • A title can be given to each transform

  • Add the ability to query a dataset for what will be the data types back. This enables automatic data type discovery (required to keep the interface sensible in coffea and other upstream libraries).

  • Python 3.9 now supported

  • Add support for the cms run 1 aod backend type.

  • Caching

    • Analysis Cache - one can create/check in a json file that will map queries to backend request-id's. This means that others can re-run and just download the data, rather than having to re-transform the data for the same queries.
    • A user can delete a data file from the local cache and it will automatically be re-downloaded
    • If a query status cache file is removed, it will be automatically re-fetched
  • Configuration:

    • End points now can have names rather than just types, supporting more than one backend of a single type (e.g. two uproot backends)

Bug Fixes:

  • If the backend has lost the data, automatically resubmit the query. This was broken when streaming URL's or files.
  • Transforms that are marked Fatal are now correctly cleared from the local cache, so they can be re-run
  • When a transform with lots of files fails, the error report will be truncated to the result from 20 different files, rather than... all 3000.
  • When a notebook is run under visual studio code, the progress bars are correctly shown (for processing and download).
  • StreamInfoUrl is now exported
  • Protect against filenames that are so long that the OS can't handle them. In particular, fix the current implementation so it has a more robust hashing mechanism for the modified filename.

In Progress:

  • Added logging information to support debugging the local machine downloading. We aren't saturating good connections and it isn't clear why that is happening yet.

Fixing up a new include

28 Jun 04:08

Choose a tag to compare

Pre-release

Trying to track down import errors, cleaning up how we include other items

Export DatasetType properly

28 Jun 03:46

Choose a tag to compare

Pre-release

So others downstream can fetch is correctly

Title and File List

26 Jun 08:45

Choose a tag to compare

Title and File List Pre-release
Pre-release

Two new features:

  • Can add a title to each request using the title argument with the get_xxx methods.
  • Instead of a did, one can specify a list of http:// or root:// files to access directly.

Add cms run1 aod default formats

07 Jun 10:40

Choose a tag to compare

Pre-release

Add the root as the default format that comes back