Overview: Anonymized job-level records from the Eagle high-performance computing (HPC) system at the National Laboratory of the Rockies (NLR). Each record represents a Slurm batch job with scheduling metadata, resource requests, resource utilization, CPU/GPU energy consumption, and efficiency metrics. Sensitive fields (user, account, job name) are replaced with cryptographic hashes.
System & Timeframe: Eagle was a 2,000-node, 8-petaflop system operated at NLR from 2019–2024. Data covers the full operational lifetime of the system. Slurm data was processed nightly; timestamps are in Mountain Time. Funding provided by the U.S. Department of Energy, EERE.
Files:
- esif.hpc.eagle.job-anon.zip — Core anonymized job records (Hive-partitioned Parquet)
- esif.hpc.eagle.job-anon-energy-metrics.zip — Same records with additional iLO and Ganglia energy metrics
- datacard.md — Full dataset documentation
~13.8 million rows, 62 variables. Readable with PyArrow, pandas, DuckDB, Apache Spark, or any Parquet-compatible tool.
Data Collection: Jobs collected via sacct through a pipeline: Eagle Jobs API → Redpanda → StreamSets → HPCMON API → PostgreSQL. Node-level power from iLO (HP Integrated Lights-Out); GPU power from Ganglia monitoring, joined to jobs via node lists and time ranges.
Preprocessing:
- Anonymization of name, user, and account fields via cryptographic hashing
- Derived columns: queue_wait, cpu_eff, max_mem_eff
- Simplified job state mapping (e.g., "CANCELLED BY 12345" → "CANCELLED")
- QoS accounting rules (buy-in, standby, or Slurm QoS value)
- CPU energy estimated from TDP (200W, Intel Xeon Gold 6154, 18 cores)
- Timezone-aware columns (_tz) sourced from LEX accounting database to correctly handle DST transitions
Key Variables:
Scheduling: job_id, partition, state_simple, submit_time_tz, start_time_tz, end_time_tz, queue_waitResources: nodes_req/used, processors_req/used, memory_req, wallclock_req/used, gpus_requested
Efficiency: cpu_eff, max_mem_eff
Energy: cpu_energy_tdp_estimated_max/used_watt_hours, node_energy_total_watt_hours (iLO), gpu0/1_energy_total_watt_hours (Ganglia)
Partitions: bigmem, bigmem-8600, bigscratch, csc, dav, ddn, debug, gpu, haswell, long, mono, short, standard
Job States: CANCELLED, COMPLETED, FAILED, NODE_FAIL, OUT_OF_MEMORY, PENDING, RUNNING, TIMEOUT
QoS Levels: Unknown, normal, buy-in, debug, penalty, high, standby
Important Notes:
- Non-_tz timestamp columns may be off by one hour across DST boundaries; use _tz columns for time difference calculations
- Energy fields are null for jobs without monitoring coverage
- Job step records and raw Slurm JSONB fields are excluded from this extract
- Do not attempt to re-identify individuals from hashed fields
| Name | Size | Type | Resource Description | History |
|---|---|---|---|---|
| esif.hpc.eagle.job-anon.zip | 870.1 MB | Archive | Eagle Jobs Dataset (Zipped Parquet/Hive Dataset). Range: 11/2018 - 06/2024. MD5sum: a06527de28cbf207d1e743822978b9b4. | |
| esif.hpc.eagle.job-anon-energy-metrics.zip | 1.3 GB | Archive | Eagle Jobs + Additional Energy Metrics Dataset (Zipped Parquet/Hive Dataset). Range: 11/2018 - 06/2024. MD5sum: cc60eac4d10b38a1bbfe3ef7dede5590. | |
| datacard.md | 18 KB | Document | Genesis formatted datacard that describes this dataset. |
Clark, Struan, Matt Selensky, and Kevin Menear. 2025. "NLR HPC Eagle Jobs Data and Additional Energy Metrics." NLR Data Catalog. Golden, CO: National Laboratory of the Rockies. Last updated: April 22, 2026. DOI: 10.7799/3023273.
