NERSCPowering Scientific Discovery Since 1974

Storage Resource Unit (SRU) Formula Coefficients

The coefficients in the Storage Resource Unit (SRU) formula were arrived at from the following considerations:

- The formula should help influence user behaviour towards efficient use of the storage resource.
- The formula should reflect the relative costs of "doing business".

From these considerations we adopted file counts, bytes stored and I/O transfers as the 3 minimum factors that needed to be included in the formula.  Hardware costs are related to these three areas in the following ways:

1.  Costs driven by number of files:
  - Metadata CPUs, disks and backup systems
  - Additional tape drives required to overcome the latency of small file sized
2.  Costs driven by the amount of space useed:
  - Library
  - Media
  - Repack CPUs and drives
3.  Costs driven by bandwidth requirements:
  - Multiple tape drives
  - Large capacity high speed disks
  - High speed networking and network switches
  - Data transfer CPUs ("movers")

We cconsidered NERSC's costs of storage operation and roughly assigned them to these three areas.  Ignoring file counts we found that storage and I/O accounted for roughly 40% and 60% of our costs respectively.  We decided that storage and I/O would have these rough proportions and that we would introduce file counts to be 10-20% of the overall formula.

At the time the cost of monitoring space usage was very high and required a complete directory listing.  Therefore we limited these operations to one or two per month.  This led to an accounting granularity of one month.  Based on this granularity the coefficients in the formula were set to generate 1 SRU per month for the average user with 1GB stored in the system.

The Formula

At the time the formula was set (1999) there were about 20 to 50 TB in NERSC storage.  The amount of I/O, on a monthly basis, was observed to be about 10% of the amount stored (as of 2003 it is 13% to 14%) so the formula was

SRUs = (GB stored) + 10 x (GB of I/O)

The average file size was about 10 MB so there were about 100 files per GB stored.  To cause this to account for 1/4 as much as the space charge the factor times 100 must equal about 0.25 which lead to a file factor of 0.0025.  This was subsequently raised slightly to 0.003 to have more influence on users so the hypothetical user, who stares 1 GB in 100 files and does 0.12 GB of I/O per month sees the following charges:

Number of file times .0003 = 0.3
Space stored                      = 1.0
I/O time 10                          = 1.2
Total                                       2.5

It was also decided that if would be convenient if the "average" user with 1GB stored would accrue 1 SRU/month so the above rates were scaled by 0.4, leading to

monthly user SRUs = 0.0012*files + 0.4*GB_stored + 4*GB_I/O

Later when the accounting granularity became daily the above rates were divided by 30.5 days per month to get

daily user SRUs = 0.0000393*files + 0.0131147*space(GB) + 4.0*I/O(GB)

Multiply by a typical 365 day year results in:

yearly user SRUs = 0.01436*files + 4.787*space(GB) + 4*I/O(GB)