"
NCP or GPUaaS provider hardware environments can be large. If there is a fault, the NCP needs to know how quickly the fault needs to be corrected in order to meet the availability SLA which may be represented as 5x9s or 6x9s etc. This worksheet calculates the mean-time-to-repair (MTTR) required across various different types of GPUaaS/NCP hardware infrastructure faults based on the input which is the expected availability SLA. The mean-time-between-failure (MTBF) numbers are taken from a study by Facebook which lists the types of faults and the number of faults found of each type during a specific amount GPU-hours.