Response of HPC hardware to neutron radiation at the dawn of exascale

dc.contributor.authorBustos, A.
dc.contributor.authorRubio-Montero, A.J.
dc.contributor.authorMéndez, R.
dc.contributor.authorRivera, S.
dc.contributor.authorGonzález, F.
dc.contributor.authorCampo, X.
dc.contributor.authorAsorey, H.
dc.contributor.authorMayo-García, R.
dc.date.accessioned2024-01-29T08:52:25Z
dc.date.available2024-01-29T08:52:25Z
dc.date.issued2023
dc.description.abstractEvery computation presents a small chance that an unexpected phenomenon ruins or modifies its output. Computers are prone to errors that, although may be very unlikely, are hard, expensive or simply impossible to avoid. In the exascale, with thousands of processors involved in a single computation, those errors are especially harmful because they can corrupt or distort the results, wasting human and material resources. In the present work, we study the effect of ionizing radiation on several pieces of commercial hardware, very common in modern supercomputers. Aiming to reproduce the natural radiation that could arise, CPUs (Xeon, EPYC) and GPUs (A100, V100, T4) are subject to a known flux of neutrons coming from two radioactive sources, namely Cf and Am-Be, in a special irradiation facility. The working hardware is irradiated under supervision to quantify any appearing error. Once the hardware response is characterised, we are able to scale down the radiation intensity and to estimate the effects on standard data centres. This can help administrators and researchers to develop their contingency plans and protocols.es_ES
dc.identifier.citationBustos, A., Rubio-Montero, A.J., Méndez, R. et al. Response of HPC hardware to neutron radiation at the dawn of exascale. J Supercomput 79, 13817–13838 (2023).es_ES
dc.identifier.urihttps://hdl.handle.net/20.500.14855/2220
dc.language.isoenges_ES
dc.publisherSpringeres_ES
dc.rights.accessRightsopen accesses_ES
dc.subjectExascalees_ES
dc.subjectsilent errorses_ES
dc.subjectneutron radiationes_ES
dc.subjectatmospheric radiationes_ES
dc.subjectinduced failureses_ES
dc.subjectHPCes_ES
dc.titleResponse of HPC hardware to neutron radiation at the dawn of exascalees_ES
dc.typejournal articlees_ES

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
articleRev_05.pdf
Size:
8.36 MB
Format:
Adobe Portable Document Format