1. Exploratory Data Analysis
1.4. EDA Case Studies
1.4.2. Case Studies
1.4.2.9. Fatigue Life of Aluminum Alloy Specimens

## Background and Data

Generation This data set comprises measurements of fatigue life (thousands of cycles until rupture) of rectangular strips of 6061-T6 aluminum sheeting, subjected to periodic loading with maximum stress of 21,000 psi (pounds per square inch), as reported by Birnbaum and Saunders (1958).
Purpose of Analysis The goal of this case study is to select a probabilistic model, from among several reasonable alternatives, to describe the dispersion of the resulting measured values of life-length.

The original study, in the field of statistical reliability analysis, was concerned with the prediction of failure times of a material subjected to a load varying in time. It was well-known that a structure designed to withstand a particular static load may fail sooner than expected under a dynamic load.

If a realistic model for the probability distribution of lifetime can be found, then it can be used to estimate the time by which a part or structure needs to be replaced to guarantee that the probability of failure does not exceed some maximum acceptable value, for example 0.1 %, while it is in service.

The chapter of this eHandbook that is concerned with the assessment of product reliability contains additional material on statistical methods used in reliability analysis. This case study is meant to complement that chapter by showing the use of graphical and other techniques in the model selection stage of such analysis.

When there is no cogent reason to adopt a particular model, or when none of the models under consideration seems adequate for the purpose, one may opt for a non-parametric statistical method, for example to produce tolerance bounds or confidence intervals.

A non-parametric method does not rely on the assumption that the data are like a sample from a particular probability distribution that is fully specified up to the values of some adjustable parameters. For example, the Gaussian probability distribution is a parametric model with two adjustable parameters.

The price to be paid when using non-parametric methods is loss of efficiency, meaning that they may require more data for statistical inference than a parametric counterpart would, if applicable. For example, non-parametric confidence intervals for model parameters may be considerably wider than what a confidence interval would need to be if the underlying distribution could be identified correctly. Such identification is what we will attempt in this case study.

It should be noted --- a point that we will stress later in the development of this case study --- that the very exercise of selecting a model often contributes substantially to the uncertainty of the conclusions derived after the selection has been made.

Software The analyses used in this case study can be generated using R code. The reader can download the data as a text file.
Data The following data are used for this case study.

```  370 1016 1235 1419 1567 1820
706 1018 1238 1420 1578 1868
716 1020 1252 1420 1594 1881
746 1055 1258 1450 1602 1890
785 1085 1262 1452 1604 1893
797 1102 1269 1475 1608 1895
844 1102 1270 1478 1630 1910
855 1108 1290 1481 1642 1923
858 1115 1293 1485 1674 1940
886 1120 1300 1502 1730 1945
886 1134 1310 1505 1750 2023
930 1140 1313 1513 1750 2100
960 1199 1315 1522 1763 2130
988 1200 1330 1522 1768 2215
990 1200 1355 1530 1781 2268
1000 1203 1390 1540 1782 2440
1010 1222 1416 1560 1792
```