BOOTSTRAPPING & DISTRIBUTION FITTING

Dealing with measurements is challenging. Any test or measurement campaign provides insights as well as confusion at the same time. Variability is always present and unavoidable. Current practice always over-simplifies the data analyses with a histogram, normal distribution or even just by picking a value. Both the bootstrapping as well as distribution fitting functionality answer important questions:

How sure am I that I have measured enough outliers?
What is a proper parameter estimate for further use in my work?
What is the likelihood that I will encounter the worst case situation in field?
How can I describe the variability of my parameter correctly?

Want to learn more or perform a distribution fitting operation yourself? Subscribe to the newsletter and use the API yourself for free.

Disclaimer. Privacy. Unsubscribe. Reach out.

bootstrap analysis

Bootstrap analyses show insights in how accurate you can estimate the mean value of a parameter of interest. Often measurements are expensive and cannot be performed unlimited. Figure 1 shows an example of 17 measurements of the Youngs Modulus (stiffness) of a soil type. This is an important parameter to estimate deformations of (for example) a temporary placed foundation. Bootstrapping is a technique which is used to estimate the variability possible in the mean value. It uses random sampling with replacement. Since a mean value is often used to get insights in deformations it is important to know whether we can actually estimate it based on the 17 measurements available.

Figure 1, Bootstrap analysis performed on Youngs Modulii estimated during a site investigation campaign

Table 1 shows the actual best estimate of the mean can vary up to 40-50%. As visible variation in settlement between 76 mm and 148 mm is likely to be expected. Conventional estimates based on the mean might lead to unsafe design since limited data is available. Additionally any safety factor which is applied is always arbitrary and only provides perceived safety. A bootstrap analysis is essential for proper interpretation of calculations based on measurements.

Description	Conventional approach	Bootstrap analysis
Force (F)	4000 kN	4000 kN
Foundation diameter (D)	3 m	3 m
Poisson ratio (v)	0.25	0.25
Youngs Modulus (E)	4.54 MPa	3.72 - 5.75 MPa
Settlement (ε)	117 mm	76 - 148 mm

Table 1, Comparison between conventional analysis and a bootstrap analysis

Distribution fitting

Fitting a probability density function allows everone to theoretically describe the variability of a parameter that is measured multiple times. Contrary to looking at the mean, it includes the amount and type of variation of the data. This allows for objectively estimating upper and lower bounds based on a pre-defined safety level. These analyses are often forgotten in practice and if used, performed in an over-simplified and incorrect manner. One step often misses: determining which type of distribution fits best. Figure 2 shows a Quantile-Quantile (QQ) analysis of 30 measurements of the angle of internal friction (describing the shear strength of sand). The closer the dots are to the 'unit line' the better the fit of the distribution.

Currently only 4 distribution types are supported (Uniform, Normal, Lognormal and Exponential). There are many different possibilities. These can all be implemented to make the tool more complete. Reach out if you would like an update.

Figure 2, Assessing which distribution is best suitable for further use with Quantile-Quantile (QQ) analysis

A QQ-analysis always provides an answer and is therefore of no use on checking whether sufficient data is available for a proper variability description. After performing a QQ-analysis a 'goodness of fit' test is performed to the best fitting distribution (this is called a Kolmogorov-Smirnov test). This test checks if the distribution is actually good enough or might be un- or overconservative. When the test result is succesfull (as stated in Figure 3) the distribution can be used to estimate upper and lower bounds of the parameter.

Figure 3, Distribution fitting of the best fitting type for the angle of internal friction

A conventional method to determine ultimate sliding capacity of a foundation would use lower bound estimate of 29 degrees and a safety factor of 1.2. On the other hand, a probabilistic approach uses an objective (in this case 95%) confidence interval. This gives considerably more information. Table 2 shows the lower, mean and upper bound horizontal load which can be applied based on the quantified variability of the angle of internal friction. As visible the allowed horizontal load can safely be increased by 20% which results in feasibility for application in tougher conditions.

Description	Conventional approach	Lower bound	Best estimate	Upper bound
Vertical load (V)	4000 kN	4000 kN	4000 kN	4000 kN
Angle of internal friction (φ)	29.0 deg (24.8 factored)	28.2 deg	34.5 deg	40.9 deg
Allowed horizontal load (H)	1848 kN	2144 kN	2751 kN	3460 kN

Table 2, Comparison of conventional approach to a detailed study using distribution fitting

Truncated distributions

The analysis above is already a major improvement if you would compare it to current practice. Nonetheless, further improvements are possible. For example by limiting the fitted distributions to physically possible and expected parameter values. This is possible by means of truncation. In this process the tails of the distribution can be removed. This avoids sampling from domains in the parameter space which are physically infeasible (in this case for example a value of 25 degrees). Figure 4 shows a truncated distribution.

Figure 4, Truncated distribution fit using only expected values of the angle of internal friction for the specific soil unit

References

Monroe, 2017, Sampling and Bootstrapping
Verruijt, 2012, Soil mechanics
Augustin, 2002, On quantile quantile plots for generalized linear models
ISO, 2015, General principles on reliability for structures
Stephens, 1974, EDF statistics for goodness of fit and some comparisons

Learn more

Want to learn more? Subscribe to the newsletter & get access to the API for free. This allows you to implement these analyses in your own workflow.

Back to the main menu