
Lognormal distributions (with two parameters) have a central role in human and ecological risk assessment for at least three reasons. First, many physical, chemical, biological, toxicological, and statistical processes tend to create random variables that follow Lognormal distributions (Hattis and Burmaster, 1994).
For example, the physical dilution of one material (say, a miscible or soluble contaminant) into another material (say, surface water in a bay) tends to create non equilibrium concentrations which are Lognormal in character (Ott, 1995; Ott, 1990). Second, when the conditions of the Central Limit Theorem obtain (Mood, Graybill, and Boes, 1974), the mathematical process of multiplying a series of random variables will produce a new random variable (the product) which tends (in the limit) to be Lognormal in character, regardless of the distributions from which the input variables arise (Benjamin and Cornell, 1970). Finally, Lognormal distributions are selfreplicating under multiplication and division, i.e., products and quotients of Lognormal random variables are themselves Lognormal distributions (Crow and Shimizu, 1988; Aitchison and Brown, 1957). The Standard Normal Distribution Since the lognormal is based on the normal distribution actually every property of the lognormal can be derived from the properties of the normal distribution. A random variable Z is said to have the standard normal distribution if it has the probability density function φ given by φ(z) = exp(−z^{2} / 2) / [(2π)^{1/2}] for z in R. The normal distribution with location parameter μ in R and scale parameter σ > 0 has probability density function f given by
f(x) = exp[−(x − μ)^{2} / (2σ^{2})] / [(2π)^{1/2}σ], for x in R. The Lognormal DistributionA random variable X is said to have the lognormal distribution, with parameters μ and σ, if ln(X) has the normal distribution with mean μ and standard deviation σ. Equivalently, X = exp(Y) where Y is normally distributed with mean μ and standard deviation σ. While the parameter μ can be any real number the parameter σ must be a positive real number. The lognormal distribution is used to model continuous random quantities when the distribution is believed to be skewed, such as certain income and lifetime variables. The lognormal density function, with parameters μ and σ, is given by
f(x) = exp{−[ln(x) − μ]^{2} / (2σ^{2}) }/ [x (2π)^{1/2} σ] for x > 0. The parameter μ is the mean and σ is the standard deviation of the distribution for the normal random variable ln[X], not the lognormal random variable X. Although sometimes confusing, μ is also the median of the normal random variable ln[X] because μ is the median of N(μ, σ).Equation represents the lognormal random variable X in "logarithmic space." The random variable ln[X] follows a normal distribution, but the random variable X follows a lognormal distribution. Consider that the difference of the normal distribution pdf to the the lognormal pdf is not only the replacement of x by ln(x) but also an additional x factor in x (2π)^{1/2} σ due to the change of variables from X to ln[X]. The moments of the lognormal distribution can be computed from the moment generating function of the normal distribution. E(X^{n}) = exp(nμ + n^{2}σ^{2} / 2). The mean and variance of X are
The median is exp(μ) and the mode is exp(μ – σ^{2 }), see Appendix. Even though the lognormal distribution has finite moments of all orders, the moment generating function is infinite at any positive number. This property is one of the reasons for the fame of the lognormal distribution. The lognormal distribution arises from many small, multiplicative random effects, in contrast to additive random effects that lead to the normal distribution. It is used extensively in reliability applications to model failure times. The lognormal and Weibull distributions are probably the most commonly used distributions in reliability applications. The lognormal is skewed to the right. For a given μ the skewness increases as σ increases. HOW IMPORTANT THIS IS YOU CANNOT IMAGINE FOR CANCER PATIENTS!
Caution! Some use instead the natural logarith the base 10 logarithm for example because they use logarithmic paper to derive the parameters graphically. i.e. they assume that the log10(X) is normally distributed. Then the parameters are μ_{10} and σ_{10} which are related to μ and σ by the relations: μ = ln(10) μ_{10} σ = ln(10)σ_{10} Example: Describing survival of cancer patients observed in a truncated time interval. We consider a medical example with cancer patients. We start the observation at T0 and end it at TN. In intervals we record the number of patients who died. We assume that the integral (number of patients) in the truncated range is the same for the experimental and theoretical. Assume that T0 is from 0 to infinite. Then the integral is the number of all patients who were considered in the study, both theoretical and observed number should be the same. The only parameters to be found are σ and μ. We consider a real case with cancer Patients (Mesothelioma). We have data set in form of a table such as 10 Patients died in the intervall from 515 weeks 8 in the interval 1525 weeks etc or in tabular form for nT times ..
Is this described by a lognormal distribution? The purpose of the work is to provide a baseline for comparison. We are interesting to to compare the results of this this dataset with another set that considers patients who take a new drug that is supposed to „extend their lifetime”. We transform T values to x values for normal distribution to calculate the cumulative value. x = ( log(T) – μ)/σ; The polynom f(x), see Appendix, is provided by Abramowitz Stegun but works only for positive x. For negative x we have to use 1f(x). We use a function say Pint(x) to calculate the cummulative probability from infinite to x for a normal distribution N(0,1), which from infinity to +infinity goes from 0 to 1 and due to the symmetry f(0) = 0.5. double PInt( double x) { // Abramowitz Stegun double d1 = 0.0498673470; double d2 = 0.0211410061; double d3 = 0.0032776263; double d4 = 0.0000380036; double d5 = 0.0000488906; double d6 = 0.0000053830; double Pval;
// Only valid for positive values.. else use relation P(x) = 1  P(x) double y; if(x < 0.0) y = 1.0*x; else y = x;
// Use Horner Schema for the polynomial f(x) Pval = 1.0 + y*(d1 + y*(d2 + y*(d3 + y*(d4+y*(d5 + y*d6))))); Pval = 1.0  0.5/pow(Pval, 16); if ( x < 0.0) Pval = 1.0  Pval;
return Pval; }
Now here we calculate the Chi2 and the probability for the dataset for a given σ and μ parameter. We assume that a method is used to minimize this Chi2.
double SumE = 0.0; double SumO = 0.0; double SumChi2 = 0.0;
// Calculated Integrals in ranges R and expectations E for(i = 0; i < nT1; i++) { RVal[i] = PxVal[i+1]PxVal[i]; (calculate x and Pint() for the boundaries of the time interval,... i.e get integral in this time range) EVal[i] = NFitCases*RVal[i]/(PxVal[nT1]PxVal[0]); // Expected value in the i.th Time interval }
SumChi2 = 0.0; for(i = 0; i < nT1; i++) { Chi2 = pow((ObsVal[i]  EVal[i]), 2)/EVal[i]; SumChi2 += Chi2; SumE += EVal[i]; SumO += ObsVal[i]; // These are the observed values from the table }
For the particular case we obtain from the minimization algorithm (not provided here) Minimum for lognornal parameters μ = 3.605840 , σ = 0.970540 Sum Chi**2 = 0.586 // Chi2 Sum E = 56.000 // Expected Sum Sum O = 56.000 // Observed Sum DOF (Degrees of Freedom) = nT4 =3 (nT1 ranges and 2 parameters) Probability ==0.899604 for this Chi2 and DOF
The results show that the survival times of patients can be modeled nice by the lognormal distribution.
We calculate the probability P that a patient will survive a given time in weeks. We use 1 – Pint(x) to find the fraction of patients still alive after a time T
The results in the table shows that after 20 weeks (or 3 months) around 26% of the patients will die and only 8% approximately will survive 140 weeks (2.7 years).
References
Abramowitz, M. and Stegun, I.A., Eds. 1964. Handbook of Mathematical Functions with Formulas, Graphs, and Mathematical Tables, National Bureau of Standards, Applied Mathematics Series Number 55, Issued June 1964, Tenth Printing with corrections in December 1972, US Government Printing Office, Washington, DC. Aitchison J and Brown JAC, 1957. The lognormal distribution, Cambridge University Press, Cambridge UK. Benjamin, J.R. and Cornell, C.A. 1970. Probability, Statistics, and Decision for Civil Engineers, McGraw Hill, New York, NY. Crow EL and Shimizu K Eds, 1988. Lognormal Distributions: Theory and Application, Dekker, New York. McAlister D, 1879. Proc. Roy. Soc. 29, 367 Limpert E, Stahel WA and Abbt M, 2001. Lognormal distributions across the sciences: keys and clues. Bioscience 51 (5), 341352 R F Mould, M Lederman, P Tai and J K MWong, Methodology to predict longterm cancer survival from shortterm data using Tobacco Cancer Risk and Absolute Cancer Cure models, Phys. Med. Biol. 47 (2002) 3893–3924. (Note the typing error in page 3901 d6 = 0.0007 005 383 0 should be d6 = 0.0000 05 383 (remove the 7) Bernard Asselain, Yann De Rycke and Alexia Savignon, Richard F Mould, Parametric modelling to predict survival time to first recurrence for breast cancer, Phys. Med. Biol. 48 (2003) L31–L33 R.F. Mould, M. Lahanas et al, Lognormal modelling of malignant pleural mesothelioma reference baseline survival rates: a study of 5563 cases, submitted for publication February 2004, (Table of Contents) Hattis, D.B. and Burmaster, D.E. 1994. Assessment of Variability and Uncertainty Distributions for Practical Risk Assessments, Risk Analysis, Volume 14, Number 5, pp 713 – 730. Ott, W.R. 1990. A Physical Explanation of the Lognormality of Pollutant Concentrations, Journal of the Air and Waste Management Association, Volume 40, pp 1378 et seq. Ott, W.R. 1995. Environmental Statistics and Data Analysis, Lewis Publishers, Boca Raton, FL. Mood, A.M., Graybill, F.A., and Boes, D.C. 1974. Introduction to the Theory of Statistics, Third Edition, McGraw Hill, New York, NY.
Appendix
Polynomial approximation P(x) Polynomial approximation P(x) for cumulative normal distribution that can be used also for the calculation of the cumulative probability of the lognormal distribution but only for positive x values. P(x) = f(x) if x is positive or 0, else P(x) = 1f(x). f(x) = 1  0.5*( 1 + d_{1 }x + d_{2} x^{2} + d_{3 }x^{3} + d_{4 }x^{4} + d_{5 }x^{5} + d_{6 }x^{6 })^{16} + eps(x) , abs(eps(x)) < 1.5*10^{7} d_{1 }= 0.0498673470 Consider that polynom is valid only for positive x values! mean median mode
Lognormal Generalization with 3 parameters
f(x) = exp[−[ln(xa) − μ]^{2} / (2σ^{2})] / [(2π)^{1/2}σ(xa)], for x >a (0 else) Parameters
x in [a, infinite]
Generation Algorithm Generate normal random number r = N( μ,σ^{2}) Return a + exp(r)


Contact  Search  Statistics 