Estimation of a linear model with two-parameter symmetric platykurtic distributed errors
© Andargie and Rao; licensee Springer. 2013
Received: 21 July 2013
Accepted: 6 November 2013
Published: 26 November 2013
A linear regression model with Gaussian-distributed error terms is the most widely used method to describe the possible relationship between outcome and predictor variables. However, there are some drawbacks of Gaussian errors such as the distribution being mesokurtic. In many practical situations, the variables under study may not be mesokurtic but are platykurtic. Hence, to analyze this sort of platykurtic variables, a multiple regression model with symmetric platykurtic (SP) distributed errors is needed. In this paper, we introduce and develop a multiple linear regression model with symmetric platykurtic distributed errors for the first time.
We used the methods of ordinary least squares (OLS) and maximum likelihood (ML) to estimate the model parameters. The properties of the ML estimators with respect to the symmetric platykurtic distributed errors are discussed. The model selection criteria such as Akaike information criteria (AIC) and Bayesian information criteria (BIC) for the models are used. The utility of the proposed model is demonstrated with both simulation and real-time data.
A comparative study of symmetric platykurtic linear regression model with the Gaussian model revealed that the former gives good fit to some data sets. The results also revealed that ML estimators are more efficient than OLS estimators in terms of the relative efficiency of the one-step-ahead forecast mean square error.
The study shows that the symmetric platykurtic distribution serves as an alternative to the normal distribution. The developed model is useful for analyzing data sets arising from agricultural experiments, portfolio management, space experiments, and a wide range of other practical problems.
KeywordsMaximum likelihood Multiple linear regression model Simulation Symmetric platykurtic distribution
Regression analysis is one of the most commonly used statistical methodologies in many branches of science and engineering used for discovering functional relationships between variables. The most typical example of regression analysis is multiple linear regression modeling, which is used for predicting values of one or more response variables from any factor of interest, the independent variables. It has received applications in almost every area of science, engineering, and medicine. Comprehensive accounts of the theory and applications of the linear regression model are discussed in Seber , Montgomery et al. , Grob , Sengupta and Jammalamadaka , Seber and Lee , Weisberg , and Yan and Su . This technique is usually based on a statistical model in which the error terms are assumed to be independent and identically distributed random variables, whose distribution is considered to be multivariate normal with a zero mean vector and a positive definite covariance matrix . However, in many disciplines, scientific research based on empirical studies or theoretical reasoning provided support for the presence of skewness or heavy tails in the distribution of the error terms. The departures from normality may be caused also by the presence of outlying values in the responses. Examples can be found, amongst others, in Fama  and Sutton . For these reasons, several researchers proposed to perform multivariate regression analysis using a model that assumes a different parametric distribution family for the error terms.
Zeckhauser and Thompson  studied on a linear regression model with power distributions. Zellner  and Sutradhar and Ali  studied on a regression model with a multivariate t error variable. Tiku et al. [14–16] investigated a linear regression model with symmetric innovations, discussed a first-order autoregressive model with symmetric innovations, and presented a linear model with t distribution, respectively. Sengupta and Jammalamadaka  studied on linear models. Liu and Bozdogan  studied on power exponential (PE) multiple regression. Wong and Bian [19, 20] studied on multiple regression coefficients in a linear model with errors being Student's t distribution and a linear regression model with underlying distribution being a generalized logistic distribution, respectively. Liu and Bozdogan  studied on multivariate regression models with PE random errors under various assumptions. Soffritti and Galimberti  discussed a multivariate linear regression model under the assumption that the error terms follow a finite mixture of normal distributions. Jafari and Hashemi  studied on linear regression with the error term of skew-normal distribution. Jahan and Khan  investigated the g-and-k distribution as the underlying assumption for the distribution of error in a simple linear regression model. Bian et al.  studied a multiple linear regression model with underlying Student's t distribution.
No serious attempt is made to develop and analyze multiple regression models with symmetric platykurtic (SP) errors. For this reason, to achieve more flexibility in statistical modeling and model selection, and to robustify many multiple statistical procedures, the purpose of this paper is to introduce and develop a multivariate linear regression model for conditions in which the distribution of error terms is assumed to be independent and identically distributed SP random errors with mean 0 and constant variance σ2.
where y is a column vector of n elements, X is an nx(k + 1)(k + 1 < n) nonrandom design matrix of covariates (with its first column having all elements equal to 1, the second column being filled by the observed values of x 1, the (k + 1)th column being filled by the observed values of x k ), β is a column vector of the (k + 1) elements, σ is an unknown scale parameter, and ∈ is an nx 1 column vector of error terms with zero mean and constant variance σ2 I.
where the notation E stands for the expected value and Cov represents an nxn variance-covariance matrix. The vector 0 is a column vector with n zero elements, and I is an identity matrix of order nxn. The parameter σ2 is unspecified, along with the vector parameter β. The elements of β are real-valued, while σ is positive. The covariates are either nonrandom or are independent of the errors. E(y) = Xβ, and Cov(y) = σ2 I. We shall use the triplet (y,Xβ,σ2 I) for linear model (2).
The manuscript is organized in eight sections. The ‘Introduction’ section frames the objective of the paper and reviews related literatures. In the ‘Properties of the two parameter symmetric platykurtic distribution’ section, we introduced the two-parameter SP distribution, in notation SP(μ,σ). We derived the maximum likelihood (ML) estimators in the ‘Maximum likelihood estimation of the model parameters’ section. In order to obtain numerical solutions to the ML estimate problem, the Newton–Raphson (NR) iterative method has been used. In the ‘Properties of the estimators through simulation study’ section, we show the asymptotic properties of the estimators. In the ‘Least squares estimation of the model parameters’ section, OLS estimation for the model parameters is studied. Comparison of MLE with OLS estimators and that of the proposed model with the Gaussian model are done in the ‘Comparative study of the model’ section. The ‘Application of the model’ section demonstrates the usefulness of the present model on real data. Finally, the ‘Summary and conclusions’ section concludes the paper.
Properties of the two-parameter symmetric platykurtic distribution
The distribution depends on three parameters μ, σ, and γ. These parameters can be interpreted as follows:
μ is a real number and may be thought of as a location measure.
σ is positive and measures the dispersion or the scale of the distribution.
γ is a kurtosis parameter which determines the shape of the distribution, taking values γ = 0, 1, 2,…, n. If γ = 0, we retrieve the normal distribution N(μ,σ2). If we take γ = 1, we get a two-parameter symmetric platykurtic distribution.
- 1.The distribution function of the random variable Y specified by the probability density function (7) is given bywhich, on simplification, reduces to(8)
where , is the distribution function of the normal random variable with mean μ and variance σ, while is the nondistribution function (cannot be a cumulative density function) since it is negative for Y > μ.
- 2.Numerical approximations for the two-parameter symmetric platykurtic cumulative distribution function (CDF): Following Marsaglia's  approximation for standard normal distribution who suggested a simple algorithm based on the Taylor series expansion, we have also approximated the values F(y;μ,σ) as follows. For standard normal distribution with arbitrary precision, where φ and Φ is the pdf and CDF of the normal distribution with mean μ and variance σ. Accordingly, after little algebra, the standard symmetric platykurtic CDF, F(y), is approximated by(9)
where n = 1, 3, 5,…, n and n!! denotes the double factorial that is the product of every odd number from 1 to n.
- 3.The cumulant-generating function is the logarithm of the moment-generating function:(10)The cumulants k n are extracted from the cumulant-generating function via differentiation (at zero) of g(t). That is, the cumulants appear as the coefficients in the Maclaurin series of g(t):(11)
That is, the first two cumulants are equal to the mean μ and the variance of the two-parameter symmetric platykurtic distribution, respectively, whereas all higher-order cumulants are equal to zero.
- 4.Hazard rate function of the distribution: The hazard function h(y;μ,σ) of the two-parameter symmetric platykurtic distribution used in this paper is utilized to characterize life phenomena and can be written as(12)
where φ is the pdf of the normal distribution with mean μ and variance σ, whereas the Q-function Q(y) is the complement of the standard normal CDF, Q(y) = 1 − Φ(y).Recently, it was observed by Gupta and Gupta  that the reversed hazard function plays an important role in the reliability analysis. The reversed hazard function of the two-parameter SP(μ,σ) is(13)
It is well known that the hazard function or the reversed hazard function uniquely determines the corresponding probability density function.
- 5.Entropy: The entropy for a two-parameter symmetric platykurtic distribution random variable y with probability density function f(y) on the real line is defined by(14)It can be recalled that the entropy for normal distribution is . If f and g are the probability distributions of symmetric platykurtic and normal distributions, respectively, then the relative entropy D(f||g) from f to g is(15)
This gives us a measure of something like the distance between the two probability distributions, in the sense that the relative entropy is always positive, is zero if and only if the two distributions are the same, and increases as the distributions diverge. Some of the more important properties of the SP distribution are summarized in the Appendix.
Maximum likelihood estimation of the model parameters
where H is the matrix of the second derivative and S is the vector of the first derivative of the log-likelihood function both evaluated at the current values of the parameter vector θ.
Here we begin with some starting value, say θ(0), and improve it by finding some better approximation θ(1) to the required root. This procedure can be iterated to go from a current approximation θ(n) to a better approximation θ(n+1).
as the integrand in the last expression is an odd function of u, that is, making use of the fact that the integrand of the off-diagonal term is an odd function of u.
Thus, the square root of the elements on the diagonal of this matrix will give us the standard errors associated with the coefficients.
Simulation and results
Summary of ML estimation of the regression model for the simulated data
Sample size ( n)
Wald 95% confidence limits
Pr > chi-square
Properties of the estimators through simulation study
If certain regularity conditions of the density are met, the MLEs are most attractive because they possess many asymptotic or large sample properties. Derivations of the asymptotic properties require some fairly intricate mathematics. The three properties of the regular densities (moments of the derivatives of the log-likelihood) are used in establishing the properties of MLEs. The properties of the ML estimators are as follows.
From (27), it is clear that the variance tends to zero as n → ∞ in each case, so we conclude that the estimators are consistent since they are composed of i.i.d. observations.
where I μ and are as defined in (22) and (25), respectively.
This means that any unbiased estimator that achieves this lower bound is efficient and no better unbiased estimator is possible. Now look back at the variance-covariance matrix (27). It is interesting to note that the variances of the estimators in the variance-covariance matrix do asymptotically coincide with the Cramer-Rao lower bound (42). This means that our MLEs are 100% asymptotically efficient. The asymptotic variance of the MLE is, in fact, equal to the Cramer-Rao lower bound for the variance of a consistent and asymptotically normally distributed estimator .
Last, the invariance property is a mathematical result of the method of computing MLEs; it is not a statistical result as such. If it is desired to analyze a continuous and continuously differentiable function of an MLE, then the function of will, itself, be the MLE since the MLE is invariant to one-to-one transformations of θ.
These four properties explain the prevalence of the ML technique. The second is a particularly powerful result. The third greatly facilitates hypothesis testing and the construction of interval estimates. The MLE has the minimum variance achievable by a consistent and asymptotically normally distributed estimator.
Least squares estimation of the model parameters
Nonlinear OLS summary of residual errors
Nonlinear OLS parameter estimates
Approximate standard error
Approximate Pr > | t|
From Table 3, it is observed that the OLS estimates differ significantly from the ML estimates and the ML estimators are closer to the true values of the parameters compared to the OLS estimators.
Comparative study of the model
Comparison of estimators of the linear regression model
Comparison of the OLS with ML estimators of the SP regression model
1.81953E − 05
1.11E − 05
−1.7E − 05
5.14E − 10
2E − 08
As we expect, the results reported in Table 4 show that ML estimators have both smaller one-step-ahead forecast bias and less MSE than OLS estimators. This reveals that ML estimators exhibit superior performance to OLS estimators. This confirms the fact that deviations from normality cause OLS estimators to be poor estimators.
Comparison of the SP-LRM with the N-LRM
Information criteria and model diagnostics of ML estimators of the parameters
Sample size ( n)
Application of the model
Variables of the Australian Institute of Sport data frame with 202 observations
Y = BMI
X 1 = RCC
X 2 = WCC
X 3 = PFC
Tests for departure from normality of the response variable (BMI)
ML estimates of model parameters calculated from the real data set
AIC (smaller is better)
BIC (smaller is better)
Summary and conclusions
A multiple linear regression model generalizes the simple linear regression model by allowing the response variable to depend on more than one explanatory variable. In this paper, we have explored the idea of using a symmetric platykurtic distribution for analyzing nonnormal errors in the multivariate linear regression model. The symmetric platykurtic distribution serves as an alternative to the normal distribution with platykurtic nature. The maximum likelihood estimators of the model parameters are derived and we found them feasible. Through simulation studies, the properties of these estimators are studied. Traditional OLS estimation is carried out in parallel and the results are compared. The simulated results reveal that the ML estimators are more efficient than the OLS estimators in terms of the relative efficiency of one-step-ahead forecast mean square error. A comparative study of the developed regression model with the Gaussian model revealed that this model gives good fit to some data sets. The asymptotic properties of the maximum likelihood estimators are studied, and the large sample theory with respect to regression coefficients is also presented. The utility of the proposed model is demonstrated with real-time data. This regression model is much more useful for analyzing data sets arising from agricultural experiments, portfolio management, space experiments, and a wide range other practical problems. The calculations in this paper make considerable use of a combination of three popular statistical packages: Mathematica 9.0, Matlab R2012b, and SAS 9.0.
Summary of properties of two-parameter symmetric platykurtic distribution
Notation: NS(μ, σ 2)
Support: y ∈ ℜ
Central moments: moments
Fisher information: information ; ,
The authors are grateful to the Editor of JUAA, two anonymous referees, and SpringerOpen Copyediting Management for the helpful comments and suggestions on the earlier version of this article. The present version of the paper owes much to their precise and kind remarks.
- Seber GAF: Linear Regression Analysis. New York: Wiley; 1977.Google Scholar
- Montgomery DC, Peck EA, Vining GG: Introduction to Linear Regression Analysis. 3rd edition. New York: Wiley; 2001.Google Scholar
- Grob J: Linear Regression. In Lecture Notes in Statistics, vol. 175. Berlin: Springer; 2003.Google Scholar
- Sengupta D, Jammalamadaka SR: Estimation in the linear model. In Linear Models: An Integrated Approach. River Edge: World Scientific; 2003:93–131.View ArticleGoogle Scholar
- Seber GAF, Lee AJ: Linear Regression Analysis. 2nd edition. New York: Wiley; 2003.View ArticleGoogle Scholar
- Weisberg S: Applied Linear Regression. 3rd edition. New York: Wiley; 2005.View ArticleGoogle Scholar
- Yan X, Su XG: Linear Regression Analysis: Theory and Computing. Hackensack: World Scientific; 2009.View ArticleGoogle Scholar
- Srivastava MS: Methods of Multivariate Statistics. New York: Wiley; 2002.Google Scholar
- Fama EF: The behaviour of stock market prices. J. Bus 1965, 38: 34–105. 10.1086/294743View ArticleGoogle Scholar
- Sutton J: Gibrat’s legacy. J. Econ. Lit 1997, 35: 40–59.Google Scholar
- Zeckhauser R, Thompson M: Linear regression with non-normal error terms. Rev. Econ. Stat 1970,52(3):280–286. 10.2307/1926296View ArticleGoogle Scholar
- Zellner A: Bayesian and non-Bayesian analysis of the regression model with multivariate Student-t error terms. J. Am. Stat. Assoc 1976,71(354):400–405.MathSciNetGoogle Scholar
- Sutradhar BC, Ali MM: Estimation of the parameters of a regression model with a multivariate t error variable. Commun. Stat. Theory 1986, 15: 429–450. 10.1080/03610928608829130MathSciNetView ArticleGoogle Scholar
- Tiku ML, Wong WK, Bian G: Estimating parameters in autoregressive models in non-normal situations: symmetric innovations. Commun. Stat. Theory Methods 28(2):315–341.Google Scholar
- Tiku ML, Wong WK, Vaughan DC, Bian G: Time series models with non-normal situations: symmetric innovations. J. Time Ser. Anal 2000,2(5):571–596.MathSciNetView ArticleGoogle Scholar
- Tiku ML, Islam MQ, Selcuk AS: Non-normal regression, II: symmetric distributions. Commun. Stat. Theory Methods 2001,30(6):1021–1045. 10.1081/STA-100104348MathSciNetView ArticleGoogle Scholar
- Sengupta D, Jammalamadaka SR: The symmetric non-normal case. In Linear Models: An Integrated Approach. River Edge: World Scientific; 2003:131–133.Google Scholar
- Liu M, Bozdogan H: Power exponential multiple regression model selection with ICOMP and genetic algorithms. Springer, Tokyo: Working paper; 2004.Google Scholar
- Wong WK, Bian G: Estimation of parameters in autoregressive models with asymmetric innovations. Stat. Prob. Lett 2005,71(1):61–70. 10.1016/j.spl.2004.10.022MathSciNetView ArticleGoogle Scholar
- Wong WK, Bian G: Robust estimation of multiple regression model with asymmetric innovations and its applicability on asset pricing model. Euras. Rev. Econ. Financ 2005,1(4):7.Google Scholar
- Liu M, Bozdogan H: Multivariate regression models with power exponential random errors and subset selection using genetic algorithms with information complexity. Eur. J. Pure Appl. Math 2008,1(1):4–37.MathSciNetGoogle Scholar
- Soffritti G, Galimberti G: Multivariate linear regression with non-normal errors: a solution based on mixture models. Stat. Comput 2011,21(4):523–536. 10.1007/s11222-010-9190-3MathSciNetView ArticleGoogle Scholar
- Jafari H, Hashemi R: Optimal designs in a simple linear regression with skew-normal distribution for error term. J. Appl. Math 2011,1(2):65–68.View ArticleGoogle Scholar
- Jahan S, Khan A: Power of t-test for simple linear regression model with non-normal error distribution: a quantile function distribution approach. J. Sci. Res 2012,4(3):609–622.View ArticleGoogle Scholar
- Bian G, McAleer M, Wong WK: Robust estimation and forecasting of the capital asset pricing model. Ann. Financ. Econ 2013. in pressGoogle Scholar
- Srinivasa Rao K, Vijay Kumar CVSR, Lakshmi Narayana J: On a new symmetrical distribution. J. Indian Soc. Agric. Stat 1997,50(1):95–102.Google Scholar
- Seshashayee M, Srinivas Rao K, Satyanarayana CH, Srinivasa Rao P: Image segmentation based on a finite generalized symmetric platykurtic mixture model with K-means. Int. J. Comput. Sci. Issu 2011,8(3):2.Google Scholar
- Marsaglia G: Evaluating the normal distribution. J. Stat. Softw 2004,11(4):1–7.Google Scholar
- Gupta RC, Gupta RD: Proportional reversed hazard rate model and its applications. J. Stat. Plann. Inference 2007,137(11):3525–3536. 10.1016/j.jspi.2007.03.029View ArticleGoogle Scholar
- Greene W: Econometric Analysis. 5th edition. Upper Saddle River: Prentice-Hall; 2003.Google Scholar
- Clements MP, Hendry DF: An empirical study of seasonal unit roots in forecasting. Int. J. Forecast 1997,13(3):341–355. 10.1016/S0169-2070(97)00022-8View ArticleGoogle Scholar
- Chiang TC, Qiao Z, Wong WK: New evidence on the relation between return volatility and trading volume. J. Forecast 2010,29(5):502–515.MathSciNetGoogle Scholar
- Ferreira JTAS, Steel MF Statistics Research Report, 419. In Bayesian multivariate regression analysis with a new class of skewed distributions. University of Warwick: Department of Statistics; 2004.Google Scholar
- Lachos VH, Bolfarine H, Arellano-Valle RB, Montenegro LC: Likelihood based inference for multivariate skew-normal regression models. Commun. Stat. Theory Methods 2007, 36: 1769–1786. 10.1080/03610920601126241MathSciNetView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.