|
|
 |
Search published articles |
 |
|
Showing 34 results for Regression
Javad Ahmadi, , Volume 23, Issue 2 (3-2019)
Abstract
A simultaneous confidence band gives useful information on the reasonable range of the unknown regression model. In this note, when the predictor variables are constrained to a special ellipsoidal region, hyperbolic and constant width confidence bonds for a multiple linear regression model are compared under the minimum volome confidence set (MVCS) criterion. The size of one speical angle that determines the size of the predictor variable region is used to find out which band is better than the other. When the angle and consquently the size of the predictor variable region is small, the constant width band is better than the hyperbolic band.
When the angle hence the size of the predictor variable regoin is large, the hyperbolic band is considerably better than the constant width band.
Mrs Azam Rastin, Dr Mohmmadreza Faridrohani, Dr Amirabbas Momenan, Dr Fatemeh Eskandari, Dr Davood Khalili, Volume 23, Issue 2 (3-2019)
Abstract
Cardiovascular diseases (CVDs) are the leading cause of death worldwide. To specify an appropriate model to determine the risk of CVD and predict survival rate, users are required to specify a functional form which relates the outcome variables to the input ones. In this paper, we proposed a dimension reduction method using a general model, which includes many widely used survival models as special cases.
Using an appropriate combination of dimension reduction and Cox Proportional Hazards model, we found a method which is effective for survival prediction.
Seyedeh Mona Ehsani Jokandan, Behrouz Fathi Vajargah, Volume 24, Issue 2 (3-2020)
Abstract
In this paper, the difference between classical regression and fuzzy regression is discussed. In fuzzy regression, nonphase and fuzzy data can be used for modeling. While in classical regression only non-fuzzy data is used.
The purpose of the study is to investigate the possibility of regression method, least squares regression based on regression and linear least squares linear regression method based on fuzzy weight calculation for non-fuzzy input and fuzzy output using symmetric triangular fuzzy numbers. Further reliability, confidence intervals and fitness fit criterion is presented for choosing the optimal model.
Finally, by providing examples of the behavior of the proposed methods, the optimality of the regression hybrid model is shown by the least linear fuzzy squares.
Akram Heidari Garmianaki, Mehrdad Niaparast, Volume 24, Issue 2 (3-2020)
Abstract
In the present era, classification of data is one of the most important issues in various sciences in order to
detect and predict events. In statistics, the traditional view of these classifications will be based on classic
methods and statistical models such as logistic regression. In the present era, known as the era of explosion
of information, in most cases, we are faced with data that cannot find the exact distribution. Therefore, the
use of data mining and machine learning methods that do not require predetermined models can be useful.
In many countries, the exact identification of the type of groundwater resources is one of the important
issues in the field of water science. In this paper, the results of the classification of a data set for groundwater resources were compared using regression, neural network, and support vector machine.
The results of these classifications showed that machine learning methods were effective in determining the exact type of springs.
, , Volume 24, Issue 2 (3-2020)
Abstract
The minimum density power divergence method provides a robust estimate in the face of a situation where the dataset includes a number of outlier data.
In this study, we introduce and use a robust minimum density power divergence estimator to estimate the parameters of the linear regression model and then with some numerical examples of linear regression model, we show the robustness of this estimator in the face of a dataset which includes a number of outliers.
, , , Volume 24, Issue 2 (3-2020)
Abstract
Sometimes, in practice, data are a function of another variable, which is called functional data. If the scalar response variable is categorical or discrete, and the covariates are functional, then a generalized functional linear model is used to analyze this type of data. In this paper, a truncated generalized functional linear model is studied and a maximum likelihood approach is used to estimate the model parameters. Finally, in a simulation study and two practical examples, the model and methods presented are implemented.
Miss Zahra Eslami, Miss Mina Norouzirad, Mr Mohammad Arashi, Volume 25, Issue 1 (1-2021)
Abstract
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regression model. Among all penalty functions, LASSO provides the best fit.
Alireza Rezaee, Mojtaba Ganjali, Ehsan Bahrami, Volume 25, Issue 1 (1-2021)
Abstract
Nonrespose is a source of error in the survey results and National statistical organizations are always looking for ways to
control and reduce it. Predicting nonrespons sampling units in the survey before conducting the survey is one of the solutions
that can help a lot in reducing and treating the survey nonresponse. Recent advances in technology and the facilitation of
complex calculations have made it possible to apply machine learning methods, such as regression and classification trees
or support vector machines, to many issues, including predicting the nonresponse of sampling units in statistics. . In this
article, while reviewing the above methods, we will predict the nonresponse sampling units in a establishment survey using
them and we will show that the combination of the above methods is more accurate in predicting the correct nonresponse
than any of the methods.
Ehsan Bahrami Samani, Samira Bahramian, Volume 26, Issue 1 (12-2021)
Abstract
The occurrence of lifetime data is a problem which is commonly encountered in various researches, including surveys, clinical trials and epidemioligical studies. Recently there has been extensive methodological resarech on analyzing lifetime data. Howerver, because usually little information from data is available to corretly estimate, the inferences might be sensitive to untestable assumptions which this calls for a sensitivity analysis to be performed.
In this paper, we describe how to evaluate the effect that perturbations to the Log-Beta Weibull Regression Responses. Also, we review and extend the application and interpretation of influence analysis methods using censored data analysis. A full likelihood-based approach that allows yielding maximum likelihood estimates of the model parameters is used. Some simulation studies are conducted to evalute the performance of the proposed indices in ddetecting sensitivity of key model parameters. We illustrate the methods expressed by analyzing the cancer data.
Mahsa Markani, Manije Sanei Tabas, Habib Naderi, Hamed Ahmadzadeh, Javad Jamalzadeh, Volume 26, Issue 2 (3-2022)
Abstract
When working on a set of regression data, the situation arises that this data
It limits us, in other words, the data does not meet a set of requirements. The generalized entropy method is able to estimate the model parameters Regression is without applying any conditions on the error probability distribution. This method even in cases where the problem Too poorly designed (for example when sample size is too small, or data that has alignment
They are high and ...) is also capable. Therefore, the purpose of this study is to estimate the parameters of the logistic regression model using the generalized entropy of the maximum. A random sample of bank customers was collected and in this study, statistical work and were performed to estimate the model parameters from the binary logistic regression model using two methods maximum generalized entropy (GME) and maximum likelihood (ML). Finally, two methods were performed. We compare the mentioned. Based on the accuracy of MSE criteria to predict customer demand for long-term account opening obtained from logistic regression using both GME and ML methods, the GME method was finally more accurate than the ml method.
Dr Mahdi Roozbeh, Mr Arta Rouhi, Fatemeh Jahadi, Saeed Zalzadeh, Volume 26, Issue 2 (3-2022)
Abstract
In this research, the aim is to assess and analyze a method to predict the stock market. However, it is not easy to predict the capital market due to its high dependence on politics but by data modeling, it will be somewhat possible to predict the stock market in the long period of time. In this regard, by using the semi-parametric regression models and support vector regression with different kernels and measuring the predictor errors in the stock market of one stock based on daily fluctuations and comparing methods using the root of mean squared error and mean absolute percentage error criteria, support vector regression model has been the most appropriate fit to the real stock market data with radial kernel and error equal to 0.1.
Dr Majid Jafari Khaledi, Mr Hassan Mirzavand, Volume 26, Issue 2 (3-2022)
Abstract
To make statistical inferences about regression model parameters, it is necessary to assume a specific distribution on the random error expression. A basic assumption in a linear regression model is that the random error expression follows a normal distribution. However, in some statistical researches, data simultaneously display skewness and bimodality features. In this setting, the normality assumption is violated. A common approach to avoiding this problem is to use a mixture of skew-normal distributions. But such models involve many parameters, which it makes difficult to fit the models to the data. Moreover, these models are faced with the non-identifiability issue.
In this situation, a suitable solution is to use flexible distributions, which can take into account the skewness and bimodality observed in the data distribution. In this direction, various methods have been proposed based on developing of the skew-normal distribution in recent years. In this paper, these methods are used to introduce more flexible regression models than the regression models based on skew-normal distribution and a mixture of two skew-normal distributions. Their performance is compared using a simulation example. The methodology is then illustrated in a practical example related to a horses dataset.
Mr Arta Roohi, Ms Fatemeh Jahadi, Dr Mahdi Roozbeh, Volume 27, Issue 1 (3-2023)
Abstract
The most popular technique for functional data analysis is the functional principal component approach, which is also an important tool for dimension reduction. Support vector regression is branch of machine learning and strong tool for data analysis. In this paper by using the method of functional principal component regression based on the second derivative penalty, ridge and lasso and support vector regression with four kernels (linear, polynomial, sigmoid and radial) in spectroscopic data, the dependent variable on the predictor variables was modeled. According to the obtained results, based on the proposed criteria for evaluating the goodness of fit, support vector regression with linear kernel and error equal to $0.2$ has had the most appropriate fit to the data set.
Ms. Zahra Jafarian Moorakani, Dr. Heydar Ali Mardani-Fard, Volume 27, Issue 1 (3-2023)
Abstract
The ordinary linear regression model is $Y=Xbeta+varepsilon$ and the estimation of parameter $beta$ is: $hatbeta=(X'X)^{-1}X'Y$. However, when using this estimator in a practical way, certain problems may arise such as variable selection, collinearity, high dimensionality, dimension reduction, and measurement error, which makes it difficult to use the above estimator. In most of these cases, the main problem is the singularity of the matrix $X'X$. Many solutions have been proposed to solve them. In this article, while reviewing these problems, a set of common solutions as well as some special and advanced methods (which are less favored by someone, but still have the potential to solve these problems intelligently) to solve them.
Dr Mahdi Roozbeh, , , Volume 27, Issue 2 (3-2023)
Abstract
Functional data analysis is used to develop statistical approaches to the data sets that are functional and continuous essentially, and because these functions belong to the spaces with infinite dimensional, using conventional methods in classical statistics for analyzing such data sets is challenging.
The most popular technique for statistical data analysis is the functional principal components approach, which is an important tool for dimensional reduction. In this research, using the method of functional principal component regression based on the second derivative penalty, ridge and lasso, the analysis of Canadian climate and spectrometric data sets is proceed. To do this, to obtain the optimum values of the penalized parameter in proposed methods, the generalized cross validation, which is a valid and efficient criterion, is applied.
Seyyed Roohollah Roozegar, Amir Reza Mahmoodi, Volume 27, Issue 2 (3-2023)
Abstract
Many regression estimation techniques are strongly affected by outlier data and many errors occur in their estimation.
In the recent years, robust methods have been developed to solve this issue. The minimum density power divergence
estimator is an estimation method based on the minimum distance between two density functions, which provides a
robust estimate in situations where the data contain a number of outliers. In this research, we present the robust estimation
method of minimum density power divergence to estimate the parameters of the Poisson regression model,
which can produce robust estimators with the least loss in efficiency. Also, we will investigate the performance of the
proposed estimators by providing a real example.
Dr Manije Sanei Tabass, Volume 27, Issue 2 (3-2023)
Abstract
Regression analysis using the method of least squares requires the establishment of basic assumptions. One of the problems of regression analysis in this way
faces major problems is the existence of collinearity among the regression variables. Many methods to solve the problems caused by the existence of the same have been introduced linearly. One of these methods is ridge regression. In this article, a new estimate for the ridge parameter using generalized maximum Tsallis entropy is presented and we call it the Ridge estimator of generalized maximum Tsallis entropy. For the cement dataset
Portland, which have strong collinearity and since 1332, different estimators have been presented for these data, this estimator is calculated and
We compare the generalized maximum Tsallis entropy ridge estimator, generalized maximum entropy ridge estimator and the least squares estimator.
Dr Ehsan Bahrami Samani, Ms Kiyana Javidi Anaraki, , Volume 28, Issue 1 (9-2023)
Abstract
Given the limited energy resources globally, energy optimization is crucial. A significant portion of this energy is consumed
by buildings. Therefore, the aim of this research is to explore the simultaneous factors affecting the heating and cooling of
buildings. In the current research, 768 different residential buildings simulated with Ecotect software have been investigated.
Joint regression model and exploratory data analysis methods were used to identify the influencing factors of the heating and
cooling of buildings. Based on variables such as relative compactness, overall height, surface area, and roof of the buildings,
a new variable called ”type” (building model) was introduced and shown to be one of the strongest factors affecting the
heating and cooling of buildings. This variable is related to the shape of the building. In the joint regression model, it is
assumed that the responses follow a multivariate normal distribution. Then this model is compared with separate regression
models (without assuming responses correlation) and using Akaike’s information criterion and deviance information criterion,
pointing to the superiority of the joint regression model. Additionally, the model parameters are estimated using the maximum
likelihood estimation method and the amount of Akaike model compared to the separate model is a decrease of 0.0072%,
which shows the superiority of the joint regression model. The deviance information criterion is equal to 0.001736%, and in
comparison with the chi distribution, the null hypothesis is rejected to test the superiority of the models, which is regressed
to the superiority of the joint model.
Maryam Maleki, Hamid Reza Nili-Sani, Dr. M.gh. Akari, Volume 28, Issue 2 (3-2024)
Abstract
In this article, logistic regression models are studied in which the response variables are two (or multiple) values and the explanatory variables (predictor or independent) are ordinary variables, but the errors have a vagueness nature in addition to being random. Based on this, we formulate the proposed model and determine the estimation of the coefficients for a case with only one explanatory variable using the method of least squares. In the end, we explain the results with an example.
Dr Mahdieh Bayati, Volume 28, Issue 2 (3-2024)
Abstract
We live in the information age, constantly surrounded by vast amounts of data from the world around us. To utilize this information effectively, it must be mathematically expressed and analyzed using statistics.
Statistics play a crucial role in various fields, including text mining, which has recently garnered significant attention. Text mining is a research method used to identify patterns in texts, which can be in written, spoken, or visual forms.
The applications of text mining are diverse, including text classification, clustering, web mining, sentiment analysis, and more. Text mining techniques are utilized to assign numerical values to textual data, enabling statistical analysis.
Since working with data requires a solid foundation in statistics, statistical tools are employed in text analysis to make predictions, such as forecasting changes in stock prices or currency exchange rates based on current textual data.
By leveraging statistical methods, text mining can uncover, confirm, or refute the truths hidden within textual content. Today, this topic is widely used in machine learning. This paper aims to provide a basic understanding of statistical tools in text mining and demonstrates how these powerful tools can be used to analyze and interpret events.
|
|