|
|
 |
Search published articles |
 |
|
Showing 34 results for Regression
Miss Zahra Eslami, Miss Mina Norouzirad, Mr Mohammad Arashi, Volume 25, Issue 1 (1-2021)
Abstract
The proportional hazard Cox regression models play a key role in analyzing censored survival data. We use penalized methods in high dimensional scenarios to achieve more efficient models. This article reviews the penalized Cox regression for some frequently used penalty functions. Analysis of medical data namely ”mgus2” confirms the penalized Cox regression performs better than the cox regression model. Among all penalty functions, LASSO provides the best fit.
Alireza Rezaee, Mojtaba Ganjali, Ehsan Bahrami, Volume 25, Issue 1 (1-2021)
Abstract
Nonrespose is a source of error in the survey results and National statistical organizations are always looking for ways to
control and reduce it. Predicting nonrespons sampling units in the survey before conducting the survey is one of the solutions
that can help a lot in reducing and treating the survey nonresponse. Recent advances in technology and the facilitation of
complex calculations have made it possible to apply machine learning methods, such as regression and classification trees
or support vector machines, to many issues, including predicting the nonresponse of sampling units in statistics. . In this
article, while reviewing the above methods, we will predict the nonresponse sampling units in a establishment survey using
them and we will show that the combination of the above methods is more accurate in predicting the correct nonresponse
than any of the methods.
Ehsan Bahrami Samani, Samira Bahramian, Volume 26, Issue 1 (12-2021)
Abstract
The occurrence of lifetime data is a problem which is commonly encountered in various researches, including surveys, clinical trials and epidemioligical studies. Recently there has been extensive methodological resarech on analyzing lifetime data. Howerver, because usually little information from data is available to corretly estimate, the inferences might be sensitive to untestable assumptions which this calls for a sensitivity analysis to be performed.
In this paper, we describe how to evaluate the effect that perturbations to the Log-Beta Weibull Regression Responses. Also, we review and extend the application and interpretation of influence analysis methods using censored data analysis. A full likelihood-based approach that allows yielding maximum likelihood estimates of the model parameters is used. Some simulation studies are conducted to evalute the performance of the proposed indices in ddetecting sensitivity of key model parameters. We illustrate the methods expressed by analyzing the cancer data.
Mahsa Markani, Manije Sanei Tabas, Habib Naderi, Hamed Ahmadzadeh, Javad Jamalzadeh, Volume 26, Issue 2 (3-2022)
Abstract
When working on a set of regression data, the situation arises that this data
It limits us, in other words, the data does not meet a set of requirements. The generalized entropy method is able to estimate the model parameters Regression is without applying any conditions on the error probability distribution. This method even in cases where the problem Too poorly designed (for example when sample size is too small, or data that has alignment
They are high and ...) is also capable. Therefore, the purpose of this study is to estimate the parameters of the logistic regression model using the generalized entropy of the maximum. A random sample of bank customers was collected and in this study, statistical work and were performed to estimate the model parameters from the binary logistic regression model using two methods maximum generalized entropy (GME) and maximum likelihood (ML). Finally, two methods were performed. We compare the mentioned. Based on the accuracy of MSE criteria to predict customer demand for long-term account opening obtained from logistic regression using both GME and ML methods, the GME method was finally more accurate than the ml method.
Dr Mahdi Roozbeh, Mr Arta Rouhi, Fatemeh Jahadi, Saeed Zalzadeh, Volume 26, Issue 2 (3-2022)
Abstract
In this research, the aim is to assess and analyze a method to predict the stock market. However, it is not easy to predict the capital market due to its high dependence on politics but by data modeling, it will be somewhat possible to predict the stock market in the long period of time. In this regard, by using the semi-parametric regression models and support vector regression with different kernels and measuring the predictor errors in the stock market of one stock based on daily fluctuations and comparing methods using the root of mean squared error and mean absolute percentage error criteria, support vector regression model has been the most appropriate fit to the real stock market data with radial kernel and error equal to 0.1.
Dr Majid Jafari Khaledi, Mr Hassan Mirzavand, Volume 26, Issue 2 (3-2022)
Abstract
To make statistical inferences about regression model parameters, it is necessary to assume a specific distribution on the random error expression. A basic assumption in a linear regression model is that the random error expression follows a normal distribution. However, in some statistical researches, data simultaneously display skewness and bimodality features. In this setting, the normality assumption is violated. A common approach to avoiding this problem is to use a mixture of skew-normal distributions. But such models involve many parameters, which it makes difficult to fit the models to the data. Moreover, these models are faced with the non-identifiability issue.
In this situation, a suitable solution is to use flexible distributions, which can take into account the skewness and bimodality observed in the data distribution. In this direction, various methods have been proposed based on developing of the skew-normal distribution in recent years. In this paper, these methods are used to introduce more flexible regression models than the regression models based on skew-normal distribution and a mixture of two skew-normal distributions. Their performance is compared using a simulation example. The methodology is then illustrated in a practical example related to a horses dataset.
Mr Arta Roohi, Ms Fatemeh Jahadi, Dr Mahdi Roozbeh, Volume 27, Issue 1 (3-2023)
Abstract
The most popular technique for functional data analysis is the functional principal component approach, which is also an important tool for dimension reduction. Support vector regression is branch of machine learning and strong tool for data analysis. In this paper by using the method of functional principal component regression based on the second derivative penalty, ridge and lasso and support vector regression with four kernels (linear, polynomial, sigmoid and radial) in spectroscopic data, the dependent variable on the predictor variables was modeled. According to the obtained results, based on the proposed criteria for evaluating the goodness of fit, support vector regression with linear kernel and error equal to $0.2$ has had the most appropriate fit to the data set.
Ms. Zahra Jafarian Moorakani, Dr. Heydar Ali Mardani-Fard, Volume 27, Issue 1 (3-2023)
Abstract
The ordinary linear regression model is $Y=Xbeta+varepsilon$ and the estimation of parameter $beta$ is: $hatbeta=(X'X)^{-1}X'Y$. However, when using this estimator in a practical way, certain problems may arise such as variable selection, collinearity, high dimensionality, dimension reduction, and measurement error, which makes it difficult to use the above estimator. In most of these cases, the main problem is the singularity of the matrix $X'X$. Many solutions have been proposed to solve them. In this article, while reviewing these problems, a set of common solutions as well as some special and advanced methods (which are less favored by someone, but still have the potential to solve these problems intelligently) to solve them.
Dr Mahdi Roozbeh, , , Volume 27, Issue 2 (3-2023)
Abstract
Functional data analysis is used to develop statistical approaches to the data sets that are functional and continuous essentially, and because these functions belong to the spaces with infinite dimensional, using conventional methods in classical statistics for analyzing such data sets is challenging.
The most popular technique for statistical data analysis is the functional principal components approach, which is an important tool for dimensional reduction. In this research, using the method of functional principal component regression based on the second derivative penalty, ridge and lasso, the analysis of Canadian climate and spectrometric data sets is proceed. To do this, to obtain the optimum values of the penalized parameter in proposed methods, the generalized cross validation, which is a valid and efficient criterion, is applied.
Seyyed Roohollah Roozegar, Amir Reza Mahmoodi, Volume 27, Issue 2 (3-2023)
Abstract
Many regression estimation techniques are strongly affected by outlier data and many errors occur in their estimation.
In the recent years, robust methods have been developed to solve this issue. The minimum density power divergence
estimator is an estimation method based on the minimum distance between two density functions, which provides a
robust estimate in situations where the data contain a number of outliers. In this research, we present the robust estimation
method of minimum density power divergence to estimate the parameters of the Poisson regression model,
which can produce robust estimators with the least loss in efficiency. Also, we will investigate the performance of the
proposed estimators by providing a real example.
Dr Manije Sanei Tabass, Volume 27, Issue 2 (3-2023)
Abstract
Regression analysis using the method of least squares requires the establishment of basic assumptions. One of the problems of regression analysis in this way
faces major problems is the existence of collinearity among the regression variables. Many methods to solve the problems caused by the existence of the same have been introduced linearly. One of these methods is ridge regression. In this article, a new estimate for the ridge parameter using generalized maximum Tsallis entropy is presented and we call it the Ridge estimator of generalized maximum Tsallis entropy. For the cement dataset
Portland, which have strong collinearity and since 1332, different estimators have been presented for these data, this estimator is calculated and
We compare the generalized maximum Tsallis entropy ridge estimator, generalized maximum entropy ridge estimator and the least squares estimator.
Dr Ehsan Bahrami Samani, Ms Kiyana Javidi Anaraki, , Volume 28, Issue 1 (9-2023)
Abstract
Given the limited energy resources globally, energy optimization is crucial. A significant portion of this energy is consumed
by buildings. Therefore, the aim of this research is to explore the simultaneous factors affecting the heating and cooling of
buildings. In the current research, 768 different residential buildings simulated with Ecotect software have been investigated.
Joint regression model and exploratory data analysis methods were used to identify the influencing factors of the heating and
cooling of buildings. Based on variables such as relative compactness, overall height, surface area, and roof of the buildings,
a new variable called ”type” (building model) was introduced and shown to be one of the strongest factors affecting the
heating and cooling of buildings. This variable is related to the shape of the building. In the joint regression model, it is
assumed that the responses follow a multivariate normal distribution. Then this model is compared with separate regression
models (without assuming responses correlation) and using Akaike’s information criterion and deviance information criterion,
pointing to the superiority of the joint regression model. Additionally, the model parameters are estimated using the maximum
likelihood estimation method and the amount of Akaike model compared to the separate model is a decrease of 0.0072%,
which shows the superiority of the joint regression model. The deviance information criterion is equal to 0.001736%, and in
comparison with the chi distribution, the null hypothesis is rejected to test the superiority of the models, which is regressed
to the superiority of the joint model.
Maryam Maleki, Hamid Reza Nili-Sani, Dr. M.gh. Akari, Volume 28, Issue 2 (3-2024)
Abstract
In this article, logistic regression models are studied in which the response variables are two (or multiple) values and the explanatory variables (predictor or independent) are ordinary variables, but the errors have a vagueness nature in addition to being random. Based on this, we formulate the proposed model and determine the estimation of the coefficients for a case with only one explanatory variable using the method of least squares. In the end, we explain the results with an example.
Dr Mahdieh Bayati, Volume 28, Issue 2 (3-2024)
Abstract
We live in the information age, constantly surrounded by vast amounts of data from the world around us. To utilize this information effectively, it must be mathematically expressed and analyzed using statistics.
Statistics play a crucial role in various fields, including text mining, which has recently garnered significant attention. Text mining is a research method used to identify patterns in texts, which can be in written, spoken, or visual forms.
The applications of text mining are diverse, including text classification, clustering, web mining, sentiment analysis, and more. Text mining techniques are utilized to assign numerical values to textual data, enabling statistical analysis.
Since working with data requires a solid foundation in statistics, statistical tools are employed in text analysis to make predictions, such as forecasting changes in stock prices or currency exchange rates based on current textual data.
By leveraging statistical methods, text mining can uncover, confirm, or refute the truths hidden within textual content. Today, this topic is widely used in machine learning. This paper aims to provide a basic understanding of statistical tools in text mining and demonstrates how these powerful tools can be used to analyze and interpret events.
|
|