|
|
 |
Search published articles |
 |
|
Nastaran Sharifian, Ehsan Bahrami Samani, Volume 15, Issue 2 (3-2022)
Abstract
One of the most frequently encountered longitudinal studies issues is data with losing the appointments or getting censoring. In such cases, all of the subjects do not have the same set of observation times. The missingness in the analysis of longitudinal discrete and continuous mixed data is also common, and missing may occur in one or both responses. Failure to pay attention to the cause of the missing (the mechanism of the missingness) leads to unbiased estimates and inferences. Therefore, in this paper, we investigate the mechanism of nonignorable missing in set-inflated continuous and zero-inflation power series, as well as the continuous and k-inflated ordinal mixed responses. A full likelihood-based approach is used to obtain the maximum likelihood estimates of the parameters of the models. In order to assess the performance of the models, some simulation studies are performed. Two applications of our models are illustrated for the American's Changing Lives survey, and the Peabody Individual Achievement Test data set.
Bibi Maryam Taheri, Hadi Jabbari, Mohammad Amini, Volume 16, Issue 1 (9-2022)
Abstract
Paying attention to the copula function in order to model the structure of data dependence has become very common in recent decades. Three methods of estimation, moment method, mixture method, and copula moment, are considered to estimate the dependence parameter of copula function in the presence of outlier data. Although the moment method is an old method, sometimes this method leads to inaccurate estimation. Thus, two other moment-based methods are intended to improve that old method. The simulation study results showed that when we use copula moment and mixture moment for estimating the dependence parameter of copula function in the presence of outlier data, the obtained MSEs are smaller. Also, the copula moment method is the best estimate based on MSE. Finally, the obtained numerical results are used in a practical example.
Mousa Golalizadeh, Sedigheh Noorani, Volume 16, Issue 1 (9-2022)
Abstract
Nowadays, the observations in many scientific fields, including biological sciences, are often high dimensional, meaning the number of variables exceeds the number of samples. One of the problems in model-based clustering of these data types is the estimation of too many parameters. To overcome this problem, the dimension of data must be first reduced before clustering, which can be done through dimension reduction methods. In this context, a recent approach that is recently receiving more attention is the random Projections method. This method has been studied from theoretical and practical perspectives in this paper. Its superiority over some conventional approaches such as principal component analysis and variable selection method was shown in analyzing three real data sets.
Zahra Zandi, Hossein Bevrani, Volume 16, Issue 2 (3-2023)
Abstract
This paper suggests Liu-type shrinkage estimators in linear regression model in the presence of multicollinearity under subspace information. The performance of the proposed estimators is compared to Liu-type estimator in terms of their relative efficiency via a Monte Carlo simulation study and a real data set. The results reveal that the proposed estimators outperform better than the Liu-type estimator.
Dr Alireza Chaji, Volume 16, Issue 2 (3-2023)
Abstract
High interpretability and ease of understanding decision trees have made
them one of the most widely used machine learning algorithms. The key to building
efficient and effective decision trees is to use the suitable splitting method. This
paper proposes a new splitting approach to produce a tree based on the T-entropy criterion
for the splitting method. The method presented on three data sets is examined
by 11 evaluation criteria. The results show that the introduced method in making
the decision tree has a more accurate performance than the well-known methods of
Gini index, Shannon, Tisalis, and Renny entropies and can be used as an alternative
method in producing the decision tree.
Dr. Robab Afshari, Volume 16, Issue 2 (3-2023)
Abstract
Although the multiple dependent state sampling (MDS) plan is preferred over the conditional plans due to the small size required, it is impossible to use it in a situation where the quality of manufactured products depends on more than one quality characteristic. In this study, to improve the performance of the mentioned method, S^T_{pk}-based MDS plan is proposed, which is applicable to inspect products with independent and multivariate normally distributed characteristics. The principal component analysis technique is used to develop an application of the proposed plan in the presence of dependent variables. Moreover, optimal values of plan parameters are obtained based on a nonlinear optimization problem. Findings indicate that compared to S^T_{pk}-based variable single sampling and repetitive group sampling plans, the proposed method is the best in terms of required sample size and OC curve. Finally, an industrial example is given to explain how to use the proposed plan.
Mehdi Kiani, Volume 17, Issue 1 (9-2023)
Abstract
In the 1980s, Genichi Taguchi, a Japanese quality advisor, claimed that most of the variability affiliated with the response could be attributed to the company of unmanageable (noise) factors. In some practical cases, his modeling proposition evidence leads the quality improvement to many runs in a crossed array. Hence, several researchers have em-braced noteworthy attitudes of response surface methodology along with the robust parameter design action as alternatives to Taguchi's plan. These alternatives model the response's mean and variance corresponding to the combination of control and noise factors in a combined array to accomplish a robust process or production. Indeed, using response surface methods to the robust parameter design minimises the impression of noise factors on assembling processes or productions. This paper intends to develop further modeling of the predicted response and variance in the presence of noise factors based on unbiased and robust estimators. Another goal is to design the experiments according to the optimal designs to improve these estimators' accuracy and precision simultaneously.
Dr. Me'raj Abdi, Dr. Mohsen Madadi, Volume 17, Issue 1 (9-2023)
Abstract
This paper proposes a different attitude for analyzing three-way contingency tables using conditional independence. We show that different types of independence explored in log-linear models can be achieved without using these models and only by using conditional independence. Some numerical examples are presented to illustrate the proposed methods.
Mr. Mohsen Motavaze, Dr. Hooshang Talebi, Volume 17, Issue 1 (9-2023)
Abstract
Production of high-quality products necessitates identifying the most influential factors, among many factors, for controlling and reducing quality variation. In such a setting, the factorial designs are utilized to determine the active factors with maximal information and model an appropriate relation between the factors and the variable of interest. In this regard, robust parameter designs dividing the factors to control- and noise factors are efficient methods for offline quality control for stabilizing the quality variation in the presence of the noise factors. Interestingly, this could be achieved through exploiting active control by noise interactions. One needs to experiment with numerous treatments to detect the active interaction effects. Search designs are suggested to save treatments, and a superior one is recommended among the appropriate ones. To determine the superior design, one needs a design criterion; however, the existing criteria could be more beneficial for robust parameter designs. In this paper, we proposed a criterion to rank the search designs and determine the superior one.
Miss Forouzan Jafari, Dr. Mousa Golalizadeh, Volume 17, Issue 2 (2-2024)
Abstract
The mixed effects model is one of the powerful statistical approaches used to model the relationship between the response variable and some predictors in analyzing data with a hierarchical structure. The estimation of parameters in these models is often done following either the least squares error or maximum likelihood approaches. The estimated parameters obtained either through the least squares error or the maximum likelihood approaches are inefficient, while the error distributions are non-normal. In such cases, the mixed effects quantile regression can be used. Moreover, when the number of variables studied increases, the penalized mixed effects quantile regression is one of the best methods to gain prediction accuracy and the model's interpretability. In this paper, under the assumption of an asymmetric Laplace distribution for random effects, we proposed a double penalized model in which both the random and fixed effects are independently penalized. Then, the performance of this new method is evaluated in the simulation studies, and a discussion of the results is presented along with a comparison with some competing models. In addition, its application is demonstrated by analyzing a real example.
Fatemeh Ghapani, Babak Babadi, Volume 17, Issue 2 (2-2024)
Abstract
In this paper, we introduce the weighted ridge estimators of fixed and random effects in stochastic restricted linear mixed measurement error models when collinearity is present. The asymptotic properties of the resulting estimates are examined. The necessary and sufficient conditions, for the superiority of the weighted ridge estimators against the weighted estimator in order to select the ridge parameter based on the mean squared error matrix of estimators, are investigated. Finally, theoretical results are augmented with a simulation study and a numerical example.
Miss Nilia Mosavi, Dr. Mousa Golalizadeh, Volume 17, Issue 2 (2-2024)
Abstract
Cancer progression among patients can be assessed by creating a set of gene markers using statistical data analysis methods. Still, one of the main problems in the statistical study of this type of data is the large number of genes versus a small number of samples. Therefore, it is essential to use dimensionality reduction techniques to eliminate and find the optimal number of genes to predict the desired classes accurately. On the other hand, choosing an appropriate method can help extract valuable information and improve the machine learning model's efficiency. This article uses an ensemble learning approach, a random support vector machine cluster, to find the optimal feature set. In the current paper and in dealing with real data, it is shown that via randomly projecting the original high-dimensional feature space onto multiple lower-dimensional feature subspaces and combining support vector machine classifiers, not only the essential genes are found in causing prostate cancer, but also the classification precision is increased.
Sara Bayat, Sakineh Dehghan, Volume 17, Issue 2 (2-2024)
Abstract
This paper presents a nonparametric multi-class depth-based classification approach for multivariate data. This approach is easy to implement rather than most existing nonparametric methods that have computational complexity. If the assumption of the elliptical symmetry holds, this method is equivalent to the Bayes optimal rule. Some simulated data sets as well as real example have been used to evaluate the performance of these depth-based classifiers.
Mozhgan Moradi, Shaho Zarei, Volume 18, Issue 1 (8-2024)
Abstract
Model-based clustering is the most widely used statistical clustering method, in which heterogeneous data are divided into homogeneous groups using inference based on mixture models. The presence of measurement error in the data can reduce the quality of clustering and, for example, cause overfitting and produce spurious clusters. To solve this problem, model-based clustering assuming a normal distribution for measurement errors has been introduced. However, too large or too small (outlier) values of measurement errors cause poor performance of existing clustering methods. To tackle this problem {and build a stable model against the presence of outlier measurement errors in the data}, in this article, a symmetric $alpha$-stable distribution is proposed as a replacement for the normal distribution for measurement errors, and the model parameters are estimated using the EM algorithm and numerical methods. Through simulation and real data analysis, the new model is compared with the MCLUST-based model, considering cases with and without measurement errors, and the performance of the proposed model for data clustering in the presence of various outlier measurement errors is shown.
Roghayeh Ghorbani Gholi Abad, Gholam Reza Mohtashami Borzadaran, Mohammad Amini, Zahra Behdani, Volume 18, Issue 2 (2-2025)
Abstract
Abstract: The use of tail risk measures has been noticed in recent decades, especially in the financial and banking industry. The most common ones are value at risk and expected shortfall. The tail Gini risk measure, a composite risk measure, was introduced recently. The primary purpose of this article is to find the relationship between the concepts of economic risks, especially the expected shortfall and the tail Gini risk measure, with the concepts of inequality indices in the economy and reliability. Examining the relationship between these concepts allows the researcher to use the concepts of one to investigate other concepts. As you will see below, the existing mathematical relationships between the tail risk measures and the mentioned indices have been obtained, and these relationships have been calculated for some distributions. Finally, real data from the Iranian Stock Exchange was used to familiarize the concept of this tail risk measure.
Mehrnoosh Madadi, Kiomars Motarjem, Volume 18, Issue 2 (2-2025)
Abstract
Due to the volume and complexity of emerging data in survival analysis, it is necessary to use statistical learning methods in this field. These methods can estimate the probability of survival and the effect of various factors on the survival of patients. In this article, the performance of the Cox model as a common model in survival analysis is compared with compensation-based methods such as Cox Ridge and Cox Lasso, as well as statistical learning methods such as random survival forests and neural networks. The simulation results show that in linear conditions, the performance of the models mentioned above is similar to the Cox model. In non-linear conditions, methods such as Cox lasso, random survival forest, and neural networks perform better. Then, these models were evaluated in the analysis of the data of patients with atheromatous, and the results showed that when faced with data with a large number of explanatory variables, statistical learning approaches generally perform better than the classical survival model.
ُsomayeh Mohebbi, Ali M. Mosammam, Volume 19, Issue 1 (9-2025)
Abstract
Systemic risk, as one of the challenges of the financial system, has attracted special attention from policymakers, investors, and researchers. Identifying and assessing systemic risk is crucial for enhancing the financial stability of the banking system. In this regard, this article uses the Conditional Value at Risk method to evaluate the systemic risk of simulated data and Iran's banking system. In this method, the conditional mean and conditional variance are modeled using Autoregressive Moving Average and Generalized Autoregressive Conditional Heteroskedasticity models, respectively. The data studied includes the daily stock prices of 17 Iranian banks from April 8, 2019, to May 1, 2023, which contains missing values in some periods. The Kalman filter approach has been used for interpolating the missing values. Additionally, Vine copulas with a hierarchical tree structure have been employed to describe the nonlinear dependencies and hierarchical risk structure of the returns of the studied banks. The results of these calculations indicate that Bank Tejarat has the highest systemic risk, and the increase in systemic risk, in addition to causing financial crises, has adverse effects on macroeconomic performance. These results can significantly help in predicting and mitigating the effects of financial crises and managing them effectively.
Tara Mohammadi, Hadi Jabbari, Sohrab Effati, Volume 19, Issue 1 (9-2025)
Abstract
Support vector machine (SVM) as a supervised algorithm was initially invented for the binary case, then due to its applications, multi-class algorithms were also designed and are still being studied as research. Recently, models have been presented to improve multi-class methods. Most of them examine the cases in which the inputs are non-random, while in the real world, we are faced with uncertain and imprecise data. Therefore, this paper examines a model in which the inputs are uncertain and the problem's constraints are also probabilistic. Using statistical theorems and mathematical expectations, the problem's constraints have been removed from the random state. Then, the moment estimation method has been used to estimate the mathematical expectation. Using Monte Carlo simulation, synthetic data has been generated and the bootstrap resampling method has been used to provide samples as input to the model and the accuracy of the model has been examined. Finally, the proposed model was trained with real data and its accuracy was evaluated with statistical indicators. The results from simulation and real examples show the superiority of the proposed model over the model based on deterministic inputs.
Mehrdad Ghaderi, Zahra Rezaei Ghahroodi, Mina Gandomi, Volume 19, Issue 1 (9-2025)
Abstract
Researchers often face the problem of how to address missing data. Multiple imputation by chained equations is one of the most common methods for imputation. In theory, any imputation model can be used to predict the missing values. However, if the predictive models are incorrect, it can lead to biased estimates and invalid inferences. One of the latest solutions for dealing with missing data is machine learning methods and the SuperMICE method. In this paper, We present a set of simulations indicating that this approach produces final parameter estimates with lower bias and better coverage than other commonly used imputation methods. Also, implementing some machine learning methods and an ensemble algorithm, SuperMICE, on the data of the Industrial establishment survey is discussed, in which the imputation of different variables in the data co-occurs. Also, the evaluation of various methods is discussed, and the method that has better performance than the other methods is introduced.
Mehran Naghizadeh Qomi, Zohre Mahdizadeh, Volume 19, Issue 1 (9-2025)
Abstract
This paper investigates repetitive acceptance sampling inspection plans of lots based on type I censoring when the lifetime has a Tsallis q-exponential distribution. A repetitive acceptance sampling inspection plan is introduced, and its components, along with the optimal average sample number and the operating characteristic value of the plan, are calculated under the specified values for the parameter of distribution and consumer's and producer's risks using a nonlinear programming optimization problem. Comparing the results of the proposed repetitive acceptance sampling plan with the optimal single sampling inspection plan demonstrates the efficiency of the repetitive acceptance sampling plan over the single sampling plan. Moreover, repetitive sampling plans with a limited linear combination of risks are introduced and compared with the existing plan. Results of the introduced plan in tables and figures show that this plan has a lower ASN and, therefore, more efficiency than the existing design. A practical example in the textile industry is used to apply the proposed schemes.
|
|