Journal of Statistical Sciences

[Home ] [Archive]

[ فارسی ]

مجله علوم آماری – نشریه علمی پژوهشی انجمن آمار ایران

Main Menu

Home

Journal Information

Articles archive

For Authors

For Reviewers

Registration

Ethics Considerations

Contact us

Site Facilities

Search in website

Receive site information

Indexing and Abstracting

Social Media

Licenses

This Journal is licensed under a Creative Commons Attribution NonCommercial 4.0
International License
(CC BY-NC 4.0).

Similarity Check Systems

Search published articles

Showing 123 results for Type of Study: Applied

The Population Mean Estimators by using Judgment Post Stratification in Stratified Sampling

Ali Najafi Majid Abadi, Nader Nematollahi,
Volume 14, Issue 2 (2-2021)

Abstract

Judgment post-stratification is a method of using additional information of ranking in the simple random sampling, to increase the efficiency of the estimators of population parameters. In this paper, we use judgment post-stratification instead of simple random sampling in stratums of stratified sampling, and present new estimators for population mean. Then, we compare the proposed estimators with random stratified mean estimator by using a simulation study. The simulation results show that the proposed estimators perform better than the random stratified mean estimator in most of the cases.

Inference of Common Correlation Coefficient Based on Confidence Distribution Concept

Mohammad Reaz Kazemi,
Volume 14, Issue 2 (2-2021)

Abstract

In this paper, we investigate the confidence interval for the parameter of the common correlation coefficient of several bivariate normal populations. To do this, we use the confidence distribution approach. By simulation studies and using the concepts of coverage probability and expected length, We compare this method with the generalized variable approach. Results of simulation studies show that the coverage probability of the proposed method is close to the nominal level in all situations and also, in most cases, the expected length of this method is less than that of the generalized variable approach. Finally, we present two real examples to apply this approach.

Variable Selection in Semiparametric Mixed Effect Model for High-Dimension Longitudinal Data

Mozhgan Taavoni, Mohammad Arashi,
Volume 14, Issue 2 (2-2021)

Abstract

This paper considers the problem of simultaneous variable selection and estimation in a semiparametric mixed-effects model for longitudinal data with normal errors. We approximate the nonparametric function by regression spline and simultaneously estimate and select the variables under the optimization of the penalized objective function. Under some regularity conditions, the asymptotic behaviour of the resulting estimators is established in a high-dimensional framework where the number of parametric covariates increases as the sample size increases. For practical implementation, we use an EM algorithm to selects the significant variables and estimates the nonzero coefficient functions. Simulation studies are carried out to assess the performance of our proposed method, and a real data set is analyzed to illustrate the proposed procedure.

The Suitable Statistical Model Selection for the Wind Speed of Tabriz and Orumiyeh Stations

Meysam Mohammadpour, Hossein Bevrani, Reza Arabi Belaghi,
Volume 15, Issue 1 (9-2021)

Abstract

Wind speed probabilistic distributions are one of the main wind characteristics for the evaluation of wind energy potential in a specific region. In this paper, 3-parameter Log-Logistic distribution is introduced and it compared with six used statistical models for the modeling the actual wind speed data reported of Tabriz and Orumiyeh stations in Iran. The maximum likelihood estimators method via Nelder–Mead algorithm is utilized for estimating the model parameters. The flexibility of proposed distributions is measured according to the coefficient of determination, Chi-square test, Kolmogorov-Smirnov test, and root mean square error criterion. Results of the analysis show that 3-parameter Log-Logistic distribution provides the best fit to model the annual and seasonal wind speed data in Orumiyeh station and except summer season for Tabriz station. Also, wind power density error is estimated for the proposed different distributions.

Bivariate Dependency Analysis using Jeffrey and Hellinger Divergence Measures based on Copula Density Estimation by Improved Probit Transformation

Morteza Mohammadi, Mahdi Emadi, Mohammad Amini,
Volume 15, Issue 1 (9-2021)

Abstract

Divergence measures can be considered as criteria for analyzing the dependency and can be rewritten based on the copula density function. In this paper, Jeffrey and Hellinger dependency criteria are estimated using the improved probit transformation method, and their asymptotic consistency is proved. In addition, a simulation study is performed to measure the accuracy of the estimators. The simulation results show that for low sample size or weak dependence, the Hellinger dependency criterion performs better than Kullback-Libeler and Jeffrey dependency criteria. Finally, the application of the studied methods in hydrology is presented.

Using Machine Learning Classification Algorithms in Official Statistics

Zahra Rezaei Ghahroodi, Hasan Ranji, Alireza Rezaei,
Volume 15, Issue 1 (9-2021)

Abstract

In most surveys, the occupation and job-industry related questions are asked through open-ended questions, and the coding of this information into thousands of categories is done manually. This is very time consuming and costly. Given the requirement of modernizing the statistical system of countries, it is necessary to use statistical learning methods in official statistics for primary and secondary data analysis. Statistical learning classification methods are also useful in the process of producing official statistics. The purpose of this article is to code some statistical processes using statistical learning methods and familiarize executive managers about the possibility of using statistical learning methods in the production of official statistics. Two applications of classification statistical learning methods, including automatic coding of economic activities and open-ended coding of statistical centers questionnaires using four iterative methods, are investigated. The studied methods include duplication, support vector machine (SVM) with multi-level aggregation methods, a combination of the duplication method and SVM, and the nearest neighbor method.

Smallest Confidence Region for the Mean and Standard Deviation of Two-Parameter Uniform Distribution

Mohammad Hossein Poursaeed,
Volume 15, Issue 1 (9-2021)

Abstract

In this paper, based on an appropriate pivotal quantity, two methods are introduced to determine confidence region for the mean and standard deviation in a two parameter uniform distribution, in which the application of numerical methods is not mandatory. In the first method, the smallest region is obtained by minimizing the confidence region's area, and in the second method, a simultaneous Bonferroni confidence interval is introduced by using the smallest confidence intervals. By the comparison of area and coverage probability of the introduced methods, as well as, comparison of the width of strip including the standard deviation in both methods, it has been shown that the first method has a better efficiency. Finally, an approximation for the quantile of F
distribution used in calculating the confidence regions in a special case is presented.

Modeling of Continuous-Count and Continuous-Ordinal Mixed Longitudinal Bivariate Inflated Responses with Possibility of Non Random Missingness

Nastaran Sharifian, Ehsan Bahrami Samani,
Volume 15, Issue 2 (3-2022)

Abstract

One of the most frequently encountered longitudinal studies issues is data with losing the appointments or getting censoring. In such cases, all of the subjects do not have the same set of observation times. The missingness in the analysis of longitudinal discrete and continuous mixed data is also common, and missing may occur in one or both responses. Failure to pay attention to the cause of the missing (the mechanism of the missingness) leads to unbiased estimates and inferences. Therefore, in this paper, we investigate the mechanism of nonignorable missing in set-inflated continuous and zero-inflation power series, as well as the continuous and k-inflated ordinal mixed responses. A full likelihood-based approach is used to obtain the maximum likelihood estimates of the parameters of the models. In order to assess the performance of the models, some simulation studies are performed. Two applications of our models are illustrated for the American's Changing Lives  survey, and the Peabody Individual Achievement Test  data set.

The Efficiency of Some Shrinkage Estimators for Shape Parameter of the Pareto-Rayleigh Distribution

Mehdi Balui, Einolah Deiri, Farshin Hormozinejad, Ezzatallah Baloui Jamkhaneh,
Volume 15, Issue 2 (3-2022)

Abstract

In most practical cases, to increase parameter estimation accuracy, we need an estimator with the least risk. In this, contraction estimators play a critical role. Our main purpose is to evaluate the efficiency of some shrinkage estimators of the shape parameter of the Pareto-Rayleigh distribution under two classes of shrinkage estimators. In this research, the purpose estimators' efficiency will be compared with the unbiased estimator obtained under the quadratic loss function. The relationship between these two classes of shrinkage estimators was examined, and then the relative efficiency of the proposed estimators was discussed and concluded via doing a Monte Carlo simulation.

Robust Empirical Bayes Small Area Estimation with Symmetric α-Stable Distribution for Error Components

Shaho Zarei,
Volume 15, Issue 2 (3-2022)

Abstract

The most widely used model in small area estimation is the area level or the Fay-Herriot model. In this model, it is typically assumed that both the area level random effects (model errors) and the sampling errors have a Gaussian distribution. However, considerable variations in error components (model errors and sampling errors) can cause poor performance in small area estimation. In this paper, to overcome this problem, the symmetric α-stable distribution is used to deal with outliers in the error components. The model parameters are estimated with the empirical Bayes method. The performance of the proposed model is investigated in different simulation scenarios and compared with the existing classic and robust empirical Bayes methods. The proposed model can improve estimation results, in particular when both error components are normal or have heavy-tailed distribution.

Construction a Non-parametric Prediction Model for Spatial Random Field Using Projection Theorem

Issac Almasi, Mehdi Omidi,
Volume 15, Issue 2 (3-2022)

Abstract

Identifying the best prediction of unobserved observation is one of the most critical issues in spatial statistics. In this line, various methods have been proposed, that each one has advantages and limitations in application. Although the best linear predictor is obtained according to the Kriging method, this model is applied for the Gaussian random field. The uncertainty in the distribution of random fields makes researchers use a method that makes the nongaussian prediction possible. In this paper, using the Projection theorem, a non-parametric method is presented to predict a random field. Then some models are proposed for predicting the nongaussian random field using the nearest neighbors. Then, the accuracy and precision of the predictor will be examined using a simulation study. Finally, the application of the introduced models is examined in the prediction of rainfall data in Khuzestan province.

A Suggestion for Using Ranked Set Sampling in Household Expenditure and Income Survey

Roshanak Aliakbari Saba, Nasrin Ebrahimi, Lida Kalhori Nadrabadi, Asieh Abbasi,
Volume 15, Issue 2 (3-2022)

Abstract

The ranked set sampling method uses the ranking information of the units to provide a more representative sample of the population to the survey designers. The sampling distribution is closer to the actual distribution of the population. In this article, to ensure the effectiveness of ranked set sampling in extensive surveys conducted to prepare official statistics, we intend to use this sampling method to improve the efficiency of key estimates of household expenditure and income survey of the Statistics Center of Iran. The results show that using ranked set sampling to design household expenditure and income surveys can improve the efficiency of key estimates of the study, provided that the ranking variable used has a high correlation with the main variables of the study. Obviously, in the absence of a suitable and available variable for ranking the units, the information of the sampling frame can be used to construct a ranking variable correlated with the key variables of the survey.

Multivariate Outlier Detection Based on Depth-Based Outlyingness Function

Sakineh Dehghan, Mohamadreza Faridrohani,
Volume 15, Issue 2 (3-2022)

Abstract

The concept of data depth has provided a helpful tool for nonparametric multivariate statistical inference by taking into account the geometry of the multivariate data and ordering them. Indeed, depth functions provide a natural centre-outward order of multivariate points relative to a multivariate distribution or a given sample. Since the outlingness of issues is inevitably related to data ranks, the centre-outward ordering could provide an algorithm for outlier detection. In this paper, based on the data depth concept, an affine invariant method is defined to identify outlier observations. The affine invariance property ensures that the identification of outlier points does not depend on the underlying coordinate system and measurement scales. This method is easier to implement than most other multivariate methods. Based on the simulation studies, the performance of the proposed method based on different depth functions has been studied. Finally, the described method is applied to the residential houses' financial values of some cities of Iran in 1397.

Comparison of Three Moment Methods of Parameter Estimation for FGM Copula in the Presence of Outlier

Bibi Maryam Taheri, Hadi Jabbari, Mohammad Amini,
Volume 16, Issue 1 (9-2022)

Abstract

Paying attention to the copula function in order to model the structure of data dependence has become very common in recent decades. Three methods of estimation, moment method, mixture method, and copula moment, are considered to estimate the dependence parameter of copula function in the presence of outlier data. Although the moment method is an old method, sometimes this method leads to inaccurate estimation. Thus, two other moment-based methods are intended to improve that old method. The simulation study results showed that when we use copula moment and mixture moment for estimating the dependence parameter of copula function in the presence of outlier data, the obtained MSEs are smaller. Also, the copula moment method is the best estimate based on MSE. Finally, the obtained numerical results are used in a practical example.

Comparison of Clustering High Dimensional Data by Random Projections Method and Some Common Methods of Dimensional Reduction

Mousa Golalizadeh, Sedigheh Noorani,
Volume 16, Issue 1 (9-2022)

Abstract

Nowadays, the observations in many scientific fields, including biological sciences, are often high dimensional, meaning the number of variables exceeds the number of samples. One of the problems in model-based clustering of these data types is the estimation of too many parameters. To overcome this problem, the dimension of data must be first reduced before clustering, which can be done through dimension reduction methods. In this context, a recent approach that is recently receiving more attention is the random Projections method. This method has been studied from theoretical and practical perspectives in this paper. Its superiority over some conventional approaches such as principal component analysis and variable selection method was shown in analyzing three real data sets.

Record Linkage with Machine Learning Methods

Dr Zahra Rezaei Ghahroodi, Zhina Aghamohamadi,
Volume 16, Issue 1 (9-2022)

Abstract

With the advent of big data in the last two decades, in order to exploit and use this type of data, the need to integrate databases for building a stronger evidence base for policy and service development is felt more than ever. Therefore, familiarity with the methodology of data linkage as one of the methods of data integration and the use of machine learning methods to facilitate the process of recording records is essential. In this paper, in addition to introducing the record linkage process and some related methods, machine learning algorithms are required to increase the speed of database integration, reduce costs and improve record linkage performance. In this paper, two databases of the Statistical Center of Iran and Social Security Organization are linked.

Using Markov Latent Class Models in Estimating the Classification Error of Iranian Labor Flow Statistics

Lida Kalhori Nadrabadi, Zohreh Fallah Mohsekhani,
Volume 16, Issue 1 (9-2022)

Abstract

In countries where labor force surveys are based on rotation samples and partially standard sample units at different periods, the number of changing statuses can be estimated and presented as flow statistics. The response error is one of the essential non-sampling errors in labor force statistics. This error is doubled in flow statistics. Usually, the error of classifying flow statistics is estimated using the interview method, which is costly and complex. This paper presents the process of estimating flow statistics and appropriate models for calculating the classification error for it. Also, according to Iran's sample rotation pattern, each model's feasibility is examined. Finally, the Markov latent class model, assuming inequality of transition probabilities based on the rotation pattern of Iran for labor force samples, is introduced as a fit model for estimating classification error for flow statistics in Iran using the labour force survey data of 2019 and 2020.

Investigating the Improvement of Recurrent Forecasting of Singular Spectrum Analysis Method in Structural Time Series Models Using Data Filtration and Weighting Algorithm

Mr Reza Zabihi Moghadam, Dr Masoud Yarmohammadi, Dr Hossein Hassani, Dr Parviz Nasiri,
Volume 16, Issue 2 (3-2023)

Abstract

The Singular Spectrum Analysis (SSA) method is a powerful non-parametric method in the field of time series analysis and has been considered due to its features such as no need to stationarity assumptions or a limit on the number of collected observations. The main purpose of the SSA method is to decompose time series into interpretable components such as trend, oscillating component, and unstructured noise. In recent years, continuous efforts have been made by researchers in various fields of research to improve this method, especially in the field of time series prediction. In this paper, a new method for improving the prediction of singular spectrum analysis using Kalman filter algorithm in structural models is introduced. Then, the performance of this method and some generalized methods of SSA are compared with the basic SSA using the root mean square error criterion. For this comparison, simulated data from structural models and real data of gas consumption in the UK have been used. The results of this study show that the newly introduced method is more accurate than other methods.

Integer-valued Autoregressive Model Based on Innovations with Discrete Exponential-Weibull Distribution

Mr Einolah Deiri, Dr Einolah Deiri, Dr Ezzatallah Jamkhaneh,
Volume 16, Issue 2 (3-2023)

Abstract

In this paper, a new integer-valued autoregressive process is introduced based on the discrete exponential-Weibull distribution to model integer-value time series data. Regarding the importance of discrete distributions in counting data modeling, the discrete counterpart of the exponential-Weibull distribution is introduced, and some of its statistical properties, such as survival function, hazard rate, moment generating function, skewness and kurtosis, are investigated. The Fisher dispersion, skewness and kurtosis indices show the flexibility and efficiency of the discrete Exponential-Weibull distribution in fitting different types of counting data. The discrete Exponential-Weibull distribution covers data fits with different dispersion characteristics (overdispersion, underdispersion and equidispersion), long right tail (skewed to the right) and heavy-tailed. The model parameters are estimated using three approaches maximum conditional likelihood, minimum generalized conditional squares, and Yule-Walker. Finally, the efficiency and superiority of the process in fitting counts data of deaths due to COVID-19 disease are compared with other competing models.

A Different Attitude in Analyzing $3$-Way Contingency Tables Using Conditional Independence

Dr. Me'raj Abdi, Dr. Mohsen Madadi,
Volume 17, Issue 1 (9-2023)

Abstract

This paper proposes a different attitude for analyzing three-way contingency tables using conditional independence. We show that different types of independence explored in log-linear models can be achieved without using these models and only by using conditional independence. Some numerical examples are presented to illustrate the proposed methods.

Page 5 from 7

First
Previous
1
2
3
4
5
6
7
Next
Last

Persian site map - English site map - Created in 0.06 seconds with 52 queries by YEKTAWEB 4714