DOI 10.17586/0021-3454-2024-67-9-751-758
UDC 658.562.012.7, 519.233.33, 519.6
OPTIMAL AGGREGATION OF CLUSTERED SAMPLE INTERVALS FOR APPLYING THE χ2 TEST
D. F. Ustinov Baltic State Technica University VOENMEH, Department of Higher Mathematics;
T. V. Vinnik
St. Petersburg State Institute of Technology, Department of Mathematics ; Associate Professor
E. A. Eskova
D. F. Ustinov Baltic State Technical University VOENMEH, Department o Higher Mathematics ; Assistant
Reference for citation: Vinnik P. М., Vinnik Т. V., Еskova Е. А. Optimal aggregation of clustered sample intervals for applying the χ2 test. Journal of Instrument Engineering. 2024. Vol. 67, N 9. P. 751–758 (in Russian). DOI: 10.17586/0021-3454-202467-9-751-758.
Abstract. The use of intervals of equal length or intervals of equal probability for using the χ2-type criterion is discussed. In this case, intervals of equal probability are predetermined by the distribution law being tested. When forming the initial sample based on real production data, it is often immediately grouped with predetermined and unchangeable grouping boundaries in production and may not satisfy the recommendations for applying χ2-type criteria. A method is proposed for constructing a set of optimal grouping intervals by combining some of the intervals available in the initial sample. An optimal set of such intervals is understood to be a set of intervals that has the least square deviation of weighted frequencies of hits from a discrete uniform distribution, which makes it possible not to change the set of intervals when changing the selected distribution law and to automatically solve the problem of choosing the optimal number of intervals. Some properties of such sets are listed, examples of situations arising during their construction are considered, and an example of forming such an optimal set is given.
Abstract. The use of intervals of equal length or intervals of equal probability for using the χ2-type criterion is discussed. In this case, intervals of equal probability are predetermined by the distribution law being tested. When forming the initial sample based on real production data, it is often immediately grouped with predetermined and unchangeable grouping boundaries in production and may not satisfy the recommendations for applying χ2-type criteria. A method is proposed for constructing a set of optimal grouping intervals by combining some of the intervals available in the initial sample. An optimal set of such intervals is understood to be a set of intervals that has the least square deviation of weighted frequencies of hits from a discrete uniform distribution, which makes it possible not to change the set of intervals when changing the selected distribution law and to automatically solve the problem of choosing the optimal number of intervals. Some properties of such sets are listed, examples of situations arising during their construction are considered, and an example of forming such an optimal set is given.
Keywords: distribution law, empirical data, χ2 test, grouping intervals, grouped samples, grouping optimality
References:
References:
- Mittag H.-J., Rinne H. Statistische Methoden der Qualitatssicherung, Munchen, Wien, 1993.
- Kobzar' A.I. Prikladnaya matematicheskaya statistika. Dlya inzhenerov i nauchnykh rabotnikov (Applied Mathematical Statistics. For Engineers and Scientists), Moscow, 2006, 816 р. (in Russ.)
- Lemeshko B.Yu., Postovalov S.N. Izvestiya vuzov. Fizika, 1995, no. 9, pp. 39–45. (in Russ.)
- Novitskiy P.V., Zograf I.A. Otsenka pogreshnostey rezul'tatov izmereniy (Evaluation of Errors in Measurement Results), Leningrad, 1991, 304 р. (in Russ.)
- Hald A. Statistical Theory with Engineering Applications, NY, Wiley, 1952, 783 p.
- Mann H.B., Wald A. Ann. Math. Stat., 1942, vol. 13, рр. 306–317.
- Williams C.A., jr., Journal of the American Statistical Association, 1950, no. 249(45), pp. 77–86.
- Lemeshko B.Yu., Postovalov S.N. Industrial Laboratory. Diagnostics of Materials, 1998, no. 5(64), pp. 56–63. (in Russ.)
- Lemeshko B.Yu., Chimitova E.V. Industrial Laboratory. Diagnostics of Materials, 2003, vol. 69, рр. 61–67. (in Russ.)
- Kulldorff G. Contributions to the theory of estimation from grouped and partially grouped samples, NY, John Wiley, 1963, 144 p.
- https://03.rosstat.gov.ru/storage/mediabank/05_tom1(1).pdf.
- Vinnik P.M., Vinnik T.V., Eskova E.A. Bulletin of Education and Development of Science of the Russian Academy of Natural Sciences, 2022, no. 4, pp. 24–30, DOI: 10.26163/RAEN.2022.32.23.003. (in Russ.)
- Andrews G.E. The theory of partitions, Addison-Wesley Pub. Co., 1976, 255 p.
- 1Korf R. IJCAI'09, Proceedings of the 21st International Joint Conference on Artificial Intelligence Pasadena, California, USA, July 11-17, 2009, pp. 538–543.
- Bardasov S.A. Bulletin of Tyumen State University, 2003, no. 5, pp. 217–219. (in Russ.)