Диссертация (1137066), страница 14
Текст из файла (страница 14)
2013, vol. 8251 of Lecture Notes in Computer Science, pp. 30–39, Springer.[54] Thomas L., Edelman D., Crook J. (2002) Credit Scoring and Its Applications,Monographs on Mathematical Modeling and Computation, SIAM: Pliladelphia,pp. 107–117[55] Bigss, D., Ville, B., and Suen, E.
(1991). A Method of Choosing MultiwayPartitions for Classification and Decision Trees. Journal of Applied Statistics,18, 1, 49-62.[56] Naeem Siddiqi, Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring, WILEY,ISBN: 978-0-471-75451-0, 2005[57] B Baesens, T Van Gestel, S Viaene, M Stepanova, J Suykens, Benchmarkingstate-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society 54 (6), 627-635, 200381[58] Ghodselahi A., A Hybrid Support Vector Machine Ensemble Model for CreditScoring, International Journal of Computer Applications (0975 – 8887), Volume17– No.5, March 2011[59] Yu, L., Wang, S. and Lai, K. K.
2009. An intelligent agent-based fuzzy groupdecision making model for financial multicriteria decision support: the case ofcredit scoring. European journal of operational research. vol. 195. pp.942-959.[60] Gestel, T. V., Baesens, B., Suykens, J. A., Van den Poel, D., Baestaens, D.-E.and Willekens, B. 2006. Bayesian kernel based classification for financial distressdetection. European journal of operational research. vol. 172. pp. 979-1003.[61] P.
Ravi Kumar and V. Ravi, “Bankruptcy Prediction in Banks and Firms via Statistical and Intelligent Techniques-A Review,” European Journal of OperationalResearch, Vol. 180, No. 1, 2007, pp. 1-28.[62] Sergei O. Kuznetsov and Mikhail V. Samokhin, “Learning closed sets of labeled graphs for chemical applications.,” in ILP, Stefan Kramer and BernhardPfahringer, Eds. 2005, vol. 3625 of Lecture Notes in Computer Science, pp. 190–208, Springer[63] SAS Institute Inc. (2012), Developing Credit Scorecards Using Credit ScoringR Enterprise MinerTM 12.1, Cary, NC: SAS Institute Inc.for SAS[64] Hocking, R. R.
(1976) "The Analysis and Selection of Variables in Linear Regression," Biometrics, 32.[65] Mehdi Kaytoue, Sergei O. Kuznetsov, Amedeo Napoli, and Sebastien Duplessis, “Mining gene expression data with pattern structures in formal conceptanalysis,” Information Sciences, vol. 181, no. 10, pp.
1989–2001, 2011.[66] Veloso, A. & Jr., W. M. (2011), Demand-Driven Associative Classification.,Springer.82[67] Bigss, D., Ville, B., and Suen, E. A Method of Choosing Multiway Partitionsfor Classification and Decision Trees. Journal of Applied Statistics, 18, 1, 49-621991[68] Bonferroni, C. E. Teoria statistica delle classi e calcolo delle probabilità – Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali diFirenze 8, pp.
3-62, 1936[69] Naeem Siddiqi, Credit Risk Scorecards: Developing and Implementing Intelligent Credit Scoring, WILEY,ISBN: 978-0-471-75451-0, 2005[70] Thomas, L. C., Edelman, D. B., Crook, J. N. Credit Scoring and Its Applications— Pliladelphia.: SIAM, 2002. 250 p.[71] B Baesens, T Van Gestel, S Viaene, M Stepanova, J Suykens, Benchmarkingstate-of-the-art classification algorithms for credit scoring, Journal of the Operational Research Society 54 (6), 627-635, 2003[72] Silvia, F., Cribari-Neto F. Beta Regression for Modelling Rates and Proportions// Journal of Applied Statistics.
2004. Vol. 31, Issue 7. p. 799–815.[73] Yu, L., Wang, S. and Lai, K. K. 2009. An intelligent agent-based fuzzy groupdecision making model for financial multicriteria decision support: the case ofcredit scoring. European journal of operational research. vol. 195. pp.942-959.[74] Gestel, T. V., Baesens, B., Suykens, J. A., Van den Poel, D., Baestaens, D.-E.and Willekens, B. 2006. Bayesian kernel based classification for financial distressdetection. European journal of operational research. vol. 172. pp.
979-1003.[75] P. Ravi Kumar and V. Ravi, “Bankruptcy Prediction in Banks and Firms via Statistical and Intelligent Techniques-A Review,” European Journal of OperationalResearch, Vol. 180, No. 1, 2007, pp. 1-28.[76] Ganter, B. and Wille, R. Formal Concept Analysis: Mathematical Foundations// Springer-Verlag New York, Inc. 1997.83[77] B. Ganter, and S. O. Kuznetsov. Pattern Structures and Their Projections, Conceptual Structures: Broadening the Base, Lecture Notes in Computer Science,Springer, Berlin/Heidelberg.
2001 Vol. 2120. p. 129–142.[78] Kuznetsov S.O. Scalable Knowledge Discovery in Complex Data with PatternStructures // PReMI, Lecture Notes in Computer Science, Springer. 2013 Vol.8251. p 30-39.[79] Grigoriev P., Sword-systems or JSM-systems for chains, using statistical considerations of STI. Series 2, 1996[80] Kuznetsov S.O. Fitting Pattern Structures to Knowledge Discovery in Big Data// Lecture Notes in Computer Science, Springer. 2013 Vol. 7880. p 254-266.[81] M.
Kaytoue, S. Duplessis, S. O. Kuznetsov, and A. Napoli. Mining Gene Expression Data with Pattern Structures in Formal Concept Analysis. // InformationSciences. Spec.Iss.: Lattices. 2011.[82] David W. Aha (Ed.). Lazy Learning. Kluwer Academic Publishers, Norwell,MA, USA. 1997.[83] X. Li and Y. Zhong. An Overview of Personal Credit Scoring: Techniques andFuture Work // International Journal of Intelligence Science. 2012 Vol. 2, Issue4A. p.
182–189.[84] Masyutin A., Kashnitsky Y., Kuznetsov S. O. Lazy Classification with IntervalPattern Structures: Application to Credit Scoring, in: Proceedings of the International Workshop "What can FCA do for Artificial Intelligence?" (FCA4AI atIJCAI 2015) / Ed.: Sergei O. Kuznetsov, A. Napoli, S. Rudolph. Buenos Aires :, 2015. P.
43-54.[85] Kaytoue, M., Kuznetsov S.O., Napoli,A.: Revisiting numerical pattern miningwith formal concept analysis. In: IJCAI, pp. 134201347 (2011).[86] https://www.investopedia.com/slide-show/top-bank-failures/84[87] http://www.banki.ru/news/lenta/?id=10257511[88] http://www.bbc.com/russian/news-42363290[89] https://www.bis.org/publ/bcbsca.htm85AppendixThe program code in R is provided for both continuous target prediction (QueryBased Regression Algorithm) and binary target prediction (Query Based Classification Algorithm) as for the prototype. It must be mentioned that for-loops are noteffective within R language and the following code serves as a working concept.The R language was chosen as soon as it is a vectorized language and, therefore,allows the reader quickly acquire understanding of algorithms discussed within thisthesis. If applying in real systems one is strongly recommended to use doParallel andfuture packages from CRAN or, better, different languages such as Java or C.The QBRA code is as follows:meet .
o p e r a t o r = f u n c t i o n ( x , y ) { # x and y a s . d a t a . f r a m e s w i t h named c o l u m n sr e s = d a t a . frame ( l b = 0 , ub = 0 , s t r i n g s A s F a c t o r s = FALSE ) [ − 1 , ]for ( i in 1: ncol ( x ) ) {r e s [ nrow ( r e s ) + 1 , ] = c ( min ( r b i n d ( x , y ) [ , i ] ) , max ( r b i n d ( x , y ) [ , i ] ) )}row . names ( r e s ) = names ( x )res}image .
f i n d = f u n c t i o n ( d e s c , d a t a ) { # d e s c a s p a t t e r n s t r u c t u r er e s = which ( d a t a [ , 1 ] >= d e s c [ 1 , " l b " ] & d a t a [ , 1 ] <= d e s c [ 1 , " ub " ] )f o r ( i i n 2 : nrow ( d e s c ) ) {r e s = i n t e r s e c t ( r e s , which ( d a t a [ , i ] >= d e s c [ i , " l b " ] & d a t a [ , i ] <= d e s c [ i , "ub " ] ) )}res}l a z y . e v a l u a t o r = f u n c t i o n ( v l , t r , # Datat a r g e t = " r r _ av " , v a r s = names ( t r ) [ names ( t r ) ! = t a r g e t ] , #Key f i e l d ss u b s a m p l e . s i z e = 0 .
0 1 , n . i t e r =1000 , a l p h a . t h r e s h o l d =0.01 ,a l l o w e d . d r o p o u t = 0 , c a p p e d = TRUE ,a c c o u n t . f o r . a n t i . s u p p o r t = TRUE , p e n a l i z e . f o r . h i g h .86d e v i a t i o n = TRUE # T u n i n g p a r a m e t e r s){n a i v e _ avg = mean ( t r [ [ t a r g e t ] ] )n a i v e _med = median ( t r [ [ t a r g e t ] ] )t 0 = Sys . t i m e ( )f o r ( k i n 1 : nrow ( v l ) ) {gt = vl [k , vars ]premises = l i s t ()# ############################################## Mining o f t h e p r e m i s e s f o r t e s t o b j e c t g t :# ############################################for ( i in n . i t e r ) {g . random .
s e t = t r [ sample ( x = 1 : nrow ( t r ) , s i z e = s u b s a m p l e . s i z e ∗nrow ( t r ) ) , c( vars , t a r g e t ) ]# I n t e r s e c t t h e o b j e c t s f r o m random e x t r a c t i o n and a t e s t o b j e c t g t :d e s c 0 = meet . o p e r a t o r ( g t , g . random . s e t [ , − n c o l ( g . random . s e t ) ] ) #−n c o l ( ) t odrop t a r g e t v a r i a b l e# Save a l l t a r g e t v a l u e s f o r h i s t o g r a m :h0 = g .