Диссертация (1149537), страница 11
Текст из файла (страница 11)
— 1998. — P. 169-173.[42] Collier N., Nobata C., Tsujii J. Extracting the names of genes andgene products with a hidden Markov model // In: Proc. of the 18thconference on Computational linguistics-Volume 1. — Association forComputational Linguistics, 2000. — P. 201-207.[43] Cortes C., Vapnik V. Support-vector networks // Machine learning. —1995. — Vol. 20. — №.
3. — P. 273–297.[44] Coyotl-Morales R. M., Villasenor-Pineda L., Montes-y-Gomez M.,Rosso P. Authorship attribution using word sequences // In: Proc. ofthe Iberoamerican Congress on Pattern Recognition. — 2006. — P. 844–853.76[45] Coyotl-Morales R. M., Villasenor-Pineda L., Montes-y-Gomez M.,Rosso P. Grouping multidimensional data – Recent Advances inClustering. — Springer. — 2006.[46] Deza M.M., Deza E. Encyclopedia of Distances // Springer.
— 2009.[47] Dhillon I., Guan Y., Kogan J. Iterative clustering of high dimensionaltext data augmented by local search // In: Proc. of The 2nd IEEEData Mining Conference. — 2002. — .[48] Diederich J., Kindermann J., Leopold E., Paas G. Authorshipattribution with support vector machines // Applied Intelligence. —2003. — Vol. 19, No. 1. — P. 109–123.[49] Drucker H., Wu D., Vapnik V. N. Support vector machines for spamcategorization // IEEE Transactions on Neural networks.
— 1999. —Т. 10. — №. 5. — P. 1048-1054.[50] Duda R. O., Hart P. E., Stork D. G. Pattern classification. — JohnWiley & Sons, 2012.[51] Dudoit S., Fridlyand J. A prediction-based resampling method forestimating the number of clusters in a dataset // Genome biology. —2002. — Vol. 3. — No. 7. — P.‘112–129.[52] Dumais S. T. et al. Latent semantic indexing (LSI) and TREC-2 //Nist Special Publication Sp. — 1994. — P. 105-105.[53] Dunn J. C. Well-separated clusters and optimal fuzzy partitions //Journal of cybernetics. — 1974. — Vol. 4.
— №. 1. — P. 95–104.[54] Feng G. et al. A Bayesian feature selection paradigm for textclassification // Information Processing & Management. — 2012. —Т. 48. — №. 2. — P. 283-302.[55] Filippone M. et al. A survey of kernel and spectral methods forclustering // Pattern recognition. — 2008 — Vol. 41 — №. 1 — P. 176–190.[56] Forgy E.W. Cluster analysis of multivariate data – efficiency vsinterpretability of classifications // Biometrics. — 1965.
— No. 21. —P. 768—769.77[57] Frery J., Largeron C., Juganaru-Mathieu M. UJM at CLEF in authorverification based on optimized classification trees // In: Proc. of theCLEF 2014.[58] Fukunaga K. Introduction to Statistical Pattern Recognition. — NewYork: Academic Press. — 1972. — 618 p.[59] Gordon A. D. Identifying genuine clusters in a classification //Computational Statistics & Data Analysis.
— 1994. — Vol. 18. —No. 5. — P. 561–581.[60] Granichin O., Kizhaeva N., Shalymov D., Volkovich Z. Writing styledetermination using the KNN text model // In: Proc. of the 2015 IEEEInternational Symposium on Intelligent Control. — Sydney, Australia,2015. — September 21–23. — P. 900–905.[61] Granichin O., Volkovich V., Toledano-Kitai D.
RandomizedAlgorithms in Automatic Control and Data Mining. Springer-Verlag:Heidelberg New York Dordrecht London. — 2015. — 251 p.[62] Gregor H. Parameter Estimation for Text Analysis. Technical report. —2005.[63] Griffiths T. L., Steyvers M. Finding scientific topics // In: Proc. of theNational academy of Sciences. — 2004. — Vol. 101. — No.
suppl 1. —P. 5228–5235.[64] Günal S. et al. On feature extraction for spam e-mail detection// International Workshop on Multimedia Content Representation,Classification and Security. — Springer, Berlin, Heidelberg, 2006. —P. 635–642.[65] Halvani O., Steinebach M.An efficient intrinsic authorship verificationscheme based on ensemble learning // In: Proc. of the 9th InternationalConference on Availability, Reliability and Security. — 2014.
— P. 571–578.[66] Han E. H. S., Karypis G., Kumar V. Text categorization using weightadjusted k-nearest neighbor classification // Pacific-Asia Conference onKnowledge Discovery and Data Mining. — Springer, Berlin, Heidelberg,2001. — P. 53–65.78[67] Han E. H. S., Karypis G. Centroid-based document classification:analysis and experimental results // European Conference onPrinciples of Data Mining and Knowledge discovery. — Springer, Berlin,Heidelberg, 2000.
— P. 424–431.[68] Han J., Pei J., Kamber M. Data Mining: Concepts and Techniques. —Elsevier, 2011.[69] Hartigan J. A. Clustering Algorithms (Probability & MathematicalStatistics). — New York: Wiley, 1975, 351 p.[70] Hofmann T. Probabilistic latent semantic indexing // ACM SIGIRForum. — ACM, 2017 — Vol. 51 — No. 2 — P. 211-218.[71] Hoover D.L. Testing Burrows’s delta // Literary and LinguisticComputing.
— 2004. — Vol. 19, No. 4. — P. 453–475.[72] Hopfield J. Neurons with graded response have collectivecomputational properties like those of two-state neurons // In:Proc. of the National Academy of Sciences. — 1984. — No. 81. —P. 3088—3092.[73] Hubert L., Arabie P. Comparing partitions //Classification. — 1985. — Vol. 2, No. 1. — P. 193–218.Journalof[74] Hubert L., Schultz J. Quadratic assignment as a general dataanalysis strategy // British journal of mathematical and statisticalpsychology.
— 1976. — Vol. 29. — №. 2. — P. 190–241.[75] Hughes J. M., Foti N. J., Krakauer D. C., Rockmore D. N.Quantitative patterns of stylistic influence in the evolution of literature// In: Proc. of the National Academy of Sciences. — 2012. — Vol. 109. —No. 20. — P. 7682–7686.[76] James M. Classification Algorithms. — Wiley-Interscience, 1985.[77] Jankowska M., Keselj V., Milios E. E. Proximity based oneclass classification with common N -gram dissimilarity for authorshipverification task // In: Proc.
of the CLEF 2013 Evaluation Labs andWorkshop. — 2013. — P. 23–26.[78] Joachims T. A statistical learning model of text classification forsupport vector machines // In: Proc. of the 24th annual international79ACM SIGIR conference on Research and development in informationretrieval. — ACM, 2001. — P. 128–136.[79] Joachims T.
Text categorization with support vector machines:Learning with many relevant features // European conference onmachine learning. — Springer, Berlin, Heidelberg, 1998. — P. 137-142.[80] Juola P. Authorship attribution // Foundations and trends inInformation Retrieval. — 2006. — Vol. 1. — No. 3. — P. 33–334.[81] Kalt T., Croft W. B. A new probabilistic model of text classificationand retrieval.
— Technical Report IR-78, University of MassachusettsCenter for Intelligent Information Retrieval. — 1996.[82] Kaufman L., Rousseeuw P. J. Finding Groups in Data: AnIntroduction to Cluster Analysis. // John Wiley. — 1990.[83] Kaufman L., Rousseeuw P. J. Finding groups in data: an introductionto cluster analysis — John Wiley & Sons, 2009 — Vol. 344.[84] Kendall M. G., Gibbons J. D. Rank Correlation Methods // EdwardArnold. — 1990.[85] Kestemont M., Luyckx K., Daelemans W., Crombez T.
Cross-Genreauthorship verification using unmasking // English Studies. — 2012. —Vol. 93. — No. 3. — P. 340–356.[86] Kestemont M., Luyckx K., Daelemans W. Intrinsic plagiarismdetection using character trigram distance scores // In: Proc. of thePAN 2012 Lab Uncovering Plagiarism, Authorship, and Social SoftwareMisuse held in conjunction with the CLEF 2012 Conference. — 2011. —P. 8.[87] Kizhaeva N., Shalymov D., Granichin O., Volkovich Z. Studyingof KNN two-sample test approach applications for writing stylecomparison of English and Russian text collections // In: Proc. of theAINL-ISMW FRUCT (Artificial Intelligence and Natural Language& Information Extraction, Social Media and Web Search). — ITMOUniversity, FRUCT Oy, Finland.
— Saint-Petersburg, Russia, 2015. —November 9–14. — P. 163–166.[88] Kizhaeva N., Volkovich Z., Granichin O., Granichina O., Kiyaev V.Spectral profiling of writing process // In: Proc. of the 2017 IEEE80Conference on Control Technology and Applications. — Coast, Hawaii,USA, 2017. — August 27–30.
— P. 2063–2068.[89] Koppel M., Schler J., Argamon S. Computational methods inauthorship attribution // Journal of the American Society forInformation Science and Technology. — 2009. — Vol. 60, No. 1. — P. 9–26.[90] Koppel M., Winter Y. Determining if two documents are written bythe same author // Journal of the American Society for InformationScience and Technology. — 2014. — Vol. 65, No.
1. — P. 178–187.[91] Krzanowski W. J., Lai Y. T. A criterion for determining the number ofgroups in a data set using sum-of-squares clustering // Biometrics. —1988. — P. 23–34.[92] Kulkarni V., Al-Rfou R., Perozzi B., Skiena S. Statistically significantdetection of linguistic change // In: Proc. of the 24th InternationalConference on World Wide Web. — 2015. — P.
11.[93] Lam W., Ho C. Y. Using a generalized instance set for automatictext categorization // In: Proc. of the 21st Annual InternationalACM SIGIR Conference on Research and Development in InformationRetrieval. — ACM, 1998. — P. 81–89.[94] Lance G. N., Willams W. T. A general theory of classification sortingstrategies-Hierarchical System // Cognitive Journal. — 1967. — Vol.9.
— P. 373–380.[95] Lance G. N., Williams W. T. Computer programs for hierarchicalpolythetic classification (“similarity analyses”) // The ComputerJournal. — 1966.. — Vol. 9. — No. 1. — P. 60–64.[96] Lemberg D., Soffer A., Volkovich Z. New approach for plagiarismdetection // International Journal of Applied Mathematics. — 2016. —Vol. 29.
— No. 3. — P. 365–371.[97] Lewis D. D. Naive (Bayes) at forty: The independence assumption ininformation retrieval // European conference on machine learning. —Springer, Berlin, Heidelberg, 1998. — P. 4-15.[98] Lovins J. B. Development of a stemming algorithm // Mech. Translat.& Comp. Linguistics. — 1968. — Vol. 11.
— №. 1-2. — P. 22–31.81[99] Luyckx K., Daelemans W. Authorship attribution and verification withmany authors and limited data // In: Proc. of the 22nd InternationalConference on Computational Linguistics. — 2008. — P. 513–520.[100] Manning C., Schutze H. Foundations of Statistical Natural LanguageProcessing. — MIT Press.