Диссертация (1149537), страница 12
Текст из файла (страница 12)
— 2003.[101] McCallum A. K. Bow: A toolkit for statistical language modeling, textretrieval, classification and clustering, 1996. — 1996.[102] McCallum A. K. Mallet: A machine learning for language toolkit. —2002.[103] McCallum A. et al. A comparison of event models for naivebayes text classification // AAAI-98 workshop on learning for textcategorization.
— 1998. — Т. 752. — №. 1. — P. 41-48.[104] McCallum A. et al. Improving text classification by shrinkage in ahierarchy of classes // ICML. — 1998. — Vol. 98. — P. 359-367.[105] Mcauliffe J. D., Blei D. M. Supervised topic models // Advances inneural information processing systems. — 2008. — P. 121–128.[106] Mika S. et al. Kernel PCA and de-noising in feature spaces //Advancesin neural information processing systems — 1999. — P. 536–542.[107] Milligan G. W., Cooper M. C.
An examination of procedures fordetermining the number of clusters in a data set // Psychometrika. —1985. — Vol. 50. — No. 2. — P. 159–179.[108] Mitchell T. M. et al. Machine learning // Burr Ridge, IL: McGrawHill. — 1997. — Vol. 45. — No. 37. — P. 870–877.[109] Murtagh F. A survey of recent advances in hierarchical clusteringalgorithms // The Computer Journal. — 1983. — Т. 26.
— №. 4. —P. 354-359.[110] Murtagh F. Complexities of hierarchic clustering algorithms: state ofthe art // Computational Statistics Quarterly. — 1984. — Vol. 1. — №.2. — P. 101–113.[111] Ng A. Y., Jordan M. I. On discriminative vs. generative classifiers: Acomparison of logistic regression and naive bayes // Advances in neuralinformation processing systems — 2002 — P. 841–848.82[112] Nigam K. et al.
Learning to classify text from labeled and unlabeleddocuments // AAAI/IAAI. — 1998. — Vol. 792.[113] Oliveira W., Justino E., Oliveira L.S. Comparing compression modelsfor authorship attribution // Forensic Science International. — 2013. —Vol. 228, No. 1. — P. 100–104.[114] Osuna E., Freund R., Girosit F.
Training support vector machines: anapplication to face detection // In: Proc. of the IEEE computer societyconference on Computer vision and pattern recognition. — 1997. —P. 130–136.[115] Peng F., Schuurmans D., Keselj V., Wang S. Augmenting naive bayesclassifiers with statistical languages model // Information Retrieval. —2004.
— Vol. 7. — P. 317–345.[116] Popkov Yu. S., Dubnov Yu. A., Popkov A. Yu. Randomized machinelearning:[117] Porter M. F. An algorithm for suffix stripping // Program. — 1980. —Vol. 14. — №. 3. — P. 130–137.[118] Rachev S. Probability Metrics and the Stability of Stochastic Models// John Wiley & Son Ltd. — 1991.[119] Rand W. Objective criteria for the evaluation of clustering methods// Journal of the American Statistical association. — 1971. — Vol.
66,No. 336. — P. 846–850.[120] Rocchio J. J. Relevance feedback in information retrieval // TheSMART Retrieval System: Experiments in Automatic DocumentProcessing. — 1971. — P. 313–323.[121] Rosenblatt F. Principles of Neurodynamics. — New York: SpartanPress. — 1962. — 616 p.[122] Rousseeuw P. J. Silhouettes: a graphical aid to the interpretation andvalidation of cluster analysis // Journal of computational and appliedmathematics — 1987 — Vol. 20 — P. 53–65.[123] Rudman J. The state of authorship attribution studies: some problemsand solutions // Computers and the Humanities. — 1998.
— Vol. 31. —P. 351–365.83[124] Salton G., Buckley C. Term-weighting approaches in automatic textretrieval // Information processing & management — 1988 — Vol. 24 —No. 5 — P. 513–523.[125] Salton G., McGill M. J. Introduction to Modern Information RetrievalMcGraw-Hill New York. — 1983.[126] Salton G., Wong A., Yang C.
S. A vector space model for automaticindexing // Communications of the ACM. — 1975. — Vol. 18. —No. 11. — P. 613–620.[127] Schoenberg I. J. Metric spaces and positive definite functions //Transactions of the American Mathematical Society. — 1938. — Vol.44.
— №. 3. — P. 522–536.[128] Scholkopf B., Smola A. J. Learning with Kernels: Support VectorMachines, Regularization, Optimization, and Beyond. — MIT press,2001.[129] Shalymov D., Granichin O., Klebanov L., Volkovich Z. Literary writingstyle recognition via a minimal spanning tree-based approach // ExpertSystems with Applications. — 2016.
— Vol. 61. — P. 145–153.[130] Sidorov G., Velasquez F., Stamatatos E., Gelbukh A., ChanonaHernandez L. Non-continuous syntactic N-grams // Expert Systemswith Applications. — 2014. — Vol. 41. — No. 3. — P. 853–860.[131] Sidorov G. Non-continuous Syntactic N-grams // International Journalof Computational Linguistics and Applications. — 2014. — Vol. 5,No. 1. — P. 139–158.[132] Sidorov G. Non-continuous syntactic N-grams // Polibits. — 2013. —Vol. 48.
— No. 1. — P. 67–75.[133] Stamatatos E., Daelemans W., Verhoeven B., Juola P., Lopez A.,Potthast M., Stein B.Overview of the Author Identification Task atPAN 2015 // In: Proc. of the CLEF (Working Notes). — 2015.[134] Stamatatos E. A Survey of modern authorship attribution methods// Journal of the American Society for information Science andTechnology. — 2009. — Vol. 60. — No. 3.
— P. 538–556.84[135] Stamatatos E. Intrinsic plagiarism detection using character N -gramprofiles // In: Proc. of the SEPLN 2009 Workshop on UncoveringPlagiarism, Authorship, and Social Software Misuse. — 2009. — P. 38–46.[136] Stein S., Argamon S. A mathematical explanation of Burrows’s delta// In: Proc. of the Digital Humanities Conference. — 2006.
— P. 207–209.[137] Sugar C. A., James G. M. Finding the number of clusters in adataset: An information-theoretic approach // Journal of the AmericanStatistical Association. — 2003. — Vol. 98. — No. 463. — P. 750–763.[138] Tan S., Wang Y., Wu G. Adapting centroid classifier for documentcategorization // Expert Systems with Applications.
— 2011. — Т.38. — №. 8. — P. 10264-10273.[139] Thompson R. A note on restricted maximum likelihood estimation withan alternative outlier model // Journal of the Royal Statistical Society,Series B: Methodological. — 1985. — Vol. 47. — P. 53–55.[140] Vapnik V. N., Kotz S.
Estimation of dependences based on empiricaldata. — New York : Springer-Verlag, 1982. — Т. 40.[141] Veltkamp R. C., Hagedoorn M. Shape similarity measures, propertiesand constructions // International Conference on Advances in VisualInformation Systems. — Springer, Berlin, Heidelberg, 2000. — P. 467476.[142] Vidyasagar M. Randomized algorithms for robust controller synthesisusing statistical learning theory // Automatica.
– 2001. – Т. 37. – №.10. – С. 1515-1528.[143] Willett P. Recent trends in hierarchic document clustering: a criticalreview //Information Processing & Management. — 1988. — Т. 24. —№. 5. — P. 577-597.[144] Wu H., Bu J., Chen C., Zhu J., Zhang L., Liu H., Wang C., Cai D.Locally discriminative topic modeling.
— Elsevier. — 2012.[145] Yang Y., Chute C. G. An example-based mapping method for textcategorization and retrieval // ACM Transactions on InformationSystems (TOIS). — 1994. — Т. 12. — №. 3. — P. 252-277.85[146] Yang Y., Liu X. A re-examination of text categorization methods //In: Proc. of the 22nd Annual International ACM SIGIR Conference onResearch and Development in Information Retrieval. — ACM, 1999.
—P. 42–49.[147] Zhang H., Chow T.W.S A coarse-to-fine framework to efficientlythwart plagiarism // Pattern Recognition. — 2011. — Vol. 44, No. 2. —P. 471–487.[148] Zhang J., Yang Y. Robustness of regularized linear classificationmethods in text categorization // In: Proc. of the 26th AnnualInternational ACM SIGIR Conference on Research and Developmentin Informaion Retrieval.
— ACM, 2003. — P. 190–197.[149] Zhao Y., Zobel J. Effective and scalable authorship attributionusing function words // In: Proc. of the Asia Information RetrievalSymposium. — 2000. — P. 174–189.[150] Zolotarev V. M. Modern Theory of Summation of Random Variables. —Walter de Gruyter. — 1997.86.