8月 26

有用的话 – 随时记录

it is well-known that k-means has the major drawback of not being able to separate data points that are not linearly separable in the given feature space (e.g, see Dhillon et al. (2004))

出自:Mine the Easy, Classify the Hard: A Semi-Supervised Approach to Automatic Sentiment Classification

 

For example, Foster et al. (2011) report a drastic drop in performance when moving from the Wall Street Journal (WSJ) domain (training set) to the Twit-ter dataset (used for evaluation)

引用 Foster, Jennifer, ?zlem ?etinoglu, Joachim Wagner, Joseph Le Roux, Stephen Hogan, Joakim Nivre, Deirdre Ho-gan, & Josef Van Genabith. 2011. # hard-to-parse: POS Tagging and Parsing the Twitterverse. In proceedings of the Workshop On Analyzing Microtext (AAAI 2011), pp. 20-25. 2011.