Nowadays social media, such as Twitter and all kinds of online forums, becomes a platform where people can express their opinions implicitly or explicitly. For example, in Twitter, people follow people they trust, and retweet the tweets they agree. In online forums, such as PoliticalForum.com, people explicitly express their opinions and interact with each other using text. Our goal is to understand people's stance on some (political) issue according to their online behaviors that can be captured in social media, including links they have issued and content they have generated, which is essential for national security and policy making. Existing attempts in this direction, however, oversimplified the problem in several aspects. First, they usually treat the user stance prediction problem as a binary classification problem, e.g., left or right, or positive or negative, while the extent of people's attitude is very critical. Second, most of the existing work depends heavily on labels, which is unacceptable for large-scale social media data and impossible to label when user stance is modeled as a numerical number. Third, most of the methods do not attempt to understand the rationality behind their online behaviors. In contrast, (1) our proposed methods can predict user stance in terms of numerical values; (2) our methods are unsupervised methods and no labels are required for the analysis; and (3) the models are carefully designed with the consideration of human rationality of their choices. In particular, two specific user stance prediction problems will be included in this keynote: (1) political ideology detection for ordinary twitter users via their heterogeneous types of links; and (2) user stance prediction in news commenting system. These methodologies may benefit more applications ranging across a wide spectrum of domains.
如今,社交媒体,例如Twitter和各种在线论坛,成为人们可以隐含或明确表达意见的平台。例如,在Twitter中,人们关注他们信任的人,并转发他们同意的推文。在诸如政治forum.com之类的在线论坛中,人们明确表达了他们的意见,并使用文本相互互动。我们的目标是根据他们可以在社交媒体中捕获的在线行为,了解人们对某些(政治)问题的立场,包括他们发行的链接和所产生的内容,这对于国家安全和政策制定至关重要。但是,朝这个方向的现有尝试过度简化了几个方面的问题。首先,他们通常将用户立场预测问题视为二进制分类问题,例如左或右,或正面或负面,而人们的态度程度非常关键。其次,大多数现有工作都在很大程度上取决于标签,这对于大规模的社交媒体数据是不可接受的,当用户态度以数字数字为单位时不可能进行标记。第三,大多数方法都不试图了解其在线行为背后的理性。相反,(1)我们提出的方法可以从数值值方面预测用户的立场; (2)我们的方法是无监督的方法,分析不需要标签; (3)在考虑人类选择的理性方面,仔细设计模型。特别是,此主题演讲中将包括两个特定的用户立场预测问题:(1)普通Twitter用户通过其异质类型的链接进行政治意识形态检测; (2)新闻评论系统中的用户立场预测。这些方法可能使更多的应用程序受益于各种领域。