Answer Quality Characteristics and Prediction on an Academic Q&A Site: A Case Study on ResearchGate
Lei Li, Daqing He, Wei Jeng,Spencer Goodwin, Chengzhi Zhang (IW3C2-2015)
The paper “Answer Quality Characteristics and Prediction on an Academic Q&A Site: A Case Study on ResearchGate” presents a study on ResearchGate, an online social Q&A site. Social Q&A sites are platforms, which allow users (1) interact with others through asking questions and providing answers, and (2) evaluate others’ contributions. Therefore, the quality of user-generated content in such sites are very important; especially, in academic Q&A sites, some questions do not have widely accepted right answers; scholars engage in the discussions to explore knowledge. The answer quality usually is judged in a peer-based fashion; scholars recommend the answer to the question. Through the study, Lie et al. show the relationship between answer characteristics and answer quality based on peer judgments on ResearchGate, and use useful characteristics as features to predict the answer quality.
Lie et al. present a framework for their study. First, they collect data from 107 Q&A threads in the three disciplines (i.e., LIS, History of Art, Astrophysics), including 1128 posts. After, they generate answer characteristics and answer quality score based on peer judgments (i.e., answer vote information). The answer characteristics include:
- Answer web-capture features: the responder’s participatory scores (named RG score on the site), responder’s publication impacts (called impact points), responder’ institution participatory scores and publication impacts, answer length, and response time
- Answer human-coded features: social elements, consensus building (i.e., agreement), and providing factual information, resources, opinion, personal experience or reference to other researchers.
The quality label for each question is based on the number of votes. They categorize all the questions into three groups: 505 low quality answers (0 upvote), 438 medium quality answers (1 or 2 upvotes), and 78 high quality answers (> 3 upvotes).
After constructing the final answer data set, they first analyze the relationship between answer characteristics and answer quality. They found out that the most important features are longer answers, answers which were provided earlier, responders with higher RG score, and researchers from an institution with a higher RG score. Other interesting findings are social element which decreases the quality of an answer, and scholars’ impact points which do not contribute much to the quality of an answer. Based on this analysis, Lei et al. build several classification models to predict the answer quality, which are logistic regression, Naïve Bayes, and SVM. For SVM, they compare the experimental results of 4 kernel functions to choose the best one. They use the metrics AUC, precision, recall and F1 to evaluate the models, and run 10-fold cross-validation. The results show that:
- SVM classification algorithm with RBF kernel function performs best.
- Contrastive analysis of three prediction models: the optimized SVM algorithm has a better performance compared to Naïve Bayes or logistic regression.
They also analyze the importance of human-coded features and web-captured features in the SVM model. Based on the results, they claim that web-captured features bring about better performance. Moreover, combining both kinds of features, the model performs best.
In this paper, Lei et al. present an analysis about the relationships between answer characteristics and answer quality. They found out some factors highly contributing to the quality of an answer such as responders with higher RG score, long answer, etc. Based on human-coded features and web-captured features presented in the paper, they are able to build prediction models for predicting answer quality. These results can help to design better academic Q&A sites such as ResearchGate. However, because some characteristics in this study are coded by human, the dataset used for building and testing prediction model is limited. Those features could be captured by machine. Therefore, the authors may improve the study by replacing human-coded features with machine-coded features. Additionally, the reputation of scholars who vote the answer should be a strong indicator for labelling the quality of the answer.