OT-SC-WS-05 | Computational social sciences
Dr. Nikolitsa Grigoropoulou
Computational social science is a cross-disciplinary field that applies computational statistics to large-scale data, typically from digital (or digitalized) sources, in order to understand human behavior. It involves a variety of observational, experimental, and simulation research formats and it is popular among industries, government agencies, and the academy alike. Quantitative text analytics, in particular, leverage the abundance of historical and real-time textual data produced by humans to generate insights, observe trends, and uncover dynamic patterns in ways that were previously unattainable with small-scale, qualitative analysis of texts or other forms of data. As such, text analytics are deployed far and wide to examine phenomena such as cultural change, population-level health indicators, disease outbreaks, discrimination in the legal system, and political behavior, among others, and support domains such as customer service, product development, disaster response, decision-making, and policy-making. Thus, researchers across disciplines can benefit from engaging with these methods and becoming familiar with their intricacies and potential applications in academic research or other industries.
- What is computational social science?
- An overview of text analytics and its workflow, including:
- Theoretical and methodological foundations
- Types of data sources, potentials and pitfalls
- Data cleaning and data transformation
- Methods of quantitative text analysis, with an emphasis on supervised and unsupervised text classification, topic modeling, and sentiment analysis
- Brief overview of text analysis software and statistical packages
- Application of quantitative text analysis in R and SAS Enterprise Miner
By the end of the workshop, the participants should be able to:
- define what computational social science (CSS) is and how it is related to other fields
- define text analytics and have a good grasp of their theoretical and methodological foundations
- understand the core workflow for text analysis
- perform basic data cleaning and data transformation
- identify different methods of quantitative text analysis, their fundamental characteristics, advantages, and limitations
- conduct basic text classification (and/or sentiment analysis)
Fundamentals of statistics and quantitative research
- For online format a second screen might be beneficial
- R (or RStudio)
- SAS Enterprise Miner (accessed for free through SAS OnDemand for Academics https://www.sas.com/en_us/software/on-demand-for-academics.html)
- Microsoft Office Excel
- Benoit, Ken. 2020. “Text as Data: An Overview.” Pp. 461–97 in The SAGE Handbook of Research Methods in Political Science and International Relations. London: SAGE Publications Ltd.
- Dumais, Susan T. 2004. “Latent Semantic Analysis.” Annual Review of Information Science and Technology 38(1):188–230. doi: 10.1002/aris.1440380105.
- Feldman, Ronen. 2013. “Techniques and Applications for Sentiment Analysis.” Communications of the ACM 56(4):82–89. doi: 10.1145/2436256.2436274.
- Medhat, Walaa, Ahmed Hassan, and Hoda Korashy. 2014. “Sentiment Analysis Algorithms and Applications: A Survey.” Ain Shams Engineering Journal 5(4):1093–1113. doi: 10.1016/j.asej.2014.04.011.
- Mohr, John W., and Petko Bogdanov. 2013. “Introduction—Topic Models: What They Are and Why They Matter.” Poetics 41(6):545–69. doi: 10.1016/j.poetic.2013.10.001.
Extended readings (available for free online):
- Aggarwal, Charu C., and Cheng Xiang Zhai, eds. 2012. Mining Text Data. Boston, MA: Springer US.
- Srivastava, Ashok N., and Mehran Sahami, eds. 2009. Text Mining: Classification, Clustering, and Applications. CRC Press.