Computational Social Science

Hanna Wallach

**Concept Graph & Resume using Claude 3.5 Sonnet | Chat GPT4o | Llama 3:**

graph LR
classDef main fill:#f9d4d4, font-weight:bold, font-size:14px
classDef intro fill:#d4f9d4, font-weight:bold, font-size:14px
classDef methods fill:#d4d4f9, font-weight:bold, font-size:14px
classDef models fill:#f9f9d4, font-weight:bold, font-size:14px
classDef inference fill:#f9d4f9, font-weight:bold, font-size:14px
classDef applications fill:#d4f9f9, font-weight:bold, font-size:14px
Main[Computational Social Science]
Main --> A[Introduction to Computational Social Science]
A --> A1[Computational social science: digital

data, computational methods 1] A --> A2[Research examples: recommendation systems,

ideological positions, networks 2] A --> A3[CS vs social sciences:

study objects, methods 3] A --> A4[Intersection: combining CS and

social sciences 4] A --> A5[Explanation vs prediction: causal

theories, predictions 5] Main --> B[Research Methods] B --> B1[Exploratory analysis: uncovering patterns,

informing research 6] B --> B2[Exploratory models: unobservable constructs,

adapted techniques 7] B --> B3[Bayesian modeling: latent variables,

probabilistic methods 8] B --> B4[Blys loop: specify, infer,

assess, revise 9] B --> B5[Model variables: observed, latent,

theoretically justified 10] B --> B6[Theoretical justification: operationalize theory

for explanation 11] Main --> C[Models and Components] C --> C1[Example: inferring issue proportions

from words 12] C --> C2[Graphical models: visualize variables,

relationships, replication 13] C --> C3[Common components: regression, mixture,

admixture, factorization 14] Main --> D[Statistical Inference] D --> D1[Statistical inference: computing posterior

distribution 15] D --> D2[Posterior inference: computational challenge,

approximation 16] D --> D3[MCMC sampling: draws posterior samples 17] D --> D4[Gewekes test: validates MCMC

implementation 18] D --> D5[Variational inference: approximate posterior,

optimization 19] D --> D6[MCMC vs VI: age,

foundations, exactness 20] Main --> E[Model Validation and Theory] E --> E1[Model validation: assess, improve

using posterior 21] E --> E2[Construct validation: appropriateness of

variables 22] E --> E3[Model checking: fit, comparison,

selection 23] E --> E4[Theory role: guides questions,

construction, validation 24] E --> E5[Balancing computation and theory:

collaboration 25] Main --> F[Applications and Collaboration] F --> F1[Political science examples: authors

research 26] F --> F2[Ideal point models: measure

ideological positions 27] F --> F3[Representation research: text data,

lawmakers priorities 28] F --> F4[Partisanship research: survey data,

social identities 29] F --> F5[Interdisciplinary collaboration: combining methods,

theory 30] class Main main class A,A1,A2,A3,A4,A5 intro class B,B1,B2,B3,B4,B5,B6 methods class C,C1,C2,C3 models class D,D1,D2,D3,D4,D5,D6 inference class E,E1,E2,E3,E4,E5,F,F1,F2,F3,F4,F5 applications

data, computational methods 1] A --> A2[Research examples: recommendation systems,

ideological positions, networks 2] A --> A3[CS vs social sciences:

study objects, methods 3] A --> A4[Intersection: combining CS and

social sciences 4] A --> A5[Explanation vs prediction: causal

theories, predictions 5] Main --> B[Research Methods] B --> B1[Exploratory analysis: uncovering patterns,

informing research 6] B --> B2[Exploratory models: unobservable constructs,

adapted techniques 7] B --> B3[Bayesian modeling: latent variables,

probabilistic methods 8] B --> B4[Blys loop: specify, infer,

assess, revise 9] B --> B5[Model variables: observed, latent,

theoretically justified 10] B --> B6[Theoretical justification: operationalize theory

for explanation 11] Main --> C[Models and Components] C --> C1[Example: inferring issue proportions

from words 12] C --> C2[Graphical models: visualize variables,

relationships, replication 13] C --> C3[Common components: regression, mixture,

admixture, factorization 14] Main --> D[Statistical Inference] D --> D1[Statistical inference: computing posterior

distribution 15] D --> D2[Posterior inference: computational challenge,

approximation 16] D --> D3[MCMC sampling: draws posterior samples 17] D --> D4[Gewekes test: validates MCMC

implementation 18] D --> D5[Variational inference: approximate posterior,

optimization 19] D --> D6[MCMC vs VI: age,

foundations, exactness 20] Main --> E[Model Validation and Theory] E --> E1[Model validation: assess, improve

using posterior 21] E --> E2[Construct validation: appropriateness of

variables 22] E --> E3[Model checking: fit, comparison,

selection 23] E --> E4[Theory role: guides questions,

construction, validation 24] E --> E5[Balancing computation and theory:

collaboration 25] Main --> F[Applications and Collaboration] F --> F1[Political science examples: authors

research 26] F --> F2[Ideal point models: measure

ideological positions 27] F --> F3[Representation research: text data,

lawmakers priorities 28] F --> F4[Partisanship research: survey data,

social identities 29] F --> F5[Interdisciplinary collaboration: combining methods,

theory 30] class Main main class A,A1,A2,A3,A4,A5 intro class B,B1,B2,B3,B4,B5,B6 methods class C,C1,C2,C3 models class D,D1,D2,D3,D4,D5,D6 inference class E,E1,E2,E3,E4,E5,F,F1,F2,F3,F4,F5 applications

**Resume: **

**1.-** Introduction to computational social science: Studying social phenomena using digitized information and computational/statistical methods.

**2.-** Examples of computational social science research: Estimating causal impact of recommendation systems, issue-adjusted ideological positions, faculty hiring networks.

**3.-** Differences between computer science and social sciences: Object of study, driving force (methods vs questions), data types, research goals.

**4.-** Computational social science at the intersection: Combining elements of computer science and social sciences in specific ways.

**5.-** Explanation vs prediction in research: Causal theories and evidence vs making predictions; interpretable models, validation, variable choice.

**6.-** Exploratory data analysis: Uncovering patterns, informing explanatory or predictive analysis; differs for computer scientists and social scientists.

**7.-** Exploratory and measurement models in social sciences: Dealing with unobservable theoretical constructs; often adapted from computer science models.

**8.-** Bayesian latent variable modeling framework: Represents unobserved patterns using latent variables; combines probabilistic models, Bayesian methods, validation.

**9.-** Bly's loop for Bayesian latent variable modeling: Iterative process - specify model, perform inference, assess validity, revise. Grounded in theory.

**10.-** Specifying variables in Bayesian latent variable models: Observed (data, covariates), latent (model parameters, quantities of interest), justified theoretically.

**11.-** Justifying model choices theoretically for explanatory analysis: Variables and relationships must operationalize theory; unnecessary for prediction/exploration.

**12.-** Example of Bayesian latent variable model: Inferring latent issue proportions in congressional bills from observed words. Defines variables, relationships.

**13.-** Graphical models: Visually represent variables (nodes), relationships (edges) and replication (plates). Equivalent to joint probability equations.

**14.-** Common model components in computational social science: Linear regression, mixture models, admixture models, matrix factorization. Standalone or combined.

**15.-** Statistical inference in Bayesian latent variable models: Computing posterior distribution of latent variables given data. Allows estimating quantities of interest.

**16.-** Posterior inference as key computational challenge: Posterior is joint divided by intractable marginal evidence. Requires approximation methods.

**17.-** Markov chain Monte Carlo (MCMC) sampling: Draws samples from posterior to approximate it. Conceptually simple but implementation can be tricky.

**18.-** Geweke's "Getting it Right" test for MCMC: Compares samples from generative process vs samples involving inference algorithm. Validates implementation.

**19.-** Variational inference algorithms: Approximate intractable posterior with tractable distribution. Turn inference into optimization of divergence/lower bound.

**20.-** Differences between MCMC and variational inference: Age, theoretical foundations, exactness, convergence, speed. VI newer, faster, better for big data.

**21.-** Model validation and criticism: Use posterior, theory, other data to assess and improve model. Important step in Bly's loop.

**22.-** Construct validation: Ensure appropriateness of variables for representing theoretical constructs. Crucial for explanatory models, unnecessary for predictive models.

**23.-** Model checking, comparison, selection: Assess model fit, compare to alternative models, select best one. Methods differ for prediction vs explanation.

**24.-** Role of theory in computational social science: Guides research questions, model construction, validation. Crucial for explanation, less so for prediction.

**25.-** Balancing computation and theory: Computational techniques enable analysis of new data; social theories provide meaning and context. Require collaboration.

**26.-** Introduction to political science examples: Author's research involves political questions and data; will demo one model in detail.

**27.-** Ideal point models: Measure lawmakers' ideological positions from votes. Example of political science model drawing on CS techniques.

**28.-** Author's research on representation and political agendas: Uses text data to understand how lawmakers present themselves and priorities.

**29.-** Author's research on partisanship and social identity: Uses survey data to explore evolving role of partisanship in social identities.

**30.-** Importance of interdisciplinary collaboration: Combining computational methods and social science theory requires researchers to work together closely.

Knowledge Vault built byDavid Vivancos 2024