- 追加された行はこの色です。
- 削除された行はこの色です。
[[FrontPage]]
**Understanding Statistics in Corpus Linguistics [#cf852098]
-Variables
--categorical
---binary (2 categories)
---multiple (n > 2): (a) nominal vs. (b) ordinal
--quantitative
---interval
---discrete
**Another aspect of variables: [#x6bfd7ee]
-explanatory/predictor/independent variables
-response/outcome/dependent variables
**Univariate vs. Bivariate analyses [#c85829ac]
-univariate
--an examination of a single variable
**Concept of "significance" [#u10790cf]
-Difference in proportions
-significance = a difference that is sufficiently large enough to trust it
***Null Hypothesis [#lcc1139e]
-There is no particular difference
***Expected frequencies [#a23f67b4]
-the frequencies we WOULD get if the two proportions are identical. Both probabilities equal 0.5
**Chi-square [#t18709d3]
-sum of the squared differences between obser[ved and expected frequencies, divided by the expected frequency, across all cells
-The probability of chi-square statistic is known for each number of degrees of freedom [number of groups -1]
-Advantages
--easy to understand
--used widely
-Disadvantages
--For small O in 2x2, apply Yate's correction or user Fisher exact test
--Dunning shows chi-square is not a good test when O are small and N is large.
--Log-likelihood test does basically the same job without these limitations
**Multivariate Analysis [#t7720dba]
-Control of variables
-Interaction of multiple predictor variables over response variables
--Log-linear analysis
--Generalized linear model
-