R¤Î¥á¥â
¥Ç¡¼¥¿¥Õ¥ì¡¼¥àÁàºî †
library(dplyr)
newdataset <- select(dataset, variable1, variable2, ...) # Choose necessary variables only
£±¤Ä¤Î¥Ç¡¼¥¿¥Õ¥ì¡¼¥à dataset ¤«¤éɬÍפÊÊÑ¿ô¡Êvariable1, 2, ..)¤òÈ´¿è¤·¤Æ newdataset ¤òºî¤ë
filter(newdataset, variable1 > 56) %>% select(variable2, variable 1)
¡¡¡¡newdataset ¤Î variable 1¤ÎÃ͡ʤ³¤³¤Ç¤Ï56¤è¤êÂ礤¤¡Ë¤Ç¥Õ¥£¥ë¥¿¤·¡¢¤«¤ÄÊÑ¿ô2, 1 ¤À¤±Ãͤòɽ¼¨
%>% ¤Ï¤³¤Î¥³¥Þ¥ó¥É¤òÃ༡Ū¤Ë¼Â¹Ô¤¹¤ë¤È¤¤¤¦°ÕÌ£
- mosaic ¥é¥¤¥Ö¥é¥ê
- ·ç»ÃͤòÄ´¤Ù¤ëºÝ¤Ë¡¢mosaic ¥é¥¤¥Ö¥é¥ê¤Î tally() ´Ø¿ô¤Ç is.na(variable) ¤ò¸¡ºº¤¹¤ë
library(mosaic)
tally(~ is.na(variable), data=dataset)
- ¤½¤¦¤¹¤ë¤È TRUE ¤Ç½Ð¤Æ¤¤¿·ï¿ô¤¬·ç»Ãͤˤ¢¤¿¤ë
- ƱÍͤΤ³¤È¤Ï favstat()´Ø¿ô¤Ç¤â¤Ç¤¤ë
favstats(~ variable, data=dataset)
min Q1 median Q3 max mean ad n missing
- ¤³¤Î¤è¤¦¤Ê¹àÌܤΠsummary ¤¬½Ð¤ë¡£ºÇ¸å¤Î missing ¤¬·ç»ÃÍ¡£ÉáÄ̤Πsummary()´Ø¿ô¤Ç¤Ï·ç»ÃͤϽФʤ¤¤Î¤ÇÊØÍø¤À¡£
ÊÑ¿ô¤ÎºÆ½¸·× †
- memisc ¥é¥¤¥Ö¥é¥ê
- dplyr ¤È¹ç¤ï¤»¤Æ°Ê²¼¤Î¤è¤¦¤Ê¥³¡¼¥É¤ò½ñ¤¯¤È¡¢ÏÀÍý±é»»¤Î¼°¤ò¤â¤È¤Ë¡¢ÊÑ¿ô¤Î¿ôÃ;ò·ï¤´¤È¤Ë¥é¥Ù¥ë¤òÉÕ¤±¤ë¤³¤È¤¬²Äǽ¡§
> library(dplyr)
> library(memisc)
> newdataset = mutate(newdataset, new_variablename=
cases(
"LABEL A" = variable1==0,
"LABEL B" = (variable1>0 & variable1<=1 & variable2<=3 & variable3==1) |
(variable1>0 & variable1<=2 & variable2<=4 & variable3==0),
"LABEL C" = ((variable1>1 | variable2>3) & variable3==1) |
((variable1>2 | variable2>4) & variable3==0)))
- variable1, 2, 3 ¤ÎÃͤÎÏÀÍý¼°¤ÎÁȤ߹ç¤ï¤»¤Ç LABEL A, B, C ¤òÄêµÁ¡£¤³¤Î¤Ø¤ó¤Î¼°¤ÎÆâÍÆ¤Ï¸¦µæÆâÍÆ¤´¤È¤Ë°Û¤Ê¤ë¡£