Free Statistics

of Irreproducible Research!

Author's title

Author*Unverified author*
R Software Module--
Title produced by softwareRecursive Partitioning (Regression Trees)
Date of computationMon, 12 Dec 2011 14:52:15 -0500
Cite this page as followsStatistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2011/Dec/12/t1323719547ht0iyhu9tunpxv5.htm/, Retrieved Thu, 31 Oct 2024 23:23:35 +0000
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=154180, Retrieved Thu, 31 Oct 2024 23:23:35 +0000
QR Codes:

Original text written by user:
IsPrivate?No (this computation is public)
User-defined keywords
Estimated Impact176
Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)
-     [Recursive Partitioning (Regression Trees)] [] [2010-12-05 19:50:12] [b98453cac15ba1066b407e146608df68]
-   PD  [Recursive Partitioning (Regression Trees)] [WS 10 Cross Valid...] [2010-12-11 14:44:18] [8081b8996d5947580de3eb171e82db4f]
-         [Recursive Partitioning (Regression Trees)] [Workshop 10, Cros...] [2010-12-11 15:06:30] [3635fb7041b1998c5a1332cf9de22bce]
- RM          [Recursive Partitioning (Regression Trees)] [] [2011-12-12 19:52:15] [d41d8cd98f00b204e9800998ecf8427e] [Current]
Feedback Forum

Post a new message
Dataseries X:
12008	4.0
9169	5.9
8788	7.1
8417	10.5
8247	15.1
8197	16.8
8236	15.3
8253	18.4
7733	16.1
8366	11.3
8626	7.9
8863	5.6
10102	3.4
8463	4.8
9114	6.5
8563	8.5
8872	15.1
8301	15.7
8301	18.7
8278	19.2
7736	12.9
7973	14.4
8268	6.2
9476	3.3
11100	4.6
8962	7.1
9173	7.8
8738	9.9
8459	13.6
8078	17.1
8411	17.8
8291	18.6
7810	14.7
8616	10.5
8312	8.6
9692	4.4
9911	2.3
8915	2.8
9452	8.8
9112	10.7
8472	13.9
8230	19.3
8384	19.5
8625	20.4
8221	15.3
8649	7.9
8625	8.3
10443	4.5
10357	3.2
8586	5.0
8892	6.6
8329	11.1
8101	12.8
7922	16.3
8120	17.4
7838	18.9
7735	15.8
8406	11.7
8209	6.4
9451	2.9
10041	4.7
9411	2.4
10405	7.2
8467	10.7
8464	13.4
8102	18.3
7627	18.4
7513	16.8
7510	16.6
8291	14.1
8064	6.1
9383	3.5
9706	1.7
8579	2.3
9474	4.5
8318	9.3
8213	14.2
8059	17.3
9111	23.0
7708	16.3
7680	18.4
8014	14.2
8007	9.1
8718	5.9
9486	7.2
9113	6.8
9025	8.0
8476	14.3
7952	14.6
7759	17.5
7835	17.2
7600	17.2
7651	14.1
8319	10.4
8812	6.8
8630	4.1




Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time3 seconds
R Server'AstonUniversity' @ aston.wessa.net

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 3 seconds \tabularnewline
R Server & 'AstonUniversity' @ aston.wessa.net \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154180&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]3 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'AstonUniversity' @ aston.wessa.net[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154180&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154180&T=0

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time3 seconds
R Server'AstonUniversity' @ aston.wessa.net







10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2C3C4CVC1C2C3C4CV
C1186161610.8493143400.6667
C296537210.238787210.3889
C34512124330.5794441170.4231
C4010711360.62672110100.4348
Overall----0.5722----0.4773

\begin{tabular}{lllllllll}
\hline
10-Fold Cross Validation \tabularnewline
 & Prediction (training) & Prediction (testing) \tabularnewline
Actual & C1 & C2 & C3 & C4 & CV & C1 & C2 & C3 & C4 & CV \tabularnewline
C1 & 186 & 16 & 16 & 1 & 0.8493 & 14 & 3 & 4 & 0 & 0.6667 \tabularnewline
C2 & 96 & 53 & 72 & 1 & 0.2387 & 8 & 7 & 2 & 1 & 0.3889 \tabularnewline
C3 & 45 & 12 & 124 & 33 & 0.5794 & 4 & 4 & 11 & 7 & 0.4231 \tabularnewline
C4 & 0 & 10 & 71 & 136 & 0.6267 & 2 & 1 & 10 & 10 & 0.4348 \tabularnewline
Overall & - & - & - & - & 0.5722 & - & - & - & - & 0.4773 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154180&T=1

[TABLE]
[ROW][C]10-Fold Cross Validation[/C][/ROW]
[ROW][C][/C][C]Prediction (training)[/C][C]Prediction (testing)[/C][/ROW]
[ROW][C]Actual[/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][C]CV[/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][C]CV[/C][/ROW]
[ROW][C]C1[/C][C]186[/C][C]16[/C][C]16[/C][C]1[/C][C]0.8493[/C][C]14[/C][C]3[/C][C]4[/C][C]0[/C][C]0.6667[/C][/ROW]
[ROW][C]C2[/C][C]96[/C][C]53[/C][C]72[/C][C]1[/C][C]0.2387[/C][C]8[/C][C]7[/C][C]2[/C][C]1[/C][C]0.3889[/C][/ROW]
[ROW][C]C3[/C][C]45[/C][C]12[/C][C]124[/C][C]33[/C][C]0.5794[/C][C]4[/C][C]4[/C][C]11[/C][C]7[/C][C]0.4231[/C][/ROW]
[ROW][C]C4[/C][C]0[/C][C]10[/C][C]71[/C][C]136[/C][C]0.6267[/C][C]2[/C][C]1[/C][C]10[/C][C]10[/C][C]0.4348[/C][/ROW]
[ROW][C]Overall[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]0.5722[/C][C]-[/C][C]-[/C][C]-[/C][C]-[/C][C]0.4773[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154180&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154180&T=1

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

10-Fold Cross Validation
Prediction (training)Prediction (testing)
ActualC1C2C3C4CVC1C2C3C4CV
C1186161610.8493143400.6667
C296537210.238787210.3889
C34512124330.5794441170.4231
C4010711360.62672110100.4348
Overall----0.5722----0.4773







Confusion Matrix (predicted in columns / actuals in rows)
C1C2C3C4
C121120
C210590
C351153
C401914

\begin{tabular}{lllllllll}
\hline
Confusion Matrix (predicted in columns / actuals in rows) \tabularnewline
 & C1 & C2 & C3 & C4 \tabularnewline
C1 & 21 & 1 & 2 & 0 \tabularnewline
C2 & 10 & 5 & 9 & 0 \tabularnewline
C3 & 5 & 1 & 15 & 3 \tabularnewline
C4 & 0 & 1 & 9 & 14 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=154180&T=2

[TABLE]
[ROW][C]Confusion Matrix (predicted in columns / actuals in rows)[/C][/ROW]
[ROW][C][/C][C]C1[/C][C]C2[/C][C]C3[/C][C]C4[/C][/ROW]
[ROW][C]C1[/C][C]21[/C][C]1[/C][C]2[/C][C]0[/C][/ROW]
[ROW][C]C2[/C][C]10[/C][C]5[/C][C]9[/C][C]0[/C][/ROW]
[ROW][C]C3[/C][C]5[/C][C]1[/C][C]15[/C][C]3[/C][/ROW]
[ROW][C]C4[/C][C]0[/C][C]1[/C][C]9[/C][C]14[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=154180&T=2

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=154180&T=2

As an alternative you can also use a QR Code:  

The GUIDs for individual cells are displayed in the table below:

Confusion Matrix (predicted in columns / actuals in rows)
C1C2C3C4
C121120
C210590
C351153
C401914



Parameters (Session):
par1 = 1 ; par2 = quantiles ; par3 = 4 ; par4 = yes ;
Parameters (R input):
par1 = 1 ; par2 = quantiles ; par3 = 4 ; par4 = yes ; par5 = ; par6 = ; par7 = ; par8 = ; par9 = ; par10 = ; par11 = ; par12 = ; par13 = ; par14 = ; par15 = ; par16 = ; par17 = ; par18 = ; par19 = ; par20 = ;
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}