Home » date » 2010 » Dec » 21 »

Recursive Partitioning - Kijkcijfers

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Tue, 21 Dec 2010 17:52:54 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb.htm/, Retrieved Tue, 21 Dec 2010 18:51:23 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
 
Dataseries X:
» Textbox « » Textfile « » CSV «
15561600 15.73 3.56 142.86 14917500 16.17 1.33 380.71 14805920 12.00 0.00 460.00 16958000 12.86 0.69 361.43 17605000 10.30 10.05 140.00 17131200 12.97 0.51 275.00 18474600 12.06 0.91 274.29 17286700 10.49 2.67 212.86 18574400 5.97 1.39 172.86 18056000 9.26 1.24 186.43 19701600 9.74 2.79 77.14 19061700 5.46 3.37 17.86 19681900 2.71 1.60 37.14 34521200 3.90 4.73 42.86 19922700 1.51 0.79 85.00 20177900 5.01 0.67 45.00 19759900 2.96 0.00 206.43 23076700 -1.97 0.60 178.57 22532000 -4.61 0.40 285.71 22029400 4.27 2.24 58.57 22587000 4.01 5.74 88.57 23256600 0.04 0.06 309.29 22680300 3.04 0.87 58.57 21916400 2.29 4.91 132.14 19640200 4.37 1.93 3.57 18813100 6.39 0.41 102.86 18730000 5.74 1.21 185.71 18154700 7.64 2.01 177.14 17848800 7.07 0.00 530.00 18077500 6.23 6.49 162.86 17133100 10.20 0.00 553.57 16602600 14.07 0.31 258.57 15878900 12.83 4.87 326.43 15789100 12.04 1.37 580.00 15422000 11.97 0.19 286.43 14661400 12.63 0.34 310.71 15879200 13.56 3.60 148.57 14339300 15.66 0.1 etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time6 seconds
R Server'Sir Ronald Aylmer Fisher' @ 193.190.124.24


Goodness of Fit
Correlation0.8876
R-squared0.7879
RMSE1720765.7442


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
11556160017065558.75-1503958.75
21491750012899698.28571432017801.71428571
31480592015320817.3571429-514897.357142856
41695800015320817.35714291637182.64285714
51760500017896694.0454545-291694.045454547
61713120017065558.7565641.25
71847460017065558.751409041.25
81728670017896694.0454545-609994.045454547
91857440017896694.0454545677705.954545453
101805600017896694.0454545159305.954545453
111970160019331636369964
121906170019331636-269936
131968190021978075.3043478-2296175.30434782
143452120021978075.304347812543124.6956522
151992270021978075.3043478-2055375.30434782
162017790019331636846264
171975990021978075.3043478-2218175.30434782
182307670021978075.30434781098624.69565218
192253200021978075.3043478553924.695652176
202202940021978075.304347851324.6956521757
212258700021978075.3043478608924.695652176
222325660021978075.30434781278524.69565218
232268030021978075.3043478702224.695652176
242191640021978075.3043478-61675.3043478243
251964020019331636308564
261881310019331636-518536
271873000017896694.0454545833305.954545453
281815470017896694.0454545258005.954545453
291784880017896694.0454545-47894.0454545468
301807750017896694.0454545180805.954545453
311713310017896694.0454545-763594.045454547
321660260017065558.75-462958.75
331587890015320817.3571429558082.642857144
341578910015320817.3571429468282.642857144
351542200015320817.3571429101182.642857144
361466140015320817.3571429-659417.357142856
371587920017065558.75-1186358.75
381433930015320817.3571429-981517.357142856
391316960012899698.2857143269901.714285715
401452890015320817.3571429-791917.357142856
411337580015320817.3571429-1945017.35714286
421230990012899698.2857143-589798.285714285
431193390012899698.2857143-965798.285714285
441006190012899698.2857143-2837798.28571429
451260960012899698.2857143-290098.285714285
461115650012899698.2857143-1743198.28571429
471218720012899698.2857143-712498.285714285
481128430012899698.2857143-1615398.28571429
491017700012899698.2857143-2722698.28571429
501097072012899698.2857143-1928978.28571429
511082068012899698.2857143-2079018.28571429
521149239012899698.2857143-1407308.28571429
531457375012899698.28571431674051.71428571
541399282012899698.28571431093121.71428571
551472707012899698.28571431827371.71428571
561568536015320817.3571429364542.642857144
571673621017065558.75-329348.75
581795018017065558.75884621.25
591700273017896694.0454545-893964.045454547
601741516017896694.0454545-481534.045454547
611792981017896694.045454533115.9545454532
621786579017896694.0454545-30904.0454545468
631920236019331636-129276
641908500017896694.04545451188305.95454545
651818888017065558.751123321.25
661846641019331636-865226
671852040019331636-811236
682002550021978075.3043478-1952575.30434782
692063610021978075.3043478-1341975.30434782
702067200021978075.3043478-1306075.30434782
712258910021978075.3043478611024.695652176
722186480021978075.3043478-113275.304347824
732275010021978075.3043478772024.695652176
742254874621978075.3043478570670.695652176
752132549521978075.3043478-652580.304347824
762155656321978075.3043478-421512.304347824
772141526921978075.3043478-562806.304347824
7820401054193316361069418
791906225321978075.3043478-2915822.30434782
801908570621978075.3043478-2892369.30434782
811927996717896694.04545451383272.95454545
821855204517896694.0454545655350.954545453
831780073317896694.0454545-95961.0454545468
841714249017896694.0454545-754204.045454547
851759317317896694.0454545-303521.045454547
861763385917896694.0454545-262835.045454547
871733661315320817.35714292015795.64285714
881700834717896694.0454545-888347.045454547
891795196517896694.045454555270.9545454532
901452092915320817.3571429-799888.357142856
911694121715320817.35714291620399.64285714
921543682412899698.28571432537125.71428571
931474426112899698.28571431844562.71428571
941424800415320817.3571429-1072813.35714286
951154095312899698.2857143-1358745.28571429
961288166112899698.2857143-18037.2857142854
971518575712899698.28571432286058.71428571
981355433912899698.2857143654640.714285715
991357510612899698.2857143675407.714285715
1001223840012899698.2857143-661298.285714285
1011330361412899698.2857143403915.714285715
1021415147812899698.28571431251779.71428571
1031417200912899698.28571431272310.71428571
1041402232012899698.28571431122621.71428571
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/218va1292953966.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/218va1292953966.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/3tiud1292953966.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/3tiud1292953966.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/4mrcg1292953966.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/21/t1292953883fkaj5fhdui4nicb/4mrcg1292953966.ps (open in new window)


 
Parameters (Session):
par1 = 1 ; par2 = none ; par4 = no ;
 
Parameters (R input):
par1 = 1 ; par2 = none ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by