Home » date » 2010 » Dec » 14 »

ws 10

*The author of this computation has been verified*
R Software Module: /rwasp_regression_trees1.wasp (opens new window with default values)
Title produced by software: Recursive Partitioning (Regression Trees)
Date of computation: Tue, 14 Dec 2010 18:38:26 +0000
 
Cite this page as follows:
Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy.htm/, Retrieved Tue, 14 Dec 2010 19:36:34 +0100
 
BibTeX entries for LaTeX users:
@Manual{KEY,
    author = {{YOUR NAME}},
    publisher = {Office for Research Development and Education},
    title = {Statistical Computations at FreeStatistics.org, URL http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy.htm/},
    year = {2010},
}
@Manual{R,
    title = {R: A Language and Environment for Statistical Computing},
    author = {{R Development Core Team}},
    organization = {R Foundation for Statistical Computing},
    address = {Vienna, Austria},
    year = {2010},
    note = {{ISBN} 3-900051-07-0},
    url = {http://www.R-project.org},
}
 
Original text written by user:
 
IsPrivate?
No (this computation is public)
 
User-defined keywords:
 
Dataseries X:
» Textbox « » Textfile « » CSV «
14544,5 94,6 -3,0 14097,8 15116,3 95,9 -3,7 14776,8 17413,2 104,7 -4,7 16833,3 16181,5 102,8 -6,4 15385,5 15607,4 98,1 -7,5 15172,6 17160,9 113,9 -7,8 16858,9 14915,8 80,9 -7,7 14143,5 13768 95,7 -6,6 14731,8 17487,5 113,2 -4,2 16471,6 16198,1 105,9 -2,0 15214 17535,2 108,8 -0,7 17637,4 16571,8 102,3 0,1 17972,4 16198,9 99 0,9 16896,2 16554,2 100,7 2,1 16698 19554,2 115,5 3,5 19691,6 15903,8 100,7 4,9 15930,7 18003,8 109,9 5,7 17444,6 18329,6 114,6 6,2 17699,4 16260,7 85,4 6,5 15189,8 14851,9 100,5 6,5 15672,7 18174,1 114,8 6,3 17180,8 18406,6 116,5 6,2 17664,9 18466,5 112,9 6,4 17862,9 16016,5 102 6,3 16162,3 17428,5 106 5,8 17463,6 17167,2 105,3 5,1 16772,1 19630 118,8 5,1 19106,9 17183,6 106,1 5,8 16721,3 18344,7 109,3 6,7 18161,3 19301,4 117,2 7,1 18509,9 18147,5 92,5 6,7 17802,7 16192,9 104,2 5,5 16409,9 18374,4 112,5 4,2 17967,7 20515,2 122,4 3,0 20286,6 18957,2 113,3 2,2 19537,3 16471,5 100 2,0 18021,9 18746,8 110,7 1,8 20194,3 19009,5 112,8 1,8 19049,6 19211,2 109 etc...
 
Output produced by software:

Enter (or paste) a matrix (table) containing all data (time) series. Every column represents a different variable and must be delimited by a space or Tab. Every row represents a period in time (or category) and must be delimited by hard returns. The easiest way to enter data is to copy and paste a block of spreadsheet cells. Please, do not use commas or spaces to seperate groups of digits!


Summary of computational transaction
Raw Inputview raw input (R code)
Raw Outputview raw output of R engine
Computing time4 seconds
R Server'RServer@AstonUniversity' @ vre.aston.ac.uk


Goodness of Fit
Correlation0.7989
R-squared0.6383
RMSE5.9402


Actuals, Predictions, and Residuals
#ActualsForecastsResiduals
194.691.53846153846153.06153846153846
295.991.53846153846154.36153846153847
3104.7100.9242424242423.77575757575758
4102.8100.9242424242421.87575757575758
598.1100.924242424242-2.82424242424243
6113.9100.92424242424212.9757575757576
780.991.5384615384615-10.6384615384615
895.791.53846153846154.16153846153847
9113.2100.92424242424212.2757575757576
10105.9100.9242424242424.97575757575758
11108.8100.9242424242427.87575757575758
12102.3100.9242424242421.37575757575758
1399100.924242424242-1.92424242424242
14100.7100.924242424242-0.224242424242419
15115.5113.4136363636362.08636363636364
16100.7100.924242424242-0.224242424242419
17109.9100.9242424242428.97575757575758
18114.6113.4136363636361.18636363636364
1985.4100.924242424242-15.5242424242424
20100.591.53846153846158.96153846153847
21114.8113.4136363636361.38636363636364
22116.5113.4136363636363.08636363636364
23112.9113.413636363636-0.513636363636351
24102100.9242424242421.07575757575758
25106100.9242424242425.07575757575758
26105.3100.9242424242424.37575757575758
27118.8113.4136363636365.38636363636364
28106.1100.9242424242425.17575757575757
29109.3113.413636363636-4.11363636363636
30117.2113.4136363636363.78636363636365
3192.5100.924242424242-8.42424242424242
32104.2100.9242424242423.27575757575758
33112.5113.413636363636-0.913636363636357
34122.4113.4136363636368.98636363636365
35113.3113.413636363636-0.11363636363636
36100100.924242424242-0.924242424242422
37110.7113.413636363636-2.71363636363635
38112.8113.413636363636-0.61363636363636
39109.8113.413636363636-3.61363636363636
40117.3113.4136363636363.88636363636364
41109.1113.413636363636-4.31363636363636
42115.9113.4136363636362.48636363636365
4396113.413636363636-17.4136363636364
4499.8100.924242424242-1.12424242424242
45116.8113.4136363636363.38636363636364
46115.7113.4136363636362.28636363636365
4799.4100.924242424242-1.52424242424242
4894.391.53846153846152.76153846153846
499191.5384615384615-0.538461538461533
5093.291.53846153846151.66153846153847
51103.1100.9242424242422.17575757575757
5294.191.53846153846152.56153846153846
5391.891.53846153846150.261538461538464
54102.7100.9242424242421.77575757575758
5582.691.5384615384615-8.93846153846154
5689.191.5384615384615-2.43846153846154
57104.5100.9242424242423.57575757575758
58105.1100.9242424242424.17575757575757
5995.1100.924242424242-5.82424242424243
6088.7100.924242424242-12.2242424242424
6186.391.5384615384615-5.23846153846154
6291.8100.924242424242-9.12424242424242
63111.5113.413636363636-1.91363636363636
6499.7100.924242424242-1.22424242424242
6597.5100.924242424242-3.42424242424242
66111.7113.413636363636-1.71363636363635
6786.2100.924242424242-14.7242424242424
6895.4100.924242424242-5.52424242424242
 
Charts produced by software:
http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/2gm881292351900.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/2gm881292351900.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/3gm881292351900.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/3gm881292351900.ps (open in new window)


http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/49wqt1292351900.png (open in new window)
http://www.freestatistics.org/blog/date/2010/Dec/14/t1292351793erdu495p94fy8vy/49wqt1292351900.ps (open in new window)


 
Parameters (Session):
par1 = 2 ; par2 = none ; par3 = 3 ; par4 = no ;
 
Parameters (R input):
par1 = 2 ; par2 = none ; par3 = 3 ; par4 = no ;
 
R code (references can be found in the software module):
library(party)
library(Hmisc)
par1 <- as.numeric(par1)
par3 <- as.numeric(par3)
x <- data.frame(t(y))
is.data.frame(x)
x <- x[!is.na(x[,par1]),]
k <- length(x[1,])
n <- length(x[,1])
colnames(x)[par1]
x[,par1]
if (par2 == 'kmeans') {
cl <- kmeans(x[,par1], par3)
print(cl)
clm <- matrix(cbind(cl$centers,1:par3),ncol=2)
clm <- clm[sort.list(clm[,1]),]
for (i in 1:par3) {
cl$cluster[cl$cluster==clm[i,2]] <- paste('C',i,sep='')
}
cl$cluster <- as.factor(cl$cluster)
print(cl$cluster)
x[,par1] <- cl$cluster
}
if (par2 == 'quantiles') {
x[,par1] <- cut2(x[,par1],g=par3)
}
if (par2 == 'hclust') {
hc <- hclust(dist(x[,par1])^2, 'cen')
print(hc)
memb <- cutree(hc, k = par3)
dum <- c(mean(x[memb==1,par1]))
for (i in 2:par3) {
dum <- c(dum, mean(x[memb==i,par1]))
}
hcm <- matrix(cbind(dum,1:par3),ncol=2)
hcm <- hcm[sort.list(hcm[,1]),]
for (i in 1:par3) {
memb[memb==hcm[i,2]] <- paste('C',i,sep='')
}
memb <- as.factor(memb)
print(memb)
x[,par1] <- memb
}
if (par2=='equal') {
ed <- cut(as.numeric(x[,par1]),par3,labels=paste('C',1:par3,sep=''))
x[,par1] <- as.factor(ed)
}
table(x[,par1])
colnames(x)
colnames(x)[par1]
x[,par1]
if (par2 == 'none') {
m <- ctree(as.formula(paste(colnames(x)[par1],' ~ .',sep='')),data = x)
}
load(file='createtable')
if (par2 != 'none') {
m <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data = x)
if (par4=='yes') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'10-Fold Cross Validation',3+2*par3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
a<-table.element(a,'Prediction (training)',par3+1,TRUE)
a<-table.element(a,'Prediction (testing)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Actual',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,paste('C',jjj,sep=''),1,TRUE)
a<-table.element(a,'CV',1,TRUE)
a<-table.row.end(a)
for (i in 1:10) {
ind <- sample(2, nrow(x), replace=T, prob=c(0.9,0.1))
m.ct <- ctree(as.formula(paste('as.factor(',colnames(x)[par1],') ~ .',sep='')),data =x[ind==1,])
if (i==1) {
m.ct.i.pred <- predict(m.ct, newdata=x[ind==1,])
m.ct.i.actu <- x[ind==1,par1]
m.ct.x.pred <- predict(m.ct, newdata=x[ind==2,])
m.ct.x.actu <- x[ind==2,par1]
} else {
m.ct.i.pred <- c(m.ct.i.pred,predict(m.ct, newdata=x[ind==1,]))
m.ct.i.actu <- c(m.ct.i.actu,x[ind==1,par1])
m.ct.x.pred <- c(m.ct.x.pred,predict(m.ct, newdata=x[ind==2,]))
m.ct.x.actu <- c(m.ct.x.actu,x[ind==2,par1])
}
}
print(m.ct.i.tab <- table(m.ct.i.actu,m.ct.i.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.i.tab[i,i] / sum(m.ct.i.tab[i,]))
numer <- numer + m.ct.i.tab[i,i]
}
print(m.ct.i.cp <- numer / sum(m.ct.i.tab))
print(m.ct.x.tab <- table(m.ct.x.actu,m.ct.x.pred))
numer <- 0
for (i in 1:par3) {
print(m.ct.x.tab[i,i] / sum(m.ct.x.tab[i,]))
numer <- numer + m.ct.x.tab[i,i]
}
print(m.ct.x.cp <- numer / sum(m.ct.x.tab))
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (jjj in 1:par3) a<-table.element(a,m.ct.i.tab[i,jjj])
a<-table.element(a,round(m.ct.i.tab[i,i]/sum(m.ct.i.tab[i,]),4))
for (jjj in 1:par3) a<-table.element(a,m.ct.x.tab[i,jjj])
a<-table.element(a,round(m.ct.x.tab[i,i]/sum(m.ct.x.tab[i,]),4))
a<-table.row.end(a)
}
a<-table.row.start(a)
a<-table.element(a,'Overall',1,TRUE)
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.i.cp,4))
for (jjj in 1:par3) a<-table.element(a,'-')
a<-table.element(a,round(m.ct.x.cp,4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable3.tab')
}
}
m
bitmap(file='test1.png')
plot(m)
dev.off()
bitmap(file='test1a.png')
plot(x[,par1] ~ as.factor(where(m)),main='Response by Terminal Node',xlab='Terminal Node',ylab='Response')
dev.off()
if (par2 == 'none') {
forec <- predict(m)
result <- as.data.frame(cbind(x[,par1],forec,x[,par1]-forec))
colnames(result) <- c('Actuals','Forecasts','Residuals')
print(result)
}
if (par2 != 'none') {
print(cbind(as.factor(x[,par1]),predict(m)))
myt <- table(as.factor(x[,par1]),predict(m))
print(myt)
}
bitmap(file='test2.png')
if(par2=='none') {
op <- par(mfrow=c(2,2))
plot(density(result$Actuals),main='Kernel Density Plot of Actuals')
plot(density(result$Residuals),main='Kernel Density Plot of Residuals')
plot(result$Forecasts,result$Actuals,main='Actuals versus Predictions',xlab='Predictions',ylab='Actuals')
plot(density(result$Forecasts),main='Kernel Density Plot of Predictions')
par(op)
}
if(par2!='none') {
plot(myt,main='Confusion Matrix',xlab='Actual',ylab='Predicted')
}
dev.off()
if (par2 == 'none') {
detcoef <- cor(result$Forecasts,result$Actuals)
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Goodness of Fit',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Correlation',1,TRUE)
a<-table.element(a,round(detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'R-squared',1,TRUE)
a<-table.element(a,round(detcoef*detcoef,4))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'RMSE',1,TRUE)
a<-table.element(a,round(sqrt(mean((result$Residuals)^2)),4))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable1.tab')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Actuals, Predictions, and Residuals',4,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'#',header=TRUE)
a<-table.element(a,'Actuals',header=TRUE)
a<-table.element(a,'Forecasts',header=TRUE)
a<-table.element(a,'Residuals',header=TRUE)
a<-table.row.end(a)
for (i in 1:length(result$Actuals)) {
a<-table.row.start(a)
a<-table.element(a,i,header=TRUE)
a<-table.element(a,result$Actuals[i])
a<-table.element(a,result$Forecasts[i])
a<-table.element(a,result$Residuals[i])
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable.tab')
}
if (par2 != 'none') {
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Confusion Matrix (predicted in columns / actuals in rows)',par3+1,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'',1,TRUE)
for (i in 1:par3) {
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
}
a<-table.row.end(a)
for (i in 1:par3) {
a<-table.row.start(a)
a<-table.element(a,paste('C',i,sep=''),1,TRUE)
for (j in 1:par3) {
a<-table.element(a,myt[i,j])
}
a<-table.row.end(a)
}
a<-table.end(a)
table.save(a,file='mytable2.tab')
}
 





Copyright

Creative Commons License

This work is licensed under a Creative Commons Attribution-Noncommercial-Share Alike 3.0 License.

Software written by Ed van Stee & Patrick Wessa


Disclaimer

Information provided on this web site is provided "AS IS" without warranty of any kind, either express or implied, including, without limitation, warranties of merchantability, fitness for a particular purpose, and noninfringement. We use reasonable efforts to include accurate and timely information and periodically update the information, and software without notice. However, we make no warranties or representations as to the accuracy or completeness of such information (or software), and we assume no liability or responsibility for errors or omissions in the content of this web site, or any software bugs in online applications. Your use of this web site is AT YOUR OWN RISK. Under no circumstances and under no legal theory shall we be liable to you or any other person for any direct, indirect, special, incidental, exemplary, or consequential damages arising from your access to, or use of, this web site.


Privacy Policy

We may request personal information to be submitted to our servers in order to be able to:

  • personalize online software applications according to your needs
  • enforce strict security rules with respect to the data that you upload (e.g. statistical data)
  • manage user sessions of online applications
  • alert you about important changes or upgrades in resources or applications

We NEVER allow other companies to directly offer registered users information about their products and services. Banner references and hyperlinks of third parties NEVER contain any personal data of the visitor.

We do NOT sell, nor transmit by any means, personal information, nor statistical data series uploaded by you to third parties.

We carefully protect your data from loss, misuse, alteration, and destruction. However, at any time, and under any circumstance you are solely responsible for managing your passwords, and keeping them secret.

We store a unique ANONYMOUS USER ID in the form of a small 'Cookie' on your computer. This allows us to track your progress when using this website which is necessary to create state-dependent features. The cookie is used for NO OTHER PURPOSE. At any time you may opt to disallow cookies from this website - this will not affect other features of this website.

We examine cookies that are used by third-parties (banner and online ads) very closely: abuse from third-parties automatically results in termination of the advertising contract without refund. We have very good reason to believe that the cookies that are produced by third parties (banner ads) do NOT cause any privacy or security risk.

FreeStatistics.org is safe. There is no need to download any software to use the applications and services contained in this website. Hence, your system's security is not compromised by their use, and your personal data - other than data you submit in the account application form, and the user-agent information that is transmitted by your browser - is never transmitted to our servers.

As a general rule, we do not log on-line behavior of individuals (other than normal logging of webserver 'hits'). However, in cases of abuse, hacking, unauthorized access, Denial of Service attacks, illegal copying, hotlinking, non-compliance with international webstandards (such as robots.txt), or any other harmful behavior, our system engineers are empowered to log, track, identify, publish, and ban misbehaving individuals - even if this leads to ban entire blocks of IP addresses, or disclosing user's identity.


FreeStatistics.org is powered by