Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_edauni.wasp

Title produced by software

Univariate Explorative Data Analysis

Date of computation

Sun, 26 Oct 2008 09:37:07 -0600

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Oct/26/t12250355244zv3g7q8pw8bqym.htm/, Retrieved Fri, 17 May 2024 03:41:29 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=18942, Retrieved Fri, 17 May 2024 03:41:29 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

167

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F     [Univariate Explorative Data Analysis] [Investigating dis...] [2007-10-22 19:45:25] [b9964c45117f7aac638ab9056d451faa]
F   PD    [Univariate Explorative Data Analysis] [taak3 autocorrelatie] [2008-10-26 15:37:07] [1aceffc2fa350402d9e8f8edd757a2e8] [Current]
-   PD      [Univariate Explorative Data Analysis] [Verbetering Q7] [2008-10-30 23:38:53] [6816386b1f3c2f6c0c9f2aa1e5bc9362] 
- RMPD      [Central Tendency] [Verbetering Q7] [2008-10-30 23:58:30] [6816386b1f3c2f6c0c9f2aa1e5bc9362] 

Feedback Forum

2008-10-31 00:04:02 [Kenny Simons] [reply] 
Voor assumptie 1 moet je inderdaad kijken naar de grafiek van autocorrelation function. Je kan hier zien dat er toch heel wat waarden buiten het betrouwbaarheidsinterval liggen, er is dus zeker correlatie. Als we de lags nu instellen op 36 kan je mss aflezen of deze correlatie seizoensgebonden is.  
http://www.freestatistics.org/blog/index.php?v=date/2008/Oct/31/t1225409984nyi6jq7lhvk5g5j.htm 
We zien hier duidelijk dat de correlatie niet seizoensgebonden is.  
 
Voor assumptie 2 heb je de juiste conclusie getrokken. ALs we zien naar het histogram zal je zien dat deze skewed right is, er is dus geen vaste verdeling. 
 
Bij assumptie 3 ben je gaan zien naar het normal QQplot, het antwoord op deze vraag moet je echter afleiden uit het run sequence plot. Als je naar dit plot ziet, zal je merken dat op lange termijn het niveau lichtjes stijgt, eerst stijgt het niveau enorm en daarna daalt het weer. Als we nu de central tendency van deze tijdreeks berekenen, kunnen we zien dat het gemiddelde ook hier een beetje stijgt. Dit kun je zien aan het plot van de winsorized mean. Het gemiddelde begint hier bij een waarde van ongeveer 8.7 en eindigt bij een waarde van ongeveer 8.8. Ook bij het plot van de trimmed mean is dit het geval 
http://www.freestatistics.org/blog/index.php?v=date/2008/Oct/31/t12254112742a46zm4x6p7a8n8.htm 
 
Voor assumptie 4 zien we terug naar het run sequence plot. Het is hier moeilijk uit af te leiden of de spreiding van de curve constant is. Als we de grafiek in 2 splitsen, zien we dat we bij het eerste deel grote schommelingen hebben, en bij het 2e deel iets kleindere schommelingen, de spreiding is dus niet constant. 
 
We kunnen dus besluiten dat de tijdreeks niet voldoet aan de 4 voorwaarden.  
2008-11-03 19:56:56 [An De Koninck] [reply] 
Ook hier werden, zoals bij Q, enkele fouten gemaakt. 
Assumptie 1: Het is goed dat de student een bepaalde lags heeft ingesteld, namelijk 12 lags. Er vallen er veel buiten het betrouwbaarheidsinterval, maar dit wijst niet onmiddellijk op seizoensgebondenheid. Hiervoor moet je een hoger aantal lags instellen (bv 24 of 36) zo kan je op langere termijn kijken of er sprake is van seizoensgebondenheid. 
Als je naar de voorlaatste grafiek kijkt zie je dat vele punten sterk afwijken van de rechte die door die puntenwolk getrokken is. Dit wijst op een heel lage autocorrelatie, één van bijna 0. 
 
Asusmptie 2: Ik zou niet durven zeggen dat de cijfergegevens abnormaal verdeeld zijn. Als je naar de density plot kijkt zie je toch een grote gelijkenis met een gauss-curve, wat overeenkomt met een normale verdeling. Het is wel waar dat er sprake is van outliers, zoals je kan zien op normal QQ-plot, maar de meeste punten liggen toch rond het gemiddelde.  
 
Assumptie 3: Als je kijkt naar de run sequenze plot zie je dat de grafiek veel op en neer gaat. Dit is echter gezien in een korte tijdspanne, en het is beter om het over een langere periode te bekijken. Je kan het beter bekijken door gebruik te maken van de central tendency. Bij een gemiddelde van 87 is de grafiek constant als je de extremen wegneemt. De outliers hebben er dus niet onmiddellijk invloed op. 
 
Assumptie 4: Hier wordt er een verkeerd antwoord gegeven. Je moet immers kijken naar de run sequenze plot. Zo kan je de spreiding van de reeks over de tijd heen zien. Het linkerdeel toont een grotere spreiding dan de rechterkant. 

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

8,9
8,9
8,5
8,1
7,5
7,1
6,9
7,1
7
6,7
7
7,3
7,7
8,4
8,4
8,8
9,1
9
8,6
7,9
7,7
7,8
9,1
9,4
9,3
8,7
8,4
8,6
9
9,1
8,7
8,2
7,9
7,9
9,1
9,4
9,5
9,1
9
9,3
9,9
9,8
9,4
8,3
8
8,5
10,4
11,1
10,9
9,9
9,2
9,2
9,5
9,6
9,5
9,1
8,9
9
10,1
10,3
10,2
9,6
9,2
9,3
9,4
9,4
9,2
9
9
9
9,8
10
9,9
9,3
9
9
9,1
9,1
9,1
9,2
8,8
8,3
8,4
8,1
7,8
7,9
7,9
8
7,9
7,5
7,2
6,9
6,6
6,7

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 2 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ 72.249.127.135 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18942&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]2 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ 72.249.127.135[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18942&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18942&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135

Descriptive Statistics
# observations	94
minimum	6.6
Q1	8
median	9
mean	8.71808510638298
Q3	9.3
maximum	11.1

\begin{tabular}{lllllllll}
\hline
Descriptive Statistics \tabularnewline
# observations & 94 \tabularnewline
minimum & 6.6 \tabularnewline
Q1 & 8 \tabularnewline
median & 9 \tabularnewline
mean & 8.71808510638298 \tabularnewline
Q3 & 9.3 \tabularnewline
maximum & 11.1 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18942&T=1

[TABLE]
[ROW][C]Descriptive Statistics[/C][/ROW]
[ROW][C]# observations[/C][C]94[/C][/ROW]
[ROW][C]minimum[/C][C]6.6[/C][/ROW]
[ROW][C]Q1[/C][C]8[/C][/ROW]
[ROW][C]median[/C][C]9[/C][/ROW]
[ROW][C]mean[/C][C]8.71808510638298[/C][/ROW]
[ROW][C]Q3[/C][C]9.3[/C][/ROW]
[ROW][C]maximum[/C][C]11.1[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18942&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18942&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Descriptive Statistics
# observations	94
minimum	6.6
Q1	8
median	9
mean	8.71808510638298
Q3	9.3
maximum	11.1

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Figure 4

PNG link

Postscript link

PDF link

Figure 5

PNG link

Postscript link

PDF link

Figure 6

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 0 ; par2 = 12 ;

Parameters (R input):

par1 = 0 ; par2 = 12 ;

R code (references can be found in the software module):

par1 <- as.numeric(par1)
par2 <- as.numeric(par2)
x <- as.ts(x)
library(lattice)
bitmap(file='pic1.png')
plot(x,type='l',main='Run Sequence Plot',xlab='time or index',ylab='value')
grid()
dev.off()
bitmap(file='pic2.png')
hist(x)
grid()
dev.off()
bitmap(file='pic3.png')
if (par1 > 0)
{
densityplot(~x,col='black',main=paste('Density Plot   bw = ',par1),bw=par1)
} else {
densityplot(~x,col='black',main='Density Plot')
}
dev.off()
bitmap(file='pic4.png')
qqnorm(x)
grid()
dev.off()
if (par2 > 0)
{
bitmap(file='lagplot.png')
dum <- cbind(lag(x,k=1),x)
dum
dum1 <- dum[2:length(x),]
dum1
z <- as.data.frame(dum1)
z
plot(z,main=paste('Lag plot, lowess, and regression line'))
lines(lowess(z))
abline(lm(z))
dev.off()
bitmap(file='pic5.png')
acf(x,lag.max=par2,main='Autocorrelation Function')
grid()
dev.off()
}
summary(x)
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Descriptive Statistics',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'# observations',header=TRUE)
a<-table.element(a,length(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'minimum',header=TRUE)
a<-table.element(a,min(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q1',header=TRUE)
a<-table.element(a,quantile(x,0.25))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'median',header=TRUE)
a<-table.element(a,median(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'mean',header=TRUE)
a<-table.element(a,mean(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q3',header=TRUE)
a<-table.element(a,quantile(x,0.75))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'maximum',header=TRUE)
a<-table.element(a,max(x))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code