Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*The author of this computation has been verified*

R Software Module

rwasp_edauni.wasp

Title produced by software

Univariate Explorative Data Analysis

Date of computation

Sun, 26 Oct 2008 10:21:39 -0600

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Oct/26/t12250381401eazqtirh0vmtxa.htm/, Retrieved Fri, 17 May 2024 15:49:50 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=18969, Retrieved Fri, 17 May 2024 15:49:50 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Estimated Impact

161

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F     [Univariate Explorative Data Analysis] [Investigation Dis...] [2007-10-21 17:06:37] [b9964c45117f7aac638ab9056d451faa]
F    D    [Univariate Explorative Data Analysis] [Reproduction Q2] [2008-10-26 16:21:39] [8a1195ff8db4df756ce44b463a631c76] [Current]
-   P       [Univariate Explorative Data Analysis] [Q2 Univariate Exp...] [2008-11-01 15:00:51] [7d3039e6253bb5fb3b26df1537d500b4] 
-   P       [Univariate Explorative Data Analysis] [Taak 4 correctie q2] [2008-11-02 13:00:19] [aa5573c1db401b164e448aef050955a1] 

Feedback Forum

2008-11-01 16:09:19 [Stéphanie Claes] [reply] 
De student heeft de geldigheid van het model onderzocht aan de hand van de 4 assumpties, maar heeft de lags niet aangepast, wat noodzakelijk is om na te gaan of er correlatie aanwezig is. 
 
1. Voor de eerste assumptie heeft de student gekeken naar de run sequence, dit is echter niet voldoende om te besluiten dat er sprake is van seizonaliteit. Hiervoor moeten we gaan kijken naar de lag plots =>http://www.freestatistics.org/blog/index.php?v=date/2008/Nov/01/t1225551702th0gd5mxytjvudz.htm 
Bij de eerste lag plot zien we dat de lijn bijna horizontaal verloopt en dat de punten gespreid liggen rond de rechte => de correlatie is bijna 0. 
Bij de tweede lag plot zien we een kromme. Als we een rechte tekenen zien we dat de punten veel meer aansluiten = positieve seizonale correlatie, wat betekent dat we met de industriële productie van deze maand een uitspraak kunnen doen over 12 maanden verder want elke keer gaat in dezelfde maand veel of minder geproduceerd worden. 
De autocorrelatiefunctie is de samenvatting van alle correlaties. 
De tijdreeks is niet random, bevat seizonale correlatie. 
 
2. Voor assumptie twee kijken we naar het histogram en eventueel naar density plot. De student heeft dit gedaan en komt tot de conclusie dat er geen ideale normaalverdeling is. Dit is niet correct, het is min of meer een normaalverdeling (als we naar het histogram kijken) en de knik die we vinden in de density plot (en ook de afwijking in het histogram) is niet erg uitgesproken waardoor er geen reden is om aan te nemen dat er geen normaalverdeling is. 
 
3. Voor de derde assumptie kan gekeken worden naar de run sequence, we kijken naar de lange termijn trend, we zien een achteruitgang dus we kunnen vermoeden dat het niveau op lange termijn niet constant blijft maar het is moeilijk om te zien. We kunnen ook naar het gemiddelde gaan kijken door central tendency toe te passen. De student heeft naar Q-Q plot gekeken maar ik denk niet dat je daar veel uit kan afleiden. 
 
4. Voor de vierde assumptie kijken we opnieuw naar de run sequence en gaan we de spreiding bekijken. De spreiding van het eerste deel is groter dan het tweede. De student heeft opnieuw naar Q-Q plot gekeken. 
 
De student maakt de juiste conclusie, namelijk dat het model niet geldig is.
2008-12-01 18:56:58 [0762c65deec3d397cd9f26b3749a0847] [reply] 
goede analyse, alleen zijn de lags niet aangepast. Bij assumption 1 gebruik je het run sequence plot terwijl dit getest wordt door de autocorrelatie of het Lagplot.  
 
Bij de tweede assumption wordt er ook gezegt dat er geen 'ideale normaalverdeling' is maar de outliers zijn quasi te verwaarlozen.  
 
Om de 3e assumptie te beoordelen werk je niet met het QQ plot maar wel met het run sequence plot; grote fluctuaties af te lezen.

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 4 seconds \tabularnewline
R Server & 'Herman Ole Andreas Wold' @ 193.190.124.10:1001 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18969&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]4 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Herman Ole Andreas Wold' @ 193.190.124.10:1001[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18969&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18969&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	4 seconds
R Server	'Herman Ole Andreas Wold' @ 193.190.124.10:1001

Descriptive Statistics
# observations	61
minimum	66.5
Q1	80.6
median	87.3
mean	86.8934426229508
Q3	94.1
maximum	109.7

\begin{tabular}{lllllllll}
\hline
Descriptive Statistics \tabularnewline
# observations & 61 \tabularnewline
minimum & 66.5 \tabularnewline
Q1 & 80.6 \tabularnewline
median & 87.3 \tabularnewline
mean & 86.8934426229508 \tabularnewline
Q3 & 94.1 \tabularnewline
maximum & 109.7 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=18969&T=1

[TABLE]
[ROW][C]Descriptive Statistics[/C][/ROW]
[ROW][C]# observations[/C][C]61[/C][/ROW]
[ROW][C]minimum[/C][C]66.5[/C][/ROW]
[ROW][C]Q1[/C][C]80.6[/C][/ROW]
[ROW][C]median[/C][C]87.3[/C][/ROW]
[ROW][C]mean[/C][C]86.8934426229508[/C][/ROW]
[ROW][C]Q3[/C][C]94.1[/C][/ROW]
[ROW][C]maximum[/C][C]109.7[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=18969&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=18969&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Descriptive Statistics
# observations	61
minimum	66.5
Q1	80.6
median	87.3
mean	86.8934426229508
Q3	94.1
maximum	109.7

Figure 1

PNG link

Postscript link

PDF link

Figure 2

PNG link

Postscript link

PDF link

Figure 3

PNG link

Postscript link

PDF link

Figure 4

PNG link

Postscript link

PDF link

Figure 5

PNG link

Postscript link

PDF link

Figure 6

PNG link

Postscript link

PDF link

Parameters (Session):

par1 = 0 ; par2 = 0 ;

Parameters (R input):

par1 = 0 ; par2 = 0 ;

R code (references can be found in the software module):

par1 <- as.numeric(par1)
par2 <- as.numeric(par2)
x <- as.ts(x)
library(lattice)
bitmap(file='pic1.png')
plot(x,type='l',main='Run Sequence Plot',xlab='time or index',ylab='value')
grid()
dev.off()
bitmap(file='pic2.png')
hist(x)
grid()
dev.off()
bitmap(file='pic3.png')
if (par1 > 0)
{
densityplot(~x,col='black',main=paste('Density Plot   bw = ',par1),bw=par1)
} else {
densityplot(~x,col='black',main='Density Plot')
}
dev.off()
bitmap(file='pic4.png')
qqnorm(x)
grid()
dev.off()
if (par2 > 0)
{
bitmap(file='lagplot.png')
dum <- cbind(lag(x,k=1),x)
dum
dum1 <- dum[2:length(x),]
dum1
z <- as.data.frame(dum1)
z
plot(z,main=paste('Lag plot, lowess, and regression line'))
lines(lowess(z))
abline(lm(z))
dev.off()
bitmap(file='pic5.png')
acf(x,lag.max=par2,main='Autocorrelation Function')
grid()
dev.off()
}
summary(x)
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Descriptive Statistics',2,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'# observations',header=TRUE)
a<-table.element(a,length(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'minimum',header=TRUE)
a<-table.element(a,min(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q1',header=TRUE)
a<-table.element(a,quantile(x,0.25))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'median',header=TRUE)
a<-table.element(a,median(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'mean',header=TRUE)
a<-table.element(a,mean(x))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'Q3',header=TRUE)
a<-table.element(a,quantile(x,0.75))
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'maximum',header=TRUE)
a<-table.element(a,max(x))
a<-table.row.end(a)
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code