Repository of Reproducible Computations

Free Statistics

of Irreproducible Research!

Author's title

Author

*Unverified author*

R Software Module

rwasp_pairs.wasp

Title produced by software

Kendall tau Correlation Matrix

Date of computation

Wed, 05 Nov 2008 05:22:31 -0700

Cite this page as follows

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?v=date/2008/Nov/05/t12258878110laxthtgr5el115.htm/, Retrieved Sun, 26 May 2024 06:09:59 +0000

Statistical Computations at FreeStatistics.org, Office for Research Development and Education, URL https://freestatistics.org/blog/index.php?pk=21736, Retrieved Sun, 26 May 2024 06:09:59 +0000

QR Codes:

Paste this QR Code to cite your computation.

Original text written by user:

IsPrivate?

No (this computation is public)

User-defined keywords

Severijns Britt

Estimated Impact

162

Family? (F = Feedback message, R = changed R code, M = changed R Module, P = changed Parameters, D = changed Data)

F       [Kendall tau Correlation Matrix] [part 2 Q1] [2008-11-05 12:22:31] [78308c9f3efc33d1da821bcd963df161] [Current]

Feedback Forum

2008-11-09 13:54:00 [Nathalie Daneels] [reply] 
Evaluatie opdracht 2: Task 1: 
 
Ik ben het eens met wat de student concludeert namelijk, de cashflow (RCF) is de beste boekhoudkundige predictor voor RNR. Dit kan men zien aan de correlatie tussen de 2 variabelen (0,01) en aan de grafiek van de RCF en RNR (alle punten liggen bijna op een rechte). 
Ik zou dit nog bij de conclusie vermelden: 
Als we naar de spreadsheet met gegevens kijken, kunnen we vaststellen dat de tijd op de horizontale as staat en de variabelen op de verticale as. Het is zo dat de tijd altijd op de verticale as moet staan, daarom moet de tabel dus getransponeerd worden. (Deze tabel moet ook bij in het werk worden opgenomen.) We hebben voor het oplossen van deze opdracht gebruik gemaakt van de Kendall Tau Rang Correlation: deze grafiek geeft een overzicht van alle correlaties van de variabelen in 1 keer: Op de hoofddiagonaal staat een histogram van elke variabele ; rechts boven de hoofddiagonaal zijn de scatterplots gegeven (per 2 variabelen) en links onder de hoofddiagonaal staat de probabiliteit van de variabelen (en niet de correlatiecoÃ«fficient) of met andere woorden de waarschijnlijkheid dat het verband aan het toeval kan worden toegeschreven (hoe hoger het getal, hoe meer waarschijnlijk dat het verband aan het toeval kan worden toegeschreven). Deze grafiek is ook een heel goede maatstaf: het ondervindt veel minder invloed van outliers en is veel robuster.  
We kunnen concluderen dat de cashflow(RCF) de beste boekhoudkundige predictor is voor RNR (wat de student ook concludeerde): De waarschijnlijkheid dat het verband tussen RCF en RNR aan het toeval te wijten is, is bijna onbestaande (0,01) of met andere woorden: we kunnen er zo goed als zeker van zijn dat RCF de beste predictor is. Dit kunnen we ook afleiden uit de scatterplot van deze twee variabelen: het is bijna mogelijk om een rechte te trekken door de waarnemingen (wat de student ook vermeldde in haar conclusie): Dit betekent dat er een lineair verband is tussen de twee data. Slechts weinig observaties liggen niet op deze rechte: we kunnen dus duidelijk veronderstellen dat er een sterk positief verband is tussen de variabelen (een kleine x-waarde van de ene variabele komt overeen met een kleine x-waarde van de andere variabele; hetzelfde geldt voor de y-waardes) en dus ook een positieve correlatie tussen x en y. 
2008-11-11 21:29:06 [Peter Van Doninck] [reply] 
De conclusie is correct. Er moet echter voorzichtig omgesprongen worden met die 0,01. Dit duidt op de p-waarde. Indien deze kleiner is dan 0,05, wat hier dus het geval is, dan kunnen we spreken dat er geen toeval is. Wanneer we de correlatie van de cashflow met het rendement vergelijken, dan komen we op een correlatie van 80%, wat de hoogste waarde is. Cashflow is dus de beste voorspeller voor het rendement.
2008-11-12 09:59:45 [Maarten Van Gucht] [reply] 
Q1: de student geeft wel het juiste antwoord weer, met de juiste berekening. Maar hij geeft geen verdere uitleg over waarom dit antwoord het juiste is. De student geeft niet weer wat de betekenis van de kleine p-waarde is. De p-waarde is namelijk de probabiliteit.(hoe sterk het verband aan het toeval toe te schrijven is) hoe lager de probabiliteit dus, hoe hoger de correlatie. Als je dus voor bijvoorbeeld 95% betrouwbaarheid, dan moet de p-waarde onder de 0,05 liggen. De student heeft gebruik gemaakt van een kendall tau correlation, dit is goed, hierdoor kun je verschillende correlaties tegelijk berekenen.  
2008-11-12 10:47:26 [Jolien Van Landeghem] [reply] 
Het antwoord van de student is correct. Je gebruikt de tau correlation matrix omdat deze niet gevoelig is voor outliers. Langs deze weg vind je er alle mogelijke correlaties terug. De p waarde is het laagst bij de RCF en dus heeft de cash flow de minste kans (1%) dat de correlatie aan het toeval wordt toegeschreven. Je mag dit concluderen omdat 0.01 kleiner is dan 0.05 ( de grenswaarde voor een significante correlatie). De tau coefficient indiceert ook correlatie : deze ligt dicht tegen 1, dus significante correlatie. Rcf is dus de beste predictor voor RNR. Ook aan de hand van de kendall tau correlation plot bekom je dezelfde conclusie.

Post a new message

Dataseries X:

Download CSV

Histogram

Boxplots

4.2	4.8	20.8	0.9	39.6
2.6	-4.2	17.1	0.85	36.1
3	1.6	22.3	0.83	34.4
3.8	5.2	25.1	0.84	33.4
4	9.2	27.7	0.85	34.8
3.5	4.6	24.9	0.83	33.7
4.1	10.6	29.5	0.83	36.3

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

\begin{tabular}{lllllllll}
\hline
Summary of computational transaction \tabularnewline
Raw Input & view raw input (R code)  \tabularnewline
Raw Output & view raw output of R engine  \tabularnewline
Computing time & 2 seconds \tabularnewline
R Server & 'Gwilym Jenkins' @ 72.249.127.135 \tabularnewline
R Framework error message & The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'. \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=21736&T=0

[TABLE]
[ROW][C]Summary of computational transaction[/C][/ROW]
[ROW][C]Raw Input[/C][C]view raw input (R code) [/C][/ROW]
[ROW][C]Raw Output[/C][C]view raw output of R engine [/C][/ROW]
[ROW][C]Computing time[/C][C]2 seconds[/C][/ROW]
[ROW][C]R Server[/C][C]'Gwilym Jenkins' @ 72.249.127.135[/C][/ROW]
[ROW][C]R Framework error message[/C][C]The field 'Names of X columns' contains a hard return which cannot be interpreted.
Please, resubmit your request without hard returns in the 'Names of X columns'.[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=21736&T=0

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=21736&T=0

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Summary of computational transaction
Raw Input	view raw input (R code)
Raw Output	view raw output of R engine
Computing time	2 seconds
R Server	'Gwilym Jenkins' @ 72.249.127.135
R Framework error message	The field 'Names of X columns' contains a hard return which cannot be interpreted. Please, resubmit your request without hard returns in the 'Names of X columns'.

Kendall tau rank correlations for all pairs of data series
pair	tau	p-value
tau( RNVM , RNR )	0.714285714285714	0.0301587301587301
tau( RNVM , RCF )	0.523809523809524	0.136111111111111
tau( RNVM , RLEZ )	0.264628062012482	0.427262856745706
tau( RNVM , REV )	0.333333333333333	0.381349206349206
tau( RNR , RCF )	0.80952380952381	0.0107142857142857
tau( RNR , RLEZ )	-0.0529256124024963	0.873844698517373
tau( RNR , REV )	0.0476190476190476	1
tau( RCF , RLEZ )	-0.264628062012482	0.427262856745706
tau( RCF , REV )	-0.142857142857143	0.772619047619048
tau( RLEZ , REV )	0.370479286817474	0.266379923342483

\begin{tabular}{lllllllll}
\hline
Kendall tau rank correlations for all pairs of data series \tabularnewline
pair & tau & p-value \tabularnewline
tau( RNVM , RNR ) & 0.714285714285714 & 0.0301587301587301 \tabularnewline
tau( RNVM , RCF ) & 0.523809523809524 & 0.136111111111111 \tabularnewline
tau( RNVM , RLEZ ) & 0.264628062012482 & 0.427262856745706 \tabularnewline
tau( RNVM , REV
 ) & 0.333333333333333 & 0.381349206349206 \tabularnewline
tau( RNR , RCF ) & 0.80952380952381 & 0.0107142857142857 \tabularnewline
tau( RNR , RLEZ ) & -0.0529256124024963 & 0.873844698517373 \tabularnewline
tau( RNR , REV
 ) & 0.0476190476190476 & 1 \tabularnewline
tau( RCF , RLEZ ) & -0.264628062012482 & 0.427262856745706 \tabularnewline
tau( RCF , REV
 ) & -0.142857142857143 & 0.772619047619048 \tabularnewline
tau( RLEZ , REV
 ) & 0.370479286817474 & 0.266379923342483 \tabularnewline
\hline
\end{tabular}
%Source: https://freestatistics.org/blog/index.php?pk=21736&T=1

[TABLE]
[ROW][C]Kendall tau rank correlations for all pairs of data series[/C][/ROW]
[ROW][C]pair[/C][C]tau[/C][C]p-value[/C][/ROW]
[ROW][C]tau( RNVM , RNR )[/C][C]0.714285714285714[/C][C]0.0301587301587301[/C][/ROW]
[ROW][C]tau( RNVM , RCF )[/C][C]0.523809523809524[/C][C]0.136111111111111[/C][/ROW]
[ROW][C]tau( RNVM , RLEZ )[/C][C]0.264628062012482[/C][C]0.427262856745706[/C][/ROW]
[ROW][C]tau( RNVM , REV
 )[/C][C]0.333333333333333[/C][C]0.381349206349206[/C][/ROW]
[ROW][C]tau( RNR , RCF )[/C][C]0.80952380952381[/C][C]0.0107142857142857[/C][/ROW]
[ROW][C]tau( RNR , RLEZ )[/C][C]-0.0529256124024963[/C][C]0.873844698517373[/C][/ROW]
[ROW][C]tau( RNR , REV
 )[/C][C]0.0476190476190476[/C][C]1[/C][/ROW]
[ROW][C]tau( RCF , RLEZ )[/C][C]-0.264628062012482[/C][C]0.427262856745706[/C][/ROW]
[ROW][C]tau( RCF , REV
 )[/C][C]-0.142857142857143[/C][C]0.772619047619048[/C][/ROW]
[ROW][C]tau( RLEZ , REV
 )[/C][C]0.370479286817474[/C][C]0.266379923342483[/C][/ROW]
[/TABLE]
Source: https://freestatistics.org/blog/index.php?pk=21736&T=1

Globally Unique Identifier (entire table): ba.freestatistics.org/blog/index.php?pk=21736&T=1

As an alternative you can also use a QR Code:

The GUIDs for individual cells are displayed in the table below:

Kendall tau rank correlations for all pairs of data series
pair	tau	p-value
tau( RNVM , RNR )	0.714285714285714	0.0301587301587301
tau( RNVM , RCF )	0.523809523809524	0.136111111111111
tau( RNVM , RLEZ )	0.264628062012482	0.427262856745706
tau( RNVM , REV )	0.333333333333333	0.381349206349206
tau( RNR , RCF )	0.80952380952381	0.0107142857142857
tau( RNR , RLEZ )	-0.0529256124024963	0.873844698517373
tau( RNR , REV )	0.0476190476190476	1
tau( RCF , RLEZ )	-0.264628062012482	0.427262856745706
tau( RCF , REV )	-0.142857142857143	0.772619047619048
tau( RLEZ , REV )	0.370479286817474	0.266379923342483

Figure 1

PNG link

Postscript link

PDF link

Parameters (Session):

Parameters (R input):

R code (references can be found in the software module):

panel.tau <- function(x, y, digits=2, prefix='', cex.cor)
{
usr <- par('usr'); on.exit(par(usr))
par(usr = c(0, 1, 0, 1))
rr <- cor.test(x, y, method='kendall')
r <- round(rr$p.value,2)
txt <- format(c(r, 0.123456789), digits=digits)[1]
txt <- paste(prefix, txt, sep='')
if(missing(cex.cor)) cex <- 0.5/strwidth(txt)
text(0.5, 0.5, txt, cex = cex)
}
panel.hist <- function(x, ...)
{
usr <- par('usr'); on.exit(par(usr))
par(usr = c(usr[1:2], 0, 1.5) )
h <- hist(x, plot = FALSE)
breaks <- h$breaks; nB <- length(breaks)
y <- h$counts; y <- y/max(y)
rect(breaks[-nB], 0, breaks[-1], y, col='grey', ...)
}
bitmap(file='test1.png')
pairs(t(y),diag.panel=panel.hist, upper.panel=panel.smooth, lower.panel=panel.tau, main=main)
dev.off()
load(file='createtable')
a<-table.start()
a<-table.row.start(a)
a<-table.element(a,'Kendall tau rank correlations for all pairs of data series',3,TRUE)
a<-table.row.end(a)
a<-table.row.start(a)
a<-table.element(a,'pair',1,TRUE)
a<-table.element(a,'tau',1,TRUE)
a<-table.element(a,'p-value',1,TRUE)
a<-table.row.end(a)
n <- length(y[,1])
n
cor.test(y[1,],y[2,],method='kendall')
for (i in 1:(n-1))
{
for (j in (i+1):n)
{
a<-table.row.start(a)
dum <- paste('tau(',dimnames(t(x))[[2]][i])
dum <- paste(dum,',')
dum <- paste(dum,dimnames(t(x))[[2]][j])
dum <- paste(dum,')')
a<-table.element(a,dum,header=TRUE)
r <- cor.test(y[i,],y[j,],method='kendall')
a<-table.element(a,r$estimate)
a<-table.element(a,r$p.value)
a<-table.row.end(a)
}
}
a<-table.end(a)
table.save(a,file='mytable.tab')

Free Statistics

Description of Statistical Computation

Tree of Dependent Computations

Dataset

Tables (Output of Computation)

Figures (Output of Computation)

Input Parameters & R Code