Wednesday, March 11, 2015

Extracting Heatmap

Inspired by this tweet, I wanted to try to do something similar in JavaScript.

Fortunately, I had this old post Chart from R + Color from Javascript to serve as a reference, and I got lots of help from these links.

In a couple of hours, I got this crude but working rendering complete with a d3.js brush to get the scale.  Then since this is sort of a finance blog, I imagined we found an old correlation heatmap like the one in Pretty Correlation Map of PIMCO Funds.  Although, we could guess at the correlation values, I thought it would be a lot more fun to get live values.  Try it out below.

  1. Brush over the scale / legend
  2. Input scale min and max
  3. Mouseover color areas in the chart

As I said, it is rough, but it works. It needs a little UI work :)

Thursday, March 5, 2015

Is Time Series Clustering Meaningless? (lots of dplyr)

A kind reader directed me in a comment on Experiments in Time Series Clustering to this paper.

Clustering of Time Series Subsequences is Meaningless: Implications for Previous and Future Research

Eamonn Keogh  and Jessica Lin

Computer Science & Engineering Department University of California – Riverside

http://www.cs.ucr.edu/~eamonn/meaningless.pdf

As I said in my last post, I don’t know what I’m doing, so I have no basis for discussing or arguing time series clustering.  After reading the paper a couple of times, I think I understand their points, and I do not think what I am doing is “meaningless”.  In their financial time series examples, they use prices and speak of trying to find patterns.  I simply want to classify which years are most alike by various characteristics, such as autocorrelation of returns  not prices, distribution of returns, and all sorts of other classifiers.

More than anything this whole exercise gave me a good excuse to dig much, much deeper.  Iongtime readers might be wondering where are the interactive plots.  I wanted to share what I have done so far hoping that readers might elaborate, argue, or point me in good directions.

Regardless of your interest in time series clustering, you might enjoy the dplyr and piping that I used to generate the results.  Also, I have not seen dplyr do applied to autocorrelation ACF, so you might want to check that out in the last snippet of code.

All of the code for this post and last post is in this Github repo.

image

 


library(TSclust)
library(quantmod)
library(dplyr)
library(pipeR)
library(tidyr)

sp5 <- getSymbols("^GSPC",auto.assign=F,from="1900-01-01")[,4]

sp5 %>>%
# dplyr doesn't like xts, so make a data.frame
(
data.frame(
date = index(.)
,price = .[,1,drop=T]
)
) %>>%
# add a column for Year
mutate( year = as.numeric(format(date,"%Y"))) %>>%
# group by our new Year column
group_by( year ) %>>%
# within each year, find what day in the year so we can join
mutate( pos = rank(date) ) %>>%
mutate( roc = price/lag(price,k=1) - 1 ) %>>%
# can remove date
select( -c(date,price) ) %>>%
as.data.frame %>>%
# years as columns as pos as row
spread( year, roc ) %>>%
# remove last year since assume not complete
( .[,-ncol(.)] ) %>>%
# remove pos since index will be same
select( -pos ) %>>%
# fill nas with previous value
na.fill( 0 ) %>>%
t %>>%
(~sp_wide) %>>%
# use TSclust diss; notes lots of METHOD options
diss( METHOD="ACF" ) %>>%
hclust %>>%
(~hc) %>>%
ape::as.phylo() %>>%
treewidget #%>>%
#htmlwidgets::as.iframe(file="index.html",selfcontained=F,libdir = "./lib")

library(lattice)
library(ggplot2)
# get wide to long the hard way
# could have easily changed to above pipe to save long
# as an intermediate step
# but this makes for a fun lapply
# and also we can add in our cluster here
sp_wide %>>%
(
lapply(
rownames(.)
,function(yr){
data.frame(
year = as.Date(paste0(yr,"-01-01"),"%Y-%m-%d")
,cluster = cutree(hc,10)[yr]
,pos = 1:length(.[yr,])
,roc = .[yr,]
)
}
)
) %>>%
(do.call(rbind,.)) %>>%
(~sp_long)


sp_long %>>%
ggplot( aes( x = roc, group = year, color = factor(cluster) ) ) %>>%
+ geom_density() %>>%
+ facet_wrap( ~ cluster, ncol = 1 ) %>>%
+ xlim(-0.05,0.05) %>>%
+ labs(title='Density of S&P 500 Years Clustered by TSclust') %>>%
+ theme_bw() %>>%
# thanks to my friend Zev Ross for his cheatsheet
+ theme( plot.title = element_text(size=15, face="bold", hjust=0) ) %>>%
+ theme( legend.position="none" ) %>>%
+ scale_color_brewer( palette="Paired" )



acf_plot


# explore autocorrelations
sp5 %>>%
# dplyr doesn't like xts, so make a data.frame
(
data.frame(
date = index(.)
,price = .[,1,drop=T]
)
) %>>%
# add a column for Year
mutate( year = as.numeric(format(date,"%Y"))) %>>%
# group by our new Year column
group_by( year ) %>>%
# within each year, find what day in the year so we can join
mutate( pos = rank(date) ) %>>%
mutate( roc = price/lag(price,k=1) - 1 ) %>>%
# can remove date
select( -c(date,price) ) %>>%
as.data.frame %>>%
# years as columns as pos as row
spread( year, roc ) %>>%
# remove last year since assume not complete
( .[,-ncol(.)] ) %>>% t -> sP

sp_long %>>%
group_by( cluster, year ) %>>%
do(
. %>>%
(
clustd ~
acf(clustd$roc,plot=F) %>>%
(a ~
data.frame(
cluster = clustd[1,2]
,year = clustd[1,1]
,lag = a$lag[-1]
,acf = a$acf[-1]
)
)
)
) %>>%
as.data.frame %>>%
ggplot( aes( x = factor(cluster), y = acf, color = factor(cluster) ) ) %>>%
+ geom_point() %>>%
+ facet_wrap( ~lag, ncol = 4 ) %>>%
+ labs(title='ACF of S&P 500 Years Clustered by TSclust') %>>%
+ theme_bw() %>>%
# thanks to my friend Zev Ross for his cheatsheet
+ theme(
plot.title = element_text(size=15, face="bold", hjust=0)
,legend.title=element_blank()
) %>>%
+ theme(legend.position="none") %>>%
+ scale_color_brewer(palette="Paired")


If you’ve made it this far, I would love to hear from you.

Monday, March 2, 2015

Experiments in Time Series Clustering

Last night I spotted this tweet about the R package TSclust.

I should start by saying that I really don’t know what I’m doing, so be warned.  I thought it would interesting to apply TSclust to the S&P 500 price time series.  I took the 1-day simple rate of change, grouped by year with dplyr, and then indexed by the day of the year all in one pipeR pipeline.  Since the TSclust paper

TSclust: An R Package for Time Series Clustering

Journal of Statistical Software, Volume 62, Issue 1

November 2014

http://www.jstatsoft.org/v62/i01/paper

demonstrates interoperability with hclust in their OECD interest rate example ( Section 5.2 ), I thought I could visualize the results nicely with treewidget from the epiwidgets package.  Just because the htmlwidget was designed for phylogeny doesn’t mean we can’t use it for finance.  Here is the result.

For reference and searching, I’ll copy the code below, but all of this can be found in this Github repo.


library(TSclust) library(quantmod) library(dplyr) library(pipeR) library(tidyr) library(epiwidgets) sp5 <- getSymbols("^GSPC",auto.assign=F,from="1900-01-01")[,4] sp5 %>>% # dplyr doesn't like xts, so make a data.frame ( data.frame( date = index(.) ,price = .[,1,drop=T] ) ) %>>% # add a column for Year mutate( year = as.numeric(format(date,"%Y"))) %>>% # group by our new Year column group_by( year ) %>>% # within each year, find what day in the year so we can join mutate( pos = rank(date) ) %>>% mutate( roc = price/lag(price,k=1) - 1 ) %>>% # can remove date select( -c(date,price) ) %>>% as.data.frame %>>% # years as columns as pos as row spread( year, roc ) %>>% # remove last year since assume not complete ( .[,-ncol(.)] ) %>>% # remove pos since index will be same select( -pos ) %>>% # fill nas with previous value na.fill( 0 ) %>>% t %>>% # use TSclust diss; notes lots of METHOD options diss( METHOD="ACF" ) %>>% hclust %>>% ape::as.phylo() %>>% treewidget

Tuesday, February 3, 2015

Financial Charts | Pan and Zoom

The htmlwidget for Week 2 over at Building Widgets claims to add pan and zoom interactivity to almost all R charts.  Since their were no tests on financial charts, I thought I would try it out on a couple.  It really does work. 

Here is an example on an efficient frontier plotted from fPortfolio.

When we combine pipeR and htmlwidgets, we get a solid result from what I think is fairly elegant and understandable code.

svgPanZoom(
svgPlot({
returns %>>%
(cumprod( 1 + . )) %>>%
(.[endpoints(.,"months")]) %>>%
( ./lag(.,k=1) - 1 ) %>>%
chart.SnailTrail(
colorset = RColorBrewer::brewer.pal(9,"Set1")[-6]
,add.names="none"
,width = 36
,step = 36
,legend.loc = "topright"
)
},height= 10, width = 16)
)

An even more challenging test was chartSeries, and svgPanZoom still passed the test beautifully. See if it works on your machine.


getSymbols("SPY")
svgPanZoom(svgPlot({chartSeries(SPY)},width = 12, height = 8))

If you would like to reproduce the plots, all the code is in this Gist.

Friday, January 2, 2015

Will I fail?

I have committed to building an htmlwidget a week in 2015.  To isolate and separate the commitment from this blog, I set up a new site Building Widgets and Github repo.  The first post Can I Commit? provides meta introspection on commitment.

Can I commit to building an htmlwidget a week in the year 2015?

 

It seems we humans all struggle internally with commitment, and at the beginning of each year, we often become even more aware of this struggle in the form of New Year's Resolutions.  This site is not really a New Year's Resolution.  It is more a resolution that coincidentally falls at the beginning of the year, since htmlwidgets was released December 17.

 

I know through plenty of experiences with commitment failure that the pattern of commitment failure will assert itself throughout the life of this project…  Building Widgets “Can I Commit?”

I promise this will be the only crosspost.  Any future posts on this blog about htmlwidgets will only be application of the widgets, most likely for finance.

Tuesday, December 30, 2014

Widgets For Christmas

For Christmas, I generally want electronic widgets, but after six months of development, all I wanted this Christmas was htmlwidgets, and Santa RStudio/jj,joe,yihui and Santa Ramnath delivered early with this RStudio tweet on December 17th.

The major benefit of htmlwidgets is it provides all three methods of bridging R with JavaScript/HTML mentioned in my Aug. 16, 2013 post I Want ggplot2/lattice and d3 (gridSVG–The Glue).  For htmlwidgets to be successful though, not only do htmlwidgets need to work, easy creation of widgets is absolutely essential.

As a quick example, we can look at the DiagrammeR package released yesterday by Richard Iannone.  DiagrammeR launched in non-htmlwidgets form severely hampering its ability to be easily used in multiple contexts.  Converting it to htmlwidgets seemed like a great opportunity to illustrate both the ease of htmlwidgets creation and the powerful infrastructure offered by htmlwidgets.  So, in a couple hours—easy to create, check—yesterday (most of the time spent on examples, documentation, and testing) with only a couple of lines of JavaScript—easy to create, check again—I was able to transform the DiagrammeR package into htmlwidgets.

I thought a finance diagram would be a great example for this blog, so off to Google Images I went looking for a good and also simple application and chose this from the Department of Finance Canada.

image

Here is what it looks like with DiagrammeR + mermaid.js.

 

If I can come up with the resolve and commitment, I might have an announcement for 2015 – the year of the widget.

Happy New Year, and thanks for 4 good years of TimelyPortfolio.

Thursday, December 11, 2014

Out of Nowhere–Explore Text on a Path

I had not really stopped to think of this until I listened to this The Web Ahead podcast with Sara Soueidan.  What is really interesting about the tech world is how experts can seemingly pop up out of nowhere and become the authority on a topic.  In the podcast, this was the case with the interviewee Sara Soueidan.  We can find a similar example in Joni "Bologna" Trythall with SVG.

I find it even more fun when I can incorporate these experts’ content into R.  Let’s animate some text on a path as Joni does in her article "Animating SVG text On A Path", but instead of an arbitrary path, let’s use a line in a plot.

some text on a path

I’ll copy the code below.  Let me know if a tutorial would be helpful.

library(SVGAnnotation)
library(pipeR)
library(htmltools)

# make as basic a line plot as I know how in R
svg = svgPlot(plot(sin(seq(0,pi*3,0.2)),type="l")) %>>%
# extract the XML and use htmlParse
# to overcome namespace confusion and difficulty
saveXML %>>% htmlParse

# with base R plots, we get clues with clip-path attributes
# in this case we know with some inspection
# there will be one g with a clip-path attribute
# and that g will contain our plotted line
getNodeSet(svg,"//g[contains(@clip-path,'url')]//path")[[1]] %>>%
# let's add an id so we can reference this later
( addAttributes( node=., id = "ourline" ) )

# first step in adding text to a path
# make a new text node
textOnPath = newXMLNode("text")
# now the critical part to join the text to the path
addChildren(
textOnPath
, newXMLNode(
"textPath"
,attrs=c( "xlink:href" = "#ourline" ) #our id given above
,"some text on a path" #some very creative saying
)
)

# add our text node to the svg plot
addChildren(
getNodeSet(svg,"//svg")[[1]]
,kids = list(textOnPath)
)
# see if it works by sending to our viewer/browser
getNodeSet(svg,"//svg")[[1]] %>>%
saveXML %>>% HTML %>>% html_print

# let's continue our journey by exploring the startOffset attribute
# startOffset says where on the path to start our text
# what happens if we add startOffset = 30%
getNodeSet(svg, "//textPath")[[1]] %>>%
( addAttributes( node = . , startOffset = "30%" ) )
# find out the effect of startOffset by browsing
getNodeSet(svg,"//svg")[[1]] %>>%
saveXML %>>% HTML %>>% html_print

# for our grand finale we can animate the text
# note: this might not work in your browser, so use Chrome
# add a child animate node with the same attributes as Joni's tutorial
getNodeSet(svg, "//textPath")[[1]] %>>%
(
addChildren(
node = .
, newXMLNode(
"animate"
,attrs = c(
attributeName="startOffset"
,values = "0;0.7;1"
,dur = "8s"
,repeatCount = "indefinite"
,keyTimes = "0;0.2;1"
)
)
)
)
# see the animated text
getNodeSet(svg,"//svg")[[1]] %>>%
saveXML %>>% HTML %>>% html_print(viewer=utils::browseURL)

Thursday, December 4, 2014

No Reason to Read, Just Need an Outlet

Don’t intend for this to be a bitch and moan post, and I’m not sure there is really any real objective other than I feel like I need an outlet.  This happens to be my only one,.

For those out there not engaged in money management, it can be pleasantly simple and maybe even entertaining to poke fun  at those of us who foolishly choose to call ourselves portfolio managers.  However, this business can be excrutiating, depressing, and frustrating.  Generally, our biggest benefit to our clients is insulating themselves from their own stupidity, but often this task becomes impossible, usually at the time when client stupidity results in the most amount of damage to themselves.

While distracting myself with my insatiable curiosity through academic research, technology, and data visualization (just look at the last couple of years of posts) helps, I cannot forget that I get paid to manage money, which generally just ain’t no fun as failing is the norm, and the brief moments of “success” go unnoticed and disappear with no lasting memory or permanent effect. 

Most would naively say go do something else, but I still feel this delusional quest really can help those few clients who trust and endure.

Tuesday, December 2, 2014

Much Better Animated Paths | Christmas SVG

Just after I made my really ugly animated turkey sketch (see post), I saw this much better set of Christmas icons in the Smashing Magazine Article Freebie Christmas Icon Set from Manuela Langella.  While I still remember how to do this, I thought I would use the same techniques in R using rvest + XML + htmltools to animate the paths with vivus.js.  In the iframe below is the result on the Santa icon.

Code: http://gist.github.com/39394d6e37a7fd878cab#file-code-R

Wednesday, November 26, 2014

Happy Thanksgiving | More Examples of XML + rvest with SVG

I did not intend for this little experiment to become a post, but I think the code builds nicely on the XML + rvest combination (also see yesterday’s post) for working with XML/HTML/SVG documents in R.

It all started when I was playing on my iPhone in the Sketchbook app and drew a really bad turkey.  Even though, the turkey was bad, I thought it would be fun to combine with vivus.js.  However, Sketchbook does not export SVG, so I exported as PDF and imported into Inkscape.  The end result was a still very messy SVG file, so I thought it would be a great test / application of my new skills with rvest + XML. The code opens the SVG, grabs all the path nodes, assembles those into a svg tag with id = "turkey", and then adds a script to use the addDependency for vivus.js.

Happy Thanksgiving to all the US readers out there.