Why forecast sales?
Humans have the magical ability to plan for future events, for future gain. It’s not quite a uniquely human trait. Because apparently ravens can match a 4-year-old.
An abundance of data, and some very nice R packages, make our ability to plan all the more powerful.
A couple of months ago we looked at sales from an historical perspective in Digital Marketplace. Six months later. In this post, we’ll use the sales data to March 31st to model a time-series forecast for the next two years. The techniques apply to any time series with characteristics of trend, seasonality or longer-term cycles. Continue reading “But can ravens forecast?”
With tensions heightened recently at the United Nations, one might wonder whether we’ve drawn closer, or farther apart, over the decades since the UN was established in 1945.
We’ll see if we can garner a clue by performing cluster analysis on the General Assembly voting of five of the founding members. We’ll focus on the five permanent members of the Security Council. Then later on we can look at whether Security Council vetoes corroborate our findings. Continue reading “An East-West less divided?”
Revisiting an old post
Last September I wrote a post entitled Is the Government realising its ambition for SMEs on G-Cloud? Six months on, I wanted to revisit and update this article, fold in a second Digital Marketplace framework, and share the R code here. Revisiting an old post also provides an opportunity to see if one can simplify and improve older code. Continue reading “Digital Marketplace. Six months later.”
Supervised machine learning
In the “cluster of six”, we used unsupervised machine learning, to reveal hidden structure in unlabelled data, and analyse the voting patterns of Labour Members of Parliament. In this blog post, we’ll use supervised machine learning to see how well we can predict crime in London. Perhaps not specific crimes. But we can use recorded crime summary data at London borough-level , non-personal aggregated data licensed under the Open Government Licence, to predict crime counts.
Along the way, we’ll see the pay-off from an exploration of multiple models.
Continue reading “Criminal goings-on in a random forest”
Unsupervised machine learning
Hansard reports what’s said in the UK Parliament, sets out details of divisions, and records decisions taken during a sitting. The hansard R package provides functions to import its data.
Using the Hansard API (Application Programming Interface), we’ll apply unsupervised machine learning to analyze the voting patterns of 219 Labour Members of Parliament (MPs). We’ll consider all divisions (results of the votes) in the UK House of Commons since the 2017 general election. Continue reading “The “cluster of six””
Experimentation with geospatial mapping
Recently I experimented with geospatial mapping techniques in R. I looked at both static and interactive maps. Embedding the media into a WordPress blog would be simple enough with a static map. The latter would require (for me) a new technique to retain the interactivity inside a blog post.
My web-site visitor log, combined with longitude and latitude data from MaxMind’s GeoLite2, offered a basis for analysis. Although less precise than the GeoIP2 database, this would be more than adequate for my purpose of getting to country and city level. I settled on the Leaflet package for visualisation given the interactivity and pleasing choice of aesthetics.
The results however were a little puzzling.
Continue reading “Surprising stories hide in seemingly mundane data”
Why take a deeper look at G-Cloud categories?
The last blog – “The key to unlocking services on G-Cloud” – touched briefly upon their overlap. And as the concept of G-Cloud categories was newly introduced in the current iteration (G9), it may be worth taking a deeper look at their impact in advance of the next.
So, in this blog, I want to explore the extent and effects of category overlap. And let’s see what insights may be drawn. For example, are some categories of less value than others? Could some suppliers gain an advantage? Perhaps by aligning each service to many categories so buyers find them irrespective of their carefully crafted search criteria?
Continue reading “Do G-Cloud categories need a tweak?”