I’m not sure how to say this, so I think I’ll just spit it out; there are some major changes on the horizon for kStats. These are good changes, and I will go into more detail farther on, but they are changes that I may negatively impact current users of the plugin. Out of 6,000+ downloads, this could mean anywhere between a 2-3 dozen web sites! </tongue-in-cheek>
Humble Beginnings
As I’ve mentioned a few hundred times, kStats began as a simple fork of StatPress Reloaded to speed things up and create a plugin more suited to larger applications of Wordpress.
Due to the nature of how StatPress chose to store statistics and report on them, a StatPress table had a tendency of growing extremely large, extremely fast. Not only did this approach create a severe bottleneck when visitors tried to access your site, but it was sadly even worse when you tried to retrieve the resulting data on the administrative side.
Due to this, I figured the best approach for kStats was to restructure the existing format to use aggregated data. Much smaller more accessible records that gave you a fast look at real time numbers combined with past totals.
Statistics are important
This approach worked great at first. kStats was fast, it recorded data quickly and it reported it to the site administrator just as fast. But over time, it’s starting to show its weakness. It gives you the numbers, but what about the meat of your stats? What if you want to see what happened last month? What about 3 weeks ago? Or 16 weeks ago?
I started to develop a new aggregate process which would store almost four times as much data, allowing the existing system to remain in place, and the reports to grow in features considerably. I could still sense impending doom with this approach though. What about further down the road? Would there be a ceiling to how much data I could store in an aggregate form? Would I just move the bloat from one storage method to another, winding up in the same pit that StatPress did?
Learning from our mistakes
In the end, I think StatPress was on the right track. I think I was too, but instead of forking down a completely different road, I think the best approach would be to learn a lesson from both approaches, and to turn that into a single new approach.
I was drowning in workload over the holidays, while simultaneously trying to keep up with family functions. I didn’t get a lot of time to work on kStats, and I must apologize to anybody I left out to dry as a result. There were some bug fixes desperately needed, and while they’re out now, they should’ve been out then. However while I didn’t have as much time to work on kStats as I would have liked, I did find some time here and there in the mornings and evenings to get some reading in.
It’s all about the database
I consider myself fairly competent when it comes to MySQL, and more importantly database design. The more I learn though (as with life), the more I realize I don’t know. SQL by itself, without the various layers that bring it to life on the web, is an art form all by itself.
I’ve been studying up on the subject, because I know it’s a very important part of this design process, as well as the design and development of a few other projects I’m working on right now, and I think I’ve drawn an outline for a new database structure that will allow kStats to break the mold a little when it comes to statistics recording and analysis for Wordpress.
Using a combination of MySQL engines, such as the ARCHIVE engine, and a completely redesigned schema, I think I can bring the speed without sacrificing any data. There’s no reason you shouldn’t be able to look up historical periods of time and see exactly what occurred. There’s no reason you shouldn’t be able to click a button and produce a detailed report using any piece of recorded data as the focus (as opposed to being restricted to predefined reports). This is what statistics are for after all, right?
The bad news
While the plugin is still technically considered a beta, I would rather not release a version that destroys all the data that’s been recorded to date. This may be unavoidable to some degree though.
While data that currently exists in the raw table of your kStats install is easy enough to move over to the new format with no data loss, the data that’s already been summarized might not be so easy to transfer.
Before I dive headlong into this new strategy I’m going to do everything in my power to determine how we can avoid such a loss. It may be as easy as providing a legacy utility which will store the historical data in a different format, and retrieve it when building certain reports. The problem I’m facing is that this aggregate data may be unusable in producing certain reports that require a particular level of accuracy.
It’s tough. I have a project board dedicated to brainstorming this, so I’ll be sure to keep everybody up to date on how it’s going to go down, well before it does.



