mark.watero.us

Wordpress stuff, a statistics plugin, and jello

Articles found for the word ‘plugins’

The future of kStats Reloaded statistics for Wordpress

4 comments

I’m not sure how to say this, so I think I’ll just spit it out; there are some major changes on the horizon for kStats. These are good changes, and I will go into more detail farther on, but they are changes that I may negatively impact current users of the plugin. Out of 6,000+ downloads, this could mean anywhere between a 2-3 dozen web sites! </tongue-in-cheek>

Humble Beginnings

As I’ve mentioned a few hundred times, kStats began as a simple fork of StatPress Reloaded to speed things up and create a plugin more suited to larger applications of Wordpress.

Due to the nature of how StatPress chose to store statistics and report on them, a StatPress table had a tendency of growing extremely large, extremely fast. Not only did this approach create a severe bottleneck when visitors tried to access your site, but it was sadly even worse when you tried to retrieve the resulting data on the administrative side.

Due to this, I figured the best approach for kStats was to restructure the existing format to use aggregated data. Much smaller more accessible records that gave you a fast look at real time numbers combined with past totals.

Statistics are important

This approach worked great at first. kStats was fast, it recorded data quickly and it reported it to the site administrator just as fast. But over time, it’s starting to show its weakness. It gives you the numbers, but what about the meat of your stats? What if you want to see what happened last month? What about 3 weeks ago? Or 16 weeks ago?

I started to develop a new aggregate process which would store almost four times as much data, allowing the existing system to remain in place, and the reports to grow in features considerably. I could still sense impending doom with this approach though. What about further down the road? Would there be a ceiling to how much data I could store in an aggregate form? Would I just move the bloat from one storage method to another, winding up in the same pit that StatPress did?

Learning from our mistakes

In the end, I think StatPress was on the right track. I think I was too, but instead of forking down a completely different road, I think the best approach would be to learn a lesson from both approaches, and to turn that into a single new approach.

I was drowning in workload over the holidays, while simultaneously trying to keep up with family functions. I didn’t get a lot of time to work on kStats, and I must apologize to anybody I left out to dry as a result. There were some bug fixes desperately needed, and while they’re out now, they should’ve been out then. However while I didn’t have as much time to work on kStats as I would have liked, I did find some time here and there in the mornings and evenings to get some reading in.

It’s all about the database

I consider myself fairly competent when it comes to MySQL, and more importantly database design. The more I learn though (as with life), the more I realize I don’t know. SQL by itself, without the various layers that bring it to life on the web, is an art form all by itself.

I’ve been studying up on the subject, because I know it’s a very important part of this design process, as well as the design and development of a few other projects I’m working on right now, and I think I’ve drawn an outline for a new database structure that will allow kStats to break the mold a little when it comes to statistics recording and analysis for Wordpress.

Using a combination of MySQL engines, such as the ARCHIVE engine, and a completely redesigned schema, I think I can bring the speed without sacrificing any data. There’s no reason you shouldn’t be able to look up historical periods of time and see exactly what occurred. There’s no reason you shouldn’t be able to click a button and produce a detailed report using any piece of recorded data as the focus (as opposed to being restricted to predefined reports). This is what statistics are for after all, right?

The bad news

While the plugin is still technically considered a beta, I would rather not release a version that destroys all the data that’s been recorded to date. This may be unavoidable to some degree though.

While data that currently exists in the raw table of your kStats install is easy enough to move over to the new format with no data loss, the data that’s already been summarized might not be so easy to transfer.

Before I dive headlong into this new strategy I’m going to do everything in my power to determine how we can avoid such a loss. It may be as easy as providing a legacy utility which will store the historical data in a different format, and retrieve it when building certain reports. The problem I’m facing is that this aggregate data may be unusable in producing certain reports that require a particular level of accuracy.

It’s tough. I have a project board dedicated to brainstorming this, so I’ll be sure to keep everybody up to date on how it’s going to go down, well before it does.

Written by mark

December 29th, 2009 at 8:02 pm

Asynchronous and kStats; delivering fast statistics

leave a comment

I don’t know why, but this blog has been really hard to write. Could be the fact that I’m still extremely sore from ripping the garage apart and cleaning it top to bottom, or the fact that I’m bummed out about my new intake for my car not being in the mail today, but I just don’t find writing easy at the moment. So I’ll just try and spit it out, and eventually it will get lost in my archives anyways…

What’s new in 0.7.1?

You won’t notice any major visual changes or fancy new features in this release. I fixed a possible vulnerability in the way that some of the data was stored and retrieved and added a new opt-in program which benefits the plugin and another program, both of which I’ll go into further detail on below.

I did however bump up the versioning from 0.6.x to 0.7.x because there’s something new going on behind the scenes that will be a long term benefit to kStats and the people who use it on their blogs.

The Old Way

The aggregate is tripped every night by somebody visiting your web site. Long story short, this would be better accomplished via a cron process run directly off the server, but due to the nature of plugins and Wordpress, expecting a user to set such a thing up just to use kStats would be asking a little too much.

When the aggregate was tripped, previous to this release, the process would run fast as fast can be and sort your data from the raw table into the seperate totals and charts tables. This of course allows kStats to run faster on a regular basis, and store more information with a much smaller footprint than its predecessor did. The pitfall was that the poor sap who tripped the process had to wait anywhere from 1-3 seconds extra for their page to load (possibly even longer on high traffic web sites).

In this age of broadband expectations, 3 seconds is an eternity.

The New Way

kStats now uses what is called an Asynchronous HTTP Request to run the aggregate. When the scheduled time comes, kStats fires off an HTTP request to an interface that runs the whole process in the background. This means that poor sap we were talking about above no longer notices a delay in their page load, no matter what the size of your database is or how much traffic you’re getting.

I promised when I started this project that the primary focus, regardless of features and capabilities, was to bring you the fastest plugin I could. I believe this update goes a long way to solidifying the groundwork of that promise.

Odds and Ends

There’s a new opt-in program that can be found on the Options page under the Definitions Utility – while I’m still looking for a more reliable Geolocation API (hint, hint), the user agent facility (determining OS, Browser, etc) is powered by the API provided by user-agent-string.info.

Should you choose to participate, what happens is when kStats stumbles across a user agent it can’t identify, it will immediately fire it off to user-agent-string.info so that they can identify it and include it in the next update of their API. The more user agents we can identify, the more accurate the process will be in determining exactly what people are using when they visit your site.

In addition, a possible security vulnerability has been closed up in the way that some data was being stored and returned from the database. The upgrade process will clean your current database and all information entered from now on is completely verified and sanitized. Please note that this was not an SQL injection vulnerability but instead a much smaller XSS vulnerability.

Download Changelog

Written by mark

December 2nd, 2009 at 7:19 pm

0.6.x feature freeze in effect.

3 comments

I have yet to track down the source of the problem. It’s hard to fix a problem that you can’t a) reproduce and b) even if you could, since the process is run deep behind the scenes you can’t step through it.

So other than some changes I made to the bar graph, I’m putting a complete freeze on adding new features to kStats until I’ve got this one nailed down. I’m going to go through each and every file and line of code one by one, and see where I can make improvements, fixes, or just finish commenting if nothing else.

Tracking down the beast

At the same time, since the only feasible source of the problem I can see is the nightly cleanup routines (as nothing else interacts directly with the totals table), I’ve taken a few steps. I’m now running a logger on it, so every time it trips I have a neat little text file that spits out every query, object and variable that’s part of the routine. I’m also running the wp-cron hook on an hourly basis, instead of nightly like regular users, so that it runs in a day what would normally take almost a month.

So far I haven’t seen the data get truncated again, so it may have been as simple as adding the call to ignore_user_abort().

In the meantime

There may be a few rapid fire bugfix releases of the 0.6.x series in the next few days.

Due to the feature freeze, they will not include any database changes or major updates that require you to run the upgrade utility. Each one should drop into place and self-update without any headache caused on your part.

Written by mark

November 24th, 2009 at 8:06 pm

Over half way there, kStats Reloaded v0.6.0

2 comments

I have to stop making changes to the user interface. Every time I’m about ready to tag a new release of kStats, I finish checking in the last of the files to Trunk, and just as I start typing svn cp trunk tags/x.x.x… Oops, forgot the new screenshots.

Would it be funny if I just left the screenshots from 0.1.0 up forever?

The path to stable

So far I’ve been managing to hit all of my goals, even surpassing a few during the beta period of kStats. I fubbed up the nightly aggregates for the first little bit, until I realized half my problem was my attempt to reinvent my own wheel. The other half fell under the same umbrella, I was just making things complicated that could have been simplified. Once I crested that hill, things starting falling into place. Read the rest of this entry »

Written by mark

November 23rd, 2009 at 10:24 pm

Posted in Plugins, kStats Reloaded

Tagged with , ,

kStats 0.5.0 – we’ve been widgetized!

one comment

Completely customizable widget interface

The new kStats widget!

I’m feeling a lot better, which means kStats is going to be getting a lot of attention over the next little while.

As per a recent feature request, kStats now comes with widget capabilities.

Other updates are noted in the changelog, and include some odds and ends bug fixes as well as complete integration of the user-agent-string.info API for fast and accurate identification of crawlers, as well as visitors operating systems and browsers.

The search engine query string definition file has also been updated and changed to an INI format in preparation for making future updates of this file available for direct download from your administrative interface.

Blog Statistics Widget

The current widget implementation is fairly straight forward in nature and completely user customizable using built in macro codes to display the desired information.

I decided to go with this format in order to allow end users complete control over how the widget is displayed, and what information is displayed. Other than being wrapped in a single DIV element with the class ‘kstats_widget’, all formatting is left in your control.

Available data includes numerical data such as your all time total visitors, pageviews, spiders or feed accesses. You can also access this data by today, yesterday, or this month. In addition, five other macros were added, including information about your current visitor (ip, host, os, browser), and how many visitors are currently viewing your site.

More Feature Requests

I’ve got lots of ideas and future plans for kStats, but what’s really going to make it the statistics plugin of choice for Wordpress users is your input.

I hope this serves to demonstrate that I am listening, and if you have a good idea for kStats, I would be more than happy to implement it. Other such suggestions that have been implemented include the ability to define your own ignore list via the options page, as well as the ability to display more than just the most recent 20 hits.

Let me know what you want to see!

Written by mark

November 15th, 2009 at 5:07 pm