If you are looking for consistency or a theme here, you're in the wrong place. That's something you will find elsewhere. This is another one of those geek entries.
For many years I have had a serious affinity for cacti and it's underlying cousin rrdtool. Before that, I was pretty fond of MRTG -- the predecessor of rrdtool. I spent the requisite time graphing the performance of computers and routers -- you know... the normal stuff.
But that's so cut and paste. There's no challenge. At some point in time I got started playing with graphing other things. My first big project was taking the opensource financial package gnucash and creating about 70 different graphs on various financial aspects. But over the years I have graphed huricane progress (actually weather buoy data), doggie potty training, prepaid cell phone statistics, TiVo statistics, various Netflix usage statistics, usage statistics for my ISP (since I have a montly data cap) -- just to name a handful. I am a graphing weenie.
The original processing of gnucash processing occurred on a 166Mhz processor. It used to take 30 minutes to crunch the gnucash file and spit out stuff -- and that is before I added "features" to track more things. Needless to say, I couldn't use the "normal" cacti mechanisms that would process the file every 5 minutes. Add to this the fact I had 70-something graphs and god only knows how many distinct data sources. I needed a way to process once, read many.
What I originally came up with was a "recipe." I processed the data, wrote perl code on the fly, then wrote a "read my data" bit of code that included the on-the-fly perl code and spit out answers cacti could understand. This code would re-process the gnucash file every 12 hours or so.
I then cut and pasted this bit of code many many times as I developed other applications to process things that didn't fit the normal cacti model.
The bad side of this was that every time I hit something new and different, the cut and paste code diverted from the original. And there were multiple copies of it, each with tweaks. Want a new feature? You have to add it 10 times to 10 bits of diverging code.
So, I finally developed a library to handle this.
The whole idea here is that you write one script that creates data. And you write a second script that spits cacti-approved data. The library marries the 2 and manages the cache. It provides cache statistics. It rebuilds the cache in the background when it is out of date. It notifies you via SMTP when things fail.
Anyway, if you want it, let me know. I would post it to CPAN, but there is such a limited audience... and it is pretty beta.
Right now the performance is well... not nearly as good as before. My cacti run queue used to take about 45 seconds. And now it runs about 70 seconds. Sure, I could make it more parallel, but that isn't really a benchmark. [BTW: those spikey things in the graph are bugs where I accidentally spawned a zillion subprocesses.] I expected a performance drop. Before it was a simple file include and now it is a chatty disk operation. I would like to think it is more robust though. Before when things hiccuped, the graph would stop. Now little errors don't stop it -- and all errors generate an email.