dustin'sTrends

Vote Me Up
upTrend: 0
Key:
java - seattle (source: Number of Jobs on Indeed)

dustin'sBlogPosts

4/11/08 6:23 PM

The meaning of life according to Google News: 86640

 We have had some long standing arguements about data integrity here.  My colleagues and other users often point out data on a graph that makes no sense.  I have always argued that we should provide the data we receive as-is, i.e. if google news provides the wrong data, then the graph will show the wrong data.  Until now we have not altered any of the data we received (I guess you'd say I won :) -- if only because I would be the one who would have to write the data-normalization code). 

Now, many of you will have noticed that the vast majority of google news graphs have gone haywire.  Google news is now randomly return the value 86640 for any query.  I think they are trying to tell us something.  I'm no numerologist, but I'm guess that 86640 is the meaning to life we have all been looking for.  As such, I have deleted all references to it, and you will never see it again.  You're welcome.

-Dustin
4/1/08 4:12 PM

Introducing SGS - Simple Graph Syndication

One of our goals while designing trendrr was to make data both easy to import and easy to export.  We like webapps that give you the option to take your data elsewhere if you so choose--I.e. we hate lock-ins.  Above and beyond moving your data elsewhere we wanted it to be easy to develop applications with the data you are tracking.  I looked around for a preexisting xml standard that would work for our purposes, but I was unable to find something suitable.  What we came up with was sgs (simple graph syndication--big ups to rss :)).

The schema itself should be easily readable/digestable for anyone familiar with xml.  It provides all the data necessary to reproduce the exported graph.

Here is an example feed:


<sgs version="0.1">
    <graph>
<title>iraq (Number of Photos on Flickr)</title>
<createdOn>2008-03-26T15:09:07</createdOn>
<link>http://trendrr.com/public/graphs/376803</link>
<creator>dustin</creator>
<creatorUrl>http://trendrr.com/user/dustin</creatorUrl>
<graphType>overlay</graphType>
<graphScale>absolute</graphScale>
<timeFrameStart>2008-03-02T12:44:03</timeFrameStart>
<timeFrameEnd>2008-04-01T13:44:03</timeFrameEnd>
    <datasets>
    <dataset>
<dataSource>Number of Photos on Flickr</dataSource>
<lastUpdated>2008-04-01T02:55:58</lastUpdated>
<percentTrend>0.006306043062614326</percentTrend>
<dataStartedCollecting>2008-03-26T15:09:07</dataStartedCollecting>
<color>#D03E6F</color>
<input>iraq</input>
<legendText>iraq</legendText>
<renderType>line</renderType>
<invert>false</invert>
    <values>
    <point>
<date>2008-03-26T18:09:32</date>
<value>193735.0</value>
</point>
    <point>
<date>2008-03-27T22:51:47</date>
<value>193938.0</value>
</point>
 ....
</values>
</dataset>
</datasets>
</graph>
</sgs>




All dates are XML encoded

We are releasing the schema under a creative commons license. So you are free to use it without any fear.

The schema as it stands now only relates to time-series graphs, but it will be expanded in the future to deal with other types of data.  Enjoy.
3/27/08 12:38 PM

RESTful api usage - Continued

As Trendrr usage continues to rise, I have been monitoring the memory usage of JBoss (the app server that Trendrr is run on) closely.  I typically just ssh to the server and use top or ps to check the memory and cpu usage.  I realized that this would be a perfect way to use the Trendrr api.  It only took a slight modification of my original cpu usage script and we are good to go.


#do a sample every ten minutes
sampleSeconds=600
#your api key would go here
apiKey={removed}

echo $memUsage

while [ 1 ]
do
    memUsage=`ps aux | grep -m 1 program.name=run.sh | nawk '{print $4}'`
    apiUrl="http://www.trendrr.com/api/simple?key=$apiKey&value=$memUsage"
    wget $apiUrl
    # Need to delete the file that wget saved.
    rm simple?*
    # Sleep until we take the next sample
    sleep $sampleSeconds
done


2/7/08 7:29 PM

RESTful api example : cpu monitor



Here is an example of using the RESTful api to track your own data. The simple script tracks the CPU usage of my developement box at ten minute intervals. It is somewhat frivolious (now you all can see how hard I –err my computer-- work everyday) but I think shows how easy it is to utilize the trendrr api.

From trendrr home page:
  1. Log in
  2. click 'Add Data' on left hand side
  3. click 'Add Custom Data Set'
The defaults work just fine for most of the fields. Just fill in 'Data Set Title' and 'Legend Text'.  In this case I used:
Data Set Title = 'CPU Monitor'
Legend Text = 'Dustins Linux Box'



Thats it! Press save and you will see your api-key on the next screen.


To get my current cpu usage I am using this simple shell script:


#do a sample every ten minutes
sampleSeconds=600
#your api key would go here
apiKey={your key here}

while [ 1 ]
do
   
    # This line gets the current CPU usage as decimal number
    cpuUsage=`top -b -n 2 -d $sampleSeconds | grep Cpu -n | nawk '{if ( $1 +0 > 15 ) print $2 + 0;}'`

# this is the Trendrr RESTful url    apiUrl="http://www.trendrr.com/api/simple?key=$apiKey&value=$cpuUsage"
   
    wget $apiUrl
    echo $apiUrl

done



Using the api is tre simple.  you just construct a url in the form:

http://www.trendrr.com/api/simple?key={your key}&value={your value}


which will inject the value at time=right now.  That is, you will see a point on the graph that corresponds to that value at the current time.  You could also pass a date parameter if you wish.  More documentation here


I am using the newest Ubuntu OS and the script works out of the box, it might work as is on Mac as well. What it does is take the average cpu usage for every ten minutes, then sends that number to trendrr via the RESTful url. Very simple. 

RSS Feed
forgot password