Tag Archives: ruby

Playing with Arduino: real HashtagBattle Twitter display

I just started a few days ago to play with Arduino.

First, I must say it’s a great platform: setup is easy, samples are great. I’ve made some embeeded development by the past, and I would dreamed of such platform!

So here is my first small project: an “Hashtagbattle display”

Real World hashtagBattle
The idea is to show the result of an Hashtagbattle on Twitter using a something more physical, in that case an arrow indicating the tendance of the results.
So we are couting the number of tweets containing hashtag a (let’s say “iphone”) and compare it to the numbers of tweet containing hashtag b (let’s say “#android”) and the direction of the arrow depend of the result. If we have the same amount of hashtag b and hashtag a, the arrow will indicate the center.

The small difficulty here was to use the streaming API of Twitter to do this real time

The board:

- the board is quite simple: there is a servo that will display the arrow direction, and two leds, a red and green, each one flashing when there is on tweet on HashTag A (red) or HashTag B are coming

sketch
The software:

As I currently have only an Arduino UNO, there is no direct connection with the internet. But I use the PC for this task, and communicate with the board through the serial line.
So the PC sends every 100 ms a line with 3 informations:
- the angle of the servo
- 0 or 1 to indicate if first leds need to be enabled
- 0 or 1 to indicate if second leds needs to be enabled

For instance:

32 1 0

Will tell that the servo need to be of 32° and red led needs to blink

So the program is the following:

require 'tweetstream'
require "serialport"

input=ARGV.shift || "#bordeaux,#strasbourg"
#params for serial port
port_str = "/dev/tty.usbmodem1411"  #may be different for you

sp = SerialPort.new(port_str, 9600, 8, 1, SerialPort::NONE)

TweetStream.configure do |config|
  config.consumer_key       = '<YOUR KEY>'
  config.consumer_secret    = '<YOUR CONSUMER KEY>'
  config.oauth_token        = '<OAUTH TOKEN>'
  config.oauth_token_secret = '<TOKEN SECRET>'
  config.auth_method        = :oauth
end

@client=TweetStream::Client.new;

total=0
last=Time.now
words=input.split(',')
flags=Array.new(words.size)
buffers=words.map{|w| []}
sp.puts "512 0 0" # reset the servo
@client.track(words) do |tweet|
  begin
    search=tweet.text.downcase
    words.each_with_index do |word,i|
      if search.include? word
        flags[i]=1
        buffers[i]<<tweet
      end
    end
    if (Time.now-last)>0.11
      begin
        ratio=(180*buffers[0].size/(buffers[0].size+buffers[1].size)).to_i
        str=ratio.to_s+" "+flags.join(' ')
        puts str
        sp.puts str
        last=Time.now
        flags=Array.new(words.size,0)
      end
    end
  end
end

Note that you canstart with parameters:

ruby notifyTweets.rb “#apple,#orange”
or use words instead of hashtags:
ruby notifTweets.rb “apple,orange”
The program needs to be run on the PC where the arduino is connected!

The Arduino part is here
We just se the serial line, wait for a line, and extract the informations from it.

#include <Servo.h> 
int ledPin=13;    // select the input pin for the potentiometer
int nbLed=2;
Servo myservo;  // create servo object to control a servo 

void setup() {
  // Open serial communications and wait for port to open:
  Serial.begin(9600);
  while (!Serial) {
    ; // wait for serial port to connect. Needed for Leonardo only
  }
  for(int i=0;i<nbLed;i++){
   pinMode(ledPin-i, OUTPUT); 
  }
 myservo.attach(9);  // attaches the servo on pin 9 to the servo object 
}

# Utility function to get a value from a string at a given pos
String getValue(String data, char separator, int index)
{
  int found = 0;
  int strIndex[] = {0, -1};
  int maxIndex = data.length()-1;

  for(int i=0; i<=maxIndex && found<=index; i++){
    if(data.charAt(i)==separator || i==maxIndex){
        found++;
        strIndex[0] = strIndex[1]+1;
        strIndex[1] = (i == maxIndex) ? i+1 : i;
    }
  }
  return found>index ? data.substring(strIndex[0], strIndex[1]) : "";
}


void loop() {
  if(Serial.available() >0) {
    String str=Serial.readStringUntil('\n');
    Serial.println("Read:"+str);
    for(int i=0;i<nbLed;i++){
      // look for the next valid integer in the incoming serial stream:
      if(!getValue(str,' ',i+1).equals("0")){
        digitalWrite(ledPin-i, HIGH);  
      }
    myservo.write(getValue(str,' ',0).toInt());
    }
  }
  delay(100);  
  for(int i=0;i<nbLed;i++){
    digitalWrite(ledPin-i, LOW); 
  } 
}

That’s all for this first project, was really fast to develop, thanks to Arduino (and Ruby!)

SpaceSaving algorithm (heavyhitter): compute real time data in streams

I’ve just discovered recently the SpaceSaving algorithm ( ) which is very interesting.

The purpose of this algorithm is to approximate computation of data from an infinite stream using a data structure with a finite size. The typical problem we try to solve is to count the number of occurrences of items coming from an infinite stream.

The obvious approach, is to have a map of items,count somewhere (memory, database) and to increment the count (or create it if not present) in the map

Depending of the distribution, the map size can be as big as the number of items in the feed (so infinite!)

Thanks to the heavy hitter algorithm, you just maintain a small subset of items, and an error associate with them.

If an item is not yet present in the map, you remove the previous minus and replace it with this one, and set the error count to the new one to the total count of the previous one.

So basically, you can have an item appears in the map but with an high error count.

The result depends of the number chosen for the size of the map , and of the distribution of the data, but I wanted to try by myself and Ive done a small implementation in ruby and a sample using the Twitter real time stream to cunt URL’s occurrence in the feed

class SpaceSaving
  attr_accessor :counts,:errors
  def initialize in_max_entries=1_000
    @in_max_entries=in_max_entries
    @counts={}
    @errors={}
  end
  #Add a value in the array
  def add_entry value,inc_count=1
     count=@counts[value]
     if count==nil   #newentry
      if @counts.size>=@in_max_entries
        min=counts.values.min
        old=counts.key min
        @counts.delete old
        @errors.delete old
        @errors[value]=min
        count=min
      else
        count=0
      end
     end
     @counts[value]=count+inc_count
  end
  
end

I just wanted to keep it as simple as possible for now, but in a real world example, it would be good to encapsulate the way to access to the @counts field.
By default, the size of the map is set to 1.000, but this can be modified….

And the sample.rb

require 'bundler/setup'
require 'tweetstream'
require './SpaceSaving.rb'

TweetStream.configure do |config|
  config.consumer_key       = <CONSUMER_KEY>
  config.consumer_secret    = <CONSUMER_SECRET>
  config.oauth_token        = <OAUTH_TOKEN>
  config.oauth_token_secret = <OAUTH_SECRET>
  config.auth_method        = :oauth
end

@client=TweetStream::Client.new;

urls=SpaceSaving.new

total=0

@client.track('http') do |status|
  begin
    # add each URL in the tweet , and count it once
    status.urls.each do |url|
     urls.add_entry(url.expanded_url,1)
    end
    total+=1
    if total%1000==0 then
      puts "Treated:#{total}"
      limit=urls.counts.values.sort[-30]
      urls.counts.each do|u,c|
        if c>limit 
          puts "Counts #{c} #{u} errors:#{urls.errors[u]}"
        end
      end
    end
  end
end

The complete sourcecode is available on Github: https://github.com/tomsoft1/SpaceSaving

So I’ve made a few tests and comparaison with an exact approach, and the result for the top 20 was similar, even after 600.000 tweets parsed, with a “small” 1.000 size comparing to around 300.000 entries for the exact algo.

Twitter Site Stream using EventMachine

I’ve spend some time to try to use Twitter Site Streaming API with event machine, so here is just a small snipet of code on how to do it. In fact it’s pretty simple

Just go to your app page on twitter ( http://www.twitter.com/apps ) and go to your application. Get the access token here. You will find also a button “get my access token” where you can get the oauth access token and access token secret.

With these, you will be able to sign the request

require 'rubygems'
require 'eventmachine'
require 'em-http'
require 'json'
require 'oauth'
require 'oauth/client/em_http'

# Edit in your details.
CONSUMER_KEY = "<put your consumer secret key here>"
CONSUMER_SECRET = "<put your consumer key here>"
ACCESS_TOKEN = "<put your access token here>"
ACCESS_TOKEN_SECRET = "<put your access token secrethere>"

def twitter_oauth_consumer
  @twitter_oauth_consumer ||= OAuth::Consumer.new(CONSUMER_KEY, CONSUMER_SECRET, :site => "http://twitter.com")
end

def twitter_oauth_access_token
  @twitter_oauth_access_token ||= OAuth::AccessToken.new(twitter_oauth_consumer, ACCESS_TOKEN, ACCESS_TOKEN_SECRET)
end

EventMachine.run do
		  # now, let's subscribe to twitter site stream
		  # check informaiton on twitter site
		  # here we are followig to user that have signed to our app...
		  toFollow=[17590452,2071231]
		 http = EventMachine::HttpRequest.new('https://betastream.twitter.com/2b/site.json'
).post(:body=>{"follow"=>toFollow.join(",")},
		 	:head => {"Content-Type" => "application/x-www-form-urlencoded"},
		 	:timeout => -1) do |client|
    		twitter_oauth_consumer.sign!(client, twitter_oauth_access_token)
  		end

	  	buffer = ""

		http.stream do |chunk|
    		buffer += chunk
   			while line = buffer.slice!(/.+\r?\n/)
   				puts "handling a new event:"+line
    		end
  		end
   		http.errback { puts "oops" }
   		http.disconnect { puts "oops, dropped connection?" }

 end

For this, I use the eventmachine plugins for ruby, as well as the oauth and em-http plugins.

Please note that you must you https with twitter in order for this to work.