Tag Archives: geolocalization

Node.js tutorial : real time geolocalized tweets

Recently I’ve played a little bit with node.js around real time visualization of Tweets, and I want to share with you one of my experiment.
The idea is to display in real time a heatmap of tweets.
For this, I’ve used node.js . I am a long time Ruby lover but always open for new technology, even if the same objective would have been achieved with EventMachine.
You can see the result on twittmap

tweetmap

So here is the code : with two main parts, the server and the client.

schema
The server

There are two parts, the connection to the real time twitter stream, and the connection between the client and the server. Each part is around 100 lines, so pretty small. See at the end of the article if you want to install it yourself.

I don’t wanted to force the user to log in into twitter to use the application, so I just a signle key for everybody. To get your own key and access token, you need to go to dev.twitter.com, register an app (to get the access_token) and get your own account token for this app, everything is explained.
For Twitter, I use the « Twit » package .
Let’s look at the code :

var T = new Twit({  // You need to setup your own twitter configuration here!
  consumer_key:    process.env.CONSUMER_KEY,
  consumer_secret: process.env.CONSUMER_SECRET,
  access_token:    process.env.ACCESS_TOKEN,
  access_token_secret:process.env.ACCESS_TOKEN_SECRET
});

Now, I open a stream with a filter on location, but I filter on [-180,-90,180,90] which is basically all locations. The only risk is to be limited by Twitter (the « limit » event), which occurs if rate of the stream is higher than 5% of total tweets.

var stream = T.stream('statuses/filter', { locations: world});
stream.on('error',function(error){
  console.log(error);
});
stream.on('limit', function (limitMessage) {
  console.log("Limit:"+JSON.stringify(limitMessage));
});
stream.on('tweet', function (tweet) {
  if(tweet.geo){
    var coords=tweet.geo.coordinates;
    clients.forEach(function(socket){
      var currentBounds=bounds_for_socket[socket.id];

      if(currentBounds&&(coords[1]>currentBounds[0])
                      &&(coords[0]>currentBounds[1])
                      &&(coords[1]<currentBounds[2])
                      &&(coords[0]<currentBounds[3])){

        totalSent+=1;
        if(totalSent%100==0)console.log("Sent:"+totalSent);
        var smallTweet={
          text:tweet.text,
          user:{   screen_name:       tweet.user.screen_name,
                   profile_image_url: tweet.user.profile_image_url,
                   id_str:            tweet.user.id_str},
          geo: tweet.geo
        };
        socket.emit('stream',smallTweet);
      }
    });
  }
  });

When a tweet is received, I first check if there is a geo location field (should be always the case because I filter on geolocation, but geojson allows other type of representation). Then, I convert the tweet to a « smallTweet » which is basically a subset of all the Twitt field, in order to reduce it size . Then, I sen dit to all connected clients (using socket.io) if the client is ‘looking ‘ at this area.

This is managed by the second part, the « socket.io » part.

var bounds_for_socket={}; 
var clients=[];  // the list of connected clients
io.sockets.on('connection', function (socket) {

  socket.on('recenter',function(msg){
    bounds_for_socket[this.id]=JSON.parse("["+msg+"]");
  });

  socket.on('disconnect',function(socket){
    //  Here we try to get the correct element in the 
    //  client list
    for(var i=0;i<clients.length;i++){
      client=clients[i];
      if(client.client.id==this.id){clients.splice(i,1)}
    }
    delete bounds_for_socket[this.id];

  });
  clients.push(socket); // Update the list of con. clients
  currentBounds=JSON.parse(socket.handshake.query.bounds);
  bounds_for_socket[socket.id]=currentBounds;
});

The small difficulty is to maintain, for every client connected, the socket id and also the bounding box for the connection. The bounding box is either sent when the connection is opened or by a message by the client to the server, when the user move the map for instance.

The client

Now let’s look at the client. I use Leaflet, which is a general-purpose mapping library with heatmap which can display heatmap on top of a map. As the heatmap is very colorful, I choose Black & white tiles to see the difference. The twitter rate (dizains of tweets/seconds) don’t allows to display a full world heatmap (for this there is other means, see : ) so I limit the zoom factor to 5.

  var baseLayer = L.tileLayer(
  'http://korona.geog.uni-heidelberg.de:8008/tms_rg.ashx?z={z}&x={x}&y={y}',{
    attribution: 'Map data &copy; <a href="http://openstreetmap.org">OpenStreetMap</a> contributors, <a href="http://creativecommons.org/licenses/by-sa/2.0/">CC-BY-SA</a>, Imagery © <a href="http://cloudmade.com">CloudMade</a>',
    maxZoom: 18
  });

  var cfg = {
    "radius": 5,
    "maxOpacity": .8,
    "scaleRadius": false,
    "useLocalExtrema": true,
    latField: 'lat',
    lngField: 'lng',
    valueField: 'count'
  };

  heatmapLayer = new HeatmapOverlay(cfg);
  map = new L.Map('map-canvas', {
    center: new L.LatLng(46.99,2.58),
    zoom:   7,
    minZoom:5,
    layers: [baseLayer, heatmapLayer]
  });
  map.on('zoomend',updateSocket);
  map.on('dragend',updateSocket);
  if(bounds=$.urlParam('bounds')){
    rect=JSON.parse(bounds);
    map.fitBounds(L.latLngBounds([rect[1],rect[0]],
                                 [rect[3],rect[2]]));
  }

bounds is the bounds view, passed in the url so a specific location can be bookmarked and reused later.

As you see, there is callback « updateSocket » called when the user finished to zoom in/zoom out or to move the map. This callback will tell the server to update his location zone through a socket, and will update also the url ‘bounds’ parameter :

function updateSocket(){
  window.history.pushState("TweetOMap","TweetOMap","/?bounds=["+map.getBounds().toBBoxString()+"]");
  if(socket)socket.emit("recenter",map.getBounds().toBBoxString());
}

The last part is the initialisation of the socket. This part will connectet to the server (or reconnect when connection is lost and the recovered) .

 function startSocket(){
  socket = io.connect('/', {query: "bounds=["+map.getBounds().toBBoxString()+"]"});
  socket.on('stream', function(tweet){
    addPoint(tweet);
  });
  socket.on('reconnect',function(){
    console.log("Reconnect");
    updateSocket();
  });
}

When a new « tweet » event is received through the server, the addPoint procedure is called :

function addPoint(tweet)
{
  if(tweet.geo){
    pt={lng:tweet.geo.coordinates[1],lat:tweet.geo.coordinates[0],count:1};
    if(showTweets){
      bubble=hover_bubble.shift();
      bubble.setContent("<img src="+tweet.user.profile_image_url+" align=left><b>@"+tweet.user.screen_name+"</b><br>"+tweet.text)
      .setLatLng(tweet.geo.coordinates)
      .addTo(map);
      hover_bubble.push(bubble);
    }
    heatmapLayer.addData(pt);
  }
}

We add the point to heatmap layer (heatMapLayer.addData(pt) ) but we also add a bubble to the map with some information on the tweet. This can be switched off using the showTweets flags. Only the last 10 tweets are displayed.

The full source code is available here: https://github.com/tomsoft1/TweetOMap

I’ve tried a new provider to deploy the app, nodejitsu. It’s free for the first application with some limitation. It’s very close to heroku and it worked quite well.

To install it :

git clone 'https://github.com/tomsoft1/TweetOMap'
cd TweetOMap
vi tweet.js # put your own cred. or put them in an env variables
npm install
node tweets.js

Then using localhost :8080 you should see the map.

Note that the url modification allows you to “remember” the state of the map, for instance, to look around new york, use this link: http://twittmap.nodejitsu.com/?bounds=[-78.2940673828125,38.86965182408357,-70.51025390625,41.79998325207397]

Conclusion
As a ruby lover, this was a nice experience. Node.js ecosystem is obviously very good, especially in this part. The ability to easily set up a socket connection between the client and the server is a real plus.

I like it as a tool on some tasks , that can be used in combination of others tools. I personally use it for instance in conjunction with others Rails or EventMachine program. Rails is much more mature in terms of framework, library and ORM support but Node.js is more advanced for evented io (but sometime , ‘too much’, see my Evented dictature article).

Note that the same principle has been used in my TweetMap experiment.