Azure平台 对Twitter 推文关键字进行实时大数据分析

Learn how to do real-time sentiment analysis of big data using HBase in an HDInsight (Hadoop) cluster.

Social web sites are one of the major driving forces for Big Data adoption. Public APIs provided by sites like Twitter are a useful source of data for analyzing and understanding popular trends. In this tutorial, you will develop a console streaming service application and an ASP.NET Web application to perform the following:

  • Get geo-tagged Tweets in real-time using the Twitter streaming API.
  • Evaluate the sentiment of these Tweets.
  • Store the sentiment information in HBase using the Microsoft HBase SDK.
  • Plot the real-time statistical results on Bing maps using an ASP.NET Web application. A visualization of the tweets will look something like this:

    You will be able to query tweets with certain keywords to get a sense of the expressed opinion in tweets is positive, negative, or neutral.

A complete Visual Studio solution sample can be found at https://github.com/maxluk/tweet-sentiment.

In this article

Prerequisites

Before you begin this tutorial, you must have the following:

  • An HBase cluster in HDInsight. For instructions on cluster provision, see Get started using HBase with Hadoop in HDInsight. You will need the following data to go through the tutorial:

    CLUSTER PROPERTY DESCRIPTION
    HBase cluster name This is your HDInsight HBase cluster name. For example: https://myhbase.azurehdinsight.net/
    Cluster user name The Hadoop user account name. The default Hadoop username is admin.
    Cluster user password The Hadoop cluster user password.
  • A workstation with Visual Studio 2013 installed. For instructions, see Installing Visual Studio.

Create a Twitter application ID and secrets

The Twitter Streaming APIs use OAuth to authorize requests.

The first step to use OAuth is to create a new application on the Twitter Developer site.

To create Twitter application ID and secrets:

  1. Sign in to https://apps.twitter.com/.Click the Sign up now link if you don't have a Twitter account.
  2. Click Create New App.
  3. Enter NameDescriptionWebsite. The Website field is not really used. It doesn't have to be a valid URL. The following table shows some sample values to use:
    FIELD VALUE
    Name MyHDInsightHBaseApp
    Description MyHDInsightHBaseApp
    Website http://www.myhdinsighthbaseapp.com
  4. Check Yes, I agree, and then click Create your Twitter application.
  5. Click the Permissions tab. The default permission is Read only. This is sufficient for this tutorial.
  6. Click the API Keys tab.
  7. Click Create my access token.
  8. Click Test OAuth in the upper right corner of the page.
  9. Write down API keyAPI secretAccess token, and Access token secret. You will need the values later in the tutorial.

Create a simple Twitter streaming service

Create a console application to get Tweets, calculate Tweet sentiment score and send the processed Tweet words to HBase.

To create the Visual Studio solution:

  1. Open Visual Studio.
  2. From the File menu, point to New, and then click Project.
  3. Type or select the following values:
    • Templates: Visual C#
    • Template: Console Application
    • Name: TweetSentimentStreaming
    • Location: C:\Tutorials
    • Solution name: TweetSentimentStreaming
  4. Click OK to continue.

To install Nuget packages and add SDK references:

  1. From the Tools menu, click Nuget Package Manager, and then click Package Manager Console. The console panel will open at the bottom of the page.
  2. Use the following commands to install the Tweetinvi package, which is used to access the Twitter API, and the Protobuf-net package, which is used to serialize and deserialize objects.
    Install-Package TweetinviAPI
    Install-Package protobuf-net 
    NOTE:

    The Microsoft Hbase SDK Nuget package is not available as of August 26th, 2014. The Github repo ishttps://github.com/hdinsight/hbase-sdk-for-net. Until the SDK is available, you must build the dll yourself. For instructions, see Get started using HBase with Hadoop in HDInsight.

  3. From Solution Explorer, right-click References, and then click Add Reference.
  4. In the left pane, expand Assemblies, and then click Framework.
  5. In the right pane, select the checkbox in front of System.Configuration, and then click OK.

To define the Tweeter streaming service class:

  1. From Solution explorer, right-click TweetSentimentStreaming, point to Add, and then click Class.
  2. In Name, type HBaseWriter, and then click Add.
  3. In HBaseWriter.cs, add the following using statements on the top of the file:
    using System.IO;
    using System.Threading;
    using System.Globalization;
    using Microsoft.HBase.Client;
    using Tweetinvi.Core.Interfaces;
    using org.apache.hadoop.hbase.rest.protobuf.generated;
  4. Inside HbaseWriter.cs, add a new class call DictionaryItem:
    public class DictionaryItem
    {
        public string Type { get; set; }
        public int Length { get; set; }
        public string Word { get; set; }
        public string Pos { get; set; }
        public string Stemmed { get; set; }
        public string Polarity { get; set; }
    }

    This class structure is used to parse the sentiment dictionary file. The data is used to calculate sentiment score for each Tweet.

  5. Inside the HBaseWriter class, define the following constants and variables:
    // HDinsight HBase cluster and HBase table information
    const string CLUSTERNAME = "https://<HBaseClusterName>.azurehdinsight.net/";
    const string HADOOPUSERNAME = "<HadoopUserName>"; //the default name is "admin"
    const string HADOOPUSERPASSWORD = "<HaddopUserPassword>";
    const string HBASETABLENAME = "tweets_by_words";
    
    // Sentiment dictionary file and the punctuation characters
    const string DICTIONARYFILENAME = @"..\..\data\dictionary\dictionary.tsv";
    private static char[] _punctuationChars = new[] {
        ' ', '!', '\"', '#', '$', '%', '&', '\'', '(', ')', '*', '+', ',', '-', '.', '/',   //ascii 23--47
        ':', ';', '<', '=', '>', '?', '@', '[', ']', '^', '_', '`', '{', '|', '}', '~' };   //ascii 58--64 + misc.
    
    // For writting to HBase
    HBaseClient client;
    
    // a sentiment dictionary for estimate sentiment. It is loaded from a physical file.
    Dictionary<string, DictionaryItem> dictionary;
    
    // use multithread write
    Thread writerThread;
    Queue<ITweet> queue = new Queue<ITweet>();
    bool threadRunning = true;
  6. Set the constant values, including <HBaseClusterName><HadoopUserName>, and <HaddopUserPassword>. If you want to change the HBase table name, you must change the table name in the Web application accordingly.

    You will download and move the dictionary.tsv file to a specific folder later in the tutorial.

  7. Define the following functions inside the HBaseWriter class:
    // This function connects to HBase, loads the sentiment dictionary, and starts the thread for writting.
    public HBaseWriter()
    {
        ClusterCredentials credentials = new ClusterCredentials(new Uri(CLUSTERNAME), HADOOPUSERNAME, HADOOPUSERPASSWORD);
        client = new HBaseClient(credentials);
    
        // create the HBase table if it doesn't exist
        if (!client.ListTables().name.Contains(HBASETABLENAME))
        {
            TableSchema tableSchema = new TableSchema();
            tableSchema.name = HBASETABLENAME;
            tableSchema.columns.Add(new ColumnSchema { name = "d" });
            client.CreateTable(tableSchema);
            Console.WriteLine("Table \"{0}\" is created.", HBASETABLENAME);
        }
    
        // Load sentiment dictionary from a file
        LoadDictionary();
    
        // Start a thread for writting to HBase
        writerThread = new Thread(new ThreadStart(WriterThreadFunction));
        writerThread.Start();
    }
    
    ~HBaseWriter()
    {
        threadRunning = false;
    }
    
    // Enqueue the Tweets received
    public void WriteTweet(ITweet tweet)
    {
        lock (queue)
        {
            queue.Enqueue(tweet);
        }
    }
    
    // Load sentiment dictionary from a file
    private void LoadDictionary()
    {
        List<string> lines = File.ReadAllLines(DICTIONARYFILENAME).ToList();
        var items = lines.Select(line =>
        {
            var fields = line.Split('\t');
            var pos = 0;
            return new DictionaryItem
            {
                Type = fields[pos++],
                Length = Convert.ToInt32(fields[pos++]),
                Word = fields[pos++],
                Pos = fields[pos++],
                Stemmed = fields[pos++],
                Polarity = fields[pos++]
            };
        });
    
        dictionary = new Dictionary<string, DictionaryItem>();
        foreach (var item in items)
        {
            if (!dictionary.Keys.Contains(item.Word))
            {
                dictionary.Add(item.Word, item);
            }
        }
    }
    
    // Calculate sentiment score
    private int CalcSentimentScore(string[] words)
    {
        Int32 total = 0;
        foreach (string word in words)
        {
            if (dictionary.Keys.Contains(word))
            {
                switch (dictionary[word].Polarity)
                {
                    case "negative": total -= 1; break;
                    case "positive": total += 1; break;
                }
            }
        }
        if (total > 0)
        {
            return 1;
        }
        else if (total < 0)
        {
            return -1;
        }
        else
        {
            return 0;
        }
    }
    
    // Popular a CellSet object to be written into HBase
    private void CreateTweetByWordsCells(CellSet set, ITweet tweet)
    {
        // Split the Tweet into words
        string[] words = tweet.Text.ToLower().Split(_punctuationChars);
    
        // Calculate sentiment score base on the words
        int sentimentScore = CalcSentimentScore(words);
        var word_pairs = words.Take(words.Length - 1)
                              .Select((word, idx) => string.Format("{0} {1}", word, words[idx + 1]));
        var all_words = words.Concat(word_pairs).ToList();
    
        // For each word in the Tweet add a row to the HBase table
        foreach (string word in all_words)
        {
            string time_index = (ulong.MaxValue - (ulong)tweet.CreatedAt.ToBinary()).ToString().PadLeft(20) + tweet.IdStr;
            string key = word + "_" + time_index;
    
            // Create a row
            var row = new CellSet.Row { key = Encoding.UTF8.GetBytes(key) };
    
            // Add columns to the row, including Tweet identifier, language, coordinator(if available), and sentiment
            var value = new Cell { column = Encoding.UTF8.GetBytes("d:id_str"), data = Encoding.UTF8.GetBytes(tweet.IdStr) };
            row.values.Add(value);
    
            value = new Cell { column = Encoding.UTF8.GetBytes("d:lang"), data = Encoding.UTF8.GetBytes(tweet.Language.ToString()) };
            row.values.Add(value);
    
            if (tweet.Coordinates != null)
            {
                var str = tweet.Coordinates.Longitude.ToString() + "," + tweet.Coordinates.Latitude.ToString();
                value = new Cell { column = Encoding.UTF8.GetBytes("d:coor"), data = Encoding.UTF8.GetBytes(str) };
                row.values.Add(value);
            }
    
            value = new Cell { column = Encoding.UTF8.GetBytes("d:sentiment"), data = Encoding.UTF8.GetBytes(sentimentScore.ToString()) };
            row.values.Add(value);
    
            set.rows.Add(row);
        }
    }
    
    // Write a Tweet (CellSet) to HBase
    public void WriterThreadFunction()
    {
        try
        {
            while (threadRunning)
            {
                if (queue.Count > 0)
                {
                    CellSet set = new CellSet();
                    lock (queue)
                    {
                        do
                        {
                            ITweet tweet = queue.Dequeue();
                            CreateTweetByWordsCells(set, tweet);
                        } while (queue.Count > 0);
                    }
    
                    // Write the Tweet by words cell set to the HBase table
                    client.StoreCells(HBASETABLENAME, set);
                    Console.WriteLine("\tRows written: {0}", set.rows.Count);
                }
                Thread.Sleep(100);
            }
        }
        catch (Exception ex)
        {
            Console.WriteLine("Exception: " + ex.Message);
        }
    }

    The code provides the following functionality:

    • Connect to Hbase [ HBaseWriter() ]: Use the HBase SDK to create a ClusterCredentials object with the cluster URL and the Hadoop user credential, and then create a HBaseClient object using the ClusterCredentials object.
    • Create HBase table [ HBaseWriter() ]: The method call is HBaseClient.CreateTable().
    • Write to HBase table [ WriterThreadFunction() ]: The method call is HBaseClient.StoreCells().

To complete the Program.cs:

  1. From Solution Explorer, double-click Program.cs to open it.
  2. At the beginning of the file, add the following using statements:
    using System.Configuration;
    using System.Diagnostics;
    using Tweetinvi;
  3. Inside the Program class, define the following constants:
    const string TWITTERAPPACCESSTOKEN = "<TwitterApplicationAccessToken";
    const string TWITTERAPPACCESSTOKENSECRET = "TwitterApplicationAccessTokenSecret";
    const string TWITTERAPPAPIKEY = "TwitterApplicationAPIKey";
    const string TWITTERAPPAPISECRET = "TwitterApplicationAPISecret";
  4. Set the constant values to match your Twitter application values.
  5. Modify the Main() function, so it looks like:
    static void Main(string[] args)
    {
        TwitterCredentials.SetCredentials(TWITTERAPPACCESSTOKEN, TWITTERAPPACCESSTOKENSECRET, TWITTERAPPAPIKEY, TWITTERAPPAPISECRET);
    
        Stream_FilteredStreamExample();
    }
  6. Add the following function to the class:
    private static void Stream_FilteredStreamExample()
    {
        for (; ; )
        {
            try
            {
                HBaseWriter hbase = new HBaseWriter();
                var stream = Stream.CreateFilteredStream();
                stream.AddLocation(Geo.GenerateLocation(-180, -90, 180, 90));
    
                var tweetCount = 0;
                var timer = Stopwatch.StartNew();
    
                stream.MatchingTweetReceived += (sender, args) =>
                {
                    tweetCount++;
                    var tweet = args.Tweet;
    
                    // Write Tweets to HBase
                    hbase.WriteTweet(tweet);
    
                    if (timer.ElapsedMilliseconds > 1000)
                    {
                        if (tweet.Coordinates != null)
                        {
                            Console.ForegroundColor = ConsoleColor.Green;
                            Console.WriteLine("\n{0}: {1} {2}", tweet.Id, tweet.Language.ToString(), tweet.Text);
                            Console.ForegroundColor = ConsoleColor.White;
                            Console.WriteLine("\tLocation: {0}, {1}", tweet.Coordinates.Longitude, tweet.Coordinates.Latitude);
                        }
    
                        timer.Restart();
                        Console.WriteLine("\tTweets/sec: {0}", tweetCount);
                        tweetCount = 0;
                    }
                };
    
                stream.StartStreamMatchingAllConditions();
            }
            catch (Exception ex)
            {
                Console.WriteLine("Exception: {0}", ex.Message);
            }
        }
    }

To download the sentiment dictionary file:

  1. Browse to https://github.com/maxluk/tweet-sentiment.
  2. Click Download ZIP.
  3. Extract the file locally.
  4. Copy the file from ../tweet-sentiment/SimpleStreamingService/data/dictionary/dictionary.tsv.
  5. Paste the file to your solution under TweetSentimentStreaming/TweetSentimentStreaming/data/dictionary/dictionary.tsv.

To run the streaming service:

  1. From Visual Studio, press F5. The following is the console application screenshot:

  2. Keep the streaming console application running while you developing the Web application, So you have more data to use.

Create an Azure Website to visualize Twitter sentiment

In this section, you will create a ASP.NET MVC Web application to read the real-time sentiment data from HBase and plot the data on Bing maps.

To create a ASP.NET MVC Web application:

  1. Open Visual Studio.
  2. Click File, click New, and then click Project.
  3. Type or enter the following:
    • Template category: Visual C#/Web
    • Template: ASP.NET Web Application
    • Name: TweetSentimentWeb
    • Location: C:\Tutorials
  4. Click OK.
  5. In Select a template, click MVC.
  6. In Windows Azure, click Manage Subscriptions.
  7. From Manage Windows Azure Subscriptions, click Sign in.
  8. Enter your Azure credential. Your Azure subscription information will be shown on the Accounts tab.
  9. Click Close to close the Manage Windows Azure Subscriptions window.
  10. From New ASP.NET Project - TweetSentimentWeb, Click OK.
  11. From Configure Windows Azure Site Settings, select the Region that is closer to you. You don't need to specify a database server.
  12. Click OK.

To install Nuget packages:

  1. From the Tools menu, click Nuget Package Manager, and then click Package Manager Console. The console panel is opened at the bottom of the page.
  2. Use the following command to install the Protobuf-net package, which is used to serialize and deserialize objects.
    Install-Package protobuf-net 
    NOTE:

    The Microsoft Hbase SDK Nuget package is not available as of August 20th, 2014. The Github repo ishttps://github.com/hdinsight/hbase-sdk-for-net. Until the SDK is available, you must build the dll yourself. For instructions, see Get started using HBase with Hadoop in HDInsight.

To add HBaseReader class:

  1. From Solution Explorer, expand TweetSentiment.
  2. Right-click Models, click Add, and then click Class.
  3. In Name, enter HBaseReader.cs, and then click Add.
  4. Replace the code with the following:
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Web;
    
    using System.Configuration;
    using System.Threading.Tasks;
    using System.Text;
    using Microsoft.HBase.Client;
    using org.apache.hadoop.hbase.rest.protobuf.generated;
    
    namespace TweetSentimentWeb.Models
    {
        public class HBaseReader
        {
            // For reading Tweet sentiment data from HDInsight HBase
            HBaseClient client;
    
            // HDinsight HBase cluster and HBase table information
            const string CLUSTERNAME = "<HBaseClusterName>";
            const string HADOOPUSERNAME = "<HBaseClusterHadoopUserName>"
            const string HADOOPUSERPASSWORD = "<HBaseCluserUserPassword>";
            const string HBASETABLENAME = "tweets_by_words";
    
            // The constructor
            public HBaseReader()
            {
                ClusterCredentials creds = new ClusterCredentials(
                                new Uri(CLUSTERNAME),
                                HADOOPUSERNAME,
                                HADOOPUSERPASSWORD);
                client = new HBaseClient(creds);
            }
    
            // Query Tweets sentiment data from the HBase table asynchronously
            public async Task<IEnumerable<Tweet>> QueryTweetsByKeywordAsync(string keyword)
            {
                List<Tweet> list = new List<Tweet>();
    
                // Demonstrate Filtering the data from the past 6 hours the row key
                string timeIndex = (ulong.MaxValue -
                    (ulong)DateTime.UtcNow.Subtract(new TimeSpan(6, 0, 0)).ToBinary()).ToString().PadLeft(20);
                string startRow = keyword + "_" + timeIndex;
                string endRow = keyword + "|";
                Scanner scanSettings = new Scanner
                {
                    batch = 100000,
                    startRow = Encoding.UTF8.GetBytes(startRow),
                    endRow = Encoding.UTF8.GetBytes(endRow)
                };
    
                // Make async scan call
                ScannerInformation scannerInfo =
                    await client.CreateScannerAsync(HBASETABLENAME, scanSettings);
    
                CellSet next;
    
                while ((next = await client.ScannerGetNextAsync(scannerInfo)) != null)
                {
                    foreach (CellSet.Row row in next.rows)
                    {
                        // find the cell with string pattern "d:coor"
                        var coordinates =
                            row.values.Find(c => Encoding.UTF8.GetString(c.column) == "d:coor");
    
                        if (coordinates != null)
                        {
                            string[] lonlat = Encoding.UTF8.GetString(coordinates.data).Split(',');
    
                            var sentimentField =
                                row.values.Find(c => Encoding.UTF8.GetString(c.column) == "d:sentiment");
                            Int32 sentiment = 0;
                            if (sentimentField != null)
                            {
                                sentiment = Convert.ToInt32(Encoding.UTF8.GetString(sentimentField.data));
                            }
    
                            list.Add(new Tweet
                            {
                                Longtitude = Convert.ToDouble(lonlat[0]),
                                Latitude = Convert.ToDouble(lonlat[1]),
                                Sentiment = sentiment
                            });
                        }
    
                        if (coordinates != null)
                        {
                            string[] lonlat = Encoding.UTF8.GetString(coordinates.data).Split(',');
                        }
                    }
                }
    
                return list;
            }
        }
    
        public class Tweet
        {
            public string IdStr { get; set; }
            public string Text { get; set; }
            public string Lang { get; set; }
            public double Longtitude { get; set; }
            public double Latitude { get; set; }
            public int Sentiment { get; set; }
        }
    }
  5. Inside the HBaseReader class, change the constant values:
    • CLUSTERNAME: The HBase cluster name. For example, https://.azurehdinsight.net/.
    • HADOOPUSERNAME: The HBase cluster Hadoop user username. The default name is admin.
    • HADOOPUSERPASSWORD: The HBase cluster Hadoop user password.
    • HBASETABLENAME = "tweets_by_words";

    The HBase table name is "tweets_by_words". The values must match the values you sent in the streaming service, so that the Web application reads the data from the same HBase table.

To add TweetsController controller:

  1. From Solution Explorer, expand TweetSentimentWeb.
  2. Right-click Controllers, click Add, and then click Controller.
  3. Click Web API 2 Controller - Empty, and then click Add.
  4. In Controller name, type TweetsController, and then click Add.
  5. From Solution Explorer, double-click TweetsController.cs to open the file.
  6. Modify the file, so it looks like the following::
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Net;
    using System.Net.Http;
    using System.Web.Http;
    
    using System.Threading.Tasks;
    using TweetSentimentWeb.Models;
    
    namespace TweetSentimentWeb.Controllers
    {
        public class TweetsController : ApiController
        {
            HBaseReader hbase = new HBaseReader();
    
            public async Task<IEnumerable<Tweet>> GetTweetsByQuery(string query)
            {
                return await hbase.QueryTweetsByKeywordAsync(query);
            }
        }
    }

To add heatmap.js

  1. From Solution Explorer, expand TweetSentimentWeb.
  2. Right-click Scripts, click Add, click JavaScript File.
  3. In Item name, enter heatmap.js.
  4. Copy and paste the following code into the file. The code was written by Alastair Aitchison. For more information, seehttp://alastaira.wordpress.com/2011/04/15/bing-maps-ajax-v7-heatmap-library/.
    /*******************************************************************************
    * Author: Alastair Aitchison
    * Website: http://alastaira.wordpress.com
    * Date: 15th April 2011
    *
    * Description:
    * This JavaScript file provides an algorithm that can be used to add a heatmap
    * overlay on a Bing Maps v7 control. The intensity and temperature palette
    * of the heatmap are designed to be easily customisable.
    *
    * Requirements:
    * The heatmap layer itself is created dynamically on the client-side using
    * the HTML5 <canvas> element, and therefore requires a browser that supports
    * this element. It has been tested on IE9, Firefox 3.6/4 and
    * Chrome 10 browsers. If you can confirm whether it works on other browsers or
    * not, I'd love to hear from you!
    
    * Usage:
    * The HeatMapLayer constructor requires:
    * - A reference to a map object
    * - An array or Microsoft.Maps.Location items
    * - Optional parameters to customise the appearance of the layer
    *  (Radius,, Unit, Intensity, and ColourGradient), and a callback function
    *
    */
    
    var HeatMapLayer = function (map, locations, options) {
    
        /* Private Properties */
        var _map = map,
          _canvas,
          _temperaturemap,
          _locations = [],
          _viewchangestarthandler,
          _viewchangeendhandler;
    
        // Set default options
        var _options = {
            // Opacity at the centre of each heat point
            intensity: 0.5,
    
            // Affected radius of each heat point
            radius: 1000,
    
            // Whether the radius is an absolute pixel value or meters
            unit: 'meters',
    
            // Colour temperature gradient of the map
            colourgradient: {
                "0.00": 'rgba(255,0,255,20)',  // Magenta
                "0.25": 'rgba(0,0,255,40)',    // Blue
                "0.50": 'rgba(0,255,0,80)',    // Green
                "0.75": 'rgba(255,255,0,120)', // Yellow
                "1.00": 'rgba(255,0,0,150)'    // Red
            },
    
            // Callback function to be fired after heatmap layer has been redrawn
            callback: null
        };
    
        /* Private Methods */
        function _init() {
            var _mapDiv = _map.getRootElement();
    
            if (_mapDiv.childNodes.length >= 3 && _mapDiv.childNodes[2].childNodes.length >= 2) {
                // Create the canvas element
                _canvas = document.createElement('canvas');
                _canvas.style.position = 'relative';
    
                var container = document.createElement('div');
                container.style.position = 'absolute';
                container.style.left = '0px';
                container.style.top = '0px';
                container.appendChild(_canvas);
    
                _mapDiv.childNodes[2].childNodes[1].appendChild(container);
    
                // Override defaults with any options passed in the constructor
                _setOptions(options);
    
                // Load array of location data
                _setPoints(locations);
    
                // Create a colour gradient from the suppied colourstops
                _temperaturemap = _createColourGradient(_options.colourgradient);
    
                // Wire up the event handler to redraw heatmap canvas
                _viewchangestarthandler = Microsoft.Maps.Events.addHandler(_map, 'viewchangestart', _clearHeatMap);
                _viewchangeendhandler = Microsoft.Maps.Events.addHandler(_map, 'viewchangeend', _createHeatMap);
    
                _createHeatMap();
    
                delete _init;
            } else {
                setTimeout(_init, 100);
            }
        }
    
        // Resets the heat map
        function _clearHeatMap() {
            var ctx = _canvas.getContext("2d");
            ctx.clearRect(0, 0, _canvas.width, _canvas.height);
        }
    
        // Creates a colour gradient from supplied colour stops on initialisation
        function _createColourGradient(colourstops) {
            var ctx = document.createElement('canvas').getContext('2d');
            var grd = ctx.createLinearGradient(0, 0, 256, 0);
            for (var c in colourstops) {
                grd.addColorStop(c, colourstops[c]);
            }
            ctx.fillStyle = grd;
            ctx.fillRect(0, 0, 256, 1);
            return ctx.getImageData(0, 0, 256, 1).data;
        }
    
        // Applies a colour gradient to the intensity map
        function _colouriseHeatMap() {
            var ctx = _canvas.getContext("2d");
            var dat = ctx.getImageData(0, 0, _canvas.width, _canvas.height);
            var pix = dat.data; // pix is a CanvasPixelArray containing height x width x 4 bytes of data (RGBA)
            for (var p = 0, len = pix.length; p < len;) {
                var a = pix[p + 3] * 4; // get the alpha of this pixel
                if (a != 0) { // If there is any data to plot
                    pix[p] = _temperaturemap[a]; // set the red value of the gradient that corresponds to this alpha
                    pix[p + 1] = _temperaturemap[a + 1]; //set the green value based on alpha
                    pix[p + 2] = _temperaturemap[a + 2]; //set the blue value based on alpha
                }
                p += 4; // Move on to the next pixel
            }
            ctx.putImageData(dat, 0, 0);
        }
    
        // Sets any options passed in
        function _setOptions(options) {
            for (attrname in options) {
                _options[attrname] = options[attrname];
            }
        }
    
        // Sets the heatmap points from an array of Microsoft.Maps.Locations
        function _setPoints(locations) {
            _locations = locations;
        }
    
        // Main method to draw the heatmap
        function _createHeatMap() {
            // Ensure the canvas matches the current dimensions of the map
            // This also has the effect of resetting the canvas
            _canvas.height = _map.getHeight();
            _canvas.width = _map.getWidth();
    
            _canvas.style.top = -_canvas.height / 2 + 'px';
            _canvas.style.left = -_canvas.width / 2 + 'px';
    
            // Calculate the pixel radius of each heatpoint at the current map zoom
            if (_options.unit == "pixels") {
                radiusInPixel = _options.radius;
            } else {
                radiusInPixel = _options.radius / _map.getMetersPerPixel();
            }
    
            var ctx = _canvas.getContext("2d");
    
            // Convert lat/long to pixel location
            var pixlocs = _map.tryLocationToPixel(_locations, Microsoft.Maps.PixelReference.control);
            var shadow = 'rgba(0, 0, 0, ' + _options.intensity + ')';
            var mapWidth = 256 * Math.pow(2, _map.getZoom());
    
            // Create the Intensity Map by looping through each location
            for (var i = 0, len = pixlocs.length; i < len; i++) {
                var x = pixlocs[i].x;
                var y = pixlocs[i].y;
    
                if (x < 0) {
                    x += mapWidth * Math.ceil(Math.abs(x / mapWidth));
                }
    
                // Create radial gradient centred on this point
                var grd = ctx.createRadialGradient(x, y, 0, x, y, radiusInPixel);
                grd.addColorStop(0.0, shadow);
                grd.addColorStop(1.0, 'transparent');
    
                // Draw the heatpoint onto the canvas
                ctx.fillStyle = grd;
                ctx.fillRect(x - radiusInPixel, y - radiusInPixel, 2 * radiusInPixel, 2 * radiusInPixel);
            }
    
            // Apply the specified colour gradient to the intensity map
            _colouriseHeatMap();
    
            // Call the callback function, if specified
            if (_options.callback) {
                _options.callback();
            }
        }
    
        /* Public Methods */
    
        this.Show = function () {
            if (_canvas) {
                _canvas.style.display = '';
            }
        };
    
        this.Hide = function () {
            if (_canvas) {
                _canvas.style.display = 'none';
            }
        };
    
        // Sets options for intensity, radius, colourgradient etc.
        this.SetOptions = function (options) {
            _setOptions(options);
        }
    
        // Sets an array of Microsoft.Maps.Locations from which the heatmap is created
        this.SetPoints = function (locations) {
            // Reset the existing heatmap layer
            _clearHeatMap();
            // Pass in the new set of locations
            _setPoints(locations);
            // Recreate the layer
            _createHeatMap();
        }
    
        // Removes the heatmap layer from the DOM
        this.Remove = function () {
            _canvas.parentNode.parentNode.removeChild(_canvas.parentNode);
    
            if (_viewchangestarthandler) { Microsoft.Maps.Events.removeHandler(_viewchangestarthandler); }
            if (_viewchangeendhandler) { Microsoft.Maps.Events.removeHandler(_viewchangeendhandler); }
    
            _locations = null;
            _temperaturemap = null;
            _canvas = null;
            _options = null;
            _viewchangestarthandler = null;
            _viewchangeendhandler = null;
        }
    
        // Call the initialisation routine
        _init();
    };
    
    // Call the Module Loaded method
    Microsoft.Maps.moduleLoaded('HeatMapModule');

To add tweetStream.js:

  1. From Solution Explorer, expand TweetSentimentWeb.
  2. Right-click Scripts, click Add, click JavaScript File.
  3. In Item name, enter twitterStream.js.
  4. Copy and paste the following code into the file:
    var liveTweetsPos = [];
    var liveTweets = [];
    var liveTweetsNeg = [];
    var map;
    var heatmap;
    var heatmapNeg;
    var heatmapPos;
    
    function initialize() {
        // Initialize the map
        var options = {
            credentials: "AvFJTZPZv8l3gF8VC3Y7BPBd0r7LKo8dqKG02EAlqg9WAi0M7la6zSIT-HwkMQbx",
            center: new Microsoft.Maps.Location(23.0, 8.0),
            mapTypeId: Microsoft.Maps.MapTypeId.ordnanceSurvey,
            labelOverlay: Microsoft.Maps.LabelOverlay.hidden,
            zoom: 2.5
        };
        var map = new Microsoft.Maps.Map(document.getElementById('map_canvas'), options);
    
        // Heatmap options for positive, neutral and negative layers
    
        var heatmapOptions = {
            // Opacity at the centre of each heat point
            intensity: 0.5,
    
            // Affected radius of each heat point
            radius: 15,
    
            // Whether the radius is an absolute pixel value or meters
            unit: 'pixels'
        };
    
        var heatmapPosOptions = {
            // Opacity at the centre of each heat point
            intensity: 0.5,
    
            // Affected radius of each heat point
            radius: 15,
    
            // Whether the radius is an absolute pixel value or meters
            unit: 'pixels',
    
            colourgradient: {
                0.0: 'rgba(0, 255, 255, 0)',
                0.1: 'rgba(0, 255, 255, 1)',
                0.2: 'rgba(0, 255, 191, 1)',
                0.3: 'rgba(0, 255, 127, 1)',
                0.4: 'rgba(0, 255, 63, 1)',
                0.5: 'rgba(0, 127, 0, 1)',
                0.7: 'rgba(0, 159, 0, 1)',
                0.8: 'rgba(0, 191, 0, 1)',
                0.9: 'rgba(0, 223, 0, 1)',
                1.0: 'rgba(0, 255, 0, 1)'
            }
        };
    
        var heatmapNegOptions = {
            // Opacity at the centre of each heat point
            intensity: 0.5,
    
            // Affected radius of each heat point
            radius: 15,
    
            // Whether the radius is an absolute pixel value or meters
            unit: 'pixels',
    
            colourgradient: {
                0.0: 'rgba(0, 255, 255, 0)',
                0.1: 'rgba(0, 255, 255, 1)',
                0.2: 'rgba(0, 191, 255, 1)',
                0.3: 'rgba(0, 127, 255, 1)',
                0.4: 'rgba(0, 63, 255, 1)',
                0.5: 'rgba(0, 0, 127, 1)',
                0.7: 'rgba(0, 0, 159, 1)',
                0.8: 'rgba(0, 0, 191, 1)',
                0.9: 'rgba(0, 0, 223, 1)',
                1.0: 'rgba(0, 0, 255, 1)'
            }
        };
    
        // Register and load the Client Side HeatMap Module
        Microsoft.Maps.registerModule("HeatMapModule", "scripts/heatmap.js");
        Microsoft.Maps.loadModule("HeatMapModule", {
            callback: function () {
                // Create heatmap layers for positive, neutral and negative tweets
                heatmapPos = new HeatMapLayer(map, liveTweetsPos, heatmapPosOptions);
                heatmap = new HeatMapLayer(map, liveTweets, heatmapOptions);
                heatmapNeg = new HeatMapLayer(map, liveTweetsNeg, heatmapNegOptions);
            }
        });
    
        $("#searchbox").val("xbox");
        $("#searchBtn").click(onsearch);
        $("#positiveBtn").click(onPositiveBtn);
        $("#negativeBtn").click(onNegativeBtn);
        $("#neutralBtn").click(onNeutralBtn);
        $("#neutralBtn").button("toggle");
    }
    
    function onsearch() {
        var uri = 'api/tweets?query=';
        var query = $('#searchbox').val();
        $.getJSON(uri + query)
            .done(function (data) {
                liveTweetsPos = [];
                liveTweets = [];
                liveTweetsNeg = [];
    
                // On success, 'data' contains a list of tweets.
                $.each(data, function (key, item) {
                    addTweet(item);
                });
    
                if (!$("#neutralBtn").hasClass('active')) {
                    $("#neutralBtn").button("toggle");
                }
                onNeutralBtn();
            })
            .fail(function (jqXHR, textStatus, err) {
                $('#statustext').text('Error: ' + err);
            });
    }
    
    function addTweet(item) {
        //Add tweet to the heat map arrays.
        var tweetLocation = new Microsoft.Maps.Location(item.Latitude, item.Longtitude);
        if (item.Sentiment > 0) {
            liveTweetsPos.push(tweetLocation);
        } else if (item.Sentiment < 0) {
            liveTweetsNeg.push(tweetLocation);
        } else {
            liveTweets.push(tweetLocation);
        }
    }
    
    function onPositiveBtn() {
        if ($("#neutralBtn").hasClass('active')) {
            $("#neutralBtn").button("toggle");
        }
        if ($("#negativeBtn").hasClass('active')) {
            $("#negativeBtn").button("toggle");
        }
    
        heatmapPos.SetPoints(liveTweetsPos);
        heatmapPos.Show();
        heatmapNeg.Hide();
        heatmap.Hide();
    
        $('#statustext').text('Tweets: ' + liveTweetsPos.length + "   " + getPosNegRatio());
    }
    
    function onNeutralBtn() {
        if ($("#positiveBtn").hasClass('active')) {
            $("#positiveBtn").button("toggle");
        }
        if ($("#negativeBtn").hasClass('active')) {
            $("#negativeBtn").button("toggle");
        }
    
        heatmap.SetPoints(liveTweets);
        heatmap.Show();
        heatmapNeg.Hide();
        heatmapPos.Hide();
    
        $('#statustext').text('Tweets: ' + liveTweets.length + "   " + getPosNegRatio());
    }
    
    function onNegativeBtn() {
        if ($("#positiveBtn").hasClass('active')) {
            $("#positiveBtn").button("toggle");
        }
        if ($("#neutralBtn").hasClass('active')) {
            $("#neutralBtn").button("toggle");
        }
    
        heatmapNeg.SetPoints(liveTweetsNeg);
        heatmapNeg.Show();
        heatmap.Hide();;
        heatmapPos.Hide();;
    
        $('#statustext').text('Tweets: ' + liveTweetsNeg.length + "\t" + getPosNegRatio());
    }
    
    function getPosNegRatio() {
        if (liveTweetsNeg.length == 0) {
            return "";
        }
        else {
            var ratio = liveTweetsPos.length / liveTweetsNeg.length;
            var str = parseFloat(Math.round(ratio * 10) / 10).toFixed(1);
            return "Positive/Negative Ratio: " + str;
        }
    }

To modify the layout.cshtml:

  1. From Solution Explorer, expand TweetSentimentWeb, expand Views, expand Shared, and then double-click _Layout.cshtml.
  2. Replace the content with the following:
    <!DOCTYPE html>
    <html>
    <head>
        <meta charset="utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1.0">
        <title>@ViewBag.Title</title>
        @Styles.Render("~/Content/css")
        @Scripts.Render("~/bundles/modernizr")
        <!-- Bing Maps -->
        <script type="text/javascript" src="http://ecn.dev.virtualearth.net/mapcontrol/mapcontrol.ashx?v=7.0&mkt=en-gb"></script>
        <!-- Spatial Dashboard JavaScript -->
        <script src="~/Scripts/twitterStream.js" type="text/javascript"></script>
    </head>
    <body onload="initialize()">
        <div class="navbar navbar-inverse navbar-fixed-top">
            <div class="container">
                <div class="navbar-header">
                    <button type="button" class="navbar-toggle" data-toggle="collapse" data-target=".navbar-collapse">
                        <span class="icon-bar"></span>
                        <span class="icon-bar"></span>
                        <span class="icon-bar"></span>
                    </button>
                </div>
                <div class="navbar-collapse collapse">
                    <div class="row">
                        <ul class="nav navbar-nav col-lg-5">
                            <li class="col-lg-12">
                                <div class="navbar-form">
                                    <input id="searchbox" type="search" class="form-control">
                                    <button type="button" id="searchBtn" class="btn btn-primary">Go</button>
                                </div>
                            </li>
                        </ul>
                        <ul class="nav navbar-nav col-lg-7">
                            <li>
                                <div class="navbar-form">
                                    <div class="btn-group" data-toggle="buttons-radio">
                                        <button type="button" id="positiveBtn" class="btn btn-primary">Positive</button>
                                        <button type="button" id="neutralBtn" class="btn btn-primary">Neutral</button>
                                        <button type="button" id="negativeBtn" class="btn btn-primary">Negative</button>
                                    </div>
                                </div>
                            </li>
                            <li><span id="statustext" class="navbar-text"></span></li>
                        </ul>
                    </div>
                </div>
            </div>
        </div>
        <div class="map_container">
            @RenderBody()
        </div>
        @Scripts.Render("~/bundles/jquery")
        @Scripts.Render("~/bundles/bootstrap")
        @RenderSection("scripts", required: false)
    </body>
    </html>

To modify the Index.cshtml

  1. From Solution Explorer, expand TweetSentimentWeb, expand Views, expand Home, and then double-click Index.cshtml.
  2. Replace the content with the following:
    @{
        ViewBag.Title = "Tweet Sentiment";
    }
    
    <div class="map_container">
        <div id="map_canvas"/>
    </div>

To modify the site.css file:

  1. From Solution Explorer, expand TweetSentimentWeb, expand Content, and then double-click Site.css.
  2. Append the following code to the file.
    /* make container, and thus map, 100% width */
    .map_container {
        width: 100%;
        height: 100%;
    }
    
    #map_canvas{
      height:100%;
    }
    
    #tweets{
      position: absolute;
      top: 60px;
      left: 75px;
      z-index:1000;
      font-size: 30px;
    }

To modify the global.asax file:

  1. From Solution Explorer, expand TweetSentimentWeb, and then double-click Global.asax.
  2. Add the following using statement:
    using System.Web.Http;
  3. Add the following lines inside the Application_Start() function:
    // Register API routes
    GlobalConfiguration.Configure(WebApiConfig.Register);

    Modify the registration of the API routes to make Web API controller work inside of the MVC application.

To run the Web application:

  1. Verify the streaming service console application is still running. So you can see the real-time changes.
  2. Press F5 to run the web application:

  3. In the text box, enter a keyword, and then click Go. Depending on the data collected in the HBase table, some keywords might not be found. Try some common keywords, such as "love", "xbox", "playstation" and so on.
  4. Toggle among PositiveNeutral, and Negative to compare sentiment on the subject.
  5. Let the streaming service running for another hour, and then search the same keyword, and compare the results.

Optionally, you can deploy the application to an Azure Web site. For instructions, see Get started with Azure Web Sites and ASP.NET.

Next Steps

In this tutorial we have learned how to get Tweets, analyze the sentiment of Tweets, save the sentiment data to HBase, and present the real-time Twitter sentiment data to Bing maps. To learn more, see:

时间: 2024-07-28 18:04:15

Azure平台 对Twitter 推文关键字进行实时大数据分析的相关文章

基于Azure平台的信息推送系统设计与实现

基于Azure平台的信息推送系统设计与实现 大连理工大学  黄涛 自有人类,即有"信息过载"问题,人类对于信息选择的正确性和准确性遭受到很多冗余信息的干扰.目前,信息推送是一种满足个人特征需求,通过自动过滤或筛选,快速.连续.自动并且准确的传递给目标用户信息的技术,是有效解决信息冗余问题方法之一.所以,设计一种使用户可以自动获取信息的信息推送系统显得十分必要. 本文设计并实现了基于微软的Windows Azure云计算平台的信息推送系统.系统模块包括信息处理模块.Web管理模块.存储模

Twitter利用Storm系统处理实时大数据

Hadoop(大数据分析领域无可争辩的王者)专注于批处理.这种模型对许多情形(比如为网页建立索引)已经足够,但还存在其他一些使用模型,它们需要来自高度动态的来源的实时信息.为了解决这个问题,就得借助 Nathan Marz 推出的 Storm(现在在 Twitter 中称为 BackType).Storm 不处理静态数据,但它处理预计会连续的流数据.考虑到 Twitter 用户每天生成 1.4 亿条推文 (tweet),那么就很容易看到此技术的巨大用途. 但 Storm 不只是一个传统的大数据分

Google计划推Twitter历史推文搜索

CNET科技资讯网4月16日国际报道 今后,你在Twitter上说过的话,只要用Google搜索,统统都可以被找出来.Google宣布计划推出Twitter历史信息的时间轴 Google宣布,计划推出Twitter历史信息的时间轴(timeline),以话题分类,搜索者可查看与搜索关键字有关的推文 (tweets)流量何时暴增.当使用者点击某个内含超大量相关推文的日期时,就会出现当日的一长串个别推文. 这听起来有点像是Google数年前推出的timeline search功能,只是这回是为Twi

Twitter拟允许推文插入更丰富内容 140字符限制仍不变

微博鼻祖Twitter,陷入了用户增长的瓶颈当中,Twitter一直在改造产品,试图吸引新用户注册.据外媒最新消息,下周一开始,Twitter将会对推文(Tweet)文字长度计算的方法进行调整,仍然维持140个英文字母的长度,但是将允许包含更多的数字媒体内容.这将会大大增加推文内容丰富性. 据美国科技新闻网站TheVerge引述知情人士称,从下周一(9月19日)开始,Twitter将会对计算140个字符的方式进行调整,所有包含在推文中的媒体内容(比如图片.视频.民意调查.GIF动图等等),以及引

透过微软Azure平台快速布建全球各大区域教育训练平台

「透过微软的 Azure 平台可以快速布建到全球各大区域,无需担心基础架构与设备问题:同时可以达到过去作不到的业务规模,在转换商业模式时有很大的帮助.」吴刚志 一宇数字技术长 云端服务已经成为许多企业期望导入藉以强化与客户联系的管道,并提升对外服务质量.但是自行建构云端服务不但复杂,同时所需要投资的设备与基础设施之成本也相当高,导致许多企业仅能改造最关键的服务.一宇数字同样也希望藉由云端服务强化企业创新价值,同时提供客户更为新颖的应用服务,然在基础建置上则是有别于其他企业,采用的是 Window

Twitter:一条简单的推文背后展示强大开源力量

摘要:7年前的一个创意,成就了如今风靡全世界的社交网络和微博客服务--Twitter.如今Twitter的月活跃用户数达到了2亿多,每天大约有5亿条推文被发送.这一切的背后,是由大量的开源项目在支撑. Twitter被称为"互联网的短信服务",允许用户发布不超过140个字的微博客,该创意来自Twitter的联合创始人Jack Dorsey,这个在7年前被分析师称为"有史以来最愚蠢"的创意,不料如今已经成为了风靡全世界的社交网络和微博客服务,月活跃用户达到了2.183

Twitter对非“常客”推出择优精选推文显示功能

在两家"唯二"的全球性社交网路中,推特(Twitter)用户增长陷入停滞,公司陷入严重低迷,而产品易用性太差.无法吸引新用户注册被认为是主因.推特开始对产品进行重大改造动作.2月10日,推特宣布,将对非频繁访问用户(即非"微博客控"网民)调整推文的显示方式,将精选出一些推文,不再全部以时间顺序进行陈列. 据美国纽约时报网站报道,推特这一改动的主要目的,是解决海量推文带来的用户信息过载负担. 在过去一年中,有关推特的产品改造计划,硅谷已有各种传言.而本周三,推特官方对

Twitter与BPCE Group共同推出通过推文进行 P2P 移动转账业务

摘要: 本周,法国第二大银行 BPCE Group 与 SNS 巨头 Twitter 共同功能推出了通过推文进行 P2P 移动转账的业务从实际行动上验证了BPCE Group 上个月的合作意愿声明. 该合作对于 Twitter 而言,可谓天 本周,法国第二大银行 BPCE Group 与 SNS 巨头 Twitter 共同功能推出了通过推文进行 P2P 移动转账的业务--从实际行动上验证了BPCE Group 上个月的合作意愿声明. 该合作对于 Twitter 而言,可谓天时兼人和--目前它正大

删除推文存档平台PostGhost宣布关闭

据外媒报道,日前,删除推文存档平台PostGhost在收到来自Twitter的中止信后宣布关闭.Twitter在信件中指出,PostGhost的行为已经违反了<开发者协议与政策>,所以他们要求该平台停止储存并显示已经在Twitter上被删除的推文.据悉,PostGhost刚在本周上线.对于Twitter提出的这一要求,PostGhost表示愿意遵守相关规定,但他们认为,对于一些公众人物,他们被删除的推文则应该记录下来. PostGhost在声明中写道:"我们认为,公众有权利查看这些通