Feel free to like, link, friend, follow, forward, rate, review, reply, comment, or connect!
Marc, a very cool topic. Social networking / big data analysis / cool tools. All good stuff. Thanks for the info...
Awesome. Any other questions? If not, have a great afternoon everybody!
Also: See: http://www.smrfoundation.org
And: http://connectedaction.net
And: http://netbadges.com
And: http://nodexlgraphgallery.org
Thanks!
Marc
http://nodexl.codeplex.com
Thanks, Marc. I believe you shared a link to the NodeXL download, correct?
Thanks for all the interest.
"Think Link"!
If I missed anyone's question, feel free to email me: marc@smrfoundation.org
Thanks, Marc. Any other big questions from the community in trhe minutes remaining?
Who has access to the All Analytics back end? If they can generate an "edge list" with this format:
User ID #1, User ID #2, Relationship Type (EX: Replies), Date, Time, etc.
We can import, analyze, and visualize.
Wow the stuff just makes my head swim.
Most sites with social features contain a network.
Getting that network data out is fairly straight forward.
I like to say, whenever two GUIDs can be joined, a network is born.
Netbadges will introduce new types of Netbadges. Right now we only award the "Bridge Builder Badge" for being the most like a bridge (max value of the Betweenness Centrality score). Next we may award the "Newcomer" or "Hub" award.
I wonder what interesting observations one could make of the allanalytics postings... this site has got pretty large audience... (I'm not sure how you'd get the raw data...)
Looking at networks with a real-time element sounds very exciting, Marc!
Thanks Marc. You've got ambitious plans. :-)
Netbadges: we will expand to include data from beyond Twitter. Facebook, YouTube, the WWW, G+ are all targets.
We are looking carefully at Activity Stream as a new "one data structure to rule them all".
As we're winding down here with just a few minutes until the end of the hour, I'm wondering if their are any more burning questions out there.
Next for us:
NodeXL: more content analysis (see recent maps for examples of summaries of top URLs, Hashtags, and @usernames mentioned in each group in the network). We will focus on ease of use and access to data sources (G+ importer coming in a week or two, for example).
NodeXL will focus more on Time. Animation and better analysis of change and contrast between networks is coming.
When NodeXL is set up to run automatically, I mostly just look at the contents of its report output directory (which is synched to all my machines via Dropbox). I just look at the most recent maps and reports. In minutes I can review 100+ maps.
Sorry guys, lost my 'Net connection for a moment. Case that happens again, I want to be sure to sneak one more question in: Marc, so you've got Netbadges and NodeXL. What's missing? Or, maybe I should say, what's next?
Since the goal of these tools is, at least in part, to get actionable results, I'm imagining all of these shapes require a different approach for engagement.
Polarized: see: http://www.flickr.com/photos/marc_smith/4971926421/ which is the map of people talking about US Politics. Two or more dense clusters with relatively little interconnection,
Brand: mostly isolates, few connections, lots of turn over
Community: few isolates, lots of density (many connections among the people), some clusters but lots of interconnection
Odd, the URL will not appear. You'll have to cut it and paste it.
Shapes:
Broadcast: a hub with many spokes, most of which do not connect to one another. often centered on the brand account.
Flickr URL: ![]()
Can you define those shapes a bit more for us, Marc?
Forgot to wrap the URL:
![]()
We have learned a lot about structure and social media. The variations you can see in Graph Gallery are one example. Also see the collection flickr (the images are higher resolution in flickr): http://www.flickr.com/photos/marc_smith/sets/72157622437066929/
The shapes we see often are "broadcast", "community", "brand", "polarized".
Our current focus is on automating the process from end to end. These maps are now automatically collected, analyzed, visualized and published with a text summary. The results are what you can see in some of my maps in NodeXLGraphGallery.org. We want to make it so that making a network map is as easy as making a pie chart.
Marc, your tool also looks at YouTube, correct. After mapping these network for soe time, have you noticed a significant difference in how communities are formed based on network structure?
Marc, with some of your own "connection" research, what are some of the more interesting that you've found?
KenAa: I do monitor these maps daily. I look for big changes in their structures. I look for new people in the top 10 lists. I look for new topics.
LOL, re "influential," Marc. Hard to get around, huh?
Shawn: If you can get data out of FourSquare, I have no doubt that we can get it into NodeXL!
Marc, being a dashboard geek, I was wondering if you ever keep any on-going 'dashboard-style' observations of any particular social media sites? Something that you keep an eye on over a period of time?
What should you do when you know the mayor of your hashtag? Yes, in some cases it is at least a good idea to follow them. They may follow you. You may even want to judiciously retweet them. They will almost certainly follow you then. And you may find that over time your tweets get greater visibility among these influential (there I said it) folks.
Marc, about Foursquare, just wondering. I suspect the data would be interesting.
Shawn: No Foursquare importer for NodeXL yet, sorry. But we do import data from a wide variety of formats that you may be able to get Foursquare data in: GraphML, CSV, etc.
So Marc, this might not be something you're privy, to but when they do find out who the "mayor of the hashtag" is -- then what? Do they try to influence that person, and use further mapping to measure success, for example?
Speaking of mayors and getting back to location-based networks, does NodeXL import from Foursquare?
But also, how do we compare?
For example, see: http://nodexlgraphgallery.org/Pages/Default.aspx?search=foundation
For a collection of maps for "Foundations" in twitter.
They vary a great deal - contrast Gates Foundation with Ford Foundation.
Brands want to know: who is the mayor of my hashtag?
Brand examples:
Yes - lots of companies now request maps for various topics on a daily basis.
Shawn: volatility is an issue that can be addressed by controlling the history window. We like volatility if we want "instantaneous" data - who is the center of the conversation *NOW*. But we may also want to know about the centrality of people over a longer window, to rule out infrequent fluctuations. These measures are based on whatever you feed them.
And networking is the heart and soul of this process. Great point, Marc.
Collective action dilemma theory is a useful framework for describing phenomena like wikipedia, message boards, photo archives, where many contribute to assemble and author the material and even more people come and take from the resource (often without ever making a contribution). The magic of the Internet is that it reduces the size of the "minimal contributing set" - the fewest number of people needed to generate a resource, while increasing the ability of a group to find each other, and simultaneously making the resources reusable and findable. That is a lot to change for a social process!
Marc, can you share some specific examples of the type of brand and/or customer research companies have used NodeXL to do?
Marc. How permentant are these centralities? Google and Kout ranks take time to change. But exporting data from Twitter on NodeXL or badges on a topic on Netbadges, how different could my results be day to day?
wow, this is impressive, i got to try it
Network data importers for NodeXL: http://nodexl.codeplex.com/wikipage?title=Third-Party%20NodeXL%20Graph%20Data%20Importers
Ning Song:
- Flexible Import and Export Import and export graphs in GraphML, Pajek, UCINet, and matrix formats.
- Direct Connections to Social Networks Import social networks directly from Twitter, YouTube, Flickr and email, or use one of several available plug-ins to get networks from Facebook, Exchange and WWW hyperlinks.
KenAa: Prediction is a often sought goal. I have questions about the predictive value of social media. I think social media is largely reactive. But reactions can be predictive, so I do not rule out its potential there. But many want to, for example, predict the US Presidential race with "Tweet" votes.
Hi, Mr. Smith, I see in your video, NodeXL can easily extract information from Twitter, how about facebook and Google+?
The number of bridges are a function of how divided the network is. Some topics are highly divided, separate groups that have few if any links to one another. Political topics often look like this.
Marc, regarding a macro-socialogical topic such as 'Collective Action' in regards to social media analysis, have you encountered techniques that would enable one to predict events like 'flash mobs' (maybe there's an emergant metric that indicates this). This would probably be based on twitter data...
Have a look at NodeXLGraphGallery, compare a number of the maps. Note how they vary in terms of the ratio of isdolates. The maps with many isolates represent topics that could be called "brands" or "pubic" topics.
Network growth does require an continuous influx of new comers.
BUT, step #1 is then to connect them to someone already connected to the big clump of interlinked people.
Marc, what have you learned about the hub/bridge relationship? For example, I'm wondering how many bridges might we expect to stem from each hub? (The map you linked to is astounding!)
To your point, Marc, efforts to grow a network must almost always start outside it.
Network analysis is used to generate the routes and directions in mapping tools.
Network analysis is recommending books to you on Amazon and Friend to you on Facebook.
Network analysis is everywhere. Google is all network analysis - PageRank is a refinement of Eigenvector Centrality - which is a network measure of how connected the people you connect to are.
Yes, when you want to change an entrentched belief about the product or service, the hubs are important.
But even there, the hubs may have less value than the bridges.
So, I am simply advocating for the use of more measures of network location and the inclusion of more locations in the network as having possible value - all depending on your goal. It is just that the goal is not *always* get the big shot to talk about me.
Marc. Not sure if we got to Ning Song's question, but wanted to insert it here. She asks: what are the main fields where social network analysis applied? in addition to customer sentiment analysis and fraud detection?
We could say that current models of influence detection only look for mountain peaks. But valleys have value.
Marc, what if you goal isn't new business? What if it's forging better customer relationships or some such? Then I would think the hubs would be of higher value?
BethSchultz: well said. I like that.
Here is an example:
http://nodexlgraphgallery.org/Pages/Graph.aspx?graphID=447
Marc. From a marketing researc standpoint, this may be the most interesting part of the tool and there are obvious actionable approaches you could use with a social media capaign at that point.
Marc, so I suppose another way of looking at that is to say the isolate is the most capable of being influenced?
Have a look at http://nodexlgraphgallery.org
Most influencer scores suggest that hubs with many links are the most valuable people. And they are valuable. If you want to get a new message out to your existing customers the hubs are great broadcasters. But they are likley to already be a customer, and thus have little to add in terms of new business.
Most "influencer" systems ignore the isolate.
But I will argue that for the strategic goal of "new customer acquisition" the isolate is the most "influential" person in the room. They have no connections to your community but they just said your name.
It also seems to me that if one wanted to pay attention to geographical location, another option might be to use a place name as a keyword or phrase. This might be particularly usefuil with smaller communities where people using the name, say on Twitter, would likely be residents.
Marc, you lost me there on the "influence" of the "isolate." Can you elaborate?
So, I suggest that we consider the "influence" of the "isolate". Isolates have no connections.
Re: Location. While sparse, the lat/long data that does exist is illuminating and remarkably precise.
Yes, NodeXL can import a wide range of attribute data from Facebook (where available). (http://socialnetimporter.codeplex.com/)
Oh, OK, thanks Marc -- lat/long would be of limited value, then.
Then how about Facebook's geographic info. Can this be imported as well?
Re: Location. Many tweets are from desktops, or from mobiles that have not enabled GPS. So only a few tweets are authored with Lat/Long data.
Ooops. Black box. Thanks, smkinoshita!
Yes: NodeXL has a Facebook importer. See: http://socialnetimporter.codeplex.com/
Marc, I like that comparison of Klout vs. NodeXL. I get it!
I would say that http://Netbadges.com is more directly competing with Klout.
And there is plenty of room for more than one: note the existence of FICO, Equifax, TransUnion, and Experian.
Marc, can you explain about the lat/long? Does that mean you get coordinates from a Tweet but that Twitter users themselves do not generally mention where they're tweeting from? If you've got lat/long does the absence of location mention in a Tweet matter?
Marc, NodeXL also can import data from Facebook?
NodeXL <> Klout
Klout is a kind of FICO score for the web.
NodeXL is more like Excel or Google maps: its a tool for examing data.
You could build a Klout score with NodeXL (if you knew their ingredients). But you could not build NodeXL out of Klout.
Marc, in short, Klout is what social media experts like Marshall Sponder in a recent post here might call a "block box."
Geo graphic data is an interesting dimension.
NodeXL does download lat/long data from Twitter, but only about 5% of tweets have location data on them that I see.
So Marc, do you see NodeXL as competing with Klout or is there room for both types of tools?
So, NodeXL (and my other [commercial] project, http://netbadges.com) are all about building a local network of people who share an attribute, for example, they all say a keyword.
Interesting question, Beth, since digital networks now have geographic components.
Klout is useful, but I have some critiques. > It is opaque: I do not know how they calculate their results
> It is global: I want a more focused measure of a person's importance within a defined topical area
Marc, since you bring up "geography," I'm wondering if you can overlay physical maps on top of the virtual maps NodeXL creates, enabling you to pinpoint geographically where influence originates to where it spreads?
Bridges may have fairly few connections, but their connections matter more since they have the few links to some other group. Social media is a collection of islands of conversation. If you want to jump from cluster to cluster, you need bridges.
Marc, measuring social influence is all the rage now with tools like Klout. Can you share with us how your work differs from that.
I like to think of all the "strategic locations" that are in a network: who is the hub (traditionally recognized as an "influencer") - but also, who is the bridge?
In network theory we imagine a landscape, a human topography.
Like geography, some points on the map are more "central" or "peripheral".
We're started, fountainhead. Jump in!
what are the main field social network analysis applied? in addition to customer sentiment analysis and fraud detection?
Influence is more complicated than a single number.
I propose an alternative term for what we seek: "strategic locations".
It's seems Beth and I are on the track here. As a side discussion, Marc and I were talking about some of the tools we'll be talking about. Maybe you'd like to bring us up to speed on some of the tools you've developed as folks arrive, Marc.
Data sets can be very large: billions of edges (edges = connections between two entities).
But in practice, many interesting data sets are in the low thousdands of connections.
The tool we have built, NodeXL, is able to handle about what Excel can handle (varys by hardware, on the order of 100Ks to 4M rows).
Marc, to add to what KenAa asked, how well do others -- marketers, primarily -- in your opinion understand social media as "a set of collections of connections?"
In sociology there are many traditions. I was trained in three. Collective Action (aka Dilemma Theory). Interactionist sociology. and Socal Network Analysis
Hi Marc. These webs of links you build... how big can these data sets be?
Hi, all, I recently see a SAS social network analysis tool used in Fraud detection, it is amazing how the power of social network analysis
I was trained as a sociologist. For me the Internet is the greatest opportunity for the social sciences in 100 years.
Marc. Just to get us started as others arrive, I wondered if you could start off with just the basics of some of your research leading up to some of the tools we'll be discussing today.
I see social media as a set of collections of connections. I use social network analysis tools to collect and analyze these webs of links.
I'm curious Marc about your background. What prompted you to focus in on social media?
I am a sociologist studying the social structure of social media. I make tools to collect, analyze, and visualize these data sets.
If anyone has already signed in and would like to start sharing some questions on some of the online networking tools shared in the post. You can start doing that now as we get the conversation started.
Hi all. I'm looking forward to today's chat as well. Understanding social media influence and activity is such a challenge!
Hi folks. Here's where we'll be chatting with Marc A. Smith of the Social Media Research Foundation beginning in just a few minutes.
Here's where we will chat with Marc A. Smith, a sociologist specializing in the social organization of online communities and in other computer mediated interaction. He is co-founder of the Social Media Research Foundation. The organization develops tools to measure and analyze the connection and construction of social media communities.
|
 |
Latest Blogs
Lisa Dierker, a Wesleyan professor who taught a statistics class on the Coursera massively open online course platform, talks about all her behind-the-scenes help.
Humanities scholars take on copyright protectionists in their desire to perform statistical and computational analyses of millions of digital books.
SAS Visual Analytics will bring Lenovo the insight it'll need to be ultra flexible with its Yoga 11S marketing spend.
To survive and thrive in today’s fast moving, highly competitive environment, companies must find their competitive edge.
All that big-data ushers in will not have positive impact.
Latest Archived Broadcast
Whether your business is in Tornado Alley or the hurricane belt, weather analytics make you prepared for what is headed your way.
June 26th 3pm EDT Wednesday
Readerboards
Have a question or topic but don't want to write a blog? Post it on our readerboards and get feedback from the community!
MORE READERBOARDS
On-demand Video with Chat
Creating great data visualizations begins with a solid understanding of the data and ends with delivering useful insight to the business.
Resource Center
A SAS Information Resource
MORE
Upcoming Events
for the Business and IT Communities
Executive forums with additional hands-on learning opportunities offered around the world
Each ideal for practitioners, Business leaders & senior executives
NYC, Boston, Philadelphia, Chicago, Minneapolis/St. Paul, Rockville, San Francisco, Los Angeles/Irvine, Dallas, Atlanta
Blog
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
MORE POINT/COUNTERPOINT BLOGS
|