Some thoughts on working out who to trust online

The deplorable attempts to use social media (and much of the mainstream media’s response) to find the bombers of the Boston marathon and then the tweets coming out of the Social Media Summit in New York got me thinking again about how we might get a better understanding of who and what to trust online.

When it comes to online trust I think there are two related questions we should be asking ourselves as technologists:

  1. can we help people better evaluate the accuracy, trustworthiness or validity of a given news story, tweet, blogpost or other publication?;
  2. and can we use social media to better filter those publications to find the most trustworthy sources or article?

This second point is also relevant in scientific publishing (a thing I’m trying to help out with these days) where there is keen interest in ‘altmetrics‘ as a mechanism to help readers discover and filter research articles.

In academic publishing the need for altmetrics has been driven in part by the rise in the number of articles published which in turn is being fuelled by the uptake of Open Access publishing. However, I would like to think that we could apply similar lessons to mainstream media output.

MEDLINE literature growth chart

Historically a publisher’s brand has, at least in theory, helped its readers to judge the value and trustworthiness of an article. If I see an article published in Nature, the New York Times or broadcast by the BBC the chances are I’m more likely to trust it than an article published in say the Daily Mail.

Academic publishing has even gone so far as to codify this in a journal’s Impact Factor (IF) an idea that Larry Page later used as the basis for his PageRank algorithm.

The premiss behind the Impact Factor is that you can identify the best journals and therefore the best content by measuring the frequency with which the average article in that journal has been cited in a particular year or period.

Simplistically then, a journal can improve their Impact Factor by ensuring they only publish the best research. ‘Good Journals’ can then act as a trusted guides to their readership – pre filtering the world’s research output to bring their readers only the best.

Obviously this can go wrong. Good research is published outside of high impact factor journals, journals can publish poor research; and mainstream media is so rife with examples of published piffle that the likes of Ben Goldacre can make a career out of exposing it.

As is often noted the web has enabled all of us to be publishers. It scarcely needs saying that it is now trivially easy for anyone to broadcast their thoughts or post a video or photograph to the Web.

This means that social media is now able to ‘break’ a story before the mainstream media. However, it also presents a problem: how do you know if it’s true? Without brands (or IF) to help guide you how do you judge if a photo, tweet or blogpost should be trusted?

There are plenty of services out there that aggregating tweets, comments, likes +1s etc. to help you find the most talked about story. Indeed most social media services themselves let you find ‘what’s hot’/ most talked about. All these services seem however to assume that there is wisdom in crowds – that the more talked about something is the more trustworthy it is. But as Oliver Reichenstein pointed out:

There is one thing crowds have a flair for, and it is not wisdom, it’s rage.”

Relying on point data (most tweeted, commented etc.) to help filter content or evaluate its trustworthiness whether that be social media or mainstream media seems to me to be foolish.

It seems to me that a better solution would be to build a ‘trust graph’ which in turn could be used to assign a score to each person for a given topic based on their network of friends and followers. It could work something like this…

If a person is followed by a significant number of people who have published peer reviewed papers on a given topic, or if they have publish in that field, then we should trust what that person says about that topic more than the average person.

Equally if a person has posted a large number of photos, tweets etc. over a long period of time from a given city and they are followed by other people from that city (as defined by someone who has a number of posts, over a period of time from that city) then we might conclude that their photographs are going to be from that city if they say they are.

Or if a person is retweeted by someone that for other reasons you trust (e.g. because you know them) then that might give you more confidence their comments and posts are truthful and accurate.

PageRank is Google's link analysis algorithm, that assigns a numerical weighting to each element of a hyperlinked set of documents, with the purpose of "measuring" its relative importance within the set.

Whatever the specifics the point I’m trying to make is that rather than relying on a single number or count we should try to build a directed graph where each person can be assigned a trust or knowledge score based on the strength of their network in that subject area. This is somewhat analogous to Google’s PageRank algorithm.

Before Google, search engines effectively counted the frequency of a given word on a Webpage to assign it a relevancy score – much as we do today when we count the number of comments, tweets etc. to help filter content.

What Larry Page realised was that by assigning a score based on the number and weight of inbound links for a given keyword he and Sergey Brin where able to design and build a much better search engine – one that relies not just on what the publisher tells us, nor simply on the number of links but on the quality of those links. A link from a trusted source is worth more than a link from an average webpage.

Building a trust graph along similar lines – where we evaluate not just the frequency of (re)tweets, comments, likes and blogposts but also consider who those people are, who’s in their network and what their network of followers think of them – could help us filter and evaluate content whether it be social or mainstream media and minimise the damage of those who don’t tweet responsibly.

One response to “Some thoughts on working out who to trust online”

  1. We trust the mainstream news agencies and broadcasters with good reason — they are legal entities that are legally accountable, and generally can be relied upon to tell the truth for purely selfish reasons. Adam Smith aside, there are plenty of first-rate journalists in the field, and their services are sought out by the likes of CNN and The Times for good reason — the actuality of events, or ring of opinion on actual events, viewed in anything other than the immediate present, is valuable to people who wish to plan their course in the world.

    The value of so-called ‘social media’ in reporting is that it ‘carries news items’ other broadcasters cannot or will not, for any number of reasons to tedious to relate here. But of primary value to me is that I can personally receive reports from individuals with whom I share a personal bond of trust through personal acquaintance, plus (or, rather, squared with) the point that I can receive reports virtually reliable from friends of personal acquaintances.

    Of course, when social media is choked by reports from people with whom you have no personal acquaintance, then it will be said that the news reports of social media as a whole lose credibility. Yet, as the same personal acquaintances and acquaintances of acquaintances remain, so the actual level of truth is static, and depends entirely upon who you know.

    How can anyone possibly trust anyone they do not know, or who is not vouched for by someone known personally — such a state would not be trusting, it would be as gullible as trusting a single, virtually anonymous online product review, without consideration that the seller or manufacturer may have written the review.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: