Beyond Brute Force: Unexpected Lessons from Crowdsourcing Satellite Imagery Analysis for UNHCR in Somalia

The recent SBTF effort to identify IDP shelters in Somalia for the UNHCR has been notable for several reasons beyond the fantastic work by our very, very hard working volunteers; some of whom may now need an eye exam and glasses… And I feel that what I’m seeing is an inflection point in the development of crisis mapping (or indeed any form of “live” or “crowdsourced” mapping). It’s the point at which we move beyond the “brute force” method of chopping large tasks into little pieces and disseminating them among a distributed human network and begin reaping the rewards of the process itself, as a collaborative space for learning and outreach. For me, this has been the most unexpected dimension of this project so far and I wanted to share my thoughts here for feedback. 

I am always skeptical of crowdsourced data or, indeed, any data. As a geographer and remote sensor whose focus is enumerating displaced populations, I have to be: skepticism is part of my job. All data contain error, so best to acknowledge it and decide what that error means. There is still a lot of uncertainty around these types of volunteered geographic information; specifically questions over the positional accuracy, precision, and validity of these data among a wide variety of other issues. These quantitative issues are important because the general assumption is that these data will be operationalized somehow and it is, therefore, imperative that they add value to already confusing situations if this enterprise is to be taken seriously in an operational sense . The good news is that research so far show that these “asserted” data are not – a priori – necessarily any worse than “authoritative” data and can be quite good due to the greater number of individuals to correct error.

It was with this thinking in mind that I joined the current SBTF effort and I very much appreciate the willingness of our great colleagues at Tomnod, DigitalGlobe, JRC, and UNHCR to treat this as an experiment to see how well a very large amount of very specific imagery analysis could be performed with crowdsourcing. We are beginning to analyze the data now and will likely be doing so for the next month or so. What has been surprising, however, have been a few new twists along the way that I feel probably are lost in the exclusively quantitative concerns that so many (myself included) focus on.

  1. There is huge potential here for stakeholder engagement and broadening your outreach: In a time of plummeting budgets, building a constituency for what your organization does is paramount. Efforts such as this give the public and chance to get engaged, to take part in your mission in a fairly easy way. Speaking about the involvement of students from her Air Photo classes at the University of Georgia, Dr. Marguerite Madden said that the engagement, “raised awareness of this grave situation and many [students] got online to find out more information about why this is happening and what is being done to help…” Today there are almost 200 more people who are familiar with the UNHCR and its mission in Somalia than there was two weeks ago. That’s one heck of an ancillary benefit, especially considering that a vast majority of the volunteers are students with the energy and the desire to contribute to a project such as this. Which brings me to the second point…
  2. The collaboration may be as important as the data. I have been consistently (and pleasantly) surprised by the rich discussion among the volunteers about virtually every aspect of this project. We specifically set out to include the academic community and, especially, the remote sensing community by engaging with the student chapters of the American Society for Photogrammetry and Remote Sensing (ASPRS) due to their higher level of familiarity with imagery analysis. Columbia University’s New Media Taskforce and the University of Madison-Wisconsin’s Department of Geography were major contributors and geography departments at both George Mason University and The University of Georgia hosted mapping events (tip of the hat to Lawrence Jefferson and Chris Strother for making those happen). As a result, we created a very rich environment for exchange and learning. Dr. Madden jumped at the chance to use her class as an opportunity to introduce un-orthodox platforms for imagery analysis to her students and everyone benefited. They learned how crowdsourcing for imagery analysis could work in a live environment and we got tons of good feedback on everything from our rule-set to the platform from her very bright students. It’s this meeting of “professional science” and “citizen science” that helps foster new developments in how we approach these emerging practices.
  3. It’s not always about fixes, it’s about concepts: while I believe that Linus’ Law is a powerful argument for crowdsourcing, it’s important to note that this not only applied to technical bugs, but conceptual ones. Part of remote sensing involves creating a rule-set to aggregate features on the surface of the Earth into meaningful classes that allow you to say something about how the world is or works. While there are robust, scientific, ways to go about this it is worth emphasizing the fact that every classification scheme (in any science) is situated within a specific context and point of view. It’s entirely possible to have very well thought out classification schemes that have little to do with the lived reality on the ground. It was with profound humility that I read the very insightful questions posted to our working Google Doc by volunteers, some of whom had zero experience with remote sensing and, yet, had very perceptive insights regarding the assumptions made by our classification scheme. For more than just a steady workforce to place dots on a map, the volunteers really put their thinking caps on to get under the hood of both the technical aspects of the effort but also the conceptual ones. It was precisely their perspective as non-experts that gave them the ability to see things in a new way.

We remain committed to a critical analysis regarding the substantive contribution of our effort to UNHCR operations, but the sense of community in our dedicated channels of communication that allowed for such vibrant discussion should also be understood as valuable. While the operational use of projects like this cannot go unexamined, it bears repeating these types of projects offer much more than just an additional set of data but present a unique forum and opportunity for creative collaboration, engagement, and learning.

By keeping these thoughts in mind we can begin to move beyond the “brute force” period in crisis mapping, in which complex and – generally – machine-driven functions are simply distributed to a human network and, instead, expand the meaning of geographic data in these new spaces of engagement.

Many thanks to all who have participated in the project thus far. As always, you are fantastic teachers.

Crowdsourcing Satellite Imagery Analysis for UNHCR-Somalia: Latest Results


253,711

That is the total number of tags created by 168 volunteers after processing 3,909 satellite images in just five days. A quarter of a million tags in 120 hours; that’s more than 2,000 tags per hour. Wow. As mentioned in this earlier blog post, volunteers specifically tagged three different types of informal shelters to provide UNHCR with an estimate of the IDP population in the Afgooye Corridor. So what happens now?

Our colleagues at Tomnod are going to use their CrowdRank algorithm to triangulate the data. About 85% of 3,000+ images were analyzed by at least 3 volunteers. So the CrowdRank algorithm will determine which tags had the most consensus across volunteers. This built-in quality control mechanism is a distinct advantage of using micro-tasking platforms like Tomnod. The tags with the most consensus will then be pushed to a dedicated UNHCR Ushahidi platform for further analysis. This project represents an applied research & development initiative. In short, we certainly don’t have all the answers. This next phase is where the assessment and analysis begins.

In the meantime, I’ve been in touch with the EC’s Joint Research Center about running their automated shelter detection algorithm on the same set of satellite imagery. The purpose is to compare those results with the crowdsourced tags in order to improve both methodologies. Clearly, none of this would be possible without the imagery and  invaluable support from our colleagues at DigitalGlobe, so huge thanks to them.

And of course, there would be no project at all were it not for our incredible volunteers, the best “Mapsters” on the planet. Indeed, none of those 200,000+ tags would exist were it not for the combined effort between the Standby Volunteer Task Force (SBTF) and students from the American Society for Photogrammetry and Remote Sensing (ASPRS); Columbia University’s New Media Task Force (NMTF) who were joined by students from the New School; the Geography Departments at the University of Wisconsin-Madison, the University of Georgia, and George Mason University, and many other volunteers including humanitarian professionals from the United Nations and beyond.

As many already know, my colleague Shadrock Roberts played a pivotal role in this project. Shadrock is my fellow co-lead on the SBTF Satellite Team and he took the important initiative to draft the feature-key and rule-sets for this mission. He also answered numerous questions from many volunteers throughout past five days. Thank you, Shadrock!

It appears that word about this innovative project has gotten back to UNHCR’s Deputy High Commissioner, Professor Alexander Aleinikoff. Shadrock and I have just been invited to meet with him in Geneva on Monday, just before the 2011 International Conference of Crisis Mappers (ICCM 2011) kicks off. We’ll be sure to share with him how incredible this volunteer network is and we’ll definitely let all volunteers know how the meeting goes. Thanks again for being the best Mapsters around!

[Cross-posted from Patrick Meier’s iRevolution blog]

Crowdsourcing Satellite Imagery Tagging to Support UNHCR in Somalia

[Cross-posted from Patrick Meier’s iRevolution blog]

The Standby Volunteer Task Force (SBTF) recently launched a new team called the Satellite Imagery Team. This team has been activated twice within the past few months. The first was to carry out this trial run in Somalia and the second was in partnership with AI-USA for this human rights project in Syria. We’re now back in Somalia thanks to a new and promising partnership with UNHCR, DigitalGlobe, Tomnod, SBTF and Ushahidi.

The purpose of this joint project is to crowdsource the geolocation of shelters in Somalia’s Afgooye corridor. This resembles our first trial run initiative only this time we have developed formal and more specialized rule-set and feature-key in direct collaboration with our colleagues at UNHCR. As noted in this document, “Because access to the ground is difficult in Somalia, it is hard to know how many people, exactly, are affected and in what areas. By using satellite imagery to identify different types of housing/shelters, etc., we can make a better and more rapid population estimate of the number of people that live in these shelters. These estimates are important for logistics and planning purposes but are also important for understanding how the displaced population is moving and changing over time.” Hence the purpose of this project.

We’ll be tagging three different types of shelters: (1) Large permanent structures; (2) Temporary structures with a metal roof; and (3) Temporary shelters without a metal roof. Each of these shelter types is described in more details in the rule-set along with real satellite imagery examples—the feature key. The rule-set describes the shape, color, tone and clustering of the different shelter types. As per previous SBTF Satellite Team deployments, we will be using Tomnod’s excellent microtasking platform for satellite imagery analysis.

Over 100 members of the SBTF have joined the Satellite Team to support this project. One member of this team, Jamon, is an associate lecturer in the Geography Department at the University of Wisconsin-Madison. He teaches on a broad array of technologies and applications of Geographic Information Science, including GPS and  satellite imagery analysis. He got in touch today to propose offering this project for class credit to his 36 undergraduate students who he will supervise during the exercise.

In addition, my colleague and fellow Satellite Team coordinator at the SBTF, has recruited many graduate students who are members of the American Society for Photogrammetry and Remote Sensing (ASPRS) to join the SBTF team on this project. The experience that these students bring to the team will be invaluable. Shadrock has also played a pivotal role in making this project happen: thanks to his extensive expertise in remote sensing and satellite imagery, he took the lead in developing the rule-set and feature-key in collaboration with UNHCR.

The project officially launches this Friday. The triangulated results will be pushed to a dedicated UNHCR Ushahidi map for review. This will allow UNCHR to add additional contextual data to the maps for further analysis. We also hope that our colleagues at the European Commission’s Joint Research Center (JRC) will run their automated shelter tagging algorithm on the satellite imagery for comparative analysis purposes. This will help us better understand the strengths and shortcomings of both approaches and more importantly provide us with insights on how to best improve each individually and in combination.

A Thank You Note from AI-USA for the Syria Satellite Imagery Project

We are very grateful to Amnesty International USA (AI-USA) for their support and partnership on the recent Syria Satellite Imagery project. This initiative leveraged high resolution satellite imagery kindly provided by DigitalGlobe, the advanced Tomnod platform and over 70 volunteers from the SBTF Satellite Team to crowdsource evidence of mass atrocities in three key Syrian cities. Here is a very kind thank you note from our counter-part at AI-USA who spearheaded the project with us. We look forward to continuing our collaboration and support of AI-USA’s important work moving forward.

 

Dear SBTF volunteers–

Though there is much left on our end to be done in relation to the Syria pilot project, I wanted to take a moment and write to express deep gratitude.

The SBTF and Amnesty International are natural partners. Amnesty operates under the principle that—given the tools—people everywhere can act in concert to protect the fundamental rights and inherent dignity of each of us.

In AI’s 50 year history, the methods and means by which we agitate as a crowd has evolved beyond community-based Amnesty chapters, to a truly global movement that is no longer artificially separated into groups of activists/advocates and the people whose rights are at risk.

Article 27 of the Universal Declaration of Human Rights guarantees all people the right to share in scientific advancement, and its benefits. Technology has always changed our world. But represented in your work and efforts is a truly new paradigm of social action for social good…one that at once transcends the structural, geographic, and economic barriers that have segmented the human family for too long, and also leverages the power of Article 27 back onto itself in a reinforcing model of technological innovation and group action.

Sadly, as we know, our efforts in this project will not bring about the end of the widespread and systematic abuses occurring in Syria. Indeed, over the course of the project, the situation on the ground as evolved for the worse, and large swaths of the country are effectively crimes scenes. The collection of evidence—and the path to justice—will be a long term endeavor.

But the fruits of the time and energy you committed to in the Syria pilot will have lasting and permanent implications for how AI and other human rights watchdogs approach documentation of war crimes and crimes against humanity. Through this pilot, we have already learned a great deal about the immense leverage social computation can have in the fight for human rights. I look forward to working with you and our other partners on this pilot to incorporate those valuable lessons into future plans.

And above all else, I look forward to the opportunity to work with you in the very near future, and to great effect.

On behalf of Amnesty International—our staff, our volunteers, our activists, and our partners in Syria and everywhere else we can and must collectively make a difference with this approach—I want to express profound gratitude. And I personally and humbly give thanks.

In Solidarity,
Scott Edwards
Advocacy, Policy, and Research Department
Amnesty International, US

Crowdsourcing Satellite Imagery Analysis for Somalia: Results of Trial Run

Cross-posted on Patrick Meier’s iRevolution blog

We’ve just completed our very first trial run of the Standby Task Volunteer Force (SBTF) Satellite Team. As mentioned in this blog post last week, the UN approached us a couple weeks ago to explore whether basic satellite imagery analysis for Somalia could be crowdsourced using a distributed mechanical turk approach. I had actually floated the idea in this blog post during the floods in Pakistan a year earlier. In any case, a colleague at Digital Globe (DG) read my post on Somalia and said: “Lets do it.”

So I reached out to Luke Barrington at Tomnod to set up distributed micro-tasking platform for Somalia. To learn more about Tomond’s neat technology, see this previous blog post. Within just a few days we had high resolution satellite imagery from DG and a dedicated crowdsourcing platform for imagery analysis, courtesy of Tomnod . All that was missing were some willing and able “mapsters” from the SBTF to tag the location of shelters in this imagery. So I sent out an email to the group and some 50 mapsters signed up within 48 hours. We ran our pilot from September 26th to September 30th. The idea here was to see what would go wrong (and right!) and thus learn as much as we could before doing this for real in the coming weeks.

It is worth emphasizing that the purpose of this trial run (and entire exercise) is not to replicate the kind of advanced and highly-skilled satellite imagery analysis that professionals already carry out.  This is not just about Somalia over the next few weeks and months. This is about Libya, Syria, Yemen, Afghanistan, Iraq, Pakistan, North Korea, Zimbabwe, Burma, etc. Professional satellite imagery experts who have plenty of time to volunteer their skills are far and few between. Meanwhile, a staggering amount of new satellite imagery is produced  every day; millions of square kilometers’ worth according to one knowledgeable colleague.

This is a big data problem that needs mass human intervention until the software can catch up. Moreover, crowdsourcing has proven to be a workable solution in many other projects and sectors. The “crowd” can indeed scan vast volumes of satellite imagery data and tag features of interest. A number of these crowds-ourcing platforms also have built-in quality assurance mechanisms that take into account the reliability of the taggers and tags. Tomnod’s CrowdRank algorithm, for example, only validates imagery analysis if a certain number of users have tagged the same image in exactly the same way. In our case, only shelters that get tagged identically by three SBTF mapsters get their locations sent to experts for review. The point here is not to replace the experts but to take some of the easier (but time-consuming) tasks off their shoulders so they can focus on applying their skill set to the harder stuff vis-a-vis imagery interpretation and analysis.

The purpose of this initial trial run was simply to give SBTF mapsters the chance to test drive the Tomnod platform and to provide feeback both on the technology and the work flows we put together. They were asked to tag a specific type of shelter in the imagery they received via the web-based Tomnod platform:

There’s much that we would do differently in the future but that was exactly the point of the trial run. We had hoped to receive a “crash course” in satellite imagery analysis from the Satellite Sentinel Project (SSP) team but our colleagues had hardly slept in days because of some very important analysis they were doing on the Sudan. So we did the best we could on our own. We do have several satellite imagery experts on the SBTF team though, so their input throughout the process was very helpful.

Our entire work flow along with comments and feedback on the trial run is available in this open and editable Google Doc. You’ll note the pages (and pages) of comments, questions and answers. This is gold and the entire point of the trial run. We definitely welcome additional feedback on our approach from anyone with experience in satellite imagery interpretation and analysis.

The result? SBTF mapsters analyzed a whopping 3,700+ individual images and tagged more than 9,400 shelters in the green-shaded area below. Known as the “Afgooye corridor,” this area marks the road between Mogadishu and Afgooye which, due to displacement from war and famine in the past year, has become one of the largest urban areas in Somalia.

Last year, UNHCR used “satellite imaging both to estimate how many people are living there, and to give the corridor a concrete reality. The images of the camps have led the UN’s refugee agency to estimate that the number of people living in the Afgooye Corridor is a staggering 410,000. Previous estimates, in September 2009, had put the number at 366,000″ (1).

The yellow rectangles depict the 3,700+ individual images that SBTF volunteers individually analyzed for shelters: And here’s the output of 3 days’ worth of shelter tagging, 9,400+ tags:

Thanks to Tomnod’s CrowdRank algorithm, we were able to analyze consensus between mapsters and pull out the triangulated shelter locations. In total, we get 1,423 confirmed locations for the types of shelters described in our work flows. A first cursory glance at a handful (“random sample”) of these confirmed locations indicate they are spot on. As a next step, we could crowdsource (or SBTF-source, rather) the analysis of just these 1,423 images to tripple check consensus.

We’ve learned a lot during this trial run and Luke got really good feedback on how to improve their platform moving forward. The data collected should also help us provide targeted feedback to SBTF mapsters in the coming days so they can further refine their skills. On my end, I should have been a lot more specific and detailed on exactly what types of shelters qualified for tagging. As the Q&A section on the Google Doc shows, many mapsters weren’t exactly sure at first because my original guidelines were simply too vague. So moving forward, it’s clear that we’ll need a far more detailed “code book” with many more examples of the features to look for along with features that do not qualify. A colleague of mine suggested that we set up an interactive, online quiz that takes volunteers through a series of examples of what to tag and not to tag. Only when a volunteer answers all questions correctly do they move on to live tagging. I have no doubt whatsoever that this would significantly increase consensus in subsequent imagery analysis.

In related news, the Humanitarian Open Street Map Team (HOT) provided SBTF mapsters with an introductory course on the OSM platform this past weekend. The HOT team has been working hard since the response to Haiti to develop an OSM Tasking Server that would allow them to micro-task the tracing of satellite imagery. They demo’d the platform to me last week and I’m very excited about this new tool in the OSM ecosystem. As soon as the system is ready for prime time, I’ll get access to the backend again and will write up a blog post specifically on the Tasking Server.