Open Source Communities, What Are the Problems? Open Climate Models (3)
Open Source Communities, What Are the Problems? Open Climate Models (3)
I want to return to the series that I started about community approaches to climate modeling. Just to help me get started I am going to repeat the last two paragraphs from the previous entry in the series. (#1 in series, #2 in series)
I managed large weather and climate modeling activities when I was at NASA. On a good day, I maintain that I managed successfully. When I was a manager I sought control, and I grimaced at some naïve ideas of community. My experience tells me that we need to investigate new ways of model development and model use. This need arises because the complexity is too large to control, and this is especially true as we extend the need to use climate models to investigate energy policy decisions and adaptation to climate change.
In the past decade we have seen the emergence of community approaches to complex problem solving. Within these communities we see the convergence of creativity and the emergence of solution paths. We see self-organizing and self-correcting processes evolve. Counter intuitively, perhaps, we see not anarchy, but the emergence of governance in these open communities. The next entry in the series will focus more on describing open communities.
Open Communities, Open Innovation: The past 10 years have seen the emergence of open communities that do things from build software, to collecting information about birds, to building large knowledge bases. An example that often comes to mind is Wikipedia. Wikipedia represents an immense knowledge base. Experts (and not) can write and modify entries. And while anyone can modify the entries that does not mean that there is complete anarchy. There are rules of governance, that in this case translates to editorial standards that assure some level of evaluation of information and affirms some level of accuracy. Such a standard is exemplified in, for example, Wikipedia’s policy of no original research. Wikipedia is even evolving as a place to provide documentation about Earth system modeling infrastructure.
Open communities also include efforts to build software. One of the most famous examples is the development of the computer operating system Linux. Another example of software development is the Apache Foundation. The Apache Foundation represents many software projects, and from their website is “not simply a group of projects sharing a server, but rather a community of developers and users.” “The Apache projects are defined by collaborative consensus based processes, an open, pragmatic software license and a desire to create high quality software that leads the way in its field.” If you explore these websites the community is open, but there are rules and values that are shared by those working in the community. There is a process by which individuals contributions migrate into the products that are branded and provided by the community. That is, there is a governance model.
The two previous paragraphs are examples of two types of community approaches, and there are other types of communities such as Project Budburst and the Encyclopedia of Life. There are grassroots communities such as the atmospheric chemistry community GEOS-CHEM. Some communities have been remarkably successful. They inspire and harvest creative solutions to complex problems. They provide a culture in which ideas and solutions converge and emerge; they contain the attributes of being self-organizing and self-correcting. And in many cases people contribute to these communities without what is traditional compensation; that is, they do it for free.
What is the motivation to participate in such a community for free? And are such communities sustainable and reliable? The participation without being paid is contrary to the intuition of traditional managers. There are people who study the motivation and governance of communities, for example, Sonali Shah and Matthias Stürmer. Some who are motivated by contributing to knowledge, and others by making their mark in some large effort. Others are motivated because they need something that is otherwise not available, and the existing efforts in the community provide the foundation for filling that need. In this case the participation in the community lets them do something that is not otherwise possible. A reason that I often find amongst scientists is the feeling that there are certain tools that should be free, and therefore, they are willing to spend the time to make the tool free, with the expectation that there are others who will also contribute their efforts. Within the federal research community, there is often the value that if tax dollars paid for the generation of data or knowledge, then that data or knowledge should be as widely available as possibly (see National Institutes of Health Public Access Policy). In the same vein, sponsors of research are constantly advocating more community interaction in order to enhance capabilities and, potentially, reduce unneeded duplication of efforts.
Community-based approaches and open access to information are concepts that have been around for far longer than what we might call the internet age. Paul Edwards in his book A Vast Machine talks about the emergence of the need to share information in the study of weather because of the common need to share observations in order for weather forecasts to be useful. Throughout my career at NASA, we would occasionally be asked to do model experiments of what would happen if we (the U.S.) or some other country decided to start charging for all or part of weather data. Sometimes the studies were motivated by – if “our” weather data is “so” important, then others should be paying for it. Well, it turns out everyone’s data is important; for forecasts to be good in the U.S. we need to know what is happening in Canada and out in the Pacific Ocean. So we benefit from open access to the basic information about the Earth’s environment.
I have been exploring the need for open community approaches to addressing climate change in general. The subject of the current set of articles is climate models, and whether or not we could have climate models that are not only accessible, but that could be correctly configured and run by a wide-range of what might be non-expert customers of climate information. To note once again, there are numerous climate models that are accessible, and which can be altered and run by the user, for example, the Community Earth System Model. These models require highly specialized expertise and computational resources.
Sticking with just the focus on climate models, arguably the open source software communities named above provide what might be called an existence criterion. That is, there is the existence of a solution. With this existence, there seem two questions to motivate how to go forward.
1) What are the important elements of successful open source development communities that would be required in an open innovation climate modeling community?
2) What are the similarities and differences of climate modeling to these communities that might help to advance or prohibit the development of a broader, more inclusive, climate modeling activity? Or stated in another way: is climate modeling in some way unique?
I have already hinted at one of the elements of successful software communities - there must exist, namely, governance. When I first started discussing open communities with my manager colleagues in national laboratories, their first response was that climate models could not be developed, evaluated, and implemented in an uncontrolled, anarchist environment. In case you have forgotten, I started this blog with the statement that I was a government manager, and I felt that control was important to me to deliver evaluated systems on time and within budget. It is important to realize, to inculcate, that open communities are not ungoverned, and if they are functional, they are not anarchist. So the development of governance approaches is an essential element; one that will be addressed more fully in future entries.
Approaching the second question posed above, how is climate modeling different from the software developed in the successful software communities mentioned above? One difference is the need to express complex phenomena with quantitative, scientific expressions. In an earlier entry I posed you could image a climate model by posing the following questions: If you were to look around at the clouds, sky, the plants, the people, the landscape, the streams, and ask the question – how do I represent these things as numbers? How do I represent how these things will change? How do I represent how these things interact with each other?
If you imagine developing the operating system for a computer, there are certain well defined tasks that need to be done, and it is possible to check with some precision whether or not you have accomplished the task. In climate modeling such precise definition is not possible, which means there is always an element of scientific judgment that is needed in the evaluation of whether or not the development of a component or sub-component has been successful. And, there is no reason to expect that combining successful sub-components and components yields a functioning climate model. Some would state that building, evaluating and deploying a successful climate model is not just a matter of building software, but it is a combined science-software activity. There is concern that community approaches that have been successful for task-oriented software projects cannot adequately incorporate the scientific integrity needed for proper climate model evaluation. This need to maintain science-based evaluation is perhaps the most formidable hurdle that must be addressed, not only, towards the ambitious goal I outline of configurable models for use by non-experts, but even for broader inclusion of the expert community.
I will end this entry here. Note a couple of new things below.
Another Big Flood
There have been a lot of big floods in the past year. Now we have the record flood in Australia (a great summary in the Boston Globe). I argued that the 2010 flood in Pakistan brought together people, geography, societal assets, wealth, weather and climate in a way that it was a case study in a climate disaster. So does the Australian flood, but it is, perhaps, on the opposite side of the scale.
Figure 1. From the Australian flood. Taken from the excellent summary at the Boston Globe.
Pakistani Flood Relief Links
Doctors Without Borders
The International Red Cross
MERLIN medical relief charity
U.S. State Department Recommended Charities
The mobile giving service mGive allows one to text the word "SWAT" to 50555. The text will result in a $10 donation to the UN Refugee Agency (UNHCR) Pakistan Flood Relief Effort.
Portlight Disaster Relief at Wunderground.com
An impressive list of organizations