WERA_OLD1015: Developing the US National Virtual Herbarium

(Multistate Research Coordinating Committee and Information Exchange Group)

Status: Inactive/Terminating

WERA_OLD1015: Developing the US National Virtual Herbarium

Duration: 10/01/2009 to 09/30/2014

Administrative Advisor(s):


NIFA Reps:


Non-Technical Summary

Statement of Issues and Justification

In 2008, the Western Association of Agricultural Experiment Station Directors supported the WDC 12 project for Integrating Access to Information from Herbaria. This project brought together more than 40 representatives from herbaria across the country at the Botany 2008 conference. The group unanimously supported the creation of a US Virtual Herbarium, USVH, that will provide, through collaboration with regional networks, a single portal to information in the nation's herbaria. This proposal outlines the goals and objectives of a five year project to develop the USVH that will be led by a coordinating committee with representatives from herbaria and informatics. Overall, the project will support AES by increasing the information available about plants that contribute to and/or impact the US agriculture industry by enabling more efficient access to the wealth of information that resides in herbaria.

Justification

Herbaria are rich sources of fine-grained information about plant diversity and biogeography and a key resource for educating students. USVH will make the resources of the country's more than 625 herbaria freely accessible to scientists, consultants, students, and members of the public via the Web. This will revolutionize research and education in systematics, ecology, land management, conservation biology, biogeography, and biodiversity informatics, just as the creation of the first herbaria in the 1540s transformed plant taxonomy (Pavord 2005) by providing, for the first time, an effective means of documenting the meaning of a name and the plants of a region. Access to more specimen information, such as USVH will provide, will enable better use of analytical tools for identifying the ecological and temporal factors that determine species distributions, prediction of areas to which an introduced species will spread or additional populations of a native species be found, and exploration of the similarity in ecological parameters determining the distribution of insect pests to the distribution of potential host species.

Placing images of herbarium specimens on line will facilitate accurate identification of specimens collected in the field, a process that currently requires visiting a herbarium with a well-managed collection. Tools that draw on both distributional information and images can be used in developing powerful online identification resources, a development that will be welcomed by all those required to identify plants.

Enabling rapid visualization of species distributions will highlight the areas for which there are few collections. This can be used to encourage greater participation in documenting the nation's flora and developing greater interest in acquiring taxonomic skills. Representatives of several federal agencies, including the US Forest Service, have expressed dismay at the lack of graduates with the ability to identify plants. Development of USVH will not, in itself, solve the problem but it will make resources available that can be used to address it.

A multi-state approach to the development of USVH is essential. Neither plants nor collectors recognize state boundaries; information about a species based on its distribution in a single state or the herbaria of a single state is likely to be inaccurate. Even to find the state-level-distribution of a species may require searching out-of-state herbaria. Holmgren and Holmgren (1977) reported that Stipa lemmonii (Vasey) Scribner [a Achnatherum lemmonii (Vasey) Barkworth] did not grow in Nevada or Utah. Barkworth and Linman (1984) discovered specimens of it from both states, one of the two from Utah being in the Gray Herbarium of Harvard University, the other in the herbarium of Northern Arizona University. Those from Nevada had simply been misidentified. In addition to avoiding overlooking records documented only in out-of state records, a multi-state approach will facilitate sharing development of the infrastructure required, thereby eliminating redundancy of effort and reducing the overall cost of its construction.

The goal of USVH is technologically feasible. Indeed, a few regional networks are already operational (e.g., SEINet (http://seinet.asu.edu/seinet/index.php), Consortium of California Herbaria (http://ucjeps.berkeley.edu/consortium/). Representatives from these networks have agreed to assist in developing USVH. A larger challenge is sociological: engaging the taxonomists in charge of herbaria, building bridges between taxonomists and computer scientists, and expanding the pool of individuals able to work at the interface between these two areas. This project focuses on development of the human resources and interactions needed to convert the idea of USVH into reality by improving dissemination of information on the processes involved. Support for software development, hardware purchases, and data entry will be sought from a variety of other sources. This project, while not contributing directly to the financial cost of digitizing herbaria and building the networks on which USVH will depend, will contribute indirectly, and substantially, by reducing redundancy in software development and accelerating dissemination of information about new and improved protocols for completing the tasks required for building UVSH.

It is hard to estimate the consequences of failing to establish this coordinating committee. A national portal to US herbaria probably will be developed eventually, but will probably take much longer and/or include only the large herbaria as does, for instance, Australias Virtual Herbarium (http://www.chah.gov.au/avh/avh.html) which ignores all but the official state herbaria. The abundance of herbaria in the US is one of the country's strengths. The purpose of the coordinating committee is to ensure that the country benefits from the wealth of resources, both informational and human, they represent.

The US National Virtual Herbarium will draw information from regional networks and individual herbaria. Consequently, this project focuses on three different levels: aiding people at individual herbaria in digitizing collections and making these resources Web-accessible, integrating information from multiple herbaria at a regional portal, and sharing data through a national portal, including enabling regional portals to reflect records for their region from extra-regional herbaria.

Previous work and current resources


Tracking progress. The number of active herbaria within the US is not known. Over 625 are registered with Biodiversity Collections Index (BCI; http://www.biodiversitycollectionsindex.org/static/index.html), but many smaller herbaria are not registered and some of those listed are inactive. It is also not known how many individual herbaria have begun to database and/or image their collections. About 25 or 4% are listed as providing information to GBIF (inquiry sent on Jan 7, 2009). One of the first tasks of the committee will be to determine the number of herbaria in each of these categories so that it may track progress and identify bottlenecks in creating USVH.

The potential importance of small herbaria to this enterprise is hard to overestimate. They are often located in areas that, because of their remoteness, have not been well collected; among their collections are many specimens that are the only known record of a species occurring in a particular region (Edward Gilbert, pers. comm., 2009). Equally important, they are often at undergraduate institutions that provide more research opportunities to their students than is feasible at large research universities.

It is also important to ensure that those teaching plant systematics teach their students to record the kind and quality of information now required for conformity to international record standards. Supporting the regional networks will aid in achieving this.

There are two resources for recording the existence of herbaria, Index Herbariorum (IH; (http://sweetgum.nybg.org/ih/) and the Biodiversity Collections Index (BCI; http://www.biodiversitycollectionsindex.org/static/index.html). These two projects are working to improve their interface so that information stored by both will only need to be entered once, but they serve different audiences and each has some unique fields. One difference, critical to this project, is that BCI accepts information from small collections, thus linking to BCI rather than IH will best serve the project's needs.

Digitizing collections. Digitizing a specimen comprises two aspects, imaging the specimen as a whole and then storing the label and annotation information in a database. There is little uniformity in the processes used for these processes. This has led to a redundancy of effort and confusion on the part of those wishing to start digitizing. At the 2008 meeting, it was clear that everyone wanted to use the most efficient methods for digitizing their herbaria, but there is great uncertainty as to the optimal combination of equipment, work flow, and software. The answer varies with the size and purpose of each collection, but factual data and clear instructions would greatly aid those seeking to start or speed up the process in their collection.

Sharing digitized information: The Global Biodiversity Information Facility (GBIF) and Taxonomic Databases Working Group (TDWG) have sponsored the development of internationally recognized standards for sharing biodiversity information. Most US herbaria export their data to according to the DarwinCore standard; with or without its extensions. Software for exporting information from a herbarium database must be tailored to the database used. There are tutorials for accomplishing this that can be readily understood by computer scientists. Clearly, there will be economies if the number of different herbarium database systems is minimized, but the primary requirement is that any system used be able to accommodate the desired fields. No databasing system is perfect. Information on the strengths and weaknesses of each system needs to be accessible. Some of this information already exists, but a summary that is regularly updated would be beneficial.

There are two protocols for sharing digitized information, Digir and Tapir (http://www.tdwg.org/activities/tapir/), Tapir being the most recent and the only one that can accommodate images and, consequently, the one on which this project will focus. Implementing these protocols requires a background in Information Technology (IT) or computer science. The National Biodiversity Information Infrastructure (NBII) program within the US Geological Service has sponsored workshops on using these two protocols for people with an IT background and is willing to sponsor more such workshops. As a first step, all networks must have at least one, and preferably more, people aware of how to set up, use, and maintain Tapir.

Integrating resources. For integrating the information from multiple herbaria there is another set of necessary and/or desirable software, e.g., software for data clean up, accommodating differing taxonomic treatments, georeferencing, image examination and measuring. At present, there is little if any sharing of software, nor an easy way to determine what software already exists. A clearing house that provides access to open source software for such procedures will accelerate creation of operational networks, a critical step in USVH development. Other resources are needed at a national level, e.g., a list of species of concern and the states in which they are of concern, shape files for counties that include the dates for which they are valid, software for automating the sharing of records among regions, etc. The immediate need is to determine what is needed and what is available and make this information available.

Objectives

  1. The goal of this project is to accelerate development of USVH that provides access to the specimen information in the nations herbaria by promoting collaboration among herbaria, existing and proposed regional networks, and providers of national resources. To achieve this goal, the steering committee has established the following objectives:
  2. Increase the number of US herbaria digitizing their collections and making their data available to GBIF through the US Node that is maintained by NBII from about 50 to at least 500.
  3. Help all proposed 12 regional networks become operational by 2013, i.e., have the ability to integrate data from multiple herbaria in their region, provide it to the national node, and provide tools for answering questions about the plants in their region using data from intra- and extra-regional herbaria. At present, three networks integrate information from herbaria in their region; none integrate information from extra-regional herbaria.
  4. Identify the critical functions for the national portal as perceived by stake holders and promote development of the relevant software.
  5. Create a highly visual and public resource for reporting participation in and progress toward developing USVH.

Procedures and Activities

Procedures

1. Develop mechanisms for tracking the progress individual herbaria are making in digitizing and sharing their information.


This information is essential for tracking the progress towards our objectives. Progress will be reported both via a map and tables in order to draw attention to the project. Two resources, Index herbariorum and the Biodiversity Collections Index, currently store information on herbaria. They are working towards seamless sharing of information but BCI accepts records from small herbaria, but the willingness of BCI to register smaller herbaria is critical to this project. Tracking progress in developing USVH will require some fields that are in neither resource. These could be established in an external database that is linked to BCI or, if BCI is agreeable, added to BCI.

2. Arrange workshops on efficient methods for digitizing herbarium specimens. These workshops will target botanists teaching plant systematics and/or in charge of herbaria.

At the 2008 meeting, it was evident that curators are eager to adopt efficient procedures for data capture but were not clear which currently available methods are most effective. In addition, verbal explanations of suggested methods left many unanswered questions. We shall arrange video demonstrations of various approaches currently in use, together with a breakdown of their financial and temporal costs. The first of these workshops will be held as part of the project's annual meeting. Subsequent workshops will be modified to reflect what curators identify as their needs.

Potential presenters for the first workshop are individuals NSF has funded for collection databasing and others at herbaria who, without NSF support, have made their collections available on line. Other potential presenters include those who have developed software that will accelerate collection databasing.


Methods for wider dissemination of the information will be explored. Possible outlets include YouTube and SciVee in addition to Web sites.

To provide a rapid overview of the benefits of sharing information, we shall ask herbaria to enter information from all their records of a few, easily identified species, including at least one introduced species. This will help display the information density that can be obtained by combining data from multiple resources and make possible creation of a movie showing how the introduced species has spread across the country.

Educating collectors concerning the information that should be on a label requires that those teaching systematics are aware of what is needed and why. We shall ask the networks to assist in disseminating this knowledge because those teaching at small institutions are more likely to be able to obtain funding to obtain a regional meeting than a national one.

3. Arrange with NBII and other key informatics organizations for workshops on making information and images available, installing the internationally recognized Tapir protocol, and integrating records from multiple collections.

The initiative for building USVH arose within the taxonomic community. A primary need in developing USVH is to establish a group of individuals with strengths in computer science and information technology to identify the software needed to make USVH a reality. This can best be accomplished by forming an implementation committee for the project. Some individuals have already agreed to serve on such a committee (Curtis Dyreson, Computer Science, Utah State University; Richard Moe, Herbarium, University of California-Berkeley; Edward Gilbert, Southwest Environmental Information Network).

NBII hosts the US node for GBIF and, as such, has individuals familiar with the problems of integrating information from a wide range of biodiversity collections. This project will work with NBII in its goal of making biodiversity information available. In addition, the committee will tap into the expertise of other herbarium and informatics organizations for workshops and training. It will seek support for these sessions from other agencies and organizations.

USVH will store information in accordance with the DarwinCore standard, including its extensions. There are several steps before that for which appropriate software is needed. The project will encourage the development of open source software for each step. During the first year of the project, we will, with the assistance of the implementation committee, identify existing software and the willingness of its authors to make it freely available. Examples of the software needed at different stages are: Data capture; data cleanup; taxonomy translation; notification of a proposed taxonomic changes. Resources needed are county-level shape files that reflect changes in boundaries over time, lists of taxa of concern in individual states and/or counties.

USVH will employ internationally recognized protocols. NBII, as the official US representative to GBIF, has already offered workshops on these protocols that enable computer science professionals to appreciate the complexities of biodiversity information and how the protocols address them. To accelerate the development of USVH, we shall ask NBII and their partners to offer additional work shops that will, in addition, provide a forum for discussion of the software needed for developing USVH, identifying existing resources, and identifying individuals or groups of individuals willing to help fill the unmet needs. The portion of the workshops that focuses on implementation of protocols will be designed primarily for computer science professionals. The second part, during which the software needed will be discussed, will be expanded to include individuals working at the interface of taxonomy and bioinformatics and some who are primarily botanists. The first workshop will be held in the fall of 2009. Early in the spring, networks will be asked to identify individuals willing to serve as the primary IT coordinator for their network.

Expected Outcomes and Impacts

  • A Web site that enables visitors to obtain records for individual taxa from all herbaria in the country, see their distribution on a map, or obtain a list of all species known to occur in an area selected by the visitor..
  • Increased knowledge of plant distributions in the US. This will enable more accurate ecological modeling, better land management, and improved assessment of impact of climate chang
  • Greater involvement in documentation of the importance of the nation's plant diversity and the importance of this diversity to human welfare.
  • Creation of new, relatively easy to use, tools for plant identification
  • Integration of bioinformatics into plant systematics education.
  • e.

Projected Participation

View Appendix E: Participation

Educational Plan

Making herbarium information Web accessible will vastly expand access to it, both for those for whom visiting a herbarium, any herbarium, is logistically impractical, for whatever reason, and for those working in a major herbarium wishing to examine specimens at another herbarium. At present, herbarium Web sites are most heavily used by systematists, who relish the access to resources at other institutions; public and private agencies; educators; and students taking courses at the institution owning the herbarium. As USVH develops, the project will encourage development of resources that draw on its information, particularly those that attract new stakeholders. Part of each annual meeting will be devoted to disseminating information about such activities.

Outreach
The resources of USVH, like those of the network nodes on which it will build, are expected to be of value to multiple audiences. To ensure that these audiences are aware of the resource, and that we know how USVH could better meet their needs, we shall:
" We shall arrange workshops to disseminate information about building online herbarium to those developing the regional networks on which USVH will depend. An important component of these networks will be encouraging students to work in the are of biodiversity informatics.
" Distribute newsletters, both as hardcopy and as pdf files, to potential users. These newsletters will highlight the features that we consider would be particularly valuable to the target audience, but also include an overview of all that it contains.
" We shall encourage members to tell their colleagues about in other disciplines about it, partly by encouraging them to co-author posters for presentations at annual meetings.
" We shall work with colleagues and teachers to develop educational modules that employ the resources made available by USVH. For instance, students could be shown on how to use USVH to compare the ecological preferences of two or more taxa; form hypotheses and test hypotheses concerning species densities; examine the distribution of different clades.
The outreach resources developed and techniques employed by USVH will be made available via a creative commons license so that others may use and modify them for their own needs.


Organization/Governance

The current proposal evolved from the work of a steering committee. It is anticipated that at the next meeting (summer 2009) that a more formal governance structure will be discussed and agreed to. A rotational scheme for leadership of the project will be put in place in the third year. The structure will be designed to encourage participation by individuals with different strengths and backgrounds while providing for continuity of the coordination effort.

AES Directors are requested to support the attendance at the annual meeting of station director or herbarium curator from their state. The meetings will be held in conjunction with the annual meeting of the American Society of Plant Taxonomists which last 3.5 days. This means that there will be a registration and accommodation costs associated with the annual meeting of the project. It will also ensure that the maintenance of herbaria is seen as an integral part of plant taxonomy and systematics and provide those attending, for most of whom teaching and/or research are their primary obligations, an opportunity to share findings and ideas in these areas with colleagues from throughout the country.

The coordinating committee members will contribute in kind services to addressing activities outlined in this proposal.

Literature Cited


Barkworth, M.E. and J. Linman. 1984. Stipa lemmonii (Vasey) Scribner (Poaceae): a taxonomic and distributional study. Madroño 31:4856.

Holmgren, A.H. and N.H. Holmgren. 1977. Poaceae in A. Cronquest et al., Intermountain Flora, vol.6. New York Botanical garden Press.

Pavord, A. 2005. The naming of names. Bloomesbury Publishing, New York, New York.

Attachments

Land Grant Participating States/Institutions

CA, HI, IA, KS, LA, MI, NC, NH, NJ, OR, UT, VT, WV

Non Land Grant Participating States/Institutions

Academy of Natural Sciences of Philadelphia, Appalachian State University, Arizona State University, Arkansas Tech University, Auburn University, Black Hills State University, Boise State University, Botanical Research Institute of Texas, Center for Biological Informatics, Delaware State University, Fairmont State University, Firesner Herbarium , George Mason University, Idaho Museum of Natural History Herbarium, James Madison University, Lynchburg College, Mississippi State University, Northern Arizona University, Oregon State University, Portland State University, Smithsonian Institution, The Morton Arboretum, Troy University, Truman State University (NEMO), University of Alabama, University of Georgia, University of Hawaii at Manoa, University of Louisiana at Monroe, University of Michigan, University of Mississippi, University of Nevada, Las Vegas, University of North Carolina Wilmington, University of Oklahoma, University of Tennessee at Chattanooga, University of the Cumberlands, University of Washington, USGS, USGS/University of Wisconsin, Utah State University, Utah Valley University, Valdosta State University, Vanderbilt University Dept. of Biological Sciences, Virginia Tech, Western Carolina University
Log Out ?

Are you sure you want to log out?

Press No if you want to continue work. Press Yes to logout current user.

Report a Bug
Report a Bug

Describe your bug clearly, including the steps you used to create it.