Biodiversity Data Journal 1: e984 OO) doi: 10.3897/BDJ.1.e984 open access Interactive key A visual identification key utilizing both gestalt and analytic approaches to identification of Carices present in North America (Plantae, Cyperaceae) Timothy Mark Jones T t+ Louisiana State University, Baton Rouge, United States of America Corresponding author: Timothy Mark Jones (tione54@tigers.Isu.edu) Academic editor: Thomas Couvreur Received: 07 Aug 2013 | Accepted: 13 Sep 2013 | Published: 16 Sep 2013 Citation: Jones T (2013) A visual identification key utilizing both gestalt and analytic approaches to identification of Carices present in North America (Plantae, Cyperaceae). Biodiversity Data Journal 1: e984. doi: 10.3897/ BDJ.1.e984 Abstract Images are a critical part of the identification process because they enable direct, immediate and relatively unmediated comparisons between a specimen being identified and one or more reference specimens. The Carices Interactive Visual Identification Key (CIVIK) is a novel tool for identification of North American Carex species, the largest vascular plant genus in North America, and two less numerous closely-related genera, Cymophyllus and Kobresia. CIVIK incorporates 1288 high-resolution tiled image sets that allow users to zoom in to view minute structures that are crucial at times for identification in these genera. Morphological data are derived from the earlier Carex Interactive Identification Key (CIIK) which in turn used data from the Flora of North America treatments. In this new iteration, images can be viewed in a grid or histogram format, allowing multiple representations of data. In both formats the images are fully zoomable. Keywords Visual key, identification, Carex, Cymophyllus, Kobresia, interactive identification, sedges © Jones T. This is an open access article distributed under the terms of the Creative Commons Attribution License 3.0 (CC-BY), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. 2 Jones T Introduction The last ten years may be remembered for the rebirth of plant taxonomy and systematics in a new guise, computational biodiversity informatics. For much of the earth, and North America in particular, botanical information that once required substantial effort to acquire is now reliably provided in seconds by such websites as the Global Biodiversity Information Facility (GBIF), Flora of North America, Missouri Botanical Garden's Tropicos, Encyclopedia of Life, United States Plants Database, and emerging regional herbarium networks. Plant biodiversity is now literally at everyone’s fingertips. State of the art plant identification systems Traditional biological identification systems today are of two primary types; analytic and gestalt (K. Thiele, pers. comm. 2013). Two forms of analytic keys commonly used today are dichotomous and interactive matrix-based keys. Both are primarily text-based question systems that can yield static images upon the final determination. Conversely, gestalt keys, use an identifiable image of the organism in question. Similar to what is seen in field guides. Analytic matrix-based keys are considered to be state of the art today The University Of Queensland 2006 due to their ability to scale up across hundreds of taxa. To use, users select characters to achieve a determination of the unknown taxon using a four-panel informational interface. The information panels often represented are ‘characters available’, ‘characters chosen’, ‘entities available’, and entities discarded’. Within this format, it is possible to insert thumbnail-sized, static images to accompany the text if the taxa numbers are relatively small (< 100). But when taxa numbers are higher (>100), their inclusion results in the information panel becoming too long to be usable, e.g. the Carices used here would require copious scrolling across its many meters of length. Visual keys borrow from both gestalt and analytic methods. They use character matrices for initial pruning of the image set analytically. After a few characters choices the many hundreds of small images are reduced to a manageable set of bigger images. Now gestalt methods take over as the images become larger and truly informative. With this hybrid of functionality, featuring the best of both gestalt and analysis, a novel identification method is created that can cater to the neophyte as well as the expert. Carex, Kobresia, and Cymophyllous: a model for scalability Carex is the largest vascular plant genus in North America (Ball and Reznicek 2002). With two closely related genera, Kobresia and Cymophyllus, it forms the Carices of North America; all three are members of the family Cyperaceae, commonly called sedges but often erroneously referred to as grasses. These three genera share a number of basic morphological characteristics including having linear leaves and a fruit enclosed in a bag- A visual identification key utilizing both gestalt and analytic approaches t ... 3 like structure called a perigynium. All have small flowers that lack large, colorful petals and sepals. Plus they share one other important characteristic: they are difficult to identify. Nevertheless, they are morphologically distinct and relatively easily recognizable as a group. The new visual key The data used in this project are primarily derived from an interactive identification program to Carex that has been online since 2006 at both Utah State University and Louisiana State University (http://www.herbarium.|lsu.edu/keys/carex/carex.html). During this time it has been consistently revised and is currently in version 21. (Suppl. materials 3, 4). Web statistics have been tracked from 2007. Data show that numerous individuals worldwide, government agencies, students in classrooms, and participants in identification workshops have repeatedly used the keys. Many users have graciously suggested revisions and Clarifications that have increased their usability and performance. The key presented here reflects contributions from several individuals, innumerable field trips, and countless hours in herbaria both identifying and imaging specimens. It is only with such collaboration and effort that an image key to such a large genus can be created. Goals My goal in this project was to create an easy to use identification resource that maximized the value of high resolution images while enabling users to explore the distribution of morphological diversity within the genera. Query-able images. For example, to answer questions such as: how are species with trigonous achenes geographically distributed across Canada by province or territory? How common are species with two-sided achenes in species with leaf blades more than 10 mm wide? These sorts of hypotheses are easily answered in histogram mode Fig. 4. Because for the first time, side-by-side image comparisons are possible across species permitting comparative examination and discrimination among closely-related members of any complex, of which there are many, within the Carices. CIVIK is seen here: http://www.herbarium2.|su.edu/aba/ Project description Title: Development of visual identification tool Study area description: This key is designed for use in North America, including Mexico. The original descriptive data was derived from Flora of North America (Ball and Reznicek 2002) and (Mackenzie 1940). My images come from fieldwork focused in eastern North America while other individuals have contributed images from other locations across North America. 4 Jones T Design description: 1. IMAGES 1.1. Contributors Steve Matson and Tony Reznicek both sent a DVD copy of their Carex field images. Lowell Urbatsch contributed his teaching-microscopy-images (http://www.herbarium.|su.edu/keys/ eee/b52.html). My images were collected from many field sites primarily in the north- eastern United States. The New York Botanical Garden Press granted the use of the plates of both North American Cariceae volumes (Mackenzie 1940). The remaining images were found on the World Wide Web (WWW) and their owners (Forest Starr, Kim Starr, Nhy Nyugen, Ann Debolt) contacted by email to request permission for their use. The remaining image contributor, Robert Mohlenbrock, had made the image used here available on hitp:// www.plants.usda.gov/ so it could be used without seeking permission. 1.2. Processing of images To manage the large image numbers (e.g., Matson hundreds of images; Jones, many thousands), each set of images from each owner was segregated on a local drive. Predictably, across this many image contributors, naming conventions differed greatly, thus significant renaming of image files was required. The basic convention used was to include the taxon name, type of image, and the author in the file name. Another issue of note was the fact that many of these images had been prepared for delivery via the WWW, and had been re-sized. Larger file sizes were selected for inclusion while those that were originally designed as thumbnails were not used. Rarely, older images that were scanned from slides were either cropped or otherwise manipulated with Photoshop CS 3. Lastly, rotation of images for appropriate orientation was also often required. 1.2.1 Image sizes Image sizes are variable and range from 40 K to over 13 MB. Line drawings and most images by Jones are at 2848 x 4288 with a maximal bit depth of 24. Matson's images were more variable as some images had been prepared for web use. They range from 2592 x 3888 to 550 x 689 with variable bit depths. Other contributed images are of intermediate sizes. 1.3. Imaging of Mackenzie's plates New York Botanical Garden Press gave permission to image the plates in K. K. Mackenzie's two volume treatment of Carices of North America (Mackenzie 1940) for use in this project. All plates were imaged with a traditional copy stand, using a Nikon 300D camera with a 1:1 macro lens, and two halogen desk lamps for illumination using JPEG format. All images required batch-processing in Photoshop CS3 for color and a minor defect in skew. Additionally, to limit total file size of the project, the images were reduced to approximately one megabyte from three megabytes by resizing. A visual identification key utilizing both gestalt and analytic approaches t ... 5 2. DATA FOR CXML CREATION 2.1. Primary data via export The dataset was derived from an export of CIIK (http:/Awww.herbarium.|su.edu/keys/carex/ carex.html) in comma separated values (CSV) from LUCID 3.4 Identification Software (The University Of Queensland 2006). These data were the template for the new secondary dataset (Fig. 1). The exported data were imported into Excel 2010 and the Excel PivotViewer plug-in generated the Commerce eXtensible Markup Language (CXML) version of the data (Suppl. material 1). This plugin has since been deprecated in favor of a command line tool, Pauthor (Microsoft 2010a, Microsoft 2010b). Images for gestalt Data for analytic Combined ‘ fiir to pare internal data External data Precere Herage farvar r Figure 1. Workflow of project 2.2. Dependent software .NET Framework (Microsoft 2007) Visual Studio 2010 / 2012 Silverlight 4 Tools for Visual Studio 2010 Silverlight Software Development Kit (SDK) Silverlight 4 Toolkit PivotViewer SDK 2.3. Interface considerations in a micro-ontology In Pivot Viewer with the Silverlight 4 format, the characters and states (C&S) are located in the searchable information pane on left, with the displayable information pane on right. This left pane is of a fixed width, lacking word-wrapping functions (Fig. 2). If all C&S 6 Jones T information data mined were used, extensive scrolling would be required and thereby reduce the usability of the key. For this reason, long text strings in the C&S were edited for brevity. A ‘less is more’ approach was taken, with C&S being restricted to those that would be appropriate in an ontology. wy — i _ + Li ora har | DSBS Hevbariom Kiews Vhed erent thier deen. Lt i =—= = § af seRRea= =geef=5 SERTEGE (f= Be Bee oe ay wae YW “es BP $= sf=emee 9 9 °8a geen aBe= | ve aus BES ¢6; Gaeax Petr A/*:) oh B the 6ialae= G9 ofii= £ Pie =— ¥ ORGE =f Sn ee ee ee i .SHDRY. 0 eae Qo" mien Go fold — -f-oge acum aiiies oor aGG0GE @ BS 8 =<88- Ge) afeews bY () § lyme momoQem Bol 80 tne BY’ FEOR 18 Neh ofr eengy Fae ee eof adie d oe f = a | MG A BGR off eed i! OBE 8198 le le 8G ee ew 005 pg) 2 sxweece Eeffe="§ 0 (189 ED skis oe = e§eeo§ feege efe> 95.9 ons of] adenefegg §'= § i= £ i p\izay 8h =a 8 ¢¢@ OS Meee coofers = ji i- 0 5S =—868 B-sofeen *8\8 9 5 8 OF chee! Nhe aPaae PY oe) om een 2 ce] + ae BoB oe Ne 6 = |) a ge a oman oT ee Bl eden 8? es ep oe Be eG ome sigia= = reno Fe are ge = ‘yea =f go vie-= BS OBee~" 68 ee ne Boone 2 ou SGQie==t=Qan =nee= oF ¢74Gen = y= sUgnees on Bale oe xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation" xmlns:x="http://schemas.microsoft.com/winfx/2006/xaml1" xmlns:d="http://schemas.microsoft.com/expression/blend/2008" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:local="clr-namespace:System.Windows.Pivot; assembly=System.Windows.Pivot" mc:Ignorable="d" d:DesignHeight="300" d:DesignWidth="400" Loaded="UserControl_Loaded"> XAML.CS or Code behind using System; using System.Collections.Generic; using System.Ling; using System.Net; using System.Windows; using System.Windows.Controls; using System.Windows.Documents; using System.Windows.Media; using System.Windows.Media.Animation; using System.Windows.Shapes; using System.Windows.Pivot; m m m m m m using System.Windows.Input; m m m m 1 namespace Al10 { public partial class MainPage: UserControl { public MainPage () { InitializeComponent (); Pivot .LoadCollection ("http://www.herbarium2.1lsu.edu/aba/Al0.cxml", string.Empty); } private void UserControl_Loaded(object sender, RoutedEventArgs e) { A visual identification key utilizing both gestalt and analytic approaches t ... 19 Additional information Later examples of visual keys deal with the clustering problem differently. Both Silverlight and HTML 5 based grass genera of Louisiana keys use existing herbarium specimen images to normalize, one herbarium specimen per taxon. Leveraging recent physical and vetted sources. This normalization character is select-able as 'one-to-one comparisons’ at the bottom of character information panel http:/\www.herbarium2.lsu.edu/grass2/. Secondly, Kingdom Plantae in HTML 5 is normalized by image number only, without a selectable character state, across’ divisions hittp://www.herbarium2.lsu.edu/aca/. Magnoliophyta is taken at a log value due to its disparate taxa value when compared to the other divisions. Acknowledgements The author sincerely appreciates the ground-breaking work completed by others before this project even began. Without these prior efforts, this current project could not have been completed in this same time-frame. A sincere thank you to all the editors of Flora of North America, Volume 23, and the image contributors. To G. Wilder, J. Bissell, M. Barkworth, A. Reznicek, K. Niklas, and my Ph.D. advisor, L. Urbatsch, thank you for sharing your wisdom and support. Also, | wish to thank W. Thomas and K. Thiele, for editorial commentary provided for this manuscript. Author contributions Jones developed the project, and contacted the other contibutors for images. S. Matson and T. Reznicek both mailed a DVD copy of their Carex field images. L. Urbatsch's teaching-microscopy-images were copied and saved to USB thumbdrives. New York Botanical Garden Press permitted the use of the images of both North American Cariceae volumes by Mackenzie, K.K. Remaining image owners were found on the WWW. and contacted by email. Thankfully, they granted permission for usage, including; F. Starr & K. Starr, N. Nyugen, and A. Debolt. R. Mohlenbrock's image was gathered from Plants.gov. References * Ball P, Reznicek A (Ed.) (2002) Flora of North America. Magnoliophyta: Commelinidae (in part); Cyperaceae. Vol. 23. 23. Oxford University Press, New York, 608 pp. [In English]. URL: http:// www.efloras.org/florataxon.aspx?flora_id=1&taxon_ id=10246 [ISBN 0-19-515207-7]. * Mackenzie K (1940) North American Cariceae. 1 & 2. New York Botanical Garden Press, New York, 539 pp. * Microsoft (2007) .NET Framework. 3.5. Microsoft. Release date: 2007 11 20. URL: http:// www.microsoft.com/en-us/download/details.aspx?id=21 20 Jones T * Microsoft (2008) Deep Zoom. 0.9.000.5. Microsoft. Release date: 2008 10 13. URL: hittp:// msdn.microsoft.com/en-us/library/cc645050%28VS.95%29.aspx * Microsoft (2010a) Microsoft Silverlight PivotViewer. Microsoft. Release date: 2010 8 09. URL: http:/Awww.microsoft.com/en-us/download/details.aspx?id=17747 * Microsoft (2010b) Pivot Collection Tool for the Command Line. 1.2. Microsoft. Release date: 2010 7 13. URL: http://pauthor.codeplex.com/ * The University Of Queensland (2006) Lucid Software 3.4. URL: http://www.lucidcentral.com/ Supplementary materials Suppl. material 1: Tertiary file structure for Carices CXML file Authors: Jones, T. M. Data type: occurences, morphological, Filename: A10.cxml - Download file (4.19 MB) Suppl. material 2: Secondary Carex morphology data; cleaned and truncated for building CKML Authors: Jones, T. M. Data type: occurrences, morphological, images Brief description: This file is an example of a build file for the creation of the CXML file. Filename: 957am fixed scirpoidea space issue.xlsx - Download file (483.24 kb) Suppl. material 3: Website data from Utah State University Authors: Google Analytics Data type: PDF Brief description: Data sheet for visitiation to CIIK by country Filename: Analytics utc.usu.edu_keys_Carex_Carex.html Location 20060531-20130630.pdf - Download file (180.01 kb) Suppl. material 4: Website data from Louisiana State University Authors: Google Analytics Data type: PDF Brief description: Data sheet for visitiation to CIIK by country Filename: Analytics Carex key LSU Location 20060531-20130630.pdf - Download file (178.08 kb) Suppl. material 5: Primary Carex morphology data from Lucid 3.4 Authors: Jones, T. M. A visual identification key utilizing both gestalt and analytic approaches t ... 21 Data type: XLSX Brief description: Export from CIIK 2013 in CSV format Filename: Carex-all-CSV.xlsx - Download file (732.82 kb) Suppl. material 6: CIVIK usage 2011 - 2013 Authors: Google Analytics Data type: PDF Brief description: This includes all visual keys developed. Here CIVIK is represented by both / aba/ and /aaa/ and iteratives. Filename: Analytics www.herbarium2.lsu.edu_aaa_A5TestPage.html Pages 20100531-20130630.pdf - Download file (168.54 kb) Suppl. material 7: Visual keys usage with Google Analytics Authors: Google Data type: analytics Brief description: Compilation of all visual keys using Google Analytics Filename: Analytics www.herbarium2.lsu.edu-aaa-A5 TestPage.html Language 20100809-20130908.pdf - Download file (189.65 kb)