Monday, September 16, 2024

GIS 5935 Module 2.1 - Surfaces [Triangulated Irregular Networks and Digital Elevation Models]

Module 2.1 of Special Topics in GIS was based on surfaces, particularly Triangulated Irregular Networks [TINs] and Digital Elevation Models [DEMs]. The first portion of the lab was an opportunity to import elevation data, set the ground source [giving it 3D visualization], and learning how to exaggerate the vertical distances to enhance the visual aesthetics of the landscape. Once these fundamental concepts were practiced, an analytical problem was presented. 

The second portion of the lab was to create a Suitability Map for a study area that illustrates the best locations for a ski resort and its associated ski run. The suitability was determined based on slope, elevation, and aspect [directional face] of the landscape. The dark green areas of the map below display the most suitable locations of the resort, and the red areas signify areas that are unsuitable for this tourist destination.


The final portion of this lab dealt with TIN's and DEM's. Environmental Systems Research Institute [ESRI] defines a Triangulated Irregular Network as a form of vector-based digital geographic data and are constructed by triangulating a set of vertices [points]. The vertices are connected with a series of edges to form a network of triangles.' The documentation for this model type is quite extensive, and can be found by clicking here. Furthermore, ESRI defines Digital Elevation Models as 'a raster representation of a continuous surface, usually referencing the surface of the earth' [ESRI, 2024]; the complete documentation for DEM's can be found by clicking here. Once these two models were created, contour lines were derived at 100 meter increments [see map below]. Contours represent lines of equal elevation that are spaced at equal intervals [Bolstad & Manson, 2022].


Due to the data types of the models used to create the contour lines, the output of each varies in appearance. Since TIN's are based on points, lines, and polygons, the contour lines take on a very segmented appearance. Conversely, a raster-based DEM will create contour lines with a very transitional and organic aesthetic.


Sources:

Bolstad, Paul & Manson, Steven. (2022). GIS Fundamentals: A First Text on Geographic Information Systems (7th Edition). Eider Press.

Environmental Systems Research Institute. (2024). Exploring Digital Elevation Models

Environmental Systems Research Institute. (2024). What is a TIN Surface?  


Tuesday, September 10, 2024

GIS 5935 Module 1.3 - Data Quality Assessment

 Module 1.3 of Special Topics in GIS was a continuation of data quality; this module focused on the completeness of datasets, roadway networks particularly. Two datasets were provided for the completeness assessment; one was obtained from Jackson County, Oregon and the other was downloaded from the United States Census Bureau TIGER shapefile repository. While both datasets contained roadway centerlines, their overall distances were significantly different. The spatial analysis performed on these datasets was to ascertain which one was more complete, based on length alone. Initially, before any processing was performed, the TIGER shapefile consisted of 11,382.7 kilometers of roadway centerlines while the Jackson County dataset accounted for 10,873.3 kilometers, making the TIGER dataset more complete. 

The next process of this lab was to analyze completeness according to [Haklay, 2010]. Essentially, this method consists of overlaying a grid index on top of the datasets and creating a thematic map according to their percentage differences. For this lab, the grid consisted of 5-kilometer squares that were set within the confines of the county border. Next, all roadways that lied outside of the grid index were clipped; this deleted any extra roadways outside the confines of the grid. After this, the roadways had to be split at the intersection of each grid cell, and then the individual roadway sections within each cell had to be dissolved into one multi-part feature. Once these processes were completed for each dataset, a comparison between the two could be made on a cell-by-cell basis [see map below].

For this particular assignment, the Jackson County dataset was determined to be used as the baseline to make all comparisons. [Haklay, 2010] eludes that completeness of datasets obtained from local jurisdictions often surpasses datasets that are downloaded from national bureaucracies or volunteered geographic information systems, such as OpenStreetMap. Once the baseline dataset was determined, executing the following formula on each grid cell provided a percentage difference that was used to create the map above:

[[Jackson County Length - TIGER Length] / Jackson County Length] * 100

As displayed in the map's legend, the positive percentages represent cells where the Jackson County dataset is more complete than the TIGER. Conversely, negative percentages represent cells where the TIGER dataset is more complete than the Jackson County shapefiles. Lastly, it is worth mentioning that the distribution of the percentage differences was skewed to the right with some drastic outliers in the negative percentage range; because of this skewed distribution, I decided to apply a manual interval data classification system over an equal interval or quantile system. This decision eliminated any misconceptions portrayed by the few extreme outliers.

Overall, this assignment was another exciting opportunity to explore the ongoing issue of spatial data quality in the realm of Geographic Information Sciences.


Source:

Haklay, M. (2010). How Good is Volunteered Geographic Information? A Comparative Study of OpenStreetMap and Ordinance Survey Datasets. Environmental and Planning B: Planning and Design, 37(4). 682-703.

Monday, September 2, 2024

GIS 5935 Module 1.2 - Spatial Data Quality

Lab assignment 1.2 of Special Topics in GIS was performing an accuracy assessment according to the National Standard for Spatial Data Accuracy. Positional Accuracy Handbook states 'the National Standard for Spatial Data Accuracy describes a way to measure and report positional accuracy of features found within a geographic dataset. Approved in 1998, the NSSDA recognizes the growing need for digital spatial data and provides a common language for reporting accuracy' [Planning, 1999]. For this assignment, two datasets were provided for a study area located in the City of Albuquerque, New Mexico. The first dataset was obtained from the City of Albuquerque and the second was a StreetMap USA dataset, which is a product of TeleAtlas and is distributed by ESRI with the ArcGIS software package. Both datasets consist of roadway networks and can be seen in the map below. The green lines represent the City of Albuquerque [ABQ] dataset, and the red lines represent the StreetMap USA dataset.

The process for this accuracy assessment is clearly outlined in the Positional Accuracy Handbook [Planning, 1999] but will be briefly outlined through the remainder of this post. The first step of the assessment was to determine if the test involves horizontal accuracy, vertical accuracy, or both; for this assignment, we focused on horizontal accuracy only. Second, testing points were determined throughout the study area; these points are displayed as black X's in the map above. Specific guidelines are provided in [Planning, 1999] that aid in the appropriate determination of testing points. Next, an independent data set of higher accuracy needs to be chosen in order to complete the assessment. To accomplish this, 2006 Digital Orthophoto Quadrangles [United States Geological Survey] were used to identify intersections as reference points, and horizontal accuracy assessments were performed on the two provided datasets. It is noteworthy, however, that a quick visual analysis concludes that the City of Albuquerque dataset is aligned more consistently with the USGS DOQ's than its StreetMap USA counterpart; this visual analysis should be harmonious with the results calculated at the end of the NSSDA assessment. Measurements were taken from each dataset to the digitized reference points located at the chosen street intersections on the USGS DOQ's. Once these measurements were obtained, the NSSDA Horizontal Accuracy Statistic Worksheet could be completed for each of the datasets; the results for the City of Albuquerque are shown below:
Here are the results obtained from the StreetMap USA dataset:
As predicted, the value of the City of Albuquerque dataset is much lower than the value calculated for the StreetMap USA dataset. The next, or sixth, step of the assessment is to construct an accuracy statement that clearly defines the dataset's accuracy to the 95th percentile. The accuracy statement for each of the two datasets are written below:

ABQ Dataset:

Using the National Standard for Spatial Data Accuracy, the data set tested 14.27ft horizontal accuracy at 95% confidence level.

StreetMap Dataset:

Using the National Standard for Spatial Data Accuracy, the data set tested 379.66ft horizontal accuracy at 95% confidence level.

This provides the user with a clearly defined value of what radius 95% of all values will fall within. As determined above, a user can be certain that 95% of all features in the City of Albuquerque dataset will fall within 14.27 feet of their true geographical location, and 95% of all features in the StreetMap USA dataset will fall within 379.66 feet of their true geographical location. In conclusion, the horizontal accuracy of the City of Albuquerque dataset is substantially higher than the StreetMap USA dataset. The final step of the NSSDA Horizontal Accuracy Assessment is to include the report in a comprehensive description of the dataset called metadata, or 'data about the data' [Planning, 1999]. An example of this comprehensive description is provided below:

Example of Detailed positional accuracy statements as reported in metadata:

Digitized features of the roadway infrastructure located within the study area of Albuquerque, New Mexico were obtained from the City of Albuquerque and from StreetMap USA, a product of TeleAtlas and distributed by ESRI with ArcGIS. Those obtained from the City of Albuquerque tested at 14.27ft horizontal accuracy at the 95% confidence level, and those obtained from StreetMap USA tested at 379.66ft horizontal accuracy at the 95% confidence level using modified NSSDA testing procedures. See Section 5 for entity information of digitized feature groups. See also Lineage portion of Section 2 for additional background. For a complete report of the testing procedures used, contact the University of West Florida GIS Department as noted in Section 6, Distribution Information.

Levels of vertical relief were not considered throughout the entire accuracy assessment of these two datasets.

All other features are generated by coordinate geometry and are based on a visually based framework of roadway intersections. Computed positions of roadway intersections, or testing points, are not based on individual field surveys. Although tests of randomly selected points for comparison may show varying degrees of accuracy between the provided datasets, overall visual analysis confirms higher levels of accuracy throughout the entire City of Albuquerque dataset. However, caution is necessary in use of roadway intersections as shown, due to the location process employed throughout this assessment. For more information, contact the GIS department at the University of West Florida.


Source:

Planning, M. (1999). Positional Accuracy Handbook. Using the National Standard for Spatial data Accuracy to measure and report geographic data quality. Minnesota Planning, St. Paul, MN.

Monday, August 26, 2024

GIS 5935 Module 1 - Data Precision and Accuracy

Module 1 of Special Topics in GIS dealt with the precision and accuracy of gathered waypoints from a GPS data collection unit. International Organization for Standardization defines precision as 'the closeness of agreement between independent test results obtained under stipulated conditions' [ISO, 2006]. With regard towards the lab assignment, precision would be determining the proximity of fifty gathered waypoints from a single location using a Garmin GPSMAP 76 data collection unit. As shown in the map below, many of the waypoints are in close proximity while others deviate from the majority. For this part of the lab, the mathematical mean was calculated for the x-, y-, and z- location for all fifty waypoints; this 'average' location is denoted on the map as a red 'X'. Once this average location was calculated, an analysis could be performed on the distance between each waypoint and the calculated average location. This precision analysis concludes that 50% of all gathered waypoints fall within 3.1 meters of the average location, 68% of waypoints fall within 4.5 meters of the average location, and 95% of all gathered waypoints fall within 14.8 meters from the calculated average location. Whether these precision analysis results would suffice varies widely between applications. These percentile distances may be acceptable and appropriate for one scenario and widely unacceptable in a different scenario; precision, therefore, is relative and must be determined at the beginning of each synopsis.


The second part of the lab assignment dealt with accuracy, and different tools that can be employed to determine the extent of accuracy within a dataset. According to GIS Fundamentals: A First Text on Geographic Information Systems, an accurate observation 'reflects the true shape, locations, or characteristics of the phenomena represented in a GIS', meaning that accuracy is a 'measure of how often or by how much our data values are in error' [Bolstad & Manson, 2022, p. 609]. For this portion of the lab, a dataset was provided and completely analyzed using Microsoft Excel. This was very beneficial, as it provided an excellent opportunity to deviate from the comfort of ArcGIS Pro, and into a program that is not so familiar. The first tool used to calculate the dataset's accuracy was a series of manual formulas, including minimum, maximum, mean, median and Root Square Mean Error. The second method used to display the accuracy of the dataset was a Cumulative Distributive Function graph, which is displayed below. 


For additional information regarding Root Mean Square Error, click here. For additional information regarding Cumulative Distribution Function, click here. Both sources are published by Science Direct and provide an extensive overview of their respective topics.


Sources:

Bolstad, Paul & Manson, Steven. (2022). GIS Fundamentals: A First Text on Geographic        Information Systems (7th Edition). Eider Press.

International Organization for Standardization. (2006). Statistics - Vocabulary and Symbols.    Part 1: General Statistical Terms and Terms Used in ProbabilityInternational Organization for Standardization.

Science Direct. (2024). Cumulative Distribution Function.                            https://www.sciencedirect.com/topics/mathematics/cumulative-distribution-function.

Science Direct. (2024). Root Mean Square Errorhttps://www.sciencedirect.com/topics/engineering/root-mean-square-error.

Thursday, August 8, 2024

GIS 5100 Module 6 - Part II: Least Cost Path and CooridorAnalysis

In the second half of Module 6, we built upon the knowledge obtained in Scenario's One and Two: reclassifying raster datasets to generate a Suitability Map. From there, we were able to continue analyzing the datasets to obtain a Least Cost Path. A Least Cost Path analysis is a geoprocessing workflow that determines the lowest cost path from a source location throughout an entire dataset. As stated in the first post for Module 6, 'cost' is a relative term that does not necessarily refer to monetary units; however, in the map below, a Least Cost Path analysis was performed for an oil company wishing to install a pipeline through a southwestern portion of the state of Oregon. Once the datasets have been reclassified and combined using the Weighted Overlay tool, a Cost Distance function can be run to determine the cost from a set source location. As shown in the map below, the Least Cost Path is derived from the Cost Distance output, and the dashed line represents the lowest-cost route through the study area. Like the Suitability Map, factors can be weighted to give priority to some criteria over others. For instance, the bottom-left map in the image below provides a Least Cost Path based on slope alone. However, if cost is added for river crossings, the Least Cost Path is drastically altered to minimize the number of times the pipeline crosses the rivers in the area [see bottom-right map below.]

Another type of map that provides least cost information is a Corridor Analysis map; a Corridor Analysis map determines an area, or corridor, of least cost between two points; the Corridor analysis does not take direction into consideration. As illustrated in the Corridor analysis map, the Least Cost Path falls within the lowest classification of the corridor area. This would provide the oil company of an area to build the pipeline within, allowing them to make final decisions regarding pipeline installation while maintaining cost efficiency.


 Lastly, in Scenario 4, all of these processes were applied to create a Corridor analysis on travelling bears throughout Coronado National Forest. While no new processes were added to the workflow involved in creating this map, it was a great opportunity to apply all tools / functions in a cumulative effort to produce the deliverable below. I was quite happy with this deliverable and believe it to be an outstanding representation of my comprehensive understanding of Suitability Mapping, Least Cost Path Mapping, and Corridor Analysis Mapping.


Tuesday, August 6, 2024

GIS 5100 Module 6 - Part I: Suitability Mapping


Module 6 of Applications in GIS was very extensive and loaded with information; the Module was divided into two parts, with two scenarios for each part. This first portion of the module focused on Suitability Mapping; a suitability map identifies areas [within a study area] that meet some, or all, criteria for a given problem. For example, in the map below, regions within the study area are highlighted that are optimal for cougars to live in; these areas were determined by meeting four criteria: distance from roadways, proximity to rivers, areas with increased slope [mountainous or canyon], and forested areas. To determine which areas meet all [or none] of these criteria, a reclassification process had to be run on the provided datasets. For instance, the Digital Elevation Model was processed using the Slope Calculator function and reclassified into two distinct classes: areas with < 9° slope and areas with > 9° slope. Next, a reclassification was employed on the landcover dataset, distinguishing forested areas from every other landcover type. The same process was completed for proximity to roadways and rivers [a Euclidian Distance raster dataset was created prior to the reclassification process.]

After all datasets had been reclassified, the values for each cell could be added together using the Raster Calculator too, and the analysis could begin. Referencing the map above, three variations on the analysis are displayed. The two bottom maps illustrate regions within the study area that meet ALL four requirements, thus being determined as the likeliest places for cougars to dwell; the only difference between the two bottom maps is one is vector-based [comprised of polygonal geometries], while the bottom-right is raster-based [a grid array of pixel values]. The top map, however, displays the study area using a graduated symbology; this tells the map viewer how many criteria are met for the entire study area. A visual comparison of the top map with the bottom two clearly illustrates all three maps highlight the same optimal areas for a cougar to live.

The second scenario of Part 1 was a similar analysis, only determining which areas within the study area would be best for a future development. The criteria used in this analysis were: proximity to roadways, proximity to water, current land classification / land use, and the slope of the land. While there were some variations within the criteria, and what was determined to be optimal for development, the workflow for Scenario 2 was the same. As displayed in the map below, a graduated symbology was utilized to show how many criteria were met for each pixel within the study area.


The difference between the two maps in Scenario 2 is weighted value. The map on the left gives an equal weighted value for each criterion, while the map on the right does not. This is significant because some factors are going to affect the suitability of potential sites more than others. For instance, while proximity to water will have some effect on the location of a future development, it will not be the same effect as not having any available roadways to access the potential site. 

As outlined in this post, the applications of Suitability Mapping are endless, and can be used in a wide variety of ways. Although this lab was very time-consuming and covered a tremendous wealth of information, it was a great experience continuing to learn about the potential applications of Geographic Information Systems.

GIS 5935 Module 2.2 - Surface Interpolation

  Post in progress - please check back soon...