Tuesday, September 24, 2024

GIS 5935 Module 2.2 - Surface Interpolation

Module 2.2 of Special Topics in GIS was an exploration of surface interpolation and some of the different methods that can be employed to produce a dataset consisting of estimated values in between known [sample or testing] points. [Bolstad & Manson, 2022] define surface interpolation as a 'prediction of variables at unmeasured locations, and based on a sampling of the same variables at known locations [p. 510]. This was accomplished by the use of four different techniques and comparing the results to discern which model most accurately portrayed the data. 

The first exercise was a comparison of Digital Elevation Models produced by Inverse Distance Weighted [IDW] and Spline interpolation. The IDW Interpolation model is an estimation of unknown values inversely proportionate to the distance from known values at sample, or testing, points. Essentially, this equates to greater distanced from sample points equals less influence in determining that cell's estimated value. The Spline Interpolation model employs mathematical functions, or polynomials, to form a smooth curved surface between the known sample points [Bolstad & Manson, 2022]. 

After these models were constructed, the Raster Calculator geoprocessing tool was ran to determine the mathematical difference between the two; the results are shown in the map below.


As illustrated in the legend, areas that are shaded brown represent raster cells where the Spline model had a higher elevation, and areas that are shaded purple represent cells where the IDW had higher elevation values. It is noteworthy, that mostly all the white areas are places where sample points were collected. Also, mathematically, the ratio of purple to brown areas are 52% and 48%, respectively. 

The remainder of the lab was an analysis of the levels of Biochemical Oxygen Demand [BOD] levels in Tampa Bay, Florida using the following models: Nearest Neighbor [Thiessen Polygons], Inverse Distance Weighted, Regularized Spline, and Tensions Spline. While IDW and Spline were described above, we took spline interpolation one step further to explore the difference between Regularized and Tension. The difference between the two models is based on the weight parameter, where the Regularized Spline produces a smooth, continuous surface, while the Tension Spline is coarser, but manages to force the surface to exactly match the values at the sample points [ESRI, 2024]. The other model used was Nearest Neighbor, or Thiessen Polygon Interpolation. Mathematically speaking, this is the least intensive as it assigns a value for any unsampled location that is equal to the value found at the nearest sample location. [Bolstad & Manson, 2022, p. 516]. This method creates polygons that extend out [in all directions] from known sample points until the maximum distance is reached to the next sample point. The results for all four methods are shown in the maps below.







The interpolation method I would choose to best represent BOD concentrations of Tampa Bay would be the Spline Technique, specifically the Regularized Spline type. The reasoning behind this decision is due to the nature of any substance being diluted in water. Regardless of the substance, once introduced into a body of water, it will disperse evenly and continuously throughout the adjacent areas. The ISW method tends to create 'hot spots' with peaks occurring at the testing points while the Tension Spline type will create a continuous surface, but not a smooth surface. Finally, Nearest Neighbor will provide an estimated value of BOD concentrations, but these are generalized over discreet regions of the bay and will not provide a smooth, continuous interpolated model. However, the Regularized Spline type 'creates a smooth, gradually changing surface with values that may lie outside the sample data range' [ESRI, 2024]. This, in my opinion, would provide a much more accurate estimate of BOD concentrations in Tampa Bay than the other three methods discussed in this assignment.


Sources:

Bolstad, Paul & Manson, Steven. (2022). GIS Fundamentals: A First Text on Geographic Information Systems (7th Edition). Eider Press.

Environmental Systems Research Institute. (2024). How Spline Workshttps://pro.arcgis.com/en/pro-app/latest/tool-reference/3d-analyst/how-spline-works.htm

Monday, September 16, 2024

GIS 5935 Module 2.1 - Surfaces [Triangulated Irregular Networks and Digital Elevation Models]

Module 2.1 of Special Topics in GIS was based on surfaces, particularly Triangulated Irregular Networks [TINs] and Digital Elevation Models [DEMs]. The first portion of the lab was an opportunity to import elevation data, set the ground source [giving it 3D visualization], and learning how to exaggerate the vertical distances to enhance the visual aesthetics of the landscape. Once these fundamental concepts were practiced, an analytical problem was presented. 

The second portion of the lab was to create a Suitability Map for a study area that illustrates the best locations for a ski resort and its associated ski run. The suitability was determined based on slope, elevation, and aspect [directional face] of the landscape. The dark green areas of the map below display the most suitable locations of the resort, and the red areas signify areas that are unsuitable for this tourist destination.


The final portion of this lab dealt with TIN's and DEM's. Environmental Systems Research Institute [ESRI] defines a Triangulated Irregular Network as a form of vector-based digital geographic data and are constructed by triangulating a set of vertices [points]. The vertices are connected with a series of edges to form a network of triangles.' The documentation for this model type is quite extensive, and can be found by clicking here. Furthermore, ESRI defines Digital Elevation Models as 'a raster representation of a continuous surface, usually referencing the surface of the earth' [ESRI, 2024]; the complete documentation for DEM's can be found by clicking here. Once these two models were created, contour lines were derived at 100 meter increments [see map below]. Contours represent lines of equal elevation that are spaced at equal intervals [Bolstad & Manson, 2022].


Due to the data types of the models used to create the contour lines, the output of each varies in appearance. Since TIN's are based on points, lines, and polygons, the contour lines take on a very segmented appearance. Conversely, a raster-based DEM will create contour lines with a very transitional and organic aesthetic.


Sources:

Bolstad, Paul & Manson, Steven. (2022). GIS Fundamentals: A First Text on Geographic Information Systems (7th Edition). Eider Press.

Environmental Systems Research Institute. (2024). Exploring Digital Elevation Models

Environmental Systems Research Institute. (2024). What is a TIN Surface?  


Tuesday, September 10, 2024

GIS 5935 Module 1.3 - Data Quality Assessment

 Module 1.3 of Special Topics in GIS was a continuation of data quality; this module focused on the completeness of datasets, roadway networks particularly. Two datasets were provided for the completeness assessment; one was obtained from Jackson County, Oregon and the other was downloaded from the United States Census Bureau TIGER shapefile repository. While both datasets contained roadway centerlines, their overall distances were significantly different. The spatial analysis performed on these datasets was to ascertain which one was more complete, based on length alone. Initially, before any processing was performed, the TIGER shapefile consisted of 11,382.7 kilometers of roadway centerlines while the Jackson County dataset accounted for 10,873.3 kilometers, making the TIGER dataset more complete. 

The next process of this lab was to analyze completeness according to [Haklay, 2010]. Essentially, this method consists of overlaying a grid index on top of the datasets and creating a thematic map according to their percentage differences. For this lab, the grid consisted of 5-kilometer squares that were set within the confines of the county border. Next, all roadways that lied outside of the grid index were clipped; this deleted any extra roadways outside the confines of the grid. After this, the roadways had to be split at the intersection of each grid cell, and then the individual roadway sections within each cell had to be dissolved into one multi-part feature. Once these processes were completed for each dataset, a comparison between the two could be made on a cell-by-cell basis [see map below].

For this particular assignment, the Jackson County dataset was determined to be used as the baseline to make all comparisons. [Haklay, 2010] eludes that completeness of datasets obtained from local jurisdictions often surpasses datasets that are downloaded from national bureaucracies or volunteered geographic information systems, such as OpenStreetMap. Once the baseline dataset was determined, executing the following formula on each grid cell provided a percentage difference that was used to create the map above:

[[Jackson County Length - TIGER Length] / Jackson County Length] * 100

As displayed in the map's legend, the positive percentages represent cells where the Jackson County dataset is more complete than the TIGER. Conversely, negative percentages represent cells where the TIGER dataset is more complete than the Jackson County shapefiles. Lastly, it is worth mentioning that the distribution of the percentage differences was skewed to the right with some drastic outliers in the negative percentage range; because of this skewed distribution, I decided to apply a manual interval data classification system over an equal interval or quantile system. This decision eliminated any misconceptions portrayed by the few extreme outliers.

Overall, this assignment was another exciting opportunity to explore the ongoing issue of spatial data quality in the realm of Geographic Information Sciences.


Source:

Haklay, M. (2010). How Good is Volunteered Geographic Information? A Comparative Study of OpenStreetMap and Ordinance Survey Datasets. Environmental and Planning B: Planning and Design, 37(4). 682-703.

Monday, September 2, 2024

GIS 5935 Module 1.2 - Spatial Data Quality

Lab assignment 1.2 of Special Topics in GIS was performing an accuracy assessment according to the National Standard for Spatial Data Accuracy. Positional Accuracy Handbook states 'the National Standard for Spatial Data Accuracy describes a way to measure and report positional accuracy of features found within a geographic dataset. Approved in 1998, the NSSDA recognizes the growing need for digital spatial data and provides a common language for reporting accuracy' [Planning, 1999]. For this assignment, two datasets were provided for a study area located in the City of Albuquerque, New Mexico. The first dataset was obtained from the City of Albuquerque and the second was a StreetMap USA dataset, which is a product of TeleAtlas and is distributed by ESRI with the ArcGIS software package. Both datasets consist of roadway networks and can be seen in the map below. The green lines represent the City of Albuquerque [ABQ] dataset, and the red lines represent the StreetMap USA dataset.

The process for this accuracy assessment is clearly outlined in the Positional Accuracy Handbook [Planning, 1999] but will be briefly outlined through the remainder of this post. The first step of the assessment was to determine if the test involves horizontal accuracy, vertical accuracy, or both; for this assignment, we focused on horizontal accuracy only. Second, testing points were determined throughout the study area; these points are displayed as black X's in the map above. Specific guidelines are provided in [Planning, 1999] that aid in the appropriate determination of testing points. Next, an independent data set of higher accuracy needs to be chosen in order to complete the assessment. To accomplish this, 2006 Digital Orthophoto Quadrangles [United States Geological Survey] were used to identify intersections as reference points, and horizontal accuracy assessments were performed on the two provided datasets. It is noteworthy, however, that a quick visual analysis concludes that the City of Albuquerque dataset is aligned more consistently with the USGS DOQ's than its StreetMap USA counterpart; this visual analysis should be harmonious with the results calculated at the end of the NSSDA assessment. Measurements were taken from each dataset to the digitized reference points located at the chosen street intersections on the USGS DOQ's. Once these measurements were obtained, the NSSDA Horizontal Accuracy Statistic Worksheet could be completed for each of the datasets; the results for the City of Albuquerque are shown below:
Here are the results obtained from the StreetMap USA dataset:
As predicted, the value of the City of Albuquerque dataset is much lower than the value calculated for the StreetMap USA dataset. The next, or sixth, step of the assessment is to construct an accuracy statement that clearly defines the dataset's accuracy to the 95th percentile. The accuracy statement for each of the two datasets are written below:

ABQ Dataset:

Using the National Standard for Spatial Data Accuracy, the data set tested 14.27ft horizontal accuracy at 95% confidence level.

StreetMap Dataset:

Using the National Standard for Spatial Data Accuracy, the data set tested 379.66ft horizontal accuracy at 95% confidence level.

This provides the user with a clearly defined value of what radius 95% of all values will fall within. As determined above, a user can be certain that 95% of all features in the City of Albuquerque dataset will fall within 14.27 feet of their true geographical location, and 95% of all features in the StreetMap USA dataset will fall within 379.66 feet of their true geographical location. In conclusion, the horizontal accuracy of the City of Albuquerque dataset is substantially higher than the StreetMap USA dataset. The final step of the NSSDA Horizontal Accuracy Assessment is to include the report in a comprehensive description of the dataset called metadata, or 'data about the data' [Planning, 1999]. An example of this comprehensive description is provided below:

Example of Detailed positional accuracy statements as reported in metadata:

Digitized features of the roadway infrastructure located within the study area of Albuquerque, New Mexico were obtained from the City of Albuquerque and from StreetMap USA, a product of TeleAtlas and distributed by ESRI with ArcGIS. Those obtained from the City of Albuquerque tested at 14.27ft horizontal accuracy at the 95% confidence level, and those obtained from StreetMap USA tested at 379.66ft horizontal accuracy at the 95% confidence level using modified NSSDA testing procedures. See Section 5 for entity information of digitized feature groups. See also Lineage portion of Section 2 for additional background. For a complete report of the testing procedures used, contact the University of West Florida GIS Department as noted in Section 6, Distribution Information.

Levels of vertical relief were not considered throughout the entire accuracy assessment of these two datasets.

All other features are generated by coordinate geometry and are based on a visually based framework of roadway intersections. Computed positions of roadway intersections, or testing points, are not based on individual field surveys. Although tests of randomly selected points for comparison may show varying degrees of accuracy between the provided datasets, overall visual analysis confirms higher levels of accuracy throughout the entire City of Albuquerque dataset. However, caution is necessary in use of roadway intersections as shown, due to the location process employed throughout this assessment. For more information, contact the GIS department at the University of West Florida.


Source:

Planning, M. (1999). Positional Accuracy Handbook. Using the National Standard for Spatial data Accuracy to measure and report geographic data quality. Minnesota Planning, St. Paul, MN.

GIS 6005 Module 1 - Map Design and Typography

Module One of GIS 6005 - Communicating GIS revolved around cartographic design principles and typographical principles that should be follow...