Apr 8, 2018 - Tools on map-making at Data Days CLE

I gave a workshop/presentation on tools for map-making at Data Days CLE on Friday. One of my favorite moments was the city employee who asked me about alternatives to ARCGIS/ESRI and specifically being able to offer read access to geodatabases to other departments of data without using ESRI (hope I remember that correctly).

My slides are at http://skorasaur.us/ddc18 and below is a long list of resources, most of which I mentioned in my talk. This list is also available in my github repository for this - https://github.com/skorasaurus/ddc18

This list is by no means, comprehensive, but a starting point for tools for map-making, primarily focusing on web maps (maps that are viewable online) outside of the ESRI ecosystem.

mapschool - As brief as it is, it’s an extremely useful overview of modern maps and some theory. I don’t know of any other document on maps that is as short yet as informative.

mapmaking suites (SAAS, software as a service):

carto

mapbox

shinyapps - R-based

Quicker and simpler web map templates:

All of these simpler web map templates require a relatively minimal amount of data (not a very rigid rule, but I’d say less than a couple hundred points/features and that you don’t have a lot of properties on them). If you have more than this, you’ll need to upload them to one of the above services.

mapzap - less styling options but easier to use

mapstarter - also has print options

leaflet + and google sheets

umap - If you want a map to share with others with some custom icons quickly and aren’t picky about the basemap; can embed as well.

data manipulation/gis in browser:

As above, these may not work (or will work very slowly) if you’re using files that have hundreds of features or are above, say 10mb, in size.

geojson.io - quickly edit and save to numerous formats; works on files < 10mb

mapshaper - relatively simple yet powerful, also has command-line based tool

dropchop - do some common GIS operations within the browser

turf.js - do some common GIS operations within the browser (javascript)

utilities for printing web maps:

portmap -

staticmapmaker.com - limited options; but usable

LA Times’ Web Map Maker

Petroff’s Print Maps

https://www.mapbox.com/help/static-api-playground/

geocoding:

smartstreets Not free; but does a relatively great job and has relatively easy to use interface; good if you’re on a timecrunch and/or limited skills.

Meta (a list of other lists):

robin’s list

awesome-spatial - great list of all types of spatial tools, many of these require knowledge in a particular programming language, comfortability with command line.

awesome-geojson - great utilities for working with geoJSON.

color-tools - all resources on colors

dataviz-tools’ list - thorough list, somewhat out of date

theory:

maptime - An informal association of meetup groups that teach geospatial concepts and maps. They have accessible tutorials. I co-organized Cleveland’s maptime from 2012-2014ish.

mapmakers-cheatsheet

Advanced:

csvkit - python library and command line to manipulate CSV files

qgis - geospatial analysis, map-making, and so much more; comparable to ArcGIS.

cheat-sheet for fiona and rasterio - Cheatsheet for using python libraries of fiona, rasterio, manipulating geospatial data.

miller - command-line based; very powerful and advanced; specifically for parsing CSV files.

GDAL cheatsheet - GDAL is a geospatial library at the core of many geospatial applications; data conversion; reprojection; analysis, and more. Cheatsheet for using some of its command-line based tools.

d3 - extremely powerful javascript library for dataviz and maps

observable HQ - a sandbox for experimenting with javascript and D3

Sites/Articles mentioned in talk:

Most famous set in every US state

when it shouldn’t be a map

data sources: Guide to Cleveland Data sources - A list of places to get available open civic data for the Cleveland area

If you want to start with the command line: https://github.com/jlevy/the-art-of-command-line

Highly recommended Books: Interactive Data Visualization for the Web: An Introduction to Designing with D3 (2nd Edition) - Scott Murray - clearly written with examples; good not just for D3 as a refresher or extremely concise overview of html, css, and javascript.

GIS Cartography - Gretchen Peterson Great design influence for making print and web-maps.

cat photo by Walid Mahfoudh


Mar 11, 2018 - Recently

What I’ve been up to (outside of my work):

I used to spend a lot of time listening, finding, and buying new music. I don’t nearly listen to as much as I used to; my priorities in my free time have changed. Tracking down or knowing that there’s a great song or album to be found just doesn’t give me as much excitement it once had.

However, these songs were my favorite ones to listen to in 2017 and will remind me of that year for the rest of my life (alphabetical order):

Broken Social Scene - Halfway Home
Broken Social Scene - Anthems for a Seventeen Year Old Girl
Dday One - Contact
Dirty Projectors - Little Bubble
Dolly Spartans - I Hear the Dead
Doves - Rise
Gomez - Options
Noname - Diddy Bop (feat. Raury & Cam O’bi)
Orbital - Belfast
pronoun - a million other things
Sammus - 1080p
Talking Heads - Once in a Lifetime
The Go-Betweens - Love Goes On!
The War On Drugs - Pain
Ultimate Painting - Song for Brian Jones

Some of the favorite albums that I listened to for the first time in 2018: AndyFellaz - BeatBop Street; Kendrick Lamar - DAMN; The Go-Betweens - 16 Lovers Lane; broken social scene - you forgot it in people.


I’ve been stewing on the rest of this post for almost a year now. Deleting portions. I’ve scrapped multiple versions of it.

2017 had been the most successful year for me, professionally. Personally, it’s been one of the hardest, battling anxiety and to a lesser extent, depression. I know I’m pretty fortunate; my struggles are a lot less burdensome than others and I have a lot of privilege.

Articulating my thoughts into sustained, multiple paragraphs in a coherent fashion that is also grammatically correct and well-polished for general audiences is relatively difficult for me.

I’ve been spending less time on twitter and trying to spend the time that I’ve devoted to that on reading books or actually reading articles that I’ve saved (liked/favorited) on twitter. I made a conscious effort to go through my twitter likes a couple weeks ago: I had 4200; now down to ~3,500 (~3,200 now). Found some articles worth reading and it was a nice window in my internet consumption over the years. It also reminded me how much link rot is prevalent.

Reminded me that I spend less time in the open source geospatial community because my full-time job in general web development nowadays (primarily wordpress and CSS language/CMS-wise; making sure that cpl.org is functional). In my experience, the opensource geo community was generally quite welcoming to new people, respectable in their behaviors at conferences and online, would work together, would sometimes prioritize (and corporate users would fund) developing documentation.

Reading these saved tweets also reminded that many of my peers, especially those I professionally admire, had unfinished projects and blog posts.

There’s a lot more to write, especially my experiences with open data and civic technology in the past couple years.


Oct 22, 2017 - The role of open data and libraries

(This is an ongoing draft/manifesto of thoughts that have; I am a web developer at Cleveland Public Library but these are my views and not those of my employer)(Writing this out made me think of even more questions than answers. I may be critical but I’m critical because I care about libraries).

As a participant in the open data and civic tech movement as a brigade captain of Open Cleveland and web developer at the Cleveland Public Library, I see the potential of libraries playing a much larger part of the open data movement.

Public libraries can be and should be (?) stewards of digital open data because they’ve historically been stewards, have public trust and neutrality, subject domain experts, and are connected with the community.

What’s open data: https://opengovdata.org/ is great).

Why:

Public Libraries have historically been stewards of data:
Historically, we librarians, are already are open data stewards. The Cleveland Public Library’s Public Administration Library (https://cpl.org/locations/public-administration-library/) has been designated as the “the most complete collection of material on Cleveland city government available anywhere” including City Council legislation, budgets, and more. This stewardship and sharing only has been on paper or microfilm.

CPL is also a Federal Depository Libraries. They were the ‘data portals’ - a centralized access point - an on-paper data portal, guaranteeing public access to federal data ( Census, reports, contact information, and more). (With this data being managed on a federal level on data.gov, how should Federal Depository Libraries continue their function (I don’t know)?)

City data portals and the open data movement haven’t been focused on maintaining or sharing historical data, often only sharing the most current version of a data set. Who is archiving and saving that? As archivists, libraries can fill this role too.

Public Libraries have subject-based experts and know how to find knowledge:
We know that just because there’s an open data portal, doesn’t mean that people will use it. Cities hosting open data portals are realizing that a portal isn’t enough. For open data to have any effect for the public good, it needs to be used like another resource, a tool, a means to an end; a source to answer the question; a source to analyze. Open data is just another source of knowledge that needs to be interpreted (by knowing how to filter the data, how to structure their queries technically, to use technical tools, etc) to find and then further analyze the information that a patron is trying to access so the patron has their answer/knowledge.

Connected with the community:
Our public libraries are still in the community and relatively do a better job of working with all communities and being places welcoming to all. They’re one of the few organizations that still have a wide and collaborate with entities across different sectors. They’re one of the only 3rd places left for people to meet. They’re one of the few places where people who normally don’t interact with each other can.

Libraries including CPL have been teaching people how to utilize the then-new sources of information, the internet and tools to make sense of it (excel) and basic digital literacy courses. Carnegie Library in Pittsburgh are teaching data literacy courses and how to use data sources. These are good starts for libraries to help people and institutions, especially those from marginalized backgrounds, learn how to access the data. As it’s just not enough to provide the raw material (books, databases) or open government data, The libraries can help people make sense and enhance the patrons’ use of these materials as they do with book discussions, instructing patrons to access and use databases, offering geneology clinics to use and understand those resources. The library would be the data intermedary perhaps doing the data analysis, helping people and institutions understand the data.

Have neutrality, public trust:
(Perhaps the most contentious point and least fleshed out?) Libraries are luckily generally well funded in NE Ohio and generally have the public’s trust. By being non-elected positions or at least, so far, relatively not politically influenced, they could continue to share data if a government administration cuts access to their data (just see what’s happening on a federal level). They could help present the material to patrons in ways that the government may find critical of them.

The challenges here and ahead:
Even from my limited experience at CPL, we’re limited by capacity. Librarians have the subject expertise but don’t have general technical expertise to do the extracting, transforming, and loading of raw data sets into ways for patrons to access. A combination of better technical training for staff members and also developers making it easier to fulfill patrons’ common data requests. Perhaps a bad analogy, like how there’s LaTeX on one end of the spectrum for extremely custom, esoteric needs that’s extremely powerful and Word available (suitable for common needs) for word processing and formatting. Libraries should have staff members who would know both to accomodate the variety of needs of patrons.

Although libraries are already sharing some historical open data, the process to migrate the data from the paper records into a digital format is laborious. A first part of the digitization process - creating digital images of these items (look at all of the digital collections!) - has been generally embraced by libraries but the knowledge and tools to transform that into open or structured data generally hasn’t been done (except perhaps OCR’ing some text of books). Budgets would need to be increased to increase staff/instiutional capacity to migrate the data on paper into a digital/structured, open format.

For example, CPL has plenty of digital maps available but no spatial data sets yet (for example: boundaries, building footprints) that could be created from these digital maps. I’m working on creating a geospatial data set of Cleveland’s annexations from these two maps.

We sometimes go half-way in preservation and need to make sure that we’re keeping these capabilities for open data: for example, digitizing maps at a low enough resolution so that it would be difficult for someone to geoference and orthorectify, licensing is another one. NYPL’s Space/Time Directory is creating tools to improve digitization processes/workflows to create these data sets. Perhaps, we should offer our maps already georeferenced (we don’t).

Administrations also need recognize the value (and limits) of open data to fund this and if they haven’t already, establish the partnerships with the holders of the data, the local governments.

Libraries have historically been stewards of data and I think they generally may have somewhat missed the initial curve of the growing open data sector/ecosystem. As an established 3rd party with a mission and history to maintain and share knowledge for the broader community with minimal restrictions ; they also can be the intermediary because the raw data and help the patrons, find the meaning, interpretation of the data. In the meantime, libraries should begin working with local governments and groups like Code For America brigades (volunteer groups using open data and civic tech to imporve their communities and local governments) to learn how they can partner to serve the community needs fulfilled in part by open data and as a be steward of data.

(Thank you to everyone who’s inspired to write this and laid some ground work writing, studying, or talking about this, notably Leila Slutz, Anastasia Diamond-Ortiz, and Mita Williams).