DOGS OF NEW YORK:
A Look at the Most Popular Dog Names by Neighborhood in NYC
By Fletcher Berryman
Why I Made This:
To even pretend like I woke up one day with a clear and God-given intention of swan-diving into New York City dog licensing data would be a not just a crime but likely a Class A felony. In fact, until several weeks ago I was still intent on investigating the mean decibel levels in the interior of Prospect Park to question whether or not levels of noise comparable to natural spaces outside the city could be found within the bounds of Brooklyn.
After finding out that I’d be out of town for the weekends I’d intended to spend recording decibel levels I instead took a look through the NYC Open Data portal for inspiration. After initially looking at NYC Dog Bite records (found here, another fantastically enthralling set of data in case you’re feeling exploratory and/or bored to tears), I accidentally encountered the much larger and more comprehensive 2016 NYC Dog License dataset from 2016. At 144,000 rows and chock-full of fascinatingly useless information about New York’s best friends I felt that I was finally sniffing in the right direction. Rather than trying too hard to create a non-existent problem that I could solve, I chose to scroll up and down through some of the rows to see if I myself would have any questions arise.
Quite quickly it became apparent that by far the most interesting part of the data set were the names of our city’s dogs. As the cultural capital of the world it goes without saying that the 8.9 million residents of this great city do not take the extremely spiritual act of naming a canine lightly.
As I scrolled on, I found myself wondering if there were patterns in the naming. New York is the most linguistically diverse city on earth; therefore it almost goes without saying that dogs’ names would in some way reflect the backgrounds of their owners…or so I thought?
To examine this hunch that was evolving slowly into a formalized and testable series of questions, I thought I’d start with something concrete. With limited confidence in my coding, I decided to set out to determine the most popular dog name(s) per neighborhood. I chose this with the eventual goal of reexamining this data at a more complex level as my time in Pratt Institute’s GIS & Design certificate program rolls on. An end goal would be to have a fully functional search engine that allows users to essentially profile dogs across New York, filtering for name, breed, neighborhood, and even when their birthday is (just picture it, a map of all the dogs in New York who share your birthday…glorious).
HOW I DID IT:
For the Foundations of Spatial Thinking version:
To begin with, I knew I’d need to drastically trim this data down. I didn’t want to take it too far (as I couldn’t anticipate what data I may need down the road) but the sheer size of the original dataset made using it in Carto impractical. Even after trimming it several times I found it would crash.
My key obstacle in the beginning was finding a feasible way to geocode the whole thing (the licensing info does not contain exact addresses of the dogs’ residences). I tried with several different plugins to geocode within Google Sheets; all of them crashed my browser. I had many options in the way of geo-codable columns to use as the data is quite thorough. Still, most of the options lacked the fidelity I’d need to examine the dog names in a meaningful way on a map. Congressional and senatorial districts were too broad while the census tract data was filled with typos, and though zip codes were closer to my goal I choose to use a type of subdivision unique to New York City called Neighborhood Tabulation Areas. They equate roughly to neighborhoods and are used by the City of New York for its own internal demographic analyses beyond that of the federal census.
Unfortunately, Carto’s built-in geocoding features were unable to geocode by NTA (expected as they are a fairly arbitrary sub-division type). I was left with limited options given my less-than-impressive coding skills.
I opted instead to first get the NTAs on the map. I downloaded the NTA shapefile from the Open Data portal . I decided to open it in ArcMap as that’s become an instinctual first step for me over the years when working with spatial data. I checked out the attribute table and happily found that it was well-organized and straightforward. This gave me the idea of simply appending the table with the values of the most popular dog names per NTA. My only problem from there was to figure out what those values were, and oh what a problem that became!
I tried several different methods of filtering the data to display the top three or so most common occurrences (modes) in the name field per NTA. I actually did end up writing a successful (albeit messy) SQL query. It was at that point that I discovered that the vast majority of NTAs have “UNKNOWN” as their most common name. I thought I could just trim that, but many have “NO NAME PROVIDED” and similar variations as their most popular “name”. I tried to find a way to write code to omit these results and then filter and order the remaining values per NTA but could not wrap my head around it.
Instead, and at the suggestion of Eric Brelsford in SAVI 780, I ended up using Excel’s Pivot Tables. Pivot tables allow for GUI-based filtering, and in my case I was able to create a new spreadsheet that had the top 10 values per NTA.
With the answers to my question in mind, I took a mind-numbing route of attack and created the new dataset myself. I went through the attribute table of the already uploaded and geocoded NTA layer and in a newly added column I entered the top name(s) for each NTA. To make it easier on myself, I had the .xls to the side in another window and used CTRL + F to quickly find each NTA as I went through Carto in the other window. All and all that manual entry took about four hours but was well worth it to get the data I wanted.
For the SAVI 780 version:
My intention with SAVI 780 was to try to tackle the same problem but through code, ideally pulling in as much coding knowledge gathered throughout the course as possible. I’ll be the first to admit that I did not personally feel like I ever reached a point of seeing where the dots connect in regards to the various coding languages and tools we learned. Quite simply, I just did not have enough time outside of class to entertain any auxiliary resources or exercises. I will say that working with HTML felt quite straightforward as I put together the page used for both classes and credit that to SAVI 780 (we didn’t touch code in Foundations of Spatial Thinking). The concept of how the same goal might be accomplished in Leaflet made sense, and would in fact be the far better solution if I had stronger coding skills (I look forward to revisiting the concept with other classes as I progress in the certificate program) since it allows for the large datasets to be hosted elsewhere in APIs rather than requiring an import of any kind.
The process for getting the data into the map was actually pretty straightforward. The map pulls from the city’s Open Data Portal API, first fetching the NTA spatial layer itself. From there it relies on an event listener that upon clicking will pull from the NTA and Dog License Data Set and then displays the data in a popup. Also in the pop-up code is a line telling it to pull the second item in the row (position 1); this is an intentional and not-perfect solution to my problem with all of the unknown data. Every NTA has UNKNOWN as the most common name, while only some have additional popular results like “NULL” or “NAME NOT PROVIDED”. By at least shaving off what I knew to be not helpful results in the first line per NTA I was able to create a semblance of a map instead of none at all. I did try displaying the first ten or so values in each pop-up but I could never get the data to display in a manner legible enough for my standards (something I’d like to revisit). Ultimately, I had to come to terms with the fact that my coding skills were far too limited to yield the perfect and clean data set that I wanted.
For aesthetics (and fun!), I made my own light-hearted basemap in Mapbox that is publicly available (see bottom of page) and dog-themed, based initially off of Mapbox’s Ice Cream theme. To turn the [water] in my basemap into dog prints, I thought I could just select that icon in the Mapbox GUI and I’d be all good (as I’d seen Eric do this briefly with bicycle icons as a filler, or what Mapbox calls a “pattern”, in an in-class demonstration). I tried to do the same and hit an unexpected snag: the icons could not be stylistically edited nor could the colors be changed. Luckily I know Adobe Illustrator well and opted to export the SVG’s from the Mapbox’s open source icon package (Maki) and then loaded them into Illustrator. From there I edited the SVG and used hex codes to get the color of it in line with the palette of the basemap. I uploaded my new SVG into Mapbox and was able to use it in the map.
Lastly, I used the iFrame HTML element to embed the Carto Map, the Leaflet Map and the basemap preview in my page.
Dog-Name Anomalies Encountered Along the Way:
There are 78 dogs named “Brooklyn”…in Brooklyn. Perhaps even more concerning, there are 2 dogs named “Brooklyn” in Co-Op City, which is in the Bronx? I have to wonder if these are the first documentable examples of “internally displaced dogs” and wish them the best in their efforts to return to their borough of origin.
There are not one but two dogs named “Blue Gotti” in South Jamaica, Queens and two dogs named “Boss Lady” in the southern portion of Crown Heights, Brooklyn. South Jamaica also has several dogs simply “Pupi”, and yes that is how it was spelled on the license.
One of my favorite Brooklyn neighborhoods, Prospect-Leffert Gardens, contains not one nor two but three living dogs named after Thor, the Norse God of Lightning.
There are many, many princesses in New York City. The vast majority of them are canines. Mott Haven alone has fifteen, in the South Bronx.
There are two dogs named “NO” in Claremont-Barthgate, Bronx and a dog named “YEAH” in Marine Park, Brooklyn.
Greenpoint’s top ten might be my personal favorite, with four dogs simply named “Sir”, five dogs with the painfully generic white human male name “Spencer” and best of all, four dogs named “Taco”.
Belmont in the Bronx seems to have a strong appreciation for hip hop and gang culture, with two dogs named “Kilo” and two named “Triggah”. Also in the Bronx are two dogs named “Paper”, assumingly a reference to getting money, in Crotona Park East. I feel compelled to mention that Belmont also has two dogs named “Cranberry”.
Cambria Heights in Queens may well be the most glamorous neighborhood in New York City, at least according to its dog names, with two “Princesses”, two “Duchesses”, and two “Mercedes”.
In looking across the city, a general trend that I found surprising and that you won’t see reflected in the maps is the popularity of the name “Gizmo”, it’s rarely the number one name in a neighborhood but is often in the top ten, especially in the Bronx.
The West Village might be the most basic neighborhood in New York, with a whopping thirty-five dogs named Charlie within its bounds.
Lastly, and please do not proceed if you’d prefer to not be repulsed or offended, there are not one but two dogs with the unfortunate name of “BIGBOIPLUGGER” in Williamsburg, God help them.
In closing, I look with both trepidation and excitement towards the future as I build upon my coding and design skills to examine NYC dog data in the courses to come.
Mark my words, I will leave no paw unturned.
For questions or to contact in general, please reach out to me at:
fletcherberryman [at] gmail [dot] com