Schools, sex offenders and mapping

by Andy Boyle.

Map made by Mike Stucka

From The Taunton Call

In 2006, The Taunton (Mass.) Call discovered that a proposed city ordinance would make most of the city off-limits to sex offender by using mapping software –Link to a pdf copy of the stories

Mike Stucka, who worked on this as a Northeastern University master’s project with former colleagues from the Taunton Gazette, is now a staff writer at The Salem (Mass.) News. He was kind enough to answer some of my questions via e-mail recently. My questions are in bold:

Explain to me how the story idea came about and why you decided to pursue it.

One of my former editors, the inestimable Frank Mulligan, mentioned over lunch one day that the city council of Taunton, Mass., was about to approve sex offender registry restrictions that summer, but the city’s mapping department wouldn’t be able to evaluate those restrictions until that fall. In other words, the city was about to pass a law without knowing what it meant. The size of the proposed restrictions, some 2,500 feet, meant that roughly one-mile circles would be carved out of the town as banishment zones. The law also included references to places “where children may regularly congregate,” which was never defined.

Frank tasked another former colleague of mine, the esteemed Rebecca Hyman, to work with me on the story. I did most of the data work, with advice from Frank and Rebecca, particularly on presentation. Rebecca and I split up the experts. Rebecca did a bunch of the background research. And then Rebecca did the part that made the real difference — she took my conclusions, my statistics, and presented them to the city council members in phone interviews. To a person, they all agreed the proposed restrictions covered too large an area and would have to be reduced. We changed public policy before publication, which was a neat experience.

Your explanation mentions that you received data about Class III sex offenders from state agencies — was it from the state police? And did they provide you with a database of names and other information, or did you use a searchable Web site and then plot the names from there?

Massachusetts lists “Level 3″ offenders publicly online, as well as in the police stations. These are the sex offenders deemed most likely to re-offend. Level 1 and Level 2 registries are held confidentially. Note, too, that these lists are never to be used to harass anyone.

I manually scraped the names and addresses and worst offenses from the Web site. There weren’t so many to do that for a single city — about two dozen. One of the first judgment calls I had to make was what I would do with their work addresses — I entered those in, but ultimately did nothing, as there were no proposed restrictions for where the sex offenders could work. Likewise, we didn’t try for an online component, but today, with tools like FMAtlas, we could have easily posted them online. Live and learn.

The state mapping agency that you received the other data from — how did you process the data and pull it into the story?

I manually built all of my databases

The largest spreadsheet, with roughly six dozen lines, was the list of registered childcare providers. I ultimately copied-and-pasted these individually from the state’s early childhood education licensing agency, including the day care provider’s name and phone number, which also never got used. It was easier to do it at one time than to risk having to go back and replicate it all.

A point about these daycare providers — many are run from private homes, which would make entire neighborhoods offlimits. At the time I did the research, I was working from an apartment underneath … a daycare provider. The pitter-patter of running feet sometimes helped focus on the goal.

The parks I tried to plot on Google Maps, find the centerline, and used the link generation feature to get me latitude and longitude. There were two or three “pocket” parks, such as those on a single house lot, that I could not find. At this point, I knew I had an “at least” story, where I could say that “at least so much land was excluded.”

The schools’ addresses I pulled from the city’s school department. I had tried to get a map of the city’s schools but the city was unresponsive. Certainly the city had the schools mapped; but I couldn’t get them. This was the biggest source of error in the story: I had to count a school as a single point. Schools aren’t points. The smallest schools have a tiny playground and a parking lot. The biggest schools have vast acres of parking for students and faculties, sprawling buildings, athletic fields and stadiums — as was the case at the city’s high school, which on a Google Maps photo appears to stretch some 2,000 feet in one direction. Treating this as a single point obviously knocked off quite a bit of accuracy from the story and again led back to the “at least” idea.

I also had to remember the city’s private schools.

And there ultimately was no way to handle that weird phrasing — “and other places where children may gather” or whatever it was — because the lack of a definition meant I couldn’t do it. Would a Boys and Girls Club count? What about the house of the neighborhood woman who likes to bake cookies for kids? Again, more support for the “at least” story. (And the Boys and Girls Club was a daycare provider.)

I typed each set of data into separate spreadsheets in OpenOffice.org Calc, a Microsoft Excel competitor, and exported them into CSV, or comma-separated value, text files for processing.

Map-wise, I used the state GIS department for an outline of the city, and the Census bureau for the Census blocks.

Walk me through the process of using Census data to see how much of the population would be affected.

The first trick was to get the most accurate information possible, which involved getting the smallest regions — Census blocks, typically with some hundreds of people. In dense areas, these were small. Because Massachusetts has no unincorporated land — land outside a city or town’s limits — some of the rural Census blocks were annoyingly large.

I generated “buffer zones,” or exclusionary areas, for each of my sections. Then I had to figure out how to compare these buffer zones to the population.

From a suggestion on NICAR mailing list, I knew I could try to cheat and grab only the Census blocks with centers in the middle. Basic geometry suggests you can have a circle intercepting less than half a square and still have the circle get the center of the square. This made me nervous.

I hit on another idea: Calculating how much of the Census block was covered by the buffer zones. If the population was pretty evenly distributed across the city, this would be the most accurate way of doing it.

I ultimately made a tutorial for this process using ArcGIS, available through the Reporting Cookbook wiki.

In the end, despite my worries, the two methods had nearly the same results — roughly 83 percent of the city — 83 percent! — would be prohibited from residency
With using Census 2000 data, is there ever any fear that the data could be outdated, or do you use population estimates?

The most accurate way might’ve been to get the city census or the
voter registration list and map every address. Given that the city was
nonresponsive in even giving me a map of the schools, I did what I
could on deadline and made do with 2000 estimates. One benefit from a
negative: The city’s financial heyday came some decades earlier, so
there was relatively little new housing that would have shifted
population centers.

Did you try to contact any of the sex offenders who would most impacted by the proposed ordinance?
Nope. I’d thought about it, and had even wanted to see which resident were covered by the most overlapping banishment zones. There was a home for recovering alcoholics near a private school, a day care, a library and other locations.

Running low on time as I scrambled to get the data together, I, well, weasled. No sex offender could be banished from a current home; they’d be grandfathered in at that residency. They’d simply be prohibited from moving to a prohibited zone.

How long did the whole process take from getting all of the data, importing it and mapping it until you had something you could use?
A couple of weeks, I think?
What was the hardest part about doing this story?
The data problems were frustrating, the mapping software wasn’t always intuitive, and I was apparently trying to do some things, like the proportional allotment of residency, that hadn’t been tried (or at least written about) before by journalists.

In the end, the timeline was the toughest, combined with the basic premise of the story and law: As we talked to experts, we came to understand that the vast majority of molestations are done by people who are known to the children — an uncle, a trusted family friend, a neighbor. Talking about these convicted strangers so much in a way distracts from the most likely source of molestation. In effect, we were forced to focus on a small part of the problem without much opportunity to educate the public about the most common risks and how to recognize them. Unfortunately, the law set the agenda, and we had to write about the law.

Ultimately, in a very real way, we came down on the side of the sex offenders. That’s an unusual place to be. But the experts told us that the more the sex offenders were cut off — from society, from jobs, from transportation, from family — the more likely they would be to reoffend. They’d also be more likely to “go underground” and not register — thus becoming completely unknown to the police and neighbors.

Some cliches come to mind — “the devil is in the details” and “… paved with good intentions.” The city council was trying to protect children, which is naturally a politically astute and humanitarian thing to do. Yet the sex offenders had no lobby to protect their interests, and no one else had done the mapping or had talked to the experts. By doing the research and writing the stories, we were able to actually protect more kids than the law would have. Let me rephrase. We reduced the likelihood of children getting molested because of overeager residency restrictions that would have made sex offenders more likely to reoffend.

To re-rephrase, we may have prevented some kids from getting molested. At the end of the day, despite the technical hangups and moral questions, that’s a pretty damned good day.

If someone else wanted to try doing a similar story in their community, what hints or tricks would you have that they could learn from?
Start early. Get all the data you can possibly need on the first pass. Think now, especially as online mapping tools have gotten easier, about multiple platforms for your results and ways that may need to be presented differently for different mediums and different audiences.

Think of those little details — like, should someone be able to type in an address and see where and why it’s prohibited? Do you post the daycare provider’s names — which are already elsewhere on the Web — in your online map, or is that too invasive since so many are run from homes?

And be damned careful with your data. By using points for the schools and leaving out some parks, I underestimated the scope of the banishment zones — giving me a consistent “at least” story. But on the first few attempts to play with the data, I made some rookie mistakes — such as not flattening all of the banned zones into a single layer. When I calculated the area of banned zones, I had more square footage of the city banned … than was actually in the city. I also had to learn how to use the mapping tools properly. For example, banishment zones near the city limits would be cut off at the city limits, because one city has no power over the neighboring town; so I had to cut out the flattened buffer layer from the city’s borders.

Such stories are going to get a lot of scrutiny. Do them — and do them right.