Archive for November, 2008

Cool Facebook visualization

Someone on a listserv I’m on sent out a link to this: http://www.facebook.com/video/video.php?v=37403547074&ref=nf

It’s a visual representation of different Facebook data, such as when people interact and when people add new friends. It’s a real nifty way to represent data. Perhaps someone could do something similar with the University of Nebraska-Lincoln’s Internet usage, and pinpoint where the most traffic is coming from visually?

More tools to help local journalists

I just added a new page on the blog — Public Records Law Links — so readers have a one-stop place for many state and local Web sites to help them find background information. I’ve included links to the local assessors office, court information, budget database searches and even how to track an airplane.

I’ll be adding more in the future, so be sure to check back.

A ‘Show Me State’ after a lawsuit

One of my favorite newspapers is the Kansas City Star, and on Sunday they had an awesome story about their fights for some e-mails from the Missouri governor’s office. The paper, in conjunction with the Associated Press and the St. Louis Post-Dispatch, eventually sued to get the records. Here’s the lede of the story, which you can read here:

ST. LOUIS | The release of thousands of e-mails from top officials in Missouri Gov. Matt Blunt’s administration appears to partially vindicate a disgraced staff lawyer who was fired last year.

Scott Eckersley’s rise and fall — and proof of his longstanding claim that he had warned the administration over e-mail secrecy — are documented in a cache of 60,166 pages of e-mails released to an unprecedented consortium of media organizations, including The Kansas City Star.

I highly recommend you read the rest of the story. After my own disagreements with the administration of the University of Nebraska-Lincoln about some public records, I’m happy that it didn’t get to this level.

This type of journalism is what needs to happen more in Nebraska. I’d like to see someone request the e-mails sent by some of our governor’s staff and see what the response is. Hopefully they won’t threaten to have you escorted out of the Capitol by security.

Atlanta’s parking scofflaws

I haven’t updated the blog in quite awhile, but I come back with a vengeance. This time it’s about a story by Mike Maciag, one of my former roommates in Erie, Pa., while we both interned at the Erie Times-News. He set out to investigate Atlanta’s collection of parking fines, and he discovered the city had lost about $10.5 million in unpaid tickets.

He eventually discovered parking enforcement was extremely weak, and he managed to track down and interview one person who had $4,100 in unpaid tickets.

You can view his story here.

Mike’s always been a good pal, and more importantly a smart reporter who I can always depend on for help when I’ve got something data-related I’m working on. He has one major flaw: He likes the Bengals (“you know, America’s team?”) too much.

Here’s his e-mailed responses to some questions I had for him:

1. Explain to me how the story idea came about and why you decided to pursue it.

I had a list of ideas when I arrived at the AJC, and a story on parking tickets was one of the first I decided to pursue. I particularly thought it was worth investigating since the city had recently cut its parking management staff to only nine people. However, I wasn’t quite sure which direction to take the story until I got my hands on the data.


2. Where did you get the data from to do the story, and describe the process of getting it (how long did it take to receive it, what costs, what form did it come in)?

The paper requested records for all parking citations issued since 2003 from the Atlanta Municipal Court. It took many phone calls and e-mails, but we eventually received the data after about a month. Some reporters might not have the luxury of waiting this long, but I was fortunate enough to have editors who were patient and willing to let me spend time getting the data.

The records were exported out of the court’s database into a delimite text file. The fields included in the file were the date, citation number, license plate number (for the more recent data), name (only a small handful had names), fine amount, penalty amount, paid amount, payment date and payment type.

The cost was about $95.

3. What did you use to analyze the data and how long did it take?

I used Access for nearly all of the analysis. Some Excel spreadsheets were also used for smaller data sets that were generated from queries.

The analysis and data cleaning took no more than two days. The reporting took much longer.

4. Walk me through your analysis process.

The first step – often the most difficult with these types of projects – was making sense of the data. This required a few phone calls to the IT people at the court.

A couple of things came out of our conversation. First, I found out that all data prior to 2005 was unreliable and couldn’t be used for the story. In addition, a new system had been implemented in 2007, and only unpaid citations were transferred over. This meant I wouldn’t be able to calculate how much money was actually collected during most years.

But I still had more than enough good data for the story. In all, the file had about 725,000 rows (about 410,000 unique citations) for tickets issued since 2005.

After cleaning up the records, I performed a series of Access queries – everything from making a list of top offenders to determining how much was owed by out-of-state motorists.

To get the top scofflaws, I subtracted the paid amount from the sum of the fines and penalties. Using the ‘Group by’ function for each plate number, I added the totals (with the SUM function) to get the amount owed. This showed at least 179 people had debts of $1,000 or more.

Tracking down vehicle owners with outstanding citations was more difficult. Although the file contained a field for the name of licensees, nearly all the entries were blank. If I remember correctly, only about 5,000 actual names where in the whole file.

In some states, not having names isn’t a problem because you can go to the DMV and request information on an owner by providing the license plate number. But in Georgia, the law doesn’t allow for this, so I had to look for alternatives.

I ended up taking my list of top scofflaws and ran queries in Access to display all records for each plate number. For most plate numbers, there were more than 100 records, all of which didn’t have names. But some license plates did have one or two records with names, and that was all I needed.

Using the city’s online parking violation payment site, I entered the plate and citation numbers for the tickets that had names attached. Some of the search results included addresses for vehicle owners. I then handed over a couple of the names and addresses to our news
researcher, who compiled reports with contact information by running reverse address searches.

For one individual who I eventually contacted for the story, the phone numbers in the report were all incorrect. Instead, I decided to use the man’s employment history to track him down. Fortunately, he just happened to be listed as working for a company on LinkedIn. I called the phone number listed on the company’s Web site and sure enough, he answered.

5. The reporting process — what was the most difficult part about using the data when trying to contact people? Did they have trouble understanding what you found? Did anyone not believe your findings or dispute them?

After computing the amounts for unpaid citations, I went back to the court to check the numbers. I explained my process for adding and subtracting dollar amounts in different columns to compute the totals, and they verified my methodology. This really should be a required step for any type of data analysis.

The major finding was that drivers owed the city a whopping $10.5 million in fines and penalties issued since 2005. No one ever disputed this number. The city Department of Public Works had no stats for comparison, so there really wasn’t much they could do. A man with $4,100 in unpaid tickets was pretty shocked, though, when I told him how much he owed.

6. What was the hardest part about doing this story?

The most difficult part was gathering all the facts regarding the city’s collection process and parking enforcement. There are many players involved. Both parking enforcement officers and police can punish drivers, but it varies greatly depending on the circumstances. The city court collects the fines, but a collection agency is used after 90 days. There are also all the details about how notices are sent out and the specifics of the contract with the city.


7. If someone else wanted to try doing a similar story in their community, what hints or tricks would you have for them, and what mistakes did you make that they could learn from?

The story possibilities are endless if you’re able to get parking ticket data. You could do a story on unpaid citations if the numbers are high enough. If the data includes enough names, another possibility might be to see if any public officials have unpaid citations. A reporter also could determine where the most tickets are issued. I would’ve liked to do that for this story, but addresses weren’t included in the data.

The best advice I can give is to just make sure you understand the data and what is or isn’t reliable. This can save you a lot of time and headaches.