Learning to make the internets — a journalist’s guide

by Andy Boyle.

Following my previous post, a few folks had some questions on where to get started doing web development with a journalistic bent. So I thought I’d write down a few thoughts and give a basic roadmap of resources online and off that can help you get your feet wet.

The big problem with learning how to do this stuff is you need to know a little about a lot just to get a basic project off the ground. And not every tutorial tells you that you need to know X, Y and Z. Therefore, I’ll try to mention all the things you should get some basic reading up on.

But before I start babbling, you need to know that I come at this with a bias toward the programming framework called Django. Just because I like Django does not mean you have to like it.  When possible, I will point out other frameworks/languages you can and should check out. The important thing to remember is that you can get the job done using many different tools. It’s your job to find the one that’s best suited for you.

[edit] Oh, and here’s the tl;dr from Brian Boyer of the skills to learn:

Server ops, SQL, a scripting language (Python, etc.), MVC + templating (Django, etc.) HTML, CSS, JavaScript + libraries (JQuery, etc.)

Just FYI.[/edit]

And here’s a cool graphic, called Engineering The Internet, that’s also a good primer.

Now that that’s out of the way, let’s begin!


Did I just blow your mind? Okay, well, in order to build anything on the internet, you need to know how the damn thing works. This is a simple tutorial that can explain it to you. You can also scroll down to “How The Internet Works” in this post for an even more layman’s version.

The basics are this: You’ve got servers. On them are databases. A website is basically a database that your computer is asking to talk to. Your browser renders the code that makes the website look pretty.

So the basic areas you need to understand in order to get a project live are this: Servers, databases and website code.

Let’s start with the code on the website. When you go to a website — like this one — something makes it look pretty. You may have heard of this. The basic building blocks are HTML and CSS. If you right click on the page and hit “view source,” the stuff that pops up is the HTML, or hypertext markup language.

After viewing the source, on line 11 you’ll see a link to this stuff. That’s the CSS. It basically makes it quick and easier to style stuff. So instead of writing “make the background of this box red, make the font of any text in it size 14 and make this box 300 pixels wide,” you just call something you previously defined.

This follows one of the basic rules of programming: Don’t Repeat Yourself. You could program something without CSS, which is called using inline styles, such as on this amazing website, but you’d be repeating things over and over. Hence the use of CSS, which stands for cascading style sheets.

So these are some basic building blocks you should learn. You don’t have to be a master at it — I’m certainly not. But you need to know the rudimentary basics so you can chat with more experienced folks and learn from them. Oh, and you could try this tutorial, or countless others on the internet.


So knowing HTML/CSS will show you how to make flat websites where basically every individual page has to be hand-coded. Think the internet from 1997. (This should refresh your memory.) Another way websites work is by hitting databases, asking for information and spitting it out on a page. This blog — most blogs, really — work this way.

Think of it like this: Each part of a blog post, or a story on a newspaper website, is made up of different parts. A headline, byline, text field, related links, etc. All of that is stored in a big spreadsheet, basically. And the website is programmed in such a way that when it sees www.thiswebsitewhatever.com/2011/04/22/best-cat-photos it knows to pull information for that post or story.

So you need to learn the basics of something called SQL, which is the language most databases talk. It stands for “structured query language,” just so you know. A basic line looks like:

 SELECT * FROM awesome_pet_names WHERE pet="cat"

That may be rusty. But basically, you talk to a table named awesome_pet_names, asking for every thing in it where the field “pet” has the word “cat” in it. That’s a basic SQL query, and that’s more or less how the entire internet runs. Especially the part about cats.

SQL is also used a lot in database-oriented journalism, so it doesn’t hurt to know. Here’s a good place to start learning some of the basics. The wonderful Derek Willis suggests one should read this book, The Art of SQL by Stephane Faroult, which is more advanced.


OMG you’re learning so fast! Yes, they are more than that. Instead of writing our websites in SQL queries, most of us write them in what’s called a server-side scripting language. These are languages that talk to the server, hit the database and spit out data onto websites. Some examples include Python (some don’t agree), Perl, Ruby and PHP. Python is what powers Django, Ruby powers Ruby on Rails and PHP can be used on its own. This blog is written in PHP, for instance. Other folks may use Java, which a lot of the internet is built on.

Also, don’t learn Perl. Just trust me. (This is where I point out that many web developers have silly biases against programming languages that are usually pointless and silly.)

Before diving into a framework, I would highly suggest learning the basics of PHP and how it interacts with MySQL, which is database software, among the most common on the internet. A good book is Head First PHP and MySQL. It teaches you the basics and it doesn’t get too boring. You can also find some good tutorials online.

Once you get the basics of PHP down, I suggest learning how to install WordPress and setting up your own blog on your own server. Which we shall now sort of discuss.


I’ve always maintained that the hardest part about making a website is setting it to run on a server. So if you make cats_rule.php that lists your favorite cats, you want it to live somewhere that people can see. Just so you’re aware, if you’re lucky enough to work at a place that has server operations people — server ops, ops, whatever they get called — you may not have to deal with this stuff as much. But it certainly doesn’t hurt to learn it.

So, a server runs on an operating system, just like your computer. One that I prefer is Ubuntu 10.04 LTS, which is one of many variations of Linux, an open source operating system that people work on out of the goodness of their hearts. Or you could use Windows Server, which was made by people who want you to have to pay lots of money to use it. Not that making money is bad or anything, but I tend to go with free stuff made by people who want to help the world.

The server, which has that operating system I mentioned, has to actually live somewhere. For many, many years, the only way you could really run a server was to either build your own box or rent space somewhere. Or have awesome ops people who set up stuff in a server room.

You may have heard of this “cloud” thing. Basically, big companies like Amazon have so much server space that they had a bunch left over that they can turn into virtual machines. Instead of making an individual physical server, they just partition their hardware to make multiple versions.

This means you can set up stuff really easy. This also means you can store media really cheap, too. I would suggest using either Amazon Web Services or Rackspace. They are both relatively cheap, with their cheapest server space costing about $10ish a month. Rackspace may be a bit easier because you don’t have to deal with ssh keys, so perhaps start out with that.

You can also use webfaction, which is about $10 a month. What’s awesome is they do a lot of the server setup for you. And you won’t really have to do much server setup, which I explain in a few grafs. So, webfaction is an option, so are many others.

My Django tutorials walk through some of the software you need to install if you’re making a Django-y project, but you basically need server software for your server, despite what you’re coding is. Apache is the standard one many people use. It’s open source and nice. I’m a bit fancier, so I use nginx. Whatever you prefer, it’s just another tool to get the job done. It works alright with PHP/MySQL.


This is where you learn about FTP, SFTP and SSHing into your server. FTP and SFTP are two ways of uploading/deleting/changing files on your live server. SFTP is much more secure than FTP, and other news developers will make fun of you if you use FTP. It’s not necessarily wrong, it’s just not as safe. And if you’re learning, no reasons to not use FTP for now.

But before that, you need to know how to ssh into your server. This involves using the terminal, which is scary for some folks. It’s all text, with no shiny buttons to click. Yet this is the most powerful part of your computer. If you’re using Windows, download PuTTY and scroll down in this link. If you’re using Apple products, then search for Terminal and open it then follow the instructions sort of here. If you’re using Ubuntu, why are you reading this?

Before you can FTP, you need to set up software on the server to let you FTP in. Here’s a quick walk-through of installing vsftpd, which is FTP software. Boom. Now you can FTP into shit. You’re welcome.


Frameworks are sweet. They make it easier to quickly develop the web. I use Django, like I previously said. If you want to get the intense walk-through of setting up Django — INCLUDING LEARNING ABOUT SERVERS! — you can go through my almost-finished tutorial starting here. You can also go through the creator of Djangos’ own tutorials here.

If you want to learn more about Ruby (but not making websites), the wonderful Dan Nguyen has some great walk-throughs here. This could be a good Rails walk-through. Who knows.

So that’s some basic info on frameworks. If you want more, use The Googles.


If you’re going to set up a website, you need to learn how to buy a domain. My pal Emily Ingram walks you through how to use GoDaddy.com to buy them at this page. That’s what I use, but I would definitely suggest you do NOT use them for hosting. Do not do not do not. You can use other service providers to buy .coms, but hey, it’s quick and easy. (Edit: You should use Namecheap.com instead of the previously mentioned godaddy. Just do it.)

Just remember to set your .com’s DNS settings to point toward your server, which is stuff you get to learn on your own, because I am getting tired from explaining the entire internet to you, my dearest friend.


Now, you may have learned that HTML/CSS stuff. But guess what? There’s more! Javascript and Ajax are neat.

Just so you’re aware, lots of places are looking for people to be kick-ass at Javascript and CSS/HTML. They call this front-end development, because it mostly deals with stuff that happens on the client side, aka in your browser. My  main focus, and all that stuff about PHP and Django and servers and whatnot, is backend development.

Javascript, and its widely used library jQuery, are what lots of the cool internets are built out of. When you click on something on a webpage, and it moves, or changes color, or slides around, usually that’s Javascript/jQuery. These are things totally worth learning. Basic Javascript tutorials can be found here. Here’s the page for jQuery, which also includes tutorials. Here’s even more in-depth stuff for jQuery. A good Javascript book I’ve been going through is Head First Javascript.

This includes learning about stuff like Ajax, which basically allows you to reload stuff on the page by hitting the server without having to actually load a new page. Think about how when you’re on Facebook and it pops up, saying your Aunt Glady Crapplebottom just liked that photo of those new shoes you bought. That’s Ajax. Learn about some of it here.

These are skills I need to get better at, as they allow you to make awesome web graphics. People who are good at this stuff make me envious, because it’s like they can somehow do magic. Wouldn’t you like to know magic? Yes you would. So figure it out, folks.


Okay. This was nowhere near an exhaustive list. I apologize. I didn’t realize what a monumental task this would be. I know I’ve forgotten stuff, so hopefully nice/mean people will write in the comments about other sites, and I’ll update this post with their wonderful suggestions.

If you’re going into this, it’s also good to keep up to speed on the goings-on in the tech world. You can do that by checking Hacker News every once in awhile. It’s a pretty good site. Another good place to look for answers to questions is Stack Overflow.

Get an account at github and start looking at other people’s code. That’ll be the best way to learn, honestly. And odds are if you’re going to make a project, someone else has already made something similar and they’ve put up free code for you to use. So search for it, figure out how someone else made the similar project and incorporate it into your own.

Join the NICAR listserv. This is the National Institute for Computer-Assisted Reporting. Lots of news developers lurk on the NICAR-L, as we call it. And you get to join the Investigative Reporters and Editors, or IRE, which is also a great organization that you should already be a member of. Why aren’t you? JOIN TODAY.

Also, the most important thing you need to learn is that someone has probably had the same error or problem you’re having while building something. So either you can read the official docs — something I am known to not do and then Derek Willis yells at me — or just Google the error. Sometimes people post similar questions/answers on various message boards or Stack Overflow. If all else fails, tweet at some of us news developers on Twitter.

Most of us learned what we learned through the helpfulness of others, so we’re glad to help others learn more.


First off, crack open a beer, because you’re probably thinking OMGWTFLOL THAT IS SO MUCH CRAP. Well, yeah. Making the internet is kind of hard. Hell, it is hard. But it’s a little less emotionally draining than most traditional journalism jobs. And the first time you get something to work, I guarantee you will throw your arms up and shout, “YES!”

Every time I make something work, I feel as though I have reinvented fire. It’s a great feeling, and it’s especially great when you can show someone else the tools of the trade. So if you decide to go down this path, please be vocal about what you’re working on. Write about it. Explain how you did stuff. Be social.

So here’s what you do. Come up with a project. It can be as simple as making a website that tracks the movies or books you own. (FYI, I already made that. See here.) You can even go more advanced, perhaps making something that scrapes all the legislative votes in your state. Hell, go even super easy: Just learn how to use Google Fusion Tables to make something like this.

The point is, go out and make something. Then let us know how you do it. If you get stuck, ask us for help. We’d be glad to see you kick some ass.