How to Create a Static Archive of Old Community Features You No Longer Use

Posted by Patrick on April 23rd, 2015 in Resources

Comments Off

Until recently, at KarateForums.com, we had this old photo album feature. I say old because no one used it, it was just there. Once upon a time, about 13 years ago, we launched the photo album, and people used it. But, as time went on, they used it less and less until 2012, where they stopped using it at all.

What happened? People adopted the most practical use for sharing photos: posting them in the forums. More people would see them, and you could have a better discussion about them. It only made sense. I embraced that idea long ago and stopped actively promoting the photo album.

And yet, it was still there. Which isn’t really a good thing. Even if I remove all references to it, the fact that it is still online, powered by old PHP code, connecting to our database, adds to the likelihood of a potential security issue. At the same time, I didn’t want to just delete the album completely like it was never there.

HTTrack

I turned HTTrack, which is free and open source. The software allows you to download an offline copy of a website (or a portion of a website). I have used it several times before. For example, if you visit my old Bad Boy Entertainment blog, it might look like a live, database-driven blog. But it’s not. It’s a static archive. Just HTML, CSS, images and JavaScript.

I did the same thing with the photo album and now we have this archive. The archive is standalone. I could delete the rest of the website, including the databases, and the archive would still work.

HTTrack may seem a bit overwhelming at first (it does a lot of things), but I’d recommend taking your time and maybe testing it on a small blog you run or a similar project. There are filters you can put in place to only archive certain links, certain types of files, etc. It’s really powerful. When I ran it on the album, I set it to archive just the album files and to ignore all other files.

Fine-Tuning the Archive

I usually need to run it a few times because I’ll usually forget to do something or will decide I want the archive to be a little cleaner and/or more self-contained. For example, I ran it once and realized I didn’t tell it to download all of the header images on the album. The header images are part of our larger template system, which means they are not hosted within the album folders. I wanted the archive to look as it does now, even if I changed the design of the community at a later date. To achieve this, I ran HTTrack again with a filter change so that it would download any image that was used on every page.

When I ran it on Bad Boy Blog, I found areas where I had improperly formatted a link (if you write 10,000 blog posts, there will be a few malformed links). I went back, fixed those, and ran it again. I discovered more issues and fixed them until I had the archive I wanted.

It’s a fine-tuning process, but it’s worth it. And once it’s done, it’s done. You don’t have to worry about it breaking for some technical reason, like your server software being updated to a new version it doesn’t support.

Final Thoughts

After you have the archive you want, you can upload the files to the proper area of your server and they’ll work, as they are all linked together, not to a particular folder or sub-domain.

In some cases, it makes sense to merge old, long forgotten features into your current community and however it uses that functionality. In others, however, I think archiving like this can make a ton of sense. Of course, you can also use it for a community you have closed, so that you can keep it online while not having to worry about keeping the software updated.

HTTrack (and software like it) is a great tool to keep in mind for those scenarios.