code:mac

CodeMac : blog : Paperless Nirvana

Paperless Nirvana!

On a single night recently, I read an article that explains how to set up the swish-e XML document indexer, Kevin Cupp's review of the ready-made raid box drobo, and a linux.com article about turning a networked computer into a pdf making machine. You can see where my mind was headed.

I've decided it would be an awesome thing to have a "paperless" office. Now I'm not going to throw out my printer, or anything silly like that. Having a digital filing cabinet instead of a paper filing cabinet just appears to be advantageous. So what all do I plan to do? Well first there is the issue of what I actually need out of this project.

Storage

I plan to store everything on Raid 1, 5, or maybe even 6. I plan to use these drives ONLY for document storage. No music, no movies. I hope to have a seperate set up for Raid 1 with MythTV for my music and movies. More on the DVR project later...

Physical Input

I need a way to get physical bills, letters, etc into digital storage. The solution is obviously a scanner of some kind. I might need two scanners though, one flatbed for things that cannot be spooled, and another one for the letter/A4 sized papers. Any suggestions on either scanner I should be looking for would be great.

Physical Output

Getting things out of digital storage into the physical world is just as important. I have a printer, problem solved. I imagine I would have this printer shared on the network so any computer can print from the same printer.

Digital Input

Setting up CUPS-PDF seems like the obvious choice. I could hit print on any page I want and just choose the CUPS-PDF printer. Then I could have a process that runs every 5 minutes to look at recently created documents and indexes them, and does any cleaning I deem necessary.

Digital Output

And finally what I would consider the most important part of this little project, getting digital output of the documents I have stored. I would need remote and local access to my documents, and have some way to easily search them.

My ideal interface for this would be a nice web interface that I could log into, and access/search/download any of my files stored on the machine. Then I could use the same interface for these documents no matter the machine or the location. I'd probably also have ssh/nfs access set up as well, but that is a simple and no-fun set up.

I still don't have a good idea of the best document indexing service to set up.

Final Thoughts

The more I think about this, the more I can see myself setting up a quick django/merb/thin interface for the server, and having it be accessible from anywhere. Most likely it'd end up being django so I could just use the admin interface for a quick solution. Ideally I'd also be able to upload files from any computer to the document server as well.

This is going to be a fun project, and I'll keep you updated! Email me if you have any suggestions for indexing services, names for the project, or possibly projects that all ready do most things describe.