|
![]() |
|
UPDATE: I forgot to mention that this is my first LaTeX text ever and that it’s based on one of my first Python projects and that I wrote what you find below in about 3 days while dorking around on Vellum and Idiopidae. Since I’m both a Python and LaTeX newb please feel free to school me. Better yet, if you think the typesetting sucks, then show me your samples. See if you can beat this one sent to me by Kashif Rasul. I still consider what he’s created as the bar to get over. I think I’ve gone insane recently because I’ve been obsessed with making the documentation for Vellum so great angels will cry tears of gold. I’m treating it as a practice book for the big book I’m writing for A/W by writing a fully LaTeX typeset PDF about Vellum using all the tools I’ve created for more modern literate programming. The big “feature” of the book is insanely well typeset code with color. The typesetting is so good that everyone I’ve showed it to seems shocked by it. I’m not sure if that’s because they think it’s ridiculous to document such a small project that well, or because it’s way better than many other books. I’ll let you people tell me what you think too. I tried to make it fit what many people use for their editor looks, but without making the book hard to read. This turned out to be easy as hell with Idiopidae and Pygments and in the process I learned quite a lot about TeX and LaTeX. The end result is good so far, but I’m looking to improve it even further with:
The Vellum book is a work in progress, but check out how it looks already in PDF form. Definitely look at Appendix A to see how well the source highlight works using Idiopidae to snarf in live code from the Vellum source. Vellum and Idiopidae have made creating the book for Vellum incredibly easy. I now don’t worry about how the code is included or formatted, and I can access it using logical sections. I just tell Idiopidae the file and section and it makes all the moves. A bit of LaTeX glue to wrap it in something pretty and I’m off. I can also do some really interesting things with sample outputs and incrementally built source files in the book. When you write a book about code you typically want to start with a simple example of some source file, and then build it up to more complex levels as the prose progresses. Traditionally this meant generating a series of individual files each with the slight modifications. Problem is this means maintaining all those little files for just one chapter. With Idiopidae I can actually put multiple files into one larger file, and put the code snippets for the progressive discussion into different sections. I hadn’t thought of this until I did it yesterday, so I may try to formalize this in Idiopidae sometime this weekend. There are a few things I want to add to Idiopidae which will probably involve a rewrite. If you use it, don’t get too attached to the syntax. What I’m thinking is that I’d like a feature to run commands, clean the output up, and then inject that into the book. I’d also like Idiopidae to keep track of md5sums for all the sections it includes. If the md5sum of a section changes then it’ll report an error assuming that you have to change the text around that section to match the code. After you change it, there’ll be a command line option to tell Idiopidae to update the md5sums. I think this “md5sum tracking” feature could really help keep books about code in sync with the code. The assumption is, if the code changes, then the prose probably has to change as well. Another feature I want in Idiopidae is to use the Pygments parser of ctags on the files so that it can find logical sections based on the already structure source. If all you need to do is include a series of functions, then having to sprinkle the source with export comments that match the function names is stupid. As I think about these features though, what I find is a lot of overlap between how Idiopidae works at building books and how Vellum automates building anything. It might be worth looking at Idiopidae reusing Vellum’s parser and build engine so that you can write a nice little description of all the pieces that come together to make a book. It would be a higher level description of the book and designed to also embed Vellum build tasks inside import statements so that you can run code and have the output injected. Check out how the book looks so far (grammar errors and spelling included) and let me know what you think. My plan is to wrap all these little tools up into one nice tool that helps programmers write better documentation, or at least helps them find someone who can write it without much fuss. Knuth On Literate ProgrammingThe ideas in Idiopidae are nothing new, as Donald Knuth has been actively promoting Literate Programming for over two decades. The main problem with LP though is its insistence on combining the code and the prose into a single file. There’s serious technical and social issues with this design choice, but in this very good interview with Knuth he mentions a quote that summarizes things quite well: Jon Bentley probably hit the nail on the head when he once was asked why literate programming hasn’t taken the whole world by storm. He observed that a small percentage of the world’s population is good at programming, and a small percentage is good at writing; apparently I am asking everybody to be in both subsets. However, this doesn’t cover the absolute hardest technical problem with combining code and prose in the same source file: syntax escape overload. When you use most LP systems you’re taking the source or two whole languages with completely different lexical and semantic meaning, and then trying to cram them together. This means juggling in your head two different syntax structures and how to merge them together with escape sequences. The technical difficulty of this probably borders on an grammar that would rival Perl or Ruby’s. This is a major technical hurdle for both people implementing such systems and for the users. As an implementer, I have to create a parser or some hack that can accurately handle both languages. I also have to figure out how the escape sequences are given to the users, how they can encode elements of one language in the other (say for giving constants a bold font). This is a huge task, so no wonder why all the literate tools out there also do a shitty job at finally typesetting the resulting code. For the users they have to use their brain in ways that don’t make sense for either a programmer or a writer. A programmer has to work in terms of structure and algorithmics with the hardest problem on many projects being finding things of interest. A writer has to work on the creativity of the prose, entertainment, whether it reads so that it makes sense, and how best to keep the reader engaged. When I code I don’t give a damn if the reader is engaged, I care if he can understand what’s written, but I’ll be damned if I’m supposed to entertain some dork who’s paid to know this stuff. These flaws then extend to every other aspect of computer science. When people do LP they typically keep all the code and prose in one giant file, maybe a few medium sized ones if you’re lucky. This is a nightmare for revision control since people on a team will constantly be conflicting with other team members. This makes it hard to break out the work, share it with others, hell even find where something is in the source. Can you imagine trying to grep for a normal C idiom but having to modify what you look for to handle TeX escaping just in case it shows up in a spot that’s inside TeX? In the end, LP seems to have missed out on the biggest advancement in software development since it started: compilers. A compiler no longer expects you to keep your source in one file, but instead has insanely good (or in C’s case, okey-dokey) support for inclusion, exporting, combination, and sharing of code. Modularity and encapsulation into libraries, modules, objects, and functions has all been supported by compilers since they started being used to create larger projects. Yet, LP almost misses this and does an “inverse compiler” You write the final combined document and then it writes what you’d normally include in the document. Insane. It makes much more sense and works a hell of a lot better with modern tools (“modern” being after 1980) if the way programmers do Literate Programming works the way they do regular programming. But, TeX is still pretty damn sweet. |