The question above is the only question on Google Labs Aptitude Test that relates to an Operating System. It’s a good question: it immediately gets rid of fanboys who can’t see anything wrong with any tool they love, and allows people who are passionate and knowledgeable enough about Unix to demonstrate the can either:
- see Unix’s faults
- tell Google they prefer Unix in uppercase, just like MULTICS was.
This is part one of a multipart series, published over the next two weeks. This week: text processing.
Table of Contents
Part I: Text Processing
There are plenty of Unix gurus who make arguments about separating content from presentation when discussing structured formats (like Tex) over Microsoft Word. The same argument applies to the shell.
- Commands don’t return structured information. They return text marked as either output or errors. Errors are accompanied by error codes to help determine the type of error. There is a vague convention a few tools follow to determine the type of output into warnings and information using [WW] and [II], but it’s not particularly popular. There is no standard field separator. To find particular data in your output, you’re forced to think about where this data exists in your output. For example, getting interface names and IP addresses out of ifconfig:ifconfig | egrep -o ‘^[a-z0-9]{1,12}|inet addr:[0-9.]+’
- Each new tool has a separate config file format, that need users to learn it and software to parse it. This places additional burden on tools developers, who often do horrible things like store configuration in their own files, causing data loss when they filed modified by other methods are overwritten. Red Hat did this a lot – system-config-named would overwrite BIND’s own configuration files (does anyone know if Red Hat still do this). Yes it was terrible, especially from a company that ships and is responsible for both the config tools and the app. But it’s quite hard to create a new parser for every new file format.
- Adding information breaks parsers. For example, it’d be handy for ifconfig to show an ethernet card’s link status and speed, or the child cards used by bonded interfaces. Yet adding this information can break existing scripts that rely on the current presentation – so you have one tool to check the current IP address and the transfer and receive counters, etc. and another tool to check the link status and speed. It’s terribly inefficient.
- Text editors don’t handle structured information. This is why people complain when a config file is in XML. Yes, it’s horrible to edit, but that’s because vi isn’t built for XML. vi has keyboard macros to delete characters, words, lines and paragraphs. You can’t delete a Samba share declaration, or an Apache virtual host. You can’t copy a share declaration to another share declaration. Or skip to the next tag enclosed value. Instead, you must treat these items as characters, paragraphs, or strings, rather than objects they are.
So how to fix this?
- Allow users to fetch particular information from output and config files by specifying what they want, rather than its location. So:ifconfig |egrep -o ‘^[a-z0-9]{1,12}|inet addr:[0-9.]+’becomesinterfaces | filter name ipaddress
- A smart shell can complete ‘name’ and ‘ipaddress’ after a few characters, since it knows all the available fields that interfaces’ returns.This is similar to the object pipelining that Perl 6, Powershell, and Hotwire allow.
- Allow output to work as input. So I can save the current interface settings of a system to a config file, or another system.
- Allow editing tools to treat content as content – adding, copying, and deleting deleting sections, variables and values, rather than treating them as a series of characters, lines or paragraphs. Sure, deleting paragraphs makes vi nice and fast. But not as fast as if you could delete rows, columns, and sections.
- Take advantage of modern displays through user configurable stylesheets. Allow proper highlighting of errors and warnings, or have intermediate records shown in different colors, or have headings shown as headings. This is something Hotwire already does.
- Allow better reporting (config to documentation) or building (turning documentation to config) by using XML as the base format. This doesn’t mean you ever have to see any actual XML code – remember, editing tools should treat content as content – see GConf and DConf for examples. It should be easy to build a machine from an word processing document specifying the machines configuration. Or build a simple document from a hosts next time someone asks for a report of some kind.
- Mandatory documentation of all settings as part of the schema. This allows all kinds of exciting possibilities, such as contextual help – see a definition of what you’re editing. Gconf and the upcoming DConf do this too.
- A hierarchical structure, for simple organization and other exciting concepts like settings mounts, to allow easy sharing of configuration between machines – again, see Gconf.
That’s it for this week. Have your own thoughts on designing a better shell? Played with Perl 6, Powershell, or Hotwire? What do you think of GConf? Comments are below as usual. To read more about how to prepare for the Google aptitude exam with customized practice tools, visit this website: https://divyashakthysofttech.com/
Comments