face icon
Back to homepage

Open document formats

Open document file formats.
You probably would not buy a car if a car's manufacturer can shut it off after few years 
if you don't pay $10000 or more for it again.

You probably would not buy a fridge that can stop working, just because in few years 
manufacturer decided it's obsolete. There is nothing wrong with the fridge - it just stops
working in 5 years so that manufacturer can force you to buy another one.
Ripoff? Yes.

To take this step further, you would not want to buy a book, only to find out it's unreadable
in 10 years because ink dissolved.

Then you should consider your documents in digital format as well.
Why would your documents be at the whim of Microsoft, Adobe, Apple, or some other company?
They do their best to tie you to their software products so you have to keep buying 
their software to open your own documents!

Microsoft, with it's .doc and .docx formats, and Adobe, with it's .pdf are the worst but not
the only offenders.

Both keep adding more "features" into their software and they keep changing document formats
"under the hood". Anyone trying to implement readers for those formats is caught in an
never-ending race uphill.

Muddling of content and presentation

Another cardinal sin is muddling of contents (text, images) and presentation 
(font sizes, colors, ...) in a binary file (Word, Adobe, ...).

Those should ideally be kept separate and clean, as for example plain text/html 
(.txt, .html) files, and .css files are.

Solutions

Use open source document formats, which are well standardized, documented, and used in many 
software packages.
In this way, you can write your documents today, and be certain that you will be able 
to open them in 10, 30 or 100 years.

Use plain text for writing, html for publishing, and open image formats for images.
Open formats exist for video as well.

List of open file formats:

List of open file formats.

Some examples

Some text from The Bible:


1:1 In the beginning God created the heaven and the earth.

1:2 And the earth was without form, and void; and darkness was upon
the face of the deep. And the Spirit of God moved upon the face of the
waters.


Now, this text can be easily created in a proper text editor, like Notepad/Notepad++ in Windows,
gedit/VIM/Emacs in Linux/BSD, Atom/Brackets/VIM on Mac, etc.

It can be opened in hundreds of different text editors and software programs, and it will always 
be an open format.

Sure, sure ... but how do I publish this?

On the web, very easy. You don't even have to change it to html.
Text is fine. Look at this example ...

Entire Bible on the web, as plain text:

Entire Bible on the web, as plain text.
That is exactly correct. If you just want to publish text on the web, you can just put it on
as plain text. You don't need to use html at all.

But plain text is so 19th century ... I need html for links, pictures, etc.

Not a problem.

HTML is easy to create. 
You just need a header, a footer, and off you go.
Header (remove leading spaces from within <> tags):

< !DOCTYPE html>
< html>
< head>
< title>Html template< /title>
< /head>
< body>
< /code>

And footer:

< /body>
< /html>
< /code>

If you want to preserve text formatting from text file, you can just enclose text
in < pre> and < /pre> tags (no spaces). This way plain text formatting is preserved.

So your entire html web page can be as simple as (remove leading spaces from within <> tags):

< !DOCTYPE html>
< html>
< head>
< title>our title here!< /title>
< /head>
< body>

< pre>
The First Book of Moses:  Called Genesis

1:1 In the beginning God created the heaven and the earth.

...
< /pre>

< /body>
< /html>

< /pre>

Then it's easy to add links, images, and whatever else HTML, if you need them.
For a great HTML tutorial, visit:

w3schools HTML tutorial.



Problems? Mental blocks? Denial?

But, but ... I need an HTML editor to write HTML? I need to buy that!

No, please don't.

To write HTML you just need a plain text editor. 
This will give you the best understanding of HTML and document structure in general.

However, if you like, you can elect to use any of dozens free HTML editors.
Be forewarned though - what they produce is often very ugly and confusing, when opened in
plain text editor later. To clean that mess up is quite difficult.

But text cannot be used for tables, data, etc! Touché!

Yes it can, and it is.
Often, Excel, GNUmeric, LibreOffice and other data files are converted to .csv or .tsv plain text 
formats for exchange with other people and programs.

What's more, plain text is very easy to open in any programming language, and data manipulation, 
statistics etc. becomes very easy.

Two popular plain-text data formats: 
First one (.csv) uses commas to separate values,
while the second one (.tsv) uses tabs - useful if your data contains commas.
Comma-separated_values (.csv) Tab-separated_values (.tsv)

Surely, programs don't use plain text?

Computer programs are almost exclusively written in plain text, understandable to humans.
To be understandable to machines, they are then sometimes converted (compiled) into machine code,
or executable files.

This is however optional - interpreters (computer programs) often use plain text for program 
storage and execution.

But, I wan't to change the looks (presentation) of my documents!

Not a problem. 
HTML separated content (HTML/text) from the presentation (looks).

To change the looks for one document, or for all of them, you just need to create single plain
text file, so-called CSS file, and looks are immediatelly different.

You can read more about it below, and access a great tutorial clicking on second link:
Cascading Style Sheets (CSS) w3schools CSS tutorial.

You need math formulas, ancient Greek, Sanskrit?

Not a problem. LaTeX got you covered: LaTeX.

Now you are some sort of plain text prophet, right?

No, but I have come to appreciate simplicity, ease of use and non-encumberance that plain text
files offer.

Furthermore, smart people saw that long before me, so they created entire systems where most 
operations were done on text files. 
Binary files were reserved, mostly, for machine code that computer can execute and understand.

This led to early UNIX philosophy where everything is a file, and to suite of small programs.
Each program could do some limited operation on a file, but they can be connected together,
to create truly astounding effects with plain text files.

Searching, extraction, sorting, word/letter replacements and filtering of any kind became possible.

Today, those tools/filters (small programs) are in use in modern UNIX, namely Linux' and BSD's.
Their importance is such that even other operating systems (MacOS, etc.) adopted them.
Text filters in UNIX/Linux/BSD. Everything is a file.

You value freedom?

Freedom from proprietary file formats liberates your documents, as we could see above.
You don't need to pay anyone to access or change your documents.

In the same manner, you should consider taking this a step further and start using
free and open source software, and even free and open source operating systems, like
Linux, BSD, FreeDOS, etc.

Remember, it's important for closed-source companies to keep you tied to their systems.
It's up to you whether you will decide to break free or not, as always.

Freedom requires educated population, and free software requires some learning as well.
But, as with any education, results are worth it.

Back to homepage