Science Library - free educational site

HTML documents

What is HTML?

HyperText Markup Language.

Hypertext: as it sounds, is text with a little extra. This 'extra' is the ability to convey instructions and other information within text on a screen. Hyperlinks therefore are addresses of other locations on the worldwide web and the instruction to the browser to go there when clicked.

Markup: how the text will be displayed in a browser window. Markup is mainly structure, such as how the text is broken into sections and paragraphs and other elements, such as headings and images. HTML gives only very basic styling capabilities. To style a page requires CSS, cascading style sheets.

Language: in computing science a distinction is made between scripts and languages. Scripts are easier to 'personalise', so find more application for web development than languages. Scripts are written for use in a specific run-time environment, and interpret (as opposed to compile) code. An example of a scripting language is Python, whereas Java is a programming language.

HTML is a markup language, rather than a programming language. It instructs a browser how to lay out text and images on a web page, so the client computer does not need to compile (translate to another level language) the code. HTML, and its related developments, XML (Extensible Markup Language) and XHTML, are how web browsers interpret and compose the various elements that make up a webpage, such as text, images and AV items. A browser will impose its own defaults, such as colors and type-face (font), border, underlines for links, etc., but these can be overwritten by styling (CSS).

History of HTML

The history of HTML is really also the history of the World Wide Web. Tim Berners-Lee wrote the first version of HTML in 1990, to facilitate the transfer of data at CERN, Switzerland. With Dan Connolly, they soon produced HTML 2.0.

Berners-Lee and Connolly went on to found the World Wide Web Consortium (W3C), the Internet's standardising organisation. The W3C soon found themselves blue-helmeting between various antagonists, such as IE and Netscape, as the browser wars broke out in the mid-1990's. The (debatable) universality of rules and standards designers enjoy today was largely won by this quasi united nations effort preventing the balkanisation of the Internet.

HTML 4.0

The initial standardisation attempts, HTML 2 and 3, came to the conclusion that the best way forward was to separate the structure and formatting of web documents. The transition was managed by a system of deprecation, in which older version formatting elements still worked, but were destined for later removal from the standards.


The separation of structure and fomatting led to the creation of CSS (Cascading Style Sheets) in December, 1996. CSS Level 2 was recommended by W3C in May 1998, and took the cascading concept into the realm of domain universal styling, allowing high precision in locating and controlling elements on any and all pages on a site. CSS Level 3 has actually been under development since 1998, but took till 2011-12 to be refined enough to be the recommended version.


When it was decided in 1998 to stop development of HTML at version 4.01, W3C released an XML version of HTML called XHTML. This was HTML requiring XML syntax compliance. It is at this point that self-closing tags appeared (e.g. <b> became <b />), and attributes became quoted (e.g. class="newClass"). On the assumption that this would become the universal standard, a version XHTML Strict was issued, and until all browsers accepted it, there was the more forgiving XHTML Transitional.


However, a competing movement did not accept XML/XHTML as the long-term standard, and, to make an interesting story dull, by 2009 W3C stopped development of XHTML 2.0, and HTML5 became the new contender for the pan-browser standard.

HTML5 has the ambition of meeting the requirements of rapidly evolving web applications, a scenario under which sites are not just locations of mainly static documents in standard layouts, but actually online interfaces for worktools, such as photo editing, and complex applications driven by AJAX and similar technologies.

HTML Page Structure

An HTML page is set up as follows:



Head declarations and included files go here



Page content goes here



Note that the entire page is within the html tags, and the page has two main sections: the head and the body, both of which have opening and closing tags.

Head Section

The head is the section at the top of every HTML page. The <head> opening tag comes after the opening <html> tag. The script between the opening <head> and closing </head> tag is not displayed in the browser window when the page is loaded by the user. What the user sees in her window is the content between the <body> </body> tags.

Instead, what appears in the head provides the browser and server with important information about the page it is loading.

The head is a convenient place to put links to external files every page needs, such as CSS and .js files. If the head is put in an external include file, it need only be edited once and every page that calls this include will automatically be updated.

Other items placed in the head include GoogleAnalytics code, Metatags, charset declarations, and author and copyright declarations.

Let us look at each of these in turn:


HTML5 has simplified the doctype declaration. All that is required to declare an HTML page that uses the W3Org HTML5 standards is:

<!doctype html>

<html lang=en>


<meta charset=utf-8>

<title>Science Library</title>



Meta tags are placed in the head section, but only the charset and title are compulsory. Here is an example of typical meta and title tags which declare an HTML page:

<meta charset="UTF-8">
<meta name="description" content="Science Education Resources and Open Knowledge Platform">
<meta name="keywords" content="Physics, Mathematics, Science, Vitruvian Boy">
<meta name="author" content="Andrew Bone and Sean Bone">
<meta name="format-detection" content="telephone=no" />

Science Library .Info

Charset declarations

Character sets declare the character encoding for the document. This is of particular importance if the site is not in English, as special characters for European and other languages may be lost 'in translation' if there is not a match between the document charset and, for example, the database charset the information comes from.

The HTML5 meta declaration <meta charset="UTF-8"> replaces the older HTML4 declaration: <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

UTF-8 is the unicode encoding. ISO-8859-1 is the encoding for the Latin alphabet.

The 'meta', 'charset' and 'utf' may be written in uppercase, lowercase or combination of both.

Included files

Files that are used by every page can be included in the head section.

<!doctype html>
<meta charset="UTF-8">
<title>Title of the Page</title>

<link rel="stylesheet" type="text/css" href="" />

<script type="text/javascript" src="included_js_file.js"></script>
<script type="text/javascript" src="included_js_file_no_2.js"></script>

Page content goes here


Notice that .js files should be included separately - each with its own pair of opening and closing <script> tags.

Thought should be given to the order in which the .js files are included, as they will be executed in that order. Large JavaScript files, or those which carry out some operation immediately, may also cause a delay in the loading of a page. It may be more efficient to load these files after a successful loading of the page. This can be achieved by putting them at the end of the HTML, such as in the footer include, instead of in the <head>.


Favicons are the tiny images used in browser tabs for a website, such as the spaceshot of a sunrise through the Earth's atmosphere used for

To load a favicon, use this code in the head section:

<link rel="Shortcut Icon" href="" />


It can be useful to know how to redirect users from one site to another. This may be desired for a number of reasons: the site is old, no longer existing, or the content has been moved, but there are many links to the old reference 'out there'.

The simplest method to do this is a 301 with a file called .htaccess. This provides a permanent redirect. Anyone who attempts to access the old url will open the redirect url instead. The file may be removed or edited at any time.

Create a normal text file and name it .htaccess, and put this content in it:

RewriteEngine On
RewriteRule ^(.*)$ [R=301]

Then load the file in the domain the redirect will operate on.

In this way, any request made for a file within the folder (or any sub-folder) in which the .htaccess appears, the request will be automatically redirected to For example, a request to will be redirected to It is also possible to redirect users to a specific file on the new domain. Just change the URL to that new address:

Given that the .htaccess file operates on the folder in which it is located and all its sub-folders, it is also possible to redirect all the requests within to a single page (even on the same site), for example In this way, it is possible to create a folder to which the public has no access.

Note: in certain operating systems, files which begin with a . are considered system files, and are therefore hidden. In this case, you may instead name the file htaccess.txt, or something similar, and rename it .htaccess once it has been loaded onto the server. Even some FTP clients hide this file type - in this case it will be necessary to change the client settings to reveal hidden files.

Sean Bone

ZumGuy Network webmaster

Visit my Internet forum: ZumGuy Internet Forum

Per una versione di quest'articolo in Italiano vedi: Reindirizzamento con file .htaccess


The Uniform Resource Locator is the address of a file on the Internet. It contains information about the name of the file, and where it is, and what a browser should do with it. Every file on the Internet has a unique name.


http:// is the scheme.

http:// means HyperText Transfer Protocol, used to access World Wide Web pages.

Other schemes are: ftp:// (file transfer protocol), to set up the location of a file to be downloaded (often .pdf),

mailto: (note the missing forward slashes), to set up the email address to send an email from a hyperlink.

On a local machine file:/// to find a file on a computer.

HTTPS adds Secure.

Schemes should always be written in lower case.


/library/computing/html/ is the path in the example. It indicates the sub-folder route from the root to the document.

File name

html_documents.php is the filename in the example. Its extension informs the browser how it should handle the document. Webpages will have the extension html or htm, unless the page is in a specialised format, such as php or pdf.

Server name is the server name in the example. It is also called the root of the domain.

If is called, the file (or .php) is assumed.

IP address

The domain has an IP address of

This code is unique for every domain and is how servers find the site.

Relative and Absolute URLs

Documents or pages may be referenced in two ways. An absolute URL provides all of the URL information, starting with http://. Absolute URLs must always be used when calling a document stored on another server. They should also be used in included files, such as and type files, since these will be called from all or many pages of the site, from all levels.

Relative URLs give the path to the document relative to the page calling that document. e.g. ../images/funnyman.jpg. The ../ informs the browser to go back up the path one folder to the next highest level, then look in a folder called images for a jpeg image called funnyman.jpg, and load it inline where the relative URL call was made.

Relative URLs make coding less cumbersome, and must be used in some circumstances, such as calling a file from outside the root. This is the case with files that must be secured from unauthorised access, such as ../../mysqli_connect.php.

Binary Code

Computers and powers of two

Normally, numbers are in base 10. There is no single integer to represent ten, instead we have a position-sensitive system for communicating the values. For example, adding 1 to 9 results in 10.

In base-2, or binary, there are only two integers used, 0 and 1. We count 0, 1, 10, 11, 100, 101, 110, 111, 1000. [$1000_{2}$ = (1 x 8) + (0 x 4) + (0 x 2) + (0 x 1) = 8 in base-10]

Computers use base-2, which explains why memory is given as powers of 2: 256Mb = $2^8$, which in binary can be expressed as 100000000.

Content © Renewable.Media. All rights reserved. Created : June 3, 2014 Last updated :March 8, 2016

Latest Item on Science Library:

The most recent article is:


View this item in the topic:

Vectors and Trigonometry

and many more articles in the subject:

Subject of the Week


Mathematics is the most important tool of science. The quest to understand the world and the universe using mathematics is as old as civilisation, and has led to the science and technology of today. Learn about the techniques and history of mathematics on


Great Scientists

Albert Einstein

1879 - 1955

Albert Einstein is considered by many to be the greatest scientist of the 20th century, and his contributions to science equal in importance and scope to those of Isaac Newton.

Albert Einstein, 1879 - 1955, a German (-Swiss, -American) Physicist
Vitruvian Boy

Quote of the day...

The human race is challenged more than ever before to demonstrate our mastery - not over nature but over ourselves.

Renewable.Media Internet Promotions

Vitruvian Boy