What is HTML?

HTML stands for HyperText Markup Language. It is the first thing your browser will try to identify when opening a web page (an actual .html file) either from a local drive or a web address. The actual language itself is not meant to be displayed within a browser window but is meant to define *how* the actual content of the page should be displayed.

At it's core, HTML is about organizing information for presentation.

Throughout this section, keep in mind that the actual "look" of a page (shapes and sizes, colors and images) is defined by another language, CSS. Furthermore, complex functionality like animations and responsive designs are defined by yet another language, JavaScript. However, both of these will typically either reside within, or be linked to by the HTML document.

A small section of code from CNN.com The column on the right contains a zoomed out view that shows only part of the nearly 2000 lines used for the front page alone. Luckily, most web sites won't be as complex as a news site that displays hundreds of links, articles and media clips like this one. (Captured in sublime)

Basic page parts

The top elements in a document will typically be the doctype declaration and the html

  1. The Document Type Declaration or "Doctype" is used to tell a web browser how to render a file. Simply declaring it as HTML is the default setting used by normal web pages. A declaration different than the one shown below can be an indicator to a web browser to be more careful about backward compatibility amongst other things.
  2. The HTML section itself. This will typically contain at least 2 more elements.
    • The Header

      Keep in mind that the "header" of the document is different than the "header" you'll hear about in layout design. This section does NOT have to do with the "top menu", banner or "masthead" used in some page designs.

      The header might contain the following elements itself, though these are all optional.
      • MetaData. This might contain information to help with searches or page identification. Like everything in the header it will not actually be shown on screen though.
      • The title of the page (usually displayed in either the program window title bar or the page tab).
      • Links to art or code resources in other files.
      • Styling information explaining how the page should be drawn.
    • The Body

      The body is where the vast majority of information in any given page will be. It defines parts of the page, the masthead, navigation menus, the footer, readable text, links, possible images, and so on.

Syntax & rules

An Element is a single piece of information started with "<" and ending with ">". Between the arrow brackets lies the content of the element (typically text).

In HTML information is kept within elements which are individual pieces of content, each defined by an "opening" and "closing" tag. Both of these tags will use the same predefined keyword and the only difference between them will be a forward slash within the closing tag. A single element looks like the following...


<html> Text might go here </html>

Notice how each keyword in the opening and closing tags are themselves bracketed by "<" and ">".

Some elements will contain visible info (most likely text), some might hold other elements, some will only be a beginning and end tags with absolutely nothing between them. There are a few reasons an element might have nothing between both tags. It might merely be a reference for other objects, it might be a placeholder for an object added after the page has loaded, or it might be a purely visual object with no text or content of its own.

It's incredibely important to always make sure an opening tag has an accompanying closing tag (except in the cases of img tags as you'll see). Missing either the opening or closing tag is one of the major mistakes leading to missing web page pieces or parts of a page not appearing as you think they should.

Every element follows these rules. For instance the aformentioned header and body tags. In actual code the outline list from above, when stripped of descriptions, would translate into the code shown here. On the left is the an easy to read outline of the page while the right is the actual code as it would appear within an ".html" file.

  1. The Doctype declaration.
  2. The HTML.
    • The Header
      • Title
    • The Body
      • Content (Divs, text, etc)

<!DOCTYPE HTML>
<html>
    <head>
        <title> Page Title </title>
    </head>
    <body>
        <div>
            <p>
            Text and other content written here.
            </p>
        </div>
    </body>
</html>

In fact if you were to copy the above code, paste it into a blank text file, and open that file in any given web browser; you'd see nothing on the page except "Text and other content written here" as all the bracketed code would be interpreted as the mechanics of how the text should be shown and not content to be shown in itself.

The importance of using tags

It's important to not only use tags, but use the appropriately named tags, which are standardized. There are only a few accepted element keywords that one can use. Aside fromt he "html" and "body" tags which should start and end every new document the content area could be filled with an assortment of elements depending on the nature of that content. When in doubt you can typically use the "div" keyword to define a generic element that will always use default properties.

Furthermore, enclosing text in element tags helps to dictate where new paragraphs appear, and allows us to style them later. Consider the code below.


<!DOCTYPE HTML>
<html>
    <body>
		<div>
		Dog               Cat
		Bird
		</div>
		<div>
		Mouse
		</div>
    </body>
</html>

When displayed in a browser, HTML will be stripped of whitespace and ignore line breaks that you add in plain text, no matter how many there are. So the code above would display like so...

Dog Cat Bird
Mouse

...instead of being spaced apart or on separate lines as you might have written them.

Whitespace refers to superfluous spaces and tab strokes which are collapsed into a single space when read by computer.

Order & Relationships

Elements are nested when one is placed within another. The inner being the "child", the outer the "parent".

Elements are often nested, meaning they fall within each other. This denotes a Parent / Child relationship, mostly meaning that the properties of the parent are passed on to the child. For instance, of the three div (division) tags below, two have a parent / child relationship with one another while the third is solitary.


		
        <div>
        	
            <div></div>
	                    
        </div>
        
       	
        <div></div>
        
                

In this case the blue div is a child of the red parent div. The third (green) div might be unnaffected by the properties of the first such as color or shape and size, but without special parameters it will always be pushed further down the page just because it comes after the first two, this is the default rendering approach of web all browsers.

Traditionally child elements are indented (one tab press) further away from the side (and from the position of their parent) as seen above. The further away from the side an element is the more parents it has. This helps with readability for the sake of human eyes and will not change the way a computer reads the code.

Element attributes

In some cases you'll see extra information within tags. It might appear like, but not limited to, something like this.


<div id="name"></div>
Attributes add extra data to individual elements. This can indicate how the element should be rendered or used by another section of code like CSS or Javascript.

In this case the div has been given an identifier with a value of "name". This "id" is an "attribute". Note that the extra attribute has appeared inside the opening tag, after the element type, but is not inside the closing tag. Elements can have none, one, or more than one attribute.

You'll see an example of attributes being used for inline styling below but by far the most common use will be to associate elements with CSS classes as you'll see in coming sections.

Most commonly used HTML elements

There are many, many, many possible element tags that can be used within an HTML document. Full lists can be found at the following locations;

Semantics refers to a label's ability to indicate its use to a human viewer through name only.

Which tags you use depend on the use of the new element. It's important to understand the "semantic" use of various keywords almost as much as the literal use of them. And by that I mean what these tags mean to other developers who might look at, or even work on your page with you, on top of how they will be interpreted by a computer. "img" obviously denotes an image while a "div" could be anything and is therefore fairly "non-semantic".

Let's narrow the focus of this page and look at only the most commonly used elements.

<div>

Probably the most common element seen in any page. The "div" tag is simply a division of the page. It's sole purpose is to define the starting and stopping points of whatever content you want to put within it.

<br>

The "br" tag (aka the "break" tag) is one of the few elements where it is acceptable to *not* have a closing tag. It will insert a line break and force text to the next line instead of wrapping across the width of the available space.

<a>

This tag defines a link to either another html page, downloadable media, or some other kind of asset. See the next section for expanded information about links.

<img>

This rag refers specifically to graphic images that are not linked to, but actually rendered within the page, and loaded when the page is loaded. JPG, PNG, and SVC are the most common and acceptable file types to link. It is another of the few elements that do not require opening and closing tags. The most basic possible form of use is as follows.


<img src="folder/file.jpg"/>

<p>

This is the most comonnly used tag for paragraph text.

<h1>,<h2>,<h3>

The "h" tags are headers. h1 will probably be used to describe an entire chapter or web page, h2 might be used for bylines or section titles, and h3 would be for subsection and figure titles.

<section>

Section tags denote a division within the page. If you were making a site about animals and had one web page devoted to dogs, then you might further divide that page by breeds. It's perfectly possible to just continue using "divs" to divide up the sections, but if you were to use "section" then you can later apply styling information to just the sections. For instance a starting margin that only applies to sections and not ever div on the page.

<video>,<audio>

As you'd expect - these allow one to embed video and audio files within a document. Some compatibility issues will force you to use certain files for certain browsers, so double check that you have the code for all major bowsers when using these tags.

Images

It should first be noted that images do not have to make use of HTML to be shown. They can also be linked to via CSS as you'll see in the CSS section on color. To decide whether to use HTML or CSS to link to an image you can ask yourself a simple question.

Do I want to tile, scale, skew, rotate, resize, or otherwise change they way an image is displayed? And does this same image need to be displayed in multiple locations on the same page?

If the answer to either of those questions is yes then you probably want to use CSSo too display the image as a background-image.

Otherwise there's nothing wrong with defining images within HTML and it is relatively simple to do.


<img src="LINK" />

As you can see images are another tag which does not need a closing "</img>" tag (though make sure you don't forget the forward slash at the end).

Only the "src" attribute is absolutely necessary within an image element, but you may want to use a "width" and "height" attribute when making the link, like so...


<img src="LINK" width="20px" height="20px" />

You might wonder why you'd use the width and height attributes when they are not technically required. Your browser will be able to read the size of an image once it is downloaded and display it in the correct dimensions unless otherwise told by CSS. The main difference between using them and not is seen when the page is first loaded.

If height and width are defined then blank space will appear while the images are still being loaded so that the layout can appear as intended. If they are not defined then the layout may change after the image is loaded depending on if the image displaces other content.

If you do use these attributes then make sure they are correct. If you input the wrong dimensions for an image you can easily end up with a skewed image where you may not want one.

Inline styles

Styling information, meaning information in the document that tells the browser HOW to draw a file, can be kept within the HTML itself. Style info appearing within an element (inline styling) is written as so.


<div style='color:red;'></div>

Multiple style properties can be defined at once:


<div style="
    color:red; 
    width:200px; 
    height:200px;
    ">
</div>

In the div element above we've added the attribute "style" and given it a value of "color:red". This means that any text within the element with appear read. The width and height are defining the size of the div in pixels.

This is generally the wrong way to change styles.

But why is that? Imagine creating a site where many of the pages are the same. Perhaps you want the menu header text to appear red. This menu header appears on every page. Now what happens when you make most of the site, create many pages of HTML where the header has these style attributes, and THEN decide you want the header to be blue. What if you've styled every element on your page this way? You now have to go back and change every style attribute in every element on every page.

This is where CSS comes into play. See the following chapter for how to implement it.

It's also important to know that the elements you write out will have default values for some properties other than "none" or "0px". The cause of this goes back years to the days when only text was sent from server to client, no images or styling info, and so the earliest browsers included default values for properties such as margin and text styling by default. For the sake of backward compatibility these values perpetuated through the years even to today. Because of these default values you'll notice a few properties are applied to all elements without you having to define them first. For instance...

  • The page body will have a margin defined (this is why the elements you make are pushed away form the edges of the screen).
  • Some text elements like headers are given different sizes. Text within "h1" elements appear larger and bolder than text within an "h2" element and so forth.

Linking to external resources

Making a link is simple, the most basic form is written like so;


<a href="url"> Displayed text </a>

Let’s look at the parts of the line above.

  • The “a” is the element used, sometimes called anchor, typically indicates a hyperlink.
  • The “href” is the actual the hypertext reference.
  • The displayed text can be anything, this is what the user will see on your page.

Optionally, you can define “target=“ to signal how a browser should open a link. “target=“_blank”” typically opens the link in a new tab. (default is “_self”)

Note: You can enclose a Div within a link as of HTML5 to make the entire div a clickable button.

External Resources

Below are some links to video tutorials you may find helpful as well.

"Channel 9" has a series covering everything form basic HTML implementation to CSS styling.