What is URL (Uniform Resource Locator)

URL stands for Uniform Resource Locator and it is used to specify a World Wide Web (www) address, and it identifies the network location of any resource that is connected to the web; those resources include web pages or HTML (Hyper Text Markup Language) files, image files, video or sound files.

When a web page address is typed into the address bar of a web browser – an URL like http://www.techgrapple.com – it is specifying the location of a specific file that is being requested from a server on the network. The result of that request, if the URL is correct the file transfer from the server to a web browser.

An URL is comprised of multiple components. Each component of an URL serves its own purpose to specify exactly how and where the data being requested is to be located, what type of data is being requested and how that data should be handled once it is found.

The Protocol

The first part of an URL is used to designate the protocol that is to be used to locate file or resource that is being requested. The protocol is used to differentiate between various standardized means by which computers communicate with each other across a network. They facilitate faster and more efficient transmission of data by allowing that data to be taken apart and compressed into smaller files. Once the compressed files have been transmitted, the protocol designates how the receiving computer is to reassemble the file so that it can be read.

In some cases, it is not necessary to specify a protocol when typing in an URL, because some servers have a standardized protocol for transmitting and reading data. For example, the URL: https://whatis.techgrapple.com will open the same page as whatis.techgrapple.com. This is because the server assumes that the proper protocol is HTTP, and HTTP is essentially “added” to the URL automatically.

There are a number of different types of protocol that can be designated in an URL:

  1. HTTP, or Hyper Text Transfer Protocol, is a protocol that designates how much of the information on the web is formatted and accessed, and it forms much of the underlying structure of the web, and as such, if HTTP is the protocol being used, it generally does not need to be in the URL.
  2. HTTPS is a secure form of HTTP. The S in HTTPS designates that a connection between computers is secure by encryption. Encryption prevents third party users from “listening in” or “eavesdropping” on communications between computers.
  3. FTP and SFTP (File Transfer Protocol) is a commonly used protocol for transferring files across a network with client-server architecture. SFTP is the secure form of FTP.
  4. News is a protocol that designates communication with internet newsgroups. A newsgroup is simply a forum that is used by users with common interests to hold discussions and communicate with one another. Unlike email, newsgroup messages can be seen and read by anyone viewing the group.
  5. Email is sent across a network using a number of different protocol types. IMAP (Internet Message Access Protocol) is a standard email protocol in which users’ email is stored on a server that is accessed by a client in order to download and read a message because messages are stored on a server, an active internet connection is required to read each message.
  6. POP email or Post Office Protocol allows users to download all of their messages onto a computer and then read them without an internet connection.
  7. SMPT (Simple Mail Transfer Protocol) is a software-based protocol. When an SMPT message is sent, it is broken up into more easily transferable pieces of information. SMPT messages are sent with an identification of the sender and instructions regarding how to reassemble the message.

There are a number of additional protocols that are used in some URLs. Protocols in an URL are followed with the symbols ://, which are followed by the host name, where the information is stored. Understanding how to identify the protocol component of an URL, and what each protocol is, can be beneficial in understanding the nature of the website that a user is visiting or the nature of the information that is being received.

The Host Name

Following the protocol section of an URL is the host name. The host name simply identifies the specific computer that a browser is attempting to access. Host names are followed by a “dot” (.) designator that specifies the type of organization that is providing the information being accessed. This is called the Top Level Domain. For example google.com is the host name for Google, and the “.com” Top Level Domain designates it as a commercial institution. Similar to the protocol part of an URL, the “www.” portion of a host name is not necessarily required. techgrapple.com will return the same result as www.techgrapple.com

There are a number of different types of institutions on the web, each with different types of information or different purposes:

  • .com or “dot com” is perhaps the most common, and it is designated as a commercial institution, one that is doing business commercially. Other commercial institution designations include .biz and .net.
  • .edu designates an institution as educational, such as colleges and universities.
  • .org designates a non-profit organization.
  • .gov and .mil each respectively designate a government and a military institution.

There are Top Level Domains that designate other organizations like museums, professional organizations or sectors of industry, and mobile device compatibility. Together, the Hostname along with the Top Level Domain make up the Domain Name. Be owned by either individuals or organizations, and can be very valuable. The value of certain domain names is in their relevance to the name of an organization that may wish to own a particular domain name. For example, the domain name cocacola.com has particular relevance to the Coca-Cola Company. As of 2014, a number of privately registered domain names were sold to organizations ranging from Bank of America ($3.0 million – loans.com) to Quinstreet, Inc ($35.6 – insurance.com).

The last part of an URL is used to specify the path through which a web page or resource can be located within a particular website’s directory or file structure. This part of the URL tells the server precisely where to find the resource, much like the number in a street address tells a visitor which building to look for on a particular street. In the case of an URL, the final string of text can not only locate the “building”, but also the specific room and the coordinates within that room at which the object of interest can be found. For example, the /#q=url in the URL:  http://www.techgrapple.com/#q=url is used to instruct the Google servers to display any web pages that are relevant to a query about URLs.

Conclusion

Understanding what an URL is and how it works is helpful in knowing how to navigate the internet. URLs are simply a set of instructions that tells a computer how and where to find, decode and display electronically stored information.