Sample Computer
Book Proposal
Developing Visual Basic 4.0 Programs for the Internet with
ActiveX Controls
by Wayne S. Freeze
SAMPLE MATERIAL:
Chapter 2: TCP/IP Concepts
Overview
In this chapter, we are going to explore some of the basic concepts about TCP/IP and the Internet. While a complete study of TCP/IP is beyond the scope of this book, this section is intended to give the reader a flavor of some of the more important concepts necessary to understanding how to use the ActiveX controls.
By the end of this chapter, you should be able to answer the following questions:
The Internet
In the late 1960's, the Department of Defense's Advanced Projects Agency (then known as ARPA and later known as DARPA) began funding research into computer networks by several university computer science departments and a few private companies. In December 1969, four different computers were connected together, thus forming a network that eventually became known as the ARPANET. Throughout the 1970's and early 1980's, ARPA encouraged more and more of its research organizations to connect to this network. In 1983, DARPA split the APRANET into two pieces: the research part which retained the name ARPANET, and the military part which became known as MILNET.
In 1985, the National Science Foundation established a high speed network to facilitate communication among researchers, and to access NSF's six supercomputer centers. This network became known as NSFNET. When this network was tied into the existing ARPANET, they jointly became known as the Internet. Since then, the growth of the Internet has been phenomenal. In 1987, there were approximately 20,000 computers attached to the Internet. Just three short years later, in 1990, the number of computers attached to the Internet increased tenfold to over 200,000. In January 1996, one source estimates that more than 37 million people over the age of 16 in the US and Canada alone have access to the Internet.
Despite the enormous size of the Internet, it operates using the same principals as any other computer network. So whether the computer network has only two or three computers connected together or the millions found on the Internet, they all have to work together to perform common tasks.
TCP/IP
To understand the tasks required in a computer network, let us begin by defining "computer network." A computer network is a collection of computers that have connections between them. These computers are also known as nodes. The simplest case is one in which two computers are connected by a single link. The network can grow more complex by adding another computer with a path to one of the computers already in the network. It can also grow more complex by adding a second path between any two existing computers in the network.
** figure 2.1: three pictures showing 2 computers connected together, 3 computers connected together, many computers connected together **
Note that each computer that is attached to the network is not directly attached to all of the other computers. To pass information between computers that are not directly attached, the first computer examines each of the computers that are directly connected to it, chooses the one that is closest to the destination computer, and passes the information on to it. This process repeats on each computer until the information eventually reaches its destination.
Passing the information from one node to another requires that one computer talk to the next in a common language. One of the most common languages in use today is TCP/IP (Transmission Control Protocol/Internet Protocol). More formally known as TCP/IP Internet Protocol Suite, this language defines a series of protocols that standardize how computers talk to each other in order to exchange information.
The protocols in TCP/IP are arranged in a series of layers known as a protocol stack. What happens within a layer is isolated from the layer above it or below it. There is a Physical Layer that deals with the actual communications hardware (modems, Ethernet cards, etc.). Above this layer is the Internet Protocol (IP) layer that deals with moving data from one node in the network to the next. Above the Internet Protocol layer is a third layer called the Transmission Control Protocol (TCP) that deals with how to move data from the source to the destination, ignoring any of the nodes in between. Above this layer is the Application Layer which deals with functions we are more familiar with like FTP and HTTP.
** figure 2.2: picture of protocol stacks showing layer to layer communications between multiple computer systems **
Clients and Servers
When two computers work together, they generally operate as a client and a server. Basically, the client generates requests and sends them to the server. The server then processes the request and sends its response back to the client.
For example, imagine two computer systems called Chris and Sam. Assume that Chris does not have a way to tell time, but Sam does. So the following exchange could occur when Chris wants to know the time.
Chris: "What time is it?"
Sam: "It is 11:15am".
In this case, Chris is the client and Sam is the server. Chris sends a request to Sam ("What time is it?"), which Sam must receive and in turn translate into the functions required to get the time and return it back to Chris ("It is 11:15am"). Finally, Chris must be able to translate the response back into something useful.
While this seems simple, remember that a number of things must exist for this exchange of information to occur. A common language or protocol must be in place so that Sam can understand Chris's request, and for Chris to understand Sam's answer. Second, if Sam is busy and does not hear Chris's request, Chris needs a way to recognize that Sam did not hear him so he can repeat the request. The same applies for Sam's response to Chris. Also, methods are needed to handle the situation in which Chris says, "What time is it?" and Sam hears "What mumble mumble mumble".
A more complete example would be:
Chris: "Sam, can you help me?"
Sam: no response
Chris: "Sam, can you help me?"
Sam: "Yes I can, Chris."
Chris: "What time is it?"
Sam: "You want to know what time it is?"
Chris: "Yes"
Sam: "It is 11:15am"
Chris: "Thank you for telling me it is 11:15am."
Sam: "You are welcome."
As you can see, this exchange is much more complicated than the first one, yet no additional information was really exchanged. While this example is still somewhat simplistic, this concept forms the basis for nearly all client/server based computing programs.
Domain Names and IP addresses
The above example is somewhat simplistic from yet another angle. It was implied that Chris and Sam knew how to reach each other. With millions of computers attached to the Internet, this is not as easy as it may seem. For example, what happens if two people want to name their computer Chris or Sam? A solution to this problem, which has evolved over time, is a naming concept called the Domain Naming System.
In general, all computers attached to the Internet have a unique hierarchical domain name associated with them. Consider the name: WWW.JustPC.COM (pronounced W W W dot JustPC dot COM). The name consists of three parts, WWW, JustPC, and COM. Working from the right, com is known as the top level domain. In this case COM indicates the entire name belongs to a commercial organization. Other valid top level domain names are: EDU for educational institutions, GOV for government organizations; MIL for military organizations, NET for major network support centers, and ORG for other types of organizations. Also valid is using a country code in place of the top level domain. Thus US, UK, and CA are valid top level domain names for places in the United States, United Kingdom (England) and Canada respectively.
The second level domain name, JustPC in this case, indicates the name of a business, organization, or institution. Unlike top level domain names, the second level domain name has no real format, other than being limited to 22 characters in length. I should also point out that the names are case insensitive. That is JustPC is the same as justpc and JUSTPC. Usually, companies like IBM and Toyota tend to use their corporate name as the second level domain. Other organizations may choose names that relate to particular projects or products.
Country codes are generally used for organizations that exist outside of the United State. However, all state government, libraries and elementary and secondary schools use .US as their top level domain. For example, the in Maryland, the Baltimore County Public Library's computer system used for mail would be known as mail.bcpl.lib.md.us.
The lowest level of the domain name is often referred to as the host name. In the case of WWW.JustPC.COM. WWW is the host name of a specific computer system that is operated by JustPC.COM. The host name can be one level as in this example, or multiple levels as determined by the organization. Some host names are very common based on the function provided by the computer. WWW indicates a machine that has information that can be used by a web browser. FTP indicates a computer that operates an FTP server. Mail may simply be directed to JustPC.COM. Note that a third level name is not necessary, however only one machine connected to the Internet may be labeled JustPC.COM.
While domain names are easy to understand for humans, they are difficult to use by computers. Computers translate the domain name into something known as an IP address. The IP address consists of four sets of numbers. In the case of WWW.JustPC.COM, the IP address is 206.153.49.129. Each of the numbers can range in value from 0 to 255 (or 28, the largest value you can store in 8 bits). Just like the domain name must be unique within the Internet, so must the IP address, and just like the domain names, there are organizations that are responsible for insuring that IP addresses do not overlap.
Why is having an IP address easier to use by a computer? Consider the case of your telephone number. You have a three digit area code, followed by a three digit exchange, followed by a four digit number. When someone dials your telephone number, there is information about exactly where you are located in the telephone network. Dialing the area code directs the call to your state (or area inside of a state when there is more than one area code in a state). Dialing the exchange directs the call to your community within your state. Finally the last part of the number directs the call to your home. The same process is followed by the computers that pass information around the Internet.
Since the lower layers of TCP/IP only move information around using IP addresses, a facility known as a Domain Name Server (DNS), or name server for short, is used to translate the domain name into an IP address. Each domain name has a unique IP address associated with it. However, the reverse is not always true. It is possible to have multiple domain names point to the same IP address. Consider the case of a company that provides information to people on the Internet. They may have a world wide web site called WWW.JustPC.COM, an FTP site called FTP.JustPC.COM, and may receive electronic mail at JustPC.COM. While these represent three different types of services, there may be no reason for them to reside on different computers. Thus the three different domain names all point to the same computer system. This also gives the organization that is responsible for the domain names the flexibility to move various functions to a different machine without having to notify everyone who may access that service.
Port numbers and Sockets
By using domain names and IP addresses, we can now find any computer on the Internet. But that does not mean that we have all of the information necessary to have a client and server talk to one another. Most computers today have the capability of running more than one program at a time. The clients and servers we have talked about previously are really computer programs that run on a particular computer. Therefore we need a method that not only uniquely identifies a computer, but also the particular program running on that computer. This is done through the use of a port.
A port is simply another number. Consider it an extension of the IP address. This value can range from 0 to 65,535 (216 -1, the largest value you can store in 16 bits). Port numbers are used to uniquely identify a message box on the computer. For standard Internet servers, there is a set of well known port numbers. For example: FTP always uses port 21 for communication and port 20 for data transfer. World Wide Web servers usually use port 80. Mail is exchanged on port 25 by SMTP.
A server program will periodically check for messages in its assigned message box. When it finds a message, the server program will remove the message from the message box, then either process it directly, or will start another program to process the message. Finally, the server program goes back to periodically checking for messages.
A client program will prepare and send a message using an IP address and port number of the desired server. As part of the message, the client will tell the server to send its response back to a specific port. Then, like the server program, the client program will wait until it receives a message before continuing to process.
Starting with our previous example and adding the new information we just learned, we arrive at this:
Chris: "Calling Domain Name Server, port 53. Where is Sam? Respond to 128.128.128.001, port 888"
Domain Name Server: "Calling 128.128.128.001, port 888. Chris, you can reach Sam at 128.128.128.002."
Chris: "Calling 128.128.128.002, port 999. Sam, can you help me? Respond to 128.128.128.001, port 888."
Sam: no response
Chris: "Calling 128.128.128.002, port 999. Sam, can you help me? Respond to 128.128.128.001, port 888."
Sam: "Calling 128.128.128.001, port 888. Yes I can, Chris."
Chris: "Calling 128.128.128.002, port 999. What time is it?"
Sam: "Calling 128.128.128.001, port 888. You want to know what time it is?"
Chris: "Calling 128.128.128.002, port 999. Yes"
Sam: "Calling 128.128.128.001, port 888. It is 11:15am"
Chris: "Calling 128.128.128.002, port 999. Thank you for telling me it is 11:15am."
Sam: "Calling 128.128.128.001, port 888. You are welcome."
This time, before Chris can talk to Sam, he must first talk to the Domain Name Server. Unlike most other computers on the network, Chris knows the IP address for the Domain Name Server, because that was defined in Windows 95 (or any other system), when the TCP/IP software was first configured. Also, in this example, the IP addresses and ports are now included as part of the communication.
Summary
In this chapter, we briefly discussed a little history and some of the basic concepts of TCP/IP and the Internet. We developed an example of how a client and a server cooperate to perform a task. In the next chapter, we will continue to explore the TCP/IP, clients and servers by looking at some of the "standard" clients and servers available on the Internet.
In case you want to check your answers to the questions in the Overview section, here are my answers:
A client is a computer program that makes requests of another computer program.
A server is a computer program that receives requests from another computer program, then processes the request and returns the result to the original computer.
A host name is the name of computer.
A domain name is a hierarchical name that uniquely describes a computer on the Internet.
An IP address is a series of four numbers, each ranging from 0 to 255 in value, that uniquely identifies a computer on the Internet.
A name server is used to translate a domain name into an IP address.
A port number is a value ranging from 0 to 65,535 that uniquely identifies a program running on a computer.
A socket is a programming language interface between an applications program and a communications facility.
[Main Page] [AxtiveX Contents] [Previous] [Feedback]
Copyright © 1997 Wayne S. Freeze, Adler and Robin Books Updated: 16 January 1997.