A network is a group of devices that can communicate with each other.
Networks consist of hosts, each of which has a unique address and the various network equipment needed to establish interconnections among the hosts. The hosts make use of network-specific host addresses in order to transmit messages to each other. Depending upon whether the network is circuit switched or packet switched, these messages are either in the form of data streams or datagrams.
In a circuit switched network, a host uses some means to signal that it wishes to communicate with another host. This causes a dedicated connection to be established between the two hosts over which a continuous sequence of data may be written. Since there is no real starting or stopping within this sequence of data (other than those delays that are added by the sender), this virtually continuous flow of data is called a stream.
Each byte that the sender transmits arrives at the recipient in the same order it was transmitted. Furthermore, since the connection is dedicated to just the communication between those two hosts, one side or the other generally knows with certainty if the connection has been lost. A feature (sometimes painfully) absent in the more flexible but unreliable packet-switched environment.
First conceived of in a landmark paper by Leonard Kleinrock of MIT (Information Flow in Large Communication Nets, 1961), a packet switched network differs significantly from a circuit switched network. When one host wishes to communicate with another in a packet switched network, it constructs a datagram with the information it wishes to transmit along with appropriate addressing information. It then delivers this datagram to the network for receipt by the destination host.
Since there is no dedicated connection between the two hosts, there is generally no way for a sender to be sure that the receiver actually saw the message that was sent to it. Unless, of course, the recipient sent some sort of confirmation that the message was received. However, this confirmation message suffers from the same possibility of getting lost as the original message. This is how we arrive at the two generals problem.
There are many ways to connect hosts together when it comes to the physical cabling. Some of the more popular methods are the bus, ring and star configurations.
In the bus configuration, all the hosts are connected to a common link of some sort where each host is an equivalent peer of each other. The Bus network is named because it is similar to an electronics bus and so we get the name from Electrical Engineering.
In the star configuration, every host connects to a central node of some sort (which may be a host itself) and every connection to every other host is routed through that central node. It is called this because the diagram for this kind of network looks somewhat like a star.
The ring topology has each host connected to a left and a right neighbor until eventually the last host is connected to the first, thus forming a ring. This topology is popular when the physical medium being used is not capable of being shared, e.g. fiber optic.
You can read about several other topologies at the Wikipedia page on network topology.
A datagram consists of some quantity of data along with all the information needed to deliver that data from the sender to the recipient. Since all the information necessary to take the given data through the network and to its destination, a datagram can be considered an autonomous entity.
An idealized datagram
| To: | that host over there |
|---|---|
| From: | this host right here |
| Message: | Four score and ... |
Addressing a datagram does require that a unique name exist for
every participant in the network. One can make the case that a continuous
series of reliably delivered datagrams is indistinguishable from a stream
and therefore one can often get the benefits of a packet switched network
with the usefulness of a stream for communicating data. Doing so requires
that some sort of virtual stream or circuit be established between
the two hosts. This is exactly what a protocol like TCP/IP
does.
An internet is a connected set of networks that is designed to operate as one big virtual network. Connections are created by designating certain hosts as routers and giving them connections to both networks. This enables that host to pass messages between the two networks based upon which hosts are on which networks. This is referred to as routing. If you combine routing with globally unique host addresses, you can create an effective illusion that there is only one big virtual network which all hosts reside upon. The only visible difference to end users (if any) will be the differences in performance when communicating with various hosts. The advantages of a single virtual network running on top of many real networks are many fold.
Since the protocol that is used in an internet is generally specifically designed to hide the particulars of the physical networks underneath it, an internet can be constructed out of a variety of different technologies. Each of which has its own particular benefits and drawbacks.
Even in an environment that is constructed entirely of one networking technology (e.g. ethernet), internetworking can reduce the amount of traffic on busy links. If you have 500 people in a building, all sharing the same physical network, they would have to share the bandwidth. Instead by creating mutliple internetworked segments you segregate the traffic so that nobody has to see all the traffic at once. A process called segmenting. If you are on a broadcast-oriented network (e.g. ethernet) then segmenting is a requirement for networks with hundreds or thousands of hosts.
Because you are physically separating networks via routers, you can use the routers to enforce policies regarding what data is and is not passed between the two networks. This gives tremendous power to secure the communications between hosts and access to hosts. Familiar forms of this are Firewalls which are policy based routers and content filtering proxies.
Because you can use heterogenous network technologies in an internetwork, you can construct your network to best fit your environment. Ethernet based lans for users in the same building or floor, fiber optic backbones to channel traffic to centralized or distributed servers which are connected to distant networks via long-distance links provided by a telco.
All of these advantages do come at a cost, even if it is a slight one. The first of which is complexity. When you combine multiple networks together, then add a new protocol layered on top of them to hide the real network, you are inherently increasing the complexity, overhead and maintenance cost of the internetwork.
In talking about computer networks, the net cost increase is usually minor when the advantages above are taken into consideration. It is even possible, if not common, that internetworking technologies provide benefits that far exceed the costs of increased complexity. This is especially true at the enterprise level where improved communication efficiency nearly always produces a benefit.
The biggest internetwork of them all is the Internet, powered
by IPv4. That's Internet with
a capital 'I'. This is the internet that is so big you almost can't
miss it. While the technologies defined by the IETF, W3C and all those
other standards bodies could be deployed in entirely private internetworks,
it is most often true that even these deployments are gatewayed in some
fashion to the global internetwork called The Internet.
The Internetwork Protocol, or IP is the protocol that the modern
Internet is built on. Specifically IPv4 as defined in
RFC 791. In this network model, every host is assigned a unique 32-bit
host address by a central authority. This central authority serves
to ensure that no two hosts are given the same address. Though in reality,
for practical purposes, the central authority assigns blocks of addresses
(network blocks)
to organizations who then ensure that those addresses within their block
are not assigned more than once.
These globally unique host addresses are the first of two steps needed to construct a single, world-wide virtual network. The second is routing, but that is discussed elsewhere. At the local level, a single network generally has assigned to it a range of addresses and a host within that network is assigned one of them. Internet addresses consist of a 32-bit number and when using them on a network there is also a bitmask which determines what parts of the host address describe the network, and which ones describe the host.
In general, all the hosts that share the same network mask and network address reside on the same network. Because binary is so cumbersome and due to a historical fact that netmasks used to only be 8,16 or 24, IP addresses are typically written as a dotted-quad where the full address is broken into four octets and written in decimal form with dots between the octets. The network mask is then appended with a slash thereby uniquely describing the host and the network it is on. As you can see in the table, for the netmasks 8 and 24, the network and host address boundary is right on the dot in the dotted quad.
| 32 bit address | dotted quad/netmask | network address | host address |
|---|---|---|---|
| 00001010010110100110001110001110 | 10.90.99.142/0 | 0.0.0.0 | 10.90.99.142 |
| 00001010001010111001110101011100 | 10.43.157.92/3 | 0.0.0.0 | 10.43.157.92 |
| 00001010101011110001010100011101 | 10.175.21.29/5 | 8.0.0.0 | 2.175.21.29 |
| 00001010110011000001111000110011 | 10.204.30.51/8 | 10.0.0.0 | 0.204.30.51 |
| 00001010010110111101000100011010 | 10.91.209.26/12 | 10.80.0.0 | 0.11.209.26 |
| 00001010101101110110101001011000 | 10.183.106.88/13 | 10.176.0.0 | 0.7.106.88 |
| 00001010000100111110011101110011 | 10.19.231.115/14 | 10.16.0.0 | 0.3.231.115 |
| 00001010010110100100110000001110 | 10.90.76.14/15 | 10.90.0.0 | 0.0.76.14 |
| 00001010111100010110001010011100 | 10.241.98.156/18 | 10.241.64.0 | 0.0.34.156 |
| 00001010010111100001100110101000 | 10.94.25.168/19 | 10.94.0.0 | 0.0.25.168 |
| 00001010110000000110100010101101 | 10.192.104.173/20 | 10.192.96.0 | 0.0.8.173 |
| 00001010001000000011100000101011 | 10.32.56.43/23 | 10.32.56.0 | 0.0.0.43 |
| 00001010111101101111111000111101 | 10.246.254.61/24 | 10.246.254.0 | 0.0.0.61 |
| 00001010100111100111011011000101 | 10.158.118.197/25 | 10.158.118.128 | 0.0.0.69 |
| 00001010100000100001011000100010 | 10.130.22.34/26 | 10.130.22.0 | 0.0.0.34 |
| 00001010011000001001001111000011 | 10.96.147.195/28 | 10.96.147.192 | 0.0.0.3 |
| 00001010100100101111110001000110 | 10.146.252.70/29 | 10.146.252.64 | 0.0.0.6 |
Using these 32 bit addresses, IP creates datagrams containing up to around 65KB of data, along with the source, destination and a couple extra fields in order to communicate between any two hosts, anywhere on the internetwork.
The User Datagram Protocol is a means for programs to make use of the IP network with very little additional complexity. It addresses the issue of multiplexing by using port numbers at the source and destination. Because IP only provides packets addressed to specific hosts, there is no way using pure IP to tell a host which program running on it is intended to receive a particular datagram. Therefore the concept of ports is used to sort incoming packets.
A program on a host which wishes to receive data on a particular port notifies the host's operating system which port it is interested in. So long as no other running program has registered interest in that port, any incoming packets to that port are routed to that program on the host for processing.
Port numbers are a 16 bit number in the range 0-65535. The IANA maintians a list of registered and well-known ports to aid in the smooth operation of the Internet. Since UDP is essentially IP with port numbers, UDP has no guarantees of delivery or non duplication. This means it is possible for a UDP packet to be silently dropped somewhere in the network, or it could be delivered any number of times. Therefore it is up to the program which uses UDP to ensure that it properly handles missing and duplicated UDP datagrams.
Similar to UDP, TCP is also built on top of IP. The Transmission Control Protocol provides a virtual stream oriented connection between two hosts. This means that the bytes arrive in-order, or not at all when transmitted and an essentially unlimited amount of data may be transferred. TCP packets are multiplexed and demultiplexed in a similar fashion to UDP packets. That is to say TCP packets use ports as well.
TCP sessions differ significantly from UDP sessions in that there is a far more complicated (and time consuming) setup and teardown process to establishing and breaking these connections. This is required in order to implement reliable, ordered transport over the unreliable mechanism of IP. Regardless, for many if not most applications, reliability outweighs performance, at least on the scale of UDP vs. TCP performance.
It is important to note that while there is a client/server relationship during the setup of the TCP session, once established either of the originating roles may become irrelevant. Both sides may transmit and receive simultaneously. Therefore a TCP session is effectively two opposing virtual streams operating over the unreliable internet.