Tuesday, March 11, 2008

Ch. 8 Key Concept-Web Server Hardware and Software

Web Server Basics
Elements of a web server: hardware (computers/related components), operating system software, and web server software.

Types of web sites
First step in planning a web server-determine what the company wants to accomplish with the server. Decisions about sever hardware and software should be driven by the volume and type of web activities expected. Types of sites include:
-Development sites: simple sites that companies use to evaluate different web design with little initial investment. A development site can reside on an existing PC running web server software.
-Intranets: corporate networks that house internal memos, corporate policy handbooks, expense account worksheets, budgets, newsletters, and a variety of other corporate documents
-Extranets: allow certain authorized parties outside the company to access parts of the information in the system
-Transaction-processing sites such as business to business and business to consumer electronic commerce sites that must be available 24 hours a day, seven days a week. These sites must have spare server computers for handling high traffic volumes and must run web and commerce software that is efficient and easily upgraded.
-Content delivery sites: deliver content such as news, histories, summaries, and other digital information. Content must be presented rapidly on the visitor’s screen. Sites must be available 24 hours a day, seven day a week and hardware requirements are similar to those of transaction-processing commerce sites.
Web clients and web servers
When people use their internet connections to become part of the web, their computers become web client computers on a worldwide client/server network- used in LANs, WANs, and the web. The client computer request services from the server. Web browser software is the software that makes computers work as web clients-called web client software. Web software is platform neutral-it lets computers communicate with each other easily and effectively.
Dynamic content
A dynamic page is a web page whose content is shaped by a program in response to user requests, whereas a static page is an unchanging page retrieved from disk. Static pages require less computing power than dynamic page delivery. Dynamic content is nonstatic information constructed in response to a web client’s request-use of databases, etc.
On a web site that is a collection of HTML pages, the content on the site can be changed only by editing the HTML in the pages. This doesn’t allow customized pages to be produced in response to specific queries. To create customized pages, web sites use one of two basic approaches: server-side scripting or a dynamic page generation technology.
Server side scripting: in server side scripting programs running on the web server creates the web pages before sending them back to the requesting web clients as parts of response messages are slow.
Dynamic page generation technologies: Server side scripts are mixed with HTML tagged text to create dynamic web pages. The future of dynamic web page generation: critics of dynamic page creation technologies- do not really solve the problem of dynamic web page generation. They argue that these dynamic page creation approaches merely shift the task of creating dynamic pages from people who write HTML code to ASP programmers. The Apache Cocoon Project- outlined a more complex model of the web page generation process that identifies four areas of concern (logic, content, style, and management). It lets web page developers divide the work into these four areas of concern and it breaks the direct connection between logic and style. By separating the logic (the work of programmers) and styles (the work of graphic artists) that is combined in the structure of HTML, web designers could make dynamic web page design easier in the future.
Various Meanings of “Server”
A server is any computer used to provide files of make programs available to other computers connected to it through a network. The software that the server computer uses to make these files and programs available called server software. Some servers are connected through a router to the internet-can run software, called web server software that makes files on those servers available to other computers on the internet. When a server computer is connected to the internet and is running web server software it is called a web server. The server computer that handles incoming and outgoing email is usually called an email server, and the software that managers email activity on that server is frequently called email server software. The server computer on which database management software runs is often called a database server.
Web client/ server communication
A web page containing many graphics and other objects can be slow to appear in the client’s web browser window because each page element (each graphic or multimedia file) requires a separate request and response.
Two tier client/server architecture
The basic web client/server model is a two tier model because it has only one client and one server. The message that a web client sends to request a file or files from a web server is called a request message-consists of three major parts:
-request line (contains a command, the name of the target resource (a filename and a description of the path to that file on the server), and the protocol name and version number)
-optional request headers (contain info about the types of files that the client will accept in response to this request)
-optional entity body (sometimes used to pass bulk info to the server)
When the server receives the request message it executes the command included in the message by retrieving the web page file from its disk and then creating a property formatted response message to send back to the client. A server’s response consists of three parts that are identical in structure to a request message: a response header line indicates the HTTP version used by the server, the status of the response and an explanation of the status information. Response header fields follow the response header line. A response header field returns info describing the server’s attributes. The entity body returns the HTML page requests by the client machine.
Three tier and N-tier client/server architectures
A three tier architecture extends the two tier architecture to allow additional processing (ex: collecting the info from a database needed to generate a dynamic web page) to occur before the web server responds to the web client’s request. The client request is formulated into an HTTP message by the web browser, sent over the internet to the web server, and examined by the web server. The web server analyzes the request and determines that responding to the request requires the help of the server’s database. The server sends a request to the database management software to search for, retrieve, and return all information about exotic fruit in the catalog database. The database info flows back through the database management software system to the server, which formats the response into an HTML document and sends that documents inside an HTTP response message back to the client over the internet.

Software for web servers
Operating systems for web servers
Operating system tasks include running programs and allocating computer resources such as memory and disk space to programs. Open source software is developed by a community of programmers who make the software available for download at no cost.
The performance of one web server differs from that of another based on workload, operating system, and the size and type of web pages served.

Electronic Mail
E-mail is the most popular form of business communication
Email conveys messages from one destination to another in few seconds. One feature of email is that documents, pictures, movies, worksheets, or other information can be sent along with the message itself.
Email drawbacks-annoyance, amount of time that businesspeople spend answering their email today, about 5 mins per message
The computer virus is a program that attaches itself to another program and can cause damage when the host program is activated. The most frustrating and expensive problem associated with email today is the issue of unsolicited commercial email – spam.
Individual user antispam tactics
-reduce the likelihood that a spammer can automatically generate their email addresses-using an email address that is more complex, individuals can reduce the chances that a spammer can randomly generate his or her address. A second way to reduce spam is to control the exposure of an email address.
Basic Content Filtering: all content filtering solutions require software that identifies content elements in an incoming email message that indicate the message is (or is not) spam. Most basic content filters examine the email headers and look for indications that the message might be spam. The software can be placed on individual users’ computers-client level filtering or on mail server computers-server level filtering. The most common basic content filtering techniques are black lists and white lists. A black list spam filter looks for From addresses in incoming messages that are known to be spammers-can delete the message or put it into the separate mailbox for review. The biggest drawback to the black list approach is that spammers frequently change their email servers, which means that a balck list must be continually updated to be effective. A white list spam filter examines From addresses and compares them to a list of known good sender addresses and usually applied at the individual user level, although it is possible to do the filtering at the organization level if the email administrator has access to all individuals’ address books. The main drawback to this approach is that it filters out any messages sent by unknown parties, not just spam.
Challenge-responses content filtering: one content filtering technique uses a white list as the basis for a confirmation procedure called challenge-response, compares all incoming messages to a white list. If the message is from a sender who isn’t on the white list, an automated email response is sent to the sender. This message (the challenge) asks the sender to reply to the email (the response). These challenges are designed so that a human can respond easily, but a computer would have difficultly formulating the response. One major drawback to challenge response systems is that they can be abused. Another issue with challenge-response systems will arise if they become widespread. Most mail that any individual receives from unknown sender. A challenge-response system thus doubles the amount of useless email messages that must be handled by the Internet’s infrastructure.
Advanced content filtering: advanced content filters that examine the entire email message can be more effective than basic content filters that only examine the message headers on the IP address of the email sender. When the filter identifies an indicator in a message, it increases that message’s spam “score”. Bayesian revision is a statistical technique in which additional knowledge is used to revise earlier estimates of probabilities. In software that contains a naïve Bayesian filter the software begins by not classifying any messages. The user reviews messages and indicates to the software which messages are spam and which aren’t/ The software gradually learns (by revising its estimates of the probability that a message element appears in a spam message) to identify spam messages.

No comments: