Welcome to the 667spring06 mini wiki at Scratchpad!

You can use the box below to create new pages for this mini-wiki. Make sure you type [[Category:667spring06]] on the page before you save it to make it part of the 667spring06 wiki (preload can be enabled to automate this task, by clicking this link and saving that page. Afterwards, you may need to purge this page, if you still see this message).

1. Explain Terms Or Concepts

a) Conditional Get

explain what it is for, how they are used – show at least 2 examples of HTTP headers, and the difference between 2 mechanism [validation & expiration]. Also explain which mechanism you implemented and tested in your server) (16 points)

 GET method supports caching mechanism. Store page in cache. Make sure cache page is refresh.
 Conditional get (validation) involves the "Last-Modified" response header, and the 
 If-Modified-Since request header.  
 1) the *server* responds to a request with the Last-Modified date of the file
    a) this can be today's date, the date the file was physicaly modified, 
    b) totally made up by some lazy 667 student
 2) the browser sends an 'if-modified-since' header with that same date 
    on it's next request (refresh)
 3) the server decides if the file is newer than the date sent. 
    a) if yes : sends back the whole file (200 OK)
    b) if no  : sends back a 304 NOT MODIFIED 
 1) the server sends an 'Expires' header telling the browser when the file is no longer valid
 2) on subsequent requests, the browser checks the exipres date of the file and
 -- a) if the date has passed it requests a new copy of the page/image
 -- b) if not, the browser does not send a request
 We implemented Validation in our server.  We actually send both the Last-Modified 
 and the Expires headers, but we leave Expires blank.

b) Caching

When supporting "Caching mechanism", validation is very importinat. Why? And what are two primary categories to support the validation and related HTTP headers?

 To make sure you are showing the fresh data
 It is very important because we need to ensure that the page requested is actually newer.
 The two primary categores are time based exipration (date) and size (length=) from 
 the "Last-Modified: 2006-06-03 7:30PM; length=1020"


(i) There are two ways to detect CGI invokation when using Apache web server. What are they

 ScriptAlias: files in this directory are considered scripts
 Modules:  Apache sends all .php files to the PHP processor when they are requested

(ii) what distinguishes between Get and Post methods

 Get  : No body, and no content length (GET <file> HTTP/1.1)
 Post : Has a body, and content length (POST <file> HTTP/1.1)
        Writes content of request.body to CGI process's standard input
        via proc.getOutputStream().write(...)

(iii) explain the procedure to invokes sub process using JAVA classes maybe she means this kinda thing

 Runtime.getRuntime().exec(" command ", env_variables);
 Every Java application has a single instance of class Runtime that allows the 
 application to interface with the environment in which the application is running.
 The current runtime can be obtained from the getRuntime method.
 This is how an external process is invoked by java.
 Three ways that work pretty much the same
 myClass implements Runanble
  Thread t = new Thread(myclass);;
 myClass2  extends Thread {} 
  Thread t = new myClass2();;
 myClass3 impelements Callable {}
   go figure it out yourself

(iv) how does web server passes information to the subprocess in case of GET and POST? (16 points)

 Both : Passes infomation in the environment variables (string array) via : 
   Runtime.getRuntime("", string[] { "SERVER=blah", "ETC=MORE" } );
   Pass all headers sent by client to the process by prefixing them with HTTP : 
 GET  : Does not pass info any other way
 POST : The body of the request is pased to standard input

Proxy Server vs. Cache

Explain how proxy server and cache on your browser enhance performance and the difference between them

 Both enhance performance by not requesting data that they do not need.  This saves on
 transfer time and server processing time.
 Proxys sit between you and the server.  They accecpt your request and *forward* it
 on.  They may *cache* the the pages, but not all of them do.  
 Your browser maintains a local cache of all files downloaded.  It decideds, based on
 expires, if it should request a new copy.  It sends if-modified-since so the server will
 not send files that have not changed, improving percieved performance.

J: cache in brower and in proxy server both save boundwidth and reduce the latency.

 cache in brower: only support individule user
 cache in proxy server: on netwrok, support everyone connect to this server.

e) Explain how your web server supports multithreading with JAVA code.

 Our webserver does this by creating a new thread to handle each incoming socket request.  
 When a new request is recieved a new instance of the HttpHandler is created
 and the socket assigned to it.  The server process then starts the thread it just
 created, and waits for another connection.

f) Explain how Mime types are used within HTTP and web applications.

Explain also how your web server supports diverse Mime types.

 Mime types are used to tell either the server or the browser what type of content
 is in the request/response.
 Our server handles diverse mime types by mapping each file extension to a 
 particular mime type, or allowing the CGI to set it. 

g) What is persistant connection? When are connections closed

 Persistant connections are kept open after a request or response is done. 
 They were introduced in http 1.1 to avoid the overhead of opening an
 expensive TCP connection for each request. 
 They are closed when either the browser or the server requests it, or 
 they time out

CGI Script and HTTP

full request message, including body

 POST /cgi-yoon/ HTTP/1.1
 Accept: text/xhtml,text/html,text/plain,*/*
 Content-Length: 17
 Content-Type: application/x-www-form-urlencoded
 -blank line-

read the perl script below and explain what it does

Basically, it parses the form and redirets the user to the location indicated by the 'URL=" form parameter.

 1) reads *from* standard input the # of bytes indicated by the CONTENT_LENGTH
    environment variable
 2) It splits what it read into name/value pairs (delimited by &)
 3) Loops over each pair
 -- 1) splits that into name, value (separator is "=")
 -- 2) replaces each + with a space ( + is how space is encoded by get/post)
 -- 3) decodes hex encoded values
 -- 4) sets the name = the value inside the form data array
 4) gets the 'url' value from the array
 5) redirects the browser (via Location:<url>) to that URL.

Using Cookie with HTTP

Request A

cookie is not sent, because it does not exist yet! The form posts to the CGI file indicated like this :

 POST /cgi-bin/A.cgi HTTP/1.1
 Content-Length: XXX
 Accept: text/html
 Referer: <place>
 ... see previous example...
 -blank line-

Response B

 Server responds with ok, sending the HTML page and all the cookies back.
 HTTP/1.1 200 OK
 Content-Type: text/html
 Content-Length: 2080
 Set-Cookie: last=yoon; <other info>
 Set-Cookie: first=llmi;
 Set-Cookie: country=HKSAR
 Set-Cookie: multi=val1,val2,val3; (array of values)
 -blank line-
 <html><head>...Thank you for your information...</body></html>

Request C

Now the browser sends another normal request, but this time it has the Cookie: header

  GET /cgi-bin/B.cgi HTTP/1.1
  Referer: http://..../A.cgi
  Cookie: last=yoon;first=llmi;country=HKSAR


What are benefits of using javaScript at Clientside? (how does JavaScript contribute to improve the performance/usability of web application)? List two.

 1) Advanced functinality (form validation, ajax)
 2) Image rotation, other effects
 3) Take some load off server 
 4) Allows quick feedback to user input, no round-trip required
 5) someone add more, cuz I hate javacript

Session Tracking

What is it, why is it necessary

A session is data that is stored by the server which is tied to a particular client and can span multiple connections. It is usually implemented as a key stored in a cookie which gives access to the server.

It is used for shopping carts because

  • it lets you store data securely on the server (prices, item id) to prevent tampering
  • cookies are of limited size
  • keep sensitive user information (cc#, login name, etc) out of easily read cookies

How to create a session using a cookie

When a connection is made, and no 'session' cookie is found, the cgi program should send one back

 Set-Cookie: name=ASPNET_SessionId;value=12351616416;

The CGI can then store information in a database/file/memory that references that particular key. On all subesquent requests the browser will send this key back to the server:

  GET /somepage.aspx HTTP/1.1
  Cookie: ASPNET_SessionId=12351616416

And then the CGI can retrieve any data it has stored for that session.

Important Examples of Cookies (2)

 1) Storing login information & status so users do not need to re-login
    every single page request (unsafe though)
 2) Tracking which ads someone has viewed via a trackign cookie
 3) Storing user preferences ( background color, etc) 
 4) Allowing a user to return at a later date and still have the same
    info present (go back to gmail,  you are still logged in)

Limited size cookie.txt, why

There are several reasons

  • Do not want to fill up the hard drive (malicious websites)
  • Keep request size down. We don't want to send many huge cookies with EVERY request!

Missing #6

Missing #7

How do you handle Authentication

Explain how your server supported authentications in handling headers & setting the secure directory, and responses. (Focus on your explanation on your own server rather than Apache or other web server.)

  • Our server takes each request and checks to see if a .htaccess file exists in the directory the file lives in, or any one of it's parent directories.
  • If it does, it loads the HTAccess and parses it & associated users file.
  • It checks the Authorization: header passed by the browser to see if the user entered a name/pwd
  • if yes
    • checks to see if the user exists, has same pwd in users file
      • if yes: processing continues as if nothing had happened
      • if no: throws a 403 exception, returning the 'access denied' page
  • if no
    • sets required headers (WWW-Authenticate: Basic realm=Password Required)
    • then throws a 401 exception, which returns the standard error page with 401 status

How to Handle Multiple Requests

Briefly explain how you design your server to handle multiple requests. Please specify by drawing a class diagram and simple flow chart to show what class you design as a multithreaded class and when each thread is launched. How can you test it?

 Our Worker implements runnable
 When a request comes in, a new worker is created and the socket 
 and conf file are assigned to it.
 The worker is then handed off to a thread pool
     10 waiting threads, queue of 10
     thread pool assigns the runnable to one of the workers & executes
 Control returns to the main server
     request is still being processed
     waits for new connection
       starts again at the beginning
how did we test it?

We used a very simple method :

  • there is a {static bool shouldBlock} on the worker.
  • The first time a request is made this evaluates to 'true'
    • the thread enteres a very long loop
    • this effectively keeps this thread 'waiting'
    • couldn't find a way to do an elegant 'sleep', so just a huge loop
  • during this time, we make a second request.
  • if multi-threading works, the second request should be processed
    • the first request will still be waiting for the timer to expire
    • eventually it will get to do what it was supposed to.

Web Browser -> Proxy -> Cgi Graph

10 Using a graph-like notation, explain the relationships between a Web Server, Browser, Proxy Server and CGI script. (Show clearly where documents (A.html, B.cgi, C.DB) are stored, and where requests and responses are sent.) (15 points)

CGI Script & HTTP (60 pts )

Please write down a Full Request Message (include a body if necessary) generated by browser when the Submit Query button is clicked. (Provide 3 proper headers) (20 points) 'is this even valid html?' =>
 <FORM METHOD="POST" ACTION="/cgi-bin/post-query">
    Which School would you like to apply to?
    <SELECT NAME="school" SIZE=5>
        <OPTION> Letters&Science</option>
        <OPTION SELECTED> Engineering<OPTION> Business
        <OPTION>Law<OPTION> Medicine</SELECT> 
    What semester do you wish to start?
    <SELECT NAME="semester">
       <OPTION> Spring<OPTION> Summer</SELECT>
 POST /cgi-gin/post-query HTTP/1.1
 Content-Length: 100
 Content-Type: application/x-www-form-urlencoded
 User-Agent: Mozilla 
 Accept: text/xml, text/html, text/plain, */*
 Accept-Encoding: deflate, gzip
 -blank line-

Explain what the python does

This cgi prints out a list of all environment variables in alternating colors( in a table )

 import cgi                --------- (1)
 #  imports the CGI module, to handle escape(), reading input, etc
 def printHeader( title ):  -------- (2)
 #  defines a function that will print the HTML header with a specified title
 for item in os.environ.keys():   ---------  (3) 
 #  loops over each 'key' in the environment collection
 if rowNumber % 2 == 0: # even row numbers are white  ----- (4)
 #  alternates the color between white/gray.  1st is gray
 print """<tr style = "background-color: %s">----- (5)
 #  Prints the Table Row with the background color, containing the 
 #  environment key & value pair

Describe what this program does

Basically, it prints out a form that asks you to enter a word. When you click submit it POSTs back to itself. In the script it checks to see if a form was submitted with the 'word' paramter from the form it printed out before. If it is there, it echos that word back to you.

 1) defines a printHeader method to print out the start of the html
 2) Calls it to print out the page & title
 3) Prints out a form that askes you to enter a favorite word
 4) Checks the CGI form collection
      if the from contains the key 'word'
        this means you submitted the form that it printed in (3)
        it prints out the word you entered
 5) ends the html

Ad blocker interference detected!

Wikia is a free-to-use site that makes money from advertising. We have a modified experience for viewers using ad blockers

Wikia is not accessible if you’ve made further modifications. Remove the custom ad blocker rule(s) and the page will load as expected.