Font Size: a A A

An approach for robust multilingual parsing

Posted on:2005-10-12Degree:M.ScType:Thesis
University:Queen's University at Kingston (Canada)Candidate:Synytskyy, MykytaFull Text:PDF
GTID:2454390008994417Subject:Computer Science
Abstract/Summary:
Since the World Wide Web first became popular, websites grew dramatically---in size, complexity, and the number of technologies they use. Maintenance of such large websites not only poses problems inherent to large software systems, but also presents some unique challenges that other software systems do not. Web documents contain not only HTML, which is potentially ill-formed, but also instructions in several different programming languages. Because of these complexities, maintaining large dynamic websites can be an extremely difficult task. It is therefore imperative to develop an approach to website analysis that can cross language boundaries to extract design-level artifacts from web site source code.; This thesis demonstrates a parsing approach that is suitable for website analysis. The approach utilizes an extensible island grammar to robustly process multilingual software artifacts. The analysis conducted on the parse trees is not inhibited by language or client/server boundaries, and can be used to extract design information from the all of the code at once. The thesis also shows the use of this approach for two common software maintenance tasks: (1) Refactoring a website to reduce or eliminate excessive cloning, with a vision towards making the web sites more maintainable; (2) Discovering the database aspect of the website, and doing webpage-to-database-to-webpage impact analysis.
Keywords/Search Tags:Web, Approach
Related items