| Legal research
|
Legal Aid Case Based Reasoning Retrieval Information Systems
In "Legal Aid Case-Based Reasoning Retrieval Information Systems", Kevin Curran and Lee Higgins provide us with useful insight into the more technical aspects of legal research. Curran and Higgins explain that modern electronic legal research systems face two important but related challenges, namely, how to improve the efficiency of a lawyer's research tasks and how to make legal research systems widely and publicly available to lawyers. Curran and Higgins investigate some of the problems with electronic legal databases and argue that case-based reasoning principles can be used to tackle these problems. They argue that the efficiency and usefulness of legal research can be improved by indexing material according to the issues involved rather than on a keyword basis.
Kevin Curran (BSc, PhD) is a Lecturer at the University of Ulster in Northern Ireland. He has written over 80 academic research papers on areas such as distributed computing, emerging trends within wireless ad-hoc networks, dynamic protocol stacks, and mobile systems. Lee Higgins is an IT consultant working within the field of telecommunications. His research interests include law and legal aid decision support systems.
1 Introduction
Since the early 1990s, there has been a marked increase in demand for legal services in Northern Ireland, particularly in the area of commercial law.[1] At the same time, empirical studies indicate that an increasing number of cases are being decided in negotiations rather than being determined by the courts.[2] The result of this, combined with a dramatic decrease in the availability of legal aid, the growth in popularity of contingency fees and a sharp increase in the cost of specialised legal services is that clients' rights in specialised areas are more frequently being represented by lawyers not specialised in particular fields. For such professionals to represent their clients effectively, they need access to a wide range of up-to-date legal materials in order to complete their legal research tasks. However, at the same time, the large and continuous increase in the volume of such legal materials means that the problem for non-specialist lawyers is not just one of information availability but also one of efficient information retrieval. The highest legal authorities in Northern Ireland have recognised that information technology (IT) can and should play a role in this area.[3]
The Internet has been used by lawyers as both a general business tool and, more specifically, as a research tool for some time. Indications are that this trend is set to increase and extend to all areas of legal practice. It should therefore, be cultivated and exploited. [4] Nevertheless, while the Internet remains a powerful and efficient means of exchanging and communicating information, the standard web-site approach of obtaining documents through predetermined hyperlinks or via keyword searches through search engines is unlikely to be particularly helpful in satisfying the sophisticated information needs of lawyers. It is crucial to recognise that most modern legal information retrieval applications (whether web-based or not) are, for some basic reasons, failing to meet the requirements of non-specialist lawyers. The current situation poses two distinct but related challenges for IT, namely:
This article aims to investigate the problems inherent in existing approaches to information retrieval and demonstrate how simple but powerful case-based reasoning (CBR) principles can be used to meet the challenges outlined above.
2 Research task
2.1 The lawyers research task
The overriding goal for the lawyer is to access potentially relevant legal resources that can help the lawyer better understand and address the relevant legal issues. A lawyer inexperienced in a particular field may approach researching a problem as follows:
The lawyer might familiarise himself or herself with the general area and the legal issues involved by consulting textbooks or leading cases in the area;
From this the lawyer will hope to get a general understanding of the type of facts the lawyer is looking for;
Having elicited facts which could potentially lead to a successful outcome, the lawyer needs to find decided cases which describe and discuss the legal issues raised by these facts. It might prove helpful if the lawyer can find decided cases which closely match the case at hand (Problem Case), since it is likely that similar issues will have been discussed or illustrated. However, even if 'similar' cases are discovered, the lawyer will need to appreciate how these cases can be distinguished from the Problem Case. The lawyer may also need to find cases which, on the facts, would be distinguished from the Problem Case; and
Once the lawyer has completed this work the lawyer will use the results to begin the legal reasoning process and, in doing so, will inevitably need to revisit these research tasks.
Overall this approach can be exceedingly time-consuming and wasteful whether carried out manually in a law library or through the use of electronic databases.
2.2 Improving the efficiency of the research task
Making information publicly available, that is, via the Internet, obviously overcomes the problem of obtaining sources which have been identified as relevant. However, efficiently identifying materials available on the Internet that could be relevant to a particular case can be difficult. The efficiency of a lawyer in this context could be greatly improved if an information retrieval system could boast, at the least, the following functionality:
The system has an interface, designed with the guidance of a legal expert, which 'walks' the lawyer through the various possible issues in the case through a series of yes/no questions that builds a profile of the Problem Case;
Once a basic profile is built up, the system indicates (through a process of basic pattern matching) the important cases in the field which best match the profile of the Problem Case and which serve as a springboard into more intensive and informed research;
If the system indicates which cases best match the Problem Case, it should explain how this match occurs and also indicate how the retrieved cases are distinguishable from the current case;
Lawyers search by concept (i.e. legal issues) not potentially random keywords. Any index of the document repository allows the lawyer to find, inter alia: the leading case on a given issue; the latest case on a given issue; important cases where a given issue is discussed; and also, articles where a given issue is discussed; and
When looking at a given case, the lawyer can quickly identify: other cases where the case was distinguished; cases similar to the Problem Case; and articles where the actual case is discussed.
3 Legal Information Retrieval Systems
3.1 Legal Databases
The most widely used research tool is the electronic database. Most legal databases exist in CD-ROM format usually purchased through a one-off subscription payment. However, with the growth in networked computing and more specifically, the Internet, an increasing number of databases are available on-line. On-line databases are capable of immediate updates and are paid for on an 'as-you-use' basis. Examples of this latter category include LEXIS [5] and Smith Bernal's[6] on-line case bases.
Usually, the full text of documents or an abstract of the full document is stored in the database and the index to the database is built on this. While a variety of indexing possibilities exist, searching (by text or an index referencing the text) is most commonly carried out using keywords and Boolean processing. This functionality may be extended through the use of truncation or thesauri services (used, for example, in the KLUWER database). [7] Users may also have the option of searching within specific fields such as date or jurisdiction or even presiding judge(s).[8] A growing number of tools now offer the ability to limit search results by legal topic and a common trend is to include hypertext links within retrieved documents so that users can easily access other materials referenced in the retrieved text.[9]
3.2 Problems with legal databases
It goes without saying that electronic legal databases have greatly improved the efficiency of legal research, at least in so far as they make materials more freely available. However, some basic problems with these tools still exist, which reduce their usefulness to lawyers, particularly if they are unfamiliar with a specific area of law. The fundamental problem derives from the combination of free-text and Boolean processing as a means of searching and retrieval. [10] Usually, all the words in the full-text (or abstract) are indexed. Given the open-text nature of law, this means that while relevant documents may be returned on a keyword search, they are subsumed in a wealth of irrelevant material (that is, levels of recall and, more importantly, relevance are deceptively low) and the user must sift through these to identify useful materials. With the volume of legal materials rapidly increasing, the number of random and meaningless associations made on a keyword search is likely to increase (despite some progress in limiting searches to a particular legal area). Furthermore, Matthijssen[11] has noted that for the optimal use of text retrieval systems users must:
know and be able to clearly articulate their information need; and
given that information represented in an index is based on the contents of a database, know the content and storage structure of the documents in the database.
Expressing an information need satisfactorily in Boolean terms has proved difficult for lawyers and assembling and applying effective search requests remains a specialist job. [12] Lawyers who are unfamiliar with a particular area of law are unlikely to know what 'keywords' will be of use. It is also important to note that lawyers do not formulate their information needs in terms of 'keywords' but instead use abstract concepts such as legal issues involved in a case. Furthermore, lawyers are unlikely to be familiar with the complexities of content and storage. The result is that lawyers wishing to make effective use of the database must overcome a 'conceptual gap' - translating their information needs formulated in legal terms into a query in technical database terms - which distorts the semantics of their request.[13]
3.3 Improving the efficiency of legal databases
We suggest that the problems with legal databases outlined above might be rectified by the following simple means:
Free-text should not be used as the basis for an index. Instead, an index should be created which sits on top of the text and acts as a type of document management system, providing intelligent guidance to the relevant texts;
To minimize the likelihood of returning irrelevant material, an initial effort should be made to compile a human-created index;
To make the index more efficient, it should be based on the user's task-domain and structured so that it can be easily searched to meet the most likely information needs; and
The interface to this database should hide the complexities of the index and allow queries to be made through the user's 'own language', to avoid the 'conceptual gap' described above.
3.4 AI-Legal Applications
A goal far more ambitious than speeding up a lawyer's research process is the use of artificial intelligence (AI) techniques to emulate the substantive legal jobs performed by a legal expert. If this goal could be achieved then the results would not only show relevant documents, but also provide guidance on how to use the retrieved material. AI machines were initially developed to provide solutions to a legal problem as would a real-life legal expert. These systems are known as legal expert systems (legal EBS). However, AI systems have moved from this expert solution goal and now instead aim only to incorporate legal knowledge with a view to providing guidance or 'decision support' to lawyers. These systems are referred to as legal knowledge-based systems (legal KBS) or legal decision support systems (legal DSS). [14] Most systems in this field have adopted techniques based on one of two dominant legal theoretical paradigms - a positivistic or a realistic view.
Rule-based systems basically adopt a positivistic view of the law as a determined set of rules. In such systems, the law is symbolically encoded as a set of production (if/then) rules (Production Rules) which are manipulated through a process of forward (or, less commonly, backward) chaining with rules being fired depending on the input facts of the Problem Case. This approach is generally not suitable given the flexible areas of law.
Other systems adopt a more realistic view of the law and place emphasis on recognizing that an important component of legal reasoning is identifying precedents for decisions in particular circumstances. CBR, basically involves reasoning from collected examples of previous problem solving experiences, [15] these experiences being actual or hypothetical legal decisions. Typically, cases are represented as frames, with slots representing factors or legal issues. Cases may then be compared and analogies created and manipulated on the basis of the presence or absence of factors. Complex weighting algorithms may be incorporated into the process to help determine the difficult issue of similarity of cases.
This CBR approach may be employed at several levels. Retrieved cases can be used to form the basis of an argument to Problem Case, or as the input into algorithms for constructing legal arguments using cases. Increasingly, the CBR approach is being used within information retrieval on a large body of materials (usually existing free-text case repositories). At the simplest level, such techniques are used for basic information retrieval using factors and a matching process to select materials likely to be relevant to the user. Increasingly, both techniques (Production Rules and CBR) are employed in hybrid systems. Popple's SHYSTER system [16] for instance, treats it's Production Rules and CBR systems as co-reasoners, each capable of operating on its own. A number of heuristics control how the two work together. For example, some of the heuristics concern how to "broaden" a near miss rule (i.e. one in which all but one conjunct can be established): it uses CBR to find cases where the rule did not fire, but the consequent of the rule still held; and it uses CBR to find cases where the rule did fire and points out the similarities between those cases and the present case.
3.5 Problems With AI-Legal Applications
Despite the intensive and laborious research conducted into AI machines, the machines have largely failed to attain their goals. In fact, very few have made the transition from research ventures to applied systems. This failure is due to fundamental problems both at the theoretical level and the practical level.
First, all such systems involve the creation of a model of the legal domain - referred to as an 'ontology'. The overriding goal here is one of representing the knowledge in a manner that is computer encodeable but accurately reflects the meaning of the original source material. Making this knowledge computer encodeable almost always involves viewing the law as a fixed set of rules. However the law is, of course, slightly more complex than this. The law is not self-contained and autonomous; instead its meaning must be interpreted in the light of many implicit and ever-changing assumptions in the political and social context. It is seriously doubted whether current technologies can handle such a complex model. It follows that this process of isomorphism is yet to be achieved and representing legal reasoning in a computer encodeable form involves a certain distortion of the subject material.
Secondly, given the work involved in building a satisfactory model and the complexity of the underlying system, AI machines are costly to develop, and extremely difficult to maintain and update (ease of maintenance being one of the cornerstones of any applied system). [17] Developing intelligent systems that can easily handle change is no trivial matter and this problem is certainly worse if we accept that the law is changeable.[18] Furthermore, unlike other areas of AI, the complexity involved in automating or providing support for legal reasoning means that no generic commercial shells are available and most systems (capable of covering only one or two legal problem areas) must be built from scratch.[19] Another issue is whether such machines have a large enough target audience to justify the massive effort required to build them.
Thirdly, AI applications (whether EBS, KBS or DSS) fail to recognize the realities of legal practice because they tend to place too much emphasis on the law as an entity embodied in written texts rather than as the product of an oral tradition. Computer technologies should therefore assist with mechanical research and retrieval tasks and not delve into the more creative (and less certain) task of legal reasoning.
Furthermore, to make sense of the complicated output they produce, the user must already have considerable knowledge of the target area of the law and sophisticated IT skills.
Finally, the complex reasoning strategies and output a successful AI application would produce, are likely only to be of use in cases decided in the highest courts (about 1% of cases). [20] Most lawyers, especially in the lower courts and in negotiations or mediations, only consult case materials to understand the basics of the law and to find illustrations of situations which might justify the client's pleas.
Attempts to automate legal reasoning have not proved very successful as yet. This is not to say that the automation of legal reasoning is impossible nor that work conducted in this area has been in vain. But given the changeable nature of the law, the difficulties in modelling legal reasoning accurately and the small number of potential users, such machines would not be cost-effective. That said, if we keep in mind the limitations of these systems (i.e. they are only sophisticated pattern matchers [21] and our development efforts should reflect this) then we can derive some use from this particular field of research. These limitations should also be clearly communicated to users.[22] The CBR process of comparing cases based on the notion of factors is relatively easy to replicate. It is also quite useful to, and a common strategy adopted by, lawyers who use it not for any substantive purpose of legal reasoning but to identify cases that could help them better understand the issues involved in a particular case.
Bearing this limited goal in mind, our example system shall attempt to implement some form of basic pattern-matching mechanism.
4 Modelling
In order to make legal resources susceptible to treatment by an information retrieval application, the target area of the law needs to be represented in a computer searchable fashion. Basically, this involves creating a 'model' of the legal area. Understanding the model is crucial to understanding how our proposed application works. The representation should remain simple, easy to update or alter, efficient, easy to understand, and intuitive to lawyers.
One common approach used by lawyers researching an area of the law is to describe the area of the law in terms of factors or legal issues which arise in the area, then analyse a Problem Case and express their information needs in terms of the legal issues involved. This simple strategy also forms the fundamental building blocks of the most praised legal CBR systems - HYPO [23] and CATO.[24] Basically, a legal case can be described in terms of the factors it exhibits. Each factor can either:
At the simplest level, factors can be binary, that is, yes or no. At a more sophisticated level, factors can be quantifiable having a strength and direction (we call this category of factors 'dimensions'). This is a simple prototype application so we shall stick to simple binary factors. We will use, as an example for our model, section 459 of the Companies Act 1985 in the context of a quasi-partnership. [25] For demonstration purposes, we will determine the factors based on the classification found in the most popular textbooks.[26] The quality of the model will greatly influence the quality of the results, so in any polished system the modelling task should be an intensive one carried out by a lawyer experienced in the field. In our example, there are several identifiable factors or legal issues, namely:
Factor 1: Has the plaintiff lost his position on the board of directors?
Factor 2: Has the plaintiff lost his livelihood?
Factor 3: Has the plaintiff come to court with clean hands?
Factor 4: Has the defendant acted mala fides?
Factor 5: Does an informal agreement exist between the parties?
Factor 6: Do the articles confer special rights?
Factor 7: Has there been a breach of director's duties?
Factor 8: Would a successful petition not harm the company?
A case can be described in terms of the presence or absence of these factors or alternatively in terms of the factors it discusses. As we are using binary factors, if a factor is present in a case it is called a p-factor. If a factor is not present, it is called a d-factor. For example, if the facts were such that the plaintiff had lost his livelihood (Factor 2), then Factor 2 would be was a p-factor in our Problem Case.
4.1 Using Our Basic Representation
We will use a classification based upon an approach adopted by Bench-Capon [27] in our example application by "performing information retrieval on an index (stored in a database) that describes and references information held in a legal case and doctrinal writing repository". Using this model, we can describe or identify a case by the factors it exhibits. For example we could describe a certain case as being the leading case on Factor 6, or indicate in our index that the main factors dealt with in the case are Factors 3, 4 and 6 etc. Doing so will allow a lawyer to retrieve cases on the basis of legal issues. Similarly, an article could be defined in terms of the legal issues it discusses.
4.2 Pattern Matching To Retrieve Similar Cases
As noted earlier, this is a very common strategy employed by lawyers to help find and retrieve factual examples in cases which could help explain their Problem Case. We use simple pattern matching for this latter purpose. Pattern matching in the most basic terms involves comparing cases based on the presence or absence of factors. Each case is defined in terms of p-factors or d-factors. Thus, if we compare 2 cases, with one being our Problem Case (C1); and the other being a stored case (C2), the set of factors exhibited when comparing the cases is classified into 4 groups:
- p-factors common to the 2-cases. We call these pro-plaintiff similarities (PPS);
- factors that make C1 stronger for the plaintiff than C2 (p-factors in C1 but not in C2; d factors in C2 but not in C1). We refer to these as our-case-stronger factors (OCS);
- factors which make C2 stronger for plaintiff than C1 (p-factors in C2 but not in C1; d factors in C1 but not in C2). We refer to these as this-case-stronger factors (TCS); and
- d-factors common to both cases. We call these pro-defendant similarities (PDS).
As Using this classification, a profile of our Problem Case can be created and matched against a set of stored cases, which are defined in terms of the factors they exhibit, thus allowing us to find the closest matching cases. We can also describe usefully how this match occurs (for example, by pinpointing where similarities lie) and helpfully explain where distinguishing points between compared cases lie.
As As demonstrated above, a target area of the law can be broken down into a series of factors symbolising legal issues. These factors may or may not be present in the stored case or article. [28] A stored document can be described in terms of the issues it deals with. Importantly, a document can also be described in terms of other cases. In our example application, the database contains tables made up of rows, which correspond to stored legal documents. Each document is represented as a tuple, having a unique identifier (Cnum/[AName]) with the other attributes being used to describe various facets of the document referenced. The attribute values are used for query and retrieval purposes.
As The column headers include a number of self-explanatory fields including (for cases), the unique identifier, name, law reports citation, date of judgement, the full address of the document on the server and the verdict of the case. However, there are a number of other fields that require the opinion of an expert in the legal area including:
Lead - Is this case the leading case on an issue? If so, the identifier of the appropriate issue is inserted;
Main - What are the main issues discussed in this case? The identifier(s) for the most important issue(s) discussed are inserted;
Distinguished - Has this case been distinguished in any other case? If so, the unique identifier(s) for the appropriate case(s) are inserted; and
Similar - Are there any cases which closely resemble this one?. [29] If so the unique identifier(s) are inserted for the appropriate case(s).
For example, case C1 above is the leading case on Factor 1, the main issues discussed in the case are Factors 2, 3 & 6, it hasn't been distinguished in any case but is similar to case C3.
The article table is structured along the same lines as the 'case' table with similar column headers for name, author etc. The 'Issue' field indicates what legal issues are dealt with in the article. If any cases are discussed this is stated in the 'Case' field. For example article A1 above deals with Factors 4 and 7 and discusses case C1. Overall this structure allows us to move away from the irrelevant results produced by a keyword search.
5 Conclusion
Current legal research applications largely fail to efficiently fulfil the information needs of lawyers. Most legal databases lack valuable structure that can help lawyers quickly access information that may be of use in their research tasks. The complexity and cost involved in creating AI-legal applications far outweighs the value they can provide to most lawyers. However, taking some of the structure inherent in the AI-legal applications (that is, classifying legal cases in terms of the 'factors' they exhibit) can greatly benefit lawyers and, recognising the limitations in AI-legal applications could help form the basis of useful, cost-effective and usable applications.
Indexing legal cases and articles according to the issues involved rather than on a keyword basis could help lawyers quickly access information that may be of use to them and also help them to better focus and express their information needs. By applying CBR principles to a very simplified model of the law, we can provide lawyers with a useful springboard into the more uncertain and unpredictable task of legal reasoning while at the same time avoid the overwhelming complexity involved in creating AI-legal applications.
Footnotes
1 Wall, "Information Technology & The Shaping of Legal Practice in the U.K" , 13th BILETA Conference http://www.bileta.ac.uk.
2 Galanter, "Law Abounding: Legalisation Around the North Atlantic", [1992] 55 Modern Law Review.
3 Lord Woolf, "Access to Justice Final Report", (1996) HMSO.
4 Electronic Law Practice, "An Exercise in Legal Futurology", [1997] 60 Modern Law Review.
5 LEXIS/NEXIS: http://www.lexis-nexis.com/lncc/custserv/html/lexis/intlaw.html.
6 SMITH-BERNAL on-line casebase: http://www.smithbernal.com.
7 Kluwer Database: http://epms.nl.
8 Smith-Bornal, supra.
9 O'Shea & Wilson, "Using Hypertext and Parallel Processing to integrate a European law database" (1997), 12th BILETA Conference: http://www.bileta.ac.uk.
10 Sturdy, "Wisps of Smoke? The Electronic Library, New Information Retrieval Techniques and Diminishing Returns" (1994), 9th BILETA Conference: http://www.bileta.co.uk.
11 Matthijsson, "A Task-based Interface to Legal Databases" (1998), 7 Artificial Intelligence Law, Vol.6 No.1.
12 De Mulder et al, "The Concept of Concept in Conceptual Legal Information Retrieval" (1993), 8th BILETA Conference pre-proceedings: http://www.bileta.ac.uk.
13 Matthijsson, "An Intelligent Interface for Legal Databases", (1995), Proceedings of 5th International Conference On Artificial Intelligence and Law, Kluwer.
14 Aikenhead, "A Discourse on Law & Artificial Intelligence" 1997, 5 JILT, 1: http://www.law.warwick.ac.uk/ltj/5-1c.html.
15 Luger & Stubblefield, "AI - Structures & Strategies for Complex Problem Solving" (1997), Addison-Wesley.
16 Popple, "SHYSTER: A Pragmatic Legal Expert System", PhD thesis, Australian National University, Canberra (1993): http://cs.anu.edu.au/~James.Popple/publications.
17 Poulin et al, "Coping with Change" (1991): http://www.confpriv.qc.ca/crdp/en/equipes/technologie/textes/ia/bratley91a.html.
18 Bratley et al, "The effect of change on legal applications" (1991): http://liguria.crdp. umontreal.ca/crdp/en/equipes/technologie/ textes/ia/bratley91b.html.
19 Zeleznikov J & Hunter D, "Building Intelligent Legal Information Systems", Kluwer, Law & Taxation Publishers ISBN 90 6544 833 0.
20 Morrisson & Leith "The Barristers World and The Nature of Law" (1992), Open University Press.
21 Greinke, Andrew, "Legal "Expert Systems: A humanistic Critique of Mechanical Legal Interface". E Law, Volume 1, Number 4, December 1994.
22 Leith, "The Computerised Lawyer" (2nd Ed.) (1998), Springer-Verlag.
23 Ashley, "Modelling Legal Argument" (1990), MIT Press Cambridge.
24 Aleven, "Teaching Case-Based Argumentation Through a Model & Examples", PhD Dissert, University of Pittsburgh Graduate Program in Intelligent Systems.
25 S.459 of the Companies Act 1985 aims at offering minority shareholders in companies a judicial remedy whenever they wish to complain of the behaviour of (usually) the majority/controlling shareholders in a company.
26 Davies & Prentice, supra. Boyle & Bird, supra.
27 Bench-Capon, "Arguing with Cases" (1997), Proceedings of JURIX '97.
28 Please note, here the presence of a factor does not necessarily make it a p-factor.
29 i.e. the facts were similar or issues dealt with were similar.
September 2004 contents
|