| [ Index ] |
PHP Cross Reference of MantisBT |
[Source view] [Print] [Project Stats]
Tools to help with ASCII in UTF-8
| Version: | $Id: ascii.php,v 1.5 2006/10/16 20:38:12 harryf Exp $ |
| File Size: | 220 lines (9 kb) |
| Included or required: | 0 times |
| Referenced: | 0 times |
| Includes or requires: | 0 files |
| utf8_is_ascii($str) X-Ref |
| Tests whether a string contains only 7bit ASCII bytes. You might use this to conditionally check whether a string needs handling as UTF-8 or not, potentially offering performance benefits by using the native PHP equivalent if it's just ASCII e.g.; <code> if ( utf8_is_ascii($someString) ) { // It's just ASCII - use the native PHP version $someString = strtolower($someString); } else { $someString = utf8_strtolower($someString); } </code> param: string return: boolean TRUE if it's all ASCII |
| utf8_is_ascii_ctrl($str) X-Ref |
| Tests whether a string contains only 7bit ASCII bytes with device control codes omitted. The device control codes can be found on the second table here: http://www.w3schools.com/tags/ref_ascii.asp param: string return: boolean TRUE if it's all ASCII without device control codes |
| utf8_strip_non_ascii($str) X-Ref |
| Strip out all non-7bit ASCII bytes If you need to transmit a string to system which you know can only support 7bit ASCII, you could use this function. param: string return: string with non ASCII bytes removed |
| utf8_strip_ascii_ctrl($str) X-Ref |
| Strip out device control codes in the ASCII range which are not permitted in XML. Note that this leaves multi-byte characters untouched - it only removes device control codes param: string return: string control codes removed |
| utf8_strip_non_ascii_ctrl($str) X-Ref |
| Strip out all non 7bit ASCII bytes and ASCII device control codes. For a list of ASCII device control codes see the 2nd table here: http://www.w3schools.com/tags/ref_ascii.asp param: string return: boolean TRUE if it's all ASCII |
| utf8_accents_to_ascii( $str, $case=0 ) X-Ref |
| Replace accented UTF-8 characters by unaccented ASCII-7 "equivalents". The purpose of this function is to replace characters commonly found in Latin alphabets with something more or less equivalent from the ASCII range. This can be useful for converting a UTF-8 to something ready for a filename, for example. Following the use of this function, you would probably also pass the string through utf8_strip_non_ascii to clean out any other non-ASCII chars Use the optional parameter to just deaccent lower ($case = -1) or upper ($case = 1) letters. Default is to deaccent both cases ($case = 0) For a more complete implementation of transliteration, see the utf8_to_ascii package available from the phputf8 project downloads: http://prdownloads.sourceforge.net/phputf8 author: Andreas Gohr <andi@splitbrain.org> param: string UTF-8 string param: int (optional) -1 lowercase only, +1 uppercase only, 1 both cases param: string UTF-8 with accented characters replaced by ASCII chars return: string accented chars replaced with ascii equivalents |
| Generated: Thu Jul 28 15:48:31 2011 | Cross-referenced by PHPXref 0.7 |