PHP – Accurately detecting the type of a file

PHP – Accurately detecting the type of a file

When files are being uploaded, you cannot rely on the MIME type the Web browser sends. This data is entirely under the control of the user and it will not necessarily be accurate. This may be malicious (e.g. a user uploading a .php file and pretending it’s an image) or accidental (e.g. a user thinking that simply renaming .tiff to .jpeg makes it such). Either way, your code should check files uploaded to get a better idea of their type.

Many files have a signature – sometimes called a magic number. These are the first few bytes of the file and should fairly accurately represent what type of file it is. The number of bytes depends on the format. For example, the hexadecimal representation of the signature of Java bytecode files is CAFEBABE.

Say we had some image files uploaded and wanted to be sure they were JPEG, GIF or PNG. We could use the following function to do this:

  1. <?php
  2.  
  3. function check_type($filename) {
  4. // PNG, GIF, JFIF JPEG, EXIF JPEF (respectively)
  5. $allowed = array('89504E47', '47494638', 'FFD8FFE0', 'FFD8FFE1');
  6.  
  7. $handle = fopen($filename, 'r');
  8. $bytes = strtoupper(bin2hex(fread($handle, 4)));
  9. fclose($handle);
  10.  
  11. return in_array($bytes, $allowed);
  12. }

In essence, I manually specify a list of allowed 4 bytes signatures as hexadecimal, read 4 bytes from the file and hex it and then compare this to the list. This function obviously needs some error checking and could be an awful lot  smarter and more adaptable, but it’s hopefully a start for you. You’ll notice that I define two different types of JPEG – JFIF and EXIF. The latter is a newer format and is used by iPhones (as well as other devices). There are also proprietary JPEG formats which exist, for example Samsung’s format – these have different signatures.

Sadly, this technique cannot be used to detect the type of files which don’t have a signature – e.g. plain text files such as PHP scripts.

PHP – Why you should use the Factory Method Pattern and how you should do it

PHP – Why you should use the Factory Method Pattern and how you should do it

This article assumes you’re using PHP version 5.3 or above. If you are not, you should note that “the PHP 5.2 series is NOT supported anymore” and you should really upgrade.

The Factory method pattern is a marvelous idea. It is designed to reduce the number of instances in which you use the ‘new’ keyword in your code. An example of its use as as follows:

  1. <?php
  2.  
  3. class SomeClass {
  4. protected $foo;
  5.  
  6. protected function __construct($foo) {
  7. $this->foo = $foo;
  8. }
  9.  
  10. public static function factory($foo = 'default') {
  11. return new static($foo);
  12. }
  13.  
  14. public function get_foo() {
  15. return $this->foo;
  16. }
  17. }
  18.  
  19. echo SomeClass::factory()->get_foo();

There are a number of points to note in this code. The first of these is that the constructor is protected. This is a feature of PHP 5.3 and above which prevents a class being instantiated using the new keyword. You’ll also notice that the default value for $foo is only specified on the factory() method. This is simply because specifying it on the constructor would be redundant as the constructor will only ever be called by the factory method. There’s a method called factory() which does nothing but return a new instance of ‘static’ (the current class). There’s also an example of how the factory is used and you’ll see that the get_foo() method is chained on. This is a useful and elegant feature which you would not get using ‘new’.

So what are the other advantages? Well, the constructor cannot return a value, whereas the factory method can. This gives you significantly more control over what you return. For example, if object construction failed, you can return a false or null value. I can’t think of a use case offhand for this and it’s probably better to throw an exception (which can also be done from the constructor) so that the successful return of a factory is always an object.

Consider a scenario like this… You have an API which you have been using for a while and decide it’s time to make a new/better featured one. Many of your sites rely on this API and, as such, you need to migrate them slowly. If you had used a factory() method in the first instance, there would be no issues doing this as you can simply change the type of object which it returns. For example…

  1. <?php
  2.  
  3. class API {
  4. protected $site_id;
  5.  
  6. protected function __construct($site_id) {
  7. $this->site_id = $site_id;
  8. }
  9.  
  10. public static function factory($site_id) {
  11. if (Site::uses_new_api($site_id)) {
  12. return NewAPI::factory($site_id);
  13. }
  14.  
  15. return new static($site_id);
  16. }
  17.  
  18. public function get_info() {
  19. return file_get_contents("http://www.info.com/?site={$this->site_id}");
  20. }
  21. }
  22.  
  23. class NewAPI extends API {
  24. public static function factory($site_id) {
  25. return new static($site_id);
  26. }
  27.  
  28. public function get_more_info() {
  29. return file_get_contents("http://www.moreinfo.com/?site={$this->site_id}");
  30. }
  31. }

You’ll see here that we’ve created a NewAPI class which extends the original API and adds functionality to it. We’ve also modified the factory() method of API to use a fictitious Site::uses_new_api() function to determine if the site uses the new API – if it does, we return an instance of it. This prevents us having to modify the code of the sites which use this class in order to migrate them to the new API.

Factory methods also make your life a little easier as you can have multiple methods which instantiate the object in different ways. For example:

  1. <?php
  2.  
  3. class Image {
  4. protected $type;
  5. protected $filename;
  6.  
  7. protected __construct($type, $filename) {
  8. $this->type = $type;
  9. $this->filename = $filename;
  10. }
  11.  
  12. public static function jpeg_factory($filename) {
  13. return new static('jpeg', $filename);
  14. }
  15.  
  16. public static function gif_factory($filename) {
  17. return new static('gif', $filename);
  18. }
  19.  
  20. public function get_readable_type() {
  21. if ($this->type == 'jpeg') {
  22. return 'The image type is JPEG';
  23. }
  24. if ($this->type == 'gif') {
  25. return 'The image type is Graphics Interchange Format';
  26. }
  27. }
  28. }

You’ll see in this example we have two factory methods –  jpeg_factory() and gif_factory(). These only take in the filename and serve to ensure that only ‘gif’ and ‘jpeg’ can end up in the internal $type variable. Again, you could use exceptions to signify there being an error in what the passes into the class, but this is a very basic example.

So, the advantages we’ve established are:

  • You can prevent the use of ‘new’ and thus have more control over how a class is instantiated
  • You can chain methods onto the end of a factory() call to avoid the need for multiple lines of code (or even storing a reference to an object at all)
  • Code instantly becomes more maintainable as you have control over what is returned from the factory methods
  • You can have multiple factory methods to better control and simplify the ways in which the object is instantiated

Horribly amazing PHP – check if you’re on the last iteration of a foreach loop

The code is so horrible, it’s gone a full circle and become amazing once again. It’s a method of finding if you’re on the last iteration of a foreach loop. As with anything that uses array pointers in PHP, be careful how you use it as the results may not be as expected:

  1. $arr = array(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
  2.  
  3. foreach ($arr as $a) {
  4.  
  5. // This is the line that does the checking
  6. if (!each($arr)) echo "End!\n";
  7.  
  8. echo $a."\n";
  9.  
  10. }

PHP – Converting bytes to a readable format

The following function takes in a number of bytes and the number of decimal places required in the output and returns the result in a readable format:

  1. function readable_filesize($bytes, $dp = 1) {
  2. // Extensions used
  3. $extensions = array('B', 'KB', 'MB', 'GB', 'TB', 'PB');
  4.  
  5. foreach ($extensions as $ext) {
  6. // If we can't divide again without getting < 1
  7. if ($bytes < 1024) {
  8. return number_format($bytes, $dp).$ext;
  9. }
  10.  
  11. $bytes /= 1024;
  12. }
  13.  
  14. // If we're here, we're at the end of the extensions array
  15. // thus return what we have
  16. return $bytes.$ext;
  17. }

Use like this:

  1. echo readable_filesize(123456);

 

Solved! phpMyAdmin cannot start session without errors

Solved! phpMyAdmin cannot start session without errors

This was a bit of a ridiculous error. It occurred after a PHP upgrade to 5.2.17 running under Apache with mod_php. Initially phpMyAdmin just rejected the login which clearing the browser’s cookies fixed. Following this, the error “cannot start session without errors” was produced. Sessions in PHP were working fine as other sites were setting them successfully. session.save_path in php.ini was set to /var/lib/php/session. This directory had write access from the Apache user. The problem was fixed by changing the permissions to 0777 on this directory.

  1. chmod 0777 /var/lib/php/session

I haven’t bothered to investigate the real reason behind the problem but the aforementioned permissions fix seems to have solved it. Do let me know if you figure this out.