php unicode utf-16 bom processing with iconv and mb_convert

basically php iconv doesnt handle FEFF BOMs while mb_convert does !

        $fh = fopen($s, "r");
        // watch out for notepad unicode FEFF BOM
        $utf16 = fgets($fh, 1024);

        $utf8 = mb_convert_encoding($utf16,'UTF-8','UTF-16');
        echo PHP_EOL;

        $utf8 = mb_convert_encoding($utf16,'UTF-8','UTF-16LE');
        echo PHP_EOL;

        $utf8 = iconv('UTF-16LE','UTF-8',mb_substr($utf16,1,null,'UTF-16LE'));
        echo PHP_EOL;

php and notepad unicode files

notepad unicode csvfiles have a 2 byte BOM (FEFF) and are UTF-16/UCS-2 little endian encoded

$ od -t x1z hot.csv | head
0000000 ff fe 48 00 6f 00 74 00 65 00 6c 00 49 00 44 00  >..H.o.t

$ od -t x2z hot.csv | head
0000000 feff 0048 006f 0074 0065 006c 0049 0044  >..H.o.t

Detecting UTF BOM (byte order mark) using PHP

When integrating systems with many different data sources and systems across Europe you are bound to eventually run in to issues with UTF-8 and national character sets as for example the Swedish ISO-8859-1. Even when parsing simple UTF-8 files with comma separated values things might things might popup to bite you.

oracle linux 6 php 5.6 install

for an oracle linux 6 php 5.6 install you will need to add some extra yum repos



# yum install php56-cli

Setup a bitnami Ubuntu VM after install

Once you download and start a Bitnami VM (say, to run a redmine or drupal server), you may want to install VMWare Tools and other applicaitons. If the VM is Windows, easy.

But on Ubuntu, it is not so easy: I use Linux rarely nowadays, though I used to use it lots many years ago, I’ve forgotten many command names and how to do various basic tasks which would be trivial to someone familiar with Ubuntu. Here is what I have to do everytime I download a new bitnami Ubuntu VM:

Update apt-get

First update apt-get, the application that gets packages for Ubuntu. If you don’t do that, you will likely get an error when using apt-get, as it won’t be able to find certain packages. It appears the VMWare Tools installation procedure requires headers to be installed, which requires apt-get.

