Streaming POST data through PHP cURL Using CURLOPT_READFUNCTION

Well, I haven’t posted here in quite some time… I’m not dead, and don’t plan on completely ditching this blog, but well…

Anyway, onto the article.

I had a PHP application where I wanted to upload part of a large file to some other server.  The naive method may be to simply split the file and upload through cURL, however I wanted to do this without any splitting.  So I needed a way to send a POST request, being able to build the request body on the fly (note, you’ll need to know the total size to be able to send the Content-Length header)

The obvious decision would be to use sockets rather than cURL, but I felt like seeing if it was possible with cURL anyway.  Although I’ll still probably use sockets (because it’s easier in the end), I thought this might (well, not really) be useful to one of the three readers I get every month.

Anyway, if you look at the curl_setopt documentation, you’ll see a CURLOPT_READFUNCTION constant, however, how to really use it doesn’t seem clear (especially with the boundaries for multipart/form-data encoding type).  Also, the documentation is wrong.

Without further ado, here’s some sample code:

<?php

$boundary = '-----------------------------168279961491';
// our request body
$str = "$boundary\r\nContent-Disposition: form-data; name='how_do_i_turn_you'\r\n\r\non\r\n$boundary--\r\n";

// set up cURL
$ch=curl_init('http://example.com/');
curl_setopt_array($ch, array(
 CURLOPT_HEADER => false,
 CURLOPT_RETURNTRANSFER => true,
 CURLOPT_POST => true,
 CURLOPT_HTTPHEADER => array( // we need to send these two headers
 'Content-Type: multipart/form-data; boundary='.$boundary,
 'Content-Length: '.strlen($str)
 ),
 // note, do not set the CURLOPT_POSTFIELDS setting
 CURLOPT_READFUNCTION => 'myfunc'
));

// function to stream data
// I'm not sure what the file pointer $fp does in this context
// but $ch is the cURL resource handle, and $len is how many bytes to read
function myfunc($ch, $fp, $len) {
 static $pos=0; // keep track of position
 global $str;
 // set data
 $data = substr($str, $pos, $len);
 // increment $pos
 $pos += strlen($data);
 // return the data to send in the request
 return $data;
}

// execute request, and show output for lolz
echo curl_exec($ch);
curl_close($ch);

Hopefully the comments give you enough idea how it all works.

PMPs – Why do People Ignore Compression?

One thing I notice is that many portable devices, companies sell higher capacity versions for exorbitant premiums, when flash memory really isn’t that expensive.  Seems to be less of an issue for players which do include an (mini/micro)SDHC expansion slot, as you can effectively increase capacity with a cheap add-on card.

But despite this, it seems that many people really do pay these excessive premiums for this increased storage.  I sometimes do wonder how people fill up so much space, eg getting a 32GB player over a 16GB one.  Surely these people have lots of videos and music, probably more than they need, and obviously, a higher capacity player allows them to carry more on the same device.

Whilst this is fine for the majority who aren’t so technically inclined, I do wonder about the people who are more technically inclined, and them overlooking the other side of the equation.  For example:

Amount of music that can be stored = Storage capacity ÷ per song size

Now we want to be able to store more music (again, even if it’s a lot more than we need), but the general approach of simply upping storage capacity is only one part of the equation – most people, even more technically inclined people, seem to ignore the fact that you can also store more stuff by reducing the file sizes of media!

Admittedly, compressing stuff can take effort.  In fact, I’ve had a number of motivations that most probably never had, including the old days of me trying to fit MP3s on floppies, squish as much as I could out of my 4GB harddrive, squeeze music on a 256MB MP3 player, and packing videos onto my 1GB PSP memory stick.  However, with a bit of reading, it’s mostly sticking your music/videos into a batch converter and then copying everything across.  It’s slightly less convenient when you add stuff (you probably need to pass these through a converter too), though, personally, I’m used to doing this, so I don’t mind.

But does compression really yield much benefit?  From what I’ve seen, I’d say so.  It seems most people just dump their 128/192/256/320kbps MP3s (usually more 320kbps as this is a popular size in P2P) on the device and that’s all they care about.  From the fact that most people cannot tell defects in 128kbps MP3s (let’s just say it’s LAME encoded), and my own listening tests, I’d say that most people cannot hear defects in 56-64kbps HE-AAC (encoded with NeroAAC).  Support for this format is limited though (difficulty of implementing SBR on embedded devices), though I believe Rockbox supports it, along with the latest iDevices (pre-late-2009 do not support HE-AAC).  Next in line would be 80-96kbps OGG Vorbis, if your player supports it.  In fact, I cannot personally hear defects in 128kbps Vorbis, so even audiophiles could use a big space saving by using higher bitrate Vorbis.  But support for Vorbis is surprisingly low, considering that this is a royalty free codec.

For an audio format with a fair bit of support, would be LC-AAC (aka “AAC”) which achieves similar quality to 128kbps MP3 at around 96-112kbps (using NeroAAC or iTunes).  Failing that, using LAME to encode MP3s with a variable bitrate can yield decent quality with average bitrates around 112kbps.

Now if we assume that the average song is a 320kbps MP3 and the listener really can’t hear defects in 128kbps MP3s, and the underlying player supports HE-AAC, we could get a massive 320-56 = 264kbps saving (82.5% smaller!) by being a bit smarter in storing our music.  This equates to being able to store over 5 times more music in the same amount of space.  But of course, this is an optimal situation, and may not always work.  Even if we’re more conservative, and say that the average MP3 is 192kbps, and the underlying player only supports LC-AAC, we can still get a 50% reduction in size by converting the 192kbps MP3 to 96kbps LC-AAC, which equates to a doubling in storage space.

Videos are perhaps more difficult to get right as the parameters involved in video encoding is significantly more complex than audio encoding (also note that videos often include audio).  But from what I’ve seen, significant space savings can be gained by encoding videos more intelligently, but it’s hard to provide rough figures as most people do convert videos for their portable devices, but use a wide variety of applications and settings.  For reference, I see a lot of >100MB PSP encoded anime episodes, however, I can personally get them to around 30-40MB using a x264 crf of 25 and ~8MB audio stream (allowing me to easily store a 12 episode anime series on a 1GB stick, with plenty of space to spare).

So for those who don’t compress their media, maybe give it a bit of a shot and see what space savings you can get.  You may be surprised at how much 16GB can really store.

×

Why would anyone buy an iMac?

People who know me probably know that I’m a lot more anti-Apple than I am anti-Microsoft, but that’s besides the point here.

Was browsing some ads that got sent to my house today and I saw an ad for an iMac (as Apple tightly controls prices, I would expect them to be similar across stores) and, seriously quite shocked at what was on offer.  The cheapest system had:

Intel i3 3GHz CPU
4GB RAM (probably DDR3)
500GB Harddisk
256MB ATI Radeon HD 4670 GPU
21.5in screen
MacOSX 10.6

All for AU$1598!  To put this in perspective, my current computer, which I bought in 2008 when the AUD crashed cost me less, and is still more powerful than the above.  This is what I paid:

Intel Core2Quad Q6600 [$295] (FYI: a C2D E8500 was about $285 at the time – comparison with i3)
4GB DDR2 1066MHz Kingmax RAM [$95]
640GB Samsung F1 7200rpm HDD [$89]
512MB ATI RadeonHD 4670 GPU [$119]
Gigabyte EP45-DS4P motherboard [$199] (that’s a rather expensive motherboard BTW)
Antec NSK6580 case with 430W Earthwatts PSU [$128]
Logitech Desktop 350 (basic kb+mouse) [$22]

…which totals $947.  If we add in a 21.5in screen [probably under $200 at the time] and a DVD burner [around $30 at the time], and even add in a copy of Windows (around $200) it’s still significantly cheaper than the iMac today even disregarding the fact that the AUD was worth 60% of what it’s worth today, relative to the USD.  Oh, and yes, my system pretty much beats the iMac in every way, not to mention it’s far more customisable and not as locked down as anything Apple make.

Okay, Apple’s stuff is absurdly expensive, this is probably nothing new.  From what I’ve heard, people may buy Apple stuff for its design.  But is the design really any good?  I personally don’t think so.

Our Uni recently replaced all library computers with iMacs (different to the one advertised, so I may be a little misinformed here) and I really don’t like their design in a number of ways.  After using one for a while, this is my thoughts so far:

The Screen and Machine

  • It’s big, heavy and somewhat cumbersome.  It appears you can only tilt the screen forward and backwards.  Although most screens (especially cheaper ones) don’t seem to be terribly adjustable, I much prefer the Dells in the IT labs, where you can adjust the height, swivel horizontally and rotate the screen itself on the stand.
  • It’s glossy.  I don’t know WTF people make glossy screens.  If I wanted to see my own face, I’d look in a mirror.  If I wanted to see that bright light behind me, which is reflecting off this stupid glossy screen, I’d look directly at it (but I wouldn’t, I’m not that stupid).  But when I’m looking at a screen, I want to see what’s actually on there.
  • I can’t seem to find any controls on the screen.  Maybe there’s some on the back, but I didn’t look too much.  Not that screen controls should be on the back anyway.
  • USB ports.  The last time I used a computer which didn’t have USB ports at the front was made about 10 years ago.  Apple helps you bring back those memories by not putting USB ports at the front (or sides).  As for the back USB ports, the number of them is somewhat limited…
    I did actually later realise that there were USB ports on the side of the keyboard.  I guess that’s a reasonable way to do things, though I still would be concerned whether these ports supply enough power for a portable HDD.
  • Actually, make it that there’s nothing useful on the front or sides of the screen.  The power button is conveniently located at the back of the screen, so if you want to turn it on, you’re going to have to pull the screen forward, and then turn it around so you can reach the button (making sure you don’t pull out any cords), then do the reverse to return the screen to its original position.
  • The back doesn’t appear to have that many ports, though I didn’t check much (not easy to), and certainly looks a lot less than what my Gigabyte EP45-DS4P motherboard supplies.
  • I still haven’t managed to find where the optical drive is…

The Keyboard

  • Is small and flat – very much like a laptop keyboard.  Maybe some people prefer laptop keyboards, but I don’t.
  • Has very little extra keys.  Fair enough I guess, but overall, seems like a cheapish keyboard and hardly anything I’d pay a premium for.  Overall quite usable though.
  • Doesn’t have a Windows key, for all those planning to install Windows on it (the Uni library iMacs all run Windows).  Fair enough from an Apple standpoint I guess.

The Mouse

  • The trackball is quite small.  At first I didn’t like it, but after a while of using it, it seems okay.  In fact, it being a ball allows you to horizontally scroll quite nicely, despite many applications not supporting horizontal scrolling, but I guess that’s not the mouse’s fault.
  • One-button design.  Despite its looks, the mouse can actually distinguish left, centre (the ball) and right button clicks reasonably well, however, only if you push your fingers in the right place.  Unfortunately, as this is a single button design, there isn’t really any clear way to feel where the right place is without looking, apart from finding the ball with your fingers and distinguishing left and right portions from there.  If you push too close to the centre though, you can inadvertently get the mouse to press the wrong button.
  • From the above, you cannot click the left and right mouse button at the same time.  Not important for most applications perhaps, though I know some games require (or can be enhanced with the ability) both buttons to be pressed at the same time.
  • Like the keyboard, the mouse is fairly basic and has no extra side buttons and the like.  Hardly anything I’d pay a premium for.

So there’s my thoughts on the iMac.  Seriously overpriced and badly designed.  Unless you absolutely must use OSX (and unwilling to build a Hackintosh) or just an avid Apple fanboi, I can’t see why anyone would rationally buy this hunk of junk.

New USB Stick

I’ve had a number of USB sticks in the past, and from historical situations, they tend to last around 2 years for me.  My current (well, actually, previous, now) USB is a Transcend 8GB, and I’ve already been using it for over 2.5 years, so I’ve been wondering if this thing is going to die.  Maybe it’s better, maybe it’s just luck, but I decided to leave out that risk factor and get myself a new USB just in case. (yes, I do manually backup data, but backups are only so good)

Anyway, one of the things bothering me with this Transcend stick is the horrible speeds it has.  Running portable apps like Firefox Portable takes forever to load, and saving anything on the USB has a noticeable latency lag.  As USBs are really cheap these days, I decided to look for a faster stick, rather than a large one.  I’m only using around 300-500MB anyway, and rarely go above 700MB unless I’m in the rare situation where I’m transferring some large files (in which case, I don’t mind bringing my USB HDD to do that), so I could easily live on a 2GB USB, perhaps 4GB for good measure.

Unfortunately, it seems all the faster USB drives are also large.  Looking around, the best that appealed to me were the 8GB Corsair Voyager and Patriot XT Xporter Boost from Umart (which now sell for around $25).  Drives like the OCZ Throttle and Corsair Voyager GT I could only find in at least 16GB sizes, which cost significantly more, and I seriously don’t need all that space.

Then I saw that MSY were selling a Patriot Xporter Rage 8GB for $25, so I decided to get one of them.  After some Googling though, I was a little worried on whether it delivered its advertised speed, finding a thread where users were complaining about the 16GB version’s write speeds, also hinting that the larger drives (64GB) may actually deliver on the advertised speeds (and I’m getting a smaller 8GB one).  But anyway, I went ahead and bought it (after they managed to get one in stock) for $24 (yay, $1 saving!).

Bringing it home, it’s formatted as FAT32 with a 64KB sector by default.  I do seem to get around 25MB/sec on sequential writes (woot!).  64KB sector is a bit excessive, but as I don’t really care about space, I don’t mind it.

As for the physical drive itself, it’s slightly smaller than the Transcend, and its capless design, I actually like.  On my old stick, it’s a little slider at the side, which you push forward to push out the USB connector.  On this one, you push the entire back part of the casing forward to reveal the USB connector.  A thing about the capless designs is that applying pressure to the USB port can cause it to retract (a pain if it gets loose and you don’t quite fit the connector in properly), but with the new Patriot drive, you’re naturally going to be applying pressure from the back of the USB stick, so it doesn’t really matter.  Anyway, the outside is also slightly rubbery, though I don’t think the additional grip is much importance.  The thing I don’t like is that it no longer has an indicator activity LED.

So, now that I have a 8GB stick, what to fill it up with?  As this is supposedly a fast drive, I decided to stick some bootable stuff on it, just in case I ever need it (unlikely, but oh well).  I’m too lazy on how to read up on making Linux boot drives, so I just used this and added some stuff that might come in handy – UBCD, System RescueCD and Ubuntu 10.10 (Knoppix and Bart’s PE might’ve been nice; would be nice to have a quick booting text based Linux distro which runs a shell script at bootup – might be useful for quickly performing some offline actions on a PC).

Unfortunately, the formatting process also reverts the drive’s sector size to 4KB, but it seems that Acronis Disk Director, which I happened to have installed, is able to convert sector sizes, so I upped it to 64KB.  First time I tried, it didn’t work (maybe cause I didn’t reboot the PC as it asked me to).  Out of interest, I noticed that Disk Director allowed creating multiple filesystems on a USB (Windows disk management doesn’t allow this), however, it seems that Windows just ignores other filesystems on the drive…  Anyway, reformatted and recreated the drive a second time, upping the sector size to 64KB and it worked.  Except that I got some warnings in the bootloader about the sector size > 32KB.  Despite that everything worked, I decided to just convert the thing down to 32KB for good measure anyway.

So that’s the wondrous story of my new USB, where Firefox Portable doesn’t take forever to load.  Maybe it’ll mean that I take up more space, since I used to stick everything in self extracting EXEs on my old drive (would extract stuff to C: drive and run from there as sequential reads on the USB were reasonable, as opposed to random reads).

Oh, and I’m also running a git repo on there too, with SmartGit as my portable Git client. (tip, you don’t need the full MSYS Git for it to work, just git.exe and libiconv.dll seem to be enough)

Delv into Git

A few days ago, I decided to start a git repository (+ github account) for my XThreads MyBB plugin.  I never really believed this plugin to be complex enough to really require version control, but as I have never used git before, I decided to use it as an opportunity to test it out and gain some experience with it.

I’ve previously had experience with SVN, and felt that it was rather clunky for handling most personal projects, and the idea of a decentralised version control system (DVCS) somewhat appealed to me.  I did some research before diving into git – apparently, one of git’s criticisms is its steep learning curve.  But I still chose it above other systems such as Mercurial, mainly due to its popularity.  I decided to counter this difficult learning curve by using a TortoiseGit to do most of the stuff for me.

So far, I really do quite like this system over SVN:

  • I no longer have to run an SVN server on my computer (don’t really like running background services that much), and/or have to always commit to a remote server
  • Potentially, I could put it on a USB and commit stuff when away from home (except no TortoiseGit; maybe there’s a nice Windows based git GUI somewhere…)
  • It doesn’t have pesky .svn folders all over the place, so you can simply copy out or package the stuff you’re distributing without having to worry about performing an export
  • Being all contained within a folder, you can easily make a backup copy of the repo if you wish, and restore it if you totally trash the repo
  • It seems a lot faster than SVN.  Maybe it’s because I’ve only got a very small amount of code, but it does certainly seem faster, despite the MSYS builds of Git apparently being slower than the Linux ones.

One negative is that it doesn’t have revision numbers like SVN does.  Possibly this is due to it being designed for non-linear development.  Maybe the feature is there somewhere (as git is said to be powerful) or can be implemented in some way, but it appears that the stock system doesn’t support revision numbers (I guess commit dates can be a reasonable proxy for linear development).

Using public key authentication over passwords is an interesting one – I haven’t really thought about it, but my 2 second intuition doesn’t seem to show much benefit.  Maybe it’s because it uses SSH rather than HTTP (SVN)?

As for GitHub, their issue tracker seems to be basic compared to something like Redmine.  I also noticed that it doesn’t seem to have the ability to diff between arbitrary revisions (would be really useful IMO).
But I guess it’s probably sufficient for a personal project (although I would like to be able to diff between multiple revisions).

Overall, I somewhat like this system, and in fact may put a lot of my code under Git version control, if I can get a better USB drive and maybe learn the Git command line (or find a good portable Git GUI).

Compressing PSP ISOs with 7z deflate

Many applications which compress data, do so using the free zlib library.  It’s relatively fast and provides a good speed/compression ratio for most applications, but above anything else, I imagine it’s huge popularity is due to it being licensed under a very liberal license.

zlib implements the deflate compression algorithm (same algorithm used in the popular ZIP file format, as well as GZip, PNG, PDF etc), and supports 9 compression levels (10, if you include “no compression” as a level), 9 being the highest compression level (at the expense of compression speed).

It may be somewhat known that 7-Zip implements its own deflate algorithms (not to be confused with the .7z format or its default LZMA algorithm) which tend to outperform zlib in terms of compression, at the expense of speed.  Some applications, such as AdvanceCOMP, have, thus, leveraged this to allow some files to have slightly higher compression ratios than that created with most applications.

Now, all CISO/CSO (compressed ISO) makers, for compressing PSP ISOs use zlib, so I imagined that replacing this with a 7z implementation of deflate would allow CSOs of smaller size to be made.

The CSO format compresses the ISO in blocks for 2048 bytes each (probably to match the ISO sector size) to allow the format to actually be read on the fly and playable on the PSP.  This does, unfortunately, mean that there’s a hit to compression, and therefore, CSOs will larger than an ISO compressed with GZip (assuming same deflate algorithm and settings).

But due to the above structure, it also means that compression can easily be made multi-threaded, simply by passing blocks around to different threads to compress.  So I decided to make such an application, not only to get the benefits of smaller CSOs, but also as an exercise in multi-threaded programming (haven’t really done this before).  The latter was somewhat a bit of a challenge, and I may write a separate blog post on it, but I have managed to compress some ISOs without it corrupting the thing, or just deadlocking during the process.

Unfortunately, the results seem to be somewhat disappointing.  I recall trying to compress the Europe release of Loco Roco and I only got about a 500KB smaller CSO compared to zlib level 9.  Just tried DJ Max Portable 2 ISO, using 7z’s deflate, 5 passes, fastbytes = 255, NC ratio = 98% (if compression ratio is above this, it will store the block uncompressed; the default ciso compressor doesn’t do this):

Uncompressed ISO: 1,778,319,360 bytes
Compressed with zlib 9: 1,637,928,332 bytes
Compressed with 7z: 1,636,999,514 bytes

So a disappointing gain of less than 1MB or 0.057%.  Granted, more can be done to achieve higher ratios (test with various fastbytes sizes, check with kzip, apply DeflOpt, maybe even reference duplicate blocks) but I doubt any of these would give even close to the gain I’ve gotten here, not to mention that they’d increase compression time by a lot.  The above process, with 6 threads, already took around 18 minutes on my Core 2 Quad Q6600 CPU @2.4GHz (was consuming 90%+ CPU throughout the process; I haven’t implemented I/O threading, which may allow it to get closer to 100%).  Compare this to the standard ciso compressor which (okay, I didn’t actually measure this) only took a few minutes to do, and this is only a single threaded app.

Oh well, at least I got some experience with multi-threaded programming, and I guess 1MB is still 1MB…  Maybe it’ll work better for popstation, as it seems to be using a larger block size of 37,632 bytes (=18*2048 or 16*2352?).

Safe PHP expressions in templates

In regards to my PHP in Templates MyBB plugin, I’ve been thinking of the possibility of using “safe expressions”, that is, allowing <if> conditionals without allowing admins to enter in “undesirable” PHP.

I came up with an idea yesterday, which I’m giving a shot at.

Basically, there’s probably only two main types of undesirable code:

  • arbitrary modifications of any kind
  • retrieving restricted information

So, any arbitrary PHP code which does neither of the above should be considered “safe”, although I admit that I feel a little edgy over this assumption.

For the first point, there’s only really three ways to perform any modifications in PHP:

  • Assignment operations (=, +=, |=, ++ etc) – this can easily be blocked by finding them in the code (after removing strings); interestingly, PHP doesn’t allow expressions such as $a--$b, instead, they need to be written (IMO properly) as $a-(-$b)
  • Functions/statements (unset, fopen, mysql_query etc) – a whitelist of allowable functions could feasibly block this, although there’d need to be a huge list of allowable functions >_>
  • Executing processes (backtick operator, eg `ls -a`) – just simply block this operator

For the second point, the MyBB template system already allows some information gathering by printing variables (eg $_SERVER[...]) so I won’t consider this to be an issue, instead, I’ll block some constants, such as __FILE__ and PHP_OS, which don’t seem to be easily printable through the MyBB templates system.  The other thing is through functions/statements, which we’re already going to whitelist, so shouldn’t be an issue.

After all that, we just have to consider a few “backdoors”:

  • Executing code within PHP strings, eg "{${phpinfo()}}"
  • Variable function calls, eg $func()

Hopefully, this catches all the bad expressions.

I’m planning on releasing a separate version of the plugin which will not accept these bad expressions.

So busy…

Well, I haven’t updated this blog in quite a while.  I haven’t forgotten about it, rather, I’ve been hung up with so much to do lately.  Just finished my mid-semester exams, so at least that’s done – doesn’t mean I’ll update this any time soon any way, for the 2 people or 100 spammers who actually visit my blog.

/useless post

Horrible Excel save times on USB

I don’t have the fastest USB drive, in fact, it’s probably crapish, primarily the horrible latencies it has.

But when saving a ~450KB .xls file takes 2 minutes, using Excel 2007, something can’t be right.  Copying the file to a local harddrive and saving there only takes one second.  Copying the file back, another second (more or less).  Evidently Excel is doing some crazy seeking whilst writing the file or similar.  But why is it so bad???  Surely, these days, it could easily write the entire file to memory before physically writing it to disk?

Switched internet plans

Stuck in a churn request to switch internet plans a few days ago.  I’m moving from TPG’s 30+30GB (on/off peak) 512kbps plan to Exetel’s 15+”120″GB (on/off peak) 1.5Mbps plan (it’s more like 100GB off peak, since it’s like impossible to get 120GB through a 1.5Mbps line in 6 hours per day for 31 days).  Both plans cost AU$40/mo.

Transfer completed now, but unfortunately, I still seem to be getting slow speeds at night 🙁 so I guess the speeds are really a Telstra issue, rather than an ISP one, which sucks, cause Telstra never fixes anything.

It does seem to be mostly going fine during the off peak time though, peaking at around 156KB/sec, averaging around 135KB/sec (which I guess is kinda crap, but probably Telstra’s issue again).

Unfortunately, TPG seemed to want a 30 day notification period or something, so we get charged a bit for that 🙁