php array performance when adding elements one by one vs when adding all of the data at once -- [Question Asked]

Issue

for the sake of simplicity i will put a simple example, i have an $array and few key and values that i wanna add to this array.. what is better primarily from performance perspective:

  1. to add all of those key values in one statement to the array. or
  2. it won’t harm to just do them one by one.

1

$array = [
    $key1 => $value1, 
    $key2 => $value2
];

OR

$array[$key1] = $value1;
$array[$key2] = $value2;

Answer we found from sources

If you have a handful of keys/values, it will make absolutely no difference. If you deal in arrays with 100K+ members, it does actually make a difference. Let’s build some data first:

$r = [];
for($i = 1; $i <= 100000; $i++) {
    $r[] = $i; // for numerically indexed array
    // $r["k_{$i}"] = $i; // for associative array
    // array_push($r, $i); // with function call
}

This generates an array with 100000 members, one-by-one. When added with a numeric (auto)index, this loop takes ~0.0025 sec on my laptop, memory usage at ~6.8MB. If I use array_push, it takes ~0.0065 sec with the function overhead. When $i is added with a named key, it takes ~0.015 sec, memory usage at ~12.8MB. Then, named keys are slower to define.

But would it make a difference if you shaved 0.015 sec to 0.012 sec? Or with ^10 volume, 0.15 sec to 0.12 sec, or even 0.075 sec? Not really. It would only really start becoming noticeable if you had 1M+ members. What you actually do with that volume of data will take much longer, and should be the primary focus of your optimization efforts.


Update: I prepared three files, one with the 100K integers from above in one set; another with 100K integers separately defined; and serialized as JSON. I loaded them and logged the time. It turns out that there is a difference, where the definition “in one set” is 50% faster and more memory-efficient. Further, if the data is deserialized from JSON, it is 3x faster than including a “native array”.

  • “In One Set”: 0.075 sec, 9.9MB
  • “As Separate”: 0.150 sec, 15.8MB
  • “From JSON”: 0.025 sec, 9.9MB
  • “From MySQL”: 0.110 sec, 13.8MB*

Then: If you define large arrays in native PHP format, define them in one go, rather than bit-by-bit. If you load bulk array data from a file, json_decode(file_get_contents('data.json'), true) loading JSON is significantly faster than include 'data.php'; with a native PHP array definition. Your mileage may vary with more complex data structures, however I wouldn’t expect the basic performance pattern to change. For reference: Source data at BitBucket.

• A curious observation: Generating the data from a scratch, in our loop above, was actually much faster than loading/parsing it from a file with a ready-made array!

• MySQL: Key-value pairs were fetched from a two-column table with PDO into an array matching the sample data with fetchAll(PDO::FETCH_UNIQUE|PDO::FETCH_COLUMN) .


Best practice: When defining your data, if it’s something you need to work with, rather than “crude export/import” data not read or manually edited: Construct your arrays in a manner that makes your code easy to maintain. I personally find it “cleaner” to keep simple arrays “contained”:

$data = [
    'length' => 100,
    'width' => 200,
    'foobar' => 'possibly'
];

Sometimes your array needs to “refer to itself” and the “bit-by-bit” format is necessary:

$data['length'] = 100;
$data['width'] = 200;
$data['square'] = $data['length'] * $data['width'];

If you build multidimensional arrays, I find it “cleaner” to separate each “root” dataset:

$data = [];
$data['shapes'] = ['square', 'triangle', 'octagon'];
$data['sizes'] = [100, 200, 300, 400];
$data['colors'] = ['red', 'green', 'blue'];

On a final note, by far the more limiting performance factor with PHP arrays is memory usage (see: array hashtable internals), which is unrelated to how you build your arrays. If you have massive datasets in arrays, make sure you don’t keep unnecessary modified copies of them floating around beyond their scope of relevance. Otherwise your memory usage will rocket.


• Tested on Win10 / PHP 8.1.1 / MariaDB 10.3.11 @ Thinkpad L380.

Answered By – Markus AO

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Posted in PHP

Who we are?

We are team of software engineers in multiple domains like Programming and coding, Fundamentals of computer science, Design and architecture, Algorithms and data structures, Information analysis, Debugging software and Testing software. We are working on Systems developer and application developer. We are curious, methodical, rational, analytical, and logical. Some of us are also conventional, meaning we're conscientious and conservative.

Answer collected from stackoverflow and other sources, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0