In various parts of my code I wish to remove a set of indices from a mapping. According to the documentation the index to be removed can be a mixed type, so it seems I can only remove indices from a mapping by performing m_delete's one-by-one.
Code like: array removable_indices = ({ "a", "b", "c" }); mapping data = ([ "a" : "A", "b" : "B", "d" : "D" ]); m_delete(data, removable_indices); will not have the same result as removing indices "a", "b" and "c" through three separate m_delete statements.
Is there a way to tell pike I wish to delete a list of indices, allowing pike to maybe perform such an operation more efficient than through separate statements? Or any chance such a statement or syntax might appear in the future?
(Hmmmm, I haven't checked m_delete(data, @removable_indices); )
Regards,
Arjan
I don't think there is a faster way than doing it the obvious way:
foreach(removable_indices;; string ind) m_delete(data, ind);
You can do in with shorter code, but it will be somewhat slower I guess:
m_delete(data, removable_indices[*]);
I don't think there is a faster way than doing it the obvious way: foreach(removable_indices;; string ind) m_delete(data, ind);
You can do in with shorter code, but it will be somewhat slower I
guess:
m_delete(data, removable_indices[*]);
Doing some timings under pike 7.4, and the last one actually appears to be the faster way (timings in next e-mail).
Thanks,
Arjan
Like this? ([1:2,3:4,17:42])-({1}); (2) Result: ([ /* 2 elements */ 3: 4, 17: 42 ])
More like ([ 1:2, 3:4, 17:42 ]) - ({ 1, 3 }); but as in the result, yes. As in performance (okay, could test only under pike 7.4) the m_delete is much faster.
Testcode: array qresult = db->query("SELECT * FROM cvs_db_edit.search AS s ORDER BY utime_modified DESC LIMIT 1000"); array remove_indices = predef::filter(indices(qresult[0]), has_value, '.');
int aa_time; array qresult2; int a_time = gethrtime(); for (int i=0; i<1000; i++) { aa_time = gethrtime(); qresult2 = copy_value(qresult); a_time += (gethrtime()-aa_time); for (int i=0; i<sizeof(qresult2); i++) { qresult2[i] -= remove_indices; } } int b_time = gethrtime(); for (int i=0; i<1000; i++) { aa_time = gethrtime(); qresult2 = copy_value(qresult); b_time += (gethrtime()-aa_time); function l = lambda(string s, mapping m){m_delete(m,s);}; for (int i=0; i<sizeof(qresult2); i++) { map(remove_indices, l, qresult2[i]); } } int c_time = gethrtime(); for (int i=0; i<1000; i++) { aa_time = gethrtime(); qresult2 = copy_value(qresult); c_time += (gethrtime()-aa_time); for (int i=0; i<sizeof(qresult2); i++) { foreach(remove_indices, string idx) { m_delete(qresult2[i], idx); } } } int d_time = gethrtime(); for (int i=0; i<1000; i++) { aa_time = gethrtime(); qresult2 = copy_value(qresult); d_time += (gethrtime()-aa_time); for (int i=0; i<sizeof(qresult2); i++) { foreach(remove_indices;; string idx) { m_delete(qresult2[i], idx); } } } int e_time = gethrtime(); for (int i=0; i<1000; i++) { aa_time = gethrtime(); qresult2 = copy_value(qresult); e_time += (gethrtime()-aa_time); for (int i=0; i<sizeof(qresult2); i++) { m_delete(qresult2[i], remove_indices[*]); } } int f_time = gethrtime();
Stdio.stdout->write("1 took %.2f msec\n", (b_time-a_time)/1000.00); Stdio.stdout->write("2 took %.2f msec\n", (c_time-b_time)/1000.00); Stdio.stdout->write("3 took %.2f msec\n", (d_time-c_time)/1000.00); Stdio.stdout->write("4 took %.2f msec\n", (e_time-d_time)/1000.00); Stdio.stdout->write("5 took %.2f msec\n", (f_time-e_time)/1000.00);
Output: 1 took 19373.84 msec 2 took 11036.73 msec 3 took 7440.38 msec 4 took 7604.60 msec 5 took 7065.02 msec
I must admit I did not verify if the result is what it's supposed to be.
Regards,
Arjan
Output: 1 took 19373.84 msec 2 took 11036.73 msec 3 took 7440.38 msec 4 took 7604.60 msec 5 took 7065.02 msec
It seems I have to state a warning with the timings I've reported. When I added another testcase, without changing any of the other loops, the timings of testcase 5 have gone up to something slower (approx. 7800 msec) than testcase 3 and 4 (still approx 7500 msec). Testcase 6 is now continuously being reported as being slightly faster (changed the loopcode for the qresult2 array, not the m_delete).
It would appear the order in which the tests are being performed in the source code is also of some influence to the performance of it. :-(
Regards,
Arjan
That doesn't run m_delete on the original mapping but instead copies all remaining entries to a new one.
-----Oorspronkelijk bericht----- Van: pike-devel-bounces@lists.lysator.liu.se [mailto:pike-devel-bounces@lists.lysator.liu.se] Namens Jonas Walldén @ Pike developers forum Verzonden: Wednesday, September 09, 2009 10:30 AM Aan: pike-devel@lists.lysator.liu.se Onderwerp: multi-m_delete
That doesn't run m_delete on the original mapping but instead copies all remaining entries to a new one.
(this has to be a reply to the ([1:2,3:4,17:42])-({1}); version).
Jonas, yes, correct - that probably explains why it is the slowest of the ones I tested, but using an array of indices with multiple values does indicate the idea I meant, but yes, in the destructive manner.
Regards,
Arjan
pike-devel@lists.lysator.liu.se